Commit graph

4303 commits

Author SHA1 Message Date
siivonek 2e468b7014 Add full w8 y coordinate table. 2024-05-09 13:06:16 +03:00
Joose Sainio 4e4084434e Remove setr from the loop in hor_w4 2024-05-09 13:06:15 +03:00
Joose Sainio 8bbf01c376 Change the right shift in pred_planar_avx2 to use a 128 bit register version of the right shift instrinsics, since when the integer version does not have a compile time constant the compiler is forced to generate the 128 bit register using version anyways, but also has to convert the integer to the 128 bit register, and the compiler does not optimize this properly and instead does the conversion on every call of the loop. ***THIS NEEDS TO BE DONE FOR ALL SHIFTS THAT DO NOT USE COMPILE TIME CONSTANT SHIFTS*** 2024-05-09 13:06:15 +03:00
siivonek b02fb1b1af Remove left shift from planar half functions. Implement the left shift with madd. Planar preds of width 4, 8 and 16 should work now without overflows. Add loop unroll macros to vertical half functions. Will be added to hor half functions later. 2024-05-09 13:06:15 +03:00
siivonek 0eb0f110c2 Add missing packus to the end of planar calculation. 2024-05-09 13:05:49 +03:00
siivonek 4ae234ef24 Implement 16xN planar prediction. 2024-05-09 13:05:49 +03:00
siivonek 4bbd5f4c9a Implement 4xN planar prediction. 2024-05-09 13:05:49 +03:00
siivonek e888be80e2 Add intra avx2 planar placeholder functions. Implement 8xN planar prediction. Note: does not work with height < 4 yet. Initial plan is to produce the planar prediction as two halves. This is subject to change at at this point, it seems only planar functions for different widths are needed. 2024-05-09 13:05:49 +03:00
siivonek c91b76a668 Add globals required by intra avx2 code. Remove unnecessary stuff from intra avx2. 2024-05-09 12:42:38 +03:00
José Pedro 9bc8e59bcf Update flow control to print CTU stats to stats file when the stats-file-prefix is set 2024-01-19 09:52:16 +01:00
Joose Sainio e5e32d67f4 [avx2] Remove a define that was never meant to be committed 2023-09-27 12:54:53 +03:00
Joose Sainio 4a1cd926fb [rdoq] Fix rdoq using uninitialized values that do not matter 2023-09-26 14:26:07 +03:00
Joose Sainio 69c1c948fa [cfg] Specify that MTT and ISP are currently experimental 2023-09-26 10:41:31 +03:00
Joose Sainio e32cf4fb52 [avx2] Re-enable disabled avx2 functions that do not work with non-square blocks 2023-09-26 10:38:29 +03:00
Joose Sainio ff77346527 [dct2] Remove unnecessary memsets 2023-09-26 09:57:47 +03:00
Joose Sainio 64d222d17c [dep_quant] Remove dead code and fix small issue 2023-09-26 09:42:30 +03:00
siivonek 284724398e Add some comments. 2023-09-26 09:21:49 +03:00
Joose Sainio 3d4e732952 [avx2] Fix issue with 16x32 inverse transform 2023-09-26 09:21:49 +03:00
Joose Sainio d62a3f888e [avx2] static all transform tables 2023-09-26 09:21:48 +03:00
Joose Sainio 1f9955bdda [avx2] Fix compilation errors 2023-09-26 09:21:35 +03:00
Joose Sainio 13d4313e02 [avx2] Mostly working 2023-09-26 09:21:29 +03:00
Joose Sainio b78f9aff17 [avx2] Inverses work when ISP is not enabled 2023-09-26 09:21:24 +03:00
siivonek 4dccbcc30d [avx2] Forward transforms seem to be working 2023-09-26 09:21:24 +03:00
Joose Sainio 19829da152 Disable all avx2 optimizations that cannot be used with mtt/isp 2023-09-26 09:21:23 +03:00
Joose Sainio 1c293b8253 pass context_store as pointer
This reverts commit 47c5ea3d5c.
2023-09-26 09:21:23 +03:00
Joose Sainio 2caf077cff Remove avx512 instrincis 2023-09-26 09:21:23 +03:00
Joose Sainio 254826d396 [avx2] Add comments 2023-09-26 09:21:19 +03:00
Joose Sainio f2fb641acb [avx2] Replace inefficient loop with AVX2 code 2023-09-26 09:21:19 +03:00
Joose Sainio bc24601369 [avx2] Improve avx2 version of update_common_context 2023-09-26 09:21:19 +03:00
Joose Sainio 915104cf10 [dep_quant] Change order of absLevels 2023-09-26 09:21:18 +03:00
Joose Sainio d850c346d6 [dep_quant] Change order of ctxInit 2023-09-26 09:21:18 +03:00
Joose Sainio a624988c91 [dep_quant] Separate abs levels and ctx init 2023-09-26 09:21:18 +03:00
Joose Sainio dda972c665 [avx2] Try to do lnz decision with avx2 2023-09-26 09:21:18 +03:00
Joose Sainio cf6f03b73b [avx2] This has worked but I'm pretty sure these should be unaligned 2023-09-26 09:20:56 +03:00
Joose Sainio b4c84e820c [avx2] Simplify 2023-09-26 09:20:56 +03:00
Joose Sainio 2811ce58f4 [avx2] AVX2 version of depquant now exactly matches scalar version 2023-09-26 09:20:56 +03:00
Joose Sainio 48ea4bff4d [dep_quant] Fix rate_estimator and quant_block init cases 2023-09-26 09:20:55 +03:00
Joose Sainio dfff9a8030 [avx2] Move dep quant stuff to strategies 2023-09-26 09:20:55 +03:00
Joose Sainio 0591342b3a [avx2] replace or 2023-09-26 09:20:38 +03:00
Joose Sainio 8b1d6fab59 [avx2] Replace loads and stores with non-avx512 stores 2023-09-26 09:20:37 +03:00
Joose Sainio 6d0a3fa5fc [avx2] Replace _mm_and_epi32 with _mm_and_si128 2023-09-26 09:20:37 +03:00
Joose Sainio 7fdc045690 [dep_quant] Clean up 2023-09-26 09:20:37 +03:00
Joose Sainio 8eb0f66734 [depquant] update_state_eos_avx2 working 2023-09-26 09:20:37 +03:00
Joose Sainio 00cc58bc55 [depquant] Only initialize rate_estimator when necessary 2023-09-26 09:20:37 +03:00
Joose Sainio 00f838306f [depquant] Initialize quant_block only when necessary 2023-09-26 09:20:37 +03:00
Joose Sainio 9e27b4056a [avx2] WIP update_state_eos_avx2 2023-09-26 09:20:36 +03:00
Joose Sainio c56350b8d6 [avx2] and last 2023-09-26 09:20:36 +03:00
Joose Sainio 9f69713c24 [depquant] remove an unnecessary memcpy 2023-09-26 09:20:36 +03:00
Joose Sainio aa48943c22 [avx2] Do decision cost comparison with avx2 2023-09-26 09:20:36 +03:00
Joose Sainio cd6110cfac [depquant] Pre calculate things
sig_ctx_offset gtx_ctx_offset cg_pos pos_y  pos_x next_sbb_right next_sbb_below
2023-09-26 09:20:30 +03:00
Joose Sainio 8f4c3cecbf [avx2] update_states_avx2 working 2023-09-26 09:20:29 +03:00
Joose Sainio 58a66c0654 [avx2] WIP update_states_avx2 2023-09-26 09:20:29 +03:00
Joose Sainio 04be92a8ec [avx2] simplify 2023-09-26 09:20:29 +03:00
Joose Sainio 8b19c468cf [avx2] check_rd_costs_avx2 done 2023-09-26 09:20:29 +03:00
Joose Sainio c6e6f5da33 [avx2] WIP check_rd_costs_avx2, almost? 2023-09-26 09:20:29 +03:00
Joose Sainio 8caabcde1a [avx2] WIP check_rd_costs_avx2 2023-09-26 09:20:28 +03:00
Joose Sainio 2912db5fca [dep_quant.c] Small refactor 2023-09-26 09:20:28 +03:00
Joose Sainio 64d34f8559 [depquant] AoS -> SoA pre quant 2023-09-26 09:20:28 +03:00
Joose Sainio 2f1e9c4020 [depquant] AoS -> SoA all states 2023-09-26 09:20:28 +03:00
Joose Sainio 73442f1bba [depquant] AoS -> SoA for Decision 2023-09-26 09:20:28 +03:00
Marko Viitanen 26ef1dda09 [ibc] Fix chroma SAD handling and disable chroma SAD for now 2023-08-30 15:06:08 +03:00
Marko Viitanen 0239572796 [ibc] Fix some instances where CU_INTER was checked instead of !CU_INTRA 2023-08-23 15:21:45 +03:00
Marko Viitanen 312ac6731c [ibc] dual-tree rebase fixes 2023-08-15 13:24:22 +03:00
Joose Sainio 805afb1331 [fix] Minor fixes 2023-08-15 13:11:50 +03:00
Joose Sainio 8c14fa94ba [mtt] Fix small issues with luma and chroma searches 2023-08-15 13:11:49 +03:00
Joose Sainio 7a5245c5a4 [dual-tree] Fix chroma tree split model context derivation during search 2023-08-15 13:11:31 +03:00
Joose Sainio 707e11dbcf [dual-tree] Small fixes 2023-08-15 13:11:30 +03:00
Joose Sainio 91591c7e7c [dual-tree] Remove the limitation of not allowing 2 height chroma blocks in dual tree 2023-08-15 13:11:29 +03:00
Joose Sainio 146e1cb85e [dual-tree] WIP simplification 2023-08-15 13:11:28 +03:00
Joose Sainio 0f50caa2d0 [mtt] Fix various small issues and DepQuant for non-square blocks 2023-08-15 13:11:27 +03:00
Joose Sainio d222718c22 [mtt] Minor fixes 2023-08-15 13:11:26 +03:00
Joose Sainio d69bdf79f4 [mtt] Fix couple of issues with 64x32 CUs and non square tr skip rdoq 2023-08-15 13:10:13 +03:00
Joose Sainio 7d787c6b22 [ISP] Fix ISP cost calculation and DepQuant with mts 2023-08-15 13:10:13 +03:00
Joose Sainio 6e24b9a7f9 [DepQuant] Fix isp+depquant and trskip + isp 2023-08-15 13:10:12 +03:00
Joose Sainio 93c7e9c296 [DepQuant] Fix for mts and lfnst being quantized incorrectly during search 2023-08-15 13:10:11 +03:00
Joose Sainio dc652c75f9 [DepQuant] Isp and chroma 2023-08-15 13:10:10 +03:00
Joose Sainio 505c26eef3 [DepQuant] Fix 2023-08-15 13:10:10 +03:00
Joose Sainio c6087230a8 [DepQuant] Fix 2023-08-15 13:10:09 +03:00
Joose Sainio 5abe9e57c6 [DepQuant] Working but not necessarily improving 2023-08-15 13:10:09 +03:00
Joose Sainio 5236bc93be [DepQuant] WIP: doesn't crash but bitstream is illegal and quality a lot worse 2023-08-15 13:10:08 +03:00
Joose Sainio bfa699fac6 [DepQuant] WPP: API 2023-08-15 13:10:07 +03:00
Joose Sainio f8994a7fae [DepQuant] WIP: dequant 2023-08-15 13:08:54 +03:00
Joose Sainio 3e66a897d4 [DepQuant] WIP: compiles 2023-08-15 13:08:53 +03:00
Joose Sainio 4dbe0cd6c3 [DepQuant] WIP: easy part done 2023-08-15 13:08:24 +03:00
Joose Sainio 2a33af283e [DepQuant] WIP: initialization done 2023-08-15 13:08:24 +03:00
Joose Sainio 1373a7ac1d [mtt] correct indexing for chroma tree 2023-08-15 13:08:23 +03:00
Joose Sainio d3f42949a7 [mtt] Only consider termination if the cu is completely inside the frame 2023-08-15 13:08:22 +03:00
Joose Sainio 0c63743fc0 [mtt] Early terminations for all intra 2023-08-15 13:08:21 +03:00
Joose Sainio bd3ec75173 [mtt] search early terminations 2023-08-15 13:08:20 +03:00
Joose Sainio 2d00cab4b9 [isp] properly reset cabac context during intra search 2023-08-15 13:08:20 +03:00
Joose Sainio b27eca7c37 [deblock] fix width and height to correct order 2023-08-15 13:08:19 +03:00
Joose Sainio eae7d72384 [isp] Keep cabac contexts up to date for the different isp tus 2023-08-15 13:08:18 +03:00
Joose Sainio c744f79117 [mtt] Fix rdoq for non-square blocks 2023-08-15 13:08:17 +03:00
Joose Sainio 3b09c66d25 [deblock] Use the isp block dimensions instead of cu dimensions fro deblock 2023-08-15 13:08:17 +03:00
Joose Sainio 73956a9a46 [isp] Fix isp bitcost calculation 2023-08-15 13:08:16 +03:00
Joose Sainio f3c8a4f5db [lfnst] Also chroma can only use lfnst if dimensions are minimum 4 2023-08-15 13:08:15 +03:00
Joose Sainio a36a1fb5ff [mtt] There is always at least the height or width amount reference pixels available 2023-08-15 13:08:15 +03:00
Joose Sainio af23c81afa [mtt] Fix reading uninitialized data for local chroma tree 2023-08-15 13:08:14 +03:00
Joose Sainio 9acdab3209 [mtt] Fix lfnst bit counting for 64 wide or tall chroma tree cus 2023-08-15 13:08:13 +03:00
Joose Sainio 812377db45 [mtt] Set cus outside of the frame to zero for initializing partial worktree 2023-08-15 13:08:12 +03:00
Joose Sainio 27d114bc08 [mtt] Fix negative indexing 2023-08-15 13:08:11 +03:00
Joose Sainio 4e203108bc [mtt] Fix ref pixel generation for the second half of 32x2 chroma cus 2023-08-15 13:08:11 +03:00
Joose Sainio 446c53fd00 [mtt] Fix cclm for non 64 divisible heights 2023-08-15 13:08:10 +03:00
Joose Sainio ad2bb20f23 [mtt] Fix deblock for isp and properly set the limit for cclm 2023-08-15 13:08:09 +03:00
Joose Sainio c89ebf8bf1 [cclm] Fix heap corruption for non 64 divisible frames 2023-08-15 13:08:08 +03:00
Joose Sainio d296cac7c3 [mtt] fix reference building for 16x1 2023-08-15 13:08:07 +03:00
Joose Sainio 926ed7e145 [rdoq] partly fix rdoq for 16x1 and 1x16 2023-08-15 13:08:06 +03:00
Joose Sainio 8e4b864e6b [deblock] Fix incorrect direction for transform split of tall blocks at the top CTU row also for chroma 2023-08-15 13:08:05 +03:00
Joose Sainio 34aed10ec1 [mtt] fix 2023-08-15 13:08:05 +03:00
Joose Sainio 1333ab55d9 [mtt] Fix ref building for 32x64 cus 2023-08-15 13:08:04 +03:00
Joose Sainio 1493a2616c [mtt] fix getting collocated chroma for edge cus 2023-08-15 13:08:03 +03:00
Joose Sainio ffe17e48d7 [mtt] minor fixes 2023-08-15 13:08:02 +03:00
Joose Sainio 06fa86c340 [isp] Fix coordinates 2023-08-15 13:07:59 +03:00
Joose Sainio 71516b8155 [mtt] Make sure mtt splits cannot reach a situation where search cannot be performed 2023-08-15 13:07:58 +03:00
Joose Sainio 23e6b9f56c [mtt] Check that we are inside the CTU before checking the ctu data 2023-08-15 13:07:58 +03:00
Joose Sainio facbc794bf [mtt] Fix trying to get split data from depth -1 2023-08-15 13:07:57 +03:00
Joose Sainio 567fa7b2bd [deblock] Fix incorrect direction for transform split of tall blocks at the top CTU row 2023-08-15 13:07:56 +03:00
Joose Sainio 9c2574880a [mtt] Fix deblock for --combine-intra 2023-08-15 13:07:55 +03:00
Joose Sainio 90ce1390c0 [mtt] static 2023-08-15 13:07:54 +03:00
Joose Sainio f6ecb15ced [mtt] Fix implicit splits when mtt is not enabled 2023-08-15 13:07:53 +03:00
Joose Sainio 05218bae21 [jccr] jccr=4 hasn't been necessary for a long time 2023-08-15 13:07:52 +03:00
Joose Sainio b69e9b2958 [mtt] Fix final issues? 2023-08-15 13:07:51 +03:00
Joose Sainio 6620ba8d76 [mtt] fix deblock 2023-08-15 13:07:50 +03:00
Joose Sainio 09baddef17 [mtt] Fix lfnst and chroma coeffs and tests 2023-08-15 13:07:49 +03:00
Joose Sainio 992182dafb WIP 2023-08-15 13:07:48 +03:00
Joose Sainio ba0d43d846 [mtt] Fill chroma data for the whole area covered by the local separate tree chroma cu 2023-08-15 13:07:47 +03:00
Joose Sainio 412dd20f09 [mtt] Fix implicit splits for non ctu divisible frames. 2023-08-15 13:07:46 +03:00
Joose Sainio 2da1a34ff3 [mtt] Fix isp for MTT 2023-08-15 13:07:45 +03:00
Joose Sainio b988c60dd1 [mtt] search works completely with everything except RDOQ deblock and ISP 2023-08-15 13:07:44 +03:00
Joose Sainio 6a6bed7f1f [mtt] WIP 2023-08-15 13:07:43 +03:00
Joose Sainio 065eb6fc03 [mtt] fix lfnst 2023-08-15 13:05:38 +03:00
Joose Sainio 9e644fafd0 [mtt] search with mtt depth 2 and dual tree works without lfnst 2023-08-15 13:05:37 +03:00
Joose Sainio fb146cb6ed [mtt] proper split availability checking for split flag 2023-08-15 13:05:35 +03:00
Joose Sainio d5d9afb1e2 [mtt] fix dual tree 2023-08-15 13:05:19 +03:00
Joose Sainio 8fbefc0de3 [mtt] fix cost calculation 2023-08-15 13:04:29 +03:00
Joose Sainio 657254d38a [mtt] search with depth 1 mtt kinda working 2023-08-15 13:04:28 +03:00
Joose Sainio 13aae7d03d [mtt] All individual mtt splits should be working + uvg_get_possible_splits 2023-08-15 13:04:27 +03:00
Joose Sainio 7b117f171f [mtt] WIP 16x16 TT split 2023-08-15 13:04:26 +03:00
Joose Sainio 43a710e104 fix rebase 2023-08-15 13:04:25 +03:00
Joose Sainio d257376ca0 [mtt] Single mtt split works for everything else, except 16x16 with TT 2023-08-15 13:04:24 +03:00
Joose Sainio 26ee443d2f [mtt] 64x32 and 32x64 2023-08-15 13:04:23 +03:00
Joose Sainio ab21c7e1d7 [mtt] Fix sqrt adjustment, cclm calculation on edges of CTU and waip for lfnst 2023-08-15 13:04:22 +03:00
Joose Sainio 5875dc1ef4 [mtt] Fix counting the number of reference pixles and implement WAIP adjustment 2023-08-15 13:04:21 +03:00
Joose Sainio b893a9268c [mtt] WIP 2023-08-15 13:04:20 +03:00
Joose Sainio 5ba8d45981 WIP 2023-08-15 13:04:19 +03:00
Joose Sainio 70cbaae619 [mtt] square root adjustment for quantization 2023-08-15 13:04:18 +03:00
Joose Sainio f19084569d WIP 2023-08-15 13:04:17 +03:00
Joose Sainio bbbd391b9e [mtt] WIP 2023-08-15 13:03:40 +03:00
Joose Sainio 02a5adf768 [mtt] remove work_tree 2023-08-15 13:01:57 +03:00
Joose Sainio f2abdd6424 [mtt] Remove work_tree_copy_down and change work_tree_copy_up not to require the whole work tree as input parameter 2023-08-15 13:01:56 +03:00
Joose Sainio 03b91992a3 [mtt] fix dual tree 2023-08-15 13:01:55 +03:00
Joose Sainio 536c0ff2ef [quant] fix fast coeff cost 2023-08-15 13:01:54 +03:00
Joose Sainio cf5f7658a0 [mtt] fix 2023-08-15 13:01:53 +03:00
Joose Sainio 1668b65f3f [mtt] fix 2023-08-15 13:01:53 +03:00
Joose Sainio e931c096db [mtt] fix 2023-08-15 13:01:52 +03:00
Joose Sainio c590e5ec73 [mtt] also copy top right CU 2023-08-15 13:01:51 +03:00
Joose Sainio a1e7664db3 [mtt] temporarily disable zero coeff rdo 2023-08-15 13:01:50 +03:00
Joose Sainio 239ee88306 [mtt] fix 2023-08-15 13:01:49 +03:00
Joose Sainio 1cf1501542 [mtt] fix 2023-08-15 13:01:48 +03:00
Joose Sainio 924a93b60e [mtt] Only initialize higher depth ctus partially 2023-08-15 13:01:48 +03:00
Joose Sainio 274e71dff6 [transform] Simplify chroma transform search a bit 2023-08-15 13:01:47 +03:00
Joose Sainio 58c6af8c87 [mtt] Add function for easily getting all split cu_locs 2023-08-15 13:01:46 +03:00
Joose Sainio cfc6aebe3c [mtt] Remove depth from cu_info_t 2023-08-15 13:01:45 +03:00
Joose Sainio b14f6f98ec [mtt] Completely remove tr_depth 2023-08-15 13:00:53 +03:00
Joose Sainio 9a29d9ded3 [mtt] remove depth from cbf 2023-08-15 13:00:51 +03:00
Joose Sainio e3dbeda7f7 [mtt] remove dependency to depth from deblock 2023-08-15 13:00:50 +03:00
Joose Sainio 89af7bda8e [mtt] remove unnecessary depth dependency from split flag 2023-08-15 13:00:49 +03:00
Joose Sainio 0b6f666a1b [mtt] remove lfnst dependency to depth 2023-08-15 13:00:08 +03:00
Joose Sainio 790b1fad48 wip 2023-08-15 13:00:07 +03:00
Joose Sainio 6a0864839c [mtt] Actually remove the last width dependency to depth 2023-08-15 13:00:06 +03:00
Joose Sainio dcf879e5ed [mtt] remove all rest usages of deriving width and height from depth 2023-08-15 12:59:39 +03:00
Joose Sainio 26dcadc149 [mtt] change most if not all of search hierarchy to use cu_loc_t 2023-08-15 12:47:11 +03:00
siivonek 0ec16967a1 [isp] Fix reference building. When ISP was in use, not enough samples were generated. Uninitialized memory was referenced. Fix some typos. 2023-08-14 12:21:30 +03:00
siivonek b16c404362 [isp] Remove some obsolete TODOs and old commented out code. 2023-08-14 12:21:30 +03:00
siivonek 95d73116f9 [isp] Fix some CI errors. Some const modifiers were discarded. 2023-08-14 12:21:29 +03:00
siivonek 90e2a17759 [lfnst] Fix LFNST error when MIP enabled. 2023-08-14 12:21:28 +03:00
siivonek 7005d222d5 [isp] Fix lfnst constraint check when ISP is used. Remove some obsolete comments. 2023-08-14 12:21:28 +03:00
siivonek 3c861e4c02 [isp] Fix search. Best LFNST and MTS modes were not selected correctly for ISP modes. 2023-08-14 12:21:27 +03:00
siivonek b4cc321349 [isp] Fix transform selection when MTS & ISP is used. Wrong transform was selected. Change mts parameter name to better reflect its purpose. 2023-08-14 12:21:26 +03:00
siivonek 85f6b00394 [isp] Add lfnst asserts. Fix error in MTS search. Fix chroma lfnst index when no coefficients present. 2023-08-14 12:21:26 +03:00
siivonek b9822398a0 [isp] Fix lfnst constraint checks when ISP is in use. Add some asserts. 2023-08-14 12:21:25 +03:00
siivonek 701257cdd2 [isp] Remove unnecessary code from forward dct 32. 2023-08-14 12:21:25 +03:00
siivonek 89db34d4e0 [isp] Use TR_MAX_WIDTH in ISP checks instead of parameter. 2023-08-14 12:21:24 +03:00
siivonek c4bc2d6b10 [isp] Limit ISP search to block size 32. Size 64 is not allowed. 2023-08-14 12:21:23 +03:00
siivonek 5713fbff1a [isp] Add ISP checks to search. LFNST can be used with ISP for larger blocks. Transform skip cannot be used with ISP. 2023-08-14 12:21:22 +03:00
siivonek 7282534879 [isp] Fix CI errors. 2023-08-14 12:21:22 +03:00
siivonek 01c4d1ddb0 [isp] Fix cabac issues. There are always four transform blocks even if there are only two ISP splits. Fix prediction issues. PDPC filter was applied when it should be disabled. Fix reference building issues. Left reference was built incorrectly for blocks with height 2. 2023-08-14 12:21:21 +03:00
siivonek b8e36bbc4a [isp] Fix storing cbfs for small ISP splits. Fix pdpc filtering. Cannot be used if width or height is less than 4. Fix dct related CI errors. 2023-08-14 12:21:20 +03:00
siivonek 99495c331b [isp] Fix some asserts to allow log2_dim 1 block sizes. Fix coefficient group scan order for small dimensions. 2023-08-14 12:21:19 +03:00
siivonek d39fddf0d8 [isp] Implement DCT for small blocks. 2023-08-14 12:21:19 +03:00
siivonek 910501012f [isp] Fix referene building for depth 2 blocks. Flip horizontal mode dimensions during prediction. Fix reference length during prediction when ISP enabled. 2023-08-14 12:21:18 +03:00
siivonek 7ba557af6b [isp] Fix luma cbf writing for ISP splits. Do not write luma cbf if first three splits had luma cbf 0. 2023-08-14 12:21:17 +03:00
siivonek a28e61eff7 [isp] Fix CI errors. 2023-08-14 12:21:17 +03:00
siivonek 4794104ecc [isp] Fix errors in reference building. Use cubic filter during prediction if ISP enabled. 2023-08-14 12:21:16 +03:00
Joose Sainio 662f31d61d [isp] Use correct coordinates for depth 4 chroma tu cost calculation 2023-08-14 12:17:36 +03:00
Joose Sainio 08942a5394 [tr-skip] fix transform skip flag writing 2023-08-14 12:17:35 +03:00
siivonek a261d4c5b3 [isp] WIP 2023-08-14 12:17:34 +03:00
siivonek 6340dfe4ce [isp] Fix mistake in pu_loc argument passing, was not used after passing. 2023-08-14 12:17:33 +03:00
Joose Sainio 88c33c0489 [lfnst] Fix lfnst constraint checking for the new coeff order 2023-08-14 12:17:33 +03:00
Joose Sainio e0e96068cc [lfnst] lfnst is not allowed for transform split 2023-08-14 12:17:32 +03:00
Joose Sainio cb7f9919e3 [jccr] Fix jccr coefficient copying 2023-08-14 12:17:32 +03:00
Joose Sainio 3e23fd0601 [cabac] fix cbf_y context for tr splits 2023-08-14 12:17:31 +03:00
siivonek 59292d8808 [isp] Add extra logic to reference building to accommodate ISP. Remove some asserts which were invalidated by ISP. 2023-08-14 12:17:31 +03:00
siivonek 33cd44f11b [isp] Fix chroma coeff writing for ISP. 2023-08-14 12:17:30 +03:00
siivonek d8d206365c [isp] Fix jccr coeffs. 2023-08-14 12:17:29 +03:00
siivonek 7398e58431 [isp] Fix coeff cost calculation. Coeff arrays were indexed wrongly. 2023-08-14 12:17:29 +03:00
siivonek d050efcb87 [isp] Fix error in last sig coeff function call. Height was not used. Fix cbf writing. Fix transform skip flag writing. 2023-08-14 12:17:28 +03:00
siivonek 33ae02aae0 [isp] Fix mistake in isp cbf writing. Loop index was increased twice. 2023-08-14 12:17:28 +03:00
siivonek 4a21039e23 [isp] Fix mistake in function declaration. 2023-08-14 12:17:27 +03:00
siivonek b8506c757c [isp] Convert functions to handle new coeff array order. Add function for getting coeff array subset. Fix assert. 2023-08-14 12:17:26 +03:00
siivonek 69dcb04c99 [isp] Use temporary coeff array when quantizing coeffs. After deriving coeffs, copy temp coeffs from linear order to correct arrays with cu order. 2023-08-14 12:17:26 +03:00
siivonek 0ae71feae4 [isp] Fix assert. 2023-08-14 12:17:25 +03:00
siivonek 2e8f008de4 [isp] Redo call hierarchy to include x, y coordinates. 2023-08-14 12:17:24 +03:00
siivonek 10f9b2be26 [isp] Keep lfnst constraint up to date during search. 2023-08-14 12:16:42 +03:00
siivonek 39f30563c5 [isp] Fix chroma width error when writing cu loc. Remove redundant IPS mode checks. 2023-08-14 12:16:42 +03:00
siivonek b53308f258 [isp] Fix mistake in setting cbfs. Skip stting if ISP is not used. 2023-08-14 12:16:41 +03:00
siivonek 56ebea7358 [isp] Set cbfs for isp splits after search. Add helper function for isp split number. 2023-08-14 12:16:40 +03:00
siivonek 510798cb3d [isp] Fix mistake in isp cabac write. Intra luma mpm flag bit was checking isp when it did not need to. 2023-08-14 12:16:40 +03:00
siivonek f86dc29ce7 [isp] Fix mistake in cost calculation. Remove some commented out code blocks. 2023-08-14 12:16:39 +03:00
siivonek bbb8faea98 [isp] Modify encode transform coeff func to handle non-square blocks, use cu_loc_t where possible. Fix mistake in mts idct generic. 2023-08-14 12:16:38 +03:00
siivonek 7062697beb [isp] Resolve TODOs. Make scan order tables const. 2023-08-14 12:16:37 +03:00
siivonek 93317cafa4 [isp] Write isp config bit to sps. 2023-08-14 12:16:37 +03:00
siivonek 182d0f4e66 [isp] Remove old_scan tables and related asserts. Fix coefficient group indexing. 2023-08-14 12:16:36 +03:00
siivonek f8641f7436 [isp] Fix assert. Implement coef cost calculation for isp splits. 2023-08-14 12:16:35 +03:00
siivonek ae0336fdfc [isp] Add non-square block handling to functions. 2023-08-14 12:16:34 +03:00
siivonek 031a758d6c [isp] Count isp cbfs. 2023-08-14 12:16:01 +03:00
siivonek 75175ee2e2 [isp] Fix isp search. 2023-08-14 12:16:00 +03:00
siivonek 8d914ce849 [isp] Implement coefficient encoding for isp splits. Make get_split_dim non static, it was needed elsewhere after all. 2023-08-14 12:15:59 +03:00
siivonek 573ecf80e3 [isp] Move can_use_lfnst_with_isp to intra.c. Remove duplicate functions. Move isp related functions from search to intra. Make isp_split_dim static. Move isp related defines from search to intra. 2023-08-14 12:15:58 +03:00
siivonek bcbd952dfd [isp] Add height handling to avx2 reconstruction. 2023-08-14 12:15:57 +03:00
siivonek 7c340fd92b [isp] Add height to inverse transform skip. 2023-08-14 12:15:56 +03:00
siivonek 318d925028 [isp] Add new convert_to_log2 table. Change all instances which used old convert_to_bit table to change dimensions into log2. 2023-08-14 12:15:55 +03:00
siivonek 6922157ed3 [isp] Fix quantization function calls. Some were not getting height as input. 2023-08-14 12:15:54 +03:00
siivonek 50ad91a94e [isp] Modify quantization functions to work with non-square blocks. 2023-08-14 12:15:53 +03:00
siivonek 31c8f1356f [isp] Add height to sig coeff group context calculation function. 2023-08-14 12:15:53 +03:00
siivonek 936256e750 [isp] Fix sig coeff flag context calculation function call. Width & height was swapped. 2023-08-14 12:15:52 +03:00
siivonek 9e7f4eac99 [isp] Change variable name 'type' to 'color'. 2023-08-14 12:15:51 +03:00
siivonek 09b905c6c4 [isp] Add height to get_tr_type function. 2023-08-14 12:15:51 +03:00
siivonek 8b7d573ae7 [isp] Add height to idct getter function. Check block dimensions in transform 2d functions. 2023-08-14 12:15:50 +03:00
siivonek 370bd07c55 [isp] Fix error in mts dct and idct. 2023-08-14 12:15:49 +03:00
siivonek 3a874ab5dd [isp] Comment out dct non square function. It is not needed since mts dct function will handle transform for non square blocks. 2023-08-14 12:15:49 +03:00
siivonek f9116441da [isp] Fix avx2 function call. Missing height parameter. 2023-08-14 12:15:48 +03:00
siivonek 6f756e831d [isp] Uncomment old scan order code to test against new one. Add assert to ensure old and new tables are the same. 2023-08-14 12:15:47 +03:00
siivonek c4d1f80f8f [isp] Fix error in scan order getter. Change define names to better reflect what they do. Add more accurate bookmark comments to scan order buffer table. 2023-08-14 12:15:46 +03:00
siivonek 8131e970e5 [isp] Modify existing scan table calls to use new getter. Add safety assert to getter. 2023-08-14 12:15:46 +03:00
siivonek 6ff9ae074e [isp] Add scan order getter. Add bookmark comments to scan order buffer. 2023-08-14 12:15:45 +03:00
siivonek 55d77c6b50 [isp] Add scan order tables for all possible block sizes. 2023-08-14 12:15:45 +03:00
siivonek 35271648db [isp] Fix some errors. Pass height to functions. Some WIP comments. 2023-08-14 12:15:44 +03:00
siivonek a9090c99b5 [isp] Fix error in inverse transform shifting. 2023-08-14 12:15:43 +03:00
siivonek cd7e091992 [isp] Fix mistake in transform if clause. 2023-08-14 12:15:43 +03:00
siivonek 6a3ddfd0bc [isp] Modify inverse transform to handle non-square blocks. 2023-08-14 12:15:42 +03:00
siivonek 626c9b02ea [isp] Modify transform and quantization functions to handle non-square blocks. Add strategy headers to CMakelist. 2023-08-14 12:15:41 +03:00
siivonek 06532dce02 [isp] Implement ISP search and partitioning. Add helper function for constructing cu_loc types. WIP stuff for transform. 2023-08-14 12:15:40 +03:00
siivonek 6236cc29be [isp] Fix avx2 function call. 2023-08-14 12:15:39 +03:00
siivonek ec4909095c [isp] Do not filter references if ISP is used. 2023-08-14 12:15:38 +03:00
siivonek 96df3ffd64 [isp] Change function calls to cu_loc_t. 2023-08-14 12:15:38 +03:00
siivonek 9406c5c31d [isp] Modify generic intra pred functions to handle non-square blocks. 2023-08-14 12:15:37 +03:00
siivonek 03b22e561e [isp] Add ISP command line option. 2023-08-14 12:15:36 +03:00
Marko Viitanen 1a1fea1a19 Merge branch 'implement_ibc' 2023-08-09 09:34:29 +03:00
Marko Viitanen 18b4a8be79 [ibc] Include the chroma in crc 2023-07-27 10:58:20 +03:00
Marko Viitanen 20875a9819 [ibc] Calculate hashes every 4 pixels and change the IBC costs a bit 2023-07-27 10:29:55 +03:00
Marko Viitanen 0fefd3f621 [ibc] Add hash based starting point finder for reqular inter search *experimental* 2023-07-24 22:55:52 +03:00
Marko Viitanen 6f4d538f4f [ibc] Clean up the ibc search, utilize hash based starting points if ibc=2 2023-07-24 22:07:22 +03:00
Marko Viitanen 6fe629e666 [ibc] A bit of cleanup and skip IBC search if cost is already less than 500 2023-07-21 20:40:00 +03:00
Marko Viitanen 3cef3c0119 Change the hardcoded general_level_idc from 5.2 to 6.3 2023-07-21 20:19:43 +03:00
Marko Viitanen 95dc4aa0cb [ibc] Fix the IBC buffer limitation, 256x64 pixels allowed 2023-07-21 20:15:24 +03:00
Marko Viitanen 8ff184a6b3 [ibc] Fill the IBC hashmap at the start of LCU search and use reverse map for "pos to hash" 2023-07-21 20:14:23 +03:00
Marko Viitanen 457d650f49 [ibc] Fix for CRC calculations
- Input for the 64bit crc intrinsic was 32bit
2023-07-19 09:57:24 +03:00
Marko Viitanen 15fb6f8183 [ibc] Add first version of the IBC hash search 2023-06-29 21:57:06 +03:00
Marko Viitanen 8cec02280f [ibc] Use IBC hashmap in LCU row basis 2023-06-28 23:06:04 +03:00
Marko Viitanen 76d66591c5 [ibc] Implement CRC for 8x8 block and generate a full hashmap at the frame load 2023-06-26 21:24:10 +03:00
Marko Viitanen 4b1f5ca7e2 [ibc] Add the hashmap to the frame and fix some small issues with hashmap and crc32c
- crc32c_4x4 strategy was not working, made some changes to the initialization
2023-06-22 21:44:49 +03:00
Marko Viitanen 30321e6dd4 [ibc] Fix uvg_hashmap_hash definition 2023-06-22 14:45:05 +03:00
Marko Viitanen a32a318d18 [ibc] Add CRC32C functions, with SSE 4.2 optimized CRC calculations 2023-06-22 14:39:56 +03:00
Marko Viitanen 7252befc17 [ibc] Add a hashmap implementation for IBC hash search 2023-06-22 14:39:56 +03:00
Marko Viitanen 68382f9e25 [ibc] Handle 4x4 block cases 2023-06-22 14:39:55 +03:00
Marko Viitanen 31fbf453c1 [ibc] Fix IBCFlag writing with I-frames and clean up some code 2023-06-22 14:39:55 +03:00
Marko Viitanen 8aded6406b [ibc] Fix issue in search 2023-05-08 12:26:06 +03:00
Marko Viitanen 34c7c432f9 [ibc] Fix deblocking for the IBC blocks 2023-05-08 11:58:40 +03:00
Joose Sainio e636db489f [cfg] Parameterize intra rough search granularity 2022-09-28 08:49:38 +03:00
Joose Sainio ed6a0528fe Further make things faster 2022-08-30 15:17:05 +03:00
Joose Sainio b0b2b0e536 Try making ultrafast all intra a bit faster 2022-08-29 14:11:08 +03:00
Marko Viitanen 6de2e2d581 [ibc] Fix some git merge issues and IBC merge candidate selection 2022-08-03 10:46:02 +03:00
Marko Viitanen 6a0e2a062d [ibc] Implement a proper search for IBC based on Inter search 2022-07-29 11:53:11 +03:00
Marko Viitanen 65c017c2f2 [ibc] Add check for above block in IBC search 2022-07-29 11:53:11 +03:00
Marko Viitanen 09e62a68fe [ibc] Fix merge candidate selection bug and IBC HMVP size reset at the start of the lcu row 2022-07-29 11:53:10 +03:00
Marko Viitanen d288cc46e9 [ibc] Fix coding of IBC in P and B slices, enable in search 2022-07-29 11:53:10 +03:00
Marko Viitanen 48584eead9 [ibc] Reset the jccr flags to fix a bug with IBC 2022-07-29 11:53:10 +03:00
Marko Viitanen 7ce01b4826 [ibc] Tune search costs a bit and revert debug vector scaling 2022-07-29 11:53:09 +03:00
Marko Viitanen 0fdf96fab2 [ibc] Change internal MV storage to INTERNAL_MV_PREC and code it as full-pel 2022-07-29 11:53:09 +03:00
Marko Viitanen cc4c757695 [ibc] Fix bugs on IBC reconstruction and add a simple search for I-frames 2022-07-29 11:53:08 +03:00
Marko Viitanen d9164f3cfe [ibc] Simplify the IBC merge candidate and mv cand selection 2022-07-29 11:53:08 +03:00
Marko Viitanen a46a4531a3 [ibc] Add HMVP for IBC and correct AMVP selection 2022-07-29 11:52:17 +03:00
Marko Viitanen dbc2006ba9 [ibc] Implement IBC reconstruction function when blocks are completely in the ibc buffer 2022-07-29 11:52:16 +03:00
Marko Viitanen b49d32af21 [ibc] Add IBC buffers 2022-07-29 11:52:15 +03:00
Marko Viitanen 6ec4c37b47 [ibc] Add IBC Flag context and code the bits, disable by default for now 2022-07-29 11:49:49 +03:00
Marko Viitanen 20d0a9b65e [ibc] Add --ibc parameter and config values for Intra Block Copy 2022-07-29 11:49:49 +03:00
Marko Viitanen cd2d4066d5 Fix scaled MV clipping and remove some unused variables 2022-07-28 13:59:11 +03:00
Marko Viitanen 3dd738ebb5 Fix mv_t rounding problems in some functions 2022-07-27 13:02:37 +03:00
Marko Viitanen 5ce1035291 [debug] Fix Motion Vector debug code not to overflow on videos not divisible by LCU_WIDTH 2022-07-27 12:48:39 +03:00
Marko Viitanen b7b7c22e44 Change mv_t to int32_t because of possible overflow in large videos 2022-07-27 12:48:39 +03:00
Joose Sainio ea32ef33ac [lfnst] handle transform skip correctly during search 2022-07-08 10:57:26 +03:00
Joose Sainio 03fb6ce92e [lfnst] Fix lfnst+tr_skip for dual tree 2022-07-08 10:57:26 +03:00
Joose Sainio 450cd00290 [mts] Fix cost calculation 2022-07-08 10:56:35 +03:00
Joose Sainio f9212b4e44 [mts] Don't do tr-skip when tr-skip is disabled 2022-07-06 15:15:28 +03:00
Joose Sainio dc7c8eeb41 [tr-skip] fix uvg_encode_ts_residual 2022-07-06 10:51:01 +03:00
Joose Sainio 427d611a00 [intra] Perform chroma search for rd2 2022-07-05 12:29:27 +03:00
Joose Sainio e2c34e7c25 [lfnst] Fix lfnst for --rd 2 2022-07-05 12:19:21 +03:00
Joose Sainio 02aa36f1a2 [tests] Fix final issue with avx2 satd and update test results 2022-07-05 10:28:59 +03:00
Joose Sainio 42adfb52a7 [satd] Satd scaling on avx2 implementations and re-enable satd tests 2022-07-05 09:34:59 +03:00
Joose Sainio 1f6a62e70e [fix-up] Force lfnst off when trying the mode from below depth block 2022-07-04 13:45:16 +03:00
Joose Sainio 3de4b99aec [jccr] Fix cost calculation 2022-07-04 13:41:14 +03:00
Joose Sainio 3a6414c31d [dual-tree] Fix deblock 2022-06-30 14:21:03 +03:00
Joose Sainio 5fefea025f [lfnst] get constarints for jccr mode 2022-06-29 16:35:55 +03:00
Joose Sainio b35a75b2eb [lfnst] Fix lfnst with rdoq 2022-06-29 16:25:25 +03:00
Joose Sainio 6ef532775b [intra] Fix various issues with cclm, mip, dual-tree, and lfnst 2022-06-29 15:09:34 +03:00
Joose Sainio 06d277bc78 [doc] update manpage and readme 2022-06-28 16:25:25 +03:00
Joose Sainio 68243e284f [cleanup] fix warnings 2022-06-28 16:02:22 +03:00
Joose Sainio b4ab9debf1 [lfnst] fix lfnst with cclm 2022-06-28 15:32:34 +03:00
Joose Sainio e25ea52f6f [lfnst] Fix mistakes 2022-06-28 15:32:33 +03:00
Joose Sainio 2fbbae834b [cclm] fix cclm for 4x4 2022-06-28 15:32:33 +03:00
Joose Sainio b8b603feb7 [lfnst] fix compile 2022-06-28 15:32:33 +03:00
Joose Sainio a0dd412811 [cclm] fix cclm bound calculation 2022-06-28 15:32:32 +03:00
Joose Sainio 75e500da10 [lfnst] LFNST working with dual tree 2022-06-28 15:32:32 +03:00
Joose Sainio faba18fe17 [dual-tree] only perform lfnst search when lfnst is enabled 2022-06-28 15:32:32 +03:00
Joose Sainio d16d6e3dd8 [dual-tree] [lfnst] allow counting lfnst bits for chroma in dual-tree 2022-06-28 15:32:31 +03:00
Joose Sainio 37590add20 [lfnst] [dual-tree] LFNST should work with dual tree 2022-06-28 15:32:31 +03:00
Joose Sainio 6c7dc9004c [dual-tree] Fix split context state updating 2022-06-28 15:32:31 +03:00
Joose Sainio b0d616b03c [dual-tree][tests] Fix some issues and enable cabac state test to test for dual tree 2022-06-28 15:32:30 +03:00
Joose Sainio 345c50ecee [dual-tree] rename kvz_ to uvg_ 2022-06-28 15:32:30 +03:00
Joose Sainio 3f12ee58b0 [dual-tree] fix --pu-depth-intra 4-4 for dual tree 2022-06-28 15:32:30 +03:00
Joose Sainio b8215baa30 [dual-tree] Fix CCLM+dual tree 2022-06-28 15:32:30 +03:00
Joose Sainio cf144e2724 [dual-tree] Works for all depths with basic tools 2022-06-28 15:32:29 +03:00
Joose Sainio 1c313e9c19 [dual-tree] works for depths 1 and 2 2022-06-28 15:32:29 +03:00
Joose Sainio 15cb06ded1 [dual-tree] Fix at least for implicit splits 2022-06-28 15:32:29 +03:00
Joose Sainio ed8496e57e [dual-tree] Matches except for cutoff bottom CTUs for forced depth=1 2022-06-28 15:32:28 +03:00
Joose Sainio 2017cb122a [dual-tree] Actually does whole frame 2022-06-28 15:32:28 +03:00
Joose Sainio abd00d04a1 [dual-tree] Still not working but bitstream valid 2022-06-28 15:32:28 +03:00
Joose Sainio 0adb0846d2 [dual-tree] Bitstream valid, hash missmatches 2022-06-28 15:32:28 +03:00
Joose Sainio be2ef18fea [dual-tree] Not working 2022-06-28 15:32:27 +03:00
Joose Sainio 8fba042e02 [dual-tree]preliminary preparation for dual tree 2022-06-28 15:32:27 +03:00
Joose Sainio 74c931a7c7 [lfnst] cost on chroma when necessary and fixes 2022-06-28 15:32:27 +03:00
Joose Sainio 20010cf759 [lfnst] Fix hash mismatches for depth 4 chroma 2022-06-28 15:32:10 +03:00
Joose Sainio ed602d1c07 [lfnst] Cabac state matches for all sizes but hash mismatches 2022-06-28 15:31:55 +03:00
Joose Sainio 7a7bf045e6 [lfnst] 16x16 2022-06-28 15:31:42 +03:00
Joose Sainio d7f7a2d99b [lfnst] working for 32x32 2022-06-28 15:31:42 +03:00
Joose Sainio b75ce57fce [intra] Fix chroma search for rd=2 2022-06-28 15:30:56 +03:00
Joose Sainio 6413854f3d [intra] fix intra recon 2022-06-28 15:30:47 +03:00
Joose Sainio a6d79407ab [lfnst] various small fixes 2022-06-28 15:30:28 +03:00
Joose Sainio cfc3fa9f09 [lfnst] Include lfnst in chroma search 2022-06-28 15:29:56 +03:00