siivonek
|
2e468b7014
|
Add full w8 y coordinate table.
|
2024-05-09 13:06:16 +03:00 |
|
Joose Sainio
|
4e4084434e
|
Remove setr from the loop in hor_w4
|
2024-05-09 13:06:15 +03:00 |
|
Joose Sainio
|
8bbf01c376
|
Change the right shift in pred_planar_avx2 to use a 128 bit register version of the right shift instrinsics, since when the integer version does not have a compile time constant the compiler is forced to generate the 128 bit register using version anyways, but also has to convert the integer to the 128 bit register, and the compiler does not optimize this properly and instead does the conversion on every call of the loop. ***THIS NEEDS TO BE DONE FOR ALL SHIFTS THAT DO NOT USE COMPILE TIME CONSTANT SHIFTS***
|
2024-05-09 13:06:15 +03:00 |
|
siivonek
|
b02fb1b1af
|
Remove left shift from planar half functions. Implement the left shift with madd. Planar preds of width 4, 8 and 16 should work now without overflows. Add loop unroll macros to vertical half functions. Will be added to hor half functions later.
|
2024-05-09 13:06:15 +03:00 |
|
siivonek
|
0eb0f110c2
|
Add missing packus to the end of planar calculation.
|
2024-05-09 13:05:49 +03:00 |
|
siivonek
|
4ae234ef24
|
Implement 16xN planar prediction.
|
2024-05-09 13:05:49 +03:00 |
|
siivonek
|
4bbd5f4c9a
|
Implement 4xN planar prediction.
|
2024-05-09 13:05:49 +03:00 |
|
siivonek
|
e888be80e2
|
Add intra avx2 planar placeholder functions. Implement 8xN planar prediction. Note: does not work with height < 4 yet. Initial plan is to produce the planar prediction as two halves. This is subject to change at at this point, it seems only planar functions for different widths are needed.
|
2024-05-09 13:05:49 +03:00 |
|
siivonek
|
c91b76a668
|
Add globals required by intra avx2 code. Remove unnecessary stuff from intra avx2.
|
2024-05-09 12:42:38 +03:00 |
|
Joose Sainio
|
e5e32d67f4
|
[avx2] Remove a define that was never meant to be committed
|
2023-09-27 12:54:53 +03:00 |
|
Joose Sainio
|
9add13b705
|
update version an docs
|
2023-09-27 09:47:05 +03:00 |
|
Joose Sainio
|
84580aebb0
|
Merge branch 'release-prep' into master
|
2023-09-27 08:11:09 +03:00 |
|
Joose Sainio
|
4a1cd926fb
|
[rdoq] Fix rdoq using uninitialized values that do not matter
|
2023-09-26 14:26:07 +03:00 |
|
Joose Sainio
|
079d7e9a1a
|
[tests] Fix mts_tests.c to not consider irrelevant elements
|
2023-09-26 11:36:43 +03:00 |
|
Joose Sainio
|
69c1c948fa
|
[cfg] Specify that MTT and ISP are currently experimental
|
2023-09-26 10:41:31 +03:00 |
|
Joose Sainio
|
e32cf4fb52
|
[avx2] Re-enable disabled avx2 functions that do not work with non-square blocks
|
2023-09-26 10:38:29 +03:00 |
|
Joose Sainio
|
ff77346527
|
[dct2] Remove unnecessary memsets
|
2023-09-26 09:57:47 +03:00 |
|
Joose Sainio
|
64d222d17c
|
[dep_quant] Remove dead code and fix small issue
|
2023-09-26 09:42:30 +03:00 |
|
siivonek
|
284724398e
|
Add some comments.
|
2023-09-26 09:21:49 +03:00 |
|
Joose Sainio
|
3d4e732952
|
[avx2] Fix issue with 16x32 inverse transform
|
2023-09-26 09:21:49 +03:00 |
|
Joose Sainio
|
d62a3f888e
|
[avx2] static all transform tables
|
2023-09-26 09:21:48 +03:00 |
|
Joose Sainio
|
1f9955bdda
|
[avx2] Fix compilation errors
|
2023-09-26 09:21:35 +03:00 |
|
Joose Sainio
|
13d4313e02
|
[avx2] Mostly working
|
2023-09-26 09:21:29 +03:00 |
|
Joose Sainio
|
b78f9aff17
|
[avx2] Inverses work when ISP is not enabled
|
2023-09-26 09:21:24 +03:00 |
|
siivonek
|
4dccbcc30d
|
[avx2] Forward transforms seem to be working
|
2023-09-26 09:21:24 +03:00 |
|
Joose Sainio
|
19829da152
|
Disable all avx2 optimizations that cannot be used with mtt/isp
|
2023-09-26 09:21:23 +03:00 |
|
Joose Sainio
|
1c293b8253
|
pass context_store as pointer
This reverts commit 47c5ea3d5c .
|
2023-09-26 09:21:23 +03:00 |
|
Joose Sainio
|
2caf077cff
|
Remove avx512 instrincis
|
2023-09-26 09:21:23 +03:00 |
|
Joose Sainio
|
254826d396
|
[avx2] Add comments
|
2023-09-26 09:21:19 +03:00 |
|
Joose Sainio
|
f2fb641acb
|
[avx2] Replace inefficient loop with AVX2 code
|
2023-09-26 09:21:19 +03:00 |
|
Joose Sainio
|
bc24601369
|
[avx2] Improve avx2 version of update_common_context
|
2023-09-26 09:21:19 +03:00 |
|
Joose Sainio
|
915104cf10
|
[dep_quant] Change order of absLevels
|
2023-09-26 09:21:18 +03:00 |
|
Joose Sainio
|
d850c346d6
|
[dep_quant] Change order of ctxInit
|
2023-09-26 09:21:18 +03:00 |
|
Joose Sainio
|
a624988c91
|
[dep_quant] Separate abs levels and ctx init
|
2023-09-26 09:21:18 +03:00 |
|
Joose Sainio
|
dda972c665
|
[avx2] Try to do lnz decision with avx2
|
2023-09-26 09:21:18 +03:00 |
|
Joose Sainio
|
cf6f03b73b
|
[avx2] This has worked but I'm pretty sure these should be unaligned
|
2023-09-26 09:20:56 +03:00 |
|
Joose Sainio
|
b4c84e820c
|
[avx2] Simplify
|
2023-09-26 09:20:56 +03:00 |
|
Joose Sainio
|
2811ce58f4
|
[avx2] AVX2 version of depquant now exactly matches scalar version
|
2023-09-26 09:20:56 +03:00 |
|
Joose Sainio
|
48ea4bff4d
|
[dep_quant] Fix rate_estimator and quant_block init cases
|
2023-09-26 09:20:55 +03:00 |
|
Joose Sainio
|
dfff9a8030
|
[avx2] Move dep quant stuff to strategies
|
2023-09-26 09:20:55 +03:00 |
|
Joose Sainio
|
0591342b3a
|
[avx2] replace or
|
2023-09-26 09:20:38 +03:00 |
|
Joose Sainio
|
8b1d6fab59
|
[avx2] Replace loads and stores with non-avx512 stores
|
2023-09-26 09:20:37 +03:00 |
|
Joose Sainio
|
6d0a3fa5fc
|
[avx2] Replace _mm_and_epi32 with _mm_and_si128
|
2023-09-26 09:20:37 +03:00 |
|
Joose Sainio
|
7fdc045690
|
[dep_quant] Clean up
|
2023-09-26 09:20:37 +03:00 |
|
Joose Sainio
|
8eb0f66734
|
[depquant] update_state_eos_avx2 working
|
2023-09-26 09:20:37 +03:00 |
|
Joose Sainio
|
00cc58bc55
|
[depquant] Only initialize rate_estimator when necessary
|
2023-09-26 09:20:37 +03:00 |
|
Joose Sainio
|
00f838306f
|
[depquant] Initialize quant_block only when necessary
|
2023-09-26 09:20:37 +03:00 |
|
Joose Sainio
|
9e27b4056a
|
[avx2] WIP update_state_eos_avx2
|
2023-09-26 09:20:36 +03:00 |
|
Joose Sainio
|
c56350b8d6
|
[avx2] and last
|
2023-09-26 09:20:36 +03:00 |
|
Joose Sainio
|
9f69713c24
|
[depquant] remove an unnecessary memcpy
|
2023-09-26 09:20:36 +03:00 |
|