Reima Hyvönen
|
3496a57f7a
|
Edited sao_edge_ddistortion_avx2 to avoid memory overflow
|
2019-08-07 16:35:24 +03:00 |
|
Reima Hyvönen
|
267ba1d6ce
|
Modified sao_band_ddistortion_avx2
|
2019-08-07 16:35:24 +03:00 |
|
Reima Hyvönen
|
e70663b245
|
added some sub commands to avoid memory read errors
|
2019-08-07 16:35:24 +03:00 |
|
Reima Hyvönen
|
59dfb4570c
|
Converted some loads to load int8_t instead ints
|
2019-08-07 16:35:24 +03:00 |
|
Reima Hyvönen
|
8b253209a8
|
Found false address load from calc_sao_edge_dir. Should now work like generic
|
2019-08-07 16:35:24 +03:00 |
|
Reima Hyvönen
|
50e0a47b7a
|
Took away __restrict
|
2019-08-07 16:35:24 +03:00 |
|
Reima Hyvönen
|
8a39eb674e
|
Removed c-variable from calc_sao_edge_dir_avx2
|
2019-08-07 16:35:24 +03:00 |
|
Reima Hyvönen
|
bc0a36830d
|
Clerified some 6 pixel loads
|
2019-08-07 16:35:24 +03:00 |
|
Reima Hyvönen
|
1a8b211e05
|
Added break to line 170
|
2019-08-07 16:35:24 +03:00 |
|
Reima Hyvönen
|
d05e750ebe
|
Added some switches to prevent segmentation fault from reading
|
2019-08-07 16:35:24 +03:00 |
|
Reima Hyvönen
|
203580047d
|
Defined some AVX functions
|
2019-08-07 16:35:24 +03:00 |
|
Reima Hyvönen
|
c884c738b1
|
Updated some commands to match the standard
|
2019-08-07 16:35:24 +03:00 |
|
Reima Hyvönen
|
b412ed2f59
|
Removed some setr and used loads calc_sao_edge_dir_avx2
|
2019-08-07 16:35:24 +03:00 |
|
Reima Hyvönen
|
c6cc063534
|
converted some hadd operations at calc_sao_edge_dir_avx2 to cast and extract
|
2019-08-07 16:35:24 +03:00 |
|
Reima Hyvönen
|
47ac109b10
|
optimated some sao_reconstruct_color_avx2 when sao->type == SAO_TYPE_BAND
|
2019-08-07 16:35:24 +03:00 |
|
Reima Hyvönen
|
96dc60a1ed
|
first working optimation
|
2019-08-07 16:35:24 +03:00 |
|
Reima Hyvönen
|
c148aff9fb
|
Some optimation done to function sao_reconstruct_color_avx2
|
2019-08-07 16:35:24 +03:00 |
|
Reima Hyvönen
|
bf16ba6cc4
|
Remade sao_edge_ddistortion_avx2 and calc_sao_edge_dir_avx2
|
2019-08-07 16:35:24 +03:00 |
|
Reima Hyvönen
|
79dc39a676
|
Some editing for sao_edge_ddistortion_avx2
|
2019-08-07 16:35:24 +03:00 |
|
Reima Hyvönen
|
06ee52924e
|
some reconst done to calc_sao_edge_dir_avx2
|
2019-08-07 16:35:24 +03:00 |
|
Reima Hyvönen
|
5fbc65d823
|
reconst optimation doesn't work yet
|
2019-08-07 16:35:24 +03:00 |
|
Reima Hyvönen
|
d29f834a69
|
Remove useless function
|
2019-08-07 16:35:24 +03:00 |
|
Reima Hyvönen
|
a232a12160
|
calc_sao_edge_dir_avx2 updated
|
2019-08-07 16:35:24 +03:00 |
|
Reima Hyvönen
|
b1febc02a5
|
sao_edge_ddistortion_avx2 now working proberly
|
2019-08-07 16:35:24 +03:00 |
|
Reima Hyvönen
|
cd6092a1ec
|
Still too much bits, looking for where they appear
|
2019-08-07 16:35:24 +03:00 |
|
Reima Hyvönen
|
7853be8eeb
|
Incomple optimation
|
2019-08-07 16:35:24 +03:00 |
|
Pauli Oikkonen
|
8d48bee180
|
Tidy fast coeff cost code
|
2019-07-09 18:01:54 +03:00 |
|
Pauli Oikkonen
|
201a43b08e
|
Clean up the RD-estimation code
|
2019-07-09 18:01:54 +03:00 |
|
Pauli Oikkonen
|
b111df5073
|
Create preliminary version of improved cost estimator
|
2019-07-09 18:01:54 +03:00 |
|
Pauli Oikkonen
|
081d16fc33
|
Fix intrinsics that may be missing on some systems
Create a header to collect all the workarounds for missing intrinsics
in one place
|
2019-05-23 19:59:40 +03:00 |
|
Pauli Oikkonen
|
7175d20bb2
|
Still include stdint.h for non-vector builds
|
2019-04-15 19:36:01 +03:00 |
|
Pauli Oikkonen
|
1315c7e2b0
|
Do not compile any vector code for non-SSE4/AVX2 builds
|
2019-04-15 19:10:48 +03:00 |
|
Pauli Oikkonen
|
f5f70e7bc5
|
Merge branch 'sad-optimization'
|
2019-04-15 19:02:01 +03:00 |
|
Pauli Oikkonen
|
6d43759604
|
Create a border-respecting 32-wide AVX hor_sad
|
2019-03-07 18:01:22 +02:00 |
|
Pauli Oikkonen
|
f218cecb38
|
Remove offending hor_sad_avx2_w32 function
Consider possibly creating a non-offending AVX2 version instead, the
way hor_sad_sse41_w32 works. Or maybe there's more essential work to
do.
|
2019-03-05 22:51:41 +02:00 |
|
Pauli Oikkonen
|
bcd9879359
|
Include quant coeff range check in non-scaling list execution path too
|
2019-02-27 17:26:44 +02:00 |
|
Pauli Oikkonen
|
24e6363f64
|
Remove the kvz_quant_avx2 wrapper function
|
2019-02-27 16:32:58 +02:00 |
|
Pauli Oikkonen
|
748820f3c5
|
Eliminate unnecessary loading of coeffs if scaling lists are off
|
2019-02-27 16:26:35 +02:00 |
|
Pauli Oikkonen
|
5994350f40
|
Allow quant_flat_avx2 to be used with scaling lists on
|
2019-02-27 16:25:59 +02:00 |
|
Pauli Oikkonen
|
d8b8923028
|
Add LGPL notices to reg_sad headers
|
2019-02-18 17:52:47 +02:00 |
|
Pauli Oikkonen
|
2d05ca8520
|
Remove width from constant-width hor_sad func params
They should kinda know it already
|
2019-02-04 20:41:40 +02:00 |
|
Pauli Oikkonen
|
dd7d989a39
|
Implement 32-wide hor_sad on AVX2
|
2019-02-04 20:41:40 +02:00 |
|
Pauli Oikkonen
|
f5ff4db01f
|
4-wide hor_sad border agnostic
|
2019-02-04 20:41:40 +02:00 |
|
Pauli Oikkonen
|
35e7f9a700
|
Fix hor_sad w8 to work with both borders
|
2019-02-04 20:41:40 +02:00 |
|
Pauli Oikkonen
|
836783dd6e
|
Use hor_sad_w32 for both left and right borders
|
2019-02-04 20:41:40 +02:00 |
|
Pauli Oikkonen
|
69687c8d24
|
Modify hor_sad_sse41_w16 to work over left and right borders
|
2019-02-04 20:41:40 +02:00 |
|
Pauli Oikkonen
|
768203a2de
|
First version of arbitrary-width SSE4.1 hor_sad
|
2019-02-04 20:41:40 +02:00 |
|
Pauli Oikkonen
|
ccf683b9b6
|
Start work on left and right border aware hor_sad
Comes with 4, 8, 16 and 32 pixel wide implementations now, at some point
investigate if this can start to thrash icache
|
2019-02-04 20:41:40 +02:00 |
|
Pauli Oikkonen
|
f781dc31f0
|
Create strategy for ver_sad
Easy to vectorize
|
2019-02-04 20:41:40 +02:00 |
|
Pauli Oikkonen
|
91cb0fbd45
|
Create strategy for directly obtaining pointer to constant-width SAD function
|
2019-02-04 20:41:40 +02:00 |
|