Commit graph

265 commits

Author SHA1 Message Date
Reima Hyvönen 3496a57f7a Edited sao_edge_ddistortion_avx2 to avoid memory overflow 2019-08-07 16:35:24 +03:00
Reima Hyvönen 267ba1d6ce Modified sao_band_ddistortion_avx2 2019-08-07 16:35:24 +03:00
Reima Hyvönen e70663b245 added some sub commands to avoid memory read errors 2019-08-07 16:35:24 +03:00
Reima Hyvönen 59dfb4570c Converted some loads to load int8_t instead ints 2019-08-07 16:35:24 +03:00
Reima Hyvönen 8b253209a8 Found false address load from calc_sao_edge_dir. Should now work like generic 2019-08-07 16:35:24 +03:00
Reima Hyvönen 50e0a47b7a Took away __restrict 2019-08-07 16:35:24 +03:00
Reima Hyvönen 8a39eb674e Removed c-variable from calc_sao_edge_dir_avx2 2019-08-07 16:35:24 +03:00
Reima Hyvönen bc0a36830d Clerified some 6 pixel loads 2019-08-07 16:35:24 +03:00
Reima Hyvönen 1a8b211e05 Added break to line 170 2019-08-07 16:35:24 +03:00
Reima Hyvönen d05e750ebe Added some switches to prevent segmentation fault from reading 2019-08-07 16:35:24 +03:00
Reima Hyvönen 203580047d Defined some AVX functions 2019-08-07 16:35:24 +03:00
Reima Hyvönen c884c738b1 Updated some commands to match the standard 2019-08-07 16:35:24 +03:00
Reima Hyvönen b412ed2f59 Removed some setr and used loads calc_sao_edge_dir_avx2 2019-08-07 16:35:24 +03:00
Reima Hyvönen c6cc063534 converted some hadd operations at calc_sao_edge_dir_avx2 to cast and extract 2019-08-07 16:35:24 +03:00
Reima Hyvönen 47ac109b10 optimated some sao_reconstruct_color_avx2 when sao->type == SAO_TYPE_BAND 2019-08-07 16:35:24 +03:00
Reima Hyvönen 96dc60a1ed first working optimation 2019-08-07 16:35:24 +03:00
Reima Hyvönen c148aff9fb Some optimation done to function sao_reconstruct_color_avx2 2019-08-07 16:35:24 +03:00
Reima Hyvönen bf16ba6cc4 Remade sao_edge_ddistortion_avx2 and calc_sao_edge_dir_avx2 2019-08-07 16:35:24 +03:00
Reima Hyvönen 79dc39a676 Some editing for sao_edge_ddistortion_avx2 2019-08-07 16:35:24 +03:00
Reima Hyvönen 06ee52924e some reconst done to calc_sao_edge_dir_avx2 2019-08-07 16:35:24 +03:00
Reima Hyvönen 5fbc65d823 reconst optimation doesn't work yet 2019-08-07 16:35:24 +03:00
Reima Hyvönen d29f834a69 Remove useless function 2019-08-07 16:35:24 +03:00
Reima Hyvönen a232a12160 calc_sao_edge_dir_avx2 updated 2019-08-07 16:35:24 +03:00
Reima Hyvönen b1febc02a5 sao_edge_ddistortion_avx2 now working proberly 2019-08-07 16:35:24 +03:00
Reima Hyvönen cd6092a1ec Still too much bits, looking for where they appear 2019-08-07 16:35:24 +03:00
Reima Hyvönen 7853be8eeb Incomple optimation 2019-08-07 16:35:24 +03:00
Pauli Oikkonen 8d48bee180 Tidy fast coeff cost code 2019-07-09 18:01:54 +03:00
Pauli Oikkonen 201a43b08e Clean up the RD-estimation code 2019-07-09 18:01:54 +03:00
Pauli Oikkonen b111df5073 Create preliminary version of improved cost estimator 2019-07-09 18:01:54 +03:00
Pauli Oikkonen 081d16fc33 Fix intrinsics that may be missing on some systems
Create a header to collect all the workarounds for missing intrinsics
in one place
2019-05-23 19:59:40 +03:00
Pauli Oikkonen 7175d20bb2 Still include stdint.h for non-vector builds 2019-04-15 19:36:01 +03:00
Pauli Oikkonen 1315c7e2b0 Do not compile any vector code for non-SSE4/AVX2 builds 2019-04-15 19:10:48 +03:00
Pauli Oikkonen f5f70e7bc5 Merge branch 'sad-optimization' 2019-04-15 19:02:01 +03:00
Pauli Oikkonen 6d43759604 Create a border-respecting 32-wide AVX hor_sad 2019-03-07 18:01:22 +02:00
Pauli Oikkonen f218cecb38 Remove offending hor_sad_avx2_w32 function
Consider possibly creating a non-offending AVX2 version instead, the
way hor_sad_sse41_w32 works. Or maybe there's more essential work to
do.
2019-03-05 22:51:41 +02:00
Pauli Oikkonen bcd9879359 Include quant coeff range check in non-scaling list execution path too 2019-02-27 17:26:44 +02:00
Pauli Oikkonen 24e6363f64 Remove the kvz_quant_avx2 wrapper function 2019-02-27 16:32:58 +02:00
Pauli Oikkonen 748820f3c5 Eliminate unnecessary loading of coeffs if scaling lists are off 2019-02-27 16:26:35 +02:00
Pauli Oikkonen 5994350f40 Allow quant_flat_avx2 to be used with scaling lists on 2019-02-27 16:25:59 +02:00
Pauli Oikkonen d8b8923028 Add LGPL notices to reg_sad headers 2019-02-18 17:52:47 +02:00
Pauli Oikkonen 2d05ca8520 Remove width from constant-width hor_sad func params
They should kinda know it already
2019-02-04 20:41:40 +02:00
Pauli Oikkonen dd7d989a39 Implement 32-wide hor_sad on AVX2 2019-02-04 20:41:40 +02:00
Pauli Oikkonen f5ff4db01f 4-wide hor_sad border agnostic 2019-02-04 20:41:40 +02:00
Pauli Oikkonen 35e7f9a700 Fix hor_sad w8 to work with both borders 2019-02-04 20:41:40 +02:00
Pauli Oikkonen 836783dd6e Use hor_sad_w32 for both left and right borders 2019-02-04 20:41:40 +02:00
Pauli Oikkonen 69687c8d24 Modify hor_sad_sse41_w16 to work over left and right borders 2019-02-04 20:41:40 +02:00
Pauli Oikkonen 768203a2de First version of arbitrary-width SSE4.1 hor_sad 2019-02-04 20:41:40 +02:00
Pauli Oikkonen ccf683b9b6 Start work on left and right border aware hor_sad
Comes with 4, 8, 16 and 32 pixel wide implementations now, at some point
investigate if this can start to thrash icache
2019-02-04 20:41:40 +02:00
Pauli Oikkonen f781dc31f0 Create strategy for ver_sad
Easy to vectorize
2019-02-04 20:41:40 +02:00
Pauli Oikkonen 91cb0fbd45 Create strategy for directly obtaining pointer to constant-width SAD function 2019-02-04 20:41:40 +02:00