Commit graph

2917 commits

Author SHA1 Message Date
Joose Sainio 5b10e5fb7e parameterize the clipping option 2019-12-06 09:51:04 +02:00
Pauli Oikkonen da370ea36d Implement AVX2 8x8 filtered DC algorithm 2019-11-28 14:10:10 +02:00
Pauli Oikkonen 5d9b7019ca Implement a 4x4 filtered DC pred function 2019-11-26 17:05:54 +02:00
Joose Sainio ca0060cbba try the original clipping 2019-11-26 15:13:04 +02:00
Pauli Oikkonen f1485ab087 Start doing an arbitrary size filtered DC pred - maybe easier to just create separate functions for fixed block sizes? 2019-11-25 15:20:29 +02:00
Joose Sainio ab2fded8af Update threadwrapper to enable pthread_rwlock_t 2019-11-21 13:38:40 +02:00
Joose Sainio eb78aead1f Fix additional potential data races 2019-11-21 11:03:12 +02:00
Joose Sainio 35d7e0d88b Fix data race 2019-11-21 10:25:04 +02:00
Marko Viitanen 94d89f03c7 Added cfg variable intra_smoothing_disabled and some cleanup 2019-11-20 08:38:33 +02:00
Marko Viitanen eb2caf9118 Fix intra angle filter, changed from gauss filter table to run-time calculated 4-tap filter 2019-11-19 15:15:21 +02:00
Pauli Oikkonen 979d66031c Create a strategy out of intra_pred_filtered_dc 2019-11-19 14:50:31 +02:00
Marko Viitanen 466d8772b0 Apply JVET_P0170_ZERO_POS_SIMPLIFICATION in coeff bypass coding 2019-11-19 14:32:38 +02:00
Joose Sainio 0e8815a3d8 test clipping qp to previous frame instead of previous ctus 2019-11-19 14:32:31 +02:00
Joose Sainio ddb4e5a131 move the intra bit calculation so that it is used also with lambda rc 2019-11-19 14:16:48 +02:00
Joose Sainio a07833f3e6 check that mallocs in rc initialization were successful
only call kvz_update_after_picture when using the OBA rc
2019-11-19 13:59:44 +02:00
Joose Sainio 50d410a316 re-enable static qp encoding and lambda rc 2019-11-19 13:45:58 +02:00
Pauli Oikkonen fa4bb86406 Optimize intra_pred_planar_avx2 for 4x4 blocks 2019-11-19 13:39:02 +02:00
Marko Viitanen 3df2642b03 Fix qt cbf context init value 2019-11-19 13:27:36 +02:00
Joose Sainio 57e5615ece Fix incorrect intra rc calculation skipping 2019-11-19 13:25:31 +02:00
Joose Sainio 6cc3bcd87e Command line parameters for oba rc and implementation of the usage of the intra parameter 2019-11-19 09:29:06 +02:00
Joose Sainio eb73548af5 Encode first frame completely before starting others to enable owf 2019-11-18 09:51:37 +02:00
Marko Viitanen 17a53230fd Code cleanup, remove unused arrays and remove tabs 2019-11-18 09:01:23 +02:00
Pauli Oikkonen 4761d228f9 Start to vectorize the 4x4 loop 2019-11-15 17:32:40 +02:00
Pauli Oikkonen 8d45ab4951 Stupidify the 4x4 planar loop for vectorization 2019-11-14 17:14:04 +02:00
Marko Viitanen 91528f3292 Update contexts 2019-11-14 13:46:51 +02:00
Marko Viitanen b309ed90be Fix NAL packet and missing fields in SPS 2019-11-14 09:21:11 +02:00
Marko Viitanen 74514981a9 Fixed PPS, SPS and slice headers and NAL unit types 2019-11-13 15:59:36 +02:00
Joose Sainio c759c138ed Prepare the rc data structure to be shared among all frame encoders 2019-11-13 11:56:25 +02:00
Joose Sainio cdb7c851a4 Fix weight calculation 2019-11-13 08:55:31 +02:00
Joose Sainio b9b01f8036 WPP with threading 2019-11-12 12:12:57 +02:00
Joose Sainio 615973adca should enable threading with wpp when owf is not used 2019-11-12 09:03:00 +02:00
Pauli Oikkonen 6f13f6525c Merge branch 'new_prints' 2019-11-07 17:04:21 +02:00
Joose Sainio d353f7dd1a Disable debug prints, fix multiple bugs in the calculation 2019-11-07 15:08:57 +02:00
mercat 57e8c3ebc2 Merge branch 'ML-cplx_red_ICIP' 2019-11-07 13:25:47 +02:00
Pauli Oikkonen 558f0ec401 Mbps, not mbps 2019-11-05 18:06:00 +02:00
Pauli Oikkonen 2edf533925 Tidy the end report printing
Also fix a bug with non-integer target FPS
2019-11-05 17:20:00 +02:00
Joose Sainio 408fd4ccb6 Fix lambda and qp calcualtion for intra frames
also fixes a bug with selecting the clip neighbor lambda and clip neighbor qp
selection for inter frames
2019-11-05 10:51:39 +02:00
Pauli Oikkonen c7313ce567 Store AVG QP information in encmain 2019-11-04 17:08:07 +02:00
Reima Hyvönen 80575c59bf Some updates done to get right bitrate and avg QP 2019-10-31 15:56:24 +02:00
Reima Hyvönen 252bab8820 Added prints to bitrate and AVG QP 2019-10-31 15:56:24 +02:00
Pauli Oikkonen 6d7a4f555c Also remove 16x16 (A * B^T)^T matrix multiply
Can be done using (B * A^T) instead, it's the exact same
2019-10-28 16:19:42 +02:00
Pauli Oikkonen 2c2deb2366 Tidy AVX2 32x32 matrix multiply 2019-10-28 16:19:42 +02:00
Pauli Oikkonen 98ad78b333 Tidy the old AVX2 32x32 matrix multiply
It was actually a very good algorithm, just looked messy!
2019-10-28 16:19:42 +02:00
Pauli Oikkonen 4a921cbdb5 Retain data as much in YMM registers as possible
This seems to make it a whole lot quicker
2019-10-28 16:19:42 +02:00
Pauli Oikkonen ac4d710e23 Unroll 32x32 matrix multiply, use all regs 2019-10-28 16:19:42 +02:00
Pauli Oikkonen a58608d0b8 Remove totally unnecessary (A * B^T)^T 32x32 multiply 2019-10-28 16:19:42 +02:00
Pauli Oikkonen 043f53539f Implement a streamlined matrix-multiply 32x32 DCT 2019-10-28 16:19:42 +02:00
Pauli Oikkonen e9da2d851b Tidy 32x32 fast DCT's helper functions 2019-10-28 16:19:42 +02:00
Pauli Oikkonen e382339182 Implement fast (butterfly) 32x32 DCT in AVX2 2019-10-28 16:19:42 +02:00
Pauli Oikkonen b5962dadac Tidy indentation in AVX2 16x16 iDCT 2019-10-28 16:19:42 +02:00
Pauli Oikkonen 36a8f89025 Fine-tune 16x16 AVX2 iDCT 2019-10-28 16:19:42 +02:00
Pauli Oikkonen ca9409de2b Implement 16x16 DCT as butterfly algorithm in AVX2 2019-10-28 16:19:42 +02:00
Pauli Oikkonen 7c69a26717 Use aligned loads and stores for AVX2 DCT 2019-10-28 16:19:42 +02:00
Pauli Oikkonen 8e9c65dca6 Align DCT matrices and temp transform buffers 2019-10-28 16:19:42 +02:00
Pauli Oikkonen 148a150522 Align DCT source and dest blocks to cache line 2019-10-28 16:19:42 +02:00
Pauli Oikkonen 8e60bbf6a6 Slightly tune 16x16 forward DCT
Use an array of __m256i's to store temporary value, essentially letting
the compiler enforce alignment and use aligned loads and stores.
2019-10-28 16:19:42 +02:00
Pauli Oikkonen c0cc0e8a75 Optimize 16x16 multiply by only slicing right mat once 2019-10-28 16:19:42 +02:00
Pauli Oikkonen e463d27f22 Implement streamlined generic 16x16 matrix multiply
It can't be this fast for real, can it?
2019-10-28 16:19:42 +02:00
Pauli Oikkonen beb85ce9d6 Reorder parameters for 8x8 matrix multiplies 2019-10-28 16:19:42 +02:00
Pauli Oikkonen 292af62256 Implement tailored 16x16 forward DCT 2019-10-28 16:19:42 +02:00
Pauli Oikkonen 30ce461d98 Redo 4x4 matrix multiplication 2019-10-28 16:19:42 +02:00
Pauli Oikkonen 07970ea82f Streamline by-the-book 8x8 matrix multiplication
Also chop up the forward transform into two tailored multiply functions
2019-10-28 16:19:42 +02:00
Pauli Oikkonen 7ec7ab3361 Implement a tailored AVX2 8x8 DCT 2019-10-28 16:19:42 +02:00
Joose Sainio 372934c7db Fix division by zero 2019-10-10 16:35:56 +03:00
Joose Sainio 9bdfdeaf5c Rest of the owl 2019-10-09 15:48:58 +03:00
Joose Sainio 1ba8525faf WIP 2019-10-09 10:35:07 +03:00
Joose Sainio 19496d2692 ? 2019-10-03 14:50:11 +03:00
Joose Sainio 4b111e339e fix couple of bugs in the implementation, bit calculation seems still bit off 2019-10-01 15:08:39 +03:00
Joose Sainio 84615e406a fix compiler warnings 2019-09-27 14:20:08 +03:00
Joose Sainio 14b7a75713 Call the new functions and fix bugs 2019-09-27 14:14:24 +03:00
Joose Sainio ef74bfb182 unify naming 2019-09-27 10:16:21 +03:00
Joose Sainio e36f481bda qp calculation for frame 2019-09-27 09:05:40 +03:00
Joose Sainio 47019ca1cd intra ck update 2019-09-26 16:04:53 +03:00
Joose Sainio 7c8f4da7cb Update c and k except after first intra 2019-09-26 13:09:28 +03:00
Joose Sainio 0577d481c1 CTU level code 2019-09-25 12:12:21 +03:00
pkubaj 1d7fcf4227
Fix build on powerpc64 with LLVM 2019-09-12 15:05:00 +02:00
mercat 0de567bfa4 Fixe memory leak 2019-09-12 09:45:32 +03:00
mercat fa116de619 Add static 2019-09-11 16:18:12 +03:00
mercat b8753a9293 Fucking INLINE fixed 2019-09-11 16:12:07 +03:00
mercat b855144e68 INLINE fixe 2019-09-11 16:12:07 +03:00
mercat 694337b803 Add const and more const 2019-09-11 16:12:07 +03:00
mercat 21c07638ed Remove const into kvz_init_constraint. 2019-09-11 16:12:06 +03:00
mercat 2bca507abe Clean version of machine learning constraint code. (ICIP paper) 2019-09-11 16:12:06 +03:00
Alexandre Mercat 0f4b7be6ee First version of ML ICIP code for master 2019-09-11 16:12:06 +03:00
Pauli Oikkonen 99597b828a Work around the ancient Win32 calling convention hassle
See if this'll work now
2019-09-06 13:14:42 +03:00
Pauli Oikkonen c5ca18950c Revert "Revert to 6924d90052 due to broken visual studio build"
This reverts commit 1dd0619bd7.
2019-09-05 18:21:55 +03:00
Pauli Oikkonen 55529decd5 Implement _mm256_insert_epi32 and extract pseudo-ops
Visual Studio headers apparently lack these guys
2019-09-05 18:20:52 +03:00
Marko Viitanen 28dc4fa2ed Fix intra MPM selection 2019-09-05 09:39:13 +03:00
Ari Lemmetti 147378e1f9 Prevent 8x4 and 4x8 bipred in merge analysis 2019-09-03 16:32:50 +03:00
Ari Lemmetti ef1fdbf259 Separate prediction of single PU/PB from CU/CB 2019-09-03 16:32:50 +03:00
Joose Sainio 7d2737bdf6 WIP picture lambda calculation 2019-09-03 11:03:35 +03:00
Ari Lemmetti 3bc510712f Enable merge analysis for smp and amp 2019-09-02 17:31:51 +03:00
Ari Lemmetti 557bcbc6aa Make luma or chroma only inter "recon" or predict possible 2019-09-02 17:15:28 +03:00
Marko Viitanen 6d5e20ca13 Header changes to match VTM 6.1 2019-09-02 09:42:35 +03:00
RLamm 60be6d411c Intra filtering fixed at least for luma. All intra modes output valid luma (hashes match), but chroma is still broken. 2019-08-30 16:14:00 +03:00
RLamm 83ac39094a Use new PDPC filtering for planar and DC modes 2019-08-29 12:51:34 +03:00
Joose Sainio 131c04f65c Fix incorrect weight for intra frame 2019-08-29 12:01:13 +03:00
Joose Sainio 8f96678d13 Fix issue with intra frames being part of gop when they shouldn't 2019-08-29 09:28:10 +03:00
Ari Lemmetti aa8ab195d1 Compare rough cost of the best merge mode against AMVP to make mode decision 2019-08-26 22:49:09 +03:00
Ari Lemmetti 8f866ff83a Use correct index 2019-08-26 20:10:10 +03:00
Ari Lemmetti 2343958a14 Fix transform split for small luma blocks 2019-08-24 21:50:17 +03:00
Ari Lemmetti 800fc8644d Reset CBFs because CBFs might have been set earlier for depth earlier. 2019-08-24 21:49:33 +03:00
Ari Lemmetti a80de22bc7 Add only different candidates to the list 2019-08-24 21:49:33 +03:00
Ari Lemmetti 45c7961412 Remove tr depth fill. It should not be needed. 2019-08-24 21:49:32 +03:00
Ari Lemmetti ff8711aaab Add missing logic to add valid indices to list 2019-08-24 21:49:29 +03:00
Marko Viitanen cb0d7c340a Use the new PDPC filtering in angular intra 2019-08-23 14:44:41 +03:00
Marko Viitanen 5bebb18943 Change intra filtering according to VTM6 2019-08-23 08:56:35 +03:00
Marko Viitanen a16efe6b52 Merge remote-tracking branch 'remotes/github_kvazaar/master'
# Conflicts:
#	build/kvazaar_VS2013.sln
#	build/kvazaar_VS2015.sln
#	build/kvazaar_VS2017.sln
#	build/kvazaar_cli/kvazaar_cli.vcxproj
#	build/kvazaar_lib/kvazaar_lib.vcxproj
#	build/kvazaar_tests/kvazaar_tests.vcxproj
#	src/encode_coding_tree.c
#	src/encode_coding_tree.h
#	src/encoder_state-bitstream.c
#	src/inter.c
#	src/strategies/avx2/quant-avx2.c
2019-08-22 15:12:01 +03:00
Marko Viitanen 01ea762c1f Fix coeff coding ad remove bdpcm flag -> CABAC bits match with VTM 6.0 2019-08-22 14:33:42 +03:00
Marko Viitanen 210af8adbe Remove joint_cb_cr flag and fix split_flag context selection 2019-08-22 11:23:24 +03:00
Marko Viitanen c713d31c93 Fix sig_coeff context selection 2019-08-22 10:57:50 +03:00
Marko Viitanen 48b8898e53 Fix CBF context init and use 2019-08-22 10:44:47 +03:00
Marko Viitanen db94ec1a84 Rename intra_mode_model -> intra_luma_mpm_flag_model and update the contexts 2019-08-19 15:17:25 +03:00
Marko Viitanen 1c6ffc0a7e Fix wrong variable types in context init 2019-08-19 14:33:55 +03:00
Marko Viitanen cd6be15e10 Fix context init to match VTM6.0 2019-08-19 13:57:31 +03:00
Marko Viitanen 3de198d2db Sync contexts with VTM6.0 2019-08-19 09:39:59 +03:00
Marko Viitanen e644b03615 Fix headers to match VTM6.0rc1 2019-08-16 15:33:20 +03:00
Ari Lemmetti 1dd0619bd7 Revert to 6924d90052 due to broken visual studio build 2019-08-08 15:15:34 +03:00
Pauli Oikkonen 2852baa673 Separate sign3_diff_epu8 from calc_eo_cat
Just to keep things simple, clear and obvious
2019-08-07 16:35:24 +03:00
Pauli Oikkonen 17947b79ee Add sao_shared_generics.h in Makefile.am 2019-08-07 16:35:24 +03:00
Pauli Oikkonen a8dd6ce351 Add a note about having implemented a separate AVX2 version of SAO offset array calculation 2019-08-07 16:35:24 +03:00
Pauli Oikkonen a858e7dd4b Combine duplicate code into inline functions 2019-08-07 16:35:24 +03:00
Pauli Oikkonen de0e97f711 Take 8/16/24b loads and stores into separate functions 2019-08-07 16:35:24 +03:00
Pauli Oikkonen 10979f58fe Tidy up code 2019-08-07 16:35:24 +03:00
Pauli Oikkonen 9cc11976c0 Combine the delta accumulation from edge and band ddistortion into shared func
This won't reduce object size, but there'll be less duplicate code
2019-08-07 16:35:24 +03:00
Pauli Oikkonen 55d877bd66 Vectorize sao_edge_ddistortion 2019-08-07 16:35:24 +03:00
Pauli Oikkonen aef0f301d3 Fix function signatures
Mark anything intended as read-only to be const, and fix alignment
2019-08-07 16:35:24 +03:00
Pauli Oikkonen 997fd369b3 Redo calc_sao_edge_dir_avx2
Do it wider, 32 pixels at once!
2019-08-07 16:35:24 +03:00
Pauli Oikkonen db1e475e02 Use i32 instead of i8 for x/y offsets
Doesn't matter too much, because this number isn't used in SIMD
computation, only as a memory reference offset.
2019-08-07 16:35:24 +03:00
Pauli Oikkonen 12de466ef5 Reimplement non-band SAO color reconstruction in AVX2
Streamline things to work on 32 pixels at once instead of 8
2019-08-07 16:35:24 +03:00
Pauli Oikkonen e8bff99329 Redo the SAO_TYPE_BAND subsection of AVX2 SAO color reconstruction
Vectorize it all, hope this helps with perf
2019-08-07 16:35:24 +03:00
Pauli Oikkonen 7b5dffa855 Implement calc_sao_offset_array in AVX2
To be efficient, the AVX2 color reconstruction algorithm will need
offsets in byte, not dword, arrays. This is completely specific to 8-bit
pixels and the function signature is fundamentally distinct from the
generic algorithm, so it's better to not strategize SAO offset array
calculation.
2019-08-07 16:35:24 +03:00
Pauli Oikkonen 29563b7039 Make kvz_calc_sao_offset_array more obvious
Name temporary values from array lookups etc that are referred multiple
times to, to make the behavior of the mechanism more transparent. Define
all the constant values at the beginning of the function and declare as
const.
2019-08-07 16:35:24 +03:00
Pauli Oikkonen 08881f5e9b (TEMP) (TODO) (whatever) Avoid compiler warnings
I want the CI to not crash on its -Wall -Werror, but instead to actually
build the thing and report me about actual memory errors etc
2019-08-07 16:35:24 +03:00
Pauli Oikkonen c18adc5ee0 Redo sao_band_ddistortion_avx2
Avoid branching and do the entire thing on 32 pixels at once in YMMs.
Also make the sao_bands function parameter const.
2019-08-07 16:35:24 +03:00
Pauli Oikkonen 2827c3e3ab Make calc_sao_bands less opaque 2019-08-07 16:35:24 +03:00
Pauli Oikkonen 1bb9a079a8 Fix indentation 2019-08-07 16:35:24 +03:00
Reima Hyvönen 7bc959c7c5 3 sao functions are now working 2019-08-07 16:35:24 +03:00
Reima Hyvönen 0e0f2d3490 made to clear sum vector after it has been set to memory 2019-08-07 16:35:24 +03:00
Reima Hyvönen f146de7acb removed some variables to prevent memory losses 2019-08-07 16:35:24 +03:00
Reima Hyvönen 247c3a7a71 conversed gined to unsigned int 2019-08-07 16:35:24 +03:00
Reima Hyvönen ac5c216974 Some more memory error preventing to sao_edge_ddistortion_avx2 2019-08-07 16:35:24 +03:00
Reima Hyvönen 3fb1cbca35 more editing sao_edge_ddistortion_avx2 2019-08-07 16:35:24 +03:00
Reima Hyvönen afbb6fb960 some more modifications to sao_edge_ddistortion_avx2 to prevent memory failures 2019-08-07 16:35:24 +03:00
Reima Hyvönen 3496a57f7a Edited sao_edge_ddistortion_avx2 to avoid memory overflow 2019-08-07 16:35:24 +03:00
Reima Hyvönen 267ba1d6ce Modified sao_band_ddistortion_avx2 2019-08-07 16:35:24 +03:00
Reima Hyvönen e70663b245 added some sub commands to avoid memory read errors 2019-08-07 16:35:24 +03:00
Reima Hyvönen 59dfb4570c Converted some loads to load int8_t instead ints 2019-08-07 16:35:24 +03:00
Reima Hyvönen 8b253209a8 Found false address load from calc_sao_edge_dir. Should now work like generic 2019-08-07 16:35:24 +03:00
Reima Hyvönen 50e0a47b7a Took away __restrict 2019-08-07 16:35:24 +03:00
Reima Hyvönen 8a39eb674e Removed c-variable from calc_sao_edge_dir_avx2 2019-08-07 16:35:24 +03:00
Reima Hyvönen bc0a36830d Clerified some 6 pixel loads 2019-08-07 16:35:24 +03:00
Reima Hyvönen 1a8b211e05 Added break to line 170 2019-08-07 16:35:24 +03:00
Reima Hyvönen d05e750ebe Added some switches to prevent segmentation fault from reading 2019-08-07 16:35:24 +03:00
Reima Hyvönen 203580047d Defined some AVX functions 2019-08-07 16:35:24 +03:00
Reima Hyvönen c884c738b1 Updated some commands to match the standard 2019-08-07 16:35:24 +03:00
Reima Hyvönen b412ed2f59 Removed some setr and used loads calc_sao_edge_dir_avx2 2019-08-07 16:35:24 +03:00
Reima Hyvönen c6cc063534 converted some hadd operations at calc_sao_edge_dir_avx2 to cast and extract 2019-08-07 16:35:24 +03:00
Reima Hyvönen 47ac109b10 optimated some sao_reconstruct_color_avx2 when sao->type == SAO_TYPE_BAND 2019-08-07 16:35:24 +03:00
Reima Hyvönen 96dc60a1ed first working optimation 2019-08-07 16:35:24 +03:00
Reima Hyvönen c148aff9fb Some optimation done to function sao_reconstruct_color_avx2 2019-08-07 16:35:24 +03:00
Reima Hyvönen bf16ba6cc4 Remade sao_edge_ddistortion_avx2 and calc_sao_edge_dir_avx2 2019-08-07 16:35:24 +03:00
Reima Hyvönen 79dc39a676 Some editing for sao_edge_ddistortion_avx2 2019-08-07 16:35:24 +03:00
Reima Hyvönen 06ee52924e some reconst done to calc_sao_edge_dir_avx2 2019-08-07 16:35:24 +03:00
Reima Hyvönen 5fbc65d823 reconst optimation doesn't work yet 2019-08-07 16:35:24 +03:00
Reima Hyvönen d29f834a69 Remove useless function 2019-08-07 16:35:24 +03:00
Reima Hyvönen a232a12160 calc_sao_edge_dir_avx2 updated 2019-08-07 16:35:24 +03:00
Reima Hyvönen b1febc02a5 sao_edge_ddistortion_avx2 now working proberly 2019-08-07 16:35:24 +03:00
Reima Hyvönen cd6092a1ec Still too much bits, looking for where they appear 2019-08-07 16:35:24 +03:00
Reima Hyvönen 7853be8eeb Incomple optimation 2019-08-07 16:35:24 +03:00
Marko Viitanen dfa5621024 Intrapred cleanup 2019-07-16 14:23:10 +03:00
Ari Lemmetti 40609aa865 Add missing headers to Makefile.am 2019-07-12 19:15:51 +03:00
Ari Lemmetti 5db3a78499 Bump versions for release 1.3 2019-07-09 22:09:32 +03:00
Ari Lemmetti d513ab1999 Add missing newline 2019-07-09 21:06:05 +03:00
Ari Lemmetti 4967072625 Do not bypass search on skip cu if early_skip is not enabled 2019-07-09 20:20:12 +03:00
Ari Lemmetti b20992a9f3 Rename functions more descriptive 2019-07-09 20:20:11 +03:00
Ari Lemmetti a348a0ec23 Fix transform depth in early skip 2019-07-09 20:05:48 +03:00
Pauli Oikkonen 8d48bee180 Tidy fast coeff cost code 2019-07-09 18:01:54 +03:00
Pauli Oikkonen 201a43b08e Clean up the RD-estimation code 2019-07-09 18:01:54 +03:00
Pauli Oikkonen b111df5073 Create preliminary version of improved cost estimator 2019-07-09 18:01:54 +03:00
Ari Lemmetti be08a87d94 Add missing parameter max-merge to the help message 2019-07-09 16:28:46 +03:00
Ari Lemmetti d0bb9b4a6d Add parameter max-merge to presets 2019-07-09 16:26:03 +03:00
Ari Lemmetti 4097331fd6 Early skip 2019-07-09 15:59:31 +03:00
Marko Viitanen 10d850e98a Use index_offset in intra angular and change the offset to width+1 2019-07-08 14:23:19 +03:00
Marko Viitanen 3d1fa2a9cf Fixing angular intra prediction reference pixels 2019-07-08 14:00:02 +03:00
Marko Viitanen 0656c54cab Fix some problems with reference pixels in angular intra prediction kvz_angular_pred_generic() 2019-07-05 15:54:51 +03:00
Marko Viitanen 89ca2d4ba1 Use correct type for modedisp2sampledisp array 2019-07-05 14:12:10 +03:00
Marko Viitanen 2e8a0d08f9 Fix mvp_idx_model initialization and use 2019-07-05 14:11:29 +03:00
Joose Sainio 977e885ea2 Fix issue with gop=0 introduced in 1c36f68d0c 2019-07-05 12:57:27 +03:00
Marko Viitanen c6217e236f Enable 4-tap filtering for the intra angular 2019-07-04 16:26:10 +03:00
Marko Viitanen cda6d951c0 Change DCT arrays back to 8-bit -> some frames are now correct 2019-07-04 15:59:10 +03:00
Marko Viitanen 8280bd3217 Add channel info to angular_pred and fix the displacement tables.
Also includes 4-tap intra filtering code commented out
2019-07-04 09:35:47 +03:00
Marko Viitanen 5e4369d6b0 Fix the kvz_cabac_encode_aligned_bins_ep function -> cabac coding now correct 2019-07-03 15:55:52 +03:00
Marko Viitanen 3fad4b0a98 Disable kvz_cabac_encode_aligned_bins_ep for now and add a ToDo message 2019-07-03 15:44:35 +03:00
Sami Ahovainio ce1e67cc3a Modified header flags to match VTM commit b9080ff45bec368c44f0c43a32dcd6804ef9f5d6 2019-07-01 13:58:15 +03:00
Sami Ahovainio 3863064d90 Fixed bugs in split decision and coefficient coding. 2019-07-01 13:00:43 +03:00
Mikko Pitkänen a7f09c8114 Merge branch 'threadwrapper' 2019-06-24 16:54:59 +03:00
Sami Ahovainio db5c0230e5 Fixed coefficient sign hiding 2019-06-20 16:26:01 +03:00
Sami Ahovainio b51254cafd Fixed significant coefficient group context calculation 2019-06-20 15:47:13 +03:00
Sami Ahovainio 5e0bea962c Fixed split context decision 2019-06-20 15:30:49 +03:00
Sami Ahovainio 12322144f0 Removed debug print from context.c 2019-06-20 15:18:22 +03:00
Sami Ahovainio 3a9800d07d Fixed coefficient coding. Fixed headers to match VTM commit e65075531471a68632bc9252d607655a0feeabc6 2019-06-20 14:43:03 +03:00
Mikko Pitkänen 3dd606ce2e Add new threadwrapper 2019-06-18 18:45:45 +03:00
Sami Ahovainio 2c78aa0642 Fixes to coeff coding. 2019-06-13 12:01:29 +03:00
Joose Sainio c94077d15e remove hardcoded value 2019-06-12 14:37:41 +03:00
Joose Sainio ac68c8444d remove negation that wasn't supposed to be there 2019-06-12 14:35:24 +03:00
Joose Sainio 5851dcc3be missing negation 2019-06-12 14:08:18 +03:00
Joose Sainio 1c36f68d0c Fix owf>=9 gop=8 and add test to catch such problem in future 2019-06-12 14:04:41 +03:00
Sami Ahovainio 3564b4829e Fixed split context decision. Modified intra mode initialization to match VTM version aa76fc5c04cf43390f43d63f9977bea8ee31997a. 2019-06-12 12:59:16 +03:00
Sami Ahovainio a8a53e15b5 Fixed headers to match VTM commit aa76fc5c04cf43390f43d63f9977bea8ee31997a. Added multi_ref_line flag coding. 2019-06-07 13:37:45 +03:00
Ari Lemmetti 933ff6ed55 Merge branch 'set-qp-in-cu-fix' 2019-06-07 09:01:03 +03:00
Sami Ahovainio 8d2581e58c Fixed issue with kvz_go_rice_par_abs where passing a unsigned argument caused MIN function to return wrong value. Modified coefficient coding to match VTM 5.0. Some issues still remain. 2019-06-05 15:57:18 +03:00
Sami Ahovainio 367f1b2129 Fixed splitting bug caused by wrong values in the headers. Fixed header flags to match VTM commit 5703e81b2de677d976ec15423f5768b17619ba6a 2019-06-05 11:21:02 +03:00
Sami Ahovainio 76d56290ed Fixed VUI header writing. Fixed debug prints of NAL headers and rbsp_stop_one_bit. 2019-05-31 11:13:11 +03:00
Ari Lemmetti c6da839002 Set lcu sqrt lambda according to lcu lambda instead of frame lambda when ROI is used 2019-05-29 18:32:10 +03:00
Marko Viitanen 8282a18c36 Fixed headers and NAL writing to match the latest VTM master 988c22cbb9c58584cac3ef0ec7794cafbea6dfd6 2019-05-29 16:18:35 +03:00
Sami Ahovainio 4768ba0628 Minor fixes to header writing. Added contexts for multi_ref_line and BDPCM. Functions added for writing both in bitstream, but they are both disabled for now. 2019-05-29 13:00:19 +03:00
Sami Ahovainio 3339e12169 Fixed some header flags 2019-05-27 09:56:56 +03:00
Ari Lemmetti 9339845e8b Set QP completely at CU level as the name '--set-qp-in-cu' implies
-Move slice delta QP to CU level when using --set-qp-in-cu
-Separate functionality from roi
2019-05-24 20:38:39 +03:00
Pauli Oikkonen 081d16fc33 Fix intrinsics that may be missing on some systems
Create a header to collect all the workarounds for missing intrinsics
in one place
2019-05-23 19:59:40 +03:00
Sami Ahovainio 5b46fbd878 Added multi_ref_idx variable for intra coding (is 0 throughout the code for now). Modified prediction flag writing. Chroma pred flag remains unchanged (ToDo). Added bitstream debug printing on VERBOSE mode. 2019-05-21 12:28:05 +03:00
Sami Ahovainio ed4e218702 Updated coefficient coding to match VTM 5.0 2019-05-13 15:30:43 +03:00
Sami Ahovainio 504c3dfd1b Modified the headers to match current VTM headers 2019-05-07 16:30:06 +03:00
Marko Viitanen 30a8a7b97c WIP fixing the last significant xy coding 2019-05-07 15:01:02 +03:00
Pauli Oikkonen 87a9208db8 Eliminate cvtsi64_si128 intrinsic
Apparently it'll cause Win32 builds to break because it emits the movq
instruction or something..
2019-04-17 16:30:40 +03:00
Pauli Oikkonen 7175d20bb2 Still include stdint.h for non-vector builds 2019-04-15 19:36:01 +03:00
Pauli Oikkonen 1315c7e2b0 Do not compile any vector code for non-SSE4/AVX2 builds 2019-04-15 19:10:48 +03:00
Pauli Oikkonen f5f70e7bc5 Merge branch 'sad-optimization' 2019-04-15 19:02:01 +03:00
Jan Beich 85f46e17a9 Detect AltiVec via elf_aux_info() on FreeBSD 12+ 2019-04-01 13:08:04 +00:00
Jan Beich 82486255da Simplify AltiVec detection on Linux 2019-04-01 13:08:04 +00:00
Marko Viitanen 1546acfdb9 New NAL unit IDs and header changes 2019-03-28 10:11:36 +02:00
Marko Viitanen 36eab9c170 New cabac context models with "rate" 2019-03-27 12:38:19 +02:00
Marko Viitanen 3bdc8ac8d3 Fix intra_chroma_pred_mode and cbf contexts 2019-03-26 09:10:09 +02:00
Marko Viitanen d15f58517f Changed intra coding to use 6 MPM, implemented merge sort and MPM selection 2019-03-20 15:20:31 +02:00
Marko Viitanen 1081336868 Updated intra pred mode init values 2019-03-20 15:18:32 +02:00
Marko Viitanen f3acd245ae New cabac coding function: kvz_cabac_encode_trunc_bin 2019-03-20 15:17:54 +02:00
Marko Viitanen 80d6e4bf05 New split flag calculations 2019-03-20 09:07:58 +02:00
Marko Viitanen 8c84348010 New entropy bit table 2019-03-20 09:07:22 +02:00
Marko Viitanen 2d0348aa6d New context models 2019-03-20 09:06:57 +02:00
Marko Viitanen 052080747e New CABAC functions 2019-03-20 09:06:26 +02:00
Marko Viitanen 20667fdba6 Update header bits to VTM 4.0+ 2019-03-11 14:02:12 +02:00
Pauli Oikkonen 6d43759604 Create a border-respecting 32-wide AVX hor_sad 2019-03-07 18:01:22 +02:00
Pauli Oikkonen f218cecb38 Remove offending hor_sad_avx2_w32 function
Consider possibly creating a non-offending AVX2 version instead, the
way hor_sad_sse41_w32 works. Or maybe there's more essential work to
do.
2019-03-05 22:51:41 +02:00
Pauli Oikkonen df2e6c54fd 4-unroll hor_sad_sse41_arbitrary
This may not increase perf though because it's so rarely used
function, so keeping icache footprint may be more essential...
2019-03-05 22:45:23 +02:00
Pauli Oikkonen 448eacba7b Avoid overreading block borders in hor_sad_sse41_arbitrary 2019-03-05 22:34:50 +02:00
Eemeli Kallio c159e275b7 Merge branch 'max_merge' 2019-03-05 14:39:03 +02:00
Pauli Oikkonen 41f51c08c4 Avoid overrunning buffer in hor_sad_sse41_w32 2019-03-01 15:37:38 +02:00
Pauli Oikkonen bcd9879359 Include quant coeff range check in non-scaling list execution path too 2019-02-27 17:26:44 +02:00
Pauli Oikkonen 24e6363f64 Remove the kvz_quant_avx2 wrapper function 2019-02-27 16:32:58 +02:00
Pauli Oikkonen 748820f3c5 Eliminate unnecessary loading of coeffs if scaling lists are off 2019-02-27 16:26:35 +02:00