Commit graph

2812 commits

Author SHA1 Message Date
Ari Lemmetti 33295bf350 Use AVX2 luma interpolation for SMP and AMP as well 2021-03-08 22:36:09 +02:00
Ari Lemmetti 7ce68761c2 Add a reminder to fix a rare case for bipred 2021-03-08 22:36:09 +02:00
Ari Lemmetti 475f1d79d5 Add some defines for important interpolation related sizes 2021-03-08 22:36:09 +02:00
Ari Lemmetti 4314f3a9a7 Rename some interpolation functions and strategies for consistency 2021-03-08 22:36:08 +02:00
Ari Lemmetti 5a70b49f69 Require 64-bit build for AVX2 interpolation filter functions 2021-03-08 22:36:08 +02:00
Ari Lemmetti 5631651469 Remove unused functions and variables 2021-03-08 22:36:08 +02:00
Ari Lemmetti d8e7aac380 Do not use nonstandard extension for struct initialization. 2021-03-08 22:36:07 +02:00
Ari Lemmetti e38219e489 Fix epol_func signature and function definition 2021-03-08 22:36:07 +02:00
Ari Lemmetti 7e6ba9750f Add new AVX2 ip filters for chroma 2021-03-08 22:36:07 +02:00
Ari Lemmetti 3476fc62c7 Fix parameter to signed 2021-03-08 22:36:06 +02:00
Ari Lemmetti e572066e46 Add new AVX2 vertical ip filter for pixel precision 2021-03-08 22:36:06 +02:00
Ari Lemmetti 9e4b62a891 Use the new horizontal filter for pixel precision as well 2021-03-08 22:36:06 +02:00
Ari Lemmetti 2175023843 Relocate function 2021-03-08 22:36:06 +02:00
Ari Lemmetti f5b0e3c52b Add new AVX2 horizontal ip filter capable of every luma PB 2021-03-08 22:36:05 +02:00
Ari Lemmetti d9a3225ae5 Add new AVX2 vertical ip filter for high-precision 2021-03-08 22:36:05 +02:00
Ari Lemmetti 84222cf3e7 Replace old block extrapolation with more capable one.
Separate paddings for different directions can be now specified.
2021-03-08 22:36:04 +02:00
Jaakko Laitinen 845902062c Fix warning and limit intra qp offset to -3 2021-03-04 18:08:59 +02:00
siivonek bf0bf73665 Fix mistake in define. 2021-02-16 20:21:33 +02:00
siivonek 6f455f29cc Add MINGW64 to define. Try to fix tsan test path error to suppressions.txt. 2021-02-16 15:44:18 +02:00
siivonek 9a65617a34 Disable thread exit call in encmain when MINGW is used. This should fix the issue with media auto-build suite. 2021-02-15 14:47:18 +02:00
Pauli Oikkonen fcc2c1fa7b return-type error does not know that you don't return from assert(0) 2021-01-12 13:28:55 +02:00
Pauli Oikkonen fa8cfb92e8 Maybe this would work with VC++
Our threadwrapper does not support PTHREAD_MUTEX_INITIALIZER, apparently
that's a toughie to implement on Windows or something, dunno. Use
dynamic initialization instead, then.
2021-01-11 18:22:53 +02:00
Pauli Oikkonen 20758a77e3 document fastrd measurement tools 2021-01-11 18:22:53 +02:00
Pauli Oikkonen 0e07308ea5 new weights 2021-01-11 18:22:53 +02:00
Pauli Oikkonen 5827ecc5a6 this little piggy wasn't on board, obviously.. 2021-01-11 18:22:53 +02:00
Pauli Oikkonen 643e70d4ca also move the readme file :^) 2021-01-11 18:22:53 +02:00
Pauli Oikkonen 1c1807f80b move rdcost stuff into a separate directory 2021-01-11 18:22:53 +02:00
Pauli Oikkonen a37095b061 new weights using new scripts 2021-01-11 18:22:53 +02:00
Pauli Oikkonen 17bedc9751 script to average out results by qp over sequences 2021-01-11 18:22:53 +02:00
Pauli Oikkonen ab13018b7c tidy it up 2021-01-11 18:22:53 +02:00
Pauli Oikkonen 8aa9a29e24 what if this were to work now 2021-01-11 18:22:53 +02:00
Pauli Oikkonen 4deed04eb9 you know what, fread returns number of elements, not bytes 2021-01-11 18:22:53 +02:00
Pauli Oikkonen c89477bb41 Ditto for 2nd part of least squares 2021-01-11 18:22:52 +02:00
Pauli Oikkonen 3dd4f0e00b Process some fault conditions in filter_rdcosts 2021-01-11 18:22:52 +02:00
Pauli Oikkonen 98a082cdcd last fixes to extract_rdcosts 2021-01-11 18:22:52 +02:00
Pauli Oikkonen b26e9c68c8 extract rdcosts works with the block qp fix 2021-01-11 18:22:52 +02:00
Pauli Oikkonen 40ae353820 Fix RD sampling to take the block QP into account 2021-01-11 18:22:52 +02:00
Pauli Oikkonen 03087fb44c Fix RDO sampling to work thru a CLI parameter, implement accuracy check
TODO: write into encoder->fastrd_learning_outfile instead of stdout.
It's a toughie tho, because fwrite takes in FILE* instead of const FILE*
but the encoder_control_t is passed as a const.
2021-01-11 18:22:52 +02:00
Pauli Oikkonen 33dd9c95cd Tool to extract RDO bitrates 2021-01-11 18:22:52 +02:00
siivonek e833354cdd Merge branch 10-bit-assert-fix 2020-12-07 20:36:50 +02:00
Pauli Oikkonen be19fd996b Add default value for fast coeff table filename
..oops
2020-11-02 14:02:51 +02:00
Pauli Oikkonen 46301e9857 Document the --fast-coeff-table option 2020-10-29 15:23:26 +02:00
Pauli Oikkonen 816789c9f4 Allow fast coeff weights to be read from a file 2020-10-29 15:22:51 +02:00
Pauli Oikkonen 6799019db0 Move fast coeff table to transform.h
Guess this is a more logical place for it
2020-10-29 15:20:27 +02:00
Pauli Oikkonen 4712ce5f59 Round the fast coeff result instead of flooring 2020-10-29 15:20:27 +02:00
Pauli Oikkonen 0fb09c9920 New filtered coeff weight by QP values 2020-10-29 15:20:27 +02:00
Pauli Oikkonen 9bf0cb27b1 Constrain fast cost estimation to QPs we have weights for 2020-10-29 15:20:27 +02:00
Pauli Oikkonen 24d487f553 New weights for 12 <= QP <= 42
Trained using MSU ultrafast settings now
2020-10-29 15:20:27 +02:00
Pauli Oikkonen 3e1c6d84b8 Fix issues in fast coeff estimation
Allow weight table to start from nonzero QP, and round weights to Q8.8
instead of flooring them
2020-10-29 15:20:27 +02:00
Pauli Oikkonen 5f91bda762 Use newer data for fast coeff cost estimation
Same training dataset, but this time only buckets 0...3 were used to
approximate the function, no sign/cg width bucket.
2020-10-29 15:20:27 +02:00
Pauli Oikkonen 2abd733199 Use unsigned min() to correctly clip -32768
If a coeff happens to be -32768 (0x8000), its 16-bit abs() is also
0x8000. It should ultimately be clipped to 3, so interpret absolute
values as unsigned instead to make that happen.
2020-10-29 15:20:27 +02:00
Pauli Oikkonen b93b90c0d7 Implement new fast coeff cost estimator in AVX2 2020-10-29 15:20:27 +02:00
Pauli Oikkonen 2f74a112b3 Try first lookup table based fast coeff estimation 2020-10-29 15:20:27 +02:00
siivonek bc1206a4d3 Define qp_delta_min & max in global.h instead of calculating them locally. 2020-09-29 13:46:27 +02:00
siivonek 0f3ef786b9 Modify delta QP range assert so it will work with any valid bit depth. Modify VAQ code so it will clip the QP to a proper range which is dependent on bit depth 2020-09-22 20:15:23 +02:00
siivonek fe6f93a951 Fix delta QP range check assert. Add separate asserts based on bit depth. 2020-09-22 20:15:22 +02:00
Joose Sainio 8143ab971c Merge branch 'stats-files'
# Conflicts:
#	src/cfg.c
#	src/cli.c
#	src/kvazaar.h
2020-09-16 09:25:00 +03:00
Joose Sainio 1c06bd7f3d Fix POC to be correct for all GOPs and Intra periods, fix issue with vaq 2020-09-14 14:25:48 +03:00
Sami Ahovainio 4d87fb2397 fixed potential out of bounds iteration 2020-09-10 12:59:39 +03:00
Sami Ahovainio 5d521a2444 Added option to force yuv as file format and made the options and file endings case insensitive 2020-09-09 16:05:59 +03:00
Joose Sainio 3fb8b7ebc6 Add --stats-file-prefix option
When the option is defined with an option four files prefixlambda.txt,
prefixqp.txt, prefixdist.txt, and prefixbits.txt that have the corresponding
data for each ctu. This is a debug feature.
2020-09-09 12:35:47 +03:00
Sami Ahovainio 84cabd9c20 Fixed sign match 2020-09-07 15:39:31 +03:00
Sami Ahovainio d691849594 Added frame header reading for both read and seek functions 2020-09-07 15:31:08 +03:00
Sami Ahovainio cbcee67821 y4m start header parsing ready 2020-09-07 15:31:07 +03:00
Joose Sainio c10b841e7c Merge remote-tracking branch 'remotes/origin/fix-sao-parameter' into master 2020-09-07 13:10:36 +03:00
Joose Sainio da09d49890 Remove optionality from --sao
SAO parameter was optional which caused that if one wants to pass argument
one needs to use "=" which is confusing since this is not required for any
other parameter
2020-09-07 12:35:40 +03:00
Pauli Oikkonen 3f7f0d7ed7 Allow bit depth to be defined from the outside
For a 10-bit build, just use:
env CFLAGS="-DKVZ_BIT_DEPTH=10" ./configure && make clean && make
2020-09-02 17:55:22 +03:00
Pauli Oikkonen 780da4568a Exclude 8-bit-only code from 10-bit builds and use uint8_t instead of kvz_pixel for code that assumes 8-bit pixels 2020-09-02 17:46:33 +03:00
Pauli Oikkonen 31ef4e4216 Fix ml functions to accept kvz_pixel*, not uint8_t* 2020-09-02 17:46:33 +03:00
Joose Sainio faf5cc858d Merge branch 'fix-lp-gop-rc' 2020-06-25 09:41:57 +03:00
Joose Sainio 138651ee85 Fix the bit and frame counts for calculating the gop allocation
Additionally dynamically adjust the smoothing window if there are rapid changes
2020-06-24 15:26:54 +03:00
Ari Lemmetti f8ff6dd567
Merge pull request #262 from jbeich/truncate-freebsd
Unbreak build on FreeBSD
2020-06-22 18:08:01 +03:00
Ari Lemmetti d1abf85229 Add MV constraint check to motion estimation start point 2020-06-01 23:51:38 +03:00
Jan Beich 1fa69c705d Rename truncate() from 30ce461d98 to avoid conflict with POSIX version
strategies/avx2/dct-avx2.c:55:23: error: static declaration of 'truncate' follows non-static declaration
static INLINE __m256i truncate(__m256i v, __m256i debias, int32_t shift)
                      ^
/usr/include/stdio.h:448:6: note: previous declaration is here
int      truncate(const char *, __off_t);
         ^
2020-04-22 16:09:42 +00:00
Ari Lemmetti 9753820b3a Update version to 2.0.0 2020-04-22 01:03:36 +03:00
Ari Lemmetti 40e81f3243 Update preset tables. Update docs. 2020-04-22 01:03:21 +03:00
siivonek 54f438a75c Update VAQ help text. Update docs. Change some lingering tabs to spaces. 2020-04-20 16:52:07 +02:00
Ari Lemmetti f31dddc019 Bypass inverse quantization and inverse transform when trying early skip 2020-04-10 16:02:09 +03:00
Pauli Oikkonen fbdb1e2d15 Add correct path to sao_shared_generics.h in makefile 2020-04-08 19:27:12 +03:00
Pauli Oikkonen 8617530b13 Use _mm_store_epi64 instead of _mm_cvtsi128_si64
Fix 32-bit builds that tend to lack the cvt intrinsic. Hope it will be
optimized to a movq r64, xmm on modern platforms though
2020-04-07 23:51:54 +03:00
Pauli Oikkonen a82966c0f5 Fix lacking _mm256_cvtss_f32 intrinsic on VS
Cast __m256 into __m128 first, the XMM variant of the intrinsic has been
around for a long enough time to be supported
2020-04-07 22:38:10 +03:00
Joose Sainio c369ff8873 Fix a potential division by zero in a floating point operation
When C is calculated with K if the value of K is not clipped before in some
cases it is possible that K gets such a large negative value that bpp^K is
rounded to zero. In real-life cases this is extremely rare and clipping
beforhand has very little to no effect.

Also remove commented debug prints
2020-04-06 11:05:49 +03:00
Ari Lemmetti 901c25c0c8 Merge branch 'vaq' 2020-04-03 19:51:17 +03:00
Ari Lemmetti 51451be5ef Handle cases where the number of pixels is not divisible by 32 2020-04-03 19:37:47 +03:00
siivonek ee544304f1 Make function static to not mess up tests. 2020-04-03 15:22:34 +02:00
siivonek e5267f7706 Fix define for use with Visual Studio. 2020-04-03 15:11:01 +02:00
siivonek 9e34369304 Merge branch 'vaq' of https://gitlab.tut.fi/TIE/ultravideo/kvazaar into vaq 2020-04-03 12:35:04 +02:00
siivonek d025977949 Clamp edge lcu pixels if dimensions are not 64 divisible. 2020-04-03 12:33:14 +02:00
Pauli Oikkonen addc1c3ede Fix warning about potentially unused hsum_8x32b
There's a lot of alternative options available, such as making it
globally visible with a kvz_ prefix, force inlining it, or anything.
This could be good too, hope it won't be compiled at all to translation
units where it's not used.
2020-04-02 16:44:22 +03:00
siivonek e3ba0bfb8c Fix memory leak. 2020-04-02 14:15:36 +02:00
siivonek 566680af7b Move function hsum to file where it is used to avoid errors. 2020-04-02 14:03:06 +02:00
siivonek 58be514e2a Fix pipeline error. 2020-04-02 13:50:08 +02:00
siivonek 2aa0d97589 Add VAQ test in test_tools. Bump minor version number in configure.ac. Update help text for VAQ. 2020-04-01 18:16:39 +02:00
siivonek c6e421019e Merge vaq-simd 2020-03-31 21:40:29 +02:00
Jaakko Laitinen 8e4b738900 Fix error when first value in pu depth list is omitted 2020-03-31 16:57:12 +03:00
Jaakko Laitinen 54ef0bbfd2 Fix unintended functionality when giving multiple --pu-depth-intra/inter list parameters 2020-03-31 16:39:56 +03:00
Jaakko Laitinen cb0c7b23b5 Merge branch 'intra_qp_offset_auto' into 'master'
Add auto option to intra-qp-offset

See merge request TIE/ultravideo/kvazaar!7
2020-03-31 16:17:36 +03:00
Pauli Oikkonen 99889dab15 Fix switch(bool) in picture-avx2.c
It passes on GCC but warns on Clang
2020-03-31 15:42:19 +03:00
Jaakko Laitinen e0440c3de1 Update docs 2020-03-31 15:27:48 +03:00
Jaakko Laitinen 7760dcf441 Remove intra qp offset from preset parameters 2020-03-31 14:06:07 +03:00