Ari Lemmetti
7ccd1a571c
[SIMD] Initial AVX2 code for 4-tap filtering in angular prediction.
2021-09-06 21:20:50 +03:00
Ari Lemmetti
20f0ff976d
[SIMD] Transform angular pred loops for SIMD processing.
2021-09-06 21:20:49 +03:00
Ari Lemmetti
3dfe09e850
[SIMD] Copy generic implementation of angular prediction as a skeleton.
2021-09-06 21:20:46 +03:00
Joose Sainio
450cbd356c
Merge branch 'joint_cbcr' into 'master'
...
[jccr] Add joint coding of chroma residual
See merge request cs/ultravideo/vvc/uvg266!6
2021-09-06 11:43:06 +03:00
Joose Sainio
0592cc65a0
[jccr] enable rdoq with jccr
2021-09-06 11:28:20 +03:00
Joose Sainio
072b84711a
[jccr] fix 64×64 CUs
2021-09-06 11:28:20 +03:00
Joose Sainio
042b5078d8
[jccr] WIP initial implementation
...
Add somekind of search for joint chroma residual coding.
Bitstream is currently correct but prediction is incorrect because the jccr
is actually not used in the search.
Hard coded to be enabled
2021-09-06 11:28:08 +03:00
Marko Viitanen
26f18865f7
[alf] Change the processing in alf_get_blk_stats_avx2() to allow utilizing the whole 256bit register
2021-08-27 13:40:28 +03:00
Marko Viitanen
fdf125f406
[alf] Fix incorrect conversion in alf_get_blk_stats_avx2
2021-08-27 10:25:20 +03:00
Marko Viitanen
6714973264
[alf] Change _mm_store_si128 to _mm_storeu_si128 in alf_get_blk_stats_avx2()
2021-08-26 18:05:06 +03:00
Marko Viitanen
5df8add046
[alf] Change order of alf_covariance.y array for better AVX2 optimization in alf_get_blk_stats_avx2()
2021-08-26 15:37:01 +03:00
Marko Viitanen
be9527cf1d
[alf] Change the order of alf_covariance.ee values to get better optimized solution for alf_get_blk_stats_avx2()
2021-08-26 11:07:13 +03:00
Marko Viitanen
f4de5cfd0f
[alf] Cleanup alf_calc_covariance_avx2() and use integers in alf_get_blk_stats_avx2()
2021-08-26 10:20:57 +03:00
Marko Viitanen
915bf3ca24
[alf] Fix AVX2 priority
2021-08-25 20:29:58 +03:00
Marko Viitanen
8ef3e6a126
[alf] Add strategy for alf_get_blk_stats() and an initial AVX2 version
2021-08-25 20:22:24 +03:00
Marko Viitanen
f61b9138cd
[alf] Import SSE4.1 optimized 5x5 and 7x7 filters from VTM13
...
* Modified to work with 8-bit pixels
2021-08-25 11:50:37 +03:00
Marko Viitanen
dc6a29b0d8
[alf] Initial generic strategies for 5x5 and 7x7 filtering
2021-08-25 10:50:00 +03:00
Marko Viitanen
c3c96d69c2
[alf] Add modified alf_derive_classification_blk_sse41() from VTM 13.0
...
* Modified to work with bitdepth 8
2021-08-20 11:45:02 +03:00
Marko Viitanen
b158d05bca
[alf] rename strategy function to include prefix
2021-08-19 17:19:17 +03:00
Marko Viitanen
3efaeede76
[alf] Define the strategy for alf_derive_classification_blk()
2021-08-19 17:04:35 +03:00
Marko Viitanen
d742f57779
Remove angular_pred_avx2 so we don't need extra parameter
2021-08-15 10:43:48 +03:00
Marko Viitanen
5604b6f946
[cleanup] remove all crypto related stuff, fix warnings, move estimate.m to tools/
2021-07-27 09:27:51 +03:00
Marko Viitanen
99a2b0384d
[cleanup] remove some warnings
2021-07-26 11:42:19 +03:00
Marko Viitanen
0cad1ac3c9
[mts] Add a comment about idct8/idst7 16x16 being unoptimized
2021-07-21 14:02:23 +03:00
Marko Viitanen
d5ef036d35
[mts] change mts_subset tables back to static
2021-07-21 13:54:59 +03:00
Marko Viitanen
60caf2c378
[mts] fix 32x32 idst/idct
2021-07-21 13:44:25 +03:00
Marko Viitanen
c2cd5fb98e
[mts] replace AVX2 DST7/DCT8 16x16 with unoptimized for now
2021-07-21 13:38:17 +03:00
Marko Viitanen
7e089f518d
[mts] add optimized versions of DCT8 and DST7, inverse not yet working properly
...
* Includes new unit tests for the mts
2021-07-21 11:53:15 +03:00
Marko Viitanen
7f67009511
Fix MD5 calculations from HEVC to VVC way
2021-06-24 15:03:29 +03:00
Marko Viitanen
c004735821
[LMCS] Fix casting of the chroma scaled residual
2021-06-18 09:35:06 +03:00
Joose Sainio
cfffd7166c
Use correct context for calculating coeff costs for transform skip
2021-06-07 13:06:03 +03:00
Marko Viitanen
4594bf0ca8
Merge branch 'lmcs_chroma'
2021-06-02 15:05:04 +03:00
Marko Viitanen
5babb14ee7
[LMCS] Use chroma scaling
2021-06-01 12:17:03 +03:00
Joose Sainio
f9de8ebc4f
Merge branch 'master' into '4x4-rd'
...
# Conflicts:
# src/encoder.c
# tests/test_intra.sh
2021-05-28 11:43:55 +00:00
Marko Viitanen
dbc7fd48bf
[LMCS] Initialize some m_reshapeCW values to avoid division by zero
2021-05-24 18:57:37 +03:00
Marko Viitanen
73ac3b68bf
[LMCS] add missing header in quant-avx2.c
2021-05-24 17:25:38 +03:00
Marko Viitanen
4cd5bc38a1
[LMCS] Luma mapping working after some rework, have to keep the reconstruction in the mapped domain
2021-05-24 17:23:17 +03:00
Joose Sainio
cfd7d2666b
slightly optimize intra-generic.c
2021-05-14 10:23:37 +03:00
Joose Sainio
7674e94fd1
[rdoq] transform skip RDOQ
...
Copy the implementation from VTM
2021-05-03 12:52:10 +03:00
Joose Sainio
d2b9893bb7
[transform skip] Fix misunderstanding that caused TS to use QP 52>=
2021-04-30 10:55:23 +03:00
Joose Sainio
a998f3ed74
[transform-skip] Convert the HEVC transfrom skip to VVC
...
For some reason transform skip uses QP MAX(52, QP) and the coeffs are
no longer shifted
2021-04-30 10:55:23 +03:00
Joose Sainio
2ab005692d
Enable 4x4 intra CUs
2021-04-23 10:57:29 +03:00
Joose Sainio
1aaa95601c
Merge remote-tracking branch 'remotes/kvz_github/master' into Fix-monochrome
...
# Conflicts:
# .gitlab-ci.yml
# build/kvazaar_lib/kvazaar_lib.vcxproj.filters
# src/cfg.c
# src/encoder.h
# src/kvazaar.h
# src/rdo.c
2021-04-23 10:56:50 +03:00
Joose Sainio
e8eab326fb
Update context selection to match VVC
2021-04-23 10:51:01 +03:00
Joose Sainio
b2076d3b39
Enable chroma scaling
...
WIP: user defined scaling array
2021-03-16 10:31:26 +02:00
Joose Sainio
412781db41
[scalinglist] Fix quant-generic
2021-03-09 10:42:40 +02:00
Joose Sainio
30e573c261
[scalinglist] WIP: Update scalinglist for VVC
...
Seems to work when rdoq is enabled but not when it is disabled
2021-03-09 09:51:49 +02:00
Ari Lemmetti
dad3d6818e
Only read left and right border pixels if necessary
2021-03-08 22:36:10 +02:00
Ari Lemmetti
b72ab583b4
Handle "don't care" rows in the end separately
2021-03-08 22:36:09 +02:00
Ari Lemmetti
33295bf350
Use AVX2 luma interpolation for SMP and AMP as well
2021-03-08 22:36:09 +02:00
Ari Lemmetti
7ce68761c2
Add a reminder to fix a rare case for bipred
2021-03-08 22:36:09 +02:00
Ari Lemmetti
475f1d79d5
Add some defines for important interpolation related sizes
2021-03-08 22:36:09 +02:00
Ari Lemmetti
4314f3a9a7
Rename some interpolation functions and strategies for consistency
2021-03-08 22:36:08 +02:00
Ari Lemmetti
5a70b49f69
Require 64-bit build for AVX2 interpolation filter functions
2021-03-08 22:36:08 +02:00
Ari Lemmetti
5631651469
Remove unused functions and variables
2021-03-08 22:36:08 +02:00
Ari Lemmetti
e38219e489
Fix epol_func signature and function definition
2021-03-08 22:36:07 +02:00
Ari Lemmetti
7e6ba9750f
Add new AVX2 ip filters for chroma
2021-03-08 22:36:07 +02:00
Ari Lemmetti
3476fc62c7
Fix parameter to signed
2021-03-08 22:36:06 +02:00
Ari Lemmetti
e572066e46
Add new AVX2 vertical ip filter for pixel precision
2021-03-08 22:36:06 +02:00
Ari Lemmetti
9e4b62a891
Use the new horizontal filter for pixel precision as well
2021-03-08 22:36:06 +02:00
Ari Lemmetti
2175023843
Relocate function
2021-03-08 22:36:06 +02:00
Ari Lemmetti
f5b0e3c52b
Add new AVX2 horizontal ip filter capable of every luma PB
2021-03-08 22:36:05 +02:00
Ari Lemmetti
d9a3225ae5
Add new AVX2 vertical ip filter for high-precision
2021-03-08 22:36:05 +02:00
Ari Lemmetti
84222cf3e7
Replace old block extrapolation with more capable one.
...
Separate paddings for different directions can be now specified.
2021-03-08 22:36:04 +02:00
Marko Viitanen
e05dcdb193
Enable sign hiding in quant_avx2 and fix a bug in kvz_encode_coeff_nxn_generic()
2021-02-12 16:40:28 +02:00
Marko Viitanen
79c36f6aeb
Enable RDOQ and sign hiding
2021-02-12 13:24:02 +02:00
Arttu Makinen
7098a94a6f
Implemented implicit MTS.
...
Added selection of implicit MTS to command parameters.
Updated the transform selection to support implicit MTS.
2021-02-11 15:11:15 +02:00
Arttu Mäkinen
8f34685a8f
Merge branch 'master' into 'mts'
...
# Conflicts:
# src/cfg.c
# src/kvazaar.h
2021-02-10 13:05:18 +02:00
Arttu Makinen
c5570abe1b
Removed 'emt' variable from cu_info_t and changed 'emt' globally to 'mts' for consistency.
2021-02-10 12:08:05 +02:00
Arttu Makinen
d0b7dd95f7
MTS works on intra mode.
...
Fixed usage of MTS constraints.
Fixed DCT8 transforms.
Added sorting function of MTS modes with intra modes and costs to search.c.
2021-02-10 11:01:58 +02:00
Arttu Makinen
2e7c342645
Implemented DCT2, DST7, and DCT8 transforms, and search for selecting transform for MTS. Using MTS results mismatch for luma component.
2021-02-02 11:09:43 +02:00
Arttu Makinen
b9c3336f0e
MTS bitstream encoding added for intra. Work with depths 0-3.
2021-01-18 20:44:36 +02:00
Arttu Makinen
98a8e78e93
avx2/encode_coding_tree-avx2.c update, because it caused errors
2020-12-30 14:25:16 +02:00
Pauli Oikkonen
816789c9f4
Allow fast coeff weights to be read from a file
2020-10-29 15:22:51 +02:00
Pauli Oikkonen
6799019db0
Move fast coeff table to transform.h
...
Guess this is a more logical place for it
2020-10-29 15:20:27 +02:00
Pauli Oikkonen
4712ce5f59
Round the fast coeff result instead of flooring
2020-10-29 15:20:27 +02:00
Pauli Oikkonen
0fb09c9920
New filtered coeff weight by QP values
2020-10-29 15:20:27 +02:00
Pauli Oikkonen
24d487f553
New weights for 12 <= QP <= 42
...
Trained using MSU ultrafast settings now
2020-10-29 15:20:27 +02:00
Pauli Oikkonen
3e1c6d84b8
Fix issues in fast coeff estimation
...
Allow weight table to start from nonzero QP, and round weights to Q8.8
instead of flooring them
2020-10-29 15:20:27 +02:00
Pauli Oikkonen
5f91bda762
Use newer data for fast coeff cost estimation
...
Same training dataset, but this time only buckets 0...3 were used to
approximate the function, no sign/cg width bucket.
2020-10-29 15:20:27 +02:00
Pauli Oikkonen
2abd733199
Use unsigned min() to correctly clip -32768
...
If a coeff happens to be -32768 (0x8000), its 16-bit abs() is also
0x8000. It should ultimately be clipped to 3, so interpret absolute
values as unsigned instead to make that happen.
2020-10-29 15:20:27 +02:00
Pauli Oikkonen
b93b90c0d7
Implement new fast coeff cost estimator in AVX2
2020-10-29 15:20:27 +02:00
Pauli Oikkonen
2f74a112b3
Try first lookup table based fast coeff estimation
2020-10-29 15:20:27 +02:00
Marko Viitanen
2db3a07b14
Prevent cu_sig_model_chroma array from being indexed over the limit
2020-10-13 14:14:57 +03:00
Marko Viitanen
bddfb47a55
Merge remote-tracking branch 'remotes/kvazaar_github/master'
2020-09-25 11:49:11 +03:00
Marko Viitanen
449975b0fb
Fixed cubic filter usage in intra angular modes
2020-09-21 14:58:34 +03:00
Pauli Oikkonen
780da4568a
Exclude 8-bit-only code from 10-bit builds and use uint8_t instead of kvz_pixel for code that assumes 8-bit pixels
2020-09-02 17:46:33 +03:00
Marko Viitanen
574c4d06ee
Fix use of log2_cg_size in coeff coding -> smaller blocks also decoded correctly
2020-08-27 18:26:16 +03:00
Marko Viitanen
20b66c9949
Sync to VTM 8.2 and add separate height to last_sig coding
2020-04-29 08:52:38 +03:00
Jan Beich
1fa69c705d
Rename truncate() from 30ce461d98
to avoid conflict with POSIX version
...
strategies/avx2/dct-avx2.c:55:23: error: static declaration of 'truncate' follows non-static declaration
static INLINE __m256i truncate(__m256i v, __m256i debias, int32_t shift)
^
/usr/include/stdio.h:448:6: note: previous declaration is here
int truncate(const char *, __off_t);
^
2020-04-22 16:09:42 +00:00
Marko Viitanen
86d76b19a4
Fix intra neighboring block selection and clean some unused code
2020-04-16 14:12:40 +03:00
Ari Lemmetti
f31dddc019
Bypass inverse quantization and inverse transform when trying early skip
2020-04-10 16:02:09 +03:00
Pauli Oikkonen
8617530b13
Use _mm_store_epi64 instead of _mm_cvtsi128_si64
...
Fix 32-bit builds that tend to lack the cvt intrinsic. Hope it will be
optimized to a movq r64, xmm on modern platforms though
2020-04-07 23:51:54 +03:00
Pauli Oikkonen
a82966c0f5
Fix lacking _mm256_cvtss_f32 intrinsic on VS
...
Cast __m256 into __m128 first, the XMM variant of the intrinsic has been
around for a long enough time to be supported
2020-04-07 22:38:10 +03:00
Ari Lemmetti
901c25c0c8
Merge branch 'vaq'
2020-04-03 19:51:17 +03:00
Ari Lemmetti
51451be5ef
Handle cases where the number of pixels is not divisible by 32
2020-04-03 19:37:47 +03:00
siivonek
e5267f7706
Fix define for use with Visual Studio.
2020-04-03 15:11:01 +02:00
Pauli Oikkonen
addc1c3ede
Fix warning about potentially unused hsum_8x32b
...
There's a lot of alternative options available, such as making it
globally visible with a kvz_ prefix, force inlining it, or anything.
This could be good too, hope it won't be compiled at all to translation
units where it's not used.
2020-04-02 16:44:22 +03:00
siivonek
566680af7b
Move function hsum to file where it is used to avoid errors.
2020-04-02 14:03:06 +02:00
siivonek
58be514e2a
Fix pipeline error.
2020-04-02 13:50:08 +02:00