hashirama/uvg266

mirror of https://github.com/ultravideo/uvg266.git synced 2024-11-30 12:44:07 +00:00

Author	SHA1	Message	Date
Ari Lemmetti	7ccd1a571c	[SIMD] Initial AVX2 code for 4-tap filtering in angular prediction.	2021-09-06 21:20:50 +03:00
Ari Lemmetti	20f0ff976d	[SIMD] Transform angular pred loops for SIMD processing.	2021-09-06 21:20:49 +03:00
Ari Lemmetti	3dfe09e850	[SIMD] Copy generic implementation of angular prediction as a skeleton.	2021-09-06 21:20:46 +03:00
Joose Sainio	450cbd356c	Merge branch 'joint_cbcr' into 'master' [jccr] Add joint coding of chroma residual See merge request cs/ultravideo/vvc/uvg266!6	2021-09-06 11:43:06 +03:00
Joose Sainio	0592cc65a0	[jccr] enable rdoq with jccr	2021-09-06 11:28:20 +03:00
Joose Sainio	072b84711a	[jccr] fix 64×64 CUs	2021-09-06 11:28:20 +03:00
Joose Sainio	042b5078d8	[jccr] WIP initial implementation Add somekind of search for joint chroma residual coding. Bitstream is currently correct but prediction is incorrect because the jccr is actually not used in the search. Hard coded to be enabled	2021-09-06 11:28:08 +03:00
Marko Viitanen	26f18865f7	[alf] Change the processing in alf_get_blk_stats_avx2() to allow utilizing the whole 256bit register	2021-08-27 13:40:28 +03:00
Marko Viitanen	fdf125f406	[alf] Fix incorrect conversion in alf_get_blk_stats_avx2	2021-08-27 10:25:20 +03:00
Marko Viitanen	6714973264	[alf] Change _mm_store_si128 to _mm_storeu_si128 in alf_get_blk_stats_avx2()	2021-08-26 18:05:06 +03:00
Marko Viitanen	5df8add046	[alf] Change order of alf_covariance.y array for better AVX2 optimization in alf_get_blk_stats_avx2()	2021-08-26 15:37:01 +03:00
Marko Viitanen	be9527cf1d	[alf] Change the order of alf_covariance.ee values to get better optimized solution for alf_get_blk_stats_avx2()	2021-08-26 11:07:13 +03:00
Marko Viitanen	f4de5cfd0f	[alf] Cleanup alf_calc_covariance_avx2() and use integers in alf_get_blk_stats_avx2()	2021-08-26 10:20:57 +03:00
Marko Viitanen	915bf3ca24	[alf] Fix AVX2 priority	2021-08-25 20:29:58 +03:00
Marko Viitanen	8ef3e6a126	[alf] Add strategy for alf_get_blk_stats() and an initial AVX2 version	2021-08-25 20:22:24 +03:00
Marko Viitanen	f61b9138cd	[alf] Import SSE4.1 optimized 5x5 and 7x7 filters from VTM13 * Modified to work with 8-bit pixels	2021-08-25 11:50:37 +03:00
Marko Viitanen	dc6a29b0d8	[alf] Initial generic strategies for 5x5 and 7x7 filtering	2021-08-25 10:50:00 +03:00
Marko Viitanen	c3c96d69c2	[alf] Add modified alf_derive_classification_blk_sse41() from VTM 13.0 * Modified to work with bitdepth 8	2021-08-20 11:45:02 +03:00
Marko Viitanen	b158d05bca	[alf] rename strategy function to include prefix	2021-08-19 17:19:17 +03:00
Marko Viitanen	3efaeede76	[alf] Define the strategy for alf_derive_classification_blk()	2021-08-19 17:04:35 +03:00
Marko Viitanen	d742f57779	Remove angular_pred_avx2 so we don't need extra parameter	2021-08-15 10:43:48 +03:00
Marko Viitanen	5604b6f946	[cleanup] remove all crypto related stuff, fix warnings, move estimate.m to tools/	2021-07-27 09:27:51 +03:00
Marko Viitanen	99a2b0384d	[cleanup] remove some warnings	2021-07-26 11:42:19 +03:00
Marko Viitanen	0cad1ac3c9	[mts] Add a comment about idct8/idst7 16x16 being unoptimized	2021-07-21 14:02:23 +03:00
Marko Viitanen	d5ef036d35	[mts] change mts_subset tables back to static	2021-07-21 13:54:59 +03:00
Marko Viitanen	60caf2c378	[mts] fix 32x32 idst/idct	2021-07-21 13:44:25 +03:00
Marko Viitanen	c2cd5fb98e	[mts] replace AVX2 DST7/DCT8 16x16 with unoptimized for now	2021-07-21 13:38:17 +03:00
Marko Viitanen	7e089f518d	[mts] add optimized versions of DCT8 and DST7, inverse not yet working properly * Includes new unit tests for the mts	2021-07-21 11:53:15 +03:00
Marko Viitanen	7f67009511	Fix MD5 calculations from HEVC to VVC way	2021-06-24 15:03:29 +03:00
Marko Viitanen	c004735821	[LMCS] Fix casting of the chroma scaled residual	2021-06-18 09:35:06 +03:00
Joose Sainio	cfffd7166c	Use correct context for calculating coeff costs for transform skip	2021-06-07 13:06:03 +03:00
Marko Viitanen	4594bf0ca8	Merge branch 'lmcs_chroma'	2021-06-02 15:05:04 +03:00
Marko Viitanen	5babb14ee7	[LMCS] Use chroma scaling	2021-06-01 12:17:03 +03:00
Joose Sainio	f9de8ebc4f	Merge branch 'master' into '4x4-rd' # Conflicts: # src/encoder.c # tests/test_intra.sh	2021-05-28 11:43:55 +00:00
Marko Viitanen	dbc7fd48bf	[LMCS] Initialize some m_reshapeCW values to avoid division by zero	2021-05-24 18:57:37 +03:00
Marko Viitanen	73ac3b68bf	[LMCS] add missing header in quant-avx2.c	2021-05-24 17:25:38 +03:00
Marko Viitanen	4cd5bc38a1	[LMCS] Luma mapping working after some rework, have to keep the reconstruction in the mapped domain	2021-05-24 17:23:17 +03:00
Joose Sainio	cfd7d2666b	slightly optimize intra-generic.c	2021-05-14 10:23:37 +03:00
Joose Sainio	7674e94fd1	[rdoq] transform skip RDOQ Copy the implementation from VTM	2021-05-03 12:52:10 +03:00
Joose Sainio	d2b9893bb7	[transform skip] Fix misunderstanding that caused TS to use QP 52>=	2021-04-30 10:55:23 +03:00
Joose Sainio	a998f3ed74	[transform-skip] Convert the HEVC transfrom skip to VVC For some reason transform skip uses QP MAX(52, QP) and the coeffs are no longer shifted	2021-04-30 10:55:23 +03:00
Joose Sainio	2ab005692d	Enable 4x4 intra CUs	2021-04-23 10:57:29 +03:00
Joose Sainio	1aaa95601c	Merge remote-tracking branch 'remotes/kvz_github/master' into Fix-monochrome # Conflicts: # .gitlab-ci.yml # build/kvazaar_lib/kvazaar_lib.vcxproj.filters # src/cfg.c # src/encoder.h # src/kvazaar.h # src/rdo.c	2021-04-23 10:56:50 +03:00
Joose Sainio	e8eab326fb	Update context selection to match VVC	2021-04-23 10:51:01 +03:00
Joose Sainio	b2076d3b39	Enable chroma scaling WIP: user defined scaling array	2021-03-16 10:31:26 +02:00
Joose Sainio	412781db41	[scalinglist] Fix quant-generic	2021-03-09 10:42:40 +02:00
Joose Sainio	30e573c261	[scalinglist] WIP: Update scalinglist for VVC Seems to work when rdoq is enabled but not when it is disabled	2021-03-09 09:51:49 +02:00
Ari Lemmetti	dad3d6818e	Only read left and right border pixels if necessary	2021-03-08 22:36:10 +02:00
Ari Lemmetti	b72ab583b4	Handle "don't care" rows in the end separately	2021-03-08 22:36:09 +02:00
Ari Lemmetti	33295bf350	Use AVX2 luma interpolation for SMP and AMP as well	2021-03-08 22:36:09 +02:00
Ari Lemmetti	7ce68761c2	Add a reminder to fix a rare case for bipred	2021-03-08 22:36:09 +02:00
Ari Lemmetti	475f1d79d5	Add some defines for important interpolation related sizes	2021-03-08 22:36:09 +02:00
Ari Lemmetti	4314f3a9a7	Rename some interpolation functions and strategies for consistency	2021-03-08 22:36:08 +02:00
Ari Lemmetti	5a70b49f69	Require 64-bit build for AVX2 interpolation filter functions	2021-03-08 22:36:08 +02:00
Ari Lemmetti	5631651469	Remove unused functions and variables	2021-03-08 22:36:08 +02:00
Ari Lemmetti	e38219e489	Fix epol_func signature and function definition	2021-03-08 22:36:07 +02:00
Ari Lemmetti	7e6ba9750f	Add new AVX2 ip filters for chroma	2021-03-08 22:36:07 +02:00
Ari Lemmetti	3476fc62c7	Fix parameter to signed	2021-03-08 22:36:06 +02:00
Ari Lemmetti	e572066e46	Add new AVX2 vertical ip filter for pixel precision	2021-03-08 22:36:06 +02:00
Ari Lemmetti	9e4b62a891	Use the new horizontal filter for pixel precision as well	2021-03-08 22:36:06 +02:00
Ari Lemmetti	2175023843	Relocate function	2021-03-08 22:36:06 +02:00
Ari Lemmetti	f5b0e3c52b	Add new AVX2 horizontal ip filter capable of every luma PB	2021-03-08 22:36:05 +02:00
Ari Lemmetti	d9a3225ae5	Add new AVX2 vertical ip filter for high-precision	2021-03-08 22:36:05 +02:00
Ari Lemmetti	84222cf3e7	Replace old block extrapolation with more capable one. Separate paddings for different directions can be now specified.	2021-03-08 22:36:04 +02:00
Marko Viitanen	e05dcdb193	Enable sign hiding in quant_avx2 and fix a bug in kvz_encode_coeff_nxn_generic()	2021-02-12 16:40:28 +02:00
Marko Viitanen	79c36f6aeb	Enable RDOQ and sign hiding	2021-02-12 13:24:02 +02:00
Arttu Makinen	7098a94a6f	Implemented implicit MTS. Added selection of implicit MTS to command parameters. Updated the transform selection to support implicit MTS.	2021-02-11 15:11:15 +02:00
Arttu Mäkinen	8f34685a8f	Merge branch 'master' into 'mts' # Conflicts: # src/cfg.c # src/kvazaar.h	2021-02-10 13:05:18 +02:00
Arttu Makinen	c5570abe1b	Removed 'emt' variable from cu_info_t and changed 'emt' globally to 'mts' for consistency.	2021-02-10 12:08:05 +02:00
Arttu Makinen	d0b7dd95f7	MTS works on intra mode. Fixed usage of MTS constraints. Fixed DCT8 transforms. Added sorting function of MTS modes with intra modes and costs to search.c.	2021-02-10 11:01:58 +02:00
Arttu Makinen	2e7c342645	Implemented DCT2, DST7, and DCT8 transforms, and search for selecting transform for MTS. Using MTS results mismatch for luma component.	2021-02-02 11:09:43 +02:00
Arttu Makinen	b9c3336f0e	MTS bitstream encoding added for intra. Work with depths 0-3.	2021-01-18 20:44:36 +02:00
Arttu Makinen	98a8e78e93	avx2/encode_coding_tree-avx2.c update, because it caused errors	2020-12-30 14:25:16 +02:00
Pauli Oikkonen	816789c9f4	Allow fast coeff weights to be read from a file	2020-10-29 15:22:51 +02:00
Pauli Oikkonen	6799019db0	Move fast coeff table to transform.h Guess this is a more logical place for it	2020-10-29 15:20:27 +02:00
Pauli Oikkonen	4712ce5f59	Round the fast coeff result instead of flooring	2020-10-29 15:20:27 +02:00
Pauli Oikkonen	0fb09c9920	New filtered coeff weight by QP values	2020-10-29 15:20:27 +02:00
Pauli Oikkonen	24d487f553	New weights for 12 <= QP <= 42 Trained using MSU ultrafast settings now	2020-10-29 15:20:27 +02:00
Pauli Oikkonen	3e1c6d84b8	Fix issues in fast coeff estimation Allow weight table to start from nonzero QP, and round weights to Q8.8 instead of flooring them	2020-10-29 15:20:27 +02:00
Pauli Oikkonen	5f91bda762	Use newer data for fast coeff cost estimation Same training dataset, but this time only buckets 0...3 were used to approximate the function, no sign/cg width bucket.	2020-10-29 15:20:27 +02:00
Pauli Oikkonen	2abd733199	Use unsigned min() to correctly clip -32768 If a coeff happens to be -32768 (0x8000), its 16-bit abs() is also 0x8000. It should ultimately be clipped to 3, so interpret absolute values as unsigned instead to make that happen.	2020-10-29 15:20:27 +02:00
Pauli Oikkonen	b93b90c0d7	Implement new fast coeff cost estimator in AVX2	2020-10-29 15:20:27 +02:00
Pauli Oikkonen	2f74a112b3	Try first lookup table based fast coeff estimation	2020-10-29 15:20:27 +02:00
Marko Viitanen	2db3a07b14	Prevent cu_sig_model_chroma array from being indexed over the limit	2020-10-13 14:14:57 +03:00
Marko Viitanen	bddfb47a55	Merge remote-tracking branch 'remotes/kvazaar_github/master'	2020-09-25 11:49:11 +03:00
Marko Viitanen	449975b0fb	Fixed cubic filter usage in intra angular modes	2020-09-21 14:58:34 +03:00
Pauli Oikkonen	780da4568a	Exclude 8-bit-only code from 10-bit builds and use uint8_t instead of kvz_pixel for code that assumes 8-bit pixels	2020-09-02 17:46:33 +03:00
Marko Viitanen	574c4d06ee	Fix use of log2_cg_size in coeff coding -> smaller blocks also decoded correctly	2020-08-27 18:26:16 +03:00
Marko Viitanen	20b66c9949	Sync to VTM 8.2 and add separate height to last_sig coding	2020-04-29 08:52:38 +03:00
Jan Beich	1fa69c705d	Rename truncate() from `30ce461d98` to avoid conflict with POSIX version strategies/avx2/dct-avx2.c:55:23: error: static declaration of 'truncate' follows non-static declaration static INLINE __m256i truncate(__m256i v, __m256i debias, int32_t shift) ^ /usr/include/stdio.h:448:6: note: previous declaration is here int truncate(const char *, __off_t); ^	2020-04-22 16:09:42 +00:00
Marko Viitanen	86d76b19a4	Fix intra neighboring block selection and clean some unused code	2020-04-16 14:12:40 +03:00
Ari Lemmetti	f31dddc019	Bypass inverse quantization and inverse transform when trying early skip	2020-04-10 16:02:09 +03:00
Pauli Oikkonen	8617530b13	Use _mm_store_epi64 instead of _mm_cvtsi128_si64 Fix 32-bit builds that tend to lack the cvt intrinsic. Hope it will be optimized to a movq r64, xmm on modern platforms though	2020-04-07 23:51:54 +03:00
Pauli Oikkonen	a82966c0f5	Fix lacking _mm256_cvtss_f32 intrinsic on VS Cast __m256 into __m128 first, the XMM variant of the intrinsic has been around for a long enough time to be supported	2020-04-07 22:38:10 +03:00
Ari Lemmetti	901c25c0c8	Merge branch 'vaq'	2020-04-03 19:51:17 +03:00
Ari Lemmetti	51451be5ef	Handle cases where the number of pixels is not divisible by 32	2020-04-03 19:37:47 +03:00
siivonek	e5267f7706	Fix define for use with Visual Studio.	2020-04-03 15:11:01 +02:00
Pauli Oikkonen	addc1c3ede	Fix warning about potentially unused hsum_8x32b There's a lot of alternative options available, such as making it globally visible with a kvz_ prefix, force inlining it, or anything. This could be good too, hope it won't be compiled at all to translation units where it's not used.	2020-04-02 16:44:22 +03:00
siivonek	566680af7b	Move function hsum to file where it is used to avoid errors.	2020-04-02 14:03:06 +02:00
siivonek	58be514e2a	Fix pipeline error.	2020-04-02 13:50:08 +02:00

1 2 3 4 5 ...

624 commits