hashirama/uvg266

mirror of https://github.com/ultravideo/uvg266.git synced 2024-11-30 20:54:07 +00:00

Author	SHA1	Message	Date
siivonek	1bbc598d75	Merge branch 'master' into vaq	2020-03-19 20:19:43 +02:00
Joose Sainio	b53911d637	Merge branch 'rc-intra'	2020-03-19 13:34:15 +02:00
Joose Sainio	a304a8ea6e	Add weights for GOP 16 based on fitting a power curve to bits spent by HM	2020-03-19 11:13:43 +02:00
Joose Sainio	e823ac1dae	miscellaneous fixes - bump library version - add help desk for --clip-neighbour - update the default values of --clip-neighbour and --intra-bits - update tests to more sensible	2020-03-19 10:47:28 +02:00
Jaakko Laitinen	b2ddba38c2	Set correct size for pu-depth min/max data structure	2020-03-19 09:29:43 +02:00
Joose Sainio	2c345bc3cf	try to fix tsan issue	2020-03-18 14:58:54 +02:00
Jaakko Laitinen	fe428dcbe1	Fix no gop functionality	2020-03-18 11:03:33 +02:00
Jaakko Laitinen	af3d559d8d	Let pu-depth be defined per gop-layer	2020-03-17 17:57:18 +02:00
Ari Lemmetti	cbd77944d8	Costs in rough intra search may be negative. Get rid of UBSan error.	2020-03-16 22:13:14 +02:00
Ari Lemmetti	aa0ade3f65	Cast values to unsigned to make UBSan not trigger due to left-shifting negatives	2020-03-16 19:52:34 +02:00
RLamm	27fe716654	Fixed reference POC indexing	2020-03-11 15:33:37 +02:00
RLamm	bf24831780	Attempt to fix random crashes	2020-03-11 15:31:47 +02:00
RLamm	887659db1f	Attempted to scale the extra_mvs	2020-03-11 15:31:46 +02:00
siivonek	8d9719ff90	Merge branch 'master' into vaq	2020-03-05 14:17:01 +02:00
Joose Sainio	c9a8f2a596	Completely disable intra based model for frame 1	2020-03-04 12:52:13 +02:00
Joose Sainio	19c79c3e58	don't use the intra frame based estimation if the result is bad	2020-03-04 09:26:22 +02:00
Ari Lemmetti	7b7358c25a	Update presets veryslow and placebo a bit Both use now --gop 16, --intra-qp-offset -3, --me tz, and --transform-skip	2020-03-03 20:41:01 +02:00
Pauli Oikkonen	60e7956dc5	Disable inaccurate integer variance calculation for now	2020-03-02 19:18:55 +02:00
Pauli Oikkonen	fc1b91335b	Implement variance calculation in integer math Maybe this is a bit faster than FP, it's not accurate though	2020-03-02 18:17:18 +02:00
Pauli Oikkonen	35c825c75f	Move hsum_8x32b to avx2_common_functions	2020-02-27 17:52:17 +02:00
Pauli Oikkonen	b00ac7d1c4	AVX2 version of buffer variance calculation	2020-02-25 15:57:56 +02:00
siivonek	a380e43bda	Add chroma channels to variance calculation.	2020-02-24 19:54:34 +02:00
Pauli Oikkonen	1bd9c6dd93	Make a strategy out of pixel_var	2020-02-24 19:37:36 +02:00
Pauli Oikkonen	86ebf366e1	fix typo	2020-02-24 18:18:10 +02:00
Joose Sainio	f81de41775	Merge branch 'master' into rc-intra	2020-02-24 15:30:57 +02:00
siivonek	5688bcd646	Merge branch 'master' into vaq	2020-02-21 17:11:10 +02:00
siivonek	908ecb1767	Add rounding to aq offsets. Fix typo	2020-02-21 13:51:43 +02:00
Ari Lemmetti	1dfc69b42e	Consider merge index bits in merge analysis and early skip	2020-02-20 09:43:58 +02:00
Joose Sainio	7deb22c8e8	Merge branch 'master' into rc-intra	2020-02-19 15:01:04 +02:00
Kari Siivonen (TAU)	c972ca9067	Add assert to check if deltaQP out of bounds. Clip adaptive QP to [-13, 12].	2020-02-18 13:20:26 +02:00
Kari Siivonen (TAU)	f07990794f	Fix error in vaq pixel blit range calculation	2020-02-18 13:20:26 +02:00
Kari Siivonen (TAU)	57ed40c263	Fix application of aq offset	2020-02-18 13:20:26 +02:00
Kari Siivonen (TAU)	be2f420d61	Change: vaq requires parameter. Parameter defines vaq strength ex. 15 == 1.5	2020-02-18 13:20:26 +02:00
Kari Siivonen (TAU)	bf1b2c1e22	Add define for vaq strength parameter	2020-02-18 13:20:26 +02:00
Kari Siivonen (TAU)	150559a7e8	Fix bugs. Enable set_qp_in_cu when using vaq	2020-02-18 13:20:26 +02:00
Kari Siivonen (TAU)	c8c71274ee	Change tabs to spaces.	2020-02-18 13:20:26 +02:00
siivonek	888382953d	Implement calculation of vaq values. Values not used yet.	2020-02-18 13:20:25 +02:00
siivonek	ad40a88c09	Add no-vaq option to vaq	2020-02-18 13:20:25 +02:00
siivonek	09f0a1c52e	Fix typo in comment	2020-02-18 13:20:25 +02:00
siivonek	84fb3fd7d1	aq: Add --vaq commandline option	2020-02-18 13:20:25 +02:00
Joose Sainio	2a98f5db1e	fix intra-bits for lp-gop	2020-02-18 10:38:29 +02:00
Ari Lemmetti	71d9327f62	Further improve fast bipred	2020-02-17 20:32:52 +02:00
Ari Lemmetti	80c26870d5	Update docs	2020-02-15 23:29:18 +02:00
Ari Lemmetti	ebb183cc01	Add option to make intra QP offset configurable	2020-02-15 22:54:48 +02:00
Ari Lemmetti	be3e08d6db	Add gop.h to Makefile	2020-02-15 22:54:47 +02:00
Ari Lemmetti	1354acd358	Prevent negative values being written to SPS with --gop=0	2020-02-15 22:54:47 +02:00
Ari Lemmetti	fe4869916c	Disable GOP and intra qp offset for all-intra coding automatically	2020-02-15 22:54:46 +02:00
Ari Lemmetti	9849fb7c77	Enable experimental rate control for GOP 16	2020-02-15 22:54:46 +02:00
Ari Lemmetti	a0a22dec8a	Remove deprecated / unused lambda adjustments	2020-02-15 22:54:46 +02:00
Arttu Ylä-Outinen	829a70e6a7	Copy lowdelay GOP definition from HM	2020-02-15 22:36:58 +02:00
Arttu Ylä-Outinen	28f99c0b87	Change definition of 8-GOP to match HM	2020-02-15 22:36:58 +02:00
Arttu Ylä-Outinen	636fa8fbdd	Fix maximum decoded picture buffer size	2020-02-15 22:36:57 +02:00
Arttu Ylä-Outinen	ebd5156db5	Add definition for random access GOP of length 16	2020-02-15 22:36:57 +02:00
Arttu Ylä-Outinen	6653f06dd0	Only compute GOP layer weights when RC is enabled	2020-02-15 22:36:57 +02:00
Arttu Ylä-Outinen	c8fff1e0d6	Use a larger number of bits for POC lsb when needed Changes the number of bits used for coding the least significant bits of the POC based on the GOP size.	2020-02-15 22:36:56 +02:00
Arttu Ylä-Outinen	d757a832c2	Change GOP QP offset handling to match HM Adds fields qp_model_scale and qp_model_offset to kvz_gop_config and intra_qp_offset to kvz_config.	2020-02-15 22:36:56 +02:00
Arttu Ylä-Outinen	f37dcd5879	Move GOP definition to a separate file Moves definition of the 8-GOP from cfg.c to gop.h.	2020-02-15 22:36:55 +02:00
Ari Lemmetti	6e1007a3e7	Get rid of LAMBA! (Commit #3000 )	2020-02-15 22:32:52 +02:00
Ari Lemmetti	0c02e71b43	Remove minor error from readme	2020-02-15 22:29:08 +02:00
Joose Sainio	e90d3141a2	Merge branch 'master' into rc-intra	2020-02-05 11:06:56 +02:00
Ari Lemmetti	9a0236bb4e	Add option 'zero-coeff-rdo'	2020-02-04 21:26:29 +02:00
Ari Lemmetti	886ff36d12	Initial implementation of fast bipred.	2020-02-04 15:46:23 +02:00
Ari Lemmetti	3c7dd0752f	Remove the broken "no mov" branch. Causes hash mismatches for example in SlideShow sequence.	2020-02-03 15:26:31 +02:00
RLamm	bf8941ddb8	Added comment about partial-coding usage	2020-01-31 16:19:48 +02:00
RLamm	b8488ab48d	Changed "partial-coding" variables to uint32_t	2020-01-31 16:02:29 +02:00
RLamm	76e3249754	Changed parameter "slicer" to "partial-coding" to avoid confusion.	2020-01-31 14:22:32 +02:00
RLamm	30d5df40c5	Custom headers for the distributed coding	2020-01-29 15:54:49 +02:00
Joose Sainio	54571529a4	Fix accessing previous frame that didn't exist	2020-01-17 10:48:35 +02:00
Joose Sainio	5c671d20e1	Use the new clipping only in situations where it actually helps	2020-01-17 09:08:21 +02:00
Joose Sainio	3c34d7c863	Fix qp estimation and checking of previous frames that dont exist	2020-01-15 09:32:04 +02:00
Joose Sainio	1a35c22a52	Change clipping of lambda and qp for ctus on OBA rc instead of clipping qp and lambda to the value of last value from the state clip to previous frame with same layer and if such frame doesn't exist, clip to previous frame	2020-01-14 14:46:05 +02:00
Pauli Oikkonen	c3d9e97e9f	Fix VS build	2019-12-12 18:34:55 +02:00
Pauli Oikkonen	7f238ca299	Remove debug print functions Whoops	2019-12-12 18:19:31 +02:00
Pauli Oikkonen	eefb5e50b3	De-inline pred_filtered_dc functions, shouldn't make much difference though	2019-12-12 17:30:00 +02:00
Pauli Oikkonen	169314de4f	32x32 filtered DC prediction in AVX2	2019-12-11 18:17:06 +02:00
Pauli Oikkonen	fb2481b7e4	16x16 filtered DC implemented in AVX2	2019-12-10 15:54:50 +02:00
Joose Sainio	b78aa7b272	save c and k to frame	2019-12-06 10:52:54 +02:00
Joose Sainio	5b10e5fb7e	parameterize the clipping option	2019-12-06 09:51:04 +02:00
Pauli Oikkonen	da370ea36d	Implement AVX2 8x8 filtered DC algorithm	2019-11-28 14:10:10 +02:00
Pauli Oikkonen	5d9b7019ca	Implement a 4x4 filtered DC pred function	2019-11-26 17:05:54 +02:00
Joose Sainio	ca0060cbba	try the original clipping	2019-11-26 15:13:04 +02:00
Pauli Oikkonen	f1485ab087	Start doing an arbitrary size filtered DC pred - maybe easier to just create separate functions for fixed block sizes?	2019-11-25 15:20:29 +02:00
Joose Sainio	ab2fded8af	Update threadwrapper to enable pthread_rwlock_t	2019-11-21 13:38:40 +02:00
Joose Sainio	eb78aead1f	Fix additional potential data races	2019-11-21 11:03:12 +02:00
Joose Sainio	35d7e0d88b	Fix data race	2019-11-21 10:25:04 +02:00
Marko Viitanen	94d89f03c7	Added cfg variable intra_smoothing_disabled and some cleanup	2019-11-20 08:38:33 +02:00
Marko Viitanen	eb2caf9118	Fix intra angle filter, changed from gauss filter table to run-time calculated 4-tap filter	2019-11-19 15:15:21 +02:00
Pauli Oikkonen	979d66031c	Create a strategy out of intra_pred_filtered_dc	2019-11-19 14:50:31 +02:00
Marko Viitanen	466d8772b0	Apply JVET_P0170_ZERO_POS_SIMPLIFICATION in coeff bypass coding	2019-11-19 14:32:38 +02:00
Joose Sainio	0e8815a3d8	test clipping qp to previous frame instead of previous ctus	2019-11-19 14:32:31 +02:00
Joose Sainio	ddb4e5a131	move the intra bit calculation so that it is used also with lambda rc	2019-11-19 14:16:48 +02:00
Joose Sainio	a07833f3e6	check that mallocs in rc initialization were successful only call kvz_update_after_picture when using the OBA rc	2019-11-19 13:59:44 +02:00
Joose Sainio	50d410a316	re-enable static qp encoding and lambda rc	2019-11-19 13:45:58 +02:00
Pauli Oikkonen	fa4bb86406	Optimize intra_pred_planar_avx2 for 4x4 blocks	2019-11-19 13:39:02 +02:00
Marko Viitanen	3df2642b03	Fix qt cbf context init value	2019-11-19 13:27:36 +02:00
Joose Sainio	57e5615ece	Fix incorrect intra rc calculation skipping	2019-11-19 13:25:31 +02:00
Joose Sainio	6cc3bcd87e	Command line parameters for oba rc and implementation of the usage of the intra parameter	2019-11-19 09:29:06 +02:00
Joose Sainio	eb73548af5	Encode first frame completely before starting others to enable owf	2019-11-18 09:51:37 +02:00
Marko Viitanen	17a53230fd	Code cleanup, remove unused arrays and remove tabs	2019-11-18 09:01:23 +02:00
Pauli Oikkonen	4761d228f9	Start to vectorize the 4x4 loop	2019-11-15 17:32:40 +02:00
Pauli Oikkonen	8d45ab4951	Stupidify the 4x4 planar loop for vectorization	2019-11-14 17:14:04 +02:00
Marko Viitanen	91528f3292	Update contexts	2019-11-14 13:46:51 +02:00
Marko Viitanen	b309ed90be	Fix NAL packet and missing fields in SPS	2019-11-14 09:21:11 +02:00
Marko Viitanen	74514981a9	Fixed PPS, SPS and slice headers and NAL unit types	2019-11-13 15:59:36 +02:00
Joose Sainio	c759c138ed	Prepare the rc data structure to be shared among all frame encoders	2019-11-13 11:56:25 +02:00
Joose Sainio	cdb7c851a4	Fix weight calculation	2019-11-13 08:55:31 +02:00
Joose Sainio	b9b01f8036	WPP with threading	2019-11-12 12:12:57 +02:00
Joose Sainio	615973adca	should enable threading with wpp when owf is not used	2019-11-12 09:03:00 +02:00
Pauli Oikkonen	6f13f6525c	Merge branch 'new_prints'	2019-11-07 17:04:21 +02:00
Joose Sainio	d353f7dd1a	Disable debug prints, fix multiple bugs in the calculation	2019-11-07 15:08:57 +02:00
mercat	57e8c3ebc2	Merge branch 'ML-cplx_red_ICIP'	2019-11-07 13:25:47 +02:00
Pauli Oikkonen	558f0ec401	Mbps, not mbps	2019-11-05 18:06:00 +02:00
Pauli Oikkonen	2edf533925	Tidy the end report printing Also fix a bug with non-integer target FPS	2019-11-05 17:20:00 +02:00
Joose Sainio	408fd4ccb6	Fix lambda and qp calcualtion for intra frames also fixes a bug with selecting the clip neighbor lambda and clip neighbor qp selection for inter frames	2019-11-05 10:51:39 +02:00
Pauli Oikkonen	c7313ce567	Store AVG QP information in encmain	2019-11-04 17:08:07 +02:00
Reima Hyvönen	80575c59bf	Some updates done to get right bitrate and avg QP	2019-10-31 15:56:24 +02:00
Reima Hyvönen	252bab8820	Added prints to bitrate and AVG QP	2019-10-31 15:56:24 +02:00
Pauli Oikkonen	6d7a4f555c	Also remove 16x16 (A * B^T)^T matrix multiply Can be done using (B * A^T) instead, it's the exact same	2019-10-28 16:19:42 +02:00
Pauli Oikkonen	2c2deb2366	Tidy AVX2 32x32 matrix multiply	2019-10-28 16:19:42 +02:00
Pauli Oikkonen	98ad78b333	Tidy the old AVX2 32x32 matrix multiply It was actually a very good algorithm, just looked messy!	2019-10-28 16:19:42 +02:00
Pauli Oikkonen	4a921cbdb5	Retain data as much in YMM registers as possible This seems to make it a whole lot quicker	2019-10-28 16:19:42 +02:00
Pauli Oikkonen	ac4d710e23	Unroll 32x32 matrix multiply, use all regs	2019-10-28 16:19:42 +02:00
Pauli Oikkonen	a58608d0b8	Remove totally unnecessary (A * B^T)^T 32x32 multiply	2019-10-28 16:19:42 +02:00
Pauli Oikkonen	043f53539f	Implement a streamlined matrix-multiply 32x32 DCT	2019-10-28 16:19:42 +02:00
Pauli Oikkonen	e9da2d851b	Tidy 32x32 fast DCT's helper functions	2019-10-28 16:19:42 +02:00
Pauli Oikkonen	e382339182	Implement fast (butterfly) 32x32 DCT in AVX2	2019-10-28 16:19:42 +02:00
Pauli Oikkonen	b5962dadac	Tidy indentation in AVX2 16x16 iDCT	2019-10-28 16:19:42 +02:00
Pauli Oikkonen	36a8f89025	Fine-tune 16x16 AVX2 iDCT	2019-10-28 16:19:42 +02:00
Pauli Oikkonen	ca9409de2b	Implement 16x16 DCT as butterfly algorithm in AVX2	2019-10-28 16:19:42 +02:00
Pauli Oikkonen	7c69a26717	Use aligned loads and stores for AVX2 DCT	2019-10-28 16:19:42 +02:00
Pauli Oikkonen	8e9c65dca6	Align DCT matrices and temp transform buffers	2019-10-28 16:19:42 +02:00
Pauli Oikkonen	148a150522	Align DCT source and dest blocks to cache line	2019-10-28 16:19:42 +02:00
Pauli Oikkonen	8e60bbf6a6	Slightly tune 16x16 forward DCT Use an array of __m256i's to store temporary value, essentially letting the compiler enforce alignment and use aligned loads and stores.	2019-10-28 16:19:42 +02:00
Pauli Oikkonen	c0cc0e8a75	Optimize 16x16 multiply by only slicing right mat once	2019-10-28 16:19:42 +02:00
Pauli Oikkonen	e463d27f22	Implement streamlined generic 16x16 matrix multiply It can't be this fast for real, can it?	2019-10-28 16:19:42 +02:00
Pauli Oikkonen	beb85ce9d6	Reorder parameters for 8x8 matrix multiplies	2019-10-28 16:19:42 +02:00
Pauli Oikkonen	292af62256	Implement tailored 16x16 forward DCT	2019-10-28 16:19:42 +02:00
Pauli Oikkonen	30ce461d98	Redo 4x4 matrix multiplication	2019-10-28 16:19:42 +02:00
Pauli Oikkonen	07970ea82f	Streamline by-the-book 8x8 matrix multiplication Also chop up the forward transform into two tailored multiply functions	2019-10-28 16:19:42 +02:00
Pauli Oikkonen	7ec7ab3361	Implement a tailored AVX2 8x8 DCT	2019-10-28 16:19:42 +02:00
Joose Sainio	372934c7db	Fix division by zero	2019-10-10 16:35:56 +03:00
Joose Sainio	9bdfdeaf5c	Rest of the owl	2019-10-09 15:48:58 +03:00
Joose Sainio	1ba8525faf	WIP	2019-10-09 10:35:07 +03:00
Joose Sainio	19496d2692	?	2019-10-03 14:50:11 +03:00
Joose Sainio	4b111e339e	fix couple of bugs in the implementation, bit calculation seems still bit off	2019-10-01 15:08:39 +03:00
Joose Sainio	84615e406a	fix compiler warnings	2019-09-27 14:20:08 +03:00
Joose Sainio	14b7a75713	Call the new functions and fix bugs	2019-09-27 14:14:24 +03:00
Joose Sainio	ef74bfb182	unify naming	2019-09-27 10:16:21 +03:00
Joose Sainio	e36f481bda	qp calculation for frame	2019-09-27 09:05:40 +03:00
Joose Sainio	47019ca1cd	intra ck update	2019-09-26 16:04:53 +03:00

1 2 3 4 5 ...

2894 commits