hashirama/uvg266

mirror of https://github.com/ultravideo/uvg266.git synced 2024-11-24 02:24:07 +00:00

Author	SHA1	Message	Date
Arttu Ylä-Outinen	6653f06dd0	Only compute GOP layer weights when RC is enabled	2020-02-15 22:36:57 +02:00
Arttu Ylä-Outinen	c8fff1e0d6	Use a larger number of bits for POC lsb when needed Changes the number of bits used for coding the least significant bits of the POC based on the GOP size.	2020-02-15 22:36:56 +02:00
Arttu Ylä-Outinen	d757a832c2	Change GOP QP offset handling to match HM Adds fields qp_model_scale and qp_model_offset to kvz_gop_config and intra_qp_offset to kvz_config.	2020-02-15 22:36:56 +02:00
Arttu Ylä-Outinen	f37dcd5879	Move GOP definition to a separate file Moves definition of the 8-GOP from cfg.c to gop.h.	2020-02-15 22:36:55 +02:00
Ari Lemmetti	6e1007a3e7	Get rid of LAMBA! (Commit #3000 )	2020-02-15 22:32:52 +02:00
Ari Lemmetti	0c02e71b43	Remove minor error from readme	2020-02-15 22:29:08 +02:00
Ari Lemmetti	9a0236bb4e	Add option 'zero-coeff-rdo'	2020-02-04 21:26:29 +02:00
Ari Lemmetti	886ff36d12	Initial implementation of fast bipred.	2020-02-04 15:46:23 +02:00
Ari Lemmetti	3c7dd0752f	Remove the broken "no mov" branch. Causes hash mismatches for example in SlideShow sequence.	2020-02-03 15:26:31 +02:00
RLamm	bf8941ddb8	Added comment about partial-coding usage	2020-01-31 16:19:48 +02:00
RLamm	b8488ab48d	Changed "partial-coding" variables to uint32_t	2020-01-31 16:02:29 +02:00
RLamm	76e3249754	Changed parameter "slicer" to "partial-coding" to avoid confusion.	2020-01-31 14:22:32 +02:00
RLamm	30d5df40c5	Custom headers for the distributed coding	2020-01-29 15:54:49 +02:00
Pauli Oikkonen	c3d9e97e9f	Fix VS build	2019-12-12 18:34:55 +02:00
Pauli Oikkonen	7f238ca299	Remove debug print functions Whoops	2019-12-12 18:19:31 +02:00
Pauli Oikkonen	eefb5e50b3	De-inline pred_filtered_dc functions, shouldn't make much difference though	2019-12-12 17:30:00 +02:00
Pauli Oikkonen	169314de4f	32x32 filtered DC prediction in AVX2	2019-12-11 18:17:06 +02:00
Pauli Oikkonen	fb2481b7e4	16x16 filtered DC implemented in AVX2	2019-12-10 15:54:50 +02:00
Pauli Oikkonen	da370ea36d	Implement AVX2 8x8 filtered DC algorithm	2019-11-28 14:10:10 +02:00
Pauli Oikkonen	5d9b7019ca	Implement a 4x4 filtered DC pred function	2019-11-26 17:05:54 +02:00
Pauli Oikkonen	f1485ab087	Start doing an arbitrary size filtered DC pred - maybe easier to just create separate functions for fixed block sizes?	2019-11-25 15:20:29 +02:00
Pauli Oikkonen	979d66031c	Create a strategy out of intra_pred_filtered_dc	2019-11-19 14:50:31 +02:00
Pauli Oikkonen	fa4bb86406	Optimize intra_pred_planar_avx2 for 4x4 blocks	2019-11-19 13:39:02 +02:00
Pauli Oikkonen	4761d228f9	Start to vectorize the 4x4 loop	2019-11-15 17:32:40 +02:00
Pauli Oikkonen	8d45ab4951	Stupidify the 4x4 planar loop for vectorization	2019-11-14 17:14:04 +02:00
Pauli Oikkonen	6f13f6525c	Merge branch 'new_prints'	2019-11-07 17:04:21 +02:00
mercat	57e8c3ebc2	Merge branch 'ML-cplx_red_ICIP'	2019-11-07 13:25:47 +02:00
Pauli Oikkonen	558f0ec401	Mbps, not mbps	2019-11-05 18:06:00 +02:00
Pauli Oikkonen	2edf533925	Tidy the end report printing Also fix a bug with non-integer target FPS	2019-11-05 17:20:00 +02:00
Pauli Oikkonen	c7313ce567	Store AVG QP information in encmain	2019-11-04 17:08:07 +02:00
Reima Hyvönen	80575c59bf	Some updates done to get right bitrate and avg QP	2019-10-31 15:56:24 +02:00
Reima Hyvönen	252bab8820	Added prints to bitrate and AVG QP	2019-10-31 15:56:24 +02:00
Pauli Oikkonen	6d7a4f555c	Also remove 16x16 (A * B^T)^T matrix multiply Can be done using (B * A^T) instead, it's the exact same	2019-10-28 16:19:42 +02:00
Pauli Oikkonen	2c2deb2366	Tidy AVX2 32x32 matrix multiply	2019-10-28 16:19:42 +02:00
Pauli Oikkonen	98ad78b333	Tidy the old AVX2 32x32 matrix multiply It was actually a very good algorithm, just looked messy!	2019-10-28 16:19:42 +02:00
Pauli Oikkonen	4a921cbdb5	Retain data as much in YMM registers as possible This seems to make it a whole lot quicker	2019-10-28 16:19:42 +02:00
Pauli Oikkonen	ac4d710e23	Unroll 32x32 matrix multiply, use all regs	2019-10-28 16:19:42 +02:00
Pauli Oikkonen	a58608d0b8	Remove totally unnecessary (A * B^T)^T 32x32 multiply	2019-10-28 16:19:42 +02:00
Pauli Oikkonen	043f53539f	Implement a streamlined matrix-multiply 32x32 DCT	2019-10-28 16:19:42 +02:00
Pauli Oikkonen	e9da2d851b	Tidy 32x32 fast DCT's helper functions	2019-10-28 16:19:42 +02:00
Pauli Oikkonen	e382339182	Implement fast (butterfly) 32x32 DCT in AVX2	2019-10-28 16:19:42 +02:00
Pauli Oikkonen	b5962dadac	Tidy indentation in AVX2 16x16 iDCT	2019-10-28 16:19:42 +02:00
Pauli Oikkonen	36a8f89025	Fine-tune 16x16 AVX2 iDCT	2019-10-28 16:19:42 +02:00
Pauli Oikkonen	ca9409de2b	Implement 16x16 DCT as butterfly algorithm in AVX2	2019-10-28 16:19:42 +02:00
Pauli Oikkonen	7c69a26717	Use aligned loads and stores for AVX2 DCT	2019-10-28 16:19:42 +02:00
Pauli Oikkonen	8e9c65dca6	Align DCT matrices and temp transform buffers	2019-10-28 16:19:42 +02:00
Pauli Oikkonen	148a150522	Align DCT source and dest blocks to cache line	2019-10-28 16:19:42 +02:00
Pauli Oikkonen	8e60bbf6a6	Slightly tune 16x16 forward DCT Use an array of __m256i's to store temporary value, essentially letting the compiler enforce alignment and use aligned loads and stores.	2019-10-28 16:19:42 +02:00
Pauli Oikkonen	c0cc0e8a75	Optimize 16x16 multiply by only slicing right mat once	2019-10-28 16:19:42 +02:00
Pauli Oikkonen	e463d27f22	Implement streamlined generic 16x16 matrix multiply It can't be this fast for real, can it?	2019-10-28 16:19:42 +02:00

1 2 3 4 5 ...

2557 commits