hashirama/uvg266

mirror of https://github.com/ultravideo/uvg266.git synced 2024-11-24 18:34:06 +00:00

Author	SHA1	Message	Date
Ari Lemmetti	1ed34f2df8	Add some planar pred optimization for blocks larger than 8x8	2016-01-13 14:50:17 +02:00
Ari Lemmetti	0df88697ff	Copy generic function to AVX2 strategy	2016-01-12 23:51:18 +02:00
Ari Lemmetti	62799a9fc3	Create generic strategy of planar prediction	2016-01-12 23:50:47 +02:00
Ari Lemmetti	3cb1cebfe5	Add missing inlines	2016-01-12 23:03:31 +02:00
Ari Lemmetti	6a0b13b8b6	Remove unused functions	2016-01-12 22:55:37 +02:00
Ari Lemmetti	61155f0edd	Add 128-bit version of the functions as well	2016-01-12 22:52:00 +02:00
Ari Lemmetti	a6afb8a8f4	Small refactoring	2016-01-12 22:29:33 +02:00
Ari Lemmetti	a756f6133a	Manually unroll vertical Hadamard transform	2016-01-12 21:45:02 +02:00
Ari Lemmetti	66350aa20e	Experiment with alternative implementation of FWHT	2016-01-11 16:25:56 +02:00
Ari Koivula	947bae24f9	Update Doxygen documentation Add module information to all header files. Update all header file documentations to briefly say what they are, and to use the javadoc format so the brief actually gets included into the doxygen documentation. Remove \file from implementation files, in order to not repeat the info from the header files. Add files under strategies and tools to Doxygen and update the Doxygen settings to be just plain better. Make README be the main page of Doxygen documentation.	2015-12-17 14:05:50 +02:00
Arttu Ylä-Outinen	864c77f6eb	Use kvz_satd_any_size in inter search. Changes search_frac and kvz_search_cu_iter to use kvz_satd_any_size for computing the SATDs instead of getting the SATD function with kvz_pixels_get_satd_func.	2015-12-15 11:21:45 +02:00
Arttu Ylä-Outinen	056fa09ba5	Add arbitrary-sized SATD functions. Adds strategy satd_any_size for generic and AVX2. The satd_any_size functions are implemented with macro SATD_ANY_SIZE defined in strategies-picture.h.	2015-12-15 11:21:45 +02:00
Arttu Ylä-Outinen	728a6abecc	Extract macro SATD_NxN. Combines definitions of macros SATD_NXN and SATD_NXN_AVX2 to macro SATD_NxN and moves it to strategies-picture.h.	2015-12-15 11:21:44 +02:00
Arttu Ylä-Outinen	4402e251ae	Fix kvz_get_extended_block functions. The buffers allocated in functions kvz_get_extended_block_avx2 and kvz_get_extended_block_generic were too small when the width of the block was less than its height. Fixed to allocate correctly sized buffers.	2015-12-15 11:21:43 +02:00
Ari Lemmetti	b78460b02c	Optimize another loop	2015-12-11 11:21:43 +02:00
Ari Lemmetti	ee8c2d0218	Add 4x4 dual SATD for AVX2	2015-12-03 17:13:11 +02:00
Ari Lemmetti	00736fa708	Generate larger than 8x8 dual satd functions with macro	2015-12-03 17:13:11 +02:00
Ari Lemmetti	bd3e1922cd	Add AVX2 8x8 dual hadamard transform	2015-12-03 17:13:11 +02:00
Ari Lemmetti	d575b94357	Implement generic functions for dual sad / satd	2015-12-03 17:13:11 +02:00
Ari Lemmetti	183ee53f47	Add alternative version of rough intra search. Calculate two costs simultaneously to exploit larger SIMD registers. Implementation for dual functions missing currently.	2015-12-03 17:12:38 +02:00
Arttu Ylä-Outinen	940ada4c0d	Mark AVX2 intra filter functions as static. Marks functions filter_4x4_avx2, filter_16x16_avx2 and filter_NxN_avx2 static as they are not used outside strategies/avx2/intra-avx2.	2015-11-09 12:48:20 +02:00
Ari Lemmetti	fbd0596114	Merge branch 'avx2-pixels-blit'	2015-11-04 11:06:10 +02:00
Ari Lemmetti	57ea7d223b	Pass SIMD registers to functions as pointers to fix 32-bit compilation in visual studio	2015-11-04 10:51:26 +02:00
Ari Lemmetti	a3855652e9	Add AVX2 version with separate handling of basic blocks and strideless copy.	2015-11-04 10:07:25 +02:00
Ari Lemmetti	0816fbea2c	Create generic strategy of blit function	2015-11-04 10:07:25 +02:00
Marko Viitanen	821d5c478b	Added missing parameter to kvz_strategy_register_picture_generic()	2015-11-02 08:55:54 +02:00
Ari Lemmetti	d71f1b5bd0	Disable incompatible optimizations for 32-bit version	2015-10-24 15:32:27 +03:00
Ari Lemmetti	df995d85e8	Utilize AVX2 for dequantization.	2015-10-23 20:17:08 +03:00
Ari Lemmetti	cf347e33c4	Move dequant to strategies. Copy generic to AVX2 as well.	2015-10-23 19:53:50 +03:00
Ari Lemmetti	47082738aa	...and the same tricks for quantized reconstruction	2015-10-23 19:44:38 +03:00
Ari Lemmetti	7961ba80d8	Add functions for bigger block sizes to calculate more residual simultaneously and reduce memory accesses	2015-10-23 19:11:56 +03:00
Ari Lemmetti	15edd5060d	Load and store multiple elements simultaneously. Use 128-bit wide zero test. wip	2015-10-23 17:03:16 +03:00
Ari Lemmetti	b37cca87c8	Copy generic to avx2	2015-10-23 17:03:15 +03:00
Ari Lemmetti	cad2ea9d6e	Move quantize_residual to quant strategies.	2015-10-23 17:03:15 +03:00
Ari Lemmetti	0c63041ba7	Add filtering functions for different block sizes. Simplify logic a bit to reduce branching. Sorry for the large commit!	2015-10-23 16:54:15 +03:00
Ari Lemmetti	5af7a42ebe	Enable AVX2 strategy. Add first version of optimizations.	2015-10-08 12:36:20 +03:00
Ari Lemmetti	f4fe3dca5e	Add AVX2 strategy. Copy generic implementation there.	2015-10-08 12:36:15 +03:00
Ari Lemmetti	54e8b346a3	Add intra strategy. Move angular prediction there.	2015-10-08 12:36:05 +03:00
Ari Lemmetti	38106afa50	Add AVX2 version of quantization.	2015-10-02 16:18:52 +03:00
Ari Lemmetti	ef0ad292ef	Add quantization strategy.	2015-10-02 16:17:02 +03:00
Ari Lemmetti	989cee1b04	Add 4x4 function as well	2015-10-01 22:14:56 +03:00
Ari Lemmetti	8b57b2bb1a	Refactor SATD to inline most of the function. Replace full horizontal add with shuffle and regular packed add.	2015-10-01 21:29:25 +03:00
Ari Lemmetti	55da2a9958	Add intrinsic version of SATD for 8x8 and larger blocks	2015-10-01 19:42:22 +03:00
Ari Lemmetti	d68fc4c41e	Add header for common utilities to use with strategies.	2015-10-01 19:40:35 +03:00
Ari Koivula	9a23ae3d92	Resolve remaining Visual Studio warnings. - Ignore most of them and fix the ones that can't be ignored.	2015-08-31 15:02:25 +03:00
Arttu Ylä-Outinen	3a10e9e3e0	Prefix all non-static symbols with "kvz_".	2015-08-26 13:02:28 +03:00
Arttu Ylä-Outinen	bfe2b31cee	Make generic satd functions static.	2015-08-26 12:10:27 +03:00
Ari Lemmetti	923f4a74d5	Fix filtering over limits	2015-08-17 17:39:56 +03:00
Ari Lemmetti	82cf4e8ff4	Output error messages to stderr	2015-08-17 15:01:46 +03:00
Ari Lemmetti	3da71b62bf	Add checks if malloc fails	2015-08-17 15:01:46 +03:00
Ari Lemmetti	4718fe7fda	Change variable names to match used convention	2015-08-17 15:01:46 +03:00
Ari Lemmetti	6a5eaf08de	Rename extend_borders to get_extended_block. Add kvz_ prefix to type definition.	2015-08-17 15:01:46 +03:00
Ari Lemmetti	d82582c37c	Changes to extend border function. Now outputs a pointer to a block with guaranteed padding for filtering. Only generate extra pixels if samples are needed out of bounds. Use memcpy otherwise.	2015-08-17 15:01:46 +03:00
Ari Lemmetti	5d96dbc6c0	Make strategy selection use bit depth given via parameter instead of excluding registration with defines	2015-08-12 13:33:38 +03:00
Ari Lemmetti	4122f36089	Prevent the registration of strategies that are incompatible when KVZ_BIT_DEPTH != 8 Remove unnecessary or misleading mentions of "8bit"	2015-08-12 11:29:53 +03:00
Ari Lemmetti	348d7780fc	Remove third shift and offset from 14-bit sampling functions (change missing from rebase)	2015-08-11 15:06:16 +03:00
Marko Viitanen	8409317bd9	Fixed rebasing errors for 10bit branch	2015-08-11 14:56:45 +03:00
Marko Viitanen	6453a511d7	Scale SAD/SATD costs to match bit depth Conflicts: src/image.c	2015-08-11 08:18:14 +03:00
Marko Viitanen	0304b6c412	Fixed luma interpolation filter when 10bit coding and some other minor fixes	2015-08-11 08:17:48 +03:00
Marko Viitanen	450b5e64ca	Fixed overflow on generic ipol filters when 10bit encoding Conflicts: src/strategies/generic/ipol-generic.c	2015-08-11 08:17:48 +03:00
Marko Viitanen	414ebe6101	Fixed checksum on bitdepth > 8 cases Conflicts: src/nal.c src/nal.h src/strategies/generic/nal-generic.c src/strategies/strategies-nal.c src/strategies/strategies-nal.h	2015-08-11 08:14:35 +03:00
Marko Viitanen	57ab46f110	Small fixes all around to enable 10bit encoding Conflicts: src/encmain.c src/encoder.c src/encoderstate.c src/global.h	2015-08-11 07:59:20 +03:00
Ari Lemmetti	5887c96991	Add and use 14bit reconstruction for fractional motion vectors with bipred	2015-08-10 18:45:29 +03:00
Ari Lemmetti	8b4a6c92da	Add 14bit precision sample functions.	2015-08-10 18:02:06 +03:00
Ari Lemmetti	b30f17d4b8	Add fractional pixel sampling for chroma	2015-08-10 17:55:37 +03:00
Ari Lemmetti	01f40ec104	Add fractional pixel sampling for luma	2015-08-10 17:51:48 +03:00
Ari Koivula	0c3c93d456	Optimize intra SAD intrinsics. - Added 64x64 version for completeness. - With the exception of 16x16, these were all slightly slower than the ASM versions, as measured by "kvazaar_test -s speed -t intra_sad", but now they are on par or slightly faster. - None of these actually use any AVX2 intrinsics, and probably never will, unless someone adds an interface for doing more than one block at a time, in which case the non-destructive versions might come in handy.	2015-08-06 19:35:00 +03:00
Arttu Ylä-Outinen	f7f17a060c	Rename pixel_t to kvz_pixel.	2015-07-02 16:58:28 +03:00
Arttu Ylä-Outinen	fab07d80da	Rename macro BIT_DEPTH to KVZ_BIT_DEPTH.	2015-07-02 16:55:47 +03:00
Marko Viitanen	8ed5d06ebe	Fixed compiler warnings caused by the bipred branch merge	2015-04-23 15:12:48 +03:00
Ari Lemmetti	b9ec4b0a54	AVX2 acceleration for new luma filtering.	2015-03-11 15:33:38 +02:00
Ari Lemmetti	39eceec38d	Rewrite of luma fractional pixel filtering. Utilizes intermediate values instead of calculating everything again.	2015-03-06 17:58:22 +02:00
Ari Koivula	ded6fd9ee8	Renamed typedef pixel to pixel_t.	2015-03-04 16:35:53 +02:00
Ari Koivula	f6147b410a	Rename struct encoder_control to encoder_control_t. Conflicts: src/encoder_state-geometry.h src/encoderstate.h	2015-03-04 14:01:14 +02:00
Ari Koivula	d7383ccb25	Change license to LGPL. - Everyone who has contributed code to the project has been asked to license their contributions under LPGL and they have agreed. - COPYING file changed to say LGPLv2.1 instead of GPLv2. - GPL changed to LGPL in the header of every single file that a header and header added to the few that were missing one. - Also.. Happy new year!	2015-02-25 15:19:05 +02:00
Ari Lemmetti	7430622038	Copy ipol-generic strategy as a base for avx2 strategy	2015-02-05 13:28:07 +02:00
Ari Lemmetti	8495870df8	Using BIT_DEPTH macro because it is constant	2015-02-05 13:19:54 +02:00
Ari Lemmetti	c82adae0c4	Use four tap functions in octpel chroma interpolation	2015-02-04 18:23:57 +02:00
Ari Lemmetti	2f11caeb73	Added generic four tap functions. Use them in halfpel chroma interpolation.	2015-02-04 17:50:12 +02:00
Ari Lemmetti	041d970ece	Apply fast clipping also to chroma filtering.	2015-01-29 16:19:04 +02:00
Ari Lemmetti	c21351cc12	Added fast clipping function for clamping values to bit depth.	2015-01-21 17:53:06 +02:00
Ari Lemmetti	f037ed580c	Improved data layout	2015-01-15 16:31:18 +02:00
Ari Lemmetti	465f718eeb	Move value clipping away from separate loop	2015-01-15 16:14:00 +02:00
Ari Lemmetti	9d12ce21d5	Cleaned luma interpolation, added functions for 8-tap filtering.	2015-01-15 16:13:12 +02:00
Ari Lemmetti	0e56d13b5d	Use smaller bit depth for fractional pixel interpolation	2015-01-15 15:00:09 +02:00
Ari Lemmetti	cc061b4c3d	Added ipol strategy for interpolation filters. Added initial files for AVX2 and generic strategies.	2015-01-15 14:59:37 +02:00
Ari Koivula	fcb6fa6d4b	Fix compilation error on PowerPC. - Need abs from stdlib.	2014-10-21 18:14:32 +03:00
Ari Koivula	55ab08c213	Fix incorrect const qualifiers. - Change input pointers to const in dct-generic, like they should have been. - Fixes compilation error on GCC.	2014-10-13 16:57:15 +03:00
Ari Koivula	8a5b24bcbe	Remove usages of GCC __attribute__. - To allow clang to compile, as it doesn't according to #58. - The target attributes are not needed anymore due to makefile handling targetting now. - The __attribute__((unused)) used for debugging. I don't know if clang supports this attribute or not but it doesn't seem very important so I'm removing it just in case.	2014-10-13 16:46:26 +03:00
Ari Koivula	d893a489d6	Fix mingw compilation issue. strategies/avx2/dct-avx2.c:334:25: error: pasting "g_dct_16" and "[" does not give a valid preprocessing token - The [ is not part of the token so compilation failed on mingw GCC 4.9.1. - Fixes #86.	2014-10-10 16:32:39 +03:00
Ari Lemmetti	bcf12567d0	Added some comments.	2014-10-03 17:51:58 +03:00
Ari Lemmetti	fea517c2ae	Misc code cleanup	2014-10-03 17:06:09 +03:00
Ari Lemmetti	85682c3b6a	Removed unused transpose functions.	2014-10-03 11:39:31 +03:00
Ari Koivula	f6272f06fc	Unify signature for transform functions. - Some used block, coeff and some src, dst. Now all signatures are const input and non-const output.	2014-10-03 11:21:43 +03:00
Ari Koivula	b932cf4b21	Clean up avx2 dct macros.	2014-10-03 11:16:25 +03:00
Ari Koivula	47244a15c3	Merge branch 'dct-optimizations' Conflicts: src/strategies/avx2/dct-avx2.c src/strategies/generic/dct-generic.c	2014-10-02 13:45:21 +03:00
Ari Lemmetti	61e1510480	Transform functions in dct-avx2.c are now generated with macros.	2014-10-02 13:24:30 +03:00
Ari Lemmetti	9407610555	Moved DCT / DST matrices to dct-generic.c	2014-10-02 13:24:30 +03:00
Ari Lemmetti	7255112bd8	Added transposed DCT/DST tables. Use them while calculating transforms instead of doing runtime transpose. Added separate functions for DST and IDST.	2014-10-02 13:24:30 +03:00
Ari Lemmetti	e7bcb58846	Added 32x32 IDCT	2014-10-02 13:24:30 +03:00

1 2 3 4

191 commits