hashirama/uvg266

mirror of https://github.com/ultravideo/uvg266.git synced 2024-11-25 10:54:05 +00:00

Author	SHA1	Message	Date
Ari Koivula	fa1af14637	Fix includes to include global.h first everywhere	2016-01-22 15:07:49 +02:00
Ari Lemmetti	44656aeb19	Remove useless calculation	2016-01-19 16:35:16 +02:00
Ari Lemmetti	a2fc9920e6	Merge branch 'alternative-satd'	2016-01-13 15:00:43 +02:00
Ari Lemmetti	1ed34f2df8	Add some planar pred optimization for blocks larger than 8x8	2016-01-13 14:50:17 +02:00
Ari Lemmetti	0df88697ff	Copy generic function to AVX2 strategy	2016-01-12 23:51:18 +02:00
Ari Lemmetti	3cb1cebfe5	Add missing inlines	2016-01-12 23:03:31 +02:00
Ari Lemmetti	6a0b13b8b6	Remove unused functions	2016-01-12 22:55:37 +02:00
Ari Lemmetti	61155f0edd	Add 128-bit version of the functions as well	2016-01-12 22:52:00 +02:00
Ari Lemmetti	a6afb8a8f4	Small refactoring	2016-01-12 22:29:33 +02:00
Ari Lemmetti	a756f6133a	Manually unroll vertical Hadamard transform	2016-01-12 21:45:02 +02:00
Ari Lemmetti	66350aa20e	Experiment with alternative implementation of FWHT	2016-01-11 16:25:56 +02:00
Ari Koivula	947bae24f9	Update Doxygen documentation Add module information to all header files. Update all header file documentations to briefly say what they are, and to use the javadoc format so the brief actually gets included into the doxygen documentation. Remove \file from implementation files, in order to not repeat the info from the header files. Add files under strategies and tools to Doxygen and update the Doxygen settings to be just plain better. Make README be the main page of Doxygen documentation.	2015-12-17 14:05:50 +02:00
Arttu Ylä-Outinen	056fa09ba5	Add arbitrary-sized SATD functions. Adds strategy satd_any_size for generic and AVX2. The satd_any_size functions are implemented with macro SATD_ANY_SIZE defined in strategies-picture.h.	2015-12-15 11:21:45 +02:00
Arttu Ylä-Outinen	728a6abecc	Extract macro SATD_NxN. Combines definitions of macros SATD_NXN and SATD_NXN_AVX2 to macro SATD_NxN and moves it to strategies-picture.h.	2015-12-15 11:21:44 +02:00
Arttu Ylä-Outinen	4402e251ae	Fix kvz_get_extended_block functions. The buffers allocated in functions kvz_get_extended_block_avx2 and kvz_get_extended_block_generic were too small when the width of the block was less than its height. Fixed to allocate correctly sized buffers.	2015-12-15 11:21:43 +02:00
Ari Lemmetti	b78460b02c	Optimize another loop	2015-12-11 11:21:43 +02:00
Ari Lemmetti	ee8c2d0218	Add 4x4 dual SATD for AVX2	2015-12-03 17:13:11 +02:00
Ari Lemmetti	00736fa708	Generate larger than 8x8 dual satd functions with macro	2015-12-03 17:13:11 +02:00
Ari Lemmetti	bd3e1922cd	Add AVX2 8x8 dual hadamard transform	2015-12-03 17:13:11 +02:00
Arttu Ylä-Outinen	940ada4c0d	Mark AVX2 intra filter functions as static. Marks functions filter_4x4_avx2, filter_16x16_avx2 and filter_NxN_avx2 static as they are not used outside strategies/avx2/intra-avx2.	2015-11-09 12:48:20 +02:00
Ari Lemmetti	fbd0596114	Merge branch 'avx2-pixels-blit'	2015-11-04 11:06:10 +02:00
Ari Lemmetti	57ea7d223b	Pass SIMD registers to functions as pointers to fix 32-bit compilation in visual studio	2015-11-04 10:51:26 +02:00
Ari Lemmetti	a3855652e9	Add AVX2 version with separate handling of basic blocks and strideless copy.	2015-11-04 10:07:25 +02:00
Ari Lemmetti	d71f1b5bd0	Disable incompatible optimizations for 32-bit version	2015-10-24 15:32:27 +03:00
Ari Lemmetti	df995d85e8	Utilize AVX2 for dequantization.	2015-10-23 20:17:08 +03:00
Ari Lemmetti	cf347e33c4	Move dequant to strategies. Copy generic to AVX2 as well.	2015-10-23 19:53:50 +03:00
Ari Lemmetti	47082738aa	...and the same tricks for quantized reconstruction	2015-10-23 19:44:38 +03:00
Ari Lemmetti	7961ba80d8	Add functions for bigger block sizes to calculate more residual simultaneously and reduce memory accesses	2015-10-23 19:11:56 +03:00
Ari Lemmetti	15edd5060d	Load and store multiple elements simultaneously. Use 128-bit wide zero test. wip	2015-10-23 17:03:16 +03:00
Ari Lemmetti	b37cca87c8	Copy generic to avx2	2015-10-23 17:03:15 +03:00
Ari Lemmetti	0c63041ba7	Add filtering functions for different block sizes. Simplify logic a bit to reduce branching. Sorry for the large commit!	2015-10-23 16:54:15 +03:00
Ari Lemmetti	5af7a42ebe	Enable AVX2 strategy. Add first version of optimizations.	2015-10-08 12:36:20 +03:00
Ari Lemmetti	f4fe3dca5e	Add AVX2 strategy. Copy generic implementation there.	2015-10-08 12:36:15 +03:00
Ari Lemmetti	38106afa50	Add AVX2 version of quantization.	2015-10-02 16:18:52 +03:00
Ari Lemmetti	989cee1b04	Add 4x4 function as well	2015-10-01 22:14:56 +03:00
Ari Lemmetti	8b57b2bb1a	Refactor SATD to inline most of the function. Replace full horizontal add with shuffle and regular packed add.	2015-10-01 21:29:25 +03:00
Ari Lemmetti	55da2a9958	Add intrinsic version of SATD for 8x8 and larger blocks	2015-10-01 19:42:22 +03:00
Arttu Ylä-Outinen	3a10e9e3e0	Prefix all non-static symbols with "kvz_".	2015-08-26 13:02:28 +03:00
Ari Lemmetti	923f4a74d5	Fix filtering over limits	2015-08-17 17:39:56 +03:00
Ari Lemmetti	82cf4e8ff4	Output error messages to stderr	2015-08-17 15:01:46 +03:00
Ari Lemmetti	3da71b62bf	Add checks if malloc fails	2015-08-17 15:01:46 +03:00
Ari Lemmetti	4718fe7fda	Change variable names to match used convention	2015-08-17 15:01:46 +03:00
Ari Lemmetti	6a5eaf08de	Rename extend_borders to get_extended_block. Add kvz_ prefix to type definition.	2015-08-17 15:01:46 +03:00
Ari Lemmetti	d82582c37c	Changes to extend border function. Now outputs a pointer to a block with guaranteed padding for filtering. Only generate extra pixels if samples are needed out of bounds. Use memcpy otherwise.	2015-08-17 15:01:46 +03:00
Ari Lemmetti	5d96dbc6c0	Make strategy selection use bit depth given via parameter instead of excluding registration with defines	2015-08-12 13:33:38 +03:00
Ari Lemmetti	4122f36089	Prevent the registration of strategies that are incompatible when KVZ_BIT_DEPTH != 8 Remove unnecessary or misleading mentions of "8bit"	2015-08-12 11:29:53 +03:00
Ari Koivula	0c3c93d456	Optimize intra SAD intrinsics. - Added 64x64 version for completeness. - With the exception of 16x16, these were all slightly slower than the ASM versions, as measured by "kvazaar_test -s speed -t intra_sad", but now they are on par or slightly faster. - None of these actually use any AVX2 intrinsics, and probably never will, unless someone adds an interface for doing more than one block at a time, in which case the non-destructive versions might come in handy.	2015-08-06 19:35:00 +03:00
Arttu Ylä-Outinen	f7f17a060c	Rename pixel_t to kvz_pixel.	2015-07-02 16:58:28 +03:00
Arttu Ylä-Outinen	fab07d80da	Rename macro BIT_DEPTH to KVZ_BIT_DEPTH.	2015-07-02 16:55:47 +03:00
Marko Viitanen	8ed5d06ebe	Fixed compiler warnings caused by the bipred branch merge	2015-04-23 15:12:48 +03:00

1 2

80 commits