hashirama/uvg266

mirror of https://github.com/ultravideo/uvg266.git synced 2024-11-24 18:34:06 +00:00

Author	SHA1	Message	Date
Reima Hyvönen	17babfffa4	25.6 working optimation, ~50% faster than original	2018-06-25 17:06:16 +03:00
Reima Hyvönen	9fed29f950	optimation for inter_recon_bipred	2018-04-18 15:25:44 +03:00
Arttu Ylä-Outinen	0a69e6d18f	Fix selection of transform function for 4x4 blocks DST function was returned for inter luma transform blocks of size 4x4 even though they must use DCT. Fixed by checking the prediction mode of the block in addition to whether it is chroma or luma.	2018-01-18 10:36:25 +02:00
Arttu Ylä-Outinen	9694bd2fae	Fix build on 32-bit systems Function coeff_abs_sum_avx2 that was added in `e950c9b` was outside the AVX2 #if directive.	2017-07-28 09:19:29 +03:00
Arttu Ylä-Outinen	e950c9b101	Add AVX2 implementation for coefficient sum	2017-07-28 07:39:36 +03:00
Arttu Ylä-Outinen	d50ae6990c	Add sum of absolute coefficients to strategies	2017-07-28 07:39:15 +03:00
Arttu Ylä-Outinen	fdb3480b54	Enable strategies for SAO reconstruction Re-enables strategies for SAO reconstruction. They were disabled in commit `ec9ff42`.	2017-07-11 10:35:18 +03:00
Arttu Ylä-Outinen	333dba3884	Add static to SAO strategies	2017-07-11 10:02:01 +03:00
Arttu Ylä-Outinen	563bc26e71	Fix out-of-bounds read in AVX2 SAO AVX2 version of SAO loaded offsets with a 256 bit read even though there are only five 32 bit integers.	2017-07-06 13:04:52 +03:00
Arttu Ylä-Outinen	2c66e0bbd2	Fix warnings about invalid reads in AVX2 ipol AVX2 filter functions read pixels in chunks of 8 or 16 bytes. At the end of the block, the read goes out of the bounds of the pixels array. The extra pixels do not affect the result. Fixes valgrind complaining about the invalid reads by allocating 5 extra pixels in kvz_get_extended_block_avx2	2017-06-22 09:37:55 +03:00
Arttu Ylä-Outinen	95775a1645	Change coefficient storage order Changes coefficient storage order to a zig-zag order. Reduces unnecessary copying of coefficients to temporary arrays.	2017-05-12 16:46:57 +03:00
Arttu Ylä-Outinen	51786eda67	Drop redundant fields in encoder_control_t Some of the fields in encoder_control_t were simply copies of the corresponding fields in kvz_config. This commit drops the copied fields in favor of using the fields in encoder_control_t.cfg directly.	2017-02-09 14:05:28 +09:00
Arttu Ylä-Outinen	e78a8dfcf5	Copy the kvz_config passed to encoder_open The kvz_config struct is created by the user but kvazaar keeps a pointer to it. It is easy to break things by modifying the configuration outside kvazaar. In addition, kvazaar modifies the struct even though it is has a const modifier. This commit changes the field cfg in encoder_control_t to be a copy of the kvz_config struct instead of a pointer, removing modifications to the const struct and allowing users to do whatever they want with it after opening the encoder.	2017-02-09 13:23:54 +09:00
Arttu Ylä-Outinen	640ff94ecd	Use separate lambda and QP for each LCU Adds fields lambda, lambda_sqrt and qp to encoder_state_t. Drops field cur_lambda_cost_sqrt from encoder_state_config_frame_t and renames cur_lambda_cost to lambda.	2017-01-09 01:24:23 +09:00
Ari Lemmetti	70a52f0e48	10-bit: add missing bit depth adjustment to ssd	2016-11-17 19:28:04 +02:00
Ari Lemmetti	29153ed503	Remove unused variable	2016-10-21 17:28:42 +03:00
Ari Lemmetti	778e46dfd8	Add AVX2 version of SSD	2016-10-21 15:07:53 +03:00
Ari Lemmetti	6f5d7c9e06	Move SSD to strategies	2016-10-21 15:07:23 +03:00
Ari Lemmetti	89b941eab4	Fix typo	2016-10-21 15:07:02 +03:00
Ari Koivula	cbfa824d1a	Merge branch 'simd'	2016-09-27 20:49:45 +03:00
Ari Koivula	14a7bcba25	Use a faster function for clipped inter SAD Use the vectorized general SSE41 inter SAD in AVX reg_sad for shapes for which we don't have AVX versions yet. Also improves speed of --smp and --amp a lot. Got a 1.25x speedup for: --preset=ultrafast -q 27 --gop=lp-g4d3r3t1 --me-early-termination=on --rd=1 --pu-depth-inter=1-3 --smp --amp * Suite speed_tests: -PASS inter_sad: 0.898M x reg_sad(64x63):x86_asm_avx (1000 ticks, 1.000 sec) +PASS inter_sad: 2.503M x reg_sad(64x63):x86_asm_avx (1000 ticks, 1.000 sec) -PASS inter_sad: 115.054M x reg_sad(1x1):x86_asm_avx (1000 ticks, 1.000 sec) +PASS inter_sad: 133.577M x reg_sad(1x1):x86_asm_avx (1000 ticks, 1.000 sec)	2016-09-27 20:48:30 +03:00
Eemeli Kallio	f41e428e5f	Removed kvz_skip_unnecessary_rdoq and reworked --rdoq-skip to skip 4x4 blocks when it is on.	2016-09-09 10:26:07 +03:00
Ari Koivula	02cd17b427	Add faster AVX inter SAD for 32x32 and 64x64 Add implementations for these functions that process the image line by line instead of using the 16x16 function to process block by block. The 32x32 is around 30% faster, and 64x64 is around 15% faster, on Haswell. PASS inter_sad: 28.744M x reg_sad(32x32):x86_asm_avx (1014 ticks, 1.014 sec) PASS inter_sad: 7.882M x reg_sad(64x64):x86_asm_avx (1014 ticks, 1.014 sec) to PASS inter_sad: 37.828M x reg_sad(32x32):x86_asm_avx (1014 ticks, 1.014 sec) PASS inter_sad: 9.081M x reg_sad(64x64):x86_asm_avx (1014 ticks, 1.014 sec)	2016-09-01 21:36:39 +03:00
Ari Lemmetti	28c4174d0e	Fix incorrect shuffle parameters _MM_SHUFFLE uses reverse order	2016-08-23 19:40:46 +03:00
Ari Lemmetti	ce77bfa15b	Replace KVZ_PERMUTE with _MM_SHUFFLE The same exact macro already exists	2016-08-22 19:08:46 +03:00
Eemeli Kallio	99d8b9abeb	Changed skip_rdoq name to kvz_skip_unnecessary_rdoq. Changed the order it uses when it goes through CGs and tuned its sum calculation.	2016-08-18 14:02:56 +03:00
Eemeli Kallio	1fb4755f31	Added rdoq-skip to quant-generic.c	2016-08-18 12:17:54 +03:00
Eemeli Kallio	d20ac03ca2	Added --rdoq-skip option	2016-08-18 12:17:53 +03:00
Arttu Ylä-Outinen	2a946bd88e	Rename encoder_state_t.global to frame "Frame" is more accurate than "global" since when OWF is used, encoder states for each frame have their own struct.	2016-08-10 13:22:36 +09:00
Arttu Ylä-Outinen	5fbb0a8c27	Fix includes	2016-08-10 13:05:40 +09:00
Ari Lemmetti	6bcba004ff	Comment out to fix unused code error on clang.	2016-07-14 14:12:16 +03:00
Ari Lemmetti	c0979ebdcb	Implement AVX2 luma sampling	2016-07-14 12:53:02 +03:00
Ari Lemmetti	6244560426	Add avx2 strategy for kvz_filter_frac_blocks_luma.	2016-07-14 12:53:02 +03:00
Ari Lemmetti	9c4e9e049b	Load only what is needed. Eliminate latency from hadds.	2016-07-14 12:53:01 +03:00
Ari Lemmetti	fccfbd2f28	Add strategy for kvz_filter_frac_blocks_luma	2016-07-14 12:51:02 +03:00
Ari Lemmetti	2b0c8db349	Add quad satd for avx2	2016-07-14 12:50:24 +03:00
Ari Lemmetti	0ff69fd6f8	Add any size multi satd	2016-07-14 12:48:37 +03:00
Arttu Ylä-Outinen	bf26661782	Add support for 4x4 blocks to SATD_ANY_SIZE. Makes functions satd_any_size_generic and satd_any_size_8bit_avx2 work on blocks whose width and/or height are not multiples of 8.	2016-06-16 18:53:17 +09:00
Ari Lemmetti	3107a93eaf	Fix avx2 chroma sampling for amp	2016-05-17 14:09:57 +03:00
Ari Lemmetti	efbdc5dade	Utilize registers more efficiently for 8x8 and larger blocks	2016-04-21 13:26:38 +03:00
Ari Lemmetti	192cee95b2	Vectorize vertical filtering	2016-04-21 13:26:38 +03:00
Ari Lemmetti	0be35f72b8	Filter 4 pixels simultaneously in x direction	2016-04-21 13:26:38 +03:00
Ari Lemmetti	10484bda9f	Make strategies out of fractional pixel sample functions	2016-04-21 13:26:38 +03:00
Ari Lemmetti	8247faf8e0	Remove 64-bit only instruction to fix 32-bit compilation.	2016-04-19 18:05:11 +03:00
Ari Lemmetti	eb55d6b6b9	Fix writing over boundary.	2016-04-19 16:03:43 +03:00
Ari Lemmetti	bcabc6fadd	Remove pixel blit from strategies. Use memcpy instead.	2016-04-06 18:44:04 +03:00
Ari Koivula	61fc3e87ba	Run include-what-you-use fix_includes.py fix_includes.py The includes should make more sense now and not just happen to compile due to headers included from other headers. Used a modified version of IWYU. Modifications were to attribute int8_t and so on to stdint.h instead of sys/types.h and immintrin.h instead of more specific headers. include-what-you-use 0.7 (git:b70df35) based on clang version 3.9.0 (trunk 264728)	2016-04-01 17:46:55 +03:00
Ari Koivula	8908d85d66	Change all relative includes to absolute	2016-04-01 17:46:44 +03:00
Ari Koivula	4876879b82	Add IWYU pragmas	2016-03-31 12:33:34 +03:00
Ari Koivula	5b66578f71	Add kvz_ prefix to md5 functions The non kvz_ symbols were being exported in the static lib, which got caught by Travis tests.	2016-03-18 13:13:35 +02:00
Ari Koivula	4125218cfa	Add --hash=md5 Add md5 through extras/libmd5 taken from HM with BSD license. It's implemented as a generic strategy using the same interface as checksum, so we can write a SIMD version if it seems necessary.	2016-03-18 05:23:57 +02:00
Ari Lemmetti	e502292ba8	Remove old function	2016-03-16 20:18:55 +02:00
Ari Lemmetti	c6cc96f5ec	Optimize sao band ddistortion	2016-03-16 20:16:00 +02:00
Ari Lemmetti	ab577f476f	Optimize sao reconstruct color	2016-03-16 20:15:32 +02:00
Ari Lemmetti	48bfddf4ec	Optimize calc sao edge dir	2016-03-16 20:14:50 +02:00
Ari Lemmetti	ba69992941	Optimize sao edge ddistortion	2016-03-16 20:14:19 +02:00
Ari Lemmetti	941b6b3e27	Optimize calc eo cat	2016-03-16 20:13:30 +02:00
Ari Lemmetti	04fbb48a09	Add strategy for avx2. Copy generic functions there.	2016-03-16 20:13:15 +02:00
Ari Lemmetti	4e30a215d8	Create generic strategy for sao.	2016-03-16 20:11:15 +02:00
Ari Lemmetti	99e37ec235	Update old pixel type to the current one	2016-01-30 19:33:09 +02:00
Ari Koivula	fa1af14637	Fix includes to include global.h first everywhere	2016-01-22 15:07:49 +02:00
Ari Lemmetti	44656aeb19	Remove useless calculation	2016-01-19 16:35:16 +02:00
Ari Lemmetti	a2fc9920e6	Merge branch 'alternative-satd'	2016-01-13 15:00:43 +02:00
Ari Lemmetti	1ed34f2df8	Add some planar pred optimization for blocks larger than 8x8	2016-01-13 14:50:17 +02:00
Ari Lemmetti	0df88697ff	Copy generic function to AVX2 strategy	2016-01-12 23:51:18 +02:00
Ari Lemmetti	62799a9fc3	Create generic strategy of planar prediction	2016-01-12 23:50:47 +02:00
Ari Lemmetti	3cb1cebfe5	Add missing inlines	2016-01-12 23:03:31 +02:00
Ari Lemmetti	6a0b13b8b6	Remove unused functions	2016-01-12 22:55:37 +02:00
Ari Lemmetti	61155f0edd	Add 128-bit version of the functions as well	2016-01-12 22:52:00 +02:00
Ari Lemmetti	a6afb8a8f4	Small refactoring	2016-01-12 22:29:33 +02:00
Ari Lemmetti	a756f6133a	Manually unroll vertical Hadamard transform	2016-01-12 21:45:02 +02:00
Ari Lemmetti	66350aa20e	Experiment with alternative implementation of FWHT	2016-01-11 16:25:56 +02:00
Ari Koivula	947bae24f9	Update Doxygen documentation Add module information to all header files. Update all header file documentations to briefly say what they are, and to use the javadoc format so the brief actually gets included into the doxygen documentation. Remove \file from implementation files, in order to not repeat the info from the header files. Add files under strategies and tools to Doxygen and update the Doxygen settings to be just plain better. Make README be the main page of Doxygen documentation.	2015-12-17 14:05:50 +02:00
Arttu Ylä-Outinen	864c77f6eb	Use kvz_satd_any_size in inter search. Changes search_frac and kvz_search_cu_iter to use kvz_satd_any_size for computing the SATDs instead of getting the SATD function with kvz_pixels_get_satd_func.	2015-12-15 11:21:45 +02:00
Arttu Ylä-Outinen	056fa09ba5	Add arbitrary-sized SATD functions. Adds strategy satd_any_size for generic and AVX2. The satd_any_size functions are implemented with macro SATD_ANY_SIZE defined in strategies-picture.h.	2015-12-15 11:21:45 +02:00
Arttu Ylä-Outinen	728a6abecc	Extract macro SATD_NxN. Combines definitions of macros SATD_NXN and SATD_NXN_AVX2 to macro SATD_NxN and moves it to strategies-picture.h.	2015-12-15 11:21:44 +02:00
Arttu Ylä-Outinen	4402e251ae	Fix kvz_get_extended_block functions. The buffers allocated in functions kvz_get_extended_block_avx2 and kvz_get_extended_block_generic were too small when the width of the block was less than its height. Fixed to allocate correctly sized buffers.	2015-12-15 11:21:43 +02:00
Ari Lemmetti	b78460b02c	Optimize another loop	2015-12-11 11:21:43 +02:00
Ari Lemmetti	ee8c2d0218	Add 4x4 dual SATD for AVX2	2015-12-03 17:13:11 +02:00
Ari Lemmetti	00736fa708	Generate larger than 8x8 dual satd functions with macro	2015-12-03 17:13:11 +02:00
Ari Lemmetti	bd3e1922cd	Add AVX2 8x8 dual hadamard transform	2015-12-03 17:13:11 +02:00
Ari Lemmetti	d575b94357	Implement generic functions for dual sad / satd	2015-12-03 17:13:11 +02:00
Ari Lemmetti	183ee53f47	Add alternative version of rough intra search. Calculate two costs simultaneously to exploit larger SIMD registers. Implementation for dual functions missing currently.	2015-12-03 17:12:38 +02:00
Arttu Ylä-Outinen	940ada4c0d	Mark AVX2 intra filter functions as static. Marks functions filter_4x4_avx2, filter_16x16_avx2 and filter_NxN_avx2 static as they are not used outside strategies/avx2/intra-avx2.	2015-11-09 12:48:20 +02:00
Ari Lemmetti	fbd0596114	Merge branch 'avx2-pixels-blit'	2015-11-04 11:06:10 +02:00
Ari Lemmetti	57ea7d223b	Pass SIMD registers to functions as pointers to fix 32-bit compilation in visual studio	2015-11-04 10:51:26 +02:00
Ari Lemmetti	a3855652e9	Add AVX2 version with separate handling of basic blocks and strideless copy.	2015-11-04 10:07:25 +02:00
Ari Lemmetti	0816fbea2c	Create generic strategy of blit function	2015-11-04 10:07:25 +02:00
Marko Viitanen	821d5c478b	Added missing parameter to kvz_strategy_register_picture_generic()	2015-11-02 08:55:54 +02:00
Ari Lemmetti	d71f1b5bd0	Disable incompatible optimizations for 32-bit version	2015-10-24 15:32:27 +03:00
Ari Lemmetti	df995d85e8	Utilize AVX2 for dequantization.	2015-10-23 20:17:08 +03:00
Ari Lemmetti	cf347e33c4	Move dequant to strategies. Copy generic to AVX2 as well.	2015-10-23 19:53:50 +03:00
Ari Lemmetti	47082738aa	...and the same tricks for quantized reconstruction	2015-10-23 19:44:38 +03:00
Ari Lemmetti	7961ba80d8	Add functions for bigger block sizes to calculate more residual simultaneously and reduce memory accesses	2015-10-23 19:11:56 +03:00
Ari Lemmetti	15edd5060d	Load and store multiple elements simultaneously. Use 128-bit wide zero test. wip	2015-10-23 17:03:16 +03:00
Ari Lemmetti	b37cca87c8	Copy generic to avx2	2015-10-23 17:03:15 +03:00
Ari Lemmetti	cad2ea9d6e	Move quantize_residual to quant strategies.	2015-10-23 17:03:15 +03:00
Ari Lemmetti	0c63041ba7	Add filtering functions for different block sizes. Simplify logic a bit to reduce branching. Sorry for the large commit!	2015-10-23 16:54:15 +03:00
Ari Lemmetti	5af7a42ebe	Enable AVX2 strategy. Add first version of optimizations.	2015-10-08 12:36:20 +03:00
Ari Lemmetti	f4fe3dca5e	Add AVX2 strategy. Copy generic implementation there.	2015-10-08 12:36:15 +03:00
Ari Lemmetti	54e8b346a3	Add intra strategy. Move angular prediction there.	2015-10-08 12:36:05 +03:00
Ari Lemmetti	38106afa50	Add AVX2 version of quantization.	2015-10-02 16:18:52 +03:00
Ari Lemmetti	ef0ad292ef	Add quantization strategy.	2015-10-02 16:17:02 +03:00
Ari Lemmetti	989cee1b04	Add 4x4 function as well	2015-10-01 22:14:56 +03:00
Ari Lemmetti	8b57b2bb1a	Refactor SATD to inline most of the function. Replace full horizontal add with shuffle and regular packed add.	2015-10-01 21:29:25 +03:00
Ari Lemmetti	55da2a9958	Add intrinsic version of SATD for 8x8 and larger blocks	2015-10-01 19:42:22 +03:00
Ari Lemmetti	d68fc4c41e	Add header for common utilities to use with strategies.	2015-10-01 19:40:35 +03:00
Ari Koivula	9a23ae3d92	Resolve remaining Visual Studio warnings. - Ignore most of them and fix the ones that can't be ignored.	2015-08-31 15:02:25 +03:00
Arttu Ylä-Outinen	3a10e9e3e0	Prefix all non-static symbols with "kvz_".	2015-08-26 13:02:28 +03:00
Arttu Ylä-Outinen	bfe2b31cee	Make generic satd functions static.	2015-08-26 12:10:27 +03:00
Ari Lemmetti	923f4a74d5	Fix filtering over limits	2015-08-17 17:39:56 +03:00
Ari Lemmetti	82cf4e8ff4	Output error messages to stderr	2015-08-17 15:01:46 +03:00
Ari Lemmetti	3da71b62bf	Add checks if malloc fails	2015-08-17 15:01:46 +03:00
Ari Lemmetti	4718fe7fda	Change variable names to match used convention	2015-08-17 15:01:46 +03:00
Ari Lemmetti	6a5eaf08de	Rename extend_borders to get_extended_block. Add kvz_ prefix to type definition.	2015-08-17 15:01:46 +03:00
Ari Lemmetti	d82582c37c	Changes to extend border function. Now outputs a pointer to a block with guaranteed padding for filtering. Only generate extra pixels if samples are needed out of bounds. Use memcpy otherwise.	2015-08-17 15:01:46 +03:00
Ari Lemmetti	5d96dbc6c0	Make strategy selection use bit depth given via parameter instead of excluding registration with defines	2015-08-12 13:33:38 +03:00
Ari Lemmetti	4122f36089	Prevent the registration of strategies that are incompatible when KVZ_BIT_DEPTH != 8 Remove unnecessary or misleading mentions of "8bit"	2015-08-12 11:29:53 +03:00
Ari Lemmetti	348d7780fc	Remove third shift and offset from 14-bit sampling functions (change missing from rebase)	2015-08-11 15:06:16 +03:00
Marko Viitanen	8409317bd9	Fixed rebasing errors for 10bit branch	2015-08-11 14:56:45 +03:00
Marko Viitanen	6453a511d7	Scale SAD/SATD costs to match bit depth Conflicts: src/image.c	2015-08-11 08:18:14 +03:00
Marko Viitanen	0304b6c412	Fixed luma interpolation filter when 10bit coding and some other minor fixes	2015-08-11 08:17:48 +03:00
Marko Viitanen	450b5e64ca	Fixed overflow on generic ipol filters when 10bit encoding Conflicts: src/strategies/generic/ipol-generic.c	2015-08-11 08:17:48 +03:00
Marko Viitanen	414ebe6101	Fixed checksum on bitdepth > 8 cases Conflicts: src/nal.c src/nal.h src/strategies/generic/nal-generic.c src/strategies/strategies-nal.c src/strategies/strategies-nal.h	2015-08-11 08:14:35 +03:00
Marko Viitanen	57ab46f110	Small fixes all around to enable 10bit encoding Conflicts: src/encmain.c src/encoder.c src/encoderstate.c src/global.h	2015-08-11 07:59:20 +03:00
Ari Lemmetti	5887c96991	Add and use 14bit reconstruction for fractional motion vectors with bipred	2015-08-10 18:45:29 +03:00
Ari Lemmetti	8b4a6c92da	Add 14bit precision sample functions.	2015-08-10 18:02:06 +03:00
Ari Lemmetti	b30f17d4b8	Add fractional pixel sampling for chroma	2015-08-10 17:55:37 +03:00
Ari Lemmetti	01f40ec104	Add fractional pixel sampling for luma	2015-08-10 17:51:48 +03:00
Ari Koivula	0c3c93d456	Optimize intra SAD intrinsics. - Added 64x64 version for completeness. - With the exception of 16x16, these were all slightly slower than the ASM versions, as measured by "kvazaar_test -s speed -t intra_sad", but now they are on par or slightly faster. - None of these actually use any AVX2 intrinsics, and probably never will, unless someone adds an interface for doing more than one block at a time, in which case the non-destructive versions might come in handy.	2015-08-06 19:35:00 +03:00
Arttu Ylä-Outinen	f7f17a060c	Rename pixel_t to kvz_pixel.	2015-07-02 16:58:28 +03:00
Arttu Ylä-Outinen	fab07d80da	Rename macro BIT_DEPTH to KVZ_BIT_DEPTH.	2015-07-02 16:55:47 +03:00
Marko Viitanen	8ed5d06ebe	Fixed compiler warnings caused by the bipred branch merge	2015-04-23 15:12:48 +03:00
Ari Lemmetti	b9ec4b0a54	AVX2 acceleration for new luma filtering.	2015-03-11 15:33:38 +02:00
Ari Lemmetti	39eceec38d	Rewrite of luma fractional pixel filtering. Utilizes intermediate values instead of calculating everything again.	2015-03-06 17:58:22 +02:00
Ari Koivula	ded6fd9ee8	Renamed typedef pixel to pixel_t.	2015-03-04 16:35:53 +02:00
Ari Koivula	f6147b410a	Rename struct encoder_control to encoder_control_t. Conflicts: src/encoder_state-geometry.h src/encoderstate.h	2015-03-04 14:01:14 +02:00
Ari Koivula	d7383ccb25	Change license to LGPL. - Everyone who has contributed code to the project has been asked to license their contributions under LPGL and they have agreed. - COPYING file changed to say LGPLv2.1 instead of GPLv2. - GPL changed to LGPL in the header of every single file that a header and header added to the few that were missing one. - Also.. Happy new year!	2015-02-25 15:19:05 +02:00
Ari Lemmetti	7430622038	Copy ipol-generic strategy as a base for avx2 strategy	2015-02-05 13:28:07 +02:00
Ari Lemmetti	8495870df8	Using BIT_DEPTH macro because it is constant	2015-02-05 13:19:54 +02:00
Ari Lemmetti	c82adae0c4	Use four tap functions in octpel chroma interpolation	2015-02-04 18:23:57 +02:00
Ari Lemmetti	2f11caeb73	Added generic four tap functions. Use them in halfpel chroma interpolation.	2015-02-04 17:50:12 +02:00
Ari Lemmetti	041d970ece	Apply fast clipping also to chroma filtering.	2015-01-29 16:19:04 +02:00
Ari Lemmetti	c21351cc12	Added fast clipping function for clamping values to bit depth.	2015-01-21 17:53:06 +02:00
Ari Lemmetti	f037ed580c	Improved data layout	2015-01-15 16:31:18 +02:00
Ari Lemmetti	465f718eeb	Move value clipping away from separate loop	2015-01-15 16:14:00 +02:00
Ari Lemmetti	9d12ce21d5	Cleaned luma interpolation, added functions for 8-tap filtering.	2015-01-15 16:13:12 +02:00
Ari Lemmetti	0e56d13b5d	Use smaller bit depth for fractional pixel interpolation	2015-01-15 15:00:09 +02:00
Ari Lemmetti	cc061b4c3d	Added ipol strategy for interpolation filters. Added initial files for AVX2 and generic strategies.	2015-01-15 14:59:37 +02:00
Ari Koivula	fcb6fa6d4b	Fix compilation error on PowerPC. - Need abs from stdlib.	2014-10-21 18:14:32 +03:00

1 2 3 4 5 ...

304 commits