Ari Lemmetti
b234897e8a
Fix smp and amp blocks in fme and revert previous change.
...
Filter 8x8 (sub)blocks even with 8x4, 4x8, 16x4, 4x16 etc.
Calculate SATD on the 8x4, ... part
2018-12-19 21:30:53 +02:00
Ari Lemmetti
a832206bb6
Replace 32-bit incompatible instrinsics
2018-11-12 18:54:33 +02:00
Ari Lemmetti
5c774c4105
Rewrite most of FME and interpolation filters
...
Changes had to break a lot of stuff and were just squashed into this horrible code dump
2018-11-08 20:21:16 +02:00
Arttu Ylä-Outinen
2c66e0bbd2
Fix warnings about invalid reads in AVX2 ipol
...
AVX2 filter functions read pixels in chunks of 8 or 16 bytes. At the end
of the block, the read goes out of the bounds of the pixels array. The
extra pixels do not affect the result.
Fixes valgrind complaining about the invalid reads by allocating 5 extra
pixels in kvz_get_extended_block_avx2
2017-06-22 09:37:55 +03:00
Ari Lemmetti
28c4174d0e
Fix incorrect shuffle parameters
...
_MM_SHUFFLE uses reverse order
2016-08-23 19:40:46 +03:00
Ari Lemmetti
ce77bfa15b
Replace KVZ_PERMUTE with _MM_SHUFFLE
...
The same exact macro already exists
2016-08-22 19:08:46 +03:00
Ari Lemmetti
6bcba004ff
Comment out to fix unused code error on clang.
2016-07-14 14:12:16 +03:00
Ari Lemmetti
c0979ebdcb
Implement AVX2 luma sampling
2016-07-14 12:53:02 +03:00
Ari Lemmetti
6244560426
Add avx2 strategy for kvz_filter_frac_blocks_luma.
2016-07-14 12:53:02 +03:00
Ari Lemmetti
9c4e9e049b
Load only what is needed. Eliminate latency from hadds.
2016-07-14 12:53:01 +03:00
Ari Lemmetti
3107a93eaf
Fix avx2 chroma sampling for amp
2016-05-17 14:09:57 +03:00
Ari Lemmetti
efbdc5dade
Utilize registers more efficiently for 8x8 and larger blocks
2016-04-21 13:26:38 +03:00
Ari Lemmetti
192cee95b2
Vectorize vertical filtering
2016-04-21 13:26:38 +03:00
Ari Lemmetti
0be35f72b8
Filter 4 pixels simultaneously in x direction
2016-04-21 13:26:38 +03:00
Ari Koivula
61fc3e87ba
Run include-what-you-use fix_includes.py fix_includes.py
...
The includes should make more sense now and not just happen to compile
due to headers included from other headers.
Used a modified version of IWYU. Modifications were to attribute int8_t
and so on to stdint.h instead of sys/types.h and immintrin.h instead of
more specific headers.
include-what-you-use 0.7 (git:b70df35)
based on clang version 3.9.0 (trunk 264728)
2016-04-01 17:46:55 +03:00
Ari Koivula
8908d85d66
Change all relative includes to absolute
2016-04-01 17:46:44 +03:00
Arttu Ylä-Outinen
4402e251ae
Fix kvz_get_extended_block functions.
...
The buffers allocated in functions kvz_get_extended_block_avx2 and
kvz_get_extended_block_generic were too small when the width of the
block was less than its height. Fixed to allocate correctly sized
buffers.
2015-12-15 11:21:43 +02:00
Ari Lemmetti
57ea7d223b
Pass SIMD registers to functions as pointers to fix 32-bit compilation in visual studio
2015-11-04 10:51:26 +02:00
Arttu Ylä-Outinen
3a10e9e3e0
Prefix all non-static symbols with "kvz_".
2015-08-26 13:02:28 +03:00
Ari Lemmetti
923f4a74d5
Fix filtering over limits
2015-08-17 17:39:56 +03:00
Ari Lemmetti
82cf4e8ff4
Output error messages to stderr
2015-08-17 15:01:46 +03:00
Ari Lemmetti
3da71b62bf
Add checks if malloc fails
2015-08-17 15:01:46 +03:00
Ari Lemmetti
4718fe7fda
Change variable names to match used convention
2015-08-17 15:01:46 +03:00
Ari Lemmetti
6a5eaf08de
Rename extend_borders to get_extended_block. Add kvz_ prefix to type definition.
2015-08-17 15:01:46 +03:00
Ari Lemmetti
d82582c37c
Changes to extend border function.
...
Now outputs a pointer to a block with guaranteed padding for filtering.
Only generate extra pixels if samples are needed out of bounds.
Use memcpy otherwise.
2015-08-17 15:01:46 +03:00
Ari Lemmetti
5d96dbc6c0
Make strategy selection use bit depth given via parameter instead of excluding registration with defines
2015-08-12 13:33:38 +03:00
Ari Lemmetti
4122f36089
Prevent the registration of strategies that are incompatible when KVZ_BIT_DEPTH != 8
...
Remove unnecessary or misleading mentions of "8bit"
2015-08-12 11:29:53 +03:00
Arttu Ylä-Outinen
f7f17a060c
Rename pixel_t to kvz_pixel.
2015-07-02 16:58:28 +03:00
Arttu Ylä-Outinen
fab07d80da
Rename macro BIT_DEPTH to KVZ_BIT_DEPTH.
2015-07-02 16:55:47 +03:00
Marko Viitanen
8ed5d06ebe
Fixed compiler warnings caused by the bipred branch merge
2015-04-23 15:12:48 +03:00
Ari Lemmetti
b9ec4b0a54
AVX2 acceleration for new luma filtering.
2015-03-11 15:33:38 +02:00
Ari Koivula
ded6fd9ee8
Renamed typedef pixel to pixel_t.
2015-03-04 16:35:53 +02:00
Ari Koivula
f6147b410a
Rename struct encoder_control to encoder_control_t.
...
Conflicts:
src/encoder_state-geometry.h
src/encoderstate.h
2015-03-04 14:01:14 +02:00
Ari Koivula
d7383ccb25
Change license to LGPL.
...
- Everyone who has contributed code to the project has been asked to license
their contributions under LPGL and they have agreed.
- COPYING file changed to say LGPLv2.1 instead of GPLv2.
- GPL changed to LGPL in the header of every single file that a header and
header added to the few that were missing one.
- Also.. Happy new year!
2015-02-25 15:19:05 +02:00
Ari Lemmetti
7430622038
Copy ipol-generic strategy as a base for avx2 strategy
2015-02-05 13:28:07 +02:00
Ari Lemmetti
0e56d13b5d
Use smaller bit depth for fractional pixel interpolation
2015-01-15 15:00:09 +02:00
Ari Lemmetti
cc061b4c3d
Added ipol strategy for interpolation filters.
...
Added initial files for AVX2 and generic strategies.
2015-01-15 14:59:37 +02:00