Pauli Oikkonen
|
c36482a11a
|
Fix bug in 24-wide SAD
*facepalm*
|
2019-02-04 20:41:40 +02:00 |
|
Pauli Oikkonen
|
f781dc31f0
|
Create strategy for ver_sad
Easy to vectorize
|
2019-02-04 20:41:40 +02:00 |
|
Pauli Oikkonen
|
91cb0fbd45
|
Create strategy for directly obtaining pointer to constant-width SAD function
|
2019-02-04 20:41:40 +02:00 |
|
Pauli Oikkonen
|
94035be342
|
Unify unrolling naming conventions
|
2019-02-04 20:41:40 +02:00 |
|
Pauli Oikkonen
|
517a4338f6
|
Unroll SSE SAD for 8-wide blocks to process 4 lines at once
|
2019-02-04 20:41:40 +02:00 |
|
Pauli Oikkonen
|
0f665b28f6
|
Unroll arbitrary width SSE4.1 SAD by 4
|
2019-02-04 20:41:40 +02:00 |
|
Pauli Oikkonen
|
84cf771dea
|
Unroll 32 and 16 wide SAD vector implementations by 4
|
2019-02-04 20:41:40 +02:00 |
|
Pauli Oikkonen
|
5df5c5f8a4
|
Cast all pointers to const types in vector SAD funcs
Also tidy up the pointer arithmetic
|
2019-02-04 20:41:40 +02:00 |
|
Pauli Oikkonen
|
a711ce3df5
|
Inline fixed width vectorized SAD functions
|
2019-02-04 20:41:40 +02:00 |
|
Pauli Oikkonen
|
4cb371184b
|
Add SSE4.1 strategy for 24px wide SAD and an AVX2 strategy for 16
|
2019-02-04 20:41:40 +02:00 |
|
Pauli Oikkonen
|
796568d9cc
|
Add SSE4.1 strategies for SAD on widths 4 and 12 and AVX2 strategies for 32 and 64
|
2019-02-04 20:41:40 +02:00 |
|
Pauli Oikkonen
|
2eaa7bc9d2
|
Move SSE4.1 SAD functions to separate header
|
2019-02-04 20:41:40 +02:00 |
|