Pauli Oikkonen
|
7c69a26717
|
Use aligned loads and stores for AVX2 DCT
|
2019-10-28 16:19:42 +02:00 |
|
Pauli Oikkonen
|
8e9c65dca6
|
Align DCT matrices and temp transform buffers
|
2019-10-28 16:19:42 +02:00 |
|
Pauli Oikkonen
|
148a150522
|
Align DCT source and dest blocks to cache line
|
2019-10-28 16:19:42 +02:00 |
|
Pauli Oikkonen
|
8e60bbf6a6
|
Slightly tune 16x16 forward DCT
Use an array of __m256i's to store temporary value, essentially letting
the compiler enforce alignment and use aligned loads and stores.
|
2019-10-28 16:19:42 +02:00 |
|
Pauli Oikkonen
|
c0cc0e8a75
|
Optimize 16x16 multiply by only slicing right mat once
|
2019-10-28 16:19:42 +02:00 |
|
Pauli Oikkonen
|
e463d27f22
|
Implement streamlined generic 16x16 matrix multiply
It can't be this fast for real, can it?
|
2019-10-28 16:19:42 +02:00 |
|
Pauli Oikkonen
|
beb85ce9d6
|
Reorder parameters for 8x8 matrix multiplies
|
2019-10-28 16:19:42 +02:00 |
|
Pauli Oikkonen
|
292af62256
|
Implement tailored 16x16 forward DCT
|
2019-10-28 16:19:42 +02:00 |
|
Pauli Oikkonen
|
30ce461d98
|
Redo 4x4 matrix multiplication
|
2019-10-28 16:19:42 +02:00 |
|
Pauli Oikkonen
|
07970ea82f
|
Streamline by-the-book 8x8 matrix multiplication
Also chop up the forward transform into two tailored multiply functions
|
2019-10-28 16:19:42 +02:00 |
|
Pauli Oikkonen
|
7ec7ab3361
|
Implement a tailored AVX2 8x8 DCT
|
2019-10-28 16:19:42 +02:00 |
|
Joose Sainio
|
372934c7db
|
Fix division by zero
|
2019-10-10 16:35:56 +03:00 |
|
Joose Sainio
|
9bdfdeaf5c
|
Rest of the owl
|
2019-10-09 15:48:58 +03:00 |
|
Joose Sainio
|
1ba8525faf
|
WIP
|
2019-10-09 10:35:07 +03:00 |
|
Joose Sainio
|
19496d2692
|
?
|
2019-10-03 14:50:11 +03:00 |
|
Joose Sainio
|
4b111e339e
|
fix couple of bugs in the implementation, bit calculation seems still bit off
|
2019-10-01 15:08:39 +03:00 |
|
Joose Sainio
|
84615e406a
|
fix compiler warnings
|
2019-09-27 14:20:08 +03:00 |
|
Joose Sainio
|
14b7a75713
|
Call the new functions and fix bugs
|
2019-09-27 14:14:24 +03:00 |
|
Joose Sainio
|
ef74bfb182
|
unify naming
|
2019-09-27 10:16:21 +03:00 |
|
Joose Sainio
|
e36f481bda
|
qp calculation for frame
|
2019-09-27 09:05:40 +03:00 |
|
Joose Sainio
|
47019ca1cd
|
intra ck update
|
2019-09-26 16:04:53 +03:00 |
|
Joose Sainio
|
7c8f4da7cb
|
Update c and k except after first intra
|
2019-09-26 13:09:28 +03:00 |
|
Joose Sainio
|
0577d481c1
|
CTU level code
|
2019-09-25 12:12:21 +03:00 |
|
Marko Viitanen
|
ad7c8d40bc
|
Merge pull request #247 from pkubaj/master
Fix build on powerpc64 with LLVM
|
2019-09-12 16:11:19 +03:00 |
|
pkubaj
|
1d7fcf4227
|
Fix build on powerpc64 with LLVM
|
2019-09-12 15:05:00 +02:00 |
|
mercat
|
0de567bfa4
|
Fixe memory leak
|
2019-09-12 09:45:32 +03:00 |
|
mercat
|
fa116de619
|
Add static
|
2019-09-11 16:18:12 +03:00 |
|
mercat
|
5cb2fbba16
|
Merge branch 'ML-cplx_red_ICIP' of gitlab.tut.fi:TIE/ultravideo/kvazaar into ML-cplx_red_ICIP
|
2019-09-11 16:12:47 +03:00 |
|
mercat
|
b8753a9293
|
Fucking INLINE fixed
|
2019-09-11 16:12:07 +03:00 |
|
mercat
|
b855144e68
|
INLINE fixe
|
2019-09-11 16:12:07 +03:00 |
|
mercat
|
694337b803
|
Add const and more const
|
2019-09-11 16:12:07 +03:00 |
|
mercat
|
21c07638ed
|
Remove const into kvz_init_constraint.
|
2019-09-11 16:12:06 +03:00 |
|
mercat
|
2bca507abe
|
Clean version of machine learning constraint code. (ICIP paper)
|
2019-09-11 16:12:06 +03:00 |
|
Alexandre Mercat
|
0f4b7be6ee
|
First version of ML ICIP code for master
|
2019-09-11 16:12:06 +03:00 |
|
mercat
|
808eb4ff96
|
Fucking INLINE fixed
|
2019-09-11 16:08:31 +03:00 |
|
mercat
|
35fd556321
|
INLINE fixe
|
2019-09-11 16:05:31 +03:00 |
|
mercat
|
5beb23d91c
|
Add const and more const
|
2019-09-11 16:03:03 +03:00 |
|
mercat
|
6cda8036c9
|
Remove const into kvz_init_constraint.
|
2019-09-11 15:57:15 +03:00 |
|
mercat
|
1dac29d9a0
|
Clean version of machine learning constraint code. (ICIP paper)
|
2019-09-11 15:49:56 +03:00 |
|
Marko Viitanen
|
4007485420
|
Update the ffmpeg version used in the tests
|
2019-09-11 14:52:30 +03:00 |
|
Marko Viitanen
|
da5dca057d
|
Change libtool path in tests to fix travis builds
|
2019-09-11 09:33:43 +03:00 |
|
Pauli Oikkonen
|
99597b828a
|
Work around the ancient Win32 calling convention hassle
See if this'll work now
|
2019-09-06 13:14:42 +03:00 |
|
Pauli Oikkonen
|
c5ca18950c
|
Revert "Revert to 6924d90052 due to broken visual studio build"
This reverts commit 1dd0619bd7 .
|
2019-09-05 18:21:55 +03:00 |
|
Pauli Oikkonen
|
55529decd5
|
Implement _mm256_insert_epi32 and extract pseudo-ops
Visual Studio headers apparently lack these guys
|
2019-09-05 18:20:52 +03:00 |
|
Ari Lemmetti
|
4e94d60552
|
Merge branch 'smp-merge-analysis'
|
2019-09-03 16:47:07 +03:00 |
|
Ari Lemmetti
|
147378e1f9
|
Prevent 8x4 and 4x8 bipred in merge analysis
|
2019-09-03 16:32:50 +03:00 |
|
Ari Lemmetti
|
ef1fdbf259
|
Separate prediction of single PU/PB from CU/CB
|
2019-09-03 16:32:50 +03:00 |
|
Joose Sainio
|
7d2737bdf6
|
WIP picture lambda calculation
|
2019-09-03 11:03:35 +03:00 |
|
Ari Lemmetti
|
3bc510712f
|
Enable merge analysis for smp and amp
|
2019-09-02 17:31:51 +03:00 |
|
Ari Lemmetti
|
557bcbc6aa
|
Make luma or chroma only inter "recon" or predict possible
|
2019-09-02 17:15:28 +03:00 |
|