Reima Hyvönen
7406c33a42
Some more cleaning
2018-10-26 12:25:18 +03:00
Reima Hyvönen
4c71546b2e
Cleaned some coding
2018-10-26 12:19:44 +03:00
Reima Hyvönen
4fe3909e48
Switched luma to use 32bits size ints intstead of 16bit size
2018-10-24 18:24:46 +03:00
Eemeli Kallio
284e73839e
Calculating zero cost moved to its own function
2018-10-16 11:02:01 +03:00
Reima Hyvönen
381e786e10
Trying to find the bug in luma
2018-10-11 18:08:41 +03:00
Marko Viitanen
c589e5ed36
Fix closed-gop frame feed, the ordering was incorrect after the first GOP
2018-10-10 11:12:03 +03:00
Reima Hyvönen
2f5f81bac3
removed the non-optimated bipred function
2018-10-09 11:19:23 +03:00
Marko Viitanen
75dce4f3ce
Fix low-delay-gop usage with --no-open-gop
2018-10-04 15:16:02 +03:00
Marko Viitanen
de71b58f76
Change closed GOP structure to include an additional IDR between GOPs
2018-10-04 11:17:03 +03:00
Reima Hyvönen
212a8e68fa
Modified to avoid memory overflow, still some bug inside luma
2018-10-02 20:23:32 +03:00
Marko Viitanen
954f07e3d7
Add --(no-)open-gop option
2018-10-02 10:05:32 +03:00
Marko Viitanen
8bef85e056
Merge branch 'set-qp-in-cu'
2018-09-03 08:33:33 +03:00
Ari Lemmetti
2fdcc2b79d
Add option --set-qp-in-cu
2018-09-03 08:32:45 +03:00
Reima Hyvönen
896034b7cf
Some renamed functions back
2018-08-28 15:31:10 +03:00
Reima Hyvönen
e8b5e6db4c
Did some merging
2018-08-28 15:26:27 +03:00
Reima Hyvönen
7de5c74434
Updated bipred_recon to work faster
2018-08-28 15:12:31 +03:00
Reima Hyvönen
47b357cca2
Comment one test
2018-08-27 18:52:14 +03:00
Reima Hyvönen
2ca99a44e8
Updated shuffle operation to be in right order
2018-08-27 18:16:38 +03:00
Marko Viitanen
b85ae3688e
Signal QP in slice header if tiles and slices=tiles are enabled
...
Keeps the PPS constant for various purposes
2018-08-16 08:44:39 +03:00
Reima Hyvönen
508b218a12
some modifications made to prevent reading too much
2018-08-14 10:50:39 +03:00
Reima Hyvönen
1d935ee888
some useless stuff removed
2018-08-13 16:47:11 +03:00
Reima Hyvönen
ce3ac4c05e
some modifications to no_mov
2018-08-13 16:41:02 +03:00
Reima Hyvönen
15a613ae94
test if no_mov breaks testing
2018-08-13 16:02:56 +03:00
Reima Hyvönen
97a2049e58
removed pointer declaration out from switch
2018-08-10 16:42:26 +03:00
Reima Hyvönen
aa94bcedbc
Stream is now pointer
2018-08-10 16:38:49 +03:00
Reima Hyvönen
fa5b227ece
256 to 32 doesn't work, made them by hand
2018-08-10 16:01:20 +03:00
Reima Hyvönen
408dedbcc8
removed _mm256_extract_epi8 and replaced with _mm_stream
2018-08-10 15:53:26 +03:00
Reima Hyvönen
31c35091c6
_mm256_cvtsi256_si32 removed
2018-08-10 10:06:40 +03:00
Reima Hyvönen
99dc43074f
_mm256_cvtsi256_si32 breaks system, too much bits. back to extract
2018-08-10 09:59:33 +03:00
Reima Hyvönen
4f1f80b2cb
Transformed convert from 256 to cast 256 -> 128 and then convert from 128
2018-08-09 15:35:54 +03:00
Reima Hyvönen
4957555eb3
Removed leftover from 939
2018-08-09 15:25:03 +03:00
Reima Hyvönen
28b165c971
Clearified some sections, added _MM_SHUFFLE macro
2018-08-09 15:23:01 +03:00
Reima Hyvönen
dd04df8667
testing if error in both avx2 functions
2018-08-03 11:49:00 +03:00
Reima Hyvönen
ed50d71fde
Switched some variables to different location, altered inter_recon_bipred_avx2 function
2018-08-02 16:08:59 +03:00
Reima Hyvönen
f5739a0028
Renaming and removing useless prints
2018-08-02 14:47:17 +03:00
Reima Hyvönen
bc09f59bb6
Edited some definitions
2018-08-02 11:54:53 +03:00
Arttu Ylä-Outinen
83555c3d6d
Enable --fast-residual-cost with fastest presets
2018-07-16 12:31:20 +03:00
Arttu Ylä-Outinen
c438bb4a19
Add an option to skip CABAC for residual costs
...
Adds command line option --fast-residual-cost=<limit>. When QP is below
the limit, estimates the cost of coding the residual coefficients from
the sum of absolute coefficients. Skipping CABAC is not worth it with
high QPs because there are fewer coefficients so CABAC is not as slow.
2018-07-16 12:31:20 +03:00
Reima Hyvönen
a4bf77f208
Tested some extract functions
2018-07-12 09:29:32 +03:00
Reima Hyvönen
c05033a893
Even more useless vectors removed
2018-07-11 15:09:14 +03:00
Reima Hyvönen
884cb77238
Removed some not used vectors
2018-07-11 15:06:11 +03:00
Reima Hyvönen
792689a5ff
Removed for-loops, added extract instead
2018-07-11 14:56:41 +03:00
Reima Hyvönen
f9c7f6ee66
Added some break-operations for avx2 optimation
2018-07-11 14:15:38 +03:00
Reima Hyvönen
cc064da143
some more optimation for bipred
2018-07-11 11:27:54 +03:00
Reima Hyvönen
9a339eef89
Merge branch 'bipred_recon' of https://gitlab.tut.fi/TIE/ultravideo/kvazaar into HEAD
...
# Conflicts:
# build/kvazaar_lib/kvazaar_lib.vcxproj
2018-07-10 16:21:04 +03:00
Reima Hyvönen
a22cf03ddb
Updated to have no movement function to avx2 strategies
2018-07-10 16:07:15 +03:00
Arttu Ylä-Outinen
b7474eb532
Fix SAO buffer sizes
...
Increases sizes of buffers used for SAO reconstruction to avoid stack
buffer overflow in AVX2 SAO reconstruction.
2018-07-05 15:56:30 +03:00
Arttu Ylä-Outinen
b37470e80f
Merge pull request #207 from jbeich/maltivec
...
Unbreak build on PowerPC if AltiVec isn't supported
2018-07-04 11:06:41 +03:00
Reima Hyvönen
ea83ae45f0
Toimiva ratkaisu
2018-07-03 11:18:51 +03:00
Jan Beich
4f4bea7496
Check -maltivec is supported before using
...
PowerPC target may lack or have non-standard FPU:
$ cc -dumpmachine
powerpcspe-undermydesk-freebsd
$ cc -c -maltivec -Isrc src/strategies/altivec/picture-altivec.c
src/strategies/altivec/picture-altivec.c:1: error: AltiVec and E500 instructions cannot coexist
2018-07-02 23:25:23 +00:00
Jan Beich
b892d820f8
Clean up macOS includes on powerpc* after 93e1c9f1c3
...
strategyselector.c:426:25: machine/cpu.h: No such file or directory
2018-07-02 21:52:45 +00:00
Reima Hyvönen
17babfffa4
25.6 working optimation, ~50% faster than original
2018-06-25 17:06:16 +03:00
Arttu Ylä-Outinen
2f995f4325
Merge pull request #205 from jbeich/powerpc
...
Unbreak build on non-Linux powerpc*
2018-06-19 13:28:00 +03:00
Arttu Ylä-Outinen
c1398ef818
Permit --period=1 with any GOP structure
...
All intra coding is a special case so it can be permitted even though
Kvazaar normally only supports intra periods that are divisible by the
GOP length.
2018-06-18 12:26:11 +03:00
Arttu Ylä-Outinen
abdebe0bf9
Fix --owf help message
...
The number of parallel frames is --owf plus one, not --owf minus one.
Fixes #204 .
2018-06-18 09:33:36 +03:00
Jan Beich
93e1c9f1c3
Add AltiVec detection for BSDs
...
strategyselector.c:377:26: linux/auxvec.h: No such file or directory
2018-06-17 15:38:24 +00:00
Miika Metsoila
98972d26c2
Document that the high tier requires level 4 or higher
2018-06-14 12:41:03 +03:00
Miika Metsoila
62b44efaa4
Write the encoding tier (main/high) into the bitstream
2018-06-14 12:41:03 +03:00
Arttu Ylä-Outinen
a343f6d587
Prepare for delta QPs at CU-level
...
- Replaces lcu_dqp_enabled with max_qp_delta_depth in encoder_control_t.
- Fixes set_cu_qps so that it can handle quantization groups of
arbitrary size.
- Fixes computation of QP predictors so that it works for quantization
groups of arbitrary size.
2018-06-13 15:36:19 +03:00
Arttu Ylä-Outinen
dc6b2024ea
Modify reference count asserts to fix data races
...
Changes asserts on the reference count of objects to assert the value
after KVZ_ATOMIC_INC instead of directly checking the value. Fixes some
data races detected by TSan.
2018-06-12 09:35:07 +03:00
Ari Lemmetti
4fb1c16c61
Add early termination for intra rdo when a zero coefficient block is found.
2018-06-08 21:03:07 +03:00
Ari Lemmetti
492529fb7a
Add the same comment to help message as well...
2018-05-30 14:13:15 +03:00
Ari Lemmetti
0d5972bf03
Add missing sort to intra transform split search so mode at 0 is the best
2018-05-21 13:10:38 +03:00
Sebastien Alaiwan
954bca7d6e
Fix memset parameter
2018-05-17 11:24:49 +02:00
Jaakko Laitinen
f9466efcbb
Close file on error
2018-05-15 11:50:16 +03:00
Reima Hyvönen
9fed29f950
optimation for inter_recon_bipred
2018-04-18 15:25:44 +03:00
Arttu Ylä-Outinen
5c585c4fbc
Update help message
...
Updates the default option values to match the medium preset.
2018-04-03 10:40:37 +03:00
Arttu Ylä-Outinen
2b4e22111a
Update presets
...
The new presets are slower but have better coding efficiency.
2018-04-03 10:37:30 +03:00
Arttu Ylä-Outinen
7185519a1b
Update command line help
...
- Adds missing default values.
- Adds help for --crypto and --key.
- Adds help for --rd=3.
- Adds help for --sao options.
- Some changes to help wording.
2018-03-23 14:33:04 +02:00
Arttu Ylä-Outinen
3606860504
Add --no-cpuid option
...
Equivalent to --cpuid=0.
2018-03-23 12:32:27 +02:00
Arttu Ylä-Outinen
fb462b25ef
Fix transform skip for inter
...
The transform skip flag in cu_info_t was stored under the intra
substruct even though transform skip can be used for inter as well. This
caused bitstream errors. Fixed by moving the flag out of the substruct.
2018-03-20 11:01:33 +02:00
Arttu Ylä-Outinen
b64e46707d
Skip raster scan step in TZ search
...
Raster scan is very slow and the BD-rate improvement is marginal.
2018-03-01 14:04:03 +02:00
Arttu Ylä-Outinen
6877064230
Add zero neighborhood check to TZ search
...
Adds an additional grid search step that starts from the zero motion
vector after the normal grid search. The search range for this step is
half of the normal range.
2018-03-01 14:02:13 +02:00
Arttu Ylä-Outinen
74a413c46a
Switch to star refinement in TZ search
2018-03-01 13:06:14 +02:00
Arttu Ylä-Outinen
ebee428ee1
Add loop termination to TZ grid search
...
Terminates the grid search if no better motion vector was found in the
last three iterations.
2018-03-01 13:06:06 +02:00
Arttu Ylä-Outinen
4c175621dd
Fix TZ grid search and star refinement
...
- Changes TZ grid search and star refinement to keep the origin constant
instead of moving to the best position after each iteration.
- Changes star refinement to loop until there is no more improvement,
instead of running the step only once.
2018-03-01 12:56:57 +02:00
Arttu Ylä-Outinen
9c2d0074a2
Add rounding of motion vectors in inter search
...
When the starting point for integer motion estimation was selected among
the merge candidates, the candidate motion vectors were always rounded
down. This commit changes the rounding so that they are rounded to the
nearest integer MV instead.
2018-03-01 09:39:21 +02:00
Ari Lemmetti
662430d441
Select CU type based on SSD, transform unit tree and mode cost of luma and chroma on --rd=2
2018-02-22 19:26:48 +02:00
Arttu Ylä-Outinen
cb06cfeadb
Drop temporary arrays in bipred search
...
Changes bipred search to use the original source and reconstruction
arrays directly instead of copying them.
2018-02-14 11:20:51 +02:00
Arttu Ylä-Outinen
0ea516ba30
Move bipred search to a separate function
2018-02-14 09:56:53 +02:00
Arttu Ylä-Outinen
6f506be12d
Drop dynamic allocation from bipred search
...
Moves the temporary LCU struct used in bipred search from the heap to
the stack. The single malloc call was a huge bottleneck in bipred.
2018-02-14 09:55:02 +02:00
Arttu Ylä-Outinen
7155dd0db7
Add negative references to L1 list
...
Changes reference index list creation so that the negative references
are added to L1 in addition to L0 when biprediction is enabled and no
reordering of pictures is done. Biprediction can now be used with the
low-delay GOP structure.
2018-02-07 14:54:52 +02:00
Arttu Ylä-Outinen
4b24cd03a2
Update for crypto++ 6.0.0 compatibility
...
Changes the crypto module to use unsigned char instead of byte. The byte
typedef is no longer included in the global namespace in crypto++ 6.0.0.
See https://github.com/weidai11/cryptopp/issues/442 .
Fixes #184 .
2018-02-05 13:35:03 +02:00
Arttu Ylä-Outinen
8c53417006
Check zero coefficient cost for inter
...
Checks the cost of flushing all coefficients of an inter block to zero.
This is much faster than doing full RDOQ but can still reduce bitrate
significantly. Encoding speed is increased since fewer coefficient bits
have to be coded with CABAC.
2018-01-29 12:41:56 +02:00
Arttu Ylä-Outinen
018b5ffa64
Move inter CU reconstruction to a new function
...
Moves code for reconstructing all PUs in an inter CU to a new function
kvz_inter_recon_cu in inter.c.
2018-01-24 15:05:39 +02:00
Arttu Ylä-Outinen
405b8c1069
Refactor inter MVD cost functions
...
Moves duplicate code for writing the MVD of a single motion vector from
kvz_get_mvd_coding_cost_cabac and encoder_inter_prediction_unit to a new
function.
2018-01-19 08:29:17 +02:00
Arttu Ylä-Outinen
c1cca1ad7f
Refactor inter MV candidate selection
...
Moves duplicate code for checking the best MV candidate from functions
calc_mvd_cost, search_pu_inter_ref and search_pu_inter to a new
function.
2018-01-19 08:29:17 +02:00
Arttu Ylä-Outinen
9067aa4535
Remove an unnecessary copy in SMP/AMP search
...
SMP/AMP search is performed using a lower work tree level than the
normal inter search so the prediction info must be copied up if an
SMP/AMP mode is chosen. Previously pixels and coefficient were copied as
well. Changed to only copy prediction info.
2018-01-18 10:36:26 +02:00
Arttu Ylä-Outinen
89a930d6dd
Add part mode bitcost when using SMP/AMP blocks
2018-01-18 10:36:26 +02:00
Arttu Ylä-Outinen
fc43643ba5
Use a transform split for SMP and AMP blocks
2018-01-18 10:36:25 +02:00
Arttu Ylä-Outinen
c74ede148b
Fix CBF flags for 4x4 luma blocks
...
CBF flags were not being propagated to the upper level from blocks of
size 4x4.
2018-01-18 10:36:25 +02:00
Arttu Ylä-Outinen
0a69e6d18f
Fix selection of transform function for 4x4 blocks
...
DST function was returned for inter luma transform blocks of size 4x4
even though they must use DCT. Fixed by checking the prediction mode of
the block in addition to whether it is chroma or luma.
2018-01-18 10:36:25 +02:00
Miika Metsoila
bcedfd6669
Remove the usage of errno in me-steps argument parsing
2018-01-16 14:38:43 +02:00
Miika Metsoila
39ed36830e
Merge branch 'me_steps'
2018-01-16 14:22:59 +02:00
Miika Metsoila
61213e3ad9
Improve step parameter parsing and usage
2018-01-10 15:16:52 +02:00
Arttu Ylä-Outinen
649113a821
Fix inter search being used for 4x4 blocks
...
When 4x4 intra blocks are enabled and inter search is limited to 16x16
and larger blocks, it is possible that inter search is accidentally done
for 4x4 blocks. Fixed by checking that block size is at least 8x8 before
doing inter search.
2018-01-10 14:21:48 +02:00
Miika Metsoila
e8e0e7596a
Add a step-cutoff parameter for motion estimation search
2017-12-22 14:04:25 +02:00
Miika Metsoila
4e13608b01
Merge branch 'diamond_search'
2017-12-18 14:11:53 +02:00
Miika Metsoila
2cde0d1a18
Document diamond search option
2017-12-12 14:45:01 +02:00
Miika Metsoila
b923b63b42
Add diamond search
2017-12-12 14:40:14 +02:00