Commit graph

2555 commits

Author SHA1 Message Date
Reima Hyvönen aa94bcedbc Stream is now pointer 2018-08-10 16:38:49 +03:00
Reima Hyvönen fa5b227ece 256 to 32 doesn't work, made them by hand 2018-08-10 16:01:20 +03:00
Reima Hyvönen 408dedbcc8 removed _mm256_extract_epi8 and replaced with _mm_stream 2018-08-10 15:53:26 +03:00
Reima Hyvönen 31c35091c6 _mm256_cvtsi256_si32 removed 2018-08-10 10:06:40 +03:00
Reima Hyvönen 99dc43074f _mm256_cvtsi256_si32 breaks system, too much bits. back to extract 2018-08-10 09:59:33 +03:00
Reima Hyvönen 4f1f80b2cb Transformed convert from 256 to cast 256 -> 128 and then convert from 128 2018-08-09 15:35:54 +03:00
Reima Hyvönen 4957555eb3 Removed leftover from 939 2018-08-09 15:25:03 +03:00
Reima Hyvönen 28b165c971 Clearified some sections, added _MM_SHUFFLE macro 2018-08-09 15:23:01 +03:00
Reima Hyvönen dd04df8667 testing if error in both avx2 functions 2018-08-03 11:49:00 +03:00
Reima Hyvönen ed50d71fde Switched some variables to different location, altered inter_recon_bipred_avx2 function 2018-08-02 16:08:59 +03:00
Reima Hyvönen f5739a0028 Renaming and removing useless prints 2018-08-02 14:47:17 +03:00
Reima Hyvönen bc09f59bb6 Edited some definitions 2018-08-02 11:54:53 +03:00
Arttu Ylä-Outinen 83555c3d6d Enable --fast-residual-cost with fastest presets 2018-07-16 12:31:20 +03:00
Arttu Ylä-Outinen c438bb4a19 Add an option to skip CABAC for residual costs
Adds command line option --fast-residual-cost=<limit>. When QP is below
the limit, estimates the cost of coding the residual coefficients from
the sum of absolute coefficients. Skipping CABAC is not worth it with
high QPs because there are fewer coefficients so CABAC is not as slow.
2018-07-16 12:31:20 +03:00
Reima Hyvönen a4bf77f208 Tested some extract functions 2018-07-12 09:29:32 +03:00
Reima Hyvönen c05033a893 Even more useless vectors removed 2018-07-11 15:09:14 +03:00
Reima Hyvönen 884cb77238 Removed some not used vectors 2018-07-11 15:06:11 +03:00
Reima Hyvönen 792689a5ff Removed for-loops, added extract instead 2018-07-11 14:56:41 +03:00
Reima Hyvönen f9c7f6ee66 Added some break-operations for avx2 optimation 2018-07-11 14:15:38 +03:00
Reima Hyvönen cc064da143 some more optimation for bipred 2018-07-11 11:27:54 +03:00
Reima Hyvönen 9a339eef89 Merge branch 'bipred_recon' of https://gitlab.tut.fi/TIE/ultravideo/kvazaar into HEAD
# Conflicts:
#	build/kvazaar_lib/kvazaar_lib.vcxproj
2018-07-10 16:21:04 +03:00
Reima Hyvönen a22cf03ddb Updated to have no movement function to avx2 strategies 2018-07-10 16:07:15 +03:00
Arttu Ylä-Outinen b7474eb532 Fix SAO buffer sizes
Increases sizes of buffers used for SAO reconstruction to avoid stack
buffer overflow in AVX2 SAO reconstruction.
2018-07-05 15:56:30 +03:00
Arttu Ylä-Outinen b37470e80f
Merge pull request #207 from jbeich/maltivec
Unbreak build on PowerPC if AltiVec isn't supported
2018-07-04 11:06:41 +03:00
Reima Hyvönen ea83ae45f0 Toimiva ratkaisu 2018-07-03 11:18:51 +03:00
Jan Beich 4f4bea7496 Check -maltivec is supported before using
PowerPC target may lack or have non-standard FPU:

$ cc -dumpmachine
powerpcspe-undermydesk-freebsd
$ cc -c -maltivec -Isrc src/strategies/altivec/picture-altivec.c
src/strategies/altivec/picture-altivec.c:1: error: AltiVec and E500 instructions cannot coexist
2018-07-02 23:25:23 +00:00
Jan Beich b892d820f8 Clean up macOS includes on powerpc* after 93e1c9f1c3
strategyselector.c:426:25: machine/cpu.h: No such file or directory
2018-07-02 21:52:45 +00:00
Reima Hyvönen 17babfffa4 25.6 working optimation, ~50% faster than original 2018-06-25 17:06:16 +03:00
Arttu Ylä-Outinen 2f995f4325
Merge pull request #205 from jbeich/powerpc
Unbreak build on non-Linux powerpc*
2018-06-19 13:28:00 +03:00
Arttu Ylä-Outinen c1398ef818 Permit --period=1 with any GOP structure
All intra coding is a special case so it can be permitted even though
Kvazaar normally only supports intra periods that are divisible by the
GOP length.
2018-06-18 12:26:11 +03:00
Arttu Ylä-Outinen abdebe0bf9 Fix --owf help message
The number of parallel frames is --owf plus one, not --owf minus one.

Fixes #204.
2018-06-18 09:33:36 +03:00
Jan Beich 93e1c9f1c3 Add AltiVec detection for BSDs
strategyselector.c:377:26: linux/auxvec.h: No such file or directory
2018-06-17 15:38:24 +00:00
Miika Metsoila 98972d26c2 Document that the high tier requires level 4 or higher 2018-06-14 12:41:03 +03:00
Miika Metsoila 62b44efaa4 Write the encoding tier (main/high) into the bitstream 2018-06-14 12:41:03 +03:00
Arttu Ylä-Outinen a343f6d587 Prepare for delta QPs at CU-level
- Replaces lcu_dqp_enabled with max_qp_delta_depth in encoder_control_t.
- Fixes set_cu_qps so that it can handle quantization groups of
  arbitrary size.
- Fixes computation of QP predictors so that it works for quantization
  groups of arbitrary size.
2018-06-13 15:36:19 +03:00
Arttu Ylä-Outinen dc6b2024ea Modify reference count asserts to fix data races
Changes asserts on the reference count of objects to assert the value
after KVZ_ATOMIC_INC instead of directly checking the value. Fixes some
data races detected by TSan.
2018-06-12 09:35:07 +03:00
Ari Lemmetti 4fb1c16c61 Add early termination for intra rdo when a zero coefficient block is found. 2018-06-08 21:03:07 +03:00
Ari Lemmetti 492529fb7a Add the same comment to help message as well... 2018-05-30 14:13:15 +03:00
Ari Lemmetti 0d5972bf03 Add missing sort to intra transform split search so mode at 0 is the best 2018-05-21 13:10:38 +03:00
Sebastien Alaiwan 954bca7d6e Fix memset parameter 2018-05-17 11:24:49 +02:00
Jaakko Laitinen f9466efcbb Close file on error 2018-05-15 11:50:16 +03:00
Reima Hyvönen 9fed29f950 optimation for inter_recon_bipred 2018-04-18 15:25:44 +03:00
Arttu Ylä-Outinen 5c585c4fbc Update help message
Updates the default option values to match the medium preset.
2018-04-03 10:40:37 +03:00
Arttu Ylä-Outinen 2b4e22111a Update presets
The new presets are slower but have better coding efficiency.
2018-04-03 10:37:30 +03:00
Arttu Ylä-Outinen 7185519a1b Update command line help
- Adds missing default values.
- Adds help for --crypto and --key.
- Adds help for --rd=3.
- Adds help for --sao options.
- Some changes to help wording.
2018-03-23 14:33:04 +02:00
Arttu Ylä-Outinen 3606860504 Add --no-cpuid option
Equivalent to --cpuid=0.
2018-03-23 12:32:27 +02:00
Arttu Ylä-Outinen fb462b25ef Fix transform skip for inter
The transform skip flag in cu_info_t was stored under the intra
substruct even though transform skip can be used for inter as well. This
caused bitstream errors. Fixed by moving the flag out of the substruct.
2018-03-20 11:01:33 +02:00
Arttu Ylä-Outinen b64e46707d Skip raster scan step in TZ search
Raster scan is very slow and the BD-rate improvement is marginal.
2018-03-01 14:04:03 +02:00
Arttu Ylä-Outinen 6877064230 Add zero neighborhood check to TZ search
Adds an additional grid search step that starts from the zero motion
vector after the normal grid search. The search range for this step is
half of the normal range.
2018-03-01 14:02:13 +02:00
Arttu Ylä-Outinen 74a413c46a Switch to star refinement in TZ search 2018-03-01 13:06:14 +02:00
Arttu Ylä-Outinen ebee428ee1 Add loop termination to TZ grid search
Terminates the grid search if no better motion vector was found in the
last three iterations.
2018-03-01 13:06:06 +02:00
Arttu Ylä-Outinen 4c175621dd Fix TZ grid search and star refinement
- Changes TZ grid search and star refinement to keep the origin constant
  instead of moving to the best position after each iteration.
- Changes star refinement to loop until there is no more improvement,
  instead of running the step only once.
2018-03-01 12:56:57 +02:00
Arttu Ylä-Outinen 9c2d0074a2 Add rounding of motion vectors in inter search
When the starting point for integer motion estimation was selected among
the merge candidates, the candidate motion vectors were always rounded
down. This commit changes the rounding so that they are rounded to the
nearest integer MV instead.
2018-03-01 09:39:21 +02:00
Ari Lemmetti 662430d441 Select CU type based on SSD, transform unit tree and mode cost of luma and chroma on --rd=2 2018-02-22 19:26:48 +02:00
Arttu Ylä-Outinen cb06cfeadb Drop temporary arrays in bipred search
Changes bipred search to use the original source and reconstruction
arrays directly instead of copying them.
2018-02-14 11:20:51 +02:00
Arttu Ylä-Outinen 0ea516ba30 Move bipred search to a separate function 2018-02-14 09:56:53 +02:00
Arttu Ylä-Outinen 6f506be12d Drop dynamic allocation from bipred search
Moves the temporary LCU struct used in bipred search from the heap to
the stack. The single malloc call was a huge bottleneck in bipred.
2018-02-14 09:55:02 +02:00
Arttu Ylä-Outinen 7155dd0db7 Add negative references to L1 list
Changes reference index list creation so that the negative references
are added to L1 in addition to L0 when biprediction is enabled and no
reordering of pictures is done. Biprediction can now be used with the
low-delay GOP structure.
2018-02-07 14:54:52 +02:00
Arttu Ylä-Outinen 4b24cd03a2 Update for crypto++ 6.0.0 compatibility
Changes the crypto module to use unsigned char instead of byte. The byte
typedef is no longer included in the global namespace in crypto++ 6.0.0.
See https://github.com/weidai11/cryptopp/issues/442.

Fixes #184.
2018-02-05 13:35:03 +02:00
Arttu Ylä-Outinen 8c53417006 Check zero coefficient cost for inter
Checks the cost of flushing all coefficients of an inter block to zero.
This is much faster than doing full RDOQ but can still reduce bitrate
significantly. Encoding speed is increased since fewer coefficient bits
have to be coded with CABAC.
2018-01-29 12:41:56 +02:00
Arttu Ylä-Outinen 018b5ffa64 Move inter CU reconstruction to a new function
Moves code for reconstructing all PUs in an inter CU to a new function
kvz_inter_recon_cu in inter.c.
2018-01-24 15:05:39 +02:00
Arttu Ylä-Outinen 405b8c1069 Refactor inter MVD cost functions
Moves duplicate code for writing the MVD of a single motion vector from
kvz_get_mvd_coding_cost_cabac and encoder_inter_prediction_unit to a new
function.
2018-01-19 08:29:17 +02:00
Arttu Ylä-Outinen c1cca1ad7f Refactor inter MV candidate selection
Moves duplicate code for checking the best MV candidate from functions
calc_mvd_cost, search_pu_inter_ref and search_pu_inter to a new
function.
2018-01-19 08:29:17 +02:00
Arttu Ylä-Outinen 9067aa4535 Remove an unnecessary copy in SMP/AMP search
SMP/AMP search is performed using a lower work tree level than the
normal inter search so the prediction info must be copied up if an
SMP/AMP mode is chosen. Previously pixels and coefficient were copied as
well. Changed to only copy prediction info.
2018-01-18 10:36:26 +02:00
Arttu Ylä-Outinen 89a930d6dd Add part mode bitcost when using SMP/AMP blocks 2018-01-18 10:36:26 +02:00
Arttu Ylä-Outinen fc43643ba5 Use a transform split for SMP and AMP blocks 2018-01-18 10:36:25 +02:00
Arttu Ylä-Outinen c74ede148b Fix CBF flags for 4x4 luma blocks
CBF flags were not being propagated to the upper level from blocks of
size 4x4.
2018-01-18 10:36:25 +02:00
Arttu Ylä-Outinen 0a69e6d18f Fix selection of transform function for 4x4 blocks
DST function was returned for inter luma transform blocks of size 4x4
even though they must use DCT. Fixed by checking the prediction mode of
the block in addition to whether it is chroma or luma.
2018-01-18 10:36:25 +02:00
Miika Metsoila bcedfd6669 Remove the usage of errno in me-steps argument parsing 2018-01-16 14:38:43 +02:00
Miika Metsoila 39ed36830e Merge branch 'me_steps' 2018-01-16 14:22:59 +02:00
Miika Metsoila 61213e3ad9 Improve step parameter parsing and usage 2018-01-10 15:16:52 +02:00
Arttu Ylä-Outinen 649113a821 Fix inter search being used for 4x4 blocks
When 4x4 intra blocks are enabled and inter search is limited to 16x16
and larger blocks, it is possible that inter search is accidentally done
for 4x4 blocks. Fixed by checking that block size is at least 8x8 before
doing inter search.
2018-01-10 14:21:48 +02:00
Miika Metsoila e8e0e7596a Add a step-cutoff parameter for motion estimation search 2017-12-22 14:04:25 +02:00
Miika Metsoila 4e13608b01 Merge branch 'diamond_search' 2017-12-18 14:11:53 +02:00
Miika Metsoila 2cde0d1a18 Document diamond search option 2017-12-12 14:45:01 +02:00
Miika Metsoila b923b63b42 Add diamond search 2017-12-12 14:40:14 +02:00
Ari Lemmetti 14892fda00 Replace simple coefficient cost estimation with CABAC. Substantial improvement.
Approximation proved to be too inaccurate while not giving actually that much speedup.
2017-12-10 01:23:48 +02:00
Miika Metsoila ea79069dc8 Fix a type warning in encmain.c 2017-12-08 16:22:40 +02:00
Miika Metsoila 6aa4cd7528 Fix type warnings 2017-12-08 16:16:36 +02:00
Miika Metsoila b3486b5114 Fix gcc/clang warnings and errors in cfg.c 2017-12-08 16:09:00 +02:00
Miika Metsoila bac07457ea Merge branch 'hevc_level' 2017-12-08 15:57:38 +02:00
Miika Metsoila c67a24e6ec Update readme and --help text 2017-12-07 12:32:46 +02:00
Ari Lemmetti 713e694d82 Define HAVE_STRUCT_TIMESPEC on Visual Studio 2015 and later
Fixes redefinition of timespec that Pthreads-Win32 does even if it has been already defined.
2017-12-05 18:26:12 +02:00
Miika Metsoila f64d42169f Improve bitrate checking to accommodate non-integer and less than 1 framerates 2017-12-01 17:20:12 +02:00
Miika Metsoila 57cf92d35f Implement level's bitrate limit checking during encoding 2017-11-28 16:19:44 +02:00
Miika Metsoila 021fb27787 Add high-tier flag 2017-11-20 16:05:28 +02:00
Miika Metsoila d249059d61 Minor refactoring of level checking 2017-11-20 13:25:26 +02:00
Arttu Ylä-Outinen cf85d52b9d Kvazaar version 1.2.0 2017-11-17 15:23:33 +02:00
Miika Metsoila 4c1512e8c5 Add a check for maximum picture width and height for the given level 2017-11-15 16:39:59 +02:00
Arttu Ylä-Outinen 4cb054295a Fix linkers
Overrides the linkers used for kvazaar, libkvazaar.la and kvazaar_tests.
When crypto++ is enabled, the C++ linker is used and when it is
disabled, the C linker is used.

This removes the need to explicitly specify -lstdc++ in configure when
crypto++ is used and fixes the build with crypto++ when libstd++ is not
installed.
2017-11-13 15:09:38 +02:00
Miika Metsoila f9a4aba867 Update documentation, fix input fps default value, remove 0 as default level 2017-11-09 16:53:31 +02:00
Miika Metsoila ebba0a4f01 Test if input conforms to it's level's limits (excluding bitrate) 2017-11-08 16:15:41 +02:00
Miika Metsoila fb4d0c3cf2 Move level argument parsing to the correct place and give it initial values 2017-11-03 15:47:35 +02:00
Miika Metsoila 61a31054e1 Add level command-line parameter 2017-11-03 13:04:05 +02:00
Arttu Ylä-Outinen 9974380cdd Fix bipred and temporal MVP
- Fixes two errors in calculating the POC for the reference frame for
  temporal candidate MV scaling.
- Fixes using the MV for the wrong direction when the temporal MV
  predictor block uses bi-prediction.

Fixes #160.
2017-10-25 12:26:41 +03:00
Arttu Ylä-Outinen 841597e123 Fix picture and slice types
Changes handling of intra pictures for --gop=8 so that every picture
with POC divisible by the intra period is intra. The first picture is
IDR and the rest of the intra pictures are CRA. POC is not reset at CRA
pictures. The leading pictures that follow the CRA picture are changed
to RASL so they are allowed to refer to pictures before the CRA picture.

Changes inter slice types to P when the L1 reference list is empty and
to B otherwise.

In all-intra, all pictures are now IDR pictures with POC zero.
2017-10-20 13:35:26 +03:00
Jaakko Laitinen 957b6850c3 Change ref list printout to match hm decoded printout 2017-09-25 13:48:56 +03:00
Arttu Ylä-Outinen 20aea8df63 Fix POCs when using --gop=8
When using --gop=8 with an intra period greater than one, a single POC
would be skipped before every intra frame. This commit fixes the problem
by turning the intra frames into BLA frames with leading pictures when
using --gop=8.
2017-09-19 09:31:58 +03:00
Miika Metsoila 6e00f63469 Remove unused variables from search_pu_inter_ref function 2017-09-18 15:36:37 +03:00
Miika Metsoila 7b0101ce3d Merge branch 'reflist_changes'
# Conflicts:
#	src/encoderstate.c
#	src/search_inter.c
2017-09-18 14:59:37 +03:00
Miika Metsoila 769b17768d Change max function to MAX macro for clang/gcc compatibility.
Remove couple of unnecessary comments
2017-09-15 14:21:51 +03:00
Miika Metsoila 5f7c5443a3 Remove inter.poc 2017-09-12 14:23:19 +03:00
Miika Metsoila 6bd78a3da7 Reverse L0 list sort direction 2017-09-12 14:23:18 +03:00
Miika Metsoila 83dc7e7f50 Made L0 to sort and fixed mv_ref_coded in search_pu_inter 2017-09-12 14:23:18 +03:00
Timothe FRIGNAC d3362a238e changed strtod to strtol 2017-08-31 15:14:31 +02:00
Timothe FRIGNAC 3a1ab54ff0 Fixed memory leaks 2017-08-31 11:51:41 +02:00
Timothe FRIGNAC 466297fd77 Fixed build error 2017-08-29 17:01:18 +02:00
Timothe FRIGNAC 2e130912cb Add --key opt 2017-08-28 17:15:13 +02:00
Miika Metsoila a5f4cf09b5 Switched from storing POCs in inter.poc to state->frame->refLXs array 2017-08-21 16:34:57 +03:00
Arttu Ylä-Outinen 409d2114f0 Fix motion vector constraints
Fixes integer motion vectors being constrained more than what was
necessary when using --mv-constraint or --wpp.
2017-08-11 14:41:36 +03:00
Arttu Ylä-Outinen 7144a00beb Rewrite thread queue
Changes thread queue so that only the jobs that are ready to run are
stored in the queue. Other jobs are kept track of by pointers in the
reverse dependency lists of other jobs. When a job is ready to run it is
appended to the queue. The job queue is stored as a linked list.

The definitions of threadqueue_queue_t and threadqueue_job_t are moved
to the .c file, turning them into opaque structs.

Makes thread queue code simpler. Fixes some TSan errors.
2017-08-11 14:18:12 +03:00
Arttu Ylä-Outinen bc47fe94af Drop thread queue debug code 2017-08-11 14:18:12 +03:00
Eemeli Kallio e5cbc7a205 --sao now enables full sao 2017-08-11 13:26:55 +03:00
Eemeli Kallio 4c3453d26f Fixed issue with no-sao argument 2017-08-11 13:12:22 +03:00
Eemeli Kallio 8674c0f5ee Added paremeter for band and edge sao. 2017-08-11 11:57:09 +03:00
Eemeli Kallio d9b93ea368 Added possibility to skip edge or band sao. 2017-08-11 11:51:49 +03:00
Arttu Ylä-Outinen 4b73bdd9aa Skip checked motion vectors in early termination
Changes the second iteration of early termination to skip the motion
vectors that were already checked in the first iteration.
2017-08-09 14:29:09 +03:00
Arttu Ylä-Outinen 606d441362 Skip computing MV cost twice in hexagon search
Changes the first step of hexagon search to skip the zero offset since
the cost of the motion vector has already been computed.
2017-08-09 14:29:09 +03:00
Arttu Ylä-Outinen fa4648061d Add mv, cost and bitcost to inter_search_info_t 2017-08-09 14:29:08 +03:00
Arttu Ylä-Outinen 328f051d7f Put inter search parameters in a single struct
Adds struct inter_search_info_t for holding the parameters that are used
by most function related to inter search. Passing the parameters in
a single struct greatly reduces the number of parameters for many
functions.
2017-08-09 14:27:53 +03:00
Miika Metsoila 0dd069f8af Fixed using wrong POC in add_temporal_candidate 2017-08-09 13:50:21 +03:00
Miika Metsoila 25e0a954c7 Fixed 2 bugs causing incorrect video output 2017-08-09 13:50:21 +03:00
Arttu Ylä-Outinen 24ecddd2a5 Fix wrong strides in SAO reconstruction
Functions kvz_sao_reconstruct and encoder_sao_reconstruct used
frame->width as the stride instead of frame->rec->stride when accessing
frame->rec->data. This caused errors when using tiles and SAO.
2017-08-01 15:40:49 +03:00
Arttu Ylä-Outinen f0bf959d17 Fix alignment errors in 32-bit build with MSVC
Changes the work_tree parameter in search.c functions from an array to
a pointer. Fixes "formal parameter with requested alignment of 8 won't
be aligned" errors.
2017-07-28 09:27:02 +03:00
Arttu Ylä-Outinen 9694bd2fae Fix build on 32-bit systems
Function coeff_abs_sum_avx2 that was added in e950c9b was outside the
AVX2 #if directive.
2017-07-28 09:19:29 +03:00
Arttu Ylä-Outinen ecb0275cdd Store CU arrays as pointers to the main array
Changes field state->tile->frame->cu_array->data to point to the CU
array in the main encoder state. Removes the need to copy the CU array
to the main CU array after search.
2017-07-28 08:36:45 +03:00
Arttu Ylä-Outinen e950c9b101 Add AVX2 implementation for coefficient sum 2017-07-28 07:39:36 +03:00
Arttu Ylä-Outinen d50ae6990c Add sum of absolute coefficients to strategies 2017-07-28 07:39:15 +03:00
Arttu Ylä-Outinen 59faca0646 Skip CABAC coefficient cost for --rd=0 2017-07-28 07:33:03 +03:00
Arttu Ylä-Outinen 19e051ea40 Reduce intra threshold
Reduces intra threshold for --rd=0 from 20 to 8. Threshold of 20
increased BD-Rate too much.
2017-07-25 13:26:38 +03:00
Arttu Ylä-Outinen e9cf15465e Fix inter cost in bipred
The cost of coding MV ref indices and MV direction was added to bitcost
but not inter cost. Fixed by adding the extra bits to inter as well.
2017-07-24 15:24:04 +03:00
Arttu Ylä-Outinen edbe00763e Drop extra parameter in kvz_image_calc_sad
Drops the parameter max_lcu_below which was always set to -1.
2017-07-24 15:21:19 +03:00
Arttu Ylä-Outinen ffac29061f Fix extrapolated inter SATD 2017-07-24 15:11:05 +03:00
Arttu Ylä-Outinen 631ef53d2a Fix inter cost calculations
Inter costs are computed using SAD except when fractional motion
estimation or bi-prediction is enabled. This commit changes
search_pu_inter_ref to recalculate the cost with SATD. Fixes inter/intra
cost comparisons since intra costs are always SATD costs.
2017-07-24 15:11:05 +03:00
Arttu Ylä-Outinen 6ce2fb1238 Add pixel offsets to encoder_state_config_tile_t
Adds fields offset_x and offset_y to encoder_state_config_tile_t.
2017-07-24 15:11:05 +03:00
Arttu Ylä-Outinen 2380ba0d41 Reduce copying in kvz_get_coeff_cost
Changes function kvz_get_coeff_cost to only copy the CABAC contexts and
not the whole encoder state.

Other threads could be simultaneously using the other parts of the
encoder state. Only copying the CABAC fixes a TSan data race warning.
2017-07-24 12:38:41 +03:00
Arttu Ylä-Outinen 24b462f801 Align coefficients to 8 bytes
Adds alignment attribute to lcu_coeff_t. The coefficients are sometimes
handled as 64-bit integers containing four coefficients so the arrays
should be aligned to 8 bytes.

Fixes a UBSan error about misaligned reads.
2017-07-24 12:37:37 +03:00
Arttu Ylä-Outinen 5ddb43c6fe Fix undefined left shifts in rdo
Replaces left shifts by multiplications when the operand may be
a negative value. Left shift of a negative value is undefined behavior.
2017-07-24 12:35:10 +03:00
Arttu Ylä-Outinen d1e64ad62b Fix undefined left shifts
Replaces left shifts by multiplications when the operand may be
a negative value. Left shift of a negative value is undefined behavior.
2017-07-20 11:15:30 +03:00
Arttu Ylä-Outinen 07b5fb9caf Fix out-of-bounds read in encoderstate
When calling encoder_state_encode_leaf with POC 0, index -1 of the GOP
array would be accessed. Fixed by skipping the code for I-frames.
2017-07-20 11:15:30 +03:00
Arttu Ylä-Outinen 8c4a3473a8 Change --owf=auto and --threads=auto selection
Changes OWF selection so that it is chosen based on the maximum number
of parallel CTUs. Number of threads is limited to prevent overhead from
extra threads.
2017-07-20 09:42:28 +03:00
Arttu Ylä-Outinen 4fc9b743c1 Drop an unnecessary pthread_cond_broadcast
Drop pthread_cond_broadcast on threadqueue->cond in function
kvz_threadqueue_waitfor. The broadcast caused threads to be woken up
more often than necessary.
2017-07-19 11:09:30 +03:00
Arttu Ylä-Outinen 14003c6a30 Disable printing PSNR with --no-psnr 2017-07-19 10:38:37 +03:00
Arttu Ylä-Outinen e90bde5c62 Clarify PSNR output
Adds letters Y, U and V to the PSNR output to make it clearer that the
printed values are the luma and chroma PSNR.
2017-07-19 10:33:43 +03:00
Arttu Ylä-Outinen fdb3480b54 Enable strategies for SAO reconstruction
Re-enables strategies for SAO reconstruction. They were disabled in
commit ec9ff42.
2017-07-11 10:35:18 +03:00
Arttu Ylä-Outinen 333dba3884 Add static to SAO strategies 2017-07-11 10:02:01 +03:00
Miika Metsoila e8cc2d8f6a Small fixes 2017-07-07 13:58:19 +03:00
Arttu Ylä-Outinen 67a60a35e3 Fix invalid calls to normalize_lcu_weights
Changes encoder_state_init_new_frame to only call normalize_lcu_weights
when the weights have been written to the array and rate control is
enabled. When rate control is disabled, the weights are not used.
2017-07-07 11:05:31 +03:00
Arttu Ylä-Outinen 563bc26e71 Fix out-of-bounds read in AVX2 SAO
AVX2 version of SAO loaded offsets with a 256 bit read even though there
are only five 32 bit integers.
2017-07-06 13:04:52 +03:00
Arttu Ylä-Outinen 0850b17f96 Drop get_wpp_limit in search_inter
WPP limit for motion vectors is now computed inside fracmv_within_tile.
2017-07-05 13:22:53 +03:00
Arttu Ylä-Outinen 2a85f0f5a4 Move hard-coded MV limits to encoder_control_t
Adds field max_inter_ref_lcu to encoder_control_t. It is used to set up
inter-LCU dependencies in encoder_state_encode_leaf and restrict motion
vectors in fracmv_within_tile.
2017-07-05 13:22:53 +03:00
Arttu Ylä-Outinen bb5354f7e2 Relax inter-CTU dependencies when SAO is off
When using WPP and OWF, the first CTU of a row depends on the last CTU
of the row below in the reference frame. This is necessary when SAO is
enabled since we currently do SAO for a whole CTU row at a time. When
SAO is disabled, however, it is unnecessary to wait for the whole row.

Changes CTUs to depend only on the CTU below in the reference frame
instead of the whole row when WPP and OWF are enabled and SAO disabled.
Gives a significant speedup when running on a machine with many CPU
cores.
2017-07-05 13:21:06 +03:00
Arttu Ylä-Outinen 1efa2708b2 Do SAO reconstruction for a single CTU at a time
Moves SAO reconstruction into encoder_state_worker_encode_lcu instead of
doing it in a separate step for the whole CTU row. Reconstruction of the
rightmost 10 pixels and bottommost 10 pixels of a CTU is delayed until
the neighboring CTU has been deblocked.

Doing SAO for the whole CTU row at a time caused unnecessary inter-CTU
dependencies when using WPP and OWF. The first CTU of a row would need
to wait until SAO was done for the row below in the previous frame.
Moving SAO reconstruction to immediately after deblocking each CTU fixes
this problem.
2017-07-04 15:14:31 +03:00
Arttu Ylä-Outinen ec9ff42077 Rewrite SAO recon to handle arbitrary sized blocks
Adds width and height parameters to function kvz_sao_reconstruct and
changes it to take coordinates in units of pixels. This will be useful
for doing SAO for areas smaller than a whole CTU.
2017-06-30 16:09:18 +03:00
Miika Metsoila dcd7acf4fd Fixed crash and incorrect info output 2017-06-27 16:05:15 +03:00
Miika Metsoila f8b6234fdb Changes to refence lists to behave more like L0/L1 lists from the specification 2017-06-27 16:05:15 +03:00
Arttu Ylä-Outinen 2c66e0bbd2 Fix warnings about invalid reads in AVX2 ipol
AVX2 filter functions read pixels in chunks of 8 or 16 bytes. At the end
of the block, the read goes out of the bounds of the pixels array. The
extra pixels do not affect the result.

Fixes valgrind complaining about the invalid reads by allocating 5 extra
pixels in kvz_get_extended_block_avx2
2017-06-22 09:37:55 +03:00
Arttu Ylä-Outinen 4d20e156db Fix handling intra period not multiple of GOP length
With low delay GOP structure, it is possible to use an intra period that
is not a multiple of the GOP structure length. Commit 00c9f52 changed
encoder_state_init_new_frame to reset POC on intra frames. GOP offset,
however, was not reset, resulting in invalid POCs and references for the
following frames.

This commit changes function kvz_encoder_feed_frame so that GOP offset
is correctly reset on intra frames.
2017-06-22 09:29:00 +03:00
Arttu Ylä-Outinen 00c9f52bd4 Fix setting picture type when using GOP
Changes encoder_state_init_new_frame to set intra frame pictype to
KVZ_NAL_IDR_W_RADL even when using GOP.
2017-06-21 13:21:47 +03:00
Arttu Ylä-Outinen f54a25f112 Fix crash when immediately closing encoder
When closing the encoder, the pictures stored in the input frame buffer
are freed by repeatedly calling kvz_encoder_feed_frame. If the encoder
was closed immediately after opening it, kvz_encoder_feed_frame would be
called with an unprepared encoder state. This would trigger an assert.

Fixed by changing kvz_encoder_feed_frame so that it does not require the
encoder state to be prepared.
2017-06-15 11:57:46 +03:00
Arttu Ylä-Outinen b74e0458fd Set inter transform depth to zero
Sets max_transform_hierarchy_depth_inter to 0 in SPS. This saves some
bits because split_transform_flag does not need to be coded for inter
blocks.

When SMP and AMP blocks are enabled the depth is set to 1 instead.
Otherwise inter split flag would default to 1 for SMP and AMP blocks,
resulting in an unnecessary transform split.
2017-06-08 10:08:20 +03:00
Arttu Ylä-Outinen 8dd01ba5a9 Refactor helper functions in search
Combines functions lcu_set_intra_mode and lcu_set_inter_pu to a single
function. Removes some duplicated code.
2017-06-06 10:32:09 +03:00
Arttu Ylä-Outinen 1bbecf7584 Refactor work tree copy functions
Extracts common code shared by work_tree_copy_up and work_tree_copy_down
to a separate function.
2017-06-06 10:32:00 +03:00
Arttu Ylä-Outinen 2b169d5d63 Fix crash in kvazaar_close
Changes kvazaar_close to stop all threads before freeing encoder states.
Fixes a crash when the encoder is closed before all pictures have been
encoded.
2017-06-02 10:05:33 +03:00
Arttu Ylä-Outinen eb9a05b7ef Fix memory leak
Changes kvazaar_close to free the remaining pictures in the the input
frame buffer. Fixes a memory leak when the encoder is closed while there
are pictures left in the buffer.
2017-06-01 15:39:35 +03:00
Arttu Ylä-Outinen 8b2483ca1c Combine intra reconstruction functions
Replaces function kvz_intra_recon_lcu_luma and
kvz_intra_recon_lcu_chroma in intra.c with function kvz_intra_recon_cu.
The new function can handle reconstruction for both luma and chroma.
Removes some duplicated code.
2017-05-24 12:07:31 +03:00
Arttu Ylä-Outinen e67fdb853d Move intra leaf TB recon to a separate function
Moves code for intra leaf transform block reconstruction from functions
kvz_intra_recon_lcu_luma and kvz_intra_recon_lcu_chroma to a new
function intra_recon_tb_leaf. Removes some duplicated code.
2017-05-24 12:07:31 +03:00
Arttu Ylä-Outinen 13d2fdbd21 Drop unused kvz_videoframe_get_cu functions 2017-05-24 11:15:31 +03:00
Arttu Ylä-Outinen f5eef7f33c Use luma pixel coordinates in encode_coding_tree
Changes functions encode_intra_coding_unit and encode_coding_tree to
take coordinate arguments in units of luma pixels instead of 8 px
blocks. This should make the code easier to understand.
2017-05-24 11:15:31 +03:00
Arttu Ylä-Outinen 525a5180ff Combine intra CU encoding functions
Merges functions encode_intra_coding_unit and
encode_intra_coding_unit_encry. Removes a lot of duplicated code.
2017-05-24 11:12:40 +03:00
Arttu Ylä-Outinen 610c91b0c5 Use luma pixel coordinates in TU coding functions
Changes functions encode_transform_unit and encode_transform_coeff to
take coordinate arguments in units of luma pixels instead of 4 px
blocks. This should make the code easier to understand.
2017-05-23 15:36:16 +03:00
Arttu Ylä-Outinen 2e8838de6e Fix crash when crypto compiled in but disabled
When kvazaar was built with crypto++ but running without using
encryption features, kvazaar attempted to delete an uninitialized crypto
handle. Fixed by setting the handle to NULL in kvz_encoder_state_init.
2017-05-23 14:01:48 +03:00
Arttu Ylä-Outinen 2f2c281e8e Fix a memory leak in crypto
A CryptoPP::CFB_Mode<CryptoPP::AES>::Encryption was allocated at the
beginning of encoder_state_encode_leaf and was never freed. This commit
changes encoder_state_worker_encode_lcu to delete the CFB_Mode. Also
moves crypto handle from encoder_state_config_tile_t to encoder_state_t
so that it can be safely deleted without affecting other threads in the
same tile.
2017-05-23 11:51:25 +03:00
Arttu Ylä-Outinen 22155950c1 Rewrite crypto to conform to kvazaar code style 2017-05-23 11:51:25 +03:00
Arttu Ylä-Outinen 6829865190 Fix inline declaration in intra_mode_encryption
Moves the inline declaration of intra_mode_encryption before the type
and changes it to use the INLINE macro. Inline declaration after type
triggered a warning on GCC.
2017-05-23 11:50:32 +03:00
Arttu Ylä-Outinen 5f8e17d4ba Eliminate a race condition in threadqueue
Fixes the order of acquiring locks for the job and its dependency in
kvz_threadqueue_job_dep_add. The dependency is locked before the job
that depends on it. This is the same order as in threadqueue_worker.

Acquiring the locks in different order in kvz_threadqueue_job_dep_add
and threadqueue_worker would sometimes result in a deadlock.
2017-05-18 12:25:53 +03:00
Arttu Ylä-Outinen 4b213477f0 Return best MV from inter early terminate
When using --me-early-termination=sensitive, early termination of inter
search used to always return the starting point if no tested motion
vector was good enough to continue the search. This commit changes
early_termination to always return the best motion vector and cost
found.
2017-05-18 09:05:14 +03:00
Arttu Ylä-Outinen 382636de55 Fix handling too large QPs
Changes kvz_config_validate to output an error if the given QP is out of
range and changes kvz_set_picture_lambda_and_qp to clip the QP to the
valid range if is too large after applying QP offset from GOP structure.
2017-05-17 12:41:51 +03:00
Arttu Ylä-Outinen de8b59c681 Drop unused function kvz_coefficients_blit 2017-05-12 16:48:30 +03:00
Arttu Ylä-Outinen bcfa5a3cd9 Add a comment explaining the coefficient order 2017-05-12 16:46:57 +03:00
Arttu Ylä-Outinen 95775a1645 Change coefficient storage order
Changes coefficient storage order to a zig-zag order. Reduces
unnecessary copying of coefficients to temporary arrays.
2017-05-12 16:46:57 +03:00
Arttu Ylä-Outinen 9395867a9a Quantize all colors in a single traversal
Changes kvz_quantize_lcu_residual to process all three colors in
a single traversal of the TU tree.
2017-05-12 16:42:41 +03:00
Arttu Ylä-Outinen 1e58fd6b16 Split kvz_quantize_lcu_residual
Splits kvz_quantize_lcu_residual to two functions that handle the TU
tree recursion and quantization of a single TU.
2017-05-12 16:42:41 +03:00
Arttu Ylä-Outinen cc87e0dcc7 Combine luma and chroma quantization functions
Replaces functions kvz_quantize_lcu_luma_residual and
kvz_quantize_lcu_chroma_residual in transform.c with function
kvz_quantize_lcu_residual. The new function can handle any of the YUV
colors. Removes some duplicated code.
2017-05-12 16:42:41 +03:00
Arttu Ylä-Outinen 1357dd0599 Pass coeffs through encoder state
Changes the way coefficients are passed from kvz_search_lcu to
kvz_encode_coding_tree. Drops fields coeff_y, coeff_u and coeff_v in
videoframe_t and instead passes them through field coeff in
endoder_state_t.
2017-05-12 16:42:41 +03:00
Eemeli Kallio 2cad3173ec Reduced amount of modes for search_intra_rdo 2017-05-12 15:56:07 +03:00
Arttu Ylä-Outinen 26adef4492 Merge branch 'erp-aqp' 2017-05-12 15:05:24 +03:00
Eemeli Kallio 55e0e65733 Added INLINE to kvz_get_ic_rate and kvz_get_coded_level in rdo.c 2017-05-12 15:03:30 +03:00
Arttu Ylä-Outinen ee3d4d0e78 Add adaptive QP for 360 degree video
Adds option --erp-aqp for enabling adaptive QP for 360 degree video with
equirectangular projection. When projected into a spherical surface,
the middle part of the video covers relatively larger area than the top
and bottom parts. Enabling --erp-aqp sets up a ROI delta QP array which
uses higher QPs for the top and bottom of the video and lower QPs for
the middle part.
2017-05-11 12:31:53 +03:00
Arttu Ylä-Outinen 79cb3a2fd3 Permit negative QP deltas in ROI
Delta QPs should not be arbitrarily restricted to positive values.
2017-05-11 12:13:47 +03:00
Arttu Ylä-Outinen edfbd6f122 Add field lcu_dqp_enabled to encoder_control_t
Delta QPs for LCUs are enabled when either ROI coding or rate control is
enabled. Having a single field is simpler than always checking whether
ROI or rate control is enabled.
2017-05-11 12:13:47 +03:00
Arttu Ylä-Outinen 2f2405dfe6 Fix crash when PU depth is limited
When video width or height was not a multiple of the smallest CU size,
no prediction would be performed at the border CUs. Kvazaar would later
crash at an assertion failure when attempting to write the bitstream for
the CU.

Fixed by permitting inter and intra prediction when the CU split is
forced, even if CUs of that size would otherwise be disabled.
2017-04-27 10:35:48 +03:00
Arttu Ylä-Outinen 9130b5107c Change handling of infinite PSNR in encmain
Changes encmain to print 999.99 as PSNR when SSE is zero. This behavior
is in line with HM. Previously SSE was set to 99 when it was zero.
2017-04-27 10:35:13 +03:00
Arttu Ylä-Outinen a9c878b535 Fix crash with WPP when threads are disabled
When WPP is enabled, a reference to SAO reconstruction job is copied
from the wavefront to the main encoder state. However, when threads are
disabled, the job is a null pointer and dereferencing it crashes the
encoder. Fixed by adding a null pointer check.
2017-04-24 12:59:57 +03:00
Arttu Ylä-Outinen 2991962033 Add reference counting to threadequeue_job_t
Both the thread queue and the encoder states hold pointers to the thread
queue jobs. It is possible that a job is removed from the thread queue
and freed while the encoder state is still using it. This commit adds
reference counting to threadqueue_job_t in order to fix the problem.

Fixes #161.
2017-04-12 16:13:52 +03:00
Arttu Ylä-Outinen bd8adff43a Drop unused defines in threads.h 2017-04-12 03:41:07 -07:00
Arttu Ylä-Outinen 7ab0a7aff2 Fix semaphores on Mac
POSIX semaphores are deprecated on Mac. This commit replaces POSIX
semaphores by Grand Central Dispatch semaphores when building on Mac.
2017-04-12 03:41:02 -07:00
Arttu Ylä-Outinen 26693e1402 Fix reliance on undefined behaviour in encmain
Pthread mutexes were used for synchronization in encmain by locking and
unlocking them from different threads. However, according to the POSIX
standard, unlocking a mutex from a different thread is undefined
behaviour. This commit replaces the mutexes by semaphores which can be
used from different threads.
2017-04-12 03:23:58 -07:00
Ari Lemmetti 47a9f0de04 Modify and use FILL_ARRAY macro to prevent warning on GCC 7
Following warning was given and is false positive

error: 'memset' used with length equal to number of elements without multiplication by element size [-Werror=memset-elt-size]
2017-04-11 14:04:25 +03:00
Eemeli Kallio f7e01b8ba1 Fixed error on rd=3 2017-04-05 13:27:14 +03:00
Eemeli Kallio 9f605152ae Changed intra to use best rough cost when using inter and rd=2 2017-04-05 13:01:32 +03:00
Ari Lemmetti 33ce101ab5 Revert "Use sizeof(uint32_t) to avoid warning in GCC7."
Did not fix the problem.

This reverts commit e3c3e74926.
2017-04-03 20:21:33 +03:00
Ari Lemmetti e3c3e74926 Use sizeof(uint32_t) to avoid warning in GCC7.
error: 'memset' used with length equal to number of elements without multiplication by element size [-Werror=memset-elt-size]
2017-04-03 19:16:09 +03:00
Arttu Ylä-Outinen df359b8f95 Fix indentation in encode_coding_tree.c
Fixes indentation of a for loop that was causing a misleading
indentation warning on GCC.

Fixes #163.
2017-03-08 22:56:28 +09:00
Pierre-Loup Cabarat 2b8ce5e47c Add intra prediction modes encryption 2017-03-06 17:27:39 +01:00
Arttu Ylä-Outinen aae141f2d3 Fix order of frames with --debug
When the decoding and presentation orders of pictures are different
(with GOP), the frames in YUV debug output would be in the decoding
order. This commit changes the kvazaar command line program to store the
reconstructed pictures in a buffer so that they can be output in the
presentation order.

Fixes #101.
2017-02-28 14:09:24 +09:00
Arttu Ylä-Outinen 094b39e7fc Refactor inter MV/merge candidate selection
Adds struct merge_candidates_t for holding the spatial and temporal
merge candidates. Changes functions with separate parameters for each
candidate to use the struct instead.
2017-02-22 15:56:36 +09:00
Arttu Ylä-Outinen 3409748a8f Refactor inter MVP candidate selection
Adds helper function add_mvp_candidate.
2017-02-22 15:56:27 +09:00
Arttu Ylä-Outinen ef6503c728 Refactor inter merge candidate selection
Adds helper function add_merge_candidate and replaces macro
CHECK_DUPLICATE with function is_duplicate_candidate.
2017-02-22 02:50:52 +09:00
Arttu Ylä-Outinen f12e09bc40 Refactor inter TMVP selection
Adds helper function add_temporal_candidate to inter.c.
2017-02-22 02:08:10 +09:00
Arttu Ylä-Outinen 4f88066740 Refactor MV and merge candidate selection
Replaces macros APPLY_MV_SCALING and CALCULATE_SCALE with helper
functions.
2017-02-22 01:14:16 +09:00
Arttu Ylä-Outinen db08041d9a Refactor inter TMVP selection
Merges three if-clauses to remove two levels of indentation.
2017-02-21 23:56:01 +09:00
Marko Viitanen 85e2a40da3 Clip scaled motion vectors, scale and td/tb values to appropriate limits
Fixes #158.
2017-02-20 15:40:20 +02:00
Ari Koivula 7369f25f64 Bump version to 1.1.0 2017-02-16 20:52:05 +02:00
Ari Lemmetti b021d2244e Reduce more unnecessary initializations. 2017-02-16 17:25:26 +02:00
Ari Lemmetti acd12cba1e Remove unnecessary memory initialization to zero
Values in interval [last_scanpos, 0] are overwritten in following for loop, except for the sig_coeff_inc value.
2017-02-16 16:48:48 +02:00
Ari Koivula 7ff33e1bf2 Fix default reference picture count
The default was 3, instead of the intended 1 of the medium preset.
2017-02-13 17:34:28 +02:00
Marko Viitanen 4251607c04 Fix a bug in TMVP reference POC list 2017-02-13 15:19:24 +02:00
Marko Viitanen 4270d451e6 Fixed some errors after rebase 2017-02-13 15:19:24 +02:00
Marko Viitanen 95effb00d0 Disable TMVP in frames with zero L0 references 2017-02-13 15:19:24 +02:00
Marko Viitanen b4de1878be Fixed TMVP scaling and candidate selection for B-frames 2017-02-13 15:19:23 +02:00
Marko Viitanen 23be633ad7 Added TMVP merge candidate scaling for L0 2017-02-13 15:19:23 +02:00
Marko Viitanen e6aa1b9b9a Renamed get_mv_cand_from_spatial() to get_mv_cand_from_candidates() 2017-02-13 15:19:23 +02:00
Marko Viitanen 1124bb5fd0 Cleaned up TMVP, mv candidate selection working, merge candidate selection not 2017-02-13 15:19:23 +02:00
Marko Viitanen d65d2ec88d WIP: add list of POCs used in the image when pushing to reference 2017-02-13 15:19:22 +02:00
Marko Viitanen 6a25cd3248 WIP: work on tmvp on inter 2017-02-13 15:19:22 +02:00
Marko Viitanen e538a94eda Enable TMVP with B-frames 2017-02-13 15:19:22 +02:00
Arttu Ylä-Outinen 363b8b49a2 Fix integer overflows with large resolutions
Limits video size so that the number of luma and chroma pixels can be
stored in an int. Fixes some integer overflows that resulted in
segmentation faults.
2017-02-12 11:40:13 +09:00
Arttu Ylä-Outinen a5a925fc28 Replace timed waits by normal waits in threadqueue
Replaces calls to pthread_cond_timedwait with pthread_cond_wait in
threadqueue.c. Simplifies code, as there should be no need for the
timeout.
2017-02-11 15:42:03 +09:00
Arttu Ylä-Outinen fd057498fc Simplify kvz_config_alloc 2017-02-11 15:42:03 +09:00
Arttu Ylä-Outinen 7f7844caad Fix finalizing uninitialized encoder states
Finalization functions for frame and tile encoder states accessed the
frame and tile fields of the encoder state even though they might be
NULL. This is the case when the initialization of an encoder state
fails. Fixed by adding NULL checks.
2017-02-09 14:05:28 +09:00
Arttu Ylä-Outinen 51786eda67 Drop redundant fields in encoder_control_t
Some of the fields in encoder_control_t were simply copies of the
corresponding fields in kvz_config. This commit drops the copied fields
in favor of using the fields in encoder_control_t.cfg directly.
2017-02-09 14:05:28 +09:00
Arttu Ylä-Outinen 6a178dee96 Fix leaking memory when --cqmfile given many times
Any previously allocated CQM file name was not freed when allocating
memory for the new file name.
2017-02-09 14:05:28 +09:00
Arttu Ylä-Outinen 63a567ad8a Fix leaking memory when --roi given many times
Any previously allocated delta QP array was not freed when allocating
a new array.
2017-02-09 14:05:21 +09:00
Arttu Ylä-Outinen bfd89136a4 Fix ROI delta QP array not getting freed 2017-02-09 13:23:55 +09:00
Arttu Ylä-Outinen e78a8dfcf5 Copy the kvz_config passed to encoder_open
The kvz_config struct is created by the user but kvazaar keeps a pointer
to it. It is easy to break things by modifying the configuration outside
kvazaar. In addition, kvazaar modifies the struct even though it is has
a const modifier.

This commit changes the field cfg in encoder_control_t to be a copy of
the kvz_config struct instead of a pointer, removing modifications to
the const struct and allowing users to do whatever they want with it
after opening the encoder.
2017-02-09 13:23:54 +09:00
Ari Koivula b8e3513a23 Fix crash with sub-LCU frame sizes and WPP
The end of slice was being calculated incorrectly, which led to no tile
being created inside the slice, which led to an assert triggering.

This fixes the wrong end of slice calculation, but also disallows
wavefront rows from being created, if there would be only one.
The wavefront initialization code assumes there are always more than
one row, so the inter-frame dependency doesn't get added properly.

Fixes #153.
2017-02-08 21:41:30 +02:00
Ari Koivula d893474bab Fix encoder getting stuck on OS-X
Main thread was stuck looping on pthread_cond_timedwait because
the abs time given on OS-X had already passed and the wait
returned immediately without releasing the mutex to allow worker
threads to proceed.

Fix was to use the gettimeofday, which returns real time instead
of monotonic, which is what pthread_cond_timedwait wants.
2017-02-02 17:27:46 +02:00
Ari Koivula 4ceda1908b Fix OS-X compiler warning
rdo.c:475:25: warning: absolute value function 'abs' given an argument of type 'int64_t' (aka 'long long') but has parameter
       of type 'int' which may cause truncation of value [-Wabsolute-value]
         current.cost = -abs(quant_cost_in_bits) + (bits << PRECISION_INC);
                         ^
rdo.c:475:25: note: use function 'llabs' instead
         current.cost = -abs(quant_cost_in_bits) + (bits << PRECISION_INC);
2017-02-01 18:09:17 +02:00
Ari Koivula c7d536bbcd Fix OS-X compiler warning
cfg.c:1024:74: warning: format specifies type 'size_t' (aka 'unsigned
long') but the argument has type 'unsigned long long'
       [-Wformat]
       fprintf(stderr, "Too large ROI size: %llu (maximum %zu).\n", size, SIZE_MAX);
2017-02-01 18:09:04 +02:00
Ari Koivula 4467506ef1 Add missing kvz_ prefix 2017-01-31 18:38:02 +02:00
Ari Koivula ed3bd898fd Remove Exp-Golomb lookup table
This table takes 256kB and isn't used very much. Au revoir!
2017-01-31 18:31:05 +02:00
Ari Koivula 5513744d24 Merge branch 'slices' 2017-01-31 16:14:30 +02:00
Ari Koivula 52904d3e9f Add --slices=tiles and --slices=wpp
This encapsulates tiles or WPP rows into their own slices, making
it possible to send them as soon as they are done, instead of waiting
for the other substreams to finish and coding the substream offsets
in the slice header.
2017-01-31 15:44:23 +02:00
Ari Koivula 0d4d0e869c Add support for independent slices
Not used yet, but they work.
2017-01-31 15:11:50 +02:00
Ari Koivula 46ae382498 Fix bugs with slice header
These fixes allow more than one slice to be used to code a picture.
- Use correct number of bits to code the slice segment address.
- Don't offset_len_minus1 for slices without substreams.
2017-01-31 14:01:59 +02:00
Ari Koivula f1fc0de2bf Write slice headers to the parent stream
Appending to the child stream doesn't work is the child is a leaf
slice state.

Simplifies flow by removing distinction between tile and slice. Now
that slice headers are written in the parent stream, there is zero
difference between tiles and slices from bitstream point of view.
2017-01-31 13:55:05 +02:00
Ari Koivula 04cd875b2c Move substream finalization to LCU coding job
Having some of the termination bits in the LCU coding and some in the
substream finalization was needlessly confusing. Doing substream
finalization directly after LCU coding makes it easy to verify that the
finalization is done correctly.

Removes one job per WPP row from the job queue.

Removes kvz_cabac_flush, because I don't like bits being put into the
bitstream implicitly. Better to have it all in the open.
2017-01-31 13:01:57 +02:00
Ari Koivula ead490b7b7 Write a new slice NAL for every slice 2017-01-31 12:36:18 +02:00
Ari Koivula cd496bf50b Move first_nal_in_au to encoder_state->frame
Needed for writing NALs from encoder_state_write_bitstream_children
2017-01-31 12:28:28 +02:00
Arttu Ylä-Outinen 1e6463c08b Fix inter bipred search
When the number of merge candidates was five, biprediction search would
read past the bounds of the priority list arrays. Fixed to limit the
search to the first four candidates.
2017-01-31 18:23:12 +09:00
Ari Lemmetti 2c069a3e5f Prevent unnecessary cu search
Prevent further analysis as soon as it is known that splitting can not improve cost
2017-01-30 16:21:41 +02:00
Arttu Ylä-Outinen 9b889c3fab Fix reading ROI files
- Checks the return value of fopen when opening the ROI file. Fixes
  a segfault when the file cannot be opened.
- Check that the width and height are positive. Fixes reading past the
  end of the delta QP array in kvz_set_lcu_lambda_and_qp.
- Check for overflow in width * height. Fixes an overflow resulting in
  a segfault.
- Properly check that fscanf succeeds. Fixes silently accepting ROI
  files that are too short.
- Properly close the FILE pointer.
2017-01-29 18:57:27 +09:00
Arttu Ylä-Outinen 46c9a483c3 Fix inter search for small SMP and AMP blocks
The function search_pu_inter_ref incorrectly rounded the coordinates of
the block to down to a multiple 8 pixels. Small SMP and AMP blocks may
start at coordinates that are not multiples of 8. Fixed by removing the
rounding.

Fixes a failing assert when --mv-constraint is used with --smp or --amp.
2017-01-29 13:34:50 +09:00
Arttu Ylä-Outinen fb10b56b82 Fix checking if a low delay GOP structure is used
Stops assuming that having cfg->gop_lowdelay set means that GOP
structure is used since it is possible that cfg->gop_lowdelay is true
but cfg->gop_len is zero. Adds checks for cfg->gop_len where needed.

Fixes a possible division by zero in kvz_encoder_feed_frame.
2017-01-28 21:56:00 +09:00
Arttu Ylä-Outinen 4f56b04239 Drop an unnecessary conditional
Drop a conditional for depth > MAX_DEPTH in search_cu. The depth cannot
be greater than MAX_DEPTH (== 3) since an earlier if-clause checks that
it is less than MAX_PU_DEPTH (== 4).
2017-01-28 21:35:27 +09:00
Ari Koivula 937a764987 Fix bug in --mv-constraint
Subpixel motion estimation return 0-vector when no subpixel vector is
within the constraint. Fix is to not call subpixel motion estimation
when the integer vector is not within the constraint.
2017-01-26 09:55:57 +02:00
Ari Koivula 4a0121ac42 Add --roi parameter
Adds region of interest coding capability.

Works by reading a file of delta QP values which will then be applied
to each frame at LCU level.
2017-01-26 09:14:14 +02:00
Ari Koivula 6f61836989 Refactor kvz_rdoq_sign_hiding
Rename and reorder everything to make more sense.

- Moved input tables into their own struct and renamed them to what
  they actually represent.
- Renamed pretty much every variable to comform to our style and
  to make sense.
- Removed the lastCG stuff, as the function already gets passed the
  last coeff anyway. (it was named width, what the hell?)
2017-01-19 23:58:17 +02:00
Ari Koivula a85390d0ac Clean up code using the fixed point frac bit tables
This is to prepare for changing the code using the floating point table
to use the fixed point table instead.

This also allows reducing the size of the fractional part, which was
useful for finding every place where the the fixed point presentation
is relied upon.
2017-01-19 20:20:51 +02:00
Ari Koivula 24a69c7467 Refactor luma deblocking
Changes luma deblocking to use gather and scatter instead of reading
to and writing from here and there in memory. Should make them
faster and easier to vectorize, or at least cleaner.

Splits strong and weak luma deblocking to two functions, as they have
almost nothing in common.
2017-01-17 22:13:39 +02:00
Ari Koivula 4cb2fca924 Refactor deblock decision 2017-01-17 19:34:17 +02:00
Arttu Ylä-Outinen 05794c3548 Add missing static to function lambda_to_qp 2017-01-11 15:53:55 +09:00
Arttu Ylä-Outinen ee518e8ac4 Take header bits into account in rate control 2017-01-11 15:53:55 +09:00
Arttu Ylä-Outinen c219d3cd94 Fix deblock when CU QP delta is enabled
Fixes deblock functions so that they use the correct QP for the filtered
edge. Adds field qp to cu_info_t.
2017-01-11 15:53:22 +09:00
Arttu Ylä-Outinen 82a98180e4 Clip LCU lambda to reduce quality fluctuation
Limits lambdas for each LCU based on the computed lambda from the
previous frame and the frame-level lambda.
2017-01-09 01:24:23 +09:00
Arttu Ylä-Outinen 93172fd251 Use separate alpha, beta and lambda for each LCU
Changes rate control to use the alpha and beta values stored in
lcu_stats_t instead of the frame-level values when selecting lambda and
QP for an LCU.
2017-01-09 01:24:23 +09:00
Arttu Ylä-Outinen 3af4e9cc8a Allocate bits separately for each LCU
Bits are allocated based on the costs of the LCUs in the previous
completely coded frame.

Breaks deblock when rate control is used.
2017-01-09 01:24:23 +09:00
Arttu Ylä-Outinen ff5e5ec6d4 Record info about coded LCUs
Adds field lcu_stats to encoder_state_config_frame_t. The following data
is recorded for each LCU:
    - number of bits
    - squared cost
    - used lambda value
    - alpha parameter used for rate control
    - beta parameter used for rate control
2017-01-09 01:24:23 +09:00
Arttu Ylä-Outinen 2a4243acbe Refactor rate control
Moves all code related to setting QP and lambda values to rate_control
module.
2017-01-09 01:24:23 +09:00
Arttu Ylä-Outinen 71633889ce Enable CU QP delta when using rate control
When rate control is enabled, enable cu_qp_delta_enabled_flag in PPS
with diff_cu_qp_delta_depth set to 0. Also adds code for writing the QP
deltas and a new cabac context.
2017-01-09 01:24:23 +09:00
Arttu Ylä-Outinen 640ff94ecd Use separate lambda and QP for each LCU
Adds fields lambda, lambda_sqrt and qp to encoder_state_t. Drops field
cur_lambda_cost_sqrt from encoder_state_config_frame_t and renames
cur_lambda_cost to lambda.
2017-01-09 01:24:23 +09:00
Arttu Ylä-Outinen 435c387357 Refactor rate control
- Defines MIN_LAMBDA and MAX_LAMBDA constants.
- Moves resetting state->frame->cur_gop_bits_coded to rate_control.c.
- Changes gop_allocate_bits to return the number of bits allocated like
  pic_allocate_bits does.
2017-01-09 01:24:23 +09:00
Arttu Ylä-Outinen 6c4f2d196a Move fields from encoder_state_t to frame
Moves fields prepared and frame_done from encoder_state_t to
encoder_state_config_frame_t.
2017-01-09 01:24:23 +09:00
Arttu Ylä-Outinen 97863cdaa2 Fail encoder init when CQM file cannot be opened 2017-01-08 19:17:43 +09:00
Arttu Ylä-Outinen db5e750c7f Fix --threads=auto
When --threads=auto was given on the command line, cfg->threads was
actually set to zero, disabling threads altogether. Fixed to set
cfg->threads to -1, so that the number of threads is chosen
automatically.
2017-01-08 17:58:22 +09:00
Ari Koivula a9e45efcfc Add a fast lane for byte-aligned bitstream writes
The CABAC engine only writes to the bitstream when it has a full byte.
These writes are also always byte-aligned, so there is no need to even
check for stream alignment.

Speedup was around 3% with ultrafast and low QP.
2016-12-23 17:01:44 +02:00
Jaakko Laitinen deb63f735f Fix gop disabling 2016-12-20 14:25:13 +02:00
Ari Lemmetti 70a52f0e48 10-bit: add missing bit depth adjustment to ssd 2016-11-17 19:28:04 +02:00
Ari Koivula fa078102f1 Fix 32bit compilation
Got a warning about implicit cast from uint64_t to void*.
2016-11-17 17:53:57 +02:00
Ari Koivula 5ceec06bd3 Merge pull request #148 from Venti-/crypto
Crypto
2016-11-16 21:33:55 +02:00
Ari Lemmetti c31207ea7d Optimize intra reference building
-Add function with reduced logic for the most common case
2016-11-16 18:28:42 +02:00
Ari Koivula 24f2a23ef8 Remove unnecessary crypto state
The frame does not need it's own crypto state, since it always has at
least one sub tile.
2016-11-16 13:58:41 +02:00
Ari Koivula 8951e34fd2 Change crypto.h stubs to print instead of assert 2016-11-16 13:58:41 +02:00
Wassim Hamidouche ea82c38906 correct memory allocation 2016-11-16 12:35:28 +02:00
Wassim Hamidouche da3e2d1d07 resolve parallel encryption 2016-11-16 12:35:28 +02:00
Ari Koivula b8a618e666 Fix problems with >8 bit input
Enforce bit depth promised by --input-bitdepth to avoid crashes when
larger values are provided.

Do endianess byte swap for all bytes when the buffer gets extended
to multiple of 8 pixels, and not just the number of input pixels.

Don't swap bytes on a little-endian system.
2016-11-13 19:58:54 +02:00
Ari Koivula 2c005cda25 Fix bug with sub-pixel motion estimation in tiles
The width of the tile was being used to index the frame pixel buffer
instead of the width of the buffer.
2016-11-07 15:53:52 +02:00
Ari Koivula 78a28e0338 Reformat --help message
- Reduce indentation to 6 spaces
- Word wrap everything to under 80 characters
- Remove defaults from options covered by presets
- Add a dash in front of argument descriptions
- Add --(no-) to names of parameters that accept it and remove mention
  of enabling or disabling
- Add executable and scripts as a dependancy to make docs
2016-11-04 15:40:28 +02:00
Ari Koivula d18de19d8a Fix DTS and PTS not being passed on through lib API
Fixes "cur_dts is invalid" warning from FFmpeg.
2016-10-28 19:05:47 +03:00
Ari Koivula 0c41c2ebd6 Make CLI set PTS for each input picture
This value is not represented in the HEVC bitstream, which is why it
was not set previously. FFmpeg sets and needs it however, so make the
CLI set it as well to make sure we handle it correctly.
2016-10-28 19:03:03 +03:00
Ari Koivula 5bf745460d Re-categorize options in the help message
- Move VUI stuff to the bottom
- Merge Parallel processing, WPP, Tiles and slices
- Add more categories for the other options
2016-10-27 03:26:15 +03:00
Ari Koivula cb6672b452 Disable WPP when Tiles are enabled
Closes #142.
2016-10-27 02:07:10 +03:00
darealshinji 488d042e5f Bump KVZ_VERSION 2016-10-25 12:32:13 +02:00
Ari Lemmetti 29153ed503 Remove unused variable 2016-10-21 17:28:42 +03:00
Ari Lemmetti 778e46dfd8 Add AVX2 version of SSD 2016-10-21 15:07:53 +03:00
Ari Lemmetti 6f5d7c9e06 Move SSD to strategies 2016-10-21 15:07:23 +03:00
Ari Lemmetti 89b941eab4 Fix typo 2016-10-21 15:07:02 +03:00
Alexis Ballier 1dcc993743 Include i386 & i486 for compiling intel asm.
x86_64-pc-linux-gnu-gcc -m32 that I use for building 32bits libraries on amd64 defines only __i386__.
2016-10-14 18:07:37 +02:00
Arttu Ylä-Outinen 5fb7afe8c4 Add --implicit-rdpcm command line parameter.
Makes it possible to use lossless coding without implicit residual DPCM.
2016-10-03 20:01:55 +09:00
Arttu Ylä-Outinen 5affc0f527 Use implicit RDPCM in lossless mode.
Sets implicit RDPCM flag in SPS when lossy coding is disabled and
applies DPCM to intra residual when prediction mode is horizontal or
vertical.
2016-10-03 19:31:38 +09:00
Ari Koivula 016dbe0894 Further refine presets
The rd-complexity of slow presets is better with a less agressive GOP.

Adding the GOP as part of the preset improved BDRate enough, that it
didn't make sense anymore to have a veryslow target the best BDRate.
Instead, push that responsibility to placebo by making it a little bit
faster.
2016-09-29 17:35:12 +03:00
Ari Koivula 31c5ff0f16 Add cross-platform core number detection
Well, turns out pthread_num_processors_np isn't standard so we need to
do this crap. Threw in hyper threading detection as a bonus.
2016-09-29 00:03:21 +03:00
Ari Koivula 8c7351eac8 Fix lp-gop with depth 1
GOPs with depth 1 had the same structure as those with depth 2:
g4d3t1 = 3 2 3 1
g4d2t1 = 2 2 2 1
g4d1t1 = 2 2 2 1

It now results in the correct:
g4d1t1 = 1 1 1 1
2016-09-29 00:03:21 +03:00
Ari Koivula a395aeaac9 Set default settings to those of --preset=medium 2016-09-29 00:03:21 +03:00
Ari Koivula 4388fe0d30 Set presets to ratedistortion-complexity optimized versions 2016-09-29 00:03:20 +03:00
Ari Koivula facb1e16df Use -p64 -q22 and --gop=lp-g4d3t1 by default
Coding inter without GOP of any kind really isn't a very sensible
default. Defaulting to B-GOP of some kind would be more better,
but lp-gop is more robust for now.
2016-09-29 00:03:20 +03:00
Ari Koivula d7391a9593 Improve default for number of parallel frames 2016-09-29 00:03:20 +03:00
Ari Koivula 19d423ab29 Use all available cores by default 2016-09-29 00:03:20 +03:00
Ari Koivula 3f138f087a Allow non-gop-length --period for lp-gop 2016-09-29 00:03:19 +03:00
Ari Koivula 16790c9f15 Remove number of references from --gop=lp syntax
The number of references should be part of the presets, so gop should
be defined separately.
2016-09-29 00:03:19 +03:00
Ari Koivula cbfa824d1a Merge branch 'simd' 2016-09-27 20:49:45 +03:00
Ari Koivula 14a7bcba25 Use a faster function for clipped inter SAD
Use the vectorized general SSE41 inter SAD in AVX reg_sad for shapes
for which we don't have AVX versions yet.

Also improves speed of --smp and --amp a lot. Got a 1.25x speedup for:
--preset=ultrafast -q 27 --gop=lp-g4d3r3t1 --me-early-termination=on --rd=1 --pu-depth-inter=1-3 --smp --amp

* Suite speed_tests:
-PASS inter_sad: 0.898M x reg_sad(64x63):x86_asm_avx (1000 ticks, 1.000 sec)
+PASS inter_sad: 2.503M x reg_sad(64x63):x86_asm_avx (1000 ticks, 1.000 sec)
-PASS inter_sad: 115.054M x reg_sad(1x1):x86_asm_avx (1000 ticks, 1.000 sec)
+PASS inter_sad: 133.577M x reg_sad(1x1):x86_asm_avx (1000 ticks, 1.000 sec)
2016-09-27 20:48:30 +03:00
Arttu Ylä-Outinen 4313e56c2d Add --no-rdoq-skip command line switch 2016-09-11 17:40:16 +09:00
Ari Koivula a7a33b08ec Remove --slice-addresses from usage message
And give a warning if it's used.

Slices will have to be implemented at some point, but they aren't yet
so let's not advertize them.
2016-09-10 21:06:00 +03:00
Eemeli Kallio f41e428e5f Removed kvz_skip_unnecessary_rdoq and reworked --rdoq-skip to skip 4x4 blocks when it is on. 2016-09-09 10:26:07 +03:00
Eemeli Kallio ed9c0b0416 RDOQ reworked in rdo.c. rdoq_signhide now skips coeffs that are after best_last_idx. 2016-09-09 10:16:51 +03:00
Ari Koivula 02cd17b427 Add faster AVX inter SAD for 32x32 and 64x64
Add implementations for these functions that process the image line by
line instead of using the 16x16 function to process block by block.

The 32x32 is around 30% faster, and 64x64 is around 15% faster,
on Haswell.

PASS inter_sad: 28.744M x reg_sad(32x32):x86_asm_avx (1014 ticks, 1.014 sec)
PASS inter_sad: 7.882M x reg_sad(64x64):x86_asm_avx (1014 ticks, 1.014 sec)
to
PASS inter_sad: 37.828M x reg_sad(32x32):x86_asm_avx (1014 ticks, 1.014 sec)
PASS inter_sad: 9.081M x reg_sad(64x64):x86_asm_avx (1014 ticks, 1.014 sec)
2016-09-01 21:36:39 +03:00
Ari Koivula d0512d25c6 Use fixed point in get_mvd_coding_cost 2016-08-30 21:37:12 +03:00
Ari Koivula ec7507a935 Further optimize get_ep_ex_golomb_bitcost
Unrolled 16-bit log2 calculation.
2016-08-30 21:37:01 +03:00
Ari Koivula a4ba794587 Optimize get_ep_ex_golomb_bitcost
Arrange the decision tree such that there is only 3 branches on the
most common paths and the more likely branch is always fall-through.

A profile guided optimization pass would probably do something similar.
2016-08-30 05:24:16 +03:00
Ari Koivula 82cfab58f8 Improve fast mvd coding cost estimation
A lot of time is being taken up by this function on ultrafast, and it
doesn't do a very good job. This change aims to both simplify the
logic and make the estimate better.

The logic is simplified by using a look up for the step mvd bit cost
step function instead of mimicking the binarization process. The
estimation is made better by checking fractional cabac bit costs.

The new function returns the same results as
kvz_get_mvd_coding_cost_cabac, but is also faster than the old
function.
2016-08-30 04:55:09 +03:00
Ari Koivula d31be8eb27 Make mvd_coding_cost functions take const cabac 2016-08-30 04:46:46 +03:00
Ari Koivula 64d631c174 Fix 8bit to 10bit input conversion regression 2016-08-25 22:09:40 +03:00
Ari Koivula 27789125d8 Fix input bit depth conversion
The input was being shifted to the wrong direction.
2016-08-25 22:05:25 +03:00
Ari Koivula 4ec039004b Add monochrome encoding
Write bitstream without chroma when encoding with --input-format=P400.
This reduces bitstream size by 0-1 %, compared to coding monochrome in
420 format, and speeds up encoding slightly due to not processing
chroma.
2016-08-25 20:15:26 +03:00
Ari Koivula c5b70cf812 Add chroma format support to yuv_t 2016-08-24 19:20:53 +03:00
Ari Koivula 032ed30ff4 Add chroma format support to kvz_picture
Add picture_alloc_csp to libkvz api to allocated pictures with chroma
format different from 420.
2016-08-24 19:20:53 +03:00
Ari Koivula 48ccc26839 Add --input-format and --input-bitdepth
Adds reading of 10 bit input for 10-bit encoding.
2016-08-24 19:20:53 +03:00
Ari Koivula cc08073615 Refactor some indexing weirdness in init_lcu_t
I thought there might be a bug in this so I cleaned it up.
2016-08-24 19:12:48 +03:00
Ari Koivula b6d674d66e Refactor integer vector inter prediction
This code was pretty bad, so I cleaned it up a bit.
2016-08-24 19:09:26 +03:00
Ari Lemmetti 28c4174d0e Fix incorrect shuffle parameters
_MM_SHUFFLE uses reverse order
2016-08-23 19:40:46 +03:00
Ari Lemmetti ce77bfa15b Replace KVZ_PERMUTE with _MM_SHUFFLE
The same exact macro already exists
2016-08-22 19:08:46 +03:00
Jovasa 68eef660bd Fixed search around mv_in in fullsearch not being saved. 2016-08-19 15:19:29 +03:00
Eemeli Kallio 99d8b9abeb Changed skip_rdoq name to kvz_skip_unnecessary_rdoq. Changed the order it uses when it goes through CGs and tuned its sum calculation. 2016-08-18 14:02:56 +03:00
Eemeli Kallio 1fb4755f31 Added rdoq-skip to quant-generic.c 2016-08-18 12:17:54 +03:00
Eemeli Kallio d20ac03ca2 Added --rdoq-skip option 2016-08-18 12:17:53 +03:00
Marko Viitanen 83cf801664 Fixed MV constraint condition in bipred 2016-08-18 08:53:17 +03:00
Marko Viitanen 5ae1c595f2 Fixed slice_temporal_mvp_enabled_flag and disabled TMVP with tiles
- slice_temporal_mvp_enabled_flag should be signalled also with non-IDR I-slices
2016-08-10 14:51:41 +03:00
Marko Viitanen 5326519182 TMVP cleanup and const qualifier fixes 2016-08-10 14:10:43 +03:00
Marko Viitanen f40907260d Added config parameter for TMVP and cmdline option --no-tmvp
- Enabled by default
 - Cannot be used with GOP at the moment
2016-08-10 14:09:29 +03:00
Marko Viitanen fd52dac1f7 Fixed TMVP scaling 2016-08-10 14:09:28 +03:00
Marko Viitanen c664bc8cf7 Added flag collocated_ref_idx to the slice header 2016-08-10 14:09:28 +03:00
Marko Viitanen c5f2611a38 Fixes for TMVP to work with the new CU array 2016-08-10 14:09:28 +03:00
Marko Viitanen d85af5755b TMVP working when only 1 ref frame 2016-08-10 14:09:28 +03:00
Marko Viitanen 39f0165efe Fix a bug in TMVP, the reference cu_array was being overwritten 2016-08-10 14:09:27 +03:00
Marko Viitanen adab8c327e Clean TMVP code 2016-08-10 14:09:20 +03:00
Marko Viitanen 5fa8226ac9 Temporal merge candidate selection 2016-08-10 14:09:20 +03:00
Marko Viitanen f83042f4a1 Temporal MV candidate selection 2016-08-10 14:09:19 +03:00
Marko Viitanen f8671581e3 Implemented function kvz_inter_get_temporal_merge_candidates() 2016-08-10 14:09:19 +03:00