Arttu Ylä-Outinen
edbe00763e
Drop extra parameter in kvz_image_calc_sad
...
Drops the parameter max_lcu_below which was always set to -1.
2017-07-24 15:21:19 +03:00
Arttu Ylä-Outinen
ffac29061f
Fix extrapolated inter SATD
2017-07-24 15:11:05 +03:00
Arttu Ylä-Outinen
631ef53d2a
Fix inter cost calculations
...
Inter costs are computed using SAD except when fractional motion
estimation or bi-prediction is enabled. This commit changes
search_pu_inter_ref to recalculate the cost with SATD. Fixes inter/intra
cost comparisons since intra costs are always SATD costs.
2017-07-24 15:11:05 +03:00
Arttu Ylä-Outinen
6ce2fb1238
Add pixel offsets to encoder_state_config_tile_t
...
Adds fields offset_x and offset_y to encoder_state_config_tile_t.
2017-07-24 15:11:05 +03:00
Arttu Ylä-Outinen
2380ba0d41
Reduce copying in kvz_get_coeff_cost
...
Changes function kvz_get_coeff_cost to only copy the CABAC contexts and
not the whole encoder state.
Other threads could be simultaneously using the other parts of the
encoder state. Only copying the CABAC fixes a TSan data race warning.
2017-07-24 12:38:41 +03:00
Arttu Ylä-Outinen
24b462f801
Align coefficients to 8 bytes
...
Adds alignment attribute to lcu_coeff_t. The coefficients are sometimes
handled as 64-bit integers containing four coefficients so the arrays
should be aligned to 8 bytes.
Fixes a UBSan error about misaligned reads.
2017-07-24 12:37:37 +03:00
Arttu Ylä-Outinen
5ddb43c6fe
Fix undefined left shifts in rdo
...
Replaces left shifts by multiplications when the operand may be
a negative value. Left shift of a negative value is undefined behavior.
2017-07-24 12:35:10 +03:00
Arttu Ylä-Outinen
d1e64ad62b
Fix undefined left shifts
...
Replaces left shifts by multiplications when the operand may be
a negative value. Left shift of a negative value is undefined behavior.
2017-07-20 11:15:30 +03:00
Arttu Ylä-Outinen
07b5fb9caf
Fix out-of-bounds read in encoderstate
...
When calling encoder_state_encode_leaf with POC 0, index -1 of the GOP
array would be accessed. Fixed by skipping the code for I-frames.
2017-07-20 11:15:30 +03:00
Arttu Ylä-Outinen
8c4a3473a8
Change --owf=auto and --threads=auto selection
...
Changes OWF selection so that it is chosen based on the maximum number
of parallel CTUs. Number of threads is limited to prevent overhead from
extra threads.
2017-07-20 09:42:28 +03:00
Arttu Ylä-Outinen
4fc9b743c1
Drop an unnecessary pthread_cond_broadcast
...
Drop pthread_cond_broadcast on threadqueue->cond in function
kvz_threadqueue_waitfor. The broadcast caused threads to be woken up
more often than necessary.
2017-07-19 11:09:30 +03:00
Arttu Ylä-Outinen
14003c6a30
Disable printing PSNR with --no-psnr
2017-07-19 10:38:37 +03:00
Arttu Ylä-Outinen
e90bde5c62
Clarify PSNR output
...
Adds letters Y, U and V to the PSNR output to make it clearer that the
printed values are the luma and chroma PSNR.
2017-07-19 10:33:43 +03:00
Arttu Ylä-Outinen
fdb3480b54
Enable strategies for SAO reconstruction
...
Re-enables strategies for SAO reconstruction. They were disabled in
commit ec9ff42
.
2017-07-11 10:35:18 +03:00
Arttu Ylä-Outinen
333dba3884
Add static to SAO strategies
2017-07-11 10:02:01 +03:00
Miika Metsoila
e8cc2d8f6a
Small fixes
2017-07-07 13:58:19 +03:00
Arttu Ylä-Outinen
67a60a35e3
Fix invalid calls to normalize_lcu_weights
...
Changes encoder_state_init_new_frame to only call normalize_lcu_weights
when the weights have been written to the array and rate control is
enabled. When rate control is disabled, the weights are not used.
2017-07-07 11:05:31 +03:00
Arttu Ylä-Outinen
563bc26e71
Fix out-of-bounds read in AVX2 SAO
...
AVX2 version of SAO loaded offsets with a 256 bit read even though there
are only five 32 bit integers.
2017-07-06 13:04:52 +03:00
Arttu Ylä-Outinen
0850b17f96
Drop get_wpp_limit in search_inter
...
WPP limit for motion vectors is now computed inside fracmv_within_tile.
2017-07-05 13:22:53 +03:00
Arttu Ylä-Outinen
2a85f0f5a4
Move hard-coded MV limits to encoder_control_t
...
Adds field max_inter_ref_lcu to encoder_control_t. It is used to set up
inter-LCU dependencies in encoder_state_encode_leaf and restrict motion
vectors in fracmv_within_tile.
2017-07-05 13:22:53 +03:00
Arttu Ylä-Outinen
bb5354f7e2
Relax inter-CTU dependencies when SAO is off
...
When using WPP and OWF, the first CTU of a row depends on the last CTU
of the row below in the reference frame. This is necessary when SAO is
enabled since we currently do SAO for a whole CTU row at a time. When
SAO is disabled, however, it is unnecessary to wait for the whole row.
Changes CTUs to depend only on the CTU below in the reference frame
instead of the whole row when WPP and OWF are enabled and SAO disabled.
Gives a significant speedup when running on a machine with many CPU
cores.
2017-07-05 13:21:06 +03:00
Arttu Ylä-Outinen
1efa2708b2
Do SAO reconstruction for a single CTU at a time
...
Moves SAO reconstruction into encoder_state_worker_encode_lcu instead of
doing it in a separate step for the whole CTU row. Reconstruction of the
rightmost 10 pixels and bottommost 10 pixels of a CTU is delayed until
the neighboring CTU has been deblocked.
Doing SAO for the whole CTU row at a time caused unnecessary inter-CTU
dependencies when using WPP and OWF. The first CTU of a row would need
to wait until SAO was done for the row below in the previous frame.
Moving SAO reconstruction to immediately after deblocking each CTU fixes
this problem.
2017-07-04 15:14:31 +03:00
Arttu Ylä-Outinen
ec9ff42077
Rewrite SAO recon to handle arbitrary sized blocks
...
Adds width and height parameters to function kvz_sao_reconstruct and
changes it to take coordinates in units of pixels. This will be useful
for doing SAO for areas smaller than a whole CTU.
2017-06-30 16:09:18 +03:00
Miika Metsoila
dcd7acf4fd
Fixed crash and incorrect info output
2017-06-27 16:05:15 +03:00
Miika Metsoila
f8b6234fdb
Changes to refence lists to behave more like L0/L1 lists from the specification
2017-06-27 16:05:15 +03:00
Arttu Ylä-Outinen
2c66e0bbd2
Fix warnings about invalid reads in AVX2 ipol
...
AVX2 filter functions read pixels in chunks of 8 or 16 bytes. At the end
of the block, the read goes out of the bounds of the pixels array. The
extra pixels do not affect the result.
Fixes valgrind complaining about the invalid reads by allocating 5 extra
pixels in kvz_get_extended_block_avx2
2017-06-22 09:37:55 +03:00
Arttu Ylä-Outinen
4d20e156db
Fix handling intra period not multiple of GOP length
...
With low delay GOP structure, it is possible to use an intra period that
is not a multiple of the GOP structure length. Commit 00c9f52
changed
encoder_state_init_new_frame to reset POC on intra frames. GOP offset,
however, was not reset, resulting in invalid POCs and references for the
following frames.
This commit changes function kvz_encoder_feed_frame so that GOP offset
is correctly reset on intra frames.
2017-06-22 09:29:00 +03:00
Arttu Ylä-Outinen
00c9f52bd4
Fix setting picture type when using GOP
...
Changes encoder_state_init_new_frame to set intra frame pictype to
KVZ_NAL_IDR_W_RADL even when using GOP.
2017-06-21 13:21:47 +03:00
Arttu Ylä-Outinen
f54a25f112
Fix crash when immediately closing encoder
...
When closing the encoder, the pictures stored in the input frame buffer
are freed by repeatedly calling kvz_encoder_feed_frame. If the encoder
was closed immediately after opening it, kvz_encoder_feed_frame would be
called with an unprepared encoder state. This would trigger an assert.
Fixed by changing kvz_encoder_feed_frame so that it does not require the
encoder state to be prepared.
2017-06-15 11:57:46 +03:00
Arttu Ylä-Outinen
b74e0458fd
Set inter transform depth to zero
...
Sets max_transform_hierarchy_depth_inter to 0 in SPS. This saves some
bits because split_transform_flag does not need to be coded for inter
blocks.
When SMP and AMP blocks are enabled the depth is set to 1 instead.
Otherwise inter split flag would default to 1 for SMP and AMP blocks,
resulting in an unnecessary transform split.
2017-06-08 10:08:20 +03:00
Arttu Ylä-Outinen
8dd01ba5a9
Refactor helper functions in search
...
Combines functions lcu_set_intra_mode and lcu_set_inter_pu to a single
function. Removes some duplicated code.
2017-06-06 10:32:09 +03:00
Arttu Ylä-Outinen
1bbecf7584
Refactor work tree copy functions
...
Extracts common code shared by work_tree_copy_up and work_tree_copy_down
to a separate function.
2017-06-06 10:32:00 +03:00
Arttu Ylä-Outinen
2b169d5d63
Fix crash in kvazaar_close
...
Changes kvazaar_close to stop all threads before freeing encoder states.
Fixes a crash when the encoder is closed before all pictures have been
encoded.
2017-06-02 10:05:33 +03:00
Arttu Ylä-Outinen
eb9a05b7ef
Fix memory leak
...
Changes kvazaar_close to free the remaining pictures in the the input
frame buffer. Fixes a memory leak when the encoder is closed while there
are pictures left in the buffer.
2017-06-01 15:39:35 +03:00
Arttu Ylä-Outinen
8b2483ca1c
Combine intra reconstruction functions
...
Replaces function kvz_intra_recon_lcu_luma and
kvz_intra_recon_lcu_chroma in intra.c with function kvz_intra_recon_cu.
The new function can handle reconstruction for both luma and chroma.
Removes some duplicated code.
2017-05-24 12:07:31 +03:00
Arttu Ylä-Outinen
e67fdb853d
Move intra leaf TB recon to a separate function
...
Moves code for intra leaf transform block reconstruction from functions
kvz_intra_recon_lcu_luma and kvz_intra_recon_lcu_chroma to a new
function intra_recon_tb_leaf. Removes some duplicated code.
2017-05-24 12:07:31 +03:00
Arttu Ylä-Outinen
13d2fdbd21
Drop unused kvz_videoframe_get_cu functions
2017-05-24 11:15:31 +03:00
Arttu Ylä-Outinen
f5eef7f33c
Use luma pixel coordinates in encode_coding_tree
...
Changes functions encode_intra_coding_unit and encode_coding_tree to
take coordinate arguments in units of luma pixels instead of 8 px
blocks. This should make the code easier to understand.
2017-05-24 11:15:31 +03:00
Arttu Ylä-Outinen
525a5180ff
Combine intra CU encoding functions
...
Merges functions encode_intra_coding_unit and
encode_intra_coding_unit_encry. Removes a lot of duplicated code.
2017-05-24 11:12:40 +03:00
Arttu Ylä-Outinen
610c91b0c5
Use luma pixel coordinates in TU coding functions
...
Changes functions encode_transform_unit and encode_transform_coeff to
take coordinate arguments in units of luma pixels instead of 4 px
blocks. This should make the code easier to understand.
2017-05-23 15:36:16 +03:00
Arttu Ylä-Outinen
2e8838de6e
Fix crash when crypto compiled in but disabled
...
When kvazaar was built with crypto++ but running without using
encryption features, kvazaar attempted to delete an uninitialized crypto
handle. Fixed by setting the handle to NULL in kvz_encoder_state_init.
2017-05-23 14:01:48 +03:00
Arttu Ylä-Outinen
2f2c281e8e
Fix a memory leak in crypto
...
A CryptoPP::CFB_Mode<CryptoPP::AES>::Encryption was allocated at the
beginning of encoder_state_encode_leaf and was never freed. This commit
changes encoder_state_worker_encode_lcu to delete the CFB_Mode. Also
moves crypto handle from encoder_state_config_tile_t to encoder_state_t
so that it can be safely deleted without affecting other threads in the
same tile.
2017-05-23 11:51:25 +03:00
Arttu Ylä-Outinen
22155950c1
Rewrite crypto to conform to kvazaar code style
2017-05-23 11:51:25 +03:00
Arttu Ylä-Outinen
6829865190
Fix inline declaration in intra_mode_encryption
...
Moves the inline declaration of intra_mode_encryption before the type
and changes it to use the INLINE macro. Inline declaration after type
triggered a warning on GCC.
2017-05-23 11:50:32 +03:00
Arttu Ylä-Outinen
5f8e17d4ba
Eliminate a race condition in threadqueue
...
Fixes the order of acquiring locks for the job and its dependency in
kvz_threadqueue_job_dep_add. The dependency is locked before the job
that depends on it. This is the same order as in threadqueue_worker.
Acquiring the locks in different order in kvz_threadqueue_job_dep_add
and threadqueue_worker would sometimes result in a deadlock.
2017-05-18 12:25:53 +03:00
Arttu Ylä-Outinen
4b213477f0
Return best MV from inter early terminate
...
When using --me-early-termination=sensitive, early termination of inter
search used to always return the starting point if no tested motion
vector was good enough to continue the search. This commit changes
early_termination to always return the best motion vector and cost
found.
2017-05-18 09:05:14 +03:00
Arttu Ylä-Outinen
382636de55
Fix handling too large QPs
...
Changes kvz_config_validate to output an error if the given QP is out of
range and changes kvz_set_picture_lambda_and_qp to clip the QP to the
valid range if is too large after applying QP offset from GOP structure.
2017-05-17 12:41:51 +03:00
Arttu Ylä-Outinen
de8b59c681
Drop unused function kvz_coefficients_blit
2017-05-12 16:48:30 +03:00
Arttu Ylä-Outinen
bcfa5a3cd9
Add a comment explaining the coefficient order
2017-05-12 16:46:57 +03:00
Arttu Ylä-Outinen
95775a1645
Change coefficient storage order
...
Changes coefficient storage order to a zig-zag order. Reduces
unnecessary copying of coefficients to temporary arrays.
2017-05-12 16:46:57 +03:00
Arttu Ylä-Outinen
9395867a9a
Quantize all colors in a single traversal
...
Changes kvz_quantize_lcu_residual to process all three colors in
a single traversal of the TU tree.
2017-05-12 16:42:41 +03:00
Arttu Ylä-Outinen
1e58fd6b16
Split kvz_quantize_lcu_residual
...
Splits kvz_quantize_lcu_residual to two functions that handle the TU
tree recursion and quantization of a single TU.
2017-05-12 16:42:41 +03:00
Arttu Ylä-Outinen
cc87e0dcc7
Combine luma and chroma quantization functions
...
Replaces functions kvz_quantize_lcu_luma_residual and
kvz_quantize_lcu_chroma_residual in transform.c with function
kvz_quantize_lcu_residual. The new function can handle any of the YUV
colors. Removes some duplicated code.
2017-05-12 16:42:41 +03:00
Arttu Ylä-Outinen
1357dd0599
Pass coeffs through encoder state
...
Changes the way coefficients are passed from kvz_search_lcu to
kvz_encode_coding_tree. Drops fields coeff_y, coeff_u and coeff_v in
videoframe_t and instead passes them through field coeff in
endoder_state_t.
2017-05-12 16:42:41 +03:00
Eemeli Kallio
2cad3173ec
Reduced amount of modes for search_intra_rdo
2017-05-12 15:56:07 +03:00
Arttu Ylä-Outinen
26adef4492
Merge branch 'erp-aqp'
2017-05-12 15:05:24 +03:00
Eemeli Kallio
55e0e65733
Added INLINE to kvz_get_ic_rate and kvz_get_coded_level in rdo.c
2017-05-12 15:03:30 +03:00
Arttu Ylä-Outinen
ee3d4d0e78
Add adaptive QP for 360 degree video
...
Adds option --erp-aqp for enabling adaptive QP for 360 degree video with
equirectangular projection. When projected into a spherical surface,
the middle part of the video covers relatively larger area than the top
and bottom parts. Enabling --erp-aqp sets up a ROI delta QP array which
uses higher QPs for the top and bottom of the video and lower QPs for
the middle part.
2017-05-11 12:31:53 +03:00
Arttu Ylä-Outinen
79cb3a2fd3
Permit negative QP deltas in ROI
...
Delta QPs should not be arbitrarily restricted to positive values.
2017-05-11 12:13:47 +03:00
Arttu Ylä-Outinen
edfbd6f122
Add field lcu_dqp_enabled to encoder_control_t
...
Delta QPs for LCUs are enabled when either ROI coding or rate control is
enabled. Having a single field is simpler than always checking whether
ROI or rate control is enabled.
2017-05-11 12:13:47 +03:00
Arttu Ylä-Outinen
2f2405dfe6
Fix crash when PU depth is limited
...
When video width or height was not a multiple of the smallest CU size,
no prediction would be performed at the border CUs. Kvazaar would later
crash at an assertion failure when attempting to write the bitstream for
the CU.
Fixed by permitting inter and intra prediction when the CU split is
forced, even if CUs of that size would otherwise be disabled.
2017-04-27 10:35:48 +03:00
Arttu Ylä-Outinen
9130b5107c
Change handling of infinite PSNR in encmain
...
Changes encmain to print 999.99 as PSNR when SSE is zero. This behavior
is in line with HM. Previously SSE was set to 99 when it was zero.
2017-04-27 10:35:13 +03:00
Arttu Ylä-Outinen
a9c878b535
Fix crash with WPP when threads are disabled
...
When WPP is enabled, a reference to SAO reconstruction job is copied
from the wavefront to the main encoder state. However, when threads are
disabled, the job is a null pointer and dereferencing it crashes the
encoder. Fixed by adding a null pointer check.
2017-04-24 12:59:57 +03:00
Arttu Ylä-Outinen
2991962033
Add reference counting to threadequeue_job_t
...
Both the thread queue and the encoder states hold pointers to the thread
queue jobs. It is possible that a job is removed from the thread queue
and freed while the encoder state is still using it. This commit adds
reference counting to threadqueue_job_t in order to fix the problem.
Fixes #161 .
2017-04-12 16:13:52 +03:00
Arttu Ylä-Outinen
bd8adff43a
Drop unused defines in threads.h
2017-04-12 03:41:07 -07:00
Arttu Ylä-Outinen
7ab0a7aff2
Fix semaphores on Mac
...
POSIX semaphores are deprecated on Mac. This commit replaces POSIX
semaphores by Grand Central Dispatch semaphores when building on Mac.
2017-04-12 03:41:02 -07:00
Arttu Ylä-Outinen
26693e1402
Fix reliance on undefined behaviour in encmain
...
Pthread mutexes were used for synchronization in encmain by locking and
unlocking them from different threads. However, according to the POSIX
standard, unlocking a mutex from a different thread is undefined
behaviour. This commit replaces the mutexes by semaphores which can be
used from different threads.
2017-04-12 03:23:58 -07:00
Ari Lemmetti
47a9f0de04
Modify and use FILL_ARRAY macro to prevent warning on GCC 7
...
Following warning was given and is false positive
error: 'memset' used with length equal to number of elements without multiplication by element size [-Werror=memset-elt-size]
2017-04-11 14:04:25 +03:00
Eemeli Kallio
f7e01b8ba1
Fixed error on rd=3
2017-04-05 13:27:14 +03:00
Eemeli Kallio
9f605152ae
Changed intra to use best rough cost when using inter and rd=2
2017-04-05 13:01:32 +03:00
Ari Lemmetti
33ce101ab5
Revert "Use sizeof(uint32_t) to avoid warning in GCC7."
...
Did not fix the problem.
This reverts commit e3c3e74926
.
2017-04-03 20:21:33 +03:00
Ari Lemmetti
e3c3e74926
Use sizeof(uint32_t) to avoid warning in GCC7.
...
error: 'memset' used with length equal to number of elements without multiplication by element size [-Werror=memset-elt-size]
2017-04-03 19:16:09 +03:00
Arttu Ylä-Outinen
df359b8f95
Fix indentation in encode_coding_tree.c
...
Fixes indentation of a for loop that was causing a misleading
indentation warning on GCC.
Fixes #163 .
2017-03-08 22:56:28 +09:00
Pierre-Loup Cabarat
2b8ce5e47c
Add intra prediction modes encryption
2017-03-06 17:27:39 +01:00
Arttu Ylä-Outinen
aae141f2d3
Fix order of frames with --debug
...
When the decoding and presentation orders of pictures are different
(with GOP), the frames in YUV debug output would be in the decoding
order. This commit changes the kvazaar command line program to store the
reconstructed pictures in a buffer so that they can be output in the
presentation order.
Fixes #101 .
2017-02-28 14:09:24 +09:00
Arttu Ylä-Outinen
094b39e7fc
Refactor inter MV/merge candidate selection
...
Adds struct merge_candidates_t for holding the spatial and temporal
merge candidates. Changes functions with separate parameters for each
candidate to use the struct instead.
2017-02-22 15:56:36 +09:00
Arttu Ylä-Outinen
3409748a8f
Refactor inter MVP candidate selection
...
Adds helper function add_mvp_candidate.
2017-02-22 15:56:27 +09:00
Arttu Ylä-Outinen
ef6503c728
Refactor inter merge candidate selection
...
Adds helper function add_merge_candidate and replaces macro
CHECK_DUPLICATE with function is_duplicate_candidate.
2017-02-22 02:50:52 +09:00
Arttu Ylä-Outinen
f12e09bc40
Refactor inter TMVP selection
...
Adds helper function add_temporal_candidate to inter.c.
2017-02-22 02:08:10 +09:00
Arttu Ylä-Outinen
4f88066740
Refactor MV and merge candidate selection
...
Replaces macros APPLY_MV_SCALING and CALCULATE_SCALE with helper
functions.
2017-02-22 01:14:16 +09:00
Arttu Ylä-Outinen
db08041d9a
Refactor inter TMVP selection
...
Merges three if-clauses to remove two levels of indentation.
2017-02-21 23:56:01 +09:00
Marko Viitanen
85e2a40da3
Clip scaled motion vectors, scale and td/tb values to appropriate limits
...
Fixes #158 .
2017-02-20 15:40:20 +02:00
Ari Koivula
7369f25f64
Bump version to 1.1.0
2017-02-16 20:52:05 +02:00
Ari Lemmetti
b021d2244e
Reduce more unnecessary initializations.
2017-02-16 17:25:26 +02:00
Ari Lemmetti
acd12cba1e
Remove unnecessary memory initialization to zero
...
Values in interval [last_scanpos, 0] are overwritten in following for loop, except for the sig_coeff_inc value.
2017-02-16 16:48:48 +02:00
Ari Koivula
7ff33e1bf2
Fix default reference picture count
...
The default was 3, instead of the intended 1 of the medium preset.
2017-02-13 17:34:28 +02:00
Marko Viitanen
4251607c04
Fix a bug in TMVP reference POC list
2017-02-13 15:19:24 +02:00
Marko Viitanen
4270d451e6
Fixed some errors after rebase
2017-02-13 15:19:24 +02:00
Marko Viitanen
95effb00d0
Disable TMVP in frames with zero L0 references
2017-02-13 15:19:24 +02:00
Marko Viitanen
b4de1878be
Fixed TMVP scaling and candidate selection for B-frames
2017-02-13 15:19:23 +02:00
Marko Viitanen
23be633ad7
Added TMVP merge candidate scaling for L0
2017-02-13 15:19:23 +02:00
Marko Viitanen
e6aa1b9b9a
Renamed get_mv_cand_from_spatial() to get_mv_cand_from_candidates()
2017-02-13 15:19:23 +02:00
Marko Viitanen
1124bb5fd0
Cleaned up TMVP, mv candidate selection working, merge candidate selection not
2017-02-13 15:19:23 +02:00
Marko Viitanen
d65d2ec88d
WIP: add list of POCs used in the image when pushing to reference
2017-02-13 15:19:22 +02:00
Marko Viitanen
6a25cd3248
WIP: work on tmvp on inter
2017-02-13 15:19:22 +02:00
Marko Viitanen
e538a94eda
Enable TMVP with B-frames
2017-02-13 15:19:22 +02:00
Arttu Ylä-Outinen
363b8b49a2
Fix integer overflows with large resolutions
...
Limits video size so that the number of luma and chroma pixels can be
stored in an int. Fixes some integer overflows that resulted in
segmentation faults.
2017-02-12 11:40:13 +09:00
Arttu Ylä-Outinen
a5a925fc28
Replace timed waits by normal waits in threadqueue
...
Replaces calls to pthread_cond_timedwait with pthread_cond_wait in
threadqueue.c. Simplifies code, as there should be no need for the
timeout.
2017-02-11 15:42:03 +09:00
Arttu Ylä-Outinen
fd057498fc
Simplify kvz_config_alloc
2017-02-11 15:42:03 +09:00
Arttu Ylä-Outinen
7f7844caad
Fix finalizing uninitialized encoder states
...
Finalization functions for frame and tile encoder states accessed the
frame and tile fields of the encoder state even though they might be
NULL. This is the case when the initialization of an encoder state
fails. Fixed by adding NULL checks.
2017-02-09 14:05:28 +09:00
Arttu Ylä-Outinen
51786eda67
Drop redundant fields in encoder_control_t
...
Some of the fields in encoder_control_t were simply copies of the
corresponding fields in kvz_config. This commit drops the copied fields
in favor of using the fields in encoder_control_t.cfg directly.
2017-02-09 14:05:28 +09:00
Arttu Ylä-Outinen
6a178dee96
Fix leaking memory when --cqmfile given many times
...
Any previously allocated CQM file name was not freed when allocating
memory for the new file name.
2017-02-09 14:05:28 +09:00
Arttu Ylä-Outinen
63a567ad8a
Fix leaking memory when --roi given many times
...
Any previously allocated delta QP array was not freed when allocating
a new array.
2017-02-09 14:05:21 +09:00
Arttu Ylä-Outinen
bfd89136a4
Fix ROI delta QP array not getting freed
2017-02-09 13:23:55 +09:00
Arttu Ylä-Outinen
e78a8dfcf5
Copy the kvz_config passed to encoder_open
...
The kvz_config struct is created by the user but kvazaar keeps a pointer
to it. It is easy to break things by modifying the configuration outside
kvazaar. In addition, kvazaar modifies the struct even though it is has
a const modifier.
This commit changes the field cfg in encoder_control_t to be a copy of
the kvz_config struct instead of a pointer, removing modifications to
the const struct and allowing users to do whatever they want with it
after opening the encoder.
2017-02-09 13:23:54 +09:00
Ari Koivula
b8e3513a23
Fix crash with sub-LCU frame sizes and WPP
...
The end of slice was being calculated incorrectly, which led to no tile
being created inside the slice, which led to an assert triggering.
This fixes the wrong end of slice calculation, but also disallows
wavefront rows from being created, if there would be only one.
The wavefront initialization code assumes there are always more than
one row, so the inter-frame dependency doesn't get added properly.
Fixes #153 .
2017-02-08 21:41:30 +02:00
Ari Koivula
d893474bab
Fix encoder getting stuck on OS-X
...
Main thread was stuck looping on pthread_cond_timedwait because
the abs time given on OS-X had already passed and the wait
returned immediately without releasing the mutex to allow worker
threads to proceed.
Fix was to use the gettimeofday, which returns real time instead
of monotonic, which is what pthread_cond_timedwait wants.
2017-02-02 17:27:46 +02:00
Ari Koivula
4ceda1908b
Fix OS-X compiler warning
...
rdo.c:475:25: warning: absolute value function 'abs' given an argument of type 'int64_t' (aka 'long long') but has parameter
of type 'int' which may cause truncation of value [-Wabsolute-value]
current.cost = -abs(quant_cost_in_bits) + (bits << PRECISION_INC);
^
rdo.c:475:25: note: use function 'llabs' instead
current.cost = -abs(quant_cost_in_bits) + (bits << PRECISION_INC);
2017-02-01 18:09:17 +02:00
Ari Koivula
c7d536bbcd
Fix OS-X compiler warning
...
cfg.c:1024:74: warning: format specifies type 'size_t' (aka 'unsigned
long') but the argument has type 'unsigned long long'
[-Wformat]
fprintf(stderr, "Too large ROI size: %llu (maximum %zu).\n", size, SIZE_MAX);
2017-02-01 18:09:04 +02:00
Ari Koivula
4467506ef1
Add missing kvz_ prefix
2017-01-31 18:38:02 +02:00
Ari Koivula
ed3bd898fd
Remove Exp-Golomb lookup table
...
This table takes 256kB and isn't used very much. Au revoir!
2017-01-31 18:31:05 +02:00
Ari Koivula
5513744d24
Merge branch 'slices'
2017-01-31 16:14:30 +02:00
Ari Koivula
52904d3e9f
Add --slices=tiles and --slices=wpp
...
This encapsulates tiles or WPP rows into their own slices, making
it possible to send them as soon as they are done, instead of waiting
for the other substreams to finish and coding the substream offsets
in the slice header.
2017-01-31 15:44:23 +02:00
Ari Koivula
0d4d0e869c
Add support for independent slices
...
Not used yet, but they work.
2017-01-31 15:11:50 +02:00
Ari Koivula
46ae382498
Fix bugs with slice header
...
These fixes allow more than one slice to be used to code a picture.
- Use correct number of bits to code the slice segment address.
- Don't offset_len_minus1 for slices without substreams.
2017-01-31 14:01:59 +02:00
Ari Koivula
f1fc0de2bf
Write slice headers to the parent stream
...
Appending to the child stream doesn't work is the child is a leaf
slice state.
Simplifies flow by removing distinction between tile and slice. Now
that slice headers are written in the parent stream, there is zero
difference between tiles and slices from bitstream point of view.
2017-01-31 13:55:05 +02:00
Ari Koivula
04cd875b2c
Move substream finalization to LCU coding job
...
Having some of the termination bits in the LCU coding and some in the
substream finalization was needlessly confusing. Doing substream
finalization directly after LCU coding makes it easy to verify that the
finalization is done correctly.
Removes one job per WPP row from the job queue.
Removes kvz_cabac_flush, because I don't like bits being put into the
bitstream implicitly. Better to have it all in the open.
2017-01-31 13:01:57 +02:00
Ari Koivula
ead490b7b7
Write a new slice NAL for every slice
2017-01-31 12:36:18 +02:00
Ari Koivula
cd496bf50b
Move first_nal_in_au to encoder_state->frame
...
Needed for writing NALs from encoder_state_write_bitstream_children
2017-01-31 12:28:28 +02:00
Arttu Ylä-Outinen
1e6463c08b
Fix inter bipred search
...
When the number of merge candidates was five, biprediction search would
read past the bounds of the priority list arrays. Fixed to limit the
search to the first four candidates.
2017-01-31 18:23:12 +09:00
Ari Lemmetti
2c069a3e5f
Prevent unnecessary cu search
...
Prevent further analysis as soon as it is known that splitting can not improve cost
2017-01-30 16:21:41 +02:00
Arttu Ylä-Outinen
9b889c3fab
Fix reading ROI files
...
- Checks the return value of fopen when opening the ROI file. Fixes
a segfault when the file cannot be opened.
- Check that the width and height are positive. Fixes reading past the
end of the delta QP array in kvz_set_lcu_lambda_and_qp.
- Check for overflow in width * height. Fixes an overflow resulting in
a segfault.
- Properly check that fscanf succeeds. Fixes silently accepting ROI
files that are too short.
- Properly close the FILE pointer.
2017-01-29 18:57:27 +09:00
Arttu Ylä-Outinen
46c9a483c3
Fix inter search for small SMP and AMP blocks
...
The function search_pu_inter_ref incorrectly rounded the coordinates of
the block to down to a multiple 8 pixels. Small SMP and AMP blocks may
start at coordinates that are not multiples of 8. Fixed by removing the
rounding.
Fixes a failing assert when --mv-constraint is used with --smp or --amp.
2017-01-29 13:34:50 +09:00
Arttu Ylä-Outinen
fb10b56b82
Fix checking if a low delay GOP structure is used
...
Stops assuming that having cfg->gop_lowdelay set means that GOP
structure is used since it is possible that cfg->gop_lowdelay is true
but cfg->gop_len is zero. Adds checks for cfg->gop_len where needed.
Fixes a possible division by zero in kvz_encoder_feed_frame.
2017-01-28 21:56:00 +09:00
Arttu Ylä-Outinen
4f56b04239
Drop an unnecessary conditional
...
Drop a conditional for depth > MAX_DEPTH in search_cu. The depth cannot
be greater than MAX_DEPTH (== 3) since an earlier if-clause checks that
it is less than MAX_PU_DEPTH (== 4).
2017-01-28 21:35:27 +09:00
Ari Koivula
937a764987
Fix bug in --mv-constraint
...
Subpixel motion estimation return 0-vector when no subpixel vector is
within the constraint. Fix is to not call subpixel motion estimation
when the integer vector is not within the constraint.
2017-01-26 09:55:57 +02:00
Ari Koivula
4a0121ac42
Add --roi parameter
...
Adds region of interest coding capability.
Works by reading a file of delta QP values which will then be applied
to each frame at LCU level.
2017-01-26 09:14:14 +02:00
Ari Koivula
6f61836989
Refactor kvz_rdoq_sign_hiding
...
Rename and reorder everything to make more sense.
- Moved input tables into their own struct and renamed them to what
they actually represent.
- Renamed pretty much every variable to comform to our style and
to make sense.
- Removed the lastCG stuff, as the function already gets passed the
last coeff anyway. (it was named width, what the hell?)
2017-01-19 23:58:17 +02:00
Ari Koivula
a85390d0ac
Clean up code using the fixed point frac bit tables
...
This is to prepare for changing the code using the floating point table
to use the fixed point table instead.
This also allows reducing the size of the fractional part, which was
useful for finding every place where the the fixed point presentation
is relied upon.
2017-01-19 20:20:51 +02:00
Ari Koivula
24a69c7467
Refactor luma deblocking
...
Changes luma deblocking to use gather and scatter instead of reading
to and writing from here and there in memory. Should make them
faster and easier to vectorize, or at least cleaner.
Splits strong and weak luma deblocking to two functions, as they have
almost nothing in common.
2017-01-17 22:13:39 +02:00
Ari Koivula
4cb2fca924
Refactor deblock decision
2017-01-17 19:34:17 +02:00
Arttu Ylä-Outinen
05794c3548
Add missing static to function lambda_to_qp
2017-01-11 15:53:55 +09:00
Arttu Ylä-Outinen
ee518e8ac4
Take header bits into account in rate control
2017-01-11 15:53:55 +09:00
Arttu Ylä-Outinen
c219d3cd94
Fix deblock when CU QP delta is enabled
...
Fixes deblock functions so that they use the correct QP for the filtered
edge. Adds field qp to cu_info_t.
2017-01-11 15:53:22 +09:00
Arttu Ylä-Outinen
82a98180e4
Clip LCU lambda to reduce quality fluctuation
...
Limits lambdas for each LCU based on the computed lambda from the
previous frame and the frame-level lambda.
2017-01-09 01:24:23 +09:00
Arttu Ylä-Outinen
93172fd251
Use separate alpha, beta and lambda for each LCU
...
Changes rate control to use the alpha and beta values stored in
lcu_stats_t instead of the frame-level values when selecting lambda and
QP for an LCU.
2017-01-09 01:24:23 +09:00
Arttu Ylä-Outinen
3af4e9cc8a
Allocate bits separately for each LCU
...
Bits are allocated based on the costs of the LCUs in the previous
completely coded frame.
Breaks deblock when rate control is used.
2017-01-09 01:24:23 +09:00
Arttu Ylä-Outinen
ff5e5ec6d4
Record info about coded LCUs
...
Adds field lcu_stats to encoder_state_config_frame_t. The following data
is recorded for each LCU:
- number of bits
- squared cost
- used lambda value
- alpha parameter used for rate control
- beta parameter used for rate control
2017-01-09 01:24:23 +09:00
Arttu Ylä-Outinen
2a4243acbe
Refactor rate control
...
Moves all code related to setting QP and lambda values to rate_control
module.
2017-01-09 01:24:23 +09:00
Arttu Ylä-Outinen
71633889ce
Enable CU QP delta when using rate control
...
When rate control is enabled, enable cu_qp_delta_enabled_flag in PPS
with diff_cu_qp_delta_depth set to 0. Also adds code for writing the QP
deltas and a new cabac context.
2017-01-09 01:24:23 +09:00
Arttu Ylä-Outinen
640ff94ecd
Use separate lambda and QP for each LCU
...
Adds fields lambda, lambda_sqrt and qp to encoder_state_t. Drops field
cur_lambda_cost_sqrt from encoder_state_config_frame_t and renames
cur_lambda_cost to lambda.
2017-01-09 01:24:23 +09:00
Arttu Ylä-Outinen
435c387357
Refactor rate control
...
- Defines MIN_LAMBDA and MAX_LAMBDA constants.
- Moves resetting state->frame->cur_gop_bits_coded to rate_control.c.
- Changes gop_allocate_bits to return the number of bits allocated like
pic_allocate_bits does.
2017-01-09 01:24:23 +09:00
Arttu Ylä-Outinen
6c4f2d196a
Move fields from encoder_state_t to frame
...
Moves fields prepared and frame_done from encoder_state_t to
encoder_state_config_frame_t.
2017-01-09 01:24:23 +09:00
Arttu Ylä-Outinen
97863cdaa2
Fail encoder init when CQM file cannot be opened
2017-01-08 19:17:43 +09:00
Arttu Ylä-Outinen
db5e750c7f
Fix --threads=auto
...
When --threads=auto was given on the command line, cfg->threads was
actually set to zero, disabling threads altogether. Fixed to set
cfg->threads to -1, so that the number of threads is chosen
automatically.
2017-01-08 17:58:22 +09:00
Ari Koivula
a9e45efcfc
Add a fast lane for byte-aligned bitstream writes
...
The CABAC engine only writes to the bitstream when it has a full byte.
These writes are also always byte-aligned, so there is no need to even
check for stream alignment.
Speedup was around 3% with ultrafast and low QP.
2016-12-23 17:01:44 +02:00
Jaakko Laitinen
deb63f735f
Fix gop disabling
2016-12-20 14:25:13 +02:00
Ari Lemmetti
70a52f0e48
10-bit: add missing bit depth adjustment to ssd
2016-11-17 19:28:04 +02:00
Ari Koivula
fa078102f1
Fix 32bit compilation
...
Got a warning about implicit cast from uint64_t to void*.
2016-11-17 17:53:57 +02:00
Ari Koivula
5ceec06bd3
Merge pull request #148 from Venti-/crypto
...
Crypto
2016-11-16 21:33:55 +02:00
Ari Lemmetti
c31207ea7d
Optimize intra reference building
...
-Add function with reduced logic for the most common case
2016-11-16 18:28:42 +02:00
Ari Koivula
24f2a23ef8
Remove unnecessary crypto state
...
The frame does not need it's own crypto state, since it always has at
least one sub tile.
2016-11-16 13:58:41 +02:00
Ari Koivula
8951e34fd2
Change crypto.h stubs to print instead of assert
2016-11-16 13:58:41 +02:00
Wassim Hamidouche
ea82c38906
correct memory allocation
2016-11-16 12:35:28 +02:00
Wassim Hamidouche
da3e2d1d07
resolve parallel encryption
2016-11-16 12:35:28 +02:00
Ari Koivula
b8a618e666
Fix problems with >8 bit input
...
Enforce bit depth promised by --input-bitdepth to avoid crashes when
larger values are provided.
Do endianess byte swap for all bytes when the buffer gets extended
to multiple of 8 pixels, and not just the number of input pixels.
Don't swap bytes on a little-endian system.
2016-11-13 19:58:54 +02:00
Ari Koivula
2c005cda25
Fix bug with sub-pixel motion estimation in tiles
...
The width of the tile was being used to index the frame pixel buffer
instead of the width of the buffer.
2016-11-07 15:53:52 +02:00
Ari Koivula
78a28e0338
Reformat --help message
...
- Reduce indentation to 6 spaces
- Word wrap everything to under 80 characters
- Remove defaults from options covered by presets
- Add a dash in front of argument descriptions
- Add --(no-) to names of parameters that accept it and remove mention
of enabling or disabling
- Add executable and scripts as a dependancy to make docs
2016-11-04 15:40:28 +02:00
Ari Koivula
d18de19d8a
Fix DTS and PTS not being passed on through lib API
...
Fixes "cur_dts is invalid" warning from FFmpeg.
2016-10-28 19:05:47 +03:00
Ari Koivula
0c41c2ebd6
Make CLI set PTS for each input picture
...
This value is not represented in the HEVC bitstream, which is why it
was not set previously. FFmpeg sets and needs it however, so make the
CLI set it as well to make sure we handle it correctly.
2016-10-28 19:03:03 +03:00
Ari Koivula
5bf745460d
Re-categorize options in the help message
...
- Move VUI stuff to the bottom
- Merge Parallel processing, WPP, Tiles and slices
- Add more categories for the other options
2016-10-27 03:26:15 +03:00
Ari Koivula
cb6672b452
Disable WPP when Tiles are enabled
...
Closes #142 .
2016-10-27 02:07:10 +03:00
darealshinji
488d042e5f
Bump KVZ_VERSION
2016-10-25 12:32:13 +02:00
Ari Lemmetti
29153ed503
Remove unused variable
2016-10-21 17:28:42 +03:00
Ari Lemmetti
778e46dfd8
Add AVX2 version of SSD
2016-10-21 15:07:53 +03:00
Ari Lemmetti
6f5d7c9e06
Move SSD to strategies
2016-10-21 15:07:23 +03:00
Ari Lemmetti
89b941eab4
Fix typo
2016-10-21 15:07:02 +03:00
Alexis Ballier
1dcc993743
Include i386 & i486 for compiling intel asm.
...
x86_64-pc-linux-gnu-gcc -m32 that I use for building 32bits libraries on amd64 defines only __i386__.
2016-10-14 18:07:37 +02:00
Arttu Ylä-Outinen
5fb7afe8c4
Add --implicit-rdpcm command line parameter.
...
Makes it possible to use lossless coding without implicit residual DPCM.
2016-10-03 20:01:55 +09:00
Arttu Ylä-Outinen
5affc0f527
Use implicit RDPCM in lossless mode.
...
Sets implicit RDPCM flag in SPS when lossy coding is disabled and
applies DPCM to intra residual when prediction mode is horizontal or
vertical.
2016-10-03 19:31:38 +09:00
Ari Koivula
016dbe0894
Further refine presets
...
The rd-complexity of slow presets is better with a less agressive GOP.
Adding the GOP as part of the preset improved BDRate enough, that it
didn't make sense anymore to have a veryslow target the best BDRate.
Instead, push that responsibility to placebo by making it a little bit
faster.
2016-09-29 17:35:12 +03:00
Ari Koivula
31c5ff0f16
Add cross-platform core number detection
...
Well, turns out pthread_num_processors_np isn't standard so we need to
do this crap. Threw in hyper threading detection as a bonus.
2016-09-29 00:03:21 +03:00
Ari Koivula
8c7351eac8
Fix lp-gop with depth 1
...
GOPs with depth 1 had the same structure as those with depth 2:
g4d3t1 = 3 2 3 1
g4d2t1 = 2 2 2 1
g4d1t1 = 2 2 2 1
It now results in the correct:
g4d1t1 = 1 1 1 1
2016-09-29 00:03:21 +03:00
Ari Koivula
a395aeaac9
Set default settings to those of --preset=medium
2016-09-29 00:03:21 +03:00
Ari Koivula
4388fe0d30
Set presets to ratedistortion-complexity optimized versions
2016-09-29 00:03:20 +03:00
Ari Koivula
facb1e16df
Use -p64 -q22 and --gop=lp-g4d3t1 by default
...
Coding inter without GOP of any kind really isn't a very sensible
default. Defaulting to B-GOP of some kind would be more better,
but lp-gop is more robust for now.
2016-09-29 00:03:20 +03:00
Ari Koivula
d7391a9593
Improve default for number of parallel frames
2016-09-29 00:03:20 +03:00
Ari Koivula
19d423ab29
Use all available cores by default
2016-09-29 00:03:20 +03:00
Ari Koivula
3f138f087a
Allow non-gop-length --period for lp-gop
2016-09-29 00:03:19 +03:00
Ari Koivula
16790c9f15
Remove number of references from --gop=lp syntax
...
The number of references should be part of the presets, so gop should
be defined separately.
2016-09-29 00:03:19 +03:00
Ari Koivula
cbfa824d1a
Merge branch 'simd'
2016-09-27 20:49:45 +03:00
Ari Koivula
14a7bcba25
Use a faster function for clipped inter SAD
...
Use the vectorized general SSE41 inter SAD in AVX reg_sad for shapes
for which we don't have AVX versions yet.
Also improves speed of --smp and --amp a lot. Got a 1.25x speedup for:
--preset=ultrafast -q 27 --gop=lp-g4d3r3t1 --me-early-termination=on --rd=1 --pu-depth-inter=1-3 --smp --amp
* Suite speed_tests:
-PASS inter_sad: 0.898M x reg_sad(64x63):x86_asm_avx (1000 ticks, 1.000 sec)
+PASS inter_sad: 2.503M x reg_sad(64x63):x86_asm_avx (1000 ticks, 1.000 sec)
-PASS inter_sad: 115.054M x reg_sad(1x1):x86_asm_avx (1000 ticks, 1.000 sec)
+PASS inter_sad: 133.577M x reg_sad(1x1):x86_asm_avx (1000 ticks, 1.000 sec)
2016-09-27 20:48:30 +03:00
Arttu Ylä-Outinen
4313e56c2d
Add --no-rdoq-skip command line switch
2016-09-11 17:40:16 +09:00
Ari Koivula
a7a33b08ec
Remove --slice-addresses from usage message
...
And give a warning if it's used.
Slices will have to be implemented at some point, but they aren't yet
so let's not advertize them.
2016-09-10 21:06:00 +03:00
Eemeli Kallio
f41e428e5f
Removed kvz_skip_unnecessary_rdoq and reworked --rdoq-skip to skip 4x4 blocks when it is on.
2016-09-09 10:26:07 +03:00
Eemeli Kallio
ed9c0b0416
RDOQ reworked in rdo.c. rdoq_signhide now skips coeffs that are after best_last_idx.
2016-09-09 10:16:51 +03:00
Ari Koivula
02cd17b427
Add faster AVX inter SAD for 32x32 and 64x64
...
Add implementations for these functions that process the image line by
line instead of using the 16x16 function to process block by block.
The 32x32 is around 30% faster, and 64x64 is around 15% faster,
on Haswell.
PASS inter_sad: 28.744M x reg_sad(32x32):x86_asm_avx (1014 ticks, 1.014 sec)
PASS inter_sad: 7.882M x reg_sad(64x64):x86_asm_avx (1014 ticks, 1.014 sec)
to
PASS inter_sad: 37.828M x reg_sad(32x32):x86_asm_avx (1014 ticks, 1.014 sec)
PASS inter_sad: 9.081M x reg_sad(64x64):x86_asm_avx (1014 ticks, 1.014 sec)
2016-09-01 21:36:39 +03:00
Ari Koivula
d0512d25c6
Use fixed point in get_mvd_coding_cost
2016-08-30 21:37:12 +03:00
Ari Koivula
ec7507a935
Further optimize get_ep_ex_golomb_bitcost
...
Unrolled 16-bit log2 calculation.
2016-08-30 21:37:01 +03:00
Ari Koivula
a4ba794587
Optimize get_ep_ex_golomb_bitcost
...
Arrange the decision tree such that there is only 3 branches on the
most common paths and the more likely branch is always fall-through.
A profile guided optimization pass would probably do something similar.
2016-08-30 05:24:16 +03:00
Ari Koivula
82cfab58f8
Improve fast mvd coding cost estimation
...
A lot of time is being taken up by this function on ultrafast, and it
doesn't do a very good job. This change aims to both simplify the
logic and make the estimate better.
The logic is simplified by using a look up for the step mvd bit cost
step function instead of mimicking the binarization process. The
estimation is made better by checking fractional cabac bit costs.
The new function returns the same results as
kvz_get_mvd_coding_cost_cabac, but is also faster than the old
function.
2016-08-30 04:55:09 +03:00
Ari Koivula
d31be8eb27
Make mvd_coding_cost functions take const cabac
2016-08-30 04:46:46 +03:00
Ari Koivula
64d631c174
Fix 8bit to 10bit input conversion regression
2016-08-25 22:09:40 +03:00
Ari Koivula
27789125d8
Fix input bit depth conversion
...
The input was being shifted to the wrong direction.
2016-08-25 22:05:25 +03:00
Ari Koivula
4ec039004b
Add monochrome encoding
...
Write bitstream without chroma when encoding with --input-format=P400.
This reduces bitstream size by 0-1 %, compared to coding monochrome in
420 format, and speeds up encoding slightly due to not processing
chroma.
2016-08-25 20:15:26 +03:00
Ari Koivula
c5b70cf812
Add chroma format support to yuv_t
2016-08-24 19:20:53 +03:00
Ari Koivula
032ed30ff4
Add chroma format support to kvz_picture
...
Add picture_alloc_csp to libkvz api to allocated pictures with chroma
format different from 420.
2016-08-24 19:20:53 +03:00
Ari Koivula
48ccc26839
Add --input-format and --input-bitdepth
...
Adds reading of 10 bit input for 10-bit encoding.
2016-08-24 19:20:53 +03:00
Ari Koivula
cc08073615
Refactor some indexing weirdness in init_lcu_t
...
I thought there might be a bug in this so I cleaned it up.
2016-08-24 19:12:48 +03:00
Ari Koivula
b6d674d66e
Refactor integer vector inter prediction
...
This code was pretty bad, so I cleaned it up a bit.
2016-08-24 19:09:26 +03:00
Ari Lemmetti
28c4174d0e
Fix incorrect shuffle parameters
...
_MM_SHUFFLE uses reverse order
2016-08-23 19:40:46 +03:00
Ari Lemmetti
ce77bfa15b
Replace KVZ_PERMUTE with _MM_SHUFFLE
...
The same exact macro already exists
2016-08-22 19:08:46 +03:00
Jovasa
68eef660bd
Fixed search around mv_in in fullsearch not being saved.
2016-08-19 15:19:29 +03:00
Eemeli Kallio
99d8b9abeb
Changed skip_rdoq name to kvz_skip_unnecessary_rdoq. Changed the order it uses when it goes through CGs and tuned its sum calculation.
2016-08-18 14:02:56 +03:00
Eemeli Kallio
1fb4755f31
Added rdoq-skip to quant-generic.c
2016-08-18 12:17:54 +03:00
Eemeli Kallio
d20ac03ca2
Added --rdoq-skip option
2016-08-18 12:17:53 +03:00
Marko Viitanen
83cf801664
Fixed MV constraint condition in bipred
2016-08-18 08:53:17 +03:00
Marko Viitanen
5ae1c595f2
Fixed slice_temporal_mvp_enabled_flag and disabled TMVP with tiles
...
- slice_temporal_mvp_enabled_flag should be signalled also with non-IDR I-slices
2016-08-10 14:51:41 +03:00
Marko Viitanen
5326519182
TMVP cleanup and const qualifier fixes
2016-08-10 14:10:43 +03:00
Marko Viitanen
f40907260d
Added config parameter for TMVP and cmdline option --no-tmvp
...
- Enabled by default
- Cannot be used with GOP at the moment
2016-08-10 14:09:29 +03:00
Marko Viitanen
fd52dac1f7
Fixed TMVP scaling
2016-08-10 14:09:28 +03:00
Marko Viitanen
c664bc8cf7
Added flag collocated_ref_idx to the slice header
2016-08-10 14:09:28 +03:00
Marko Viitanen
c5f2611a38
Fixes for TMVP to work with the new CU array
2016-08-10 14:09:28 +03:00
Marko Viitanen
d85af5755b
TMVP working when only 1 ref frame
2016-08-10 14:09:28 +03:00
Marko Viitanen
39f0165efe
Fix a bug in TMVP, the reference cu_array was being overwritten
2016-08-10 14:09:27 +03:00
Marko Viitanen
adab8c327e
Clean TMVP code
2016-08-10 14:09:20 +03:00
Marko Viitanen
5fa8226ac9
Temporal merge candidate selection
2016-08-10 14:09:20 +03:00
Marko Viitanen
f83042f4a1
Temporal MV candidate selection
2016-08-10 14:09:19 +03:00
Marko Viitanen
f8671581e3
Implemented function kvz_inter_get_temporal_merge_candidates()
2016-08-10 14:09:19 +03:00
Marko Viitanen
2956bdb379
Added flag slice_temporal_mvp_enabled_flag
2016-08-10 14:09:19 +03:00
Arttu Ylä-Outinen
2a946bd88e
Rename encoder_state_t.global to frame
...
"Frame" is more accurate than "global" since when OWF is used, encoder
states for each frame have their own struct.
2016-08-10 13:22:36 +09:00
Arttu Ylä-Outinen
5fbb0a8c27
Fix includes
2016-08-10 13:05:40 +09:00
Arttu Ylä-Outinen
aabf6ca3ee
Extract encoding code from encoderstate.c
...
Moves functions kvz_encode_coding_tree and kvz_encode_coeff_nxn from
encoderstate.c to encode_coding_tree.c.
2016-08-09 22:16:50 +09:00
Arttu Ylä-Outinen
803f29be8f
Remove reconstructed picture allocation in lossless.
...
Changes encoder_set_source_picture to set the reconstructed picture to
a copy of the source picture instead of allocating a new picture when
lossless coding is used.
2016-08-03 14:25:08 +09:00
Arttu Ylä-Outinen
aaec473a19
Refactor encoder state initialization.
...
- Moves allocation of the reconstructed picture after the source picture
is set.
- Extracts main state initialization to a separate function from
encoder_state_new_frame.
- Changes kvz_encoder_feed_frame to return the frame.
- Renames some functions to better match their purpose.
2016-08-03 14:25:08 +09:00
Arttu Ylä-Outinen
cd7024b3a5
Skip computing SSD when using lossless coding.
...
The SSD is always zero since it is lossless.
2016-08-03 14:25:08 +09:00
Arttu Ylä-Outinen
fbbe5d1844
Use kvz_pixels_calc_ssd for SSD in search.c.
...
Replaces loops for computing SSDs by calling kvz_pixels_calc_ssd in
search.c.
2016-08-03 14:25:08 +09:00
Arttu Ylä-Outinen
22cc97ffb1
Fix missing field initializers.
2016-08-03 14:25:08 +09:00
Arttu Ylä-Outinen
06b82bf888
Disable filters, trskip and signhide in lossless.
...
When lossless coding is used, deblock and SAO are skipped, transform
skip flag is not written and sign hiding is not used.
2016-08-03 14:25:08 +09:00
Arttu Ylä-Outinen
97451ec401
Align assignments in encoder.c.
2016-08-03 14:25:08 +09:00
Arttu Ylä-Outinen
1dc94663c3
Bypass transform and quantization with --lossless.
...
When --lossless is given, set cu_transquant_bypass_flag for every CU and
bypass transform and quantization by directly copying reference pixels
to reconstruction and the residual to coefficients.
2016-08-03 14:25:08 +09:00
Arttu Ylä-Outinen
2113b0182d
Enable PPS-level tq bypass flag with --lossless.
...
Sets transquant_bypass_enable_flag to true in PPS when --lossless is
given.
2016-08-03 14:25:08 +09:00
Arttu Ylä-Outinen
a5897bbece
Make cabac context initialization tables static.
2016-08-03 14:25:08 +09:00
Arttu Ylä-Outinen
23e7d9bb37
Add --lossless command line parameter.
2016-08-03 14:25:08 +09:00
Arttu Ylä-Outinen
5372ea432f
Update README and manpage.
2016-08-03 14:25:08 +09:00
Ari Lemmetti
6bcba004ff
Comment out to fix unused code error on clang.
2016-07-14 14:12:16 +03:00
Ari Lemmetti
c0979ebdcb
Implement AVX2 luma sampling
2016-07-14 12:53:02 +03:00
Ari Lemmetti
6244560426
Add avx2 strategy for kvz_filter_frac_blocks_luma.
2016-07-14 12:53:02 +03:00
Ari Lemmetti
9c4e9e049b
Load only what is needed. Eliminate latency from hadds.
2016-07-14 12:53:01 +03:00
Ari Lemmetti
7f71cb423a
Check 4 fractional pixel positions simultaneously
2016-07-14 12:52:24 +03:00
Ari Lemmetti
ad445ab8a1
Transition to kvz_filter_frac_blocks_luma
2016-07-14 12:51:02 +03:00
Ari Lemmetti
fccfbd2f28
Add strategy for kvz_filter_frac_blocks_luma
2016-07-14 12:51:02 +03:00
Ari Lemmetti
e9c3074d32
Add buffers and definitions for upcoming filtering
...
Samples are to be filtered in separate blocks instead of
making one big picture with interpolated pixels
2016-07-14 12:51:02 +03:00
Ari Lemmetti
7afe7e963b
Use fme_level to control the search accuracy.
2016-07-14 12:51:01 +03:00
Ari Lemmetti
5fa323bf25
Skip searching best hpel twice. Make hpel and qpel loops similar.
2016-07-14 12:51:01 +03:00
Ari Lemmetti
bc98a9affa
Change the search order to suit lighter fme search
2016-07-14 12:51:01 +03:00
Ari Lemmetti
2b0c8db349
Add quad satd for avx2
2016-07-14 12:50:24 +03:00
Ari Lemmetti
0ff69fd6f8
Add any size multi satd
2016-07-14 12:48:37 +03:00
Ari Lemmetti
d17b9e7d6e
Allow subme parameters 0-4
...
Update usage, presets,defaults,lib version
2016-07-12 19:49:38 +03:00
Arttu Ylä-Outinen
62ad57d0bf
Fix kvz_image_list_add for zero-sized lists.
...
When a list does not have space for the new element, its size is
doubled. If the size of the list is zero, it would not be resized. Fixed
to always resize the list so that the new element can be added.
2016-06-22 13:35:16 +09:00
Arttu Ylä-Outinen
433e528af7
Drop unused variable in search_pu_inter.
...
Removes unused variable max_px_below_lcu.
2016-06-22 13:35:16 +09:00
Arttu Ylä-Outinen
7836ff6ec9
Drop unused functions.
...
Removes functions kvz_coefficients_calc_abs, kvz_intra_rdo_cost_compare
and kvz_rdo_cost_intra which are no longer used.
2016-06-22 13:35:15 +09:00
Arttu Ylä-Outinen
e4b5840f56
Add parentheses around macro arguments in cabac.h.
2016-06-22 13:35:15 +09:00
Arttu Ylä-Outinen
a387b74e51
Fix resolution auto-detection.
...
Only try to guess the resolution from filename when neither width nor
height is given.
2016-06-22 13:35:15 +09:00
Arttu Ylä-Outinen
097bf8f3c0
Add a typedef for mvd coding cost functions.
2016-06-20 13:56:10 +09:00
Arttu Ylä-Outinen
d3c0e49286
Update comments.
2016-06-16 20:25:08 +09:00
Arttu Ylä-Outinen
ae832cda8c
Pack cbf flags in cu_info_t to two bytes.
...
Reduces size of cu_info_t.
2016-06-16 20:24:19 +09:00
Arttu Ylä-Outinen
cad2d496b8
Enable 4x8 and 4x16 partition modes
...
Enables search for 2NxN and Nx2N partition modes for 8x8 CUs and 2NxnU,
2NxnD, nLx2N and nRx2N partition modes for 16x16 CUs.
Changes the loop for copying reconstructed luma pixels in
kvz_inter_recon_lcu to use 4 byte chunks instead of 8 byte chunks since
it is now possible to have 4 pixel wide blocks.
2016-06-16 20:23:16 +09:00
Arttu Ylä-Outinen
90df7350f0
Make deblocking work with 4 pixel wide blocks.
2016-06-16 20:21:50 +09:00
Arttu Ylä-Outinen
bf26661782
Add support for 4x4 blocks to SATD_ANY_SIZE.
...
Makes functions satd_any_size_generic and satd_any_size_8bit_avx2 work
on blocks whose width and/or height are not multiples of 8.
2016-06-16 18:53:17 +09:00
Arttu Ylä-Outinen
2ae260e422
Change width of cells in lcu_t to 4 pixels.
...
Intra mode info for NxN partition units is now stored in the
corresponding 4x4 cell in lcu_t.cu array.
2016-06-16 18:53:17 +09:00
Arttu Ylä-Outinen
360f5bb8da
Always use pixel coordinates for indexing lcu_t.
...
Removes macro LCU_GET_CU and uses LCU_GET_CU_AT_PX in its place.
2016-06-16 18:53:17 +09:00
Arttu Ylä-Outinen
46e8122d27
Add functions for indexing cu_array_t structures.
...
Replaces macro CU_ARRAY_AT with functions kvz_cu_array_at and
kvz_cu_array_at_const.
2016-06-16 18:52:19 +09:00
Arttu Ylä-Outinen
c5afabdd3b
Change width of cells in cu_array_t to 4 pixels.
2016-06-15 12:25:11 +09:00
Arttu Ylä-Outinen
57a3d9b4b9
Add a function for copying CU data from LCUs.
...
Adds function kvz_cu_array_copy_from_lcu which CU info data from an
lcu_t structure to a cu_array_t structure.
2016-06-15 12:25:11 +09:00
Arttu Ylä-Outinen
2c85a00a55
Change kvz_cu_array_alloc to use pixel dimensions.
...
Changes function kvz_cu_array_alloc to take width and height parameters
in pixels instead of SCUs.
2016-06-15 12:25:11 +09:00
Arttu Ylä-Outinen
b276a347c0
Add a macro for indexing cu_array_t.
...
Adds macro CU_ARRAY_AT(cu_array, x, y) to cu.h.
2016-06-15 12:25:11 +09:00
Arttu Ylä-Outinen
8ac1f1986e
Move CU array copy to a separate function.
...
Moves code for copying parts of cu_array_t to a new function
kvz_cu_array_copy in cu module.
2016-06-15 12:25:11 +09:00
Arttu Ylä-Outinen
41e75daed7
Fix overlapping memcpy in kvz_search_cu_smp.
...
The destination and source pointers might be equal. Fixed by replacing
the memcpy call with a simple assignment.
2016-06-15 12:25:11 +09:00
Ari Lemmetti
29af8bcd21
Remove const to match function signature
2016-06-14 18:19:40 +03:00
Eemeli Kallio
5af6ab320c
Merge branch 'me_early_terminate'
...
Conflicts:
configure.ac
src/cfg.c
src/cli.c
src/kvazaar.h
src/search_inter.c
2016-06-14 15:03:35 +03:00
Eemeli Kallio
43c7778b82
Updated version number.
2016-06-14 10:53:04 +03:00
Arttu Ylä-Outinen
23fdeeaf10
Move mv_cand and mv_dir into a bitfield in cu_info_t.
...
Reduces size of cu_info_t.
2016-06-14 12:21:57 +09:00
Arttu Ylä-Outinen
35aadf6776
Reduce size of type in cu_info_t to two bits.
...
Reduces size of cu_info_t.
2016-06-14 12:21:57 +09:00
Arttu Ylä-Outinen
1cbe844f79
Move inter and intra into an union in cu_info_t.
...
Reduces size of cu_info_t.
2016-06-14 12:21:57 +09:00
Arttu Ylä-Outinen
b6d793ef33
Drop field inter.mvd from cu_info_t
...
Instead of storing the mv differences in cu_info_t, they are computed
from the mv candidates and the motion vector. Reduces the size of
cu_info_t.
2016-06-14 12:21:57 +09:00
Arttu Ylä-Outinen
98aa906f30
Drop field coded from cu_info_t
...
It can be inferred from the position and size of the CU.
2016-06-14 12:21:57 +09:00
Arttu Ylä-Outinen
ebb10763f1
Drop field inter.mv_ref_coded from cu_info_t.
...
Storing inter.mv_ref_coded in cu_info_t is unnecessary since it can be
computed from refmap and inter.mv_ref.
2016-06-14 12:21:57 +09:00
Arttu Ylä-Outinen
4be5c8f349
Move flags into a bitfield in cu_info_t.
...
Reduces the size of cu_info_t.
2016-06-14 12:21:57 +09:00
Arttu Ylä-Outinen
30e9ee988d
Move bitcost field out of cu_info_t.inter.
...
The bitcost is only needed for the currently searched CU.
Fixes bitcost of the second PU being ignored when using SMP or AMP.
2016-06-14 12:21:57 +09:00
Arttu Ylä-Outinen
16d13ed046
Move cost field out of cu_info_t.inter
...
The cost is only needed for the currently searched CU.
2016-06-14 12:20:05 +09:00
Arttu Ylä-Outinen
c5c2c182d9
Drop unused field mode from cu_info_t.inter.
2016-06-14 12:18:17 +09:00
Eemeli Kallio
e4f1a74512
Added early termination option for motion estimation.
...
Conflicts:
src/search_inter.c
2016-06-13 16:20:35 +03:00
Wassim Hamidouche
5bc7287c67
add fix for crypro
2016-06-09 10:49:31 +03:00
Wassim Hamidouche
35634b5596
correct MV sign encryption
2016-06-09 10:49:31 +03:00
Wassim Hamidouche
15abdc6e81
correct sign encryption
2016-06-09 10:49:31 +03:00
Wassim Hamidouche
73c3203a26
encry coef transfs
2016-06-09 10:49:31 +03:00
Wassim Hamidouche
7ad5f8bbe5
encry coef transf sign
2016-06-09 10:49:31 +03:00
Wassim Hamidouche
02b0712973
fix g++ compilation
2016-06-09 10:48:44 +03:00
Ari Koivula
a2170f0763
Compile the cryptopp wrapper only when used
...
This should allow us to avoid an unnecessary dependancy to a C++
compiler.
Conflicts:
configure.ac
2016-06-07 17:11:12 +03:00
Ari Koivula
182038c743
Don't allow enabling encryption when it's not compiled in
2016-06-07 16:58:09 +03:00
Ari Koivula
8eb087120e
Make VisualStudio ignore the crypto stuff
...
Add stubs for the crypto functions so we can refer to them, even if we
never use them.
2016-06-07 16:58:09 +03:00
Wassim Hamidouche
76cb6dc6c2
add check flags
2016-06-07 10:54:26 +02:00
Ari Koivula
60ea8a359f
Add --crypto parameter
2016-06-07 10:31:40 +02:00
Wassim Hamidouche
02308d1ba6
add MVs encryption
2016-06-07 10:28:30 +02:00
Wassim Hamidouche
4637c8a828
compile Kvazaar encoder with ITpp library
2016-06-07 08:33:04 +02:00
Eemeli Kallio
8f182ac6de
Added functions select_starting_point and mv_in_merge to search_inter.c
2016-06-06 17:16:04 +03:00
Ari Koivula
fe71638a96
Fix problem with ASM compilation
...
When compiling C++ files along with C, libtool would complain about
the --tag missing, even though CC should be the default.
2016-06-06 15:47:56 +03:00
Eemeli Kallio
836a3b1daa
Added functions select_starting_point and mv_in_merge.
2016-06-06 12:18:33 +03:00
Ari Koivula
4eaacbe23e
Fix bug with lp-gop and ratecontrol
...
The first frame was always qp51 due to gop_offset being -1 for the
first frame. This fix makes it so that bits are allocated as if it was
the last (high quality) frame from the previous GOP.
2016-05-27 15:53:55 +03:00
Ari Koivula
3fbd7ed97f
Add GOP layer weights for lowdelay-P
...
When using ratecontrol with lowdelay-P, this improves BDRate by 1-25%.
Strongest effect is when using 4 layers and multiple references.
Also allow using 1 or 2 layers with ratecontrol.
2016-05-27 13:46:26 +03:00
Ari Koivula
67acead4bc
Fix referring over IDR boundary when using --gop
...
This problem resulted in an illegal bitstream with --gop=lp, because it
uses IDR's. The --gop=8 would not code IDR pictures, even when told to
with -p, which masked this problem.
This fix solves the problem with --gop=lp and also prevents references
across the intra picture in --gop=8. The intra pictures should be set
to IDR in a later fix, or an alternate method of differentiating
between IDR and non-IDR intra should be made.
2016-05-27 13:20:53 +03:00
Ari Koivula
a77dc1610e
Refactor encoder_state_remove_refs
...
I needed to debug this, so I rewrote it to make sense. There is an
obvious bug with the IDR handling that I left in place to fix in a
separate commit.
2016-05-27 13:20:45 +03:00
Eemeli Kallio
b5c05e58e0
Fixed typo in strategyselector.c
2016-05-24 11:04:29 +03:00
Ari Lemmetti
68c6f0f7b8
Enable deblocking for every preset
...
Deblocking adds very little complexity
while giving massive coding performance boost
2016-05-17 18:50:31 +03:00
Ari Lemmetti
6a07761b46
Add smp and amp options to presets
2016-05-17 14:26:58 +03:00
Ari Lemmetti
3107a93eaf
Fix avx2 chroma sampling for amp
2016-05-17 14:09:57 +03:00
Ari Koivula
24d0f9f685
Fix usage message for --hash
2016-05-11 15:03:43 +03:00
Ari Koivula
a1c772b696
Merge pull request #136 from MrAsura/cu-split-termination
...
Cu split termination
Closes #133 .
2016-05-10 17:22:08 +03:00
Jaakko Laitinen
7010526b1d
Removed tabs.
2016-05-10 15:52:44 +03:00
Jaakko Laitinen
a77eb5c874
Fixed type conversion error when parsing cu split termination.
2016-05-10 14:34:46 +03:00
Jaakko Laitinen
0d361d5bc7
Moved cu split termination from a pre-processor to a input parameter.
2016-05-10 14:15:41 +03:00
Ari Koivula
1dbe4eb852
Merge branch 'mv-full'
2016-05-10 13:28:07 +03:00
Ari Koivula
f6a9d237a3
Merge pull request #134 from miimiz/testink_eemeli
...
Strategyselector prints
2016-05-10 13:27:23 +03:00
Eemeli Kallio
8cfeed852c
Added print about SIMD optimizations available and in use to strategyselector.
2016-05-10 12:59:15 +03:00
Ari Koivula
f51a68b6fa
Add different sizes of search window for full search
2016-04-21 15:11:35 +03:00
Ari Lemmetti
efbdc5dade
Utilize registers more efficiently for 8x8 and larger blocks
2016-04-21 13:26:38 +03:00
Ari Lemmetti
192cee95b2
Vectorize vertical filtering
2016-04-21 13:26:38 +03:00
Ari Lemmetti
0be35f72b8
Filter 4 pixels simultaneously in x direction
2016-04-21 13:26:38 +03:00
Ari Lemmetti
10484bda9f
Make strategies out of fractional pixel sample functions
2016-04-21 13:26:38 +03:00
Ari Koivula
28e7548387
Fix bug in full mv search
...
This optimization led to some points not being searched.
2016-04-21 12:03:57 +03:00
Ari Koivula
2576aeee0b
Use merge candidates in full mv search
...
Perform a full search window around every mv candidate and the
0-vector.
2016-04-20 20:47:11 +03:00
Ari Lemmetti
8247faf8e0
Remove 64-bit only instruction to fix 32-bit compilation.
2016-04-19 18:05:11 +03:00
Ari Lemmetti
eb55d6b6b9
Fix writing over boundary.
2016-04-19 16:03:43 +03:00
Ari Lemmetti
bcabc6fadd
Remove pixel blit from strategies. Use memcpy instead.
2016-04-06 18:44:04 +03:00
Ari Lemmetti
2140197ccc
Tidy up coeff blit function and use memcpy again.
...
Give memcpy constants for fixed sizes to enable copying many bytes simultaneously.
2016-04-06 18:03:00 +03:00
Ari Koivula
08b4480d94
Re-add time.h include
...
Include-what-you-use wants to include sys/time.h instead, or if I
override it to include time.h it will remove the include completely.
2016-04-02 19:05:16 +03:00
Ari Koivula
61fc3e87ba
Run include-what-you-use fix_includes.py fix_includes.py
...
The includes should make more sense now and not just happen to compile
due to headers included from other headers.
Used a modified version of IWYU. Modifications were to attribute int8_t
and so on to stdint.h instead of sys/types.h and immintrin.h instead of
more specific headers.
include-what-you-use 0.7 (git:b70df35)
based on clang version 3.9.0 (trunk 264728)
2016-04-01 17:46:55 +03:00
Ari Koivula
016810d982
Move COMPILE_ macro to global.h
...
While these are only used for strategies, it's non-intuitive to have
to include strategyselector.h in every file under strategies before
including anything else.
2016-04-01 17:46:55 +03:00
Ari Koivula
8908d85d66
Change all relative includes to absolute
2016-04-01 17:46:44 +03:00
Ari Koivula
4876879b82
Add IWYU pragmas
2016-03-31 12:33:34 +03:00
Marko Viitanen
41a5f9bbbe
Fix filetime conversion to timespec
2016-03-24 10:08:11 +02:00
Ari Koivula
9139e169fe
Fix unnecessary waiting in main thread
...
The main thread has to wait for the worker threads to finish. The
pthread_cond_timedwait call used to accomplish this was given
a relative instead of absolute time, which resulted in the call
returning immediately, because the time had already passed.
This removes the now unnecessary sleeps and fixes the time given to
the pthread_cond_timedwait such that it now waits until a job finishes
or 100ms have passed.
2016-03-23 22:23:04 +02:00
Ari Koivula
e23ed231fb
Fix race condition with owf and non-square motion partitions
...
The OWF wpp limit code assumed square blocks, and as such did not work
correctly when height != width. This changes the relevant code to consider
both height and width.
2016-03-22 16:46:38 +02:00
Arttu Ylä-Outinen
d6a3e02f16
Fix calculating reference CU index in inter search
...
Fixes a possible segfault when SMP or AMP blocks are used.
2016-03-22 12:55:58 +02:00
Ari Lemmetti
f4538ab474
Copy pixels more efficiently in lcu recon.
2016-03-18 20:10:03 +02:00
Ari Koivula
5b66578f71
Add kvz_ prefix to md5 functions
...
The non kvz_ symbols were being exported in the static lib, which got caught
by Travis tests.
2016-03-18 13:13:35 +02:00
Ari Koivula
4125218cfa
Add --hash=md5
...
Add md5 through extras/libmd5 taken from HM with BSD license. It's
implemented as a generic strategy using the same interface as checksum,
so we can write a SIMD version if it seems necessary.
2016-03-18 05:23:57 +02:00
Ari Koivula
883448b8fb
Add --hash parameter
...
Allows decoded picture hash to be selected among none and checksum.
2016-03-18 05:20:15 +02:00
Ari Lemmetti
6d5f8e3aec
Define KVZ_COMPILE_ASM for the correct files.
...
Enables asm strategies again.
2016-03-17 16:21:31 +02:00
Ari Lemmetti
e502292ba8
Remove old function
2016-03-16 20:18:55 +02:00
Ari Lemmetti
c6cc96f5ec
Optimize sao band ddistortion
2016-03-16 20:16:00 +02:00
Ari Lemmetti
ab577f476f
Optimize sao reconstruct color
2016-03-16 20:15:32 +02:00
Ari Lemmetti
48bfddf4ec
Optimize calc sao edge dir
2016-03-16 20:14:50 +02:00
Ari Lemmetti
ba69992941
Optimize sao edge ddistortion
2016-03-16 20:14:19 +02:00
Ari Lemmetti
941b6b3e27
Optimize calc eo cat
2016-03-16 20:13:30 +02:00
Ari Lemmetti
04fbb48a09
Add strategy for avx2. Copy generic functions there.
2016-03-16 20:13:15 +02:00
Ari Lemmetti
4e30a215d8
Create generic strategy for sao.
2016-03-16 20:11:15 +02:00
Ari Koivula
6f431e510c
Comment and tidy threadqueue_worker
...
Carefully avoided making any changes to the logic.
2016-03-14 20:08:04 +02:00
Ari Koivula
1165ae2e1f
Increase --mv-constraint=frametimemargin margin
...
Increase the margin to be 4 luma pixels to every direction.
2016-03-14 16:02:54 +02:00
Arttu Ylä-Outinen
0eda28ced6
Fix Visual Studio warnings
...
Initialization of a struct with addresses of local variables generated
warning C4221 in encmain.
2016-03-14 14:12:21 +02:00
Ari Koivula
e91ca74733
Refactor kvz_encode_last_significant_xy
2016-03-10 18:47:16 +02:00
Ari Koivula
1fc0e8076c
Format kvz_encode_last_significant_xy whitespace
2016-03-10 18:17:45 +02:00
Ari Koivula
df9a958ef2
Merge branch 'log2'
2016-03-10 18:16:41 +02:00
Ari Koivula
4112a4364d
Remove g_to_bits table
2016-03-10 15:59:51 +02:00
Ari Koivula
9fcfba637f
Remove duplicated inline functions
2016-03-10 15:28:31 +02:00
Ari Koivula
e27ec2cc53
Add kvz_math.h for common inline math functions
...
Calling it just math.h would have prevented including system math.h.
2016-03-10 15:26:18 +02:00
Ricardo Constantino
c515796a21
Only use version prefix in kvazaar binary
...
Fixes regression since 54f08f2
causing libkvazaar version checks to not
work (i.e. pkg-config)
2016-03-09 16:13:59 +00:00
Arttu Ylä-Outinen
54f08f2bdb
Use output of git describe as version.
2016-03-09 15:04:29 +02:00
Ari Koivula
f8edf28161
Fix const qualifier warning
...
Also set the warning to an error in VS.
2016-03-09 14:16:15 +02:00
Ari Koivula
b0c3ece31e
Fix race condition when deblocking is on but SAO is off
...
Already suspected this yesterday, but didn't want to add the code to
handle it before confirming that it's actually a problem. It is.
2016-03-09 14:02:46 +02:00
Ari Koivula
1671725c72
Fix non-determinism issue with OWF WPP margin
...
The previous reasoning used deblocking and fractional motion estimation
together to arrive at a margin of 4 pixels. This was wrong, and with
either of these off, half pixel chroma interpolation could use pixels
outside the intended region.
Deblocking does not currently affect the margin needed.
2016-03-08 20:18:38 +02:00
Ari Koivula
674bfa14ce
Comment WPP deblocking and SAO
...
I was a bit unclear about exactly what happens and when regarding SAO
and deblocking when we do frame-parallel WPP parallelism, so I checked
and commented the bits that were unclear to me.
2016-03-08 19:39:04 +02:00
Ari Koivula
aec152c953
Fix OWF mv restriction limit
...
The check was done in regard to the wrong dimension, allowing the
access to unfinished parts of the frame when coding multiple frames
at the same time.
2016-03-08 17:12:43 +02:00
Ari Koivula
fda103aa7c
Refactor cfg->tiles_width_count and cfg->tiles_height_count
...
Change code everywere so these actually mean "width count" and not
"width count minus one".
2016-03-07 17:29:15 +02:00
Ari Koivula
a350eb3a1e
Fix --tiles to have the correct number of tiles.
...
The tiles_width_count etc. actually mean "count minus one".
2016-03-07 17:24:31 +02:00
Ari Koivula
49ea2d7b7f
Fix --mv-constraint=frametile
...
Option --mv-constraint=frametilemargin was being used instead of
frametile.
2016-03-07 16:41:00 +02:00
Ari Koivula
95b8dd99f6
Add --tiles parameter
...
Add new parameter --tiles that accept only uniform split. I considered
supporting the syntax of --tiles-width-split for this, but writing
--tiles=u2xu2 is just not as intuitive as --tiles=2x2, and there is
hardly ever any reason to use anything but uniform split. The more
cumbersome --tiles-width-split and --tiles-height-split parameters
are still there to allow finer control.
2016-03-07 16:33:51 +02:00
Ari Koivula
fd34dd9bc6
Fix race condition with OWF
...
There was an off by one error in the dependance setting code, which
resulted in dependencies not being set resulting in checksum errors.
For example if ref_neg=1 and owf=1.
2016-03-07 13:38:23 +02:00
Ari Koivula
81b439f4da
Optimize starting point selection in tz
...
Avoid checking zero motion vectors multiple times. The merge candidate
list often has only one or two candidates, the other being zeroes.
2016-03-04 16:48:46 +02:00
Ari Koivula
2436702c27
Optimize starting point selection in hexbs
...
Avoid checking zero motion vectors multiple times. The merge candidate
list often has only one or two candidates, the other being zeroes.
2016-03-04 16:48:12 +02:00
Ari Koivula
5327b59b45
Remove KVZ_PERF_SEARCHPX
...
It's too invasive and we don't really need it.
2016-03-04 16:48:12 +02:00
Arttu Ylä-Outinen
348ac4888b
Fix calc_mode_bits.
...
The CUs left and above the current one would be set to NULL when there
was only one CU between the current one and the left or top edge of the
frame.
2016-03-04 14:08:35 +02:00
Ari Koivula
86219aa0fc
Fix non-determinism with tiles
...
Earlier fix that fixed the supply side of the cu_array to take tile
coordinates into account should have been accompanied with this one
that does the same thing to demand side.
2016-03-03 17:39:20 +02:00
Arttu Ylä-Outinen
626b53ce85
Move sao search from encoderstate to sao.
...
Moves sao search from function encoder_state_worker_encode_lcu in
encoderstate.c to function kvz_sao_search_lcu in sao.c. Makes functions
kvz_init_sao_info, kvz_sao_search_chroma and kvz_sao_search_luma static
since they are no longer used outside sao.c.
2016-03-01 14:56:16 +02:00
Ari Koivula
cfa722e448
Reduce parallelism for tiles
...
There is still some race-condition with encoding tiles from multiple
frames, so disable this to keep the bitstream deterministic.
2016-02-29 20:20:21 +02:00
Ari Koivula
3dcc0957f8
Deal with impossible mv constraints
...
If 0,0 vector is illegal, it's possible that no legal movement vector,
is found, in which case a large cost is returned instead. The cost
overflowed and there is all sorts of silliness with converting from
double to int, but I'm not going to fix all of it because when we
remove the doubles it will all get fixed.
2016-02-29 19:18:14 +02:00
Ari Koivula
b1adf1576a
Add --mv-constraint=frametilemargin
...
Add an even stricter motion vector constraint to prevent motion vectors
to fractional pixel positions that would need pixels outside the tile.
2016-02-29 19:18:14 +02:00
Ari Koivula
f808cbf608
Allow increased parallelism for tiles
...
When movement vectors are constrained to tiles, only the same tile in
previous frame needs to be depended upon.
2016-02-29 14:33:06 +02:00
Ari Koivula
f4ebff12b0
Combine tile mv constraint with OWF mv constraint
...
This also fixes movement vectors in tiles when OWF is on. The OWF mv
constraint assumed WPP, so it didn't work with tiles.
2016-02-29 14:33:06 +02:00
Ari Koivula
7981609cd0
Add --mv-constraint=frametile
2016-02-29 14:33:06 +02:00
Ari Koivula
9dbbb7fdbc
Add --mv-constraint argument
2016-02-29 14:33:06 +02:00
Ari Koivula
1be877faf9
Fix chroma reconstruction with tiles
...
An incorrect frame boundary check caused a checksum error, because the
chroma reconstruction of the encoder was wrong. The encoder treated
horizontal tile boundaries as frame boundaries when the vertical
component of the movement vector was a multiple of 8.
2016-02-29 14:32:51 +02:00
Ari Koivula
c0dc490dd1
Fix inter non-determinism with tiles
...
CU data was being copied to the wrong place in the reference frames
cu_array, which led to uninitialized data being used as a starting
point for motion vector search.
Fixes #99 .
2016-02-26 17:05:04 +02:00
Ari Koivula
719d72925b
Add loop-input option
...
This option is useful for testing long encodes, as you don't have to
find an actual infinite input.
2016-02-18 20:00:55 +02:00
Ari Koivula
d23a5a15f1
Fix overflow in rate control
...
A 32 bit int overflowed after 2^31 bits (2Gb). It will still overflow
eventually, after 500 years of outputting 1Gb/s, but by that time,
I recon we will have fixed this properly and it's time to upgrade.
2016-02-18 16:48:21 +02:00
Ari Koivula
eeafe14946
Clean up search initialization
...
Copy lcu explicitly instead of initializing with the same parameters.
2016-02-17 14:57:31 +02:00
Arttu Ylä-Outinen
e5c84c361c
Eliminate a race condition with input thread.
...
Changes communication between the input thread and main thread in
encmain.c so that only one of them uses img_in and retval at a time.
Fixes a race condition which would sometimes result in a deadlock.
2016-02-17 12:09:19 +02:00
Ari Koivula
c40ede56ad
Allow more frame parallelism in LP-gop
...
Add dependency to the reference frame instead of the previous frame,
in order to allow more frames to be encoded in parallel when temporal
stepping >1 in LP-gop (such as --gop=lp-g8d4r1t2).
2016-02-05 17:08:24 +02:00
Arttu Ylä-Outinen
40c7198f7d
Add a script for updating README
...
Adds script tools/update_readme.sh for regenerating the "Using Kvazaar"
section of README.md from the output of "kvazaar --help".
2016-02-05 16:21:39 +02:00
Arttu Ylä-Outinen
aac5373095
Fix typos in documentation
...
Fixes a few typos in README and command line help.
2016-02-05 16:21:27 +02:00
Ari Koivula
a4915dc547
Update man and README
2016-02-04 14:16:58 +02:00
Ari Koivula
e941e21cd6
Enable errors about non-existing CLI options
...
Set opterr and optind to their normal default values.
2016-02-04 13:48:58 +02:00
Ari Koivula
7a4bf94a52
Add --version and --help
...
Also don't print help by default, because it's too long. Print a
shorter usage message instead.
2016-02-04 13:48:48 +02:00
Ari Lemmetti
99e37ec235
Update old pixel type to the current one
2016-01-30 19:33:09 +02:00
Ari Koivula
c76a0951cf
Change version to 0.8.3
2016-01-28 21:21:02 +02:00
Ari Koivula
cb2121b1aa
Double time scale when field coding is used
2016-01-28 21:04:52 +02:00
Ari Koivula
8ad7d2a714
Move interlacing stuff to libkvazaaar API
...
This moves the interlacing from CLI code to api->encoder_encode, in
order to make it possible to use field coding through the lib API.
The field order is now determined per frame, as FFmpeg gives it per
frame and it's signaled per frame.
As a side effect, the CLI also now prints info from frames instead of
fields. While we might want to extend the API in the future to allow
printing of more detailed information about fields, for now it's
more important that the CLI uses the real lib API.
PSNR calculation for interlaced frames disabled until we have a way to
avoid deinterlacing the frame when it's not necessary.
2016-01-27 15:29:45 +02:00
Ari Koivula
6952f0fcc6
Refactor interlaced reading
...
Doesn't change the way it works. Just rearranges things so it's easier
to see what is going on.
2016-01-26 13:42:41 +02:00
Ari Koivula
a46351efe1
Fix out of bounds error in interlacing
...
When field height was padded to a multiple of 8, yuv_io_extract_field
would read outside the buffer.
2016-01-26 13:41:52 +02:00
Arttu Ylä-Outinen
49677810b5
Rename config module to cfg.
...
Prevents a conflict with config.h and src/config.h so that the config.h
generated by configure is included in global.h. Fixes problems with
large input files on 32-bit systems.
2016-01-25 12:26:46 +02:00
Marko Viitanen
8e6c12b859
Merge branch 'input_reading_thread'
2016-01-25 12:00:03 +02:00
Marko Viitanen
b4a4ce848c
Use field parity for extracting correct fields from the interlaced picture
2016-01-25 10:58:12 +02:00
Marko Viitanen
441ce7728f
Fix for input_read_thread() in the case when interlaced source-scan-type is used
2016-01-25 10:57:51 +02:00
Marko Viitanen
198204a20a
Fix when using --source-scan-type=bff, offset was used for output lines
2016-01-25 10:13:51 +02:00
Ari Koivula
22b8ed43dc
Remove global.h include from kvazaar.h
...
It shouldn't have been put there as it's the lib interface.
2016-01-22 15:23:34 +02:00
Ari Koivula
249c88011e
Fix problem with >2GB input files on 32bit
2016-01-22 15:15:02 +02:00
Ari Koivula
fa1af14637
Fix includes to include global.h first everywhere
2016-01-22 15:07:49 +02:00
Ari Koivula
3bf278529c
Fix interlacing when using lib interface
...
Some flags used for interlacing were set in CLI interface, which
meant that interlacing didn't work correctly when used through
libkvazaar.
2016-01-22 14:35:20 +02:00
Marko Viitanen
0128ee26e7
Clear img_in pointer after reading it
2016-01-22 14:29:35 +02:00
Marko Viitanen
b5459c1f23
Fixed performance monitoring by adding KVZ_ prefix to GET_TIME
2016-01-22 11:27:25 +02:00
Marko Viitanen
e36237335e
Fixed memory leaks caused by the input handler thread and cleaned up the code
2016-01-22 11:27:25 +02:00
Marko Viitanen
ad9a1f6539
Input thread implementation
...
- Handle input processing in a separate thread to allow main thread more time with thread handling etc
- Significant speedup can be seen when run on ultrafast settings and on a system with great number of cores
2016-01-22 11:27:25 +02:00
Ari Koivula
5e734593c0
Add psnr argument to CLI
...
To disable calculation of PSNR for frames, printing 0.0dB instead.
2016-01-21 15:08:34 +02:00
Ari Koivula
9eba3a83cc
Add compiler flag checking to configure
2016-01-20 16:32:34 +00:00
Arttu Ylä-Outinen
d452709795
Fix compiling AVX2 strategies.
...
Option -mavx2 was omitted when compiling AVX2 strategies. This commit
moves strategies to convenience libraries so that their compilation
flags can be easily set and adds -mavx2 to CFLAGS of the AVX2 library.
2016-01-20 11:04:12 +02:00
Ari Koivula
8060e2f6ec
Delete kvazaar_version.h
...
It's not used anymore.
2016-01-19 20:40:35 +02:00
Ari Lemmetti
44656aeb19
Remove useless calculation
2016-01-19 16:35:16 +02:00
Marko Viitanen
e822c16659
Removed unneeded cpu flags causing compiling to fail on powerpc, closes #121
2016-01-18 08:55:32 +02:00
Ari Koivula
c8c0b4e8e8
Change version number for v0.8.2
2016-01-15 19:42:07 +02:00
Ari Koivula
e2402c0000
Remove kva_api_get versioning.
...
We have soname versioning now, so we should focus on getting that right
instead. This also serves as an example of correctly incrementing the
lib-version.
2016-01-15 19:39:24 +02:00
Ari Koivula
caf809f26d
Remove scons build scripts
...
Because we are not going to maintain them.
2016-01-15 17:35:35 +02:00
Ari Koivula
15e1110997
Remove reference to Makefile-old
...
Makefile-old was deleted and this reference breaks make dist.
2016-01-15 17:32:54 +02:00
Ari Lemmetti
a9decd2f40
Bump for yet another release
2016-01-14 23:23:11 +02:00
Ari Koivula
7718ac378f
Add fractional FPS support.
...
Now that we put the timing info into the bitstream, the time base must
be precisely known. Represent framerate as a fraction and add timing
info only if the old floating point framerate was not used.
Deprecate cfg->framerate so it can be removed once we get patches to
FFmpeg and libav.
Add support for (num)/(denom) format to --input-fps.
2016-01-14 22:16:53 +02:00
Ari Lemmetti
a9bd7b9e63
Bump version numbers for release v0.8.0
2016-01-14 20:38:28 +02:00
Ari Lemmetti
b605e3866e
Bye bye Makefile
2016-01-14 20:38:01 +02:00
Marko Viitanen
242edf98ad
Added calculation and writing of VUI num_units_in_tick and time_scale
2016-01-14 15:32:33 +02:00
Ari Lemmetti
daf39e348f
Add dedicated handling for blitting NxN coeffs when N is 4, 8 or 16
2016-01-13 19:27:45 +02:00
Ari Lemmetti
a2fc9920e6
Merge branch 'alternative-satd'
2016-01-13 15:00:43 +02:00
Ari Lemmetti
1ed34f2df8
Add some planar pred optimization for blocks larger than 8x8
2016-01-13 14:50:17 +02:00
Ari Lemmetti
0df88697ff
Copy generic function to AVX2 strategy
2016-01-12 23:51:18 +02:00
Ari Lemmetti
62799a9fc3
Create generic strategy of planar prediction
2016-01-12 23:50:47 +02:00
Ari Lemmetti
3cb1cebfe5
Add missing inlines
2016-01-12 23:03:31 +02:00
Ari Lemmetti
6a0b13b8b6
Remove unused functions
2016-01-12 22:55:37 +02:00
Ari Lemmetti
61155f0edd
Add 128-bit version of the functions as well
2016-01-12 22:52:00 +02:00
Ari Lemmetti
a6afb8a8f4
Small refactoring
2016-01-12 22:29:33 +02:00
Ari Lemmetti
a756f6133a
Manually unroll vertical Hadamard transform
2016-01-12 21:45:02 +02:00
Ari Lemmetti
66350aa20e
Experiment with alternative implementation of FWHT
2016-01-11 16:25:56 +02:00
Arttu Ylä-Outinen
e14858f41a
Fix build and tests.
...
- Remove non-existent file interface_main.c from library sources.
- Add file mv_cand_tests.c to test sources.
2015-12-21 16:03:55 +02:00
Arttu Ylä-Outinen
9abdee7cc3
Merge branch 'autotools'
2015-12-21 15:54:30 +02:00
Arttu Ylä-Outinen
eb6fa3d980
Fix exporting functions in library.
...
Rewrites definition of macro KVZ_PUBLIC in kvazaar.h so that
KVZ_STATIC_LIB need not be defined when building a static library.
2015-12-21 14:38:59 +02:00
darealshinji
8427a85d36
Add tests
2015-12-19 08:24:35 +01:00
Ari Koivula
1270da3626
Move files under their modules in Visual Studio
...
Also moves CLI stuff under CLI project, so they are compiled as their
own lib just like when the Makefile is used.
The file interface_main.c was an artifact from a bygone era and should have
been deleted long ago.
2015-12-17 15:39:45 +02:00
Ari Koivula
947bae24f9
Update Doxygen documentation
...
Add module information to all header files.
Update all header file documentations to briefly say what they are, and
to use the javadoc format so the brief actually gets included into the
doxygen documentation.
Remove \file from implementation files, in order to not repeat the info
from the header files.
Add files under strategies and tools to Doxygen and update the Doxygen
settings to be just plain better.
Make README be the main page of Doxygen documentation.
2015-12-17 14:05:50 +02:00
Ari Koivula
a6ea705e19
Add missing lambda to some bit costs
...
Bits were being added to rate distortion without being multiplied by
lambda in a few places. Fixing this bug also finally allows us to remove
the magic bits from the Coding Unit split decision.
I tried to find new optimum value for CU_COST and it turned out to be 2
for veryslow and 0 for superfast. The difference between 0 and 2 on
veryslow was only 0.1% however, so I don't think this parameter is
needed any longer. Before this fix the effect of removing CU_COST would
have been 0.8%.
2015-12-15 16:32:38 +02:00
Arttu Ylä-Outinen
0e33049d9e
Enable full mv search once again.
...
- Updates function search_mv_full so that it compiles and handles
non-square blocks.
- Enables compilation of search_mv_full.
- Sets full search radius to 32.
- Enables selecting full mv search with "--me full".
2015-12-15 12:26:26 +02:00
Arttu Ylä-Outinen
dbb9b0df85
Enable search for AMP blocks.
2015-12-15 11:21:46 +02:00
Arttu Ylä-Outinen
7e4f4538a4
Implement encoding AMP part modes.
...
Also adds parameter --amp for enabling AMP blocks.
2015-12-15 11:21:45 +02:00
Arttu Ylä-Outinen
c3716f7803
Add --smp option for enabling SMP blocks.
2015-12-15 11:21:45 +02:00
Arttu Ylä-Outinen
38b881c36f
Implement search_frac for rectangular blocks.
...
Replaces parameter depth of function search_frac with parameters width
and height.
2015-12-15 11:21:45 +02:00
Arttu Ylä-Outinen
864c77f6eb
Use kvz_satd_any_size in inter search.
...
Changes search_frac and kvz_search_cu_iter to use kvz_satd_any_size for
computing the SATDs instead of getting the SATD function with
kvz_pixels_get_satd_func.
2015-12-15 11:21:45 +02:00
Arttu Ylä-Outinen
056fa09ba5
Add arbitrary-sized SATD functions.
...
Adds strategy satd_any_size for generic and AVX2. The satd_any_size
functions are implemented with macro SATD_ANY_SIZE defined in
strategies-picture.h.
2015-12-15 11:21:45 +02:00
Arttu Ylä-Outinen
6bdc08b6eb
Drop unused function declaration.
...
Removes declaration of non-existent function satd_8bit_8x8_generic in
strategyselector.h.
2015-12-15 11:21:44 +02:00
Arttu Ylä-Outinen
728a6abecc
Extract macro SATD_NxN.
...
Combines definitions of macros SATD_NXN and SATD_NXN_AVX2 to macro
SATD_NxN and moves it to strategies-picture.h.
2015-12-15 11:21:44 +02:00
Arttu Ylä-Outinen
1eebfde0c5
Make tz search work with non-square blocks.
...
Replaces parameter depth with parameters width and height.
2015-12-15 11:21:44 +02:00
Arttu Ylä-Outinen
e203883f3d
Refactor kvz_filter_deblock_lcu.
...
Moves code for filtering the rightmost 4 pixels of an LCU to a separate
function filter_deblock_lcu_rightmost.
2015-12-15 11:21:44 +02:00
Arttu Ylä-Outinen
21ca74fe86
Replace deblock filter with a simple loop.
...
- Adds function is_pu_boundary.
- Moves code for filtering an edge of a single PU or TU to a new
function filter_deblock_unit.
- Replaces recursive CU tree traversal in filter_deblock_cu with
a simple loop and renames it to filter_deblock_lcu_inside.
2015-12-15 11:21:43 +02:00
Arttu Ylä-Outinen
7516fda970
Make fractional recon work with non-square blocks.
...
Adds parameter block_height to functions inter_recon_frac_luma,
inter_recon_14bit_frac_luma and inter_recon_14bit_frac_chroma so that
they can handle SMP blocks.
2015-12-15 11:21:43 +02:00
Arttu Ylä-Outinen
591a1ce6db
Turn some inter recon functions static.
...
Makes the following functions static since they are not used outside
inter.c:
- kvz_inter_recon_frac_luma
- kvz_inter_recon_14bit_frac_luma
- kvz_inter_recon_frac_chroma
- kvz_inter_recon_14bit_frac_chroma
2015-12-15 11:21:43 +02:00
Arttu Ylä-Outinen
0f531362bf
Enable Nx2N partitions.
2015-12-15 11:21:43 +02:00
Arttu Ylä-Outinen
4402e251ae
Fix kvz_get_extended_block functions.
...
The buffers allocated in functions kvz_get_extended_block_avx2 and
kvz_get_extended_block_generic were too small when the width of the
block was less than its height. Fixed to allocate correctly sized
buffers.
2015-12-15 11:21:43 +02:00
Arttu Ylä-Outinen
bdd8b1c0aa
Implement 2NxN partitions in inter search.
...
- Try using 2NxN partitions after the usual 2Nx2N.
- Adds function kvz_search_cu_smp to search_inter module.
2015-12-15 11:21:42 +02:00
Arttu Ylä-Outinen
410064e880
Split lcu_set_inter into two functions.
...
Moves code for setting the inter modes for a single PU to a new function
lcu_set_inter_pu.
2015-12-15 11:21:42 +02:00
Arttu Ylä-Outinen
3236428e4d
Make hexbs search work with non-square blocks.
...
Replaces parameter depth with parameters width and height.
2015-12-15 11:21:42 +02:00
Arttu Ylä-Outinen
31ba8d61c3
Implement fractional chroma recon for SMP blocks.
...
Adds parameter block_height to function kvz_inter_recon_frac_chroma.
2015-12-15 11:21:42 +02:00
Arttu Ylä-Outinen
0b6cef7be5
Remove unused function kvz_inter_set_block.
2015-12-15 11:21:42 +02:00
Arttu Ylä-Outinen
e63486b23f
Make lcu_set_inter work with SMP blocks.
2015-12-15 11:21:41 +02:00
Arttu Ylä-Outinen
7b99eb2970
Call recon functions correctly for SMP blocks.
...
Makes calls to kvz_inter_recon_lcu and kvz_inter_recon_lcu_bipred in
function search_cu work correctly when using SMP blocks.
2015-12-15 11:21:41 +02:00
Arttu Ylä-Outinen
dc4525c0e3
Implement inter recon for non-square blocks.
...
Adds parameter height to functions kvz_inter_recon_lcu and
kvz_inter_recon_lcu_bipred and makes them work on non-square sizes.
Fractional reconstruction functions do not handle non-square blocks yet.
2015-12-15 11:21:41 +02:00
Arttu Ylä-Outinen
f874c8614e
Add part_mode binarization table comment.
2015-12-15 11:21:41 +02:00
Arttu Ylä-Outinen
c77074a7ff
Implement encoding SMP blocks.
2015-12-15 11:21:41 +02:00
Arttu Ylä-Outinen
98707a1288
Move encoding intra CU to a separate function.
...
Moves code for encoding a single intra coding unit from function
kvz_encode_coding_tree to a new function encode_intra_coding_unit.
2015-12-15 11:21:40 +02:00
Arttu Ylä-Outinen
c336674da3
Move encoding part mode to a separate function.
...
Moves code for encoding the part mode from function
kvz_encode_coding_tree to a new function encode_part_mode.
2015-12-15 11:21:40 +02:00
Arttu Ylä-Outinen
ac952cbb44
Move encoding inter PUs to a separate function.
...
Moves code for encoding a single inter prediction unit from function
kvz_encode_coding_tree to function encode_inter_prediction_unit.
2015-12-15 11:21:40 +02:00
Arttu Ylä-Outinen
5ee9f164e8
Add macros for getting PU location and size.
...
- Moves SIZE_* definitions to cu.h.
- Adds constant arrays kvz_part_mode_num_parts, kvz_part_mode_offsets
and kvz_part_mode_sizes for storing the number of PUs, PU offsets and
PU sizes.
- Adds macros PU_GET_X, PU_GET_Y, PU_GET_W and PU_GET_H for getting the
location and size of a PU.
2015-12-15 11:21:40 +02:00
Arttu Ylä-Outinen
a3df13fb99
Make kvz_inter_get_merge_cand work with SMP blocks.
...
- Replaces parameter depth with parameters width and height.
- Adds parameters use_a1 and use_b1 for disabling the use of merge
candidates A1 and B1.
2015-12-15 11:21:40 +02:00
Arttu Ylä-Outinen
1cd149fb97
Check merge/mv candidate types earlier.
...
Moves checks for motion vector prediction and merge candidate block
types (inter/intra) from functions kvz_inter_get_mv_cand and
kvz_inter_get_merge_cand to kvz_inter_get_spatial_merge_candidates.
2015-12-15 11:21:39 +02:00
Arttu Ylä-Outinen
969c91d7c4
Add a test for kvz_inter_get_spatial_merge_candidates.
2015-12-15 11:21:39 +02:00
Arttu Ylä-Outinen
02375bf7e5
Make kvz_inter_get_mv_cand work with SMP blocks.
...
Replaces the depth parameter of kvz_inter_get_mv_cand with parameters
width and height.
2015-12-15 11:21:39 +02:00
Ari Koivula
3a80c7de74
Further optimize coefficient coding
...
Remove the need to count the coefficients by populating the significant
coefficient group map first and finding the last coefficient from the
last group afterward. The speedup is about 2% on ultrafast.
The previous version of this patch was reverted due to a bug, which
has now been fixed.
2015-12-11 16:47:55 +02:00
Ari Lemmetti
b78460b02c
Optimize another loop
2015-12-11 11:21:43 +02:00
Ari Koivula
b32965925e
Revert "Further optimize coefficient coding"
...
This reverts commit 25462124f8
.
That commit broke the bitstream. If it's not good enough to push on Friday
night, it's probably not good enough on Monday morning either.
2015-12-07 15:12:04 +02:00
Ari Koivula
865c86fef2
Remove unused variable
2015-12-07 10:32:18 +02:00
Ari Koivula
91631a1c36
Merge branch 'coeff-optimization'
2015-12-07 10:25:46 +02:00
Ari Koivula
25462124f8
Further optimize coefficient coding
...
Remove the need to count the coefficients by populating the significant
coefficient group map first and finding the last coefficient from the
last group afterward.
2015-12-07 10:23:01 +02:00
Ari Koivula
c94707e6e8
Fix bug with OWF+FME+deblocking
...
Increases the MV safety margin of OWF from 2 to 3 when deblocking
is used and 4 when both deblocking and FME are used.
Fractional pixel motion estimation can move the vector one more pixel
down causing checksum error. This fixes that error by increasing the
OWF safety margin and changes the interface, so that different margin
can be used when FME or deblocking are not in use.
2015-12-04 15:26:56 +02:00
darealshinji
fe2ff12244
fix building with autotools
2015-12-03 22:41:24 +01:00
darealshinji
b6d3510c2e
pkg-config: move -lm to Libs.private
2015-12-03 22:39:27 +01:00
Ari Lemmetti
6fe223c4dc
Nonzero calculation magic
2015-12-03 18:29:44 +02:00
Ari Lemmetti
f2d8cd4d64
Merge branch 'intra-search-multi'
2015-12-03 17:25:52 +02:00
Ari Lemmetti
c4e1552ef6
Replace original rough intra search
2015-12-03 17:13:11 +02:00
Ari Lemmetti
ee8c2d0218
Add 4x4 dual SATD for AVX2
2015-12-03 17:13:11 +02:00
Ari Lemmetti
00736fa708
Generate larger than 8x8 dual satd functions with macro
2015-12-03 17:13:11 +02:00
Ari Lemmetti
bd3e1922cd
Add AVX2 8x8 dual hadamard transform
2015-12-03 17:13:11 +02:00
Ari Lemmetti
d575b94357
Implement generic functions for dual sad / satd
2015-12-03 17:13:11 +02:00
Ari Lemmetti
183ee53f47
Add alternative version of rough intra search.
...
Calculate two costs simultaneously to exploit larger SIMD registers.
Implementation for dual functions missing currently.
2015-12-03 17:12:38 +02:00
darealshinji
8ff28ec974
Make dynamic linking easier
2015-12-01 14:34:08 +01:00
Arttu Ylä-Outinen
21e19067fe
Extract inter search in a single ref frame.
...
Moves code for doing inter search in a single reference frame from
function kvz_search_cu_inter to a new function search_cu_inter_ref.
2015-11-18 11:16:27 +02:00
Arttu Ylä-Outinen
f9f3d5929e
Use macros for indexing cu_array in lcu_t.
...
Replaces accesses cu_array with macro calls and adds macros
LCU_GET_TOP_RIGHT_CU and CU_GET_CU.
2015-11-18 11:16:27 +02:00
Arttu Ylä-Outinen
8db8f3d523
Use macro SUB_SCU where possible.
...
Replaces expressions like (x & 0x3f) with SUB_SCU(x).
2015-11-18 11:16:26 +02:00
Arttu Ylä-Outinen
9532d79adb
Add macros for indexing cu array in lcu_t.
...
- Adds macros LCU_GET_CU and LCU_GET_CU_AT_PX to cu.h.
- Replaces accesses to the cu array of lcu_t by calls to these macros.
2015-11-18 11:16:26 +02:00
Arttu Ylä-Outinen
39302b0328
Refactor inter_clear_cu_unused.
...
Replaces duplicated code with a for loop.
2015-11-18 11:16:26 +02:00
Arttu Ylä-Outinen
33208ac9fb
Add a comment explaining the cu array in lcu_t.
2015-11-18 11:16:25 +02:00
Arttu Ylä-Outinen
e0b02599a5
Refactor filter_deblock_edge_ functions.
...
Replaces repetitive calls to kvz_filter_deblock_luma and
kvz_filter_deblock_chroma with loops in functions
filter_deblock_edge_luma and filter_deblock_edge_chroma.
2015-11-18 11:12:32 +02:00
Arttu Ylä-Outinen
43fc6ac419
Mark deblock functions static.
...
Marks the following functions static and removes them from filter.h
since they are not used outside the filter module.
- kvz_filter_deblock_luma
- kvz_filter_deblock_chroma
- kvz_filter_deblock_edge_luma
- kvz_filter_deblock_edge_chroma
- kvz_filter_deblock_cu
2015-11-18 11:12:31 +02:00
Arttu Ylä-Outinen
c93a190940
Refactor deblocking filter functions.
...
- Replace parameter depth in kvz_filter_deblock_edge_{luma,chroma} with
length.
- Move checking whether an edge needs to be filtered from functions
kvz_filter_deblock_edge_{luma,chroma} to functions
kvz_filter_deblock_{cu,lcu}.
- Use pixel coordinates instead of 8-pixel block coordinates in
kvz_filter_deblock_cu.
- Add comments.
These changes should make it easier to modify the deblocking filter to
handle SMP and AMP blocks.
2015-11-18 11:12:31 +02:00
Arttu Ylä-Outinen
863bd1c55d
Replace EDGE_ macros with an enum in filter.
...
Replaces macros EDGE_HOR and EDGE_VER with enum edge_dir.
2015-11-18 11:12:30 +02:00
Ari Koivula
b6e443f3ce
Fix bug in lp-gop parsing
...
Unnecessary mod operation resulted in 0 as the reference delta.
2015-11-16 11:58:57 +02:00
Ari Koivula
cfe834bb53
Merge branch 'lowdelay_GOP'
...
Conflicts:
README.md
2015-11-14 00:05:13 +02:00
Ari Koivula
5ae97b46c6
Remove redundant LP-GOP structures
...
These structures can now be defined with the LP-GOP syntax.
The syntax for lb is g4d3r4t1 and for ultralow it is g8d3r1t1.
2015-11-14 00:01:29 +02:00
Ari Koivula
d2de2aa6aa
Add "lp-g8d4r2t2" style GOP selection
...
This is my own syntax that I've been using when testing this feature.
It allows for defining some simple type of hierarchical GOP structures.
2015-11-14 00:01:28 +02:00
Ari Koivula
b10866cb1e
Fix SPS ref pic counts for lowdelay GOP
2015-11-13 23:11:11 +02:00
Ari Koivula
a6a713ac02
Use P-slices for lowdelay GOPs
2015-11-13 23:11:11 +02:00
Ari Koivula
0722f461c5
Fix compiling tests on mac
...
The mac version of KVZ_GET_TIME macro has many statements, which
prevented it being used inside a for loop statement. Added brackets
to all versions to prevent this issue arising in the future.
Fixes #115 .
2015-11-13 22:57:29 +02:00
Ari Koivula
93637f4683
Move macros in threads.h to KVZ_ namespace
2015-11-13 22:46:32 +02:00
Arttu Ylä-Outinen
95fb2ed9ed
Fix unexpected behavior of --deblock option.
...
Giving a single number as an argument to --deblock option would enable
deblocking and set both beta and tc to that value. This commit changes
a single number argument to be interpreted as a boolean specifying
whether to enable deblocking or not. As a result, "--deblock 0" can be
now used to disable deblocking.
This fixes deblocking being enabled in all presets.
2015-11-13 14:22:42 +02:00
Arttu Ylä-Outinen
60ad19d0c8
Fix --deblock option.
...
Fixes --deblock option so that it takes a "beta:tc" argument as
advertised in the README and command line help.
2015-11-13 14:22:42 +02:00
Arttu Ylä-Outinen
8960ce369e
Reject extra command line arguments.
...
Changes the command line program to print an error and exit instead of
silently ignoring non-option arguments.
2015-11-13 13:57:00 +02:00
Arttu Ylä-Outinen
e42f1351f9
Call config_parse through the api struct in cli.
...
Replaces a call to kvz_config_parse with api->config_parse.
2015-11-09 14:31:04 +02:00
Arttu Ylä-Outinen
87ca9e1856
Drop an unnecessary call to kvz_threadqueue_flush.
...
Removes threadqueue dependency from the command line program.
2015-11-09 14:31:04 +02:00
Arttu Ylä-Outinen
b1abe65e83
Move kvz_get_padding to encmain.
2015-11-09 14:31:03 +02:00
Arttu Ylä-Outinen
0eb1e710e6
Move PSNR computation from videoframe to encmain.
...
Moves function kvz_videoframe_compute_psnr to encmain and renames it to
compute_psnr. Removes videoframe dependency from the command line
program.
2015-11-09 13:50:43 +02:00
Arttu Ylä-Outinen
940ada4c0d
Mark AVX2 intra filter functions as static.
...
Marks functions filter_4x4_avx2, filter_16x16_avx2 and filter_NxN_avx2
static as they are not used outside strategies/avx2/intra-avx2.
2015-11-09 12:48:20 +02:00
darealshinji
f51e3847b6
Fix cross-building on Linux
2015-11-06 21:53:44 +01:00
Marko Viitanen
94bec1b444
Cleanup of mv-rdo, removed unused functions
2015-11-05 14:40:06 +02:00
Marko Viitanen
0cb57961b0
Use dynamically selected get_mvd_cost function for MV candidate selection
2015-11-05 14:31:37 +02:00
Marko Viitanen
bb4f50aded
Added mv-rdo commandline parameter and use it in presets
2015-11-05 13:59:30 +02:00
Marko Viitanen
4e7e9eefbf
Enable usage of MV RDO with a config parameter (in hexbs, tz, frac, bipred)
2015-11-05 12:24:03 +02:00
Marko Viitanen
9a535e1c56
Added missing kvz_ prefixes and fixed some warnings
2015-11-05 09:07:59 +02:00
Marko Viitanen
822c174377
Set cabac to only count bits
2015-11-05 09:07:59 +02:00
Marko Viitanen
1ed0d85020
Added a function for cabac mvd coding cost get_mvd_coding_cost_cabac()
...
Conflicts:
src/rdo.c
2015-11-05 09:07:59 +02:00
Marko Viitanen
6a2658cc74
Added calc_mvd_cost_cabac() to calculate real bits used for motion vectors
...
Conflicts:
src/rdo.h
src/search_inter.c
Conflicts:
src/rdo.c
2015-11-05 09:07:59 +02:00
Ari Lemmetti
fbd0596114
Merge branch 'avx2-pixels-blit'
2015-11-04 11:06:10 +02:00
Ari Lemmetti
57ea7d223b
Pass SIMD registers to functions as pointers to fix 32-bit compilation in visual studio
2015-11-04 10:51:26 +02:00
Ari Lemmetti
a3855652e9
Add AVX2 version with separate handling of basic blocks and strideless copy.
2015-11-04 10:07:25 +02:00
Ari Lemmetti
0816fbea2c
Create generic strategy of blit function
2015-11-04 10:07:25 +02:00
Ari Koivula
5c1ff57f9f
Add corresponding option for every "--no-X" option
...
Needed in order to turn back options turned off by presets.
2015-11-04 00:12:26 +02:00
Ari Koivula
8d9e8aad73
Fix lambda calculation to match HM
...
The lambda was not being increased for non-key frames and was different
in other ways too. The new implementation matches HM.
2015-11-03 16:49:42 +02:00
Ari Koivula
ba47b3cdb1
Make --preset accept numbers
...
Ultrafast corresponds to 0 and placebo to 9.
2015-11-03 15:46:23 +02:00
Ari Koivula
74ee2f3b27
Redefine presets and include them in README.
2015-11-03 15:26:34 +02:00
Marko Viitanen
27e743a507
Added a commandline option for using a preset
...
- Defined presets: ultrafast, superfast, veryfast, faster, fast, medium,
slow, slower, veryslow, placebo
2015-11-03 12:25:06 +02:00
Marko Viitanen
641c204277
Use lowdelay flag in GOP for not using input picture caching
...
- Reduced layers to 3 in LB
2015-11-02 12:36:41 +02:00
Marko Viitanen
9a99f7972f
New GOP structure for ultralow delay
2015-11-02 11:33:16 +02:00
Marko Viitanen
388986399f
Added a definition for low-delay B GOP structure
2015-11-02 10:53:06 +02:00
Marko Viitanen
821d5c478b
Added missing parameter to kvz_strategy_register_picture_generic()
2015-11-02 08:55:54 +02:00
Ari Lemmetti
6dce1f1e33
Update versions for a new release
2015-10-30 17:31:55 +02:00
Ari Lemmetti
d71f1b5bd0
Disable incompatible optimizations for 32-bit version
2015-10-24 15:32:27 +03:00
Ari Lemmetti
df995d85e8
Utilize AVX2 for dequantization.
2015-10-23 20:17:08 +03:00
Ari Lemmetti
cf347e33c4
Move dequant to strategies. Copy generic to AVX2 as well.
2015-10-23 19:53:50 +03:00
Ari Lemmetti
47082738aa
...and the same tricks for quantized reconstruction
2015-10-23 19:44:38 +03:00
Ari Lemmetti
7961ba80d8
Add functions for bigger block sizes to calculate more residual simultaneously and reduce memory accesses
2015-10-23 19:11:56 +03:00
Ari Lemmetti
15edd5060d
Load and store multiple elements simultaneously. Use 128-bit wide zero
...
test. *wip*
2015-10-23 17:03:16 +03:00
Ari Lemmetti
b37cca87c8
Copy generic to avx2
2015-10-23 17:03:15 +03:00
Ari Lemmetti
cad2ea9d6e
Move quantize_residual to quant strategies.
2015-10-23 17:03:15 +03:00
Ari Lemmetti
c013e58f0c
Merge branch 'avx2-faster-angular'
2015-10-23 16:54:35 +03:00
Ari Lemmetti
0c63041ba7
Add filtering functions for different block sizes. Simplify logic a bit to reduce branching. Sorry for the large commit!
2015-10-23 16:54:15 +03:00
Arttu Ylä-Outinen
f7b6365db8
Merge pull request #109 from lu-zero/master
...
version: Bump
2015-10-23 12:26:01 +03:00
Luca Barbato
7ecd9c7284
version: Bump
...
d5f3778f72
provided a new interface
2015-10-23 10:02:28 +02:00
Arttu Ylä-Outinen
1cf55f066f
Fix memory leak in encoder_headers.
...
The header data was not freed when data_out was NULL.
2015-10-23 09:55:08 +03:00
Arttu Ylä-Outinen
a1272e98f8
Prevent disabling VPS from command line.
...
Disabling VPS when using the command line encoder would result in an
invalid bitstream.
2015-10-19 11:25:29 +03:00
Arttu Ylä-Outinen
024fedff57
Disable writing VPS when vps_period is negative.
...
Turns vps_period in struct encoder_control_t into a signed value.
Negative values are interpreted as "never send parameter sets."
2015-10-19 11:25:18 +03:00
Arttu Ylä-Outinen
d5f3778f72
Add function encoder_headers to API.
...
This provides means for obtaining the VPS, SPS and PPS separately from
the rest of the bitstream.
2015-10-16 11:47:27 +03:00
Arttu Ylä-Outinen
037b72c72b
Add parameter stream to VPS, SPS and PPS encoding.
2015-10-14 14:40:45 +03:00
Arttu Ylä-Outinen
db17d33b0b
Simplify code in encoder_state-bitstream.
2015-10-14 12:37:26 +03:00
Luca Barbato
15fd8241a9
build: Replace a sed expression with a simpler awk
...
The former does not work for sure on macosx.
2015-10-10 12:42:24 +02:00
Luca Barbato
a44d24ce40
build: Drop a trailing space
2015-10-10 12:42:10 +02:00
Ari Lemmetti
5af7a42ebe
Enable AVX2 strategy. Add first version of optimizations.
2015-10-08 12:36:20 +03:00
Ari Lemmetti
f4fe3dca5e
Add AVX2 strategy. Copy generic implementation there.
2015-10-08 12:36:15 +03:00
Ari Lemmetti
54e8b346a3
Add intra strategy. Move angular prediction there.
2015-10-08 12:36:05 +03:00
Ari Lemmetti
c123b97fec
Remove option -fno-lto from strategies. LTO is no longer used anyway.
2015-10-05 19:34:56 +03:00
Ari Koivula
ff976e2afc
Arrange parameters in intra fancily
2015-10-05 06:23:14 +03:00
Ari Koivula
d83d57df1a
Fix function names in intra
...
Prefix non-static functions with kvz_intra_ and static with intra_.
2015-10-05 06:23:14 +03:00
Ari Koivula
30b4fa4247
Rename intra prediction to kvz_intra_predict
2015-10-05 06:23:14 +03:00
Ari Koivula
7280dbf429
Remove unnecessary function
...
This function used to be more complicated, but now it's so simple that
it's just obfuscating what's happening.
2015-10-05 06:23:05 +03:00
Ari Koivula
1221e4c7d2
Remove old intra prediction code.
2015-10-05 05:30:47 +03:00
Ari Koivula
23439557e6
Remove remaining usages of old intra prediction
2015-10-05 05:23:22 +03:00
Ari Koivula
ca3ba997aa
Switch to new intra pred in search_intra_chroma_rough.
2015-10-05 05:03:58 +03:00
Ari Koivula
eaff6e29d9
Switch to new intra pred in kvz_search_cu_intra
2015-10-05 04:00:42 +03:00
Ari Koivula
55d741e250
Switch to new intra pred in kvz_intra_recon_lcu_chroma
2015-10-05 04:00:42 +03:00
Ari Koivula
678a1dd1dd
Switch to new intra pred in kvz_intra_recon_lcu_luma
2015-10-05 02:29:02 +03:00
Ari Koivula
cd2f1797bf
Add reimplemented intra prediction code
...
Just along side for now to help with debugging.
The main difference with the new versions is that they take and output
width**2 blocks and two width*2+1 arrays of reference samples,
instead of the (2*width+8)**2 blocks the old ones do. This should make
the interface clearer and the memory footprint smaller.
Also commented the shit out of angular prediction, so hopefully Ari L.
will have an easier time with a SIMD implementation.
2015-10-05 02:29:02 +03:00
Ari Koivula
115756b9d7
Accept --rd=3 parameter
2015-10-05 02:28:56 +03:00
Ari Lemmetti
7a3dabf43e
Merge branch 'avx2-quant'
2015-10-02 16:31:33 +03:00
Ari Lemmetti
38106afa50
Add AVX2 version of quantization.
2015-10-02 16:18:52 +03:00
Ari Lemmetti
ef0ad292ef
Add quantization strategy.
2015-10-02 16:17:02 +03:00
Ari Koivula
41dd44f7cf
Fix warnings with -DNDEBUG
2015-10-02 15:13:07 +03:00
Ari Koivula
81f5ca76cb
Accept tile configurations with either dimension as one
2015-10-02 15:06:31 +03:00
Ari Lemmetti
989cee1b04
Add 4x4 function as well
2015-10-01 22:14:56 +03:00
Ari Lemmetti
8b57b2bb1a
Refactor SATD to inline most of the function. Replace full horizontal add with shuffle and regular packed add.
2015-10-01 21:29:25 +03:00
Ari Lemmetti
55da2a9958
Add intrinsic version of SATD for 8x8 and larger blocks
2015-10-01 19:42:22 +03:00
Ari Lemmetti
d68fc4c41e
Add header for common utilities to use with strategies.
2015-10-01 19:40:35 +03:00
Arttu Ylä-Outinen
512e5bb25f
Bump version to 0.7.0
2015-09-30 15:20:57 +03:00
Arttu Ylä-Outinen
8f404a3b6f
Add NAL unit type to frame_info.
2015-09-28 10:30:59 +03:00
Arttu Ylä-Outinen
1c898a2f4a
Prefix NAL unit type enum constants with KVZ_.
2015-09-28 10:30:58 +03:00
Arttu Ylä-Outinen
4e5c7fe6e8
Remove function kvz_encoder_compute_stats.
...
Changes main function to compute frame PSNR by calling
kvz_videoframe_compute_psnr directly with the source and reconstructed
pictures returned from encoder_encode.
2015-09-28 10:30:58 +03:00
Arttu Ylä-Outinen
efd361ee8e
Return the original picture from encoder_encode.
2015-09-28 10:30:58 +03:00
Arttu Ylä-Outinen
afd0d3eee0
Remove encoderstate dependency from cli module.
...
Changes function print_frame_info to use a kvz_frame_info struct to get
the data to be printed.
2015-09-28 10:30:58 +03:00
Arttu Ylä-Outinen
7edc1b0b1c
Add reference picture lists to kvz_frame_info.
2015-09-28 10:30:57 +03:00
Arttu Ylä-Outinen
d5dceb45f1
Factor out a function for building ref lists.
...
The code for building the reference picture lists was duplicated in
functions encoder_state_ref_sort and print_frame_info. This commit moves
it to a new function kvz_encoder_get_ref_lists. Also makes
encoder_ref_insertion_sort static since it is not used outside the
encoderstate module any more.
2015-09-28 10:30:57 +03:00
Arttu Ylä-Outinen
c856a6b598
Output frame info from encoder_encode.
...
Adds a new output parameter info_out to encoder_encode. It returns
a struct containing information about the encoded frame, including POC,
QP and slice type.
2015-09-28 10:30:57 +03:00
Arttu Ylä-Outinen
173b70b53f
Rename SLICE_* enum constants to KVZ_SLICE_*.
2015-09-28 10:30:56 +03:00
Ari Koivula
63ab4068be
Clean up the makefile a bit
...
Use the existing TARGET_CPU_ARCH and TARGET_CPU_BITS instead of filtering
ARCH over and over again.
Comment some of the more obscure parts.
2015-09-18 15:13:13 +03:00
Ari Koivula
eb12fe0d98
Re-enable disabled -m32 and -m64 flags.
2015-09-18 12:13:31 +03:00
Ari Koivula
9537b996e7
Make makefile work on arm
...
Only compile x86 specific optimizations for x86 and don't give
-m32 or -m64 on arm.
2015-09-18 00:23:49 +03:00
Ari Koivula
d76890bbff
Bump version to 0.6.1
2015-09-16 18:42:20 +03:00
Ari Koivula
1d5cfbdcc2
Remove unused variable.
2015-09-16 18:39:46 +03:00
Ari Koivula
513e80bcca
Fix bug causing unnecessary copying of memory
...
This bug caused a single tiles worth of lcu_info_t structs to be copied
unnecessarily for every LCU in the frame. This obviously caused huge
memory bandwidth issues when coding large frames without tiles. The
effect was minimized somewhat with a large number of tiles, because
only the current tile was copied.
From context it is clear that this piece of code was supposed to copy
a single tile or frame, once the frame was done, but because it was
placed in a function which is called for every LCU, it copied the data
for the LCU, but also lots of extra stuff.
The fix is to copy only the current LCU instead of the whole tile.
2015-09-16 18:23:44 +03:00
Marko Viitanen
d8b50d6951
Bump version to 0.6.0
2015-09-15 15:44:55 +03:00
Marko Viitanen
5b3f2a6229
Merge branch 'pkgconfig-fix'
2015-09-15 15:37:49 +03:00
Ari Koivula
f1ac0e6bc2
Rename _DEBUG to KVZ_DEBUG
2015-09-15 13:04:03 +03:00
Ari Koivula
ec2d8d6ad7
Rename _DEBUG_PERF macros to KVZ_PERF
...
And move them to threadqueue.h, where the things that use them are.
2015-09-15 13:03:32 +03:00
Arttu Ylä-Outinen
4db06bcf07
Use correct version in kvazaar.pc.
...
Changes kvazaar.pc to use kvazaar version instead of the library
version. The version number is extracted from global.h using sed.
2015-09-15 12:57:34 +03:00
Marko Viitanen
3217e70f99
Revert "Revert "Fix keeping of reference frames over IDR boundary.""
...
This reverts commit 87936eb99f
.
Conflicts:
src/encoderstate.c
2015-09-14 14:31:58 +03:00
Arttu Ylä-Outinen
b4ec664fc9
Set DTS values of output pictures.
...
Adds field dts to struct kvz_picture and rewrites kvz_encoder_feed_frame
to set the DTS when returning pictures.
2015-09-14 14:16:56 +03:00
Arttu Ylä-Outinen
25c23aa298
Remove static variables from kvz_encoder_feed_frame.
...
Adds struct input_frame_buffer_t for storing the input buffer state.
2015-09-14 14:12:19 +03:00
Arttu Ylä-Outinen
1d2a398197
Move function kvz_encoder_feed_frame to a separate module.
...
Adds module input_frame_buffer.
2015-09-14 14:12:18 +03:00
Arttu Ylä-Outinen
009717bf7c
Remove unused field bitstream_length from kvz_encoder.
2015-09-14 14:12:18 +03:00
Arttu Ylä-Outinen
97913cee40
Add pts field to kvz_picture.
...
The pts field can be used to set the presentation timestamp of the input
frames. The timestamps are copied to the reconstructed frames.
2015-09-14 14:12:00 +03:00
Ari Koivula
24618c90ce
Fix wrong type in debug code.
...
- This type is expected by outside debug scripts. It does not have to
match the function name.
2015-09-10 16:07:18 +03:00
Ari Koivula
0ac2bc31a3
Put parenthesis around _DEBUG.
...
- To protect against precedence issues.
2015-09-10 16:06:19 +03:00
Ari Koivula
3958e8b6f7
Handle VS warnings with _DEBUG.
...
- Conditional expression is constant was being triggered by debug code.
2015-09-10 14:16:42 +03:00
Ari Koivula
cb1a206c74
Dump threading data structures only with _DEBUG_PRINT_THREADING_INFO.
...
- They are usually not needed when using _DEBUG.
2015-09-10 14:16:42 +03:00
Arttu Ylä-Outinen
70b3e10e27
Fix a crash with owf=4, gop=8, frames=10.
...
A call to kvz_threadqueue_waitfor caused the tqj_bitstream_written field
of the previous encoder state to become a dangling pointer, subsequently
causing an assertion to fail. This would only occur when the encoder
state used for a new frame was not the last finished one.
Fixed by setting tqj_bitstream_written to NULL after the job is done and
removing unnecessary calls to kvz_threadqueue_waitfor.
2015-09-07 15:37:04 +03:00
Arttu Ylä-Outinen
3c35f470a1
Fix get_ctx_cu_split_model.
2015-09-02 11:47:03 +03:00
Ari Koivula
9a23ae3d92
Resolve remaining Visual Studio warnings.
...
- Ignore most of them and fix the ones that can't be ignored.
2015-08-31 15:02:25 +03:00
Luca Barbato
efe5291427
build: Drop the gnu-only option Deterministic
...
Unbreak building on MacOSX and possibly other BSDs.
2015-08-29 10:38:10 +02:00
Ari Koivula
c52f7858ab
Use long start code in picture_timing_sei if it's first NAL in AU.
2015-08-27 15:23:34 +03:00
Ari Koivula
69d1059602
Fix access unit delimiter.
...
- The nal header was written after the pic_type.
2015-08-27 15:18:25 +03:00
Ari Koivula
9584cd7352
Move rbsp_trailing_bits elements to encapsulating functions.
...
- Also add missing bitstream align. It's unnecessary as the version can't not
be byte aligned.
2015-08-27 15:18:18 +03:00
Ari Koivula
207367f317
Add new kvz_bitstream_align which only aligns when needed.
...
- Changing picture_timing_sei_message to align doesn't change anything, but
protects against future changes if more data is added there in future.
2015-08-27 15:16:20 +03:00
Ari Koivula
b2fb1b6d4a
Rename kvz_bitstream_align to kvz_bitstream_rbsp_trailing_bits.
...
- The syntax is called rbsp_trailing_bits in spec and 1 byte is added
even when the bitstream is already aligned, so align is a bad name.
2015-08-27 14:33:30 +03:00
Arttu Ylä-Outinen
3a10e9e3e0
Prefix all non-static symbols with "kvz_".
2015-08-26 13:02:28 +03:00
Arttu Ylä-Outinen
bfe2b31cee
Make generic satd functions static.
2015-08-26 12:10:27 +03:00
Arttu Ylä-Outinen
d0bc58a874
Document the library API in more detail.
2015-08-26 12:10:27 +03:00
Arttu Ylä-Outinen
04ba5dca41
Make config_destroy accept a NULL pointer.
2015-08-26 12:10:26 +03:00
darealshinji
87a4e53c35
Makefile: add $(CPPFLAGS), use -Wl,-z,noexecstack and install development files on Linux
2015-08-22 16:00:26 +02:00
Ari Lemmetti
3661f3f3f5
Fix incorrect free on error
2015-08-21 18:26:12 +03:00
Ari Lemmetti
4103bd2786
Add missing padding for frame allocation
2015-08-21 17:25:54 +03:00
Ari Lemmetti
d6c3363dc8
Merge branch 'interlacing_experimental'
2015-08-21 15:50:24 +03:00
Ari Lemmetti
ed7948810e
Fix help message and update README.md
2015-08-21 15:29:48 +03:00
Ari Lemmetti
68fcc67a16
Add extraction of fields according to source scan type
2015-08-21 15:15:20 +03:00
Ari Lemmetti
581ff95412
Write flags and SEI messages for interlacing.
2015-08-21 14:46:05 +03:00
Arttu Ylä-Outinen
38893d2da1
Use the static lib to link the program.
...
Changes the Makefile to use the static library and additional object
files as linker input instead of all the object files when linking the
command line program.
2015-08-20 16:42:29 +03:00
Arttu Ylä-Outinen
cb49586d36
Add static library target to Makefile.
...
Adds targets libkvazaar.a and install-static to the Makefile.
2015-08-20 16:42:29 +03:00
Arttu Ylä-Outinen
23159e45b7
Rename LIB_OBJS to SHARED_OBJS in Makefile.
...
SHARED_OBJS is a more appropriate name since the objects are only used
for building shared libraries, not static ones.
2015-08-20 16:42:28 +03:00
Arttu Ylä-Outinen
8488ba1bb7
Remove unnecessary objects from libraries.
...
Removes the cli and yuv_io modules from libraries.
2015-08-20 16:42:28 +03:00
Arttu Ylä-Outinen
dd874a0a4a
Move writing of reconstructed picture to encmain.
...
- Removes parameter recout of function encoder_compute_stats.
- Now only encmain uses the yuv_io module.
2015-08-20 16:42:28 +03:00
Ari Lemmetti
923f4a74d5
Fix filtering over limits
2015-08-17 17:39:56 +03:00
Ari Lemmetti
febe66f148
Fix typo
2015-08-17 16:16:11 +03:00
Ari Lemmetti
82cf4e8ff4
Output error messages to stderr
2015-08-17 15:01:46 +03:00
Ari Lemmetti
3da71b62bf
Add checks if malloc fails
2015-08-17 15:01:46 +03:00
Ari Lemmetti
4718fe7fda
Change variable names to match used convention
2015-08-17 15:01:46 +03:00
Ari Lemmetti
6a5eaf08de
Rename extend_borders to get_extended_block. Add kvz_ prefix to type definition.
2015-08-17 15:01:46 +03:00
Ari Lemmetti
d82582c37c
Changes to extend border function.
...
Now outputs a pointer to a block with guaranteed padding for filtering.
Only generate extra pixels if samples are needed out of bounds.
Use memcpy otherwise.
2015-08-17 15:01:46 +03:00
Ari Lemmetti
fc038cb8bf
Add --source-scan-type parameter
...
Options progressive (default)
tff for top field first
bff for bottom field first
2015-08-13 12:53:14 +03:00
Ari Lemmetti
4dcc0d876d
Fix rate control when --owf=auto (default)
...
Value in cfg stays -1 if auto selection is used.
Use value in encoder instead.
2015-08-12 18:20:57 +03:00
Ari Lemmetti
5d96dbc6c0
Make strategy selection use bit depth given via parameter instead of excluding registration with defines
2015-08-12 13:33:38 +03:00
Ari Lemmetti
4122f36089
Prevent the registration of strategies that are incompatible when KVZ_BIT_DEPTH != 8
...
Remove unnecessary or misleading mentions of "8bit"
2015-08-12 11:29:53 +03:00
Ari Lemmetti
33b6481660
Remove unused variables.
2015-08-11 15:53:40 +03:00
Ari Lemmetti
348d7780fc
Remove third shift and offset from 14-bit sampling functions (change missing from rebase)
2015-08-11 15:06:16 +03:00
Marko Viitanen
8409317bd9
Fixed rebasing errors for 10bit branch
2015-08-11 14:56:45 +03:00
Marko Viitanen
58f12bd530
Changed frame 8bit to 10bit conversion to be done without memory allocation
2015-08-11 08:18:14 +03:00
Marko Viitanen
6453a511d7
Scale SAD/SATD costs to match bit depth
...
Conflicts:
src/image.c
2015-08-11 08:18:14 +03:00
Marko Viitanen
0304b6c412
Fixed luma interpolation filter when 10bit coding and some other minor fixes
2015-08-11 08:17:48 +03:00
Marko Viitanen
450b5e64ca
Fixed overflow on generic ipol filters when 10bit encoding
...
Conflicts:
src/strategies/generic/ipol-generic.c
2015-08-11 08:17:48 +03:00
Marko Viitanen
191d3e4d87
Fixed RDOQ on 10bit encoding
2015-08-11 08:14:35 +03:00
Marko Viitanen
414ebe6101
Fixed checksum on bitdepth > 8 cases
...
Conflicts:
src/nal.c
src/nal.h
src/strategies/generic/nal-generic.c
src/strategies/strategies-nal.c
src/strategies/strategies-nal.h
2015-08-11 08:14:35 +03:00
Marko Viitanen
57ab46f110
Small fixes all around to enable 10bit encoding
...
Conflicts:
src/encmain.c
src/encoder.c
src/encoderstate.c
src/global.h
2015-08-11 07:59:20 +03:00
Ari Lemmetti
7cd4f7a5c9
Enable fractional motion vectors with bipred
2015-08-10 18:49:12 +03:00
Ari Lemmetti
5887c96991
Add and use 14bit reconstruction for fractional motion vectors with bipred
2015-08-10 18:45:29 +03:00
Ari Lemmetti
a87dafb27a
Add hi_prec_buf_t for higher precision intermediate values for interpolation filter.
2015-08-10 18:35:06 +03:00
Ari Lemmetti
3e31ff2476
Use the function to sample half-pixels.
2015-08-10 18:30:41 +03:00
Ari Lemmetti
0a096c7040
Move qpel and octpel reconstruction in separate functions.
2015-08-10 18:25:52 +03:00
Ari Lemmetti
fc00b4795c
Preparations for more accurate reconstruction with bipred
2015-08-10 18:08:13 +03:00
Ari Lemmetti
8b4a6c92da
Add 14bit precision sample functions.
2015-08-10 18:02:06 +03:00
Ari Lemmetti
b30f17d4b8
Add fractional pixel sampling for chroma
2015-08-10 17:55:37 +03:00
Ari Lemmetti
650dd7d840
Use pixels_blit to copy neccessary pixels.
2015-08-10 17:52:00 +03:00
Ari Lemmetti
01f40ec104
Add fractional pixel sampling for luma
2015-08-10 17:51:48 +03:00
Ari Koivula
0c3c93d456
Optimize intra SAD intrinsics.
...
- Added 64x64 version for completeness.
- With the exception of 16x16, these were all slightly slower than the ASM
versions, as measured by "kvazaar_test -s speed -t intra_sad", but now they
are on par or slightly faster.
- None of these actually use any AVX2 intrinsics, and probably never will,
unless someone adds an interface for doing more than one block at a time,
in which case the non-destructive versions might come in handy.
2015-08-06 19:35:00 +03:00
Ari Lemmetti
20b833bc8e
Fix mingw errors
2015-07-31 18:44:36 +03:00
Ari Lemmetti
12c391eb08
Add auto-detection for input resolution.
...
Use --input-res=auto as default.
2015-07-31 17:35:16 +03:00
Ari Koivula
0740a73fbb
Clean up Makefile.
...
- Move stuff around.
- LDFLAGS -shared and -dynamiclib imply -fpic.
2015-07-31 15:57:05 +03:00
Ari Koivula
beec2705b1
Add cli, lib-shared and lib-static to Makefile.
2015-07-31 15:57:05 +03:00
Ari Koivula
24b3306325
Fix incorrect pattern rules in Makefile.
...
- Having more than one rule in a pattern rule means that both of those files
are created at the same time with the rule. This only worked for debug,
because debug build was never done in the same invocation as release build.
2015-07-31 14:36:45 +03:00
Ari Koivula
1c27f67963
Remove -flto.
...
- Always use the compiler to invoke the linker. Clang will give additional
parameters to the linker when compiled with -flto.
- Giving a different optimization level to linker did not make any difference
in gcc-5.1.1.
2015-07-31 14:36:26 +03:00
Ari Koivula
54b1be341e
Don't compile executable with PIC.
...
- It's required for .so and .dylib, but not for .dll or the executable.
- It might be better to use libtool for this, but I'm not ready to go that
far yet.
2015-07-29 17:12:09 +03:00
Ari Koivula
f8154f8382
Merge branch 'make-dylib'
2015-07-29 11:28:43 +03:00
Ari Koivula
60437fd0c3
Add -lrt back the exe link command.
2015-07-29 11:28:11 +03:00
Luca Barbato
5c7a808bbd
build: Generate a pkg-conf file
2015-07-29 02:27:12 +02:00
Ari Koivula
04e1a21ded
Merge branch 'make-dylib'
...
Closes #94 .
2015-07-28 11:42:46 +03:00
Ari Koivula
2211b90a24
Move comments for defines to a different line.
...
- Having comment as part of the define confuses doxygen. They get added
to every function that uses the macro.
2015-07-21 17:10:08 +03:00
Ari Koivula
022d28ab11
Fix small hexbs pattern.
...
- Who could mess this up? Oh.. right.
2015-07-21 16:12:44 +03:00
Ari Koivula
22e56f86c7
Move inter search patterns inside the search functions.
2015-07-21 16:06:31 +03:00
Ari Koivula
b73b275e08
Remove unused includes from search.
2015-07-21 15:06:06 +03:00
Ari Koivula
ae56118010
Move functions from search to search_intra.
2015-07-21 14:59:19 +03:00
Ari Koivula
bf7542c35d
Move functions from search to search_inter.
2015-07-21 12:16:05 +03:00
Ari Koivula
3c9b830d8f
Add modules search_intra and search_inter.
...
- For breaking up search module.
2015-07-21 12:04:16 +03:00
Arttu Ylä-Outinen
06ea593477
Change dylib file name to libkvazaar.X.dylib.
...
Changes the version number in the dylib filename from a three-digit
version (libkvazaar.X.Y.Z.dylib) to a single-digit one
(libkvazaar.X.dylib).
2015-07-20 15:09:46 +03:00
Arttu Ylä-Outinen
df749e032e
Add necessary linker options when building dylib.
...
Sets linker options -compatibility_version and -install_name when making
dylib.
2015-07-20 15:09:09 +03:00
Luca Barbato
9c414995c5
build: Add a MacOSX install target for the library
2015-07-17 19:44:20 +02:00
Arttu Ylä-Outinen
59f95b8e73
Add nasm support.
...
Makes is possible to build kvazaar using nasm instead of yasm.
- Adds trailing slashes to -I params in ASFLAGS.
- Disables CPU NOP directives when assembler is not yasm.
2015-07-17 13:59:25 +03:00
Arttu Ylä-Outinen
e307b7cec4
Check that input dimensions are multiples of two.
...
Fixes wrongly accepting non-multiple of two resolutions and a segfault
when one of the input dimensions is one.
2015-07-17 10:07:24 +03:00
Arttu Ylä-Outinen
d2c42cb303
Fix making tests.
...
Commit 9cfbd55e
removed "./" prefix of the TESTS variable in the
Makefile but the recipe of target tests was expecting it. Fixed by
prepending "./" to the tests recipe.
2015-07-17 10:07:24 +03:00
Luca Barbato
56ff1c7805
build: Drop the non-standard -t
...
Should unbreak freebsd.
2015-07-16 16:50:09 +02:00
Arttu Ylä-Outinen
94e8fc1536
Build dylib on Darwin.
...
Adds target libkvazaar.dylib to Makefile. On Darwin, libkvazaar.dylib is
set as a prerequisite of the all target.
2015-07-16 14:15:09 +03:00
Arttu Ylä-Outinen
a4ec92081a
Make symbols hidden by default.
...
Adds "-fvisibility=hidden" to CFLAGS and LDFLAGS. Defines macro
KVZ_PUBLIC for marking symbols that should be visible.
2015-07-13 14:20:21 +03:00
Arttu Ylä-Outinen
9cfbd55ea8
Add making symlinks to make install.
...
Running "make install" now creates symlinks libkvazaar.so and
libkvazaar.so.X pointing to libkvazaar.so.X.Y.Z.
2015-07-13 11:45:42 +03:00
Ari Koivula
c94d91061c
Merge branch 'cpuid-fix'
2015-07-09 11:40:46 +03:00
Arttu Ylä-Outinen
8550c6ccd8
Fix AVX2 detection.
...
Replaces calls to __get_cpuid by __cpuid_count on gcc and clang and
calls to __cpuid by __cpuidex on MSVC. Unlike __get_cpuid and __cpuid,
__cpuid_count and __cpuidex set the ecx register which is required for
AVX2 detection.
2015-07-09 11:20:37 +03:00
Ari Koivula
9acf7795a2
Refactor cpuid capability detection.
...
- Moved cpuid data to a struct to make it easier to group data from one
cpuid call together.
- Renamed the bit masks to make it harder to mask the wrong register or
cpuid.
- Remove the .byte trick. We don't really need to support such ancient
compilers?
2015-07-09 11:20:37 +03:00
Arttu Ylä-Outinen
e69088026e
Write slice header before joining child streams.
...
The lengths of the leaf streams must be available when the slice header
is written. Writing the header before joining child streams removes the
need to copy leaf bitstreams instead of moving them.
2015-07-08 13:14:17 +03:00
Arttu Ylä-Outinen
907451590e
Fix encoding when both GOP and OWF are enabled.
...
Changes kvazaar_encode to not increase cur_state_num unless a frame is
started.
2015-07-07 10:05:42 +03:00
Arttu Ylä-Outinen
3efdee2c13
Fix compilation warnings when using clang.
...
Removes typedef redefinitions in kvazaar_internal.h.
2015-07-06 13:46:56 +03:00
Arttu Ylä-Outinen
cc580ac861
Only print PSNR if some frames were encoded.
2015-07-06 13:39:47 +03:00
Arttu Ylä-Outinen
089ff895ad
Fix seeking when input stream is not seekable.
2015-07-06 12:07:05 +03:00
Arttu Ylä-Outinen
aca5d7514f
Fix pocs reallocation in imagelist.
...
Replaced sizeof(int32_t*) by sizeof(int32_t).
2015-07-06 11:58:05 +03:00
Arttu Ylä-Outinen
ca8435f581
Remove setting CC in Makefile.
2015-07-06 11:27:28 +03:00
Arttu Ylä-Outinen
3a47aab696
Fix allocating tile boundary arrays.
...
Column and row numbers had been mixed up.
2015-07-06 10:48:19 +03:00
Arttu Ylä-Outinen
a0865ff351
Change ime_algorithm in kvz_config to an enum.
...
Adds enum kvz_ime_algorithm to kvzaar.h.
2015-07-06 09:47:18 +03:00
Arttu Ylä-Outinen
66656fdebc
Move handling of command line args to cli module.
...
- Adds struct cmdline_opts_t.
- Adds functions cmdline_opts_parse and cmdline_opts_free to cli module.
- Removes fields input, output, debug, frames and seek from struct
kvz_config.
- Removes function config_read from config module.
2015-07-06 08:25:54 +03:00
Arttu Ylä-Outinen
581f740c59
Fix compilation when checkpoints are enabled.
...
- Include string.h in checkpoint.h
- Check return values of fgets calls in checkpoint.h.
- Replace variable length array in image.c by a dynamically allocated
array.
- Add -DCHECKPOINTS to CFLAGS in Makefile when CHECKPOINTS is defined.
2015-07-03 13:54:44 +03:00
Arttu Ylä-Outinen
6eb89a2813
Adjust Makefile for building kvazaar.dll.
...
Adds targets "kvazaar.dll" and "install-dll" to the Makefile.
2015-07-02 16:58:30 +03:00
Arttu Ylä-Outinen
af2b417809
Set up Makefile for building libkvazaar.so.
...
Adds targets "libkvazaar.so.0.0.0", "install", "install-prog" and
"install-lib" to the Makefile.
2015-07-02 16:58:30 +03:00
Arttu Ylä-Outinen
4ab9aa3e2f
Move kvz_encoder definition to kvazaar_internal.h.
2015-07-02 16:58:30 +03:00
Arttu Ylä-Outinen
b715ae9767
Return length of the data from encoder_encode.
...
Adds parameter len_out returning the length of the encoded data in bytes
to function encoder_encode.
2015-07-02 16:58:29 +03:00
Arttu Ylä-Outinen
538deaa9d6
Add functions picture_{alloc,free} to kvazaar API.
2015-07-02 16:58:29 +03:00
Arttu Ylä-Outinen
6451df9a4f
Move bitstream chunk definition to kvazaar.h.
...
- Renames struct bitstream_chunk_t to kvz_data_chunk.
- Renames macro BITSTREAM_MEMORY_CHUNK_SIZE to KVZ_DATA_CHUNK_SIZE.
- Removes kvz_payload typedef.
- Adds function chunk_free(kvz_data_chunk *chunk) to kvazaar API.
2015-07-02 16:58:28 +03:00
Arttu Ylä-Outinen
f7f17a060c
Rename pixel_t to kvz_pixel.
2015-07-02 16:58:28 +03:00
Arttu Ylä-Outinen
cecea44d37
Rename config_t to kvz_config.
2015-07-02 16:58:28 +03:00
Arttu Ylä-Outinen
17d720363a
Rename struct image_t to kvz_picture.
2015-07-02 16:55:48 +03:00
Arttu Ylä-Outinen
fab07d80da
Rename macro BIT_DEPTH to KVZ_BIT_DEPTH.
2015-07-02 16:55:47 +03:00
Arttu Ylä-Outinen
7b6178f6e0
Rename macro MAX_GOP to KVZ_MAX_GOP_LENGTH.
2015-07-02 16:55:47 +03:00
Arttu Ylä-Outinen
3f32d500e2
Move config_t structure to kvazaar.h.
2015-07-02 16:55:46 +03:00
Arttu Ylä-Outinen
cecdf4f34e
Move config validation to encoder_control_init.
...
Ensures that config is valid even when not initialized by config_read.
2015-07-02 16:47:28 +03:00
Arttu Ylä-Outinen
04a1fc07cf
Move all config validation to config_validate.
2015-07-02 16:43:19 +03:00
Arttu Ylä-Outinen
25706af770
Add a function for moving bitstream data.
...
Replaces calls to bitstream_append with bitstream_move where possible.
2015-07-02 16:35:47 +03:00
Arttu Ylä-Outinen
398f0c823b
Replace memory bitstreams with linked lists.
...
- Removes all bitstream types.
- Changes encoder_encode to return the encoded data as list of chunks.
- Moves writing of the encoded data to the main function.
2015-07-02 16:35:46 +03:00
Arttu Ylä-Outinen
7e20e62cc7
Make kvazaar_encode consume one frame on each call.
...
- Replaces read_one_frame by encoder_feed_frame.
- Adds field "prepared" to encoderstate_t to indicate that
encoder_next_frame has been called.
- Input frames are read in the main function and passed to
encoder_encode.
2015-07-02 16:28:40 +03:00
Arttu Ylä-Outinen
012c0580df
Move writing reconstructed image to yuv_io module.
...
Adds function yuv_io_write.
2015-07-02 16:28:39 +03:00
Arttu Ylä-Outinen
7bd23f5dbb
Rename yuv_input module to yuv_io.
2015-07-02 16:28:39 +03:00
Arttu Ylä-Outinen
1f41717351
Rename stats_done to frame_done in encoderstate.
...
The new field frame_done is set to zero when starting to encode a new
frame and reset to one when the encoded data has been written.
2015-07-02 16:24:26 +03:00
Arttu Ylä-Outinen
50a5d5faa5
Let subimages have multiple references.
...
Adds function image_copy_ref to image module for getting a new reference
to an image. It can be used instead of image_make_subimage when the
sizes of the original and the subimage are same.
2015-07-02 16:24:26 +03:00
Arttu Ylä-Outinen
cec9b937dc
Make image list resize use realloc.
...
Much simpler than allocating, copying and freeing the arrays manually.
2015-07-02 16:24:25 +03:00
Arttu Ylä-Outinen
fe3b629905
Move poc from image_t to image_list_t.
2015-07-02 16:24:25 +03:00
Arttu Ylä-Outinen
5d524c0290
Move seeking to yuv_input module.
2015-07-02 16:24:24 +03:00
Arttu Ylä-Outinen
f41ce04488
Refactor main function.
...
- Make sure that everything which is allocated gets deallocated.
- Move finalization of encoder states to kvazaar.c.
- Remove empty strategyselector_free function.
- Remove unused variable curpos.
- Fix includes.
2015-07-02 16:24:24 +03:00
Arttu Ylä-Outinen
9c20f96397
Move opening files in main to separate functions.
2015-07-02 16:24:24 +03:00
Arttu Ylä-Outinen
40b136cf48
Fix seeking when reading from stdin.
...
Seeking used read_one_frame to skip frames. Changed to simply use fread
instead.
2015-07-02 16:24:24 +03:00
Arttu Ylä-Outinen
970d0ec182
Move input reading functions to yuv_input module.
...
Adds function read_yuv_frame and moves functions fill_after_frame and
read_and_fill_frame_data from encoderstate to yuv_input.
2015-07-02 16:24:23 +03:00
Arttu Ylä-Outinen
4a7b86a43b
Make g_exp_table statically allocated.
...
Removes the need to free the table.
2015-07-02 16:14:52 +03:00
Arttu Ylä-Outinen
b130ecc9bb
Fix "reference not found" when GOP is enabled.
...
The encoder state must be cleared by calling encoder_next_frame before
calling read_one_frame.
2015-07-02 16:14:51 +03:00
Ari Koivula
7e98a483d7
Use the API for checking whether the encoding is finished.
2015-07-02 16:14:51 +03:00
Ari Koivula
fc58748ae8
Output bitstream through API.
...
- Use the existing bitstream_t type to give access to the bitstream.
We can extend it later to make it a linked list like I was planning
to do with the payload type.
- The main encoder now also stores the bitstream in memory.
2015-07-02 16:10:51 +03:00
Ari Koivula
df50a0dae6
Move config_parse into api.
2015-07-02 15:52:24 +03:00
Ari Koivula
4e5326d3d5
Move encoding to API.
...
- Api->encoder_encode can now be called repeatedly to start encoder
jobs and to retrieve the results.
Conflicts:
src/encmain.c
2015-07-02 15:52:23 +03:00
Ari Koivula
9a3edce3fc
Separate input and output from encoding.
...
- Move image_t and pixel_t to the kvazaar.h API.
- Try and arrange things such that image_t can be used as input and
output for encoding.
Conflicts:
src/encmain.c
2015-07-02 15:52:23 +03:00
Ari Koivula
f87cea78da
Wait for bitstream immediately after encoding the frame.
...
- This should reduce the encoding delay by one frame when encoding in
real time.
2015-07-02 15:52:23 +03:00
Ari Koivula
ad11d1bca5
Add kvazaar.h to hold high-level encoder API.
...
- Move encoder initialization from main to kvazaar.c.
- Have main use the API for initialization.
Conflicts:
src/encmain.c
2015-07-02 15:52:23 +03:00
Ari Koivula
0170e9280f
Move some initialization to encoder_control_init.
...
- Removed some members from encoder_control_t that weren't really used
very much anymore.
2015-07-02 15:45:35 +03:00
Ari Koivula
504f3d9c9b
Move some config initialization to config_read.
2015-07-02 15:45:34 +03:00
Ari Koivula
c99fe63860
Move seek functionality outside the main input loop.
2015-07-02 15:45:34 +03:00
Ari Koivula
4f4b62b13c
Fix owf.
2015-07-02 15:45:34 +03:00
Ari Koivula
5c28745457
Move OWF logic and CLI stuff out of encoder_compute_stats.
...
- CLI stuff is moved to either cli-module or to main function.
- OWF stuff is made more explicit by counting the frames instead of
communicating through encoder_state_t.stats_done.
2015-07-02 15:45:33 +03:00
Ari Koivula
ea50d03e52
Add cli module and move interface stuff to there.
2015-07-02 15:45:33 +03:00
Marko Viitanen
ff4fb64169
Fixed a precedence bug in bipred search
2015-06-12 09:49:56 +03:00
Marko Viitanen
44ba9d9f7c
Bump version number to 0.5.0
2015-06-11 10:33:27 +03:00
Ari Koivula
30f8640380
Disable trskip SAD calcultation when trskip is not enabled.
2015-06-08 13:16:12 +03:00
Marko Viitanen
3253ba4812
Fixed non-deterministic behavior when using bipred and owf
2015-06-08 13:14:53 +03:00
Ari Koivula
cdb66baf16
Fix mutex being unlocked twice.
2015-06-01 16:28:50 +03:00
Ari Koivula
80ec1fda3a
Remove unnecessary dependency between I-frames.
...
- Inter OWF dependency was being added to non-IDR I-frames.
2015-06-01 16:26:30 +03:00
Arttu Ylä-Outinen
984e7cb4e0
Fix setting QP when rate control is disabled.
...
When rate control is disabled, QP and lambda are now selected like they
were before rate control was implemented.
2015-06-01 13:57:11 +03:00
Arttu Ylä-Outinen
b0435d37a9
Update rate control parameters.
...
On each frame, adjust the parameters alpha and beta in the equation
lambda = alpha * pow(R, beta)
2015-05-29 11:50:08 +03:00
Arttu Ylä-Outinen
b24d92bd6e
Move initialization of constants to encoder.c.
...
Some constants used in rate control are now initialized only once instead
of being computed on every frame. Adds pixels_per_pic, target_avg_bppic,
target_avg_bpp and gop_layer_weights to encoder_control_t.
2015-05-29 11:45:36 +03:00
Arttu Ylä-Outinen
b54d5aa91f
Select GOP picture weights according to bitrate.
...
Pictures in same layer have equal weights. At low bitrates, the difference
between low and high layers is greater than at high bitrates.
2015-05-29 11:43:42 +03:00
Arttu Ylä-Outinen
93d2a95ddc
Implement rate control in lambda domain.
...
- Rate control adjusts the lambda value.
- QP is selected according to lambda.
- Bits are allocated for GOPs and individual pictures.
2015-05-19 11:40:51 +03:00
Arttu Ylä-Outinen
664de9ade0
Keep track of bits written in current gop.
...
Adds cur_gop_bits_coded into encoder_state_config_global_t. The count is
updated whenever a frame is written.
2015-05-19 10:42:23 +03:00
Arttu Ylä-Outinen
4a5698a6ba
Implement basic rate control.
2015-05-19 10:42:17 +03:00
Arttu Ylä-Outinen
d27cde55a4
Add --input-fps and --bitrate parameters.
2015-05-15 13:57:51 +03:00
Arttu Ylä-Outinen
5b8cd76f01
Keep track of total number of bits coded.
...
Adds total_bits_coded into encoder_state_config_global_t. The count is
updated whenever a frame is written.
2015-05-15 13:57:50 +03:00
Arttu Ylä-Outinen
815a2bea55
Use bitstream_tell to get stream position.
2015-05-15 13:57:50 +03:00
Ari Koivula
56bb8e75ba
Fix non-deterministic behavior with tiles.
...
- Depend on the whole previous frame.
- We should really go through all these FIXME's sometime.
2015-05-12 12:00:32 +03:00
Ari Koivula
a48d91dacd
Fix WPP not working when SAO is off and OWF is on.
...
- Every wavefront row was being set to done when the first wavefront
row got done.
- Looks like I didn't understand how the data structure worked when I
"cleaned this up", and it didn't get caught in tests because it
needs OWF to be on to affect anything.
2015-05-11 12:01:17 +03:00
Ari Koivula
87936eb99f
Revert "Fix keeping of reference frames over IDR boundary."
...
This reverts commit b43f1cb9eb
.
- This change resulted in use of uninitialized memory with owf != 0.
Conflicts:
src/encoderstate.c
2015-05-05 17:07:49 +03:00
Ari Koivula
c3b42291e1
Merge branches 'coverity-fix-5', 'coverity-fix-6', 'coverity-fix-7', 'coverity-fix-8', 'coverity-fix-9', 'coverity-fix-10', 'coverity-fix-11', 'coverity-fix-12', 'coverity-fix-13' and 'coverity-fix-14' into coverity-fixes2
2015-05-05 12:27:02 +03:00
Ari Koivula
62285c405c
Fix coverity warning.
...
- False positive about a shift with -1 when code_num overflows.
2015-05-05 12:20:07 +03:00
Ari Koivula
ed670f4185
Fix coverity warning.
...
- False positive about buffer overrun due to thinking work_tree_copy_up
could be called with depth == 4.
2015-05-05 11:54:09 +03:00
Ari Koivula
2276e0028f
Fix coverity warning.
...
- False positive about use of an uninitialized value. Actually just
copying uninitialized data from one struct to another.
2015-05-05 10:39:29 +03:00
Ari Koivula
41d9889e28
Fix coverity warning.
...
- False positive about coeff_y being uninitialized when width == 0.
2015-05-05 10:23:52 +03:00
Ari Koivula
80cbda364b
Fix coverity warning.
...
- Variable guards dead code. Although, maybe it will complain about the
dead code now instead.
2015-05-05 10:17:06 +03:00
Ari Koivula
08d079773f
Fix coverity warning.
...
- Dead code.
2015-05-05 10:12:01 +03:00
Ari Koivula
e225c5b302
Fix coverity warning.
...
- Dead code due to current value of MRG_MAX_NUM_CANDS. Not sure if this
fix will work but I think it looks better.
2015-05-05 10:06:18 +03:00
Ari Koivula
17bdc82b5e
Fix coverity warning.
...
- Dereferencing a pointer from realloc before checking if it's null.
2015-05-05 09:40:24 +03:00
Ari Koivula
cc980fb815
Fix coverity warning.
...
- False positive for use of uninitialized variable.
2015-05-05 09:37:34 +03:00
Ari Koivula
7a551bece5
Fix coverity warning.
...
- Remove dead code.
2015-05-05 09:29:40 +03:00
Ari Koivula
cf2a406aba
Fix coverity warning.
...
- False positive for overflow. Fixed the parameter declaration.
2015-05-04 17:38:16 +03:00
Ari Koivula
1c6c4963e7
Fix coverity warning.
...
- Mutex was left locked when malloc failed. Fixed.
2015-05-04 17:38:16 +03:00
Ari Koivula
1c3873f5b2
Fix coverity warning.
...
- Overflow from buggy implementation of modulo behavior for
pattern_type. As there is no need for such behavior I removed it.
2015-05-04 17:38:16 +03:00
Ari Koivula
6234b09461
Fix coverity warning.
...
- A false alarm about buffer overflow. No new modes are added if all modes
are already in the list.
- Skip checking predicted modes if all modes are in the list.
2015-05-04 17:38:16 +03:00
Ari Koivula
9015aab996
Clean up IDR handling code.
...
- IDR was called RADL, probably because the NAL type is IDR_W_RADL.
- Move things around to make it clearer what is happening.
2015-04-30 20:46:07 +03:00
Ari Koivula
b43f1cb9eb
Fix keeping of reference frames over IDR boundary.
...
-
2015-04-30 15:42:16 +03:00
Ari Koivula
c0c9bc619a
Fix valgrind warning.
...
- Attribute state->global->slicetype was used before being initialized.
- The reference frame lists should be updated based on current frame,
not on previous frame (or uninitialized data).
2015-04-30 13:18:28 +03:00
Ari Lemmetti
afcccb5c81
Merge branch 'memory_leak_test'
2015-04-29 16:26:59 +03:00
Ari Lemmetti
0081384727
Clean Makefile a bit. Add debug build option.
2015-04-24 20:45:19 +03:00
Marko Viitanen
8ed5d06ebe
Fixed compiler warnings caused by the bipred branch merge
2015-04-23 15:12:48 +03:00
Marko Viitanen
fd060cf2c6
Merge branch 'bipred'
...
Conflicts:
README.md
src/config.c
src/config.h
src/encmain.c
2015-04-23 14:45:44 +03:00
Marko Viitanen
79dc7e7270
Bi-pred search cleanup
2015-04-23 14:39:41 +03:00
Marko Viitanen
0e958ebe84
Fixed merge candidate selection
2015-04-23 12:18:33 +03:00
Marko Viitanen
3c694a8f6e
Fixed bipred mv candidate selection
2015-04-23 12:18:05 +03:00
Marko Viitanen
9951810910
Fixed deblocking with bi-dir blocks
2015-04-23 09:43:39 +03:00
Marko Viitanen
7f504b7808
Added a commandline parameter --bipred to enable bi-pred search
2015-04-21 14:35:16 +03:00
Marko Viitanen
fb74f86a5b
Bi-pred search now actually does cost calculations
2015-04-21 14:16:06 +03:00
Marko Viitanen
e12ba7c80f
Created function inter_recon_lcu_bipred() and moved bipred recon there
2015-04-21 12:05:21 +03:00
Marko Viitanen
50fce975d9
Clamp bi-pred motion vectors because ipol filtering requires modifications
2015-04-21 11:24:07 +03:00
Ari Koivula
13924a2057
Add --no-info parameter.
...
- Stops encoder information from being added to bitstream.
- The version information overhead is too big when doing comparisons with
very short sequences.
2015-04-16 17:30:36 +03:00
Ari Koivula
7028846423
Fix bug in intra mode search.
...
- The cost of the first mode in the mode list was returned instead of cost of
the selected mode, as this used to be the best mode when the list was
sorted. Should only matter when doing inter coding.
- This pretty much affects only --rd=1 in inter frames.
2015-04-09 16:05:53 +03:00
Marko Viitanen
da3fe9f199
Fixed rounding in bi-pred reconstruction
2015-04-02 15:55:13 +03:00
Marko Viitanen
c7a17cf1c4
Changed motion vector candidate derivation to work with bi-pred case
2015-04-02 14:05:24 +03:00
Marko Viitanen
73db9fec83
Fixed asserts for intra PU-depth configurations
2015-04-02 10:31:56 +03:00
Marko Viitanen
5d71fb3136
Fixed leaf aligning
2015-04-01 08:49:22 +03:00
Marko Viitanen
d26b89174b
Fixed intra chroma search ref_v pointer
2015-03-31 15:43:22 +03:00
Marko Viitanen
4b7db2e014
Added a dummy bi-pred search, always selects bi-pred block when possible
2015-03-31 15:02:43 +03:00
Marko Viitanen
2c676927f0
Fixed a bug in bipred reconstruction causing an overflow
2015-03-31 15:02:10 +03:00
Marko Viitanen
06bc4f3d5e
Fixed duplicate checking for merge cand and some cleanup
2015-03-31 12:23:46 +03:00
Marko Viitanen
004e8082ab
Fixed deblocking after L0/L1 mv changes
2015-03-31 12:22:48 +03:00
Marko Viitanen
c02e3b8e26
Fixed inter_get_mv_cand() reference picture checking
2015-03-30 15:22:56 +03:00
Marko Viitanen
f881d6bf8a
Modified structures and mv handling to use L0/L1 vectors
2015-03-30 14:40:29 +03:00
Marko Viitanen
d6f68d0950
Force clearing of references when GOP not used and I-slice
2015-03-30 10:21:41 +03:00
Marko Viitanen
f28ebbcd41
Moved GOP defining to config.c and added parameter --gop
...
* Checking that intra period and gop_len match
2015-03-30 10:09:54 +03:00
Marko Viitanen
c82915761f
Enabled insertion of I-slices when GOP is used
2015-03-30 10:09:49 +03:00
Marko Viitanen
815e0b8897
Moved reference list printing to encoder_compute_stats()
2015-03-30 10:09:32 +03:00
Marko Viitanen
2243d139bf
Fixed GOP reference usage when using owf
2015-03-26 14:11:13 +02:00
Marko Viitanen
1dc53be8fc
Fixed leaf aligning
2015-03-26 13:54:17 +02:00
Marko Viitanen
bbeb85f9ee
Fixed case when cfg->frames is zero
2015-03-26 11:24:41 +02:00
Marko Viitanen
5c04603421
Remove unused ref frames on GOP case even when number of ref frames is within limits
2015-03-26 11:14:13 +02:00
Marko Viitanen
5071b5c990
Moved reference list sorting and parsing to encoder_state_new_frame()
...
* fixed a bug in reference verification and added an error state
2015-03-26 10:58:56 +02:00
Marko Viitanen
c40ca49b6c
When GOP is used, verify the references are available
2015-03-26 10:38:21 +02:00
Marko Viitanen
fe581b881e
Changed GOP structure to enable coding sequences not divisible by gop_len
2015-03-25 16:00:20 +02:00
Marko Viitanen
42e02dbfd9
Fixed tr-skip cost calculation
2015-03-24 13:35:28 +02:00
Marko Viitanen
a7328ab008
Fixed tr-skip cost calculation
2015-03-24 12:40:01 +02:00
Marko Viitanen
c649c90f3a
Changes to enable adaptation to any GOP len
2015-03-24 12:01:57 +02:00
Marko Viitanen
9a828ae5da
Fixed merge candidate scaling in hexbs and excluded weighted pred candidates in cost calc
2015-03-24 09:38:24 +02:00
Marko Viitanen
2d8552d0d6
Fixed merge candidate usage by skipping weighted prediction candidate
2015-03-23 15:17:41 +02:00
Marko Viitanen
7952f892fc
Fixed GOP reference usage
2015-03-23 14:17:44 +02:00
Marko Viitanen
34e8f70c8c
Fixed temporal MV predictor offset
2015-03-23 09:22:47 +02:00
Marko Viitanen
eccf1c1a16
Fixed temporal MV predictor offset
2015-03-23 09:21:52 +02:00
Marko Viitanen
164b7a7743
Merge remote-tracking branch 'remotes/origin/master' into GOP
2015-03-20 11:40:15 +02:00
Marko Viitanen
5ae9a70e38
Disable usage of P-slices when GOP
2015-03-20 10:43:59 +02:00
Marko Viitanen
26082d5328
Zero merge candidate fix for B-frames
2015-03-20 10:33:05 +02:00
Marko Viitanen
0c1aa6f73c
Better reference picture removal function encoder_state_remove_refs()
2015-03-20 10:28:17 +02:00
Marko Viitanen
7dab3ea0f6
Replaced temporary reference lists with the ones in gop configurations
2015-03-20 10:25:40 +02:00
Marko Viitanen
f166d25dd0
Added positive and negative reference frames to the gop config
2015-03-20 10:22:53 +02:00
Arttu Ylä-Outinen
3f31e7bf47
Merge branch 'tz-search'
2015-03-19 19:04:44 +02:00
Arttu Ylä-Outinen
176dbb6a5b
Add --me parameter.
...
Selects the integer motion estimation algorithm (hexbs or tz).
2015-03-19 18:48:10 +02:00
Marko Viitanen
d72c560880
Generate sorted reference list for L0 and L1
2015-03-19 12:26:59 +02:00
Marko Viitanen
c761a6beb3
Added encoder state as an input parameter to inter_get_merge_cand()
2015-03-18 12:35:47 +02:00
Marko Viitanen
c56b4d5747
Added combined merge candidates on B-slices and struct inter_merge_cand_t
2015-03-18 10:03:06 +02:00
Ari Koivula
6ec177f75c
Improve handling of input vector to inter search.
2015-03-17 17:16:15 +02:00
Ari Koivula
55ae02f367
Copy cu_info from tiles to main state.
...
- Main states cu_array can be accessed through state->global->ref, which
allows the use of cu_info data from reference frames.
- This was already used by giving previous frames movement vector to next
frame as a starting point candidate, but that functionality was broken at
some point because the data wasn't being moved from child tiles cu_array
to the main cu_array.
- Alternative would be to access the child tiles array directly, but
currently there isn't a mechanism to preserve those arrays for reference
frames.
2015-03-17 13:24:20 +02:00
Ari Koivula
4bec6cec93
Simplify wavefront handling.
...
- Move the reconstruction status assignment out of the main for job loop.
2015-03-17 13:23:27 +02:00
Ari Koivula
4a27f79f20
Update comments.
2015-03-17 13:23:16 +02:00
Marko Viitanen
1da1dc9578
Clean up reference index and mvd writing
2015-03-16 09:41:02 +02:00
Ari Koivula
ca09e8bfe3
Fix WPP not working with threads=0.
...
- Apparently threadqueue_submit runs the job if there are no threads.
2015-03-13 17:15:05 +02:00
SanteriS
913ade461b
tz_search step 1, first if: && -> ||
2015-03-12 17:57:17 +02:00
SanteriS
949ec57849
Merge branch 'master' of https://github.com/ultravideo/kvazaar
2015-03-12 17:55:03 +02:00
SanteriS
bdb0639ac9
fixed function interfaces for tz_search and its subfunctions.
2015-03-12 17:54:21 +02:00
Ari Koivula
d2bb71739f
Clean up and comment WPP threading code.
...
- Remove WPP row reconstruction dependency to the row above current one in
the previous frame. It's obviously unnecessary.
- Remove WPP row reconstruction dependency to the current row in the
previous frame, unless the current row is the last row.
2015-03-11 18:30:37 +02:00
Ari Lemmetti
b9ec4b0a54
AVX2 acceleration for new luma filtering.
2015-03-11 15:33:38 +02:00
Marko Viitanen
bc8ea9547e
Use P-frames when last GOP picture
2015-03-11 15:23:16 +02:00
Marko Viitanen
a4b5f46b46
Fixed reference list delta and num_ref_idx_lX_active values
2015-03-11 15:19:32 +02:00
Marko Viitanen
ac4973c544
Fixed deblocking strength in this configuration when B-slice
2015-03-10 15:20:02 +02:00
Marko Viitanen
1527822753
Fixed GOP POC order when not using threads
2015-03-10 14:12:51 +02:00
Marko Viitanen
866c3bfdf1
Setting gop_len to 0 now works
2015-03-10 12:16:57 +02:00
Marko Viitanen
1c38fbbd3b
Fixed GOP when no threads are used
2015-03-10 10:45:05 +02:00
Marko Viitanen
66660516b7
Merge remote-tracking branch 'remotes/github/master' into GOP
...
Conflicts:
src/cabac.h
src/config.h
src/cu.h
src/encoder_state-bitstream.c
src/encoderstate.c
2015-03-10 10:32:00 +02:00
Marko Viitanen
ff41ef557d
Fixed reference usage of top GOP layer pictures
2015-03-10 09:18:19 +02:00
Marko Viitanen
eba298e635
Added cu->inter.mv_ref_coded variable
2015-03-10 09:17:25 +02:00
Marko Viitanen
ec02642cc8
Added more bits to POC counter and fixed num_reorder_pic and max_dec_pic_buffering values
2015-03-10 09:06:32 +02:00
SanteriS
9e9f5e3150
Merge branch 'master' of https://github.com/ultravideo/kvazaar
2015-03-08 19:20:08 +02:00
SanteriS
e2f9fe130a
changed step 1 for tz_search
2015-03-08 19:19:23 +02:00
Ari Lemmetti
39eceec38d
Rewrite of luma fractional pixel filtering. Utilizes intermediate values instead of calculating everything again.
2015-03-06 17:58:22 +02:00
Marko Viitanen
42d3f2a8b0
Added B-frame encoding and reference list exceptions for top-layer GOP pictures
2015-03-06 16:32:50 +02:00
Marko Viitanen
1afba671e2
Added missing cabac bits to mv coding
2015-03-06 16:31:27 +02:00
Marko Viitanen
6095503918
Modified search to use correct reference id and mv directions
2015-03-06 16:29:24 +02:00
Marko Viitanen
13c925b701
Testset of data for reference picture lists
2015-03-06 16:28:23 +02:00
Marko Viitanen
43b086caed
Added missing slice header flag "mvd_l1_zero_flag"
2015-03-06 16:27:42 +02:00
Marko Viitanen
18d9789fab
Cabac context array for inter direction
2015-03-06 16:26:16 +02:00
Ari Koivula
2f79bfebf7
Rename parameter encoder_state to state in all functions.
...
- It's so widely used that there isn't really need to emphasize that
it's the encoders state. Also, it isn't really the encoders state,
but encoding jobs state.
2015-03-04 17:31:07 +02:00
Ari Koivula
14fe1b6648
Rename enum color_index to color_t.
2015-03-04 16:37:35 +02:00
Ari Koivula
ded6fd9ee8
Renamed typedef pixel to pixel_t.
2015-03-04 16:35:53 +02:00
Ari Koivula
1f42adb1ea
Renamed typedef coefficient to coeff_t.
2015-03-04 16:33:47 +02:00
Ari Koivula
fedd05465d
Rename struct sao_info to sao_info_t.
2015-03-04 16:32:38 +02:00
Ari Koivula
3d135324da
Rename struct threadqueue_queue to threadqueue_queue_t.
2015-03-04 16:30:20 +02:00
Ari Koivula
b7fcb800b2
Rename struct threadqueue_job to threadqueue_job_t.
2015-03-04 16:28:56 +02:00
Ari Koivula
cf5f240604
Rename struct hardware_flags to hardware_flags_t.
2015-03-04 16:24:59 +02:00
Ari Koivula
e7754bb518
Rename struct strategy_to_select to strategy_to_select_t.
2015-03-04 16:24:06 +02:00
Ari Koivula
e95b138e62
Rename struct strategy_list to strategy_list_t.
2015-03-04 16:23:04 +02:00
Ari Koivula
95afc5af51
Rename struct strategy to strategy_t.
2015-03-04 16:17:45 +02:00
Ari Koivula
db42176a64
Rename struct image_list to image_list_t.
2015-03-04 16:13:57 +02:00
Ari Koivula
7bafd34cfa
Remove struct rd_stats.
2015-03-04 14:01:17 +02:00
Ari Koivula
fe55961f84
Rename struct image to image_t.
2015-03-04 14:01:17 +02:00
Ari Koivula
5431d0ce19
Rename struct lcu_order_element to lcu_order_element_t.
2015-03-04 14:01:17 +02:00
Ari Koivula
9e64ee3cee
Suffix encoder_state_config structs with _t.
2015-03-04 14:01:17 +02:00
Ari Koivula
cdb1a25f05
Inline struct me into encoder_control_t.
2015-03-04 14:01:16 +02:00
Ari Koivula
e5b18cd536
Inline cu_info_intra and cu_info_inter into cu_info_t.
2015-03-04 14:01:16 +02:00
Ari Koivula
a0767a76d2
Rename struct vector2d to vector2d_t.
2015-03-04 14:01:16 +02:00
Ari Koivula
5b12830756
Rename struct config to config_t.
2015-03-04 14:01:16 +02:00
Ari Koivula
1a62fee300
Rename struct cabac_data to cabac_data_t.
2015-03-04 14:01:16 +02:00
Ari Koivula
727fefacc4
Rename struct cabac_ctx to cabac_ctx_t.
2015-03-04 14:01:16 +02:00
Ari Koivula
4bc0308b7e
Rename struct bitstream_file to bitstream_file_t.
2015-03-04 14:01:15 +02:00
Ari Koivula
d6ec6a618d
Rename struct bitstream_mem to bitstream_mem_t.
2015-03-04 14:01:15 +02:00
Ari Koivula
106c9128ad
Rename struct bitstream_base to bitstream_base_t.
2015-03-04 14:01:15 +02:00
Ari Koivula
5d8498dc88
Rename struct bit_table to bit_table_t.
2015-03-04 14:01:15 +02:00
Ari Koivula
8cd8240f7a
Rename struct bitstream to bitstream_t.
2015-03-04 14:01:15 +02:00
Ari Koivula
7ca688b376
Rename struct videoframe to videoframe_t.
2015-03-04 14:01:15 +02:00
Ari Koivula
63e224574e
Rename struct cu_info to cu_info_t.
2015-03-04 14:01:15 +02:00
Ari Koivula
f3fab62d33
Rename struct cu_array to cu_array_t.
2015-03-04 14:01:15 +02:00
Ari Koivula
78f0c3a83b
Rename struct scaling_list to scaling_list_t.
2015-03-04 14:01:14 +02:00
Ari Koivula
f6147b410a
Rename struct encoder_control to encoder_control_t.
...
Conflicts:
src/encoder_state-geometry.h
src/encoderstate.h
2015-03-04 14:01:14 +02:00
Ari Koivula
b14f89c88f
Rename struct encoder_state to encoder_state_t.
2015-03-04 14:00:46 +02:00
Marko Viitanen
890b4c1e20
Modified image handling and QP calculations to support GOP
2015-03-03 12:22:50 +02:00
Marko Viitanen
c3d9e0b707
Added testset of data for GOP
2015-03-03 12:22:09 +02:00
Marko Viitanen
34b231378b
Modified config and encoder_state structs for GOP
2015-03-03 12:21:45 +02:00
SanteriS
b55bfe1729
Merge branch 'master' of https://github.com/ultravideo/kvazaar
2015-02-25 18:15:35 +02:00
SanteriS
bef7cae4f8
Merge branch 'master' of https://github.com/ultravideo/kvazaar
2015-02-25 15:29:11 +02:00
SanteriS
f478732b4c
tz search bugfix
2015-02-25 15:28:45 +02:00
Ari Koivula
d7383ccb25
Change license to LGPL.
...
- Everyone who has contributed code to the project has been asked to license
their contributions under LPGL and they have agreed.
- COPYING file changed to say LGPLv2.1 instead of GPLv2.
- GPL changed to LGPL in the header of every single file that a header and
header added to the few that were missing one.
- Also.. Happy new year!
2015-02-25 15:19:05 +02:00
Ari Koivula
3e58e03b56
Select motion compensation search starting point from among merge candidates.
...
- Greatly reduces bdrate for most sequences.
2015-02-25 12:58:15 +02:00
SanteriS
2f68cf3847
(TZ search) Fixed missing check for owf mode. Added 6 point hexagon search pattern.
2015-02-23 16:59:48 +02:00
Ari Koivula
9865e73b90
Remove NetBSD getopt dependency to unistd.h.
...
- Remove the $NetBSD header as it wouldn't get updated and is wrong.
2015-02-19 16:26:14 +02:00
Ari Koivula
dd54b5ae10
Replace GNU getopt with NetBSD getopt.
...
- This doesn't compile, but I'm including it to have a version history for
changes required to make it work.
- We need this for to have a getopt implementation on Windows.
- It's necessary to change the implementation to switch from GPL to LGPL.
2015-02-19 16:26:14 +02:00
Ari Koivula
c979db7e95
Avoid sorting intra modes unnecessarily.
2015-02-19 16:25:45 +02:00
Ari Koivula
1c2129fdcb
Improve sort_modes.
...
- When encoding with fast enough settings this function can use up to 5%
of the cpu time, so I tried to optimize it a little bit.
2015-02-19 16:25:38 +02:00
Ari Koivula
5fa6438b25
Clean up calls to memset.
...
- Replaces all calls to memset with new FILL and FILL_ARRAY macros. The use
of memset was inconsistent and we never use it for anything complicated.
2015-02-19 16:25:28 +02:00
Arttu Ylä-Outinen
b6776a8cee
Add --vps-period parameter.
2015-02-18 13:55:27 +02:00
SanteriS
1a4d30d15a
fixed step 1 of TZ algorithm
2015-02-11 18:51:21 +02:00
SanteriS
ce4c251cd1
Merge branch 'master' of https://github.com/ultravideo/kvazaar
2015-02-09 17:29:49 +02:00
Ari Lemmetti
8aea1a0fa9
Updated version string. Fixed dct strategy registration error message.
2015-02-05 14:07:26 +02:00
Ari Lemmetti
7846cf3093
Merge branch 'faster_interpolation'
2015-02-05 13:29:43 +02:00
Ari Lemmetti
7430622038
Copy ipol-generic strategy as a base for avx2 strategy
2015-02-05 13:28:07 +02:00
Ari Lemmetti
8495870df8
Using BIT_DEPTH macro because it is constant
2015-02-05 13:19:54 +02:00
Ari Lemmetti
c82adae0c4
Use four tap functions in octpel chroma interpolation
2015-02-04 18:23:57 +02:00
Ari Lemmetti
2f11caeb73
Added generic four tap functions. Use them in halfpel chroma interpolation.
2015-02-04 17:50:12 +02:00
Ari Lemmetti
ff456c120a
Enabled link time optimizations. Disabled default rules.
2015-02-04 15:19:47 +02:00
SanteriS
50dd59eb21
Added different search patterns for TZ search.
2015-02-02 19:14:45 +02:00
Ari Lemmetti
041d970ece
Apply fast clipping also to chroma filtering.
2015-01-29 16:19:04 +02:00
Ari Koivula
ff721bab81
Fix possible non-determinism with owf.
...
- Triggers when owf is on, sao is off and deblocking is on.
2015-01-26 16:02:31 +02:00
Ari Koivula
f01cbbb5ca
Add --no-signhide parameter.
2015-01-24 21:29:37 +02:00
Ari Koivula
5f24c6b73d
Make normal dequant use runtime sign-hiding configuration.
2015-01-24 21:29:25 +02:00
Ari Koivula
1ccb3bd324
Move sign hiding stuff in rdoq to its own function.
...
- There is some stuff from sign hiding left intermingled with rdoq code,
but I don't want to change the code too before testing that I didn't
break anything.
2015-01-24 21:27:20 +02:00
Ari Koivula
804a3b648b
Clean up quantization sign hiding.
...
- To allow for later configuration at runtime.
2015-01-23 16:03:59 +02:00
Ari Koivula
c940ccb549
Fix gcc error.
...
encmain.c:433:13: error: format ‘%llu’ expects argument of type ‘long long unsigned int’, but argument 4 has type ‘uint64_t’
2015-01-23 15:50:14 +02:00
Ari Koivula
5d16fa6c4f
Add VPS every intra frame.
...
- Just rdo=0 for now. Later this can be extended to be configured separately.
2015-01-22 13:13:23 +02:00
Ari Koivula
d685ee86d6
Record total bitstream length correctly when using stdout.
...
- If the output is not a file, we can't check the size of the file.
2015-01-22 12:29:06 +02:00
Ari Koivula
1b19afc706
Flush output buffer after every frame.
2015-01-22 12:29:06 +02:00
Ari Lemmetti
b4aab06073
Added new files in Makefile.
2015-01-21 18:38:09 +02:00
Ari Lemmetti
c21351cc12
Added fast clipping function for clamping values to bit depth.
2015-01-21 17:53:06 +02:00
SanteriS
4b3d77aaf2
Enable tz search.
2015-01-21 12:55:00 +02:00
Ari Koivula
f86def8ed8
Remove unused variables.
2015-01-20 17:50:19 +02:00
Ari Koivula
8ac66934c0
Clean up NAL header code.
...
- Use long start code for RADL NAL units if they are the first NAL in the
access unit.
- Ffmpeg mpegts was complaining about start codes not being present.
There wasn't anything wrong that I could find though, besides the
missing intra long start code.
2015-01-20 17:34:59 +02:00
Ari Koivula
81ad583e08
Use the same coeff cost calculation for all rd modes.
...
- It's not worth it to have these faster approximations for coefficient cost.
2015-01-20 17:34:59 +02:00
Ari Koivula
870171e6ad
Fix --rd=0 actually work.
2015-01-20 17:34:59 +02:00
Ari Lemmetti
f037ed580c
Improved data layout
2015-01-15 16:31:18 +02:00
Ari Lemmetti
4382c2f088
Added missing -1 to PIXEL_MAX macro
2015-01-15 16:14:07 +02:00
Ari Lemmetti
465f718eeb
Move value clipping away from separate loop
2015-01-15 16:14:00 +02:00
Ari Lemmetti
9d12ce21d5
Cleaned luma interpolation, added functions for 8-tap filtering.
2015-01-15 16:13:12 +02:00
Ari Lemmetti
0e56d13b5d
Use smaller bit depth for fractional pixel interpolation
2015-01-15 15:00:09 +02:00
Ari Lemmetti
cc061b4c3d
Added ipol strategy for interpolation filters.
...
Added initial files for AVX2 and generic strategies.
2015-01-15 14:59:37 +02:00
Ari Lemmetti
73762062b6
Clarified comments a bit
2015-01-15 11:57:19 +02:00
Ari Koivula
ab3364afb4
Add skipping of intra search in inter frames for rd=0.
2015-01-15 11:54:35 +02:00
Ari Lemmetti
c9f310a6c2
Use pixel type instead of uint8_t
2015-01-15 11:47:00 +02:00
Ari Lemmetti
cad5f14372
Fixed compile errors (-Werror)
2015-01-14 18:27:35 +02:00
SanteriS
126569c737
Added first version of TZ search algorithm.
2015-01-14 14:54:09 +02:00
Ari Koivula
660547098a
Merge branch 'intra-fast-lcu'
2015-01-14 12:03:12 +02:00
Ari Koivula
01195aecbb
Move cu split model to a function.
2015-01-14 11:16:34 +02:00
Ari Koivula
8c89dcfc50
Move mode bit calculation to a function.
2015-01-14 10:44:52 +02:00
Daniel Eneyev
27d79ffae3
workaround for GET_TIME in Mac OS
2015-01-13 17:06:55 +03:00
Ari Koivula
fc79c2103e
Generalize the fast intra-mode tryout code to work for any depth.
2015-01-12 11:47:21 +02:00
Ari Koivula
f1364d297b
Fix bug resulting in incorrect bitstream.
...
- If 64x64 intra PUs were enabled and --rd was less than 2, no intra mode
search was performed for depth 0 resulting in incorrect bitstream.
2015-01-12 11:16:33 +02:00
Ari Koivula
bbae2e8a27
Update usage and readme.
2015-01-12 10:59:28 +02:00
Ari Koivula
f4bd322804
Add command line options for prediction unit depth.
2015-01-12 10:40:34 +02:00
Ari Koivula
edf2681ea4
Comment functions in search.c.
2015-01-07 14:56:14 +02:00
Ari Koivula
8c1e0b8a7f
Tweak owf=auto.
...
- Twice the required number is too little.
2014-12-10 11:23:51 +02:00
Ari Koivula
129c8e38e0
Set owf default to auto.
2014-12-09 19:00:11 +02:00
Ari Koivula
51b5692121
Rewrite owf=auto code to be more general.
...
- Change the definition to be a bit more general. The mapping from resolution
to owf frames stays mostly the same however, but should handle weird
resolutions better.
- Move everything to config module.
- Fix handling of tiles. It had a bug where owf for tiles was always
threads * 4/3 - 1. Works as intended now.
2014-12-09 19:00:11 +02:00
Ari Koivula
374012ab26
Merge branch 'intraskip'
2014-12-01 17:30:03 +02:00
Ari Lemmetti
24492adb02
Merge branch 'fme_merge'
2014-11-21 15:08:45 +02:00
Ari Koivula
21d221c075
Add fast 64x64 intra test.
...
- If intra search is not enabled for a depth, try the result from the
top left CU of the next depth. This seems to give most of the benefit
of at least 64x64 intra prediction units without costing very much
in performance.
2014-11-20 17:20:24 +02:00
Ari Lemmetti
4874f2662f
Added --subme commandline parameter for fractional pixel motion estimation: 1 == enable (default), 0 == disable.
2014-11-20 14:59:04 +02:00
Ari Lemmetti
d5d2e04995
Merge branch 'fme'
2014-11-19 16:40:22 +02:00
Ari Koivula
3ef88dfda5
Add --owf=auto option.
...
- The optimal value for Overlapping Wave Front (OWF) depends on a bunch of
variables. Attempt to set the optimal owf value, at least for all intra.
2014-11-18 02:19:40 +02:00
Ari Lemmetti
5a946f24ea
Fixed time output formatting.
2014-11-14 16:46:41 +02:00
Ari Lemmetti
56c537e145
Build fixes for MinGW.
...
threads.h: use windows.h headers for clock stuff on MinGW
strategyselector.c: assert with strlen for MinGW support
2014-11-14 16:46:41 +02:00
Daniel Eneyev
992a98c5c4
If output name is dash - write to stdout
2014-11-13 12:45:53 +03:00
Ari Lemmetti
c46b75a0ca
Fixed mingw build error. Modified function declaration in getopt.h.
...
A macro definition adds * in front of __argc and __argv, causing
build error with mingw. Renamed them to argc and argv to prevent this.
2014-10-31 17:40:18 +02:00
Ari Lemmetti
6a12bc406d
Load greatest submodule. Fixed loop that occurred during build process.
2014-10-30 15:17:50 +02:00
Ari Lemmetti
a64aae7c53
Makefile now compiles tests. Fixed test files. Removed unused stuff.
2014-10-29 15:32:47 +02:00
Ari Koivula
50643eeaf8
Merge pull request #88 from darealshinji/patch-2
...
version.h is no longer used
2014-10-27 20:23:15 +02:00
darealshinji
81ecef17d7
version.h is no longer used
2014-10-27 18:17:26 +01:00
darealshinji
e230fb2eab
make it possible to add custom CFLAGS
2014-10-27 17:19:05 +01:00
Ari Lemmetti
e93fa54838
Added -lrt to fix undefined references to clock_gettime on some systems
2014-10-23 14:51:28 +03:00
Ari Lemmetti
eb7cecc3dd
Added .travis.yml for continuous integration. Added env variable to disable AVX2 for Travis (GCC version doesn't support it yet).
2014-10-23 14:20:07 +03:00
Ari Lemmetti
20967cfafe
Allow CC to be defined other than gcc. If not defined, use gcc as default.
2014-10-23 13:25:00 +03:00
Ari Koivula
fcb6fa6d4b
Fix compilation error on PowerPC.
...
- Need abs from stdlib.
2014-10-21 18:14:32 +03:00
Ari Koivula
f6fead6221
Fix crash on inter frames.
...
- If the bitcost was 0 it would underflow for skip mode. The bitcost is now
checked before decrementing.
2014-10-21 18:11:39 +03:00
Ari Koivula
dfc67b766a
Disable rd1 chroma search.
...
- The bdrate improvement isn't really worth the time it takes, so enable it
only for rd3 untill it can be made faster or better.
2014-10-16 13:59:20 +03:00
Ari Koivula
e9b8d9b889
Fix gcc warnings.
...
- Remove unused variables.
- Change intra prediction functions to take their inputs as const pointers.
- Change intra_get_pred to take two pointers instead of an array of pointers,
because the warnings got just too exotic.
2014-10-16 13:17:46 +03:00
Ari Koivula
4bac52d9b6
Merge branch 'intra'
2014-10-16 13:11:23 +03:00
Ari Koivula
afb9e8c3f4
Remove extra parameter sets.
2014-10-16 12:21:36 +03:00
Ari Koivula
02ec26fcea
Try different number of chroma intra modes for different depths.
...
- And avoid doing extra work if no extra modes are tested for certain depths.
2014-10-16 12:21:36 +03:00
Ari Koivula
3cf5e422e8
Make fast chroma mode search select modes for slower chroma search.
2014-10-16 12:21:36 +03:00
Ari Koivula
d12dbd4aa0
Add fast intra chroma mode search.
2014-10-16 12:21:08 +03:00
Ari Koivula
75a137c1e9
Add --cpuid parameter to disable runtime optimizations.
2014-10-16 12:01:36 +03:00
Ari Koivula
3e6023dfb5
Rename search constants and set sane defaults.
2014-10-16 03:08:11 +03:00
Ari Koivula
8a407b0313
Estimate luma and chroma intra mode bits separately.
...
- Remove cu_info.intra[].cost and bitcost as unnecessary.
- Add luma_mode_bits to complement chroma_mode_bits and remove
intra_pred_ratecost as unneccessary. Difference is that intra_pred_ratecost
was more coarse and included chroma mode with the assumption that it would
be the same as chroma.
2014-10-16 03:08:11 +03:00
Ari Koivula
c9e212ba92
Add intra chroma mode search.
...
- Based on full chroma reconstruction so enabled only for --rd=2.
2014-10-16 03:07:50 +03:00
Ari Koivula
b32867be2a
Remove -lrt from LDFLAGS.
...
- This might be required on some embedded system, but from what I can see
all the functions we use from real time extensions are included in libc
and the program seems to work fine without it.
- It doesn't exist on MingwW or Mac, so I think it's better to remove it
completely and add it later on any system that actually requires it.
- Related to #85 .
2014-10-14 11:48:57 +03:00
Ari Koivula
6f8a976b12
Give ARCH_X86_64 to yasm on Mac.
...
- Issue raised in #85 .
2014-10-14 09:47:56 +03:00
Ari Koivula
55ab08c213
Fix incorrect const qualifiers.
...
- Change input pointers to const in dct-generic, like they should have been.
- Fixes compilation error on GCC.
2014-10-13 16:57:15 +03:00
Ari Koivula
8a5b24bcbe
Remove usages of GCC __attribute__.
...
- To allow clang to compile, as it doesn't according to #58 .
- The target attributes are not needed anymore due to makefile handling
targetting now.
- The __attribute__((unused)) used for debugging. I don't know if clang
supports this attribute or not but it doesn't seem very important so
I'm removing it just in case.
2014-10-13 16:46:26 +03:00
Ari Koivula
04613bd5b3
Disable GET_TIME on Mac.
...
- This should fix the Mac version not compiling in issue #85 .
2014-10-13 16:22:11 +03:00
Ari Koivula
a469c059a5
Take chroma tr-skip bits into account.
2014-10-13 10:48:39 +03:00
Ari Koivula
7a5cf5d865
Add trskip mode cost to fast trskip mode decision.
2014-10-13 10:45:41 +03:00
Ari Koivula
f164a5ba79
Add fast transform skip estimation to rough intra search.
2014-10-13 10:42:24 +03:00
Ari Koivula
d893a489d6
Fix mingw compilation issue.
...
strategies/avx2/dct-avx2.c:334:25: error: pasting "g_dct_16" and "[" does
not give a valid preprocessing token
- The [ is not part of the token so compilation failed on mingw GCC 4.9.1.
- Fixes #86 .
2014-10-10 16:32:39 +03:00
Ari Koivula
28d1532578
Make rd=1 use cabac for coeff cost estimation.
2014-10-08 12:50:03 +03:00
Ari Koivula
cbb2aa75b7
Add macros for adjusting weight of distortion between luma and chroma.
...
- Everything needs to have a short name because windows has a maximum path
length limitation that is breaking my testing framework.
2014-10-08 10:31:54 +03:00
Ari Koivula
49ad845c33
Add cabac bits for part_mode.
2014-10-08 10:31:54 +03:00
Ari Koivula
b6710e7893
Add cabac bits for cu split flag.
2014-10-08 10:31:54 +03:00
Ari Koivula
38b224cf69
Change rest of cu split search costs to double.
2014-10-08 10:31:54 +03:00
Ari Koivula
17473624d3
Add transform tree bit costs for cbf_luma.
2014-10-08 10:31:54 +03:00
Ari Koivula
3b04d39db4
Take cabac bits into account on transform tree.
2014-10-08 10:31:54 +03:00
Ari Koivula
296f142d9e
Retain coded block flag data during transform split search.
2014-10-08 10:31:54 +03:00
Ari Koivula
85dea10f3f
Clean up transform split search.
...
- Remove unnecessary checks and comment.
2014-10-08 10:31:54 +03:00
Ari Koivula
e1b801eb6f
Add transform tree chroma cbf bits.
2014-10-08 10:31:23 +03:00
Ari Koivula
3868cc7ff1
Fix crash on inter search when --tr-depth-intra is used.
...
- Transform splits meant for intra modes were used for inter when inter mode
was chosen, which caused an assert to be triggered if the split transform
block didn't have any coefficients.
2014-10-03 19:29:06 +03:00
Ari Lemmetti
bcf12567d0
Added some comments.
2014-10-03 17:51:58 +03:00
Ari Lemmetti
fea517c2ae
Misc code cleanup
2014-10-03 17:06:09 +03:00
Ari Lemmetti
85682c3b6a
Removed unused transpose functions.
2014-10-03 11:39:31 +03:00
Ari Koivula
8a80845b91
Add chroma to transform split search.
2014-10-03 11:36:57 +03:00
Ari Koivula
51662e1081
Fix differences between cu_rd_cost_luma and rdo_cost_intra.
2014-10-03 11:36:57 +03:00
Ari Koivula
bc7d7d5cb6
Add cu_info* as parameter to reconstruction functions.
...
- This is required so these functions can be used for searching. When NULL
is given they take the CU from LCU struct as they did previously.
Conflicts:
src/search.c
2014-10-03 11:36:56 +03:00
Ari Koivula
ccc575e2c6
Disable transform tree bits.
2014-10-03 11:36:56 +03:00
Ari Koivula
a0ab469c89
Disable rdo_cost_intra.
2014-10-03 11:36:56 +03:00
Ari Koivula
c164978e21
Add FULL_CU_SPLIT_SEARCH macro for disabling cu split optimization.
2014-10-03 11:36:56 +03:00
Ari Koivula
549ac96438
Change costs to doubles to avoid rounding intermediate results.
...
- Helps with debugging.
2014-10-03 11:36:56 +03:00
Ari Koivula
e591e89ade
Add prediction mode to chroma reconstruction parameters.
...
- Just like in luma.
2014-10-03 11:36:56 +03:00
Ari Koivula
f6272f06fc
Unify signature for transform functions.
...
- Some used block, coeff and some src, dst. Now all signatures are const input
and non-const output.
2014-10-03 11:21:43 +03:00
Ari Koivula
b932cf4b21
Clean up avx2 dct macros.
2014-10-03 11:16:25 +03:00
Ari Koivula
47244a15c3
Merge branch 'dct-optimizations'
...
Conflicts:
src/strategies/avx2/dct-avx2.c
src/strategies/generic/dct-generic.c
2014-10-02 13:45:21 +03:00
Ari Lemmetti
61e1510480
Transform functions in dct-avx2.c are now generated with macros.
2014-10-02 13:24:30 +03:00
Ari Lemmetti
9407610555
Moved DCT / DST matrices to dct-generic.c
2014-10-02 13:24:30 +03:00
Ari Lemmetti
7255112bd8
Added transposed DCT/DST tables. Use them while calculating transforms instead of doing runtime transpose. Added separate functions for DST and IDST.
2014-10-02 13:24:30 +03:00
Ari Lemmetti
e7bcb58846
Added 32x32 IDCT
2014-10-02 13:24:30 +03:00
Ari Lemmetti
eacf173b7e
Added 32x32 DCT for AVX2
2014-10-02 13:24:30 +03:00
Ari Lemmetti
d2856a5d40
Added 32x32 transpose
2014-10-02 13:24:30 +03:00
Ari Lemmetti
7a33f08312
Added 16x16 DCT and IDCT for AVX2
2014-10-02 13:24:30 +03:00
Ari Lemmetti
d2fe2a5391
Added 16x16 transpose
2014-10-02 13:24:30 +03:00
Ari Lemmetti
d6af146a2e
Added part of the functions 16x16 DCT needs
2014-10-02 13:24:30 +03:00
Ari Lemmetti
aba3acdfff
Added AVX2 optimized transforms for 4x4 and 8x8 blocks
2014-10-02 13:24:30 +03:00
Ari Lemmetti
5856f32d81
Fixed incorrect shift values for inverse transforms in generic strategy
2014-10-02 13:24:29 +03:00
Ari Lemmetti
41b032664d
First version of 4x4 forward DCT
2014-10-02 13:24:29 +03:00
Ari Koivula
36232619ab
Fix broken cabac contexts in wpp.
...
- Fixes #84 .
- The issue was caused by 241b9d6
naively copying the whole struct, which
contains data other than just the contexts. Rather than reverting the
change, the struct was refactored to have another struct that contained
just the contexts.
2014-09-24 01:02:52 +03:00
Ari Koivula
4e052d3f0f
Wrap contexts of cabac_data inside cabac_data.ctx struct.
2014-09-24 01:02:37 +03:00
Ari Koivula
b339004c4c
Rename cabac_state.ctx to cur_ctx.
2014-09-24 01:02:28 +03:00
Ari Koivula
8b8b53fba5
Merge branch 'sao_cabac'
2014-09-22 10:28:30 +03:00
Ari Koivula
bfa399c8fc
Fix compiler warnings.
...
- Non-parenthesized parameter in a macro.
- Unused variables.
- Wrong const qualifiers.
- Signed/unsigned comparison.
2014-09-22 10:04:57 +03:00
Marko Viitanen
6f65a9cbbd
Improved SAO merge decisions
2014-09-16 10:08:17 +03:00
Marko Viitanen
21df11ba4e
Implemented SAO search for both chroma components
2014-09-15 16:07:31 +03:00
Marko Viitanen
e8d1140a1a
Check SAO band offset for both chroma components and better SAO chroma cabac costs
2014-09-15 16:07:31 +03:00
Marko Viitanen
0c92031e8a
SAO merge checking cleanup
2014-09-15 16:07:31 +03:00
Marko Viitanen
b274e7adcd
Added cabac bit cost calculations to SAO search
2014-09-15 16:07:31 +03:00
Ari Koivula
5f732126c3
Add cabac bit costs float table.
2014-09-15 15:45:43 +03:00
Ari Koivula
0db7d8d20f
test cu split cost
2014-09-15 15:42:03 +03:00
Ari Koivula
35b2e6f755
Add missing cabac context for chroma cbf.
...
- The context was also missing from HM, but has been fixed in HM13.
2014-09-15 15:41:44 +03:00
Ari Koivula
241b9d6adb
Simplify cabac context copying.
...
Conflicts:
src/context.c
2014-09-15 15:41:44 +03:00
darealshinji
61a414bced
reposition colons in usage message to match with the rest
2014-09-15 03:40:18 +02:00
Ari Koivula
3c73892609
Fix transform split search.
...
- Redo the search with the best mode to make sure the tr_depth parameters are
correct.
2014-09-11 10:56:53 +03:00
Ari Koivula
46b6b1243b
Add --rd=3 mode and enable searching of intra depth 0.
...
- intra_build_reference_border was overflowing at depth 0 because it uses
arrays just large enough to accommodate 32x32 transforms, which is the
biggest transform.
- For similar reasons search_intra_rough doesn't work at depth 0.
- The --rd=3 mode tries all modes with transform search. It also works without
rough search so it was used to test depth 0 search. If --rd=3 is not on intra
split at depth 0 is not searched for.
Conflicts:
src/search.c
2014-09-11 10:54:41 +03:00
Ari Koivula
c5fa824347
Rebase transform split search.
2014-09-08 14:13:59 +03:00
Ari Koivula
79b86ce6e1
Add --tr-depth-intra command line option.
...
Conflicts:
src/encoder.c
2014-09-04 13:42:24 +03:00
Marko Viitanen
fe236de807
Fixed sps_max_dec_pic_buffering value to include current picture
2014-09-01 10:31:11 +03:00
Marko Viitanen
dbcc8d65aa
Removed duplicate function from RDOQ
2014-08-28 08:50:01 +03:00
Ari Koivula
931ec7301c
Put slice delta QP to bitstream.
...
- Before slice delta QP was always 0. Now if global->QP is changed before
contexts are set, the delta qp is put to the bitstream, allowing for rough
frame level rate control.
2014-08-25 16:43:23 +03:00
Ari Koivula
4c3bbd4a35
Rewrite the SContruct.
...
- Works with new /strategy/ structure.
- Change architecture selection to use arch= instead of construction target.
2014-08-25 16:43:23 +03:00
Ari Lemmetti
f88c3b6f37
Removed unnecessary if (both branches did the same thing)
2014-08-20 11:54:35 +03:00
Laurent Fasnacht
f3c311fe1a
Fix commit 8502f3d
2014-08-11 15:17:15 +02:00
Laurent Fasnacht
f9bffe35a5
Log tile id in sad perf log
2014-08-11 11:57:08 +02:00
Laurent Fasnacht
6a937de9b2
Fix search_cu log
2014-08-11 11:57:08 +02:00
Laurent Fasnacht
8502f3d850
Improve logging
2014-08-11 11:57:07 +02:00
Laurent Fasnacht
f1b303a2d2
Fix compilation errors
2014-08-11 09:53:06 +02:00
Ari Lemmetti
47e3bcfb50
Fixed incorrect shift values for inverse transforms in generic strategy
2014-08-07 16:01:30 +03:00
Ari Lemmetti
709520a233
Removed all AVX2 instructions from SATD functions.
...
-Zero extend macro now returns results in 2 xmm registers instead of one ymm
2014-07-31 13:25:28 +03:00
Ari Lemmetti
0beb278f5b
Partial butterfly strategy is now called DCT strategy. Made changes to transform functions in preparation for optimizations.
...
-Moved fast_forward_dst and fast_inverse_dst to DCT strategies
2014-07-31 13:25:28 +03:00
Ari Lemmetti
6bf63bd171
Added AVX2 strategy for partial butterfly (no optimizations yet)
2014-07-31 13:25:28 +03:00
Ari Lemmetti
faccc4f09b
Partial butterfly functions now utilize the strategy selector
2014-07-31 13:25:28 +03:00
Ari Koivula
c2fac805d7
Give HAVE_ALIGNED_STACK to yasm on windows.
...
- Linux gets it through some other means but on windows it needs to be
given explicitly.
- Fixes issue #78 .
2014-07-30 16:26:23 +03:00
Ari Koivula
669e99dd7f
Improve intra SAD AVX2 intrinsics.
...
- Moved implementations for different sizes to inline functions that are
defined using each other, reducing the amount of redundant code.
- Performance of sad_8bit_32x32_avx2 improved by about 10% due to unrolling of
the loop.
2014-07-25 15:59:55 +03:00
Ari Koivula
e00102f0ca
Compile asm optimizations only if yasm is present.
2014-07-23 14:57:40 +03:00
Ari Lemmetti
85fb0784e4
Fixed intendentation and added some empty lines for readability
2014-07-23 12:32:27 +03:00
Ari Lemmetti
bd6e89c1f0
Updated include directories and file names to Makefile
2014-07-22 15:36:54 +03:00
Ari Lemmetti
4f88ebce5a
Added comments and made visual studio not to compile x86inc.asm
2014-07-22 15:07:57 +03:00
Ari Koivula
cfd3636e08
Move some repetitive SATD asm into a macro.
...
Conflicts:
src/strategies/x86_avx/picture_x86.asm
2014-07-22 12:46:39 +03:00
Ari Lemmetti
c81639dd09
Removed old unused macro
2014-07-22 11:11:20 +03:00
Ari Lemmetti
cf0797cafd
Reordered and intended assembly code
2014-07-22 11:07:42 +03:00
Ari Lemmetti
fea44c8234
Renaming AVX/asm files
...
-Splitted SAD and SATD functions in separate files
2014-07-21 18:02:01 +03:00
Ari Lemmetti
a64df6f0d0
Merge branch 'asm'
...
Conflicts:
build/kvazaar_lib/kvazaar_lib.vcxproj.filters
src/Makefile
src/strategies/strategies-picture.c
2014-07-21 16:41:09 +03:00
Ari Lemmetti
1be2c3aae5
Preparing push to master and misc
...
-Removed unnecessary <math.h> headers
-Updated AVX/asm optimizations to match the new file hierarchy
-Makefile only compiles .asm files if KVAZAAR_DISABLE_YASM is not set to 1 and TARGET_CPU_ARCH is x86
2014-07-21 12:39:56 +03:00
Ari Koivula
a8f7103797
Add AVX2 implementations for sad_8bit_ 8x8, 16x16 and 32x32.
2014-07-18 18:27:30 +03:00
Ari Koivula
3daa5dd1f1
Add sse2 implementaton for sad_8bit_4x4.
2014-07-18 18:20:34 +03:00
Ari Koivula
f49332c9b8
Add missing includes.
2014-07-18 17:56:15 +03:00
Ari Koivula
291817667f
Tidy up the Makefile.
2014-07-18 17:31:18 +03:00
Ari Koivula
e241866f43
Compile intrinsic functions with appropriate flags in gcc.
...
- Remove -march=native as it's no longer necessary for intrinsics to work.
Closes #77 .
- I couldn't test altivec or sse4.1, but sse4.1 compiles so I expect it
to work.
2014-07-18 17:28:14 +03:00
Ari Koivula
5662621b3c
Free threadqueue jobs when they are not needed.
...
- Also add destroying the mutex when the job is freed.
- This makes Kvazaar no longer acquire thousands of OS handles on Windows.
2014-07-16 16:51:20 +03:00
Ari Lemmetti
1e94262f85
Made AVX asm compatible with the changed system
...
- x86inc.asm is now located in extras
- Removed unused cpu.asm/h
2014-07-14 18:51:17 +03:00
Ari Lemmetti
683eda1183
Merge branch 'master' into asm
...
Conflicts:
build/kvazaar_lib/kvazaar_lib.vcxproj
build/kvazaar_lib/kvazaar_lib.vcxproj.filters
src/Makefile
src/strategies/strategies-picture.c
2014-07-14 16:42:33 +03:00
Ari Lemmetti
7f873e037c
Updated Makefile to compile picture_x86.asm
2014-07-14 15:30:08 +03:00
Ari Lemmetti
2169f9ab8c
Added AVX asm comments and fixes
...
-Added vzeroupper to satd macro to prevent AVX-SSE transition penalties int picture_x86.asm
-Fixed the order of registers in zero extend macro in picture_x86.asm
-Fixed SATD checkers test pattern in satd_tests.c
2014-07-14 14:43:36 +03:00
Ari Koivula
5d0df56c94
Move optimizations to their own compilation units according to target.
...
- This is necessary in order to compile AVX intrinsics correctly in
Visual Studio. Having everything in their own units should also make
compiling normal C code with optimizations on easier.
- For now the makefile still relies on GCC __target__ attribute for compiling
intrinsics.
2014-07-11 17:26:19 +03:00
Ari Koivula
f605d6c35b
Align intra buffers to 32 bytes for 256 bit SIMD instructions.
2014-07-11 17:26:19 +03:00
Ari Koivula
fbd03b706e
Reconfigure VS project.
...
- Moved compilation flag stuff from project file to the abstraction layer.
- Disabled randomized base address as unnecessary.
- Disable stack buffer security check from release.
2014-07-11 17:26:19 +03:00
Laurent Fasnacht
72abc69b3d
Measure time for SAD in _DEBUG mode
2014-07-08 11:42:58 +02:00
Laurent Fasnacht
1a318c714d
log poc with new_frame
2014-07-08 11:42:19 +02:00
Laurent Fasnacht
e64a692780
Add CU type in threadqueue.log
2014-07-08 09:06:31 +02:00
Laurent Fasnacht
abfbb7cad3
Fix duplicate type key in threadqueue.log
2014-07-07 11:36:50 +02:00
Laurent Fasnacht
946e3b9651
Log search_cu to threadqueue.log
2014-07-07 10:50:05 +02:00
Laurent Fasnacht
f62e571c15
Add missing info to threadqueue.log
2014-07-07 10:49:40 +02:00
Ari Lemmetti
048127c7e3
AVX assembly optimizations improved
2014-07-02 16:57:06 +03:00
Ari Koivula
7ecf78bb70
Use sqrt lambda cost for searches not using SSD.
...
- Add encoder_state->global->cur_lambda_cost_sqrt.
- Use sqrt lambda for inter search and rough intra search.
- The effect on inter is around 10-20% bdrate. The effect on intra is smaller
and non-existent when --rd=2 is enabled, as the intra search refinement was
already done with SSD and correct lambda.
2014-06-26 13:56:38 +03:00
Laurent Fasnacht
1112dca933
Fix compilation issue with assertion disabled
2014-06-26 07:31:37 +02:00
Laurent Fasnacht
9ab9defe67
Bitstream length per frame works again
2014-06-19 10:24:03 +02:00
Laurent Fasnacht
45faadb2c9
Fix bug where the wrong number of frames could be encoded (if one frame takes longer than the others)
2014-06-19 10:24:02 +02:00
Ari Koivula
d5a77be4b8
Fix avx detection for gcc.
...
- GCC doesn't support _xgetbv intrinsic so we have to use inline assembler.
2014-06-18 11:50:17 +03:00
Ari Lemmetti
bdef5384ef
Added AVX strategy
2014-06-17 16:52:24 +03:00
Ari Koivula
d7abe6a7c2
Address compilation warning.
...
strategyselector.c:170:10: error: ‘__get_cpuid’ is static but used in inline function ‘get_cpuid’ which is not static [-Werror]
return __get_cpuid(level, eax, ebx, ecx, edx);
2014-06-17 16:26:55 +03:00
Ari Koivula
60ecc6baae
Remove unused stuff.
2014-06-17 16:20:01 +03:00
Ari Koivula
7532b789f8
Add -std=gnu99 for gcc.
...
- std=c99 doesn't work because then struct timespec won't be defined.
2014-06-17 16:15:39 +03:00
Ari Koivula
94bc457b6c
Add option to disable fast intra search.
2014-06-17 15:32:05 +03:00
Ari Koivula
e27fc875c0
Clean up intra search.
2014-06-17 15:09:12 +03:00
Ari Koivula
e4d70ac1ab
Use more starting points for smaller blocks in intra search.
2014-06-17 13:28:27 +03:00
Ari Koivula
9911c7553b
Avoid unnecessary intra dir searching.
2014-06-17 13:11:35 +03:00
Ari Koivula
bd16a55b9b
Always check DC and planar intra modes.
...
- At least one of them is always in predicted modes, but to make sure they
are both included add them explicitly.
2014-06-17 12:51:15 +03:00
Ari Koivula
70740da123
Add smarter rough intra search.
...
- Directional intra mode search is done using halving search from the best
known mode. Starting modes are vertical, horizontal and the 3 diagonal
modes.
Conflicts:
src/search.c
2014-06-17 12:33:10 +03:00
Marko Viitanen
0e2fe9e7ff
Changed intra search to skip some modes speeding it up
2014-06-17 12:32:29 +03:00
Marko Viitanen
a1c3cfe944
Moved intra mode cost calculation to a function
...
Conflicts:
src/search.c
2014-06-17 12:32:29 +03:00
Marko Viitanen
eb7d46f9ef
Modify CU split cost.
2014-06-17 12:30:32 +03:00
Marko Viitanen
bfa37b876b
Conformance fix: set sps_max_dec_pic_buffering to correct value
2014-06-17 12:30:32 +03:00
Ari Koivula
b3c15b8f94
Merge branch 'owf'
2014-06-16 16:07:41 +03:00
Laurent Fasnacht
91de92134f
Constrain the search not to go under the LCU below if OWF is enabled
2014-06-16 14:27:56 +02:00
Laurent Fasnacht
ef9c2258e9
Fix frame counter and stats
2014-06-16 13:21:52 +02:00
Ari Koivula
153b1ee41f
Merge branch 'intra-sad-strategies'
2014-06-16 12:34:37 +03:00
Laurent Fasnacht
84d34c2655
Fix compilation on non-intel
2014-06-16 11:24:02 +02:00
Ari Koivula
3f00592b96
Separate strategyselector debug prints from _DEBUG.
...
- I only want to see the strategy stuff.
2014-06-16 12:15:19 +03:00
Ari Koivula
1c97a10a6d
Move intra SAD and SATD functions under strategies.
2014-06-16 12:13:41 +03:00
Laurent Fasnacht
4b4702819b
Also print encoding FPS
2014-06-16 11:10:11 +02:00
Laurent Fasnacht
2347574a8e
Fix problems revealed by valgrind
2014-06-16 11:10:09 +02:00
Laurent Fasnacht
28c3f22ba1
Fix possible freeze
2014-06-16 11:03:48 +02:00
Laurent Fasnacht
a96c742ad4
Fix depends for wpp+owf
2014-06-16 11:03:47 +02:00
Laurent Fasnacht
f99e41d41f
Improved CPU time statistics
2014-06-16 11:03:46 +02:00
Laurent Fasnacht
8a33c0a688
Fix recon job for wfrow
2014-06-16 10:55:01 +02:00
Laurent Fasnacht
bf6024734a
Fix statistics with OWF
2014-06-16 10:55:00 +02:00
Laurent Fasnacht
0522a3d8e5
--owf option
2014-06-16 10:55:00 +02:00
Laurent Fasnacht
47d1ded7b0
Dependencies between frames
2014-06-16 10:54:59 +02:00
Laurent Fasnacht
003d3c504c
image_list_copy_contents
2014-06-16 10:54:58 +02:00
Laurent Fasnacht
f4187dd10c
cu_array data structure
2014-06-16 10:54:57 +02:00
Laurent Fasnacht
3be3fa8d6e
Use different processing order depending if we have OWF or not
2014-06-16 10:54:56 +02:00
Laurent Fasnacht
c32943f78b
OWF
2014-06-16 10:54:56 +02:00
Laurent Fasnacht
490dd15f3d
Remove flush between frame
2014-06-16 10:51:33 +02:00
Laurent Fasnacht
fddcbabe28
bitstream writing is now a "normal" job in a thread
2014-06-16 10:51:32 +02:00
Laurent Fasnacht
ff7143cc24
Assign thread_queue_jobs and move image_free to a more suitable place
2014-06-16 10:51:32 +02:00
Ari Koivula
87ca828a63
Correct intra sad function labels.
...
- These haven't been 16 bit for a long time.
2014-06-16 10:45:10 +03:00
Ari Koivula
fcce6ae823
Fix printing of AVX2 capability.
2014-06-14 01:24:19 +03:00
Ari Koivula
a49ba2633a
Add OS and CPU detection for AVX2 and AVX.
2014-06-13 16:57:53 +03:00
Ari Koivula
1de102be61
Move strategies to their own compilation units.
...
- Enforces a little bit more hierarchy. Compilation units are in strategies
and whatever inline includes they have are in a folder with the same name
as the strategy.
2014-06-13 15:30:23 +03:00
Ari Koivula
aa3549a717
Change SLEEP(0) to SLEEP(10) on Windows.
...
- This is a workaround for a performance problem on Windows where main thread
is busy looping.
2014-06-13 12:01:03 +03:00
Laurent Fasnacht
4acadccf89
Only signal the required number of threads
2014-06-13 08:34:59 +02:00
Laurent Fasnacht
70ce7cec20
Remove unneccessary locks by adding threadqueue->queue_running counter
2014-06-13 08:34:58 +02:00
Laurent Fasnacht
7ef34ff5a1
Ability to dump mutex_lock, mutex_unlock and cond_wait timing, if compiled with -D_PTHREAD_DUMP
2014-06-13 08:32:14 +02:00
Laurent Fasnacht
68ad323e84
Tentative fix for race condition
2014-06-12 14:01:33 +02:00
Laurent Fasnacht
b194e19708
Tentative fix for deadlock
2014-06-12 12:57:14 +02:00
Laurent Fasnacht
b765eca153
Remove unneeded encoder_state_blit_pixels
2014-06-12 11:47:46 +02:00
Laurent Fasnacht
da07b8b35d
No-copy works (SAO and deblocking enabled)
2014-06-12 11:47:38 +02:00
Laurent Fasnacht
2cc700fab8
No-copy works with --no-sao (deblocking enabled)
2014-06-12 11:47:31 +02:00
Laurent Fasnacht
6b408b5904
No-copy works with --no-sao --no-deblock
2014-06-12 11:47:30 +02:00
Laurent Fasnacht
0dbfa62698
Replace copy of images made for tiles by sub-images (no copy)
...
- replace width by stride where required in the source code
2014-06-12 11:47:30 +02:00
Laurent Fasnacht
b1347efef5
Add checkpoint in sao_reconstruct
2014-06-12 11:47:29 +02:00
Laurent Fasnacht
ae4dc4eb44
Fix uninitialized sao_info structure members, which was creating false positive when checkpointing SAO
2014-06-12 11:47:29 +02:00
Laurent Fasnacht
f371bdafc3
sao_info checkpoints
2014-06-12 11:47:28 +02:00
Laurent Fasnacht
b7fe81c55c
Checkpoint in pixels_blit, and avoid doing undefined behaviour when source and destination is the same.
...
Seems a reasonnable point to observe when refactoring, since it's called on most image data.
2014-06-12 11:47:28 +02:00
Laurent Fasnacht
da8559fa34
Fix bug in CHECKPOINTS_FINALIZE() when checkpoints are disabled
2014-06-12 11:47:27 +02:00
Laurent Fasnacht
14df6de0d0
Checkpoint on frame checksum
2014-06-12 11:47:00 +02:00
Laurent Fasnacht
22df7cf98b
Use an assert instead of a dumb assignment
2014-06-12 11:47:00 +02:00
Laurent Fasnacht
cf123e317f
Code to checkpoint cu_info and lcu_t
2014-06-12 11:47:00 +02:00
Ari Koivula
ea830d3dd2
Add warning for VLAs in Makefile.
2014-06-12 09:57:08 +03:00
Ari Koivula
443f2f00aa
Fix compilation for VS.
...
- VS2013 does not support variable length arrays.
2014-06-11 17:51:55 +03:00
Laurent Fasnacht
87ed365053
typo fix
2014-06-11 10:29:05 +02:00
Laurent Fasnacht
6ca30367f9
Fix POC bug
2014-06-11 10:29:05 +02:00
Laurent Fasnacht
8437229885
Fix handling of cu_arrays
2014-06-11 10:29:04 +02:00
Laurent Fasnacht
e1d9cb015a
Basic checkpointing system
2014-06-11 10:29:03 +02:00
Laurent Fasnacht
27a49d287d
Big refactor to use videoframe, image_list, and image instead of picture*
2014-06-10 09:19:06 +02:00
Laurent Fasnacht
530faf3951
Move video frame related stuff to videoframe
2014-06-05 14:08:31 +02:00
Laurent Fasnacht
0fac77f9eb
Image now in separate module
2014-06-05 14:04:12 +02:00
Laurent Fasnacht
2456c65822
Replace accesses to picture->cu_array with picture_get_cu and picture_get_cu_const
2014-06-05 10:41:58 +02:00
Laurent Fasnacht
821b71910b
Move picture_list to its own module
2014-06-05 09:49:24 +02:00
Laurent Fasnacht
7372f9244d
Basic infrastructure for OWF
2014-06-05 09:09:25 +02:00
Laurent Fasnacht
16e3a58359
Performance improvement
2014-06-05 06:57:51 +02:00
Laurent Fasnacht
bad6d45e5f
Performance improvement
2014-06-05 06:57:51 +02:00
Laurent Fasnacht
aad2089fcf
Use -ftree-vectorize
2014-06-05 06:57:50 +02:00
Laurent Fasnacht
ea04bcd6a4
AltiVec support for SAD
2014-06-05 06:57:34 +02:00
Ari Koivula
3a7147baf4
Merge branch 't-20140602'
2014-06-04 18:11:15 +03:00
Ari Koivula
31b1bbc215
Address implicit declaration of warnings.
2014-06-04 18:00:50 +03:00
Ari Koivula
4f5c87fc5e
Remove duplicate function definition.
2014-06-04 17:56:05 +03:00
Ari Koivula
cb7d7f9e15
Update Makefile.
2014-06-04 17:52:28 +03:00
Ari Koivula
bb47534b88
Make encoder_state .c files their own compilation units.
...
- It's good that this module has been chopped to smaller pieces, but lets
avoid including .c files unless we really have to. These make pretty good
submodules on their own so just make them their own compilation units.
- Move some stuff around to avoid having to forward declare them
in encoderstate.c.
2014-06-04 17:45:18 +03:00
Ari Lemmetti
9e649a8f38
Updated usage message
2014-06-04 15:23:27 +03:00
Laurent Fasnacht
b8acdc784a
Fix compilation of encoder.c with -D_DEBUG
2014-06-03 15:02:14 +02:00
Laurent Fasnacht
961da05235
Split encoderstate.c in multiple files
2014-06-03 14:47:49 +02:00
Laurent Fasnacht
3d07f8cc84
encoderstate refactor
2014-06-03 14:25:16 +02:00
Laurent Fasnacht
2e821b79a9
encoder_state in now in encoder_state.[ch]
2014-06-03 13:51:30 +02:00
Laurent Fasnacht
9bdecbe071
Better thread scheduling
2014-06-03 11:39:16 +02:00
Laurent Fasnacht
0811dbcfbe
Remove unneeded cond_broadcast. Limit contention
2014-06-03 09:45:17 +02:00
Laurent Fasnacht
5ee1319c08
Altivec detection
2014-06-03 07:55:39 +02:00
Laurent Fasnacht
58ad3b4d26
Log more performance data, plot also now many threads are running
2014-06-03 07:42:22 +02:00
Laurent Fasnacht
5ed69b063b
Strategy selector for array_checksum, basic implementation using precomputed 256*256 block with larger accesses than byte
2014-06-03 07:42:22 +02:00
Ari Koivula
a483e8cb0f
Move cpuid stuff away from compiler namespace.
...
Conflicts:
src/strategyselector.c
2014-05-30 10:08:14 +03:00
Marko Viitanen
6a72f87028
Merge commit '792a5a5dd1946a327f22b2daba05c6645dfa8037'
2014-05-30 08:47:01 +03:00
Marko Viitanen
792a5a5dd1
Small fix for __get_cpuid()
2014-05-30 08:37:03 +03:00
Laurent Fasnacht
642564b6fb
Remove unused variable
2014-05-28 15:04:45 +02:00
Laurent Fasnacht
4f86919d75
Get rid of assembly cpuid for x86, compilation works for powerpc
2014-05-28 15:04:00 +02:00
Ari Koivula
e585da37e5
Give correct transform depth to RDOQ.
...
Conflicts:
src/search.c
2014-05-28 15:47:49 +03:00
Ari Koivula
dceb3da9b8
Fix bug in search relating to transform with no non-zero coefficients.
...
- Because cost was calculated even though there were no coefficients, these
very good modes were less likely to be selected.
- Added assert to encode_coeff_nxn to avoid these problems in the future.
2014-05-28 15:22:18 +03:00
Ari Koivula
ddc02cc09e
Avoid regenerating reference pixels for every rdo mode.
2014-05-22 13:18:28 +03:00
Ari Koivula
dbe13d0cba
Separate sad intra search from rdo search.
2014-05-22 12:47:45 +03:00
Ari Koivula
19ce21e07c
Split final cost to luma and chroma functions.
2014-05-22 09:45:00 +03:00
Ari Koivula
a6962e2974
Separate intra transform coding to luma and chroma functions.
2014-05-22 09:40:34 +03:00
Laurent Fasnacht
3a30a886fc
FREE_POINTER of job->rdepends was at the wrong place (memory leak)
2014-05-22 07:15:18 +02:00
Laurent Fasnacht
3b38777b71
Fix condition depending on uninitialized value in SAO
2014-05-21 16:33:24 +02:00
Laurent Fasnacht
66e730ba94
Fix encoder_state_init, which was making out of bound reads
2014-05-21 14:23:36 +02:00
Laurent Fasnacht
37c20b8ce5
Add dependency between SAO rows
2014-05-21 13:52:56 +02:00
Laurent Fasnacht
90f46dc56f
Threadqueue has now a start index to the first queue job. It improves the speed a little
2014-05-21 12:02:55 +02:00
Laurent Fasnacht
f4f9093cb5
Parallel SAO
2014-05-21 11:48:29 +02:00
Laurent Fasnacht
a3fcb141ed
lcu_order_element now has pointer to neighbor LCUs
2014-05-21 11:06:53 +02:00
Ari Koivula
de76d0a294
Don't add dependency to the above LCU in wavefront if it's not necessary.
...
- The top-right LCU already has dependency to the top LCU.
2014-05-20 10:48:19 +03:00
Laurent Fasnacht
bdc2d43180
Write bitstream directly after doing the search. This is required since we need the correct entropy status for wpp
2014-05-20 09:29:01 +02:00
Laurent Fasnacht
06532292fc
Wavefront are in tile coordinates
2014-05-20 09:28:58 +02:00
Ari Koivula
4751a3744b
Fix intra mode search not doing boundary smoothing for DC.
...
- Move the boundary smoothing to the prediction function to make sure it's not
forgotten.
2014-05-19 16:23:17 +03:00
Ari Koivula
f9a603e4ea
Move intra mode search form intra module to search module.
...
- Make the actual intra prediction function global.
- Move the rdo stuff to rdo module.
2014-05-19 16:12:02 +03:00
Ari Koivula
1da94f2085
Stop deblocking from filtering edges not on 8x8 grid.
2014-05-19 15:58:54 +03:00
Ari Koivula
2224e18a46
Make deblocking work with transform splits.
...
- It used to work only with the implicit transform split from LCU size.
2014-05-19 15:58:54 +03:00
Ari Koivula
656b0a321b
Add chroma mode to lcu_set_intra_mode.
...
- This is needed for intra split.
2014-05-19 15:58:54 +03:00
Ari Koivula
921f58b249
Add tr_split to lcu_set_intra_mode.
2014-05-19 15:58:54 +03:00
Ari Koivula
846b608125
Add transform split recursion to intra reconstruction.
2014-05-19 15:58:54 +03:00
Ari Koivula
63f6cad5a0
Include global.h in thread modules.
2014-05-19 15:58:16 +03:00
Ari Koivula
551b087b47
Remove bunch of unnecessary code from encode_transform_unit.
...
- Really, it's useless. Selecting scan order isn't this hard.
- Checked from HM that ctx_idx doesn't have anything to do with contexts.
2014-05-16 17:42:40 +03:00
Ari Koivula
f73bef0941
Remove unused include.
2014-05-16 16:09:59 +03:00
Laurent Fasnacht
6fdb821b14
Fix memory leaks
2014-05-16 12:20:40 +02:00
Laurent Fasnacht
d4a6aed471
Multi-row jobs
2014-05-16 12:20:40 +02:00
Marko Viitanen
94285fbed7
Fixed compiling on visual studio with _DEBUG defined
2014-05-16 12:22:06 +03:00
Marko Viitanen
86155ef1ba
Added windows specific timing macros for thread debugging
2014-05-16 12:16:22 +03:00
Laurent Fasnacht
36945e89ce
Stubs to be able to make a portable version of the profiling
2014-05-16 10:15:05 +02:00
Laurent Fasnacht
53b0835316
Improve handling of jobs when not using threads
2014-05-16 08:50:43 +02:00
Laurent Fasnacht
519750d630
Write bitstream of a wavefront in a parallel way
2014-05-16 08:50:42 +02:00
Laurent Fasnacht
7473ac1bfc
Able to log time in a simple way
2014-05-16 08:50:42 +02:00
Laurent Fasnacht
86e01284b8
Add -lrt
2014-05-16 08:48:54 +02:00
Laurent Fasnacht
4f73a7fc91
Instrument threads in order to be able to do some visualization
2014-05-16 08:44:32 +02:00
Ari Koivula
a7cd31d87b
Update the names of some bins to the current spec.
...
- Helps with debugging.
2014-05-16 05:44:03 +03:00
Ari Koivula
ab4041c8fc
Change cabac debug statements to show information better.
...
- Show the number of bits when encoding multiple bins. I would like just the
bits them selves in string form, but that's too much trouble for this.
- Print then as unsigned and coerce them to unsigned, as they are going
get coerced to unsigned by the function call anyway.
- Change state to be less verbose.
2014-05-16 05:44:03 +03:00
Ari Koivula
c9a8756fbd
Fix NxN scan mode for lcu_get_final_cost.
...
- Scan mode was always selected according to the first PU mode.
2014-05-15 16:20:35 +03:00
Marko Viitanen
b08047cce9
Fixed intra chroma mode selection
2014-05-15 09:50:05 +03:00
Tapio Katajisto
4d879945b2
Fixed cost calculations in fme
2014-05-15 03:42:42 +00:00
Ari Koivula
f0e990905e
Remove chroma mode "36".
...
- It's an unnecessary chore to handle this special case everywhere (it means
chroma_mode == intra_mode). Better just to use the actual mode.
2014-05-14 19:56:35 +03:00
Ari Koivula
60a0ba4280
Update VS project files to link win32-pthread.
...
- I haven't found a good way of including external dependencies to VS projects
yet. Win32-pthreads is assumed to be found at the same level as kvazaar dir
and has the files x86/pthreadVC2.lib and x64/pthreadVC2.lib.
- Win32-pthreads also requires the pthreadVC2.dll to be in PATH when running
the program. Not sure what to do about that yet. We might need an installer
for windows to handle that.
- Disable openmp as it's no longer used.
- Stop linking Ws2_32.lib as that hasn't been used for ages.
2014-05-14 17:54:34 +03:00
Laurent Fasnacht
8ff9ea0eee
Wavefront works with parallelism + deblock (still no SAO)
2014-05-14 14:01:26 +02:00
Laurent Fasnacht
38444a81a6
Threads should be put in queue in wait state if we want to add dependencies later
2014-05-14 14:01:25 +02:00
Laurent Fasnacht
e72408249b
Add encoder_state pointer to lcu_order_element, new worker_encoder_state_search_lcu function to run the search stuff on one LCU
2014-05-14 14:01:24 +02:00
Laurent Fasnacht
eb62696461
Fix problems when image dimensions is not a multiple of LCU
2014-05-14 13:27:14 +02:00
Laurent Fasnacht
1ba1683c05
search buffer has to be allocated tile-wise to avoid problems with wavefronts
2014-05-14 13:27:13 +02:00
Laurent Fasnacht
bb86f24000
Take advantage of the new buffers to remove uneeded item assignment
2014-05-14 13:27:13 +02:00
Laurent Fasnacht
6607c9f563
Use new buffers for search
2014-05-14 13:27:12 +02:00
Laurent Fasnacht
c257c4b863
Add const for the buffers
2014-05-14 13:27:12 +02:00
Laurent Fasnacht
1680273e80
Store search borders in a buffer for the whole picture
2014-05-14 13:27:11 +02:00
Laurent Fasnacht
0ceb1469a2
Improve decision about when to split into threads
2014-05-14 13:27:11 +02:00
Laurent Fasnacht
d4a303e7e6
Free jobs as soon as possible
2014-05-14 13:27:09 +02:00
Laurent Fasnacht
63adb54a3d
Add --threads <int> command line parameter
2014-05-14 13:27:09 +02:00
Laurent Fasnacht
e772799d5e
encoder_state_encode uses now the threadqueue
2014-05-14 13:27:08 +02:00
Laurent Fasnacht
baede7f6c4
threadqueue
2014-05-14 13:27:08 +02:00
Laurent Fasnacht
8b7774153f
Add SLEEP() define
2014-05-14 13:27:08 +02:00
Laurent Fasnacht
aac7fc55b1
Remove filter_deblock function, which is not used and somewhat dangerous, since it doesn't take into account specific stuff about subencoders.
2014-05-14 13:27:07 +02:00
Laurent Fasnacht
bc3ca90bdf
Fix tiles when SAO or deblock is enabled.
...
Was broken by previous commit.
2014-05-14 13:27:07 +02:00
Laurent Fasnacht
4815a0604b
Entropy coding sync works without parallelism, without SAO and without deblocking
2014-05-14 13:27:06 +02:00
Laurent Fasnacht
2c2a2528f3
Remove openmp stuff
2014-05-14 13:27:06 +02:00
Ari Koivula
aee9bf2875
Re-add rdo control to transformskip decision.
...
- It got left out when rewriting the function.
2014-05-14 12:39:23 +03:00
Ari Koivula
9147b7acbf
Split residual quantization to separate luma and chroma function.
2014-05-14 11:19:48 +03:00
Tapio Katajisto
cc92cfee18
Added few warnings to Makefile
...
Cleaned fme code a bit
2014-05-14 01:49:34 +00:00
Tapio Katajisto
efc43c8b3a
Added fractional pixel motion estimation
...
Added farctional mv support for inter recon
Added 1/8-pel chroma and 1/4-pel luma interpolation
2014-05-14 01:42:02 +00:00
Ari Koivula
e947bd4c0e
Clean up trskip decision code and remove old code.
...
- You can define structs inside functions! This changes everything!!
- Bitstream changes a little bit compared to old trskip decision. Bdrate
change is insignificant though.
2014-05-13 22:00:04 +03:00
Ari Koivula
a3cdee9ec5
Move new trskip decision to a function.
2014-05-13 21:59:00 +03:00
Ari Koivula
2ff713ccb2
Add new implementation for trskip decision.
2014-05-13 21:57:45 +03:00
Ari Koivula
8b8da6f493
Make luma and chroma use the same quantization function.
...
- Only thing not working was transform skip.
2014-05-13 21:57:23 +03:00
Ari Koivula
f0bfcedba2
Clean up coeff reconstruction code.
2014-05-13 21:56:10 +03:00