Marko Viitanen
867304970e
[fme] Enable avx2 interpolation and fix some warning about shifting mv's
...
* Also switch mv int16_t to mv_t in many places
2021-11-22 10:38:18 +02:00
Marko Viitanen
d183c78fad
[fme] Change fracmv_within_tile() to use internal MV resolution
2021-11-22 10:06:38 +02:00
Ari Lemmetti
2bdfb3b536
Rename variables to be less misleading
2021-11-21 02:20:42 +02:00
Ari Lemmetti
eb0f42aa96
Update comment
2021-11-21 02:11:50 +02:00
Ari Lemmetti
86b37a8e02
Minor formatting
2021-11-21 02:06:49 +02:00
Ari Lemmetti
fd20462202
Fix more newlines...
2021-11-21 02:01:57 +02:00
Ari Lemmetti
40ca21d221
Consistent naming part 3
2021-11-21 01:04:14 +02:00
Ari Lemmetti
6c0bdb45b9
Consistent naming part 2
2021-11-21 00:58:52 +02:00
Ari Lemmetti
a68d73674b
Consistent naming part 1
2021-11-21 00:32:01 +02:00
Ari Lemmetti
8f0e96162a
Formatting
2021-11-20 23:33:57 +02:00
Ari Lemmetti
a236b32c52
Fix newlines
2021-11-20 23:14:31 +02:00
Ari Lemmetti
5225dcea3c
Implement missing block sizes. Fix types and conditions.
2021-11-20 22:53:37 +02:00
Marko Viitanen
ad8bad3f94
[inter] Scale x and y correctly in fracmv_within_tile()
2021-11-19 17:51:46 +02:00
Marko Viitanen
8841ed9c21
[inter] Fix fracmv_within_tile() assert to use correct mv resolution
2021-11-19 17:37:11 +02:00
Marko Viitanen
4d20461410
[inter] Limit merge search of some blocks with sub 1/4 pixel mv's
2021-11-19 17:28:13 +02:00
Marko Viitanen
fa9a1db498
[inter] Fix mv precision in many places and add more mv_t usage and new vector2d_t rounding function
2021-11-19 16:20:49 +02:00
Marko Viitanen
5020f5f742
[inter] Fix incorrect mv scaling in unipred and change more mv types to mv_t
2021-11-18 11:49:08 +02:00
Marko Viitanen
21d7d2e4ed
[inter] Remove MV rounding from kvz_inter_get_merge_cand() and add it to where merge cand are used
...
* Should be adapted to AMVR later
* merge candidates match VTM at full precision
2021-11-18 11:09:26 +02:00
Marko Viitanen
bf06538f33
[inter] Change internal MV precision to "INTERNAL_MV_PREC" and add new type mv_t
2021-11-18 09:49:12 +02:00
Ari Lemmetti
5389842675
Add missing SIMD bipred functions for AMP blocks with size of 12 or larger
2021-11-17 21:33:13 +02:00
Marko Viitanen
c4a9d3dc83
[amvr] Add cmd line parameter for amvr and a field to the cu_array_t for setting it
...
* Still hardcoded too QPEL
2021-11-17 14:53:30 +02:00
Marko Viitanen
757772e8c4
[inter] Disable AMVR by default
...
* Can be used to reduce cost on signalling motion vectors later
2021-11-16 17:38:38 +02:00
Marko Viitanen
d4902cc840
[inter] Implement Adaptive Motion Vector Resolution bits, the resolution still in fullpel
2021-11-16 17:18:29 +02:00
Ari Lemmetti
e3aadd4272
Add missing things after rebase
2021-11-15 21:48:43 +02:00
Marko Viitanen
a91c9bd095
Fix sse41 ver_sad_arbitrary() reading over the boundary and disable ver_sad_w12(), since it always reads 16 bytes
...
* This fixes valgrind complaining about using uninitialised values
2021-11-15 12:42:29 +02:00
Marko Viitanen
9e0491ee79
[inter] Remove the deprecated B-priority list merge candidate selection
2021-11-10 15:56:54 +02:00
Marko Viitanen
f6011cf850
[inter] Fix inter_pred_idc signalling
2021-11-10 12:54:33 +02:00
Marko Viitanen
1656202dbc
[inter] Fix ref pic list signalling with GOP
2021-11-10 12:54:04 +02:00
Marko Viitanen
901bf561ff
[inter] Fix HMVP mv candidate derivation for more than one reference frame
2021-11-10 09:33:12 +02:00
Marko Viitanen
907fa6a36a
[inter] Fix how references are signalled for PU
2021-11-09 09:05:45 +02:00
Ari Lemmetti
146298a0df
New AVX2 block averaging *WIP* missing small chroma block and SMP/AMP
2021-11-08 23:01:13 +02:00
Ari Lemmetti
ef69c65c58
New bipred average functions
2021-11-08 23:01:12 +02:00
Ari Lemmetti
f47bd5d86f
Rename some bipred functions
2021-11-08 23:01:12 +02:00
Ari Lemmetti
b52a930bed
About working with generics
2021-11-08 23:01:12 +02:00
Ari Lemmetti
e7857cbb24
Remove avx2 blending
2021-11-08 22:45:45 +02:00
Marko Viitanen
4a42b5cbc4
[cleanup] Remove HMVP debug code and extra arrays in intra coding
2021-11-08 10:11:17 +02:00
Marko Viitanen
c9d8412682
[inter] use Merge regions to limit the merge candidates
2021-11-08 09:51:23 +02:00
Marko Viitanen
6944437e98
Disable top-right CU copy to LCU when WPP is used, since it's not available
2021-11-08 09:43:53 +02:00
Marko Viitanen
aea4e349f5
[inter] Implement HMVP LUT updates during the search
2021-11-05 13:13:11 +02:00
Marko Viitanen
30d97d9af6
[inter] Implement pairwise-average candidates for merge candidates
...
- Half-pel candidates are skipped for now because it needs some special handling
2021-11-01 13:24:23 +02:00
Marko Viitanen
4a7e4e3e20
[inter] Add HMVP to merge candidate list
2021-10-29 14:19:20 +03:00
Marko Viitanen
41c1b6172c
[inter] Fix picture headers for P/B slices and disable some features in tests
2021-10-29 10:30:12 +03:00
Marko Viitanen
73c4128100
[quant] Map scalinglistType correctly
2021-10-29 09:10:15 +03:00
Marko Viitanen
492d22e8be
Disable interpolation AVX2 optimizations for now
2021-10-29 08:43:52 +03:00
Marko Viitanen
852da3c4f0
[inter] Fix overflow in HMVP shifting
2021-10-29 08:36:34 +03:00
Marko Viitanen
e2bdf02acc
[inter] fix merge_candidates_t initialization
2021-10-26 11:50:32 +03:00
Marko Viitanen
b0e6ab9f96
[inter] MVP candidate order fix and limit b0 with wpp
2021-10-25 22:57:58 +03:00
Marko Viitanen
112ce66259
[inter] Disable merge and skip modes -> inter working
2021-10-25 11:26:07 +03:00
Marko Viitanen
08766c0bb3
[inter] Fix max-merge usage
2021-10-25 11:25:23 +03:00
Marko Viitanen
899c672ed1
Make sure the dpb is more than max_num_reorder_pics
2021-10-19 10:16:04 +03:00
Ari Lemmetti
d4880be6f2
Compute proper count of buffered frames for vps and sps. Use common function.
2021-10-19 02:34:32 +03:00
Marko Viitanen
cc22233117
Change version to v2.1.0
2021-10-13 15:24:01 +03:00
Marko Viitanen
57883369ca
Change all the license texts in source headers and LICENSE file to 3-clause BSD, closes #302
...
* All now have the same exact text string
2021-10-13 15:22:46 +03:00
Marko Viitanen
b68625b869
Add correct reorder and buffering values to VPS, as they were only in SPS
2021-10-13 10:54:35 +03:00
Marko Viitanen
7918628b8e
Offset output dts by -1 when num_out >= gop_len, otherwise there will be a gap of 2 dts. Fixes #310
2021-10-11 11:18:58 +03:00
Marko Viitanen
7a5eb7712b
Fix merge candidate derivation order
2021-10-08 16:34:02 +03:00
Marko Viitanen
a39bc69482
Move HMVP arrays to more suitable place
2021-10-08 16:33:32 +03:00
Marko Viitanen
f68ba68fb2
Push HMVP item also when coding a skipped cu
2021-10-08 16:29:15 +03:00
Marko Viitanen
b8ba814909
Fix mv cand selection from HMVP -> working if no merging
2021-10-08 16:29:15 +03:00
Marko Viitanen
76a7294e35
Implement HMVP look-up-table functions
2021-10-08 16:29:14 +03:00
Marko Viitanen
c4dcabe95b
Add config parameter "parallel_merge_level" and array for hmvp
2021-10-08 16:28:54 +03:00
Marko Viitanen
cb9f9381c3
[inter] Fix inter reconstruction, correct function was in wrong branch
2021-10-08 14:48:49 +03:00
Marko Viitanen
78363ccab0
Replace bitstream->simulation with cabac->only_count
2021-09-14 17:44:56 +03:00
Marko Viitanen
19ff5a21ca
[alf] Fix a problem with alf and not updating the cabac contexts
...
* Added a bitstream coding simulation after LCU search
2021-09-14 10:03:23 +03:00
Marko Viitanen
3bbb3b7e36
[thread] Add correct wavefront dependencies when ALF is used
2021-09-13 21:34:14 +03:00
Marko Viitanen
aa36c1e86b
[thread] change wavefront dependencies to depend on bitstream writing instead of recon
...
* Possible fix for non-deterministic behaviour
2021-09-13 20:37:31 +03:00
Marko Viitanen
5271659f76
[inter] write ref pic list to the bitstream
2021-09-08 13:50:35 +03:00
Ari Lemmetti
171b9c60b3
[SIMD] Convert planar and DC mode PDPC loops to AVX2
2021-09-08 03:40:38 +03:00
Ari Lemmetti
ad35d4a4c8
[SIMD] Loop transformation, prepare data for latter loop
2021-09-06 22:38:37 +03:00
Ari Lemmetti
22da8cfe65
[SIMD] Loop transformations for SIMD processing
2021-09-06 22:30:36 +03:00
Ari Lemmetti
c195d906d3
[SIMD] Copy generic implementation of planar/DC PDPC as a skeleton
2021-09-06 21:20:51 +03:00
Ari Lemmetti
c6b33c7b92
[SIMD] Move PDPC condition out of strategy
2021-09-06 21:20:51 +03:00
Ari Lemmetti
46cf9b6871
[SIMD] Make strategy out of PDPC for planar and DC
2021-09-06 21:20:51 +03:00
Ari Lemmetti
816e7a5a91
[SIMD] Replace PDPC remainder loop with masking operations
2021-09-06 21:20:51 +03:00
Ari Lemmetti
1926b4cc27
[SIMD] Initial AVX2 code for transpose in angular prediction
2021-09-06 21:20:50 +03:00
Ari Lemmetti
913573baca
[SIMD] Initial AVX2 code for PDPC in angular prediction
2021-09-06 21:20:50 +03:00
Ari Lemmetti
7ccd1a571c
[SIMD] Initial AVX2 code for 4-tap filtering in angular prediction.
2021-09-06 21:20:50 +03:00
Ari Lemmetti
20f0ff976d
[SIMD] Transform angular pred loops for SIMD processing.
2021-09-06 21:20:49 +03:00
Ari Lemmetti
3dfe09e850
[SIMD] Copy generic implementation of angular prediction as a skeleton.
2021-09-06 21:20:46 +03:00
Joose Sainio
450cbd356c
Merge branch 'joint_cbcr' into 'master'
...
[jccr] Add joint coding of chroma residual
See merge request cs/ultravideo/vvc/uvg266!6
2021-09-06 11:43:06 +03:00
Joose Sainio
91374e95a9
[MTS] Move chroma outside of mts search
2021-09-06 11:28:33 +03:00
Joose Sainio
276f0bf006
[jccr] fix undefined behaviour that did not really affect anything
2021-09-06 11:28:33 +03:00
Joose Sainio
3a73abd264
[jccr] disable jccr for blocks when tr-depth != depth, i.e. 64×64
2021-09-06 11:28:32 +03:00
Joose Sainio
0592cc65a0
[jccr] enable rdoq with jccr
2021-09-06 11:28:20 +03:00
Joose Sainio
072b84711a
[jccr] fix 64×64 CUs
2021-09-06 11:28:20 +03:00
Joose Sainio
29d86aea84
[jccr] cmdline option
2021-09-06 11:28:08 +03:00
Joose Sainio
042b5078d8
[jccr] WIP initial implementation
...
Add somekind of search for joint chroma residual coding.
Bitstream is currently correct but prediction is incorrect because the jccr
is actually not used in the search.
Hard coded to be enabled
2021-09-06 11:28:08 +03:00
Marko Viitanen
839b9527af
Fix nal unit debug printing when VERBOSE is used
2021-09-01 14:28:07 +03:00
Marko Viitanen
26f18865f7
[alf] Change the processing in alf_get_blk_stats_avx2() to allow utilizing the whole 256bit register
2021-08-27 13:40:28 +03:00
Marko Viitanen
fdf125f406
[alf] Fix incorrect conversion in alf_get_blk_stats_avx2
2021-08-27 10:25:20 +03:00
Marko Viitanen
6714973264
[alf] Change _mm_store_si128 to _mm_storeu_si128 in alf_get_blk_stats_avx2()
2021-08-26 18:05:06 +03:00
Marko Viitanen
5df8add046
[alf] Change order of alf_covariance.y array for better AVX2 optimization in alf_get_blk_stats_avx2()
2021-08-26 15:37:01 +03:00
Marko Viitanen
be9527cf1d
[alf] Change the order of alf_covariance.ee values to get better optimized solution for alf_get_blk_stats_avx2()
2021-08-26 11:07:13 +03:00
Marko Viitanen
f4de5cfd0f
[alf] Cleanup alf_calc_covariance_avx2() and use integers in alf_get_blk_stats_avx2()
2021-08-26 10:20:57 +03:00
Marko Viitanen
915bf3ca24
[alf] Fix AVX2 priority
2021-08-25 20:29:58 +03:00
Marko Viitanen
8ef3e6a126
[alf] Add strategy for alf_get_blk_stats() and an initial AVX2 version
2021-08-25 20:22:24 +03:00
Marko Viitanen
f61b9138cd
[alf] Import SSE4.1 optimized 5x5 and 7x7 filters from VTM13
...
* Modified to work with 8-bit pixels
2021-08-25 11:50:37 +03:00
Marko Viitanen
dc6a29b0d8
[alf] Initial generic strategies for 5x5 and 7x7 filtering
2021-08-25 10:50:00 +03:00
Marko Viitanen
c3c96d69c2
[alf] Add modified alf_derive_classification_blk_sse41() from VTM 13.0
...
* Modified to work with bitdepth 8
2021-08-20 11:45:02 +03:00
Marko Viitanen
b158d05bca
[alf] rename strategy function to include prefix
2021-08-19 17:19:17 +03:00
Marko Viitanen
3efaeede76
[alf] Define the strategy for alf_derive_classification_blk()
2021-08-19 17:04:35 +03:00
Marko Viitanen
dee8a167e4
Clean up entropy tables and some unused code / comments
2021-08-17 10:31:14 +03:00
Marko Viitanen
9e9a8058c5
[alf] Allocate alf covariance and classifier memory only when needed
2021-08-15 10:44:05 +03:00
Marko Viitanen
2007132937
[alf] Make the alf structs a bit more memory efficient
2021-08-15 10:44:04 +03:00
Marko Viitanen
d742f57779
Remove angular_pred_avx2 so we don't need extra parameter
2021-08-15 10:43:48 +03:00
Marko Viitanen
ef287ee00c
[alf] Add math.h header to alf.c for sqrt()
2021-08-15 10:41:55 +03:00
Marko Viitanen
b5bc981d2a
Add entropy bits back to intra luma mode cost
...
* Makes things better after the entropies were fixed
2021-08-15 08:10:45 +03:00
Marko Viitanen
1e925ec980
[rdoq] fix kvz_ts_rdoq error scale
2021-08-14 22:52:32 +03:00
Marko Viitanen
8fcf5cf55c
[rdoq] Fix a lot of things
...
* Fix entropy table
* fix float entropy
* use dest_coeff instead of coef in ctx_idx_abs calculation
* Calculate new ctx_sig in correct place
2021-08-14 22:12:08 +03:00
Marko Viitanen
b412a96820
[cleanup] Change mentions of Kvazaar to uvg266 in README.md and remove crypto parameters
2021-07-27 10:18:45 +03:00
Marko Viitanen
5604b6f946
[cleanup] remove all crypto related stuff, fix warnings, move estimate.m to tools/
2021-07-27 09:27:51 +03:00
Marko Viitanen
99a2b0384d
[cleanup] remove some warnings
2021-07-26 11:42:19 +03:00
Marko Viitanen
226d7a9f53
[alf] remove alf clipping functions and free tqj_alf_process also when new job is allocated
2021-07-26 11:21:57 +03:00
Marko Viitanen
eb491ecea2
[alf] free state->tqj_alf_process to not leak memory
2021-07-26 10:26:50 +03:00
Marko Viitanen
0f8f422ad6
[alf] use correct lcu index with wpp and use proper cabac context for alf search
2021-07-25 20:19:17 +03:00
Marko Viitanen
90ed51a6ad
[alf] remove encoder_state_worker_encode_lcu since it is not used
2021-07-23 21:58:36 +03:00
Marko Viitanen
070dcc1924
[alf] fix alf_info passing to sub_states
2021-07-23 21:54:52 +03:00
Marko Viitanen
dc6862051e
[alf] Initialize all the alf tables in one place
2021-07-23 21:44:09 +03:00
Marko Viitanen
9e70707fba
[alf] Add new wf_recon_jobs and change search/bitstream writing to use local coeff instead of state->coeff
2021-07-23 10:40:19 +03:00
Marko Viitanen
b538f33838
[alf] add new thread queue job alf_process ran before the final bitstream writing
2021-07-22 23:21:00 +03:00
Marko Viitanen
20041740f2
[alf] move parameters to state and fix some static variables causing problems in multithreading
2021-07-22 23:18:56 +03:00
Marko Viitanen
3146f2d17f
[alf] Add job for ALF processing just before writing the bitstream out
2021-07-22 18:46:53 +03:00
Marko Viitanen
c188b1fdf9
[alf] Use correct LCU count
2021-07-22 18:45:33 +03:00
Marko Viitanen
0cad1ac3c9
[mts] Add a comment about idct8/idst7 16x16 being unoptimized
2021-07-21 14:02:23 +03:00
Marko Viitanen
d5ef036d35
[mts] change mts_subset tables back to static
2021-07-21 13:54:59 +03:00
Marko Viitanen
60caf2c378
[mts] fix 32x32 idst/idct
2021-07-21 13:44:25 +03:00
Marko Viitanen
c2cd5fb98e
[mts] replace AVX2 DST7/DCT8 16x16 with unoptimized for now
2021-07-21 13:38:17 +03:00
Marko Viitanen
7e089f518d
[mts] add optimized versions of DCT8 and DST7, inverse not yet working properly
...
* Includes new unit tests for the mts
2021-07-21 11:53:15 +03:00
Marko Viitanen
7f67009511
Fix MD5 calculations from HEVC to VVC way
2021-06-24 15:03:29 +03:00
Marko Viitanen
c9e48f253d
Fix hash message with monochrome
2021-06-24 14:48:48 +03:00
Marko Viitanen
1d436844da
Remove duplicated code from kvz_rdoq
2021-06-24 13:20:02 +03:00
Marko Viitanen
ca0c357268
[rdoq] Fix chroma bit calculations to include >>3 for width and height
2021-06-24 13:19:20 +03:00
Marko Viitanen
c004735821
[LMCS] Fix casting of the chroma scaled residual
2021-06-18 09:35:06 +03:00
Marko Viitanen
b22fd61c7f
[intrapred] Change kvz_luma_mode_bits to make it return more correct costs
2021-06-18 09:35:06 +03:00
Joose Sainio
cfffd7166c
Use correct context for calculating coeff costs for transform skip
2021-06-07 13:06:03 +03:00
Marko Viitanen
4594bf0ca8
Merge branch 'lmcs_chroma'
2021-06-02 15:05:04 +03:00
Marko Viitanen
cc6ff368df
[LMCS] Store calculated chroma scaling values for speedup
2021-06-02 09:33:45 +03:00
Marko Viitanen
5babb14ee7
[LMCS] Use chroma scaling
2021-06-01 12:17:03 +03:00
Marko Viitanen
fad11a5c92
[LMCS] Import LMCS chroma functions from VTM13.0
2021-06-01 09:01:55 +03:00
Joose Sainio
f9de8ebc4f
Merge branch 'master' into '4x4-rd'
...
# Conflicts:
# src/encoder.c
# tests/test_intra.sh
2021-05-28 11:43:55 +00:00
Marko Viitanen
ddea6d73c9
[LMCS] Fix blank references in some cases by selecting between source_lmcs and source in init_lcu_t()
2021-05-28 10:57:25 +03:00
Marko Viitanen
96a12d9830
Disable SPS extension writing if they are not used -> compatible with VTM 11 and 13
2021-05-28 10:17:19 +03:00
Marko Viitanen
1bbe1204e4
[LMCS] set ph_lmcs_enabled_flag according to the sliceReshaperEnableFlag
2021-05-27 16:09:34 +03:00
Marko Viitanen
4ea9bee0b6
Add rrc_rice extension flags to make bitstream correct with VTM 13.0 and update the CI VTM binary
2021-05-27 11:37:07 +03:00
Marko Viitanen
5aa04035d8
[LMCS] Fix a bug where floor_log2 function is used with 0 value
2021-05-27 08:39:58 +03:00
Joose Sainio
2df94f6a17
Fix rd=3
2021-05-27 08:39:41 +03:00
Marko Viitanen
9231ed4869
[LMCS] Update kvz_lmcs_preanalyzer inter side from VTM
2021-05-26 18:01:57 +03:00
Marko Viitanen
d040a4238c
[LMCS] Allocate LMCS images with the config flag since the actual enabled flag is checked later
2021-05-26 17:16:45 +03:00
Marko Viitanen
bb12894575
[LMCS] Always allocate the LMCS APS struct to simplify things
2021-05-26 17:01:19 +03:00
Marko Viitanen
a5ff9284a8
[LMCS] Enable LMCS per slice according to the pre-analyzer
2021-05-26 16:48:57 +03:00
Marko Viitanen
be9776e40f
Fix a bug causing tmvp related flag being written on intra frames
2021-05-26 14:31:34 +03:00
Marko Viitanen
e9044bfbc5
[LMCS] free source_lmcs and rec_lmcs in encoder_state_encode (as done with normal source and rec)
2021-05-25 17:42:34 +03:00
Marko Viitanen
3dae3f072e
[LMCS] Actually allocate the source_lmcs and rec_lmcs
2021-05-25 14:27:21 +03:00
Marko Viitanen
e5684b0be1
[LMCS] Free rec_lmcs and source_lmcs in kvz_encoder_prepare
2021-05-25 14:04:06 +03:00
Marko Viitanen
252d5c7eaf
[LMCS] Add top-level indicator for LMCS to know when we can free the images
2021-05-25 11:00:46 +03:00
Marko Viitanen
c69d456040
[LMCS] Fix memory leak and remove debug printing
2021-05-24 22:23:45 +03:00
Marko Viitanen
dbc7fd48bf
[LMCS] Initialize some m_reshapeCW values to avoid division by zero
2021-05-24 18:57:37 +03:00
Marko Viitanen
73ac3b68bf
[LMCS] add missing header in quant-avx2.c
2021-05-24 17:25:38 +03:00
Marko Viitanen
4cd5bc38a1
[LMCS] Luma mapping working after some rework, have to keep the reconstruction in the mapped domain
2021-05-24 17:23:17 +03:00
Marko Viitanen
88bec75306
[LMCS] keep the original reference data intact and keep lcu.rec in LMCS domain
2021-05-20 16:40:49 +03:00
Marko Viitanen
9b986c5359
[LMCS] fix division by zero
2021-05-20 16:38:46 +03:00
Marko Viitanen
3516972237
[LMCS] Move LMCS mapping / inverse to the source LCU data
2021-05-18 21:22:22 +03:00
Marko Viitanen
c6746b709c
[LMCS] Use calloc for lmcs_aps, makes it behave deterministic
2021-05-18 16:27:07 +03:00
Joose Sainio
cfd7d2666b
slightly optimize intra-generic.c
2021-05-14 10:23:37 +03:00
Marko Viitanen
178d62bde3
[LMCS] Move LMCS data structures under the frame
2021-05-12 11:42:34 +03:00
Joose Sainio
34fddeb85d
Re-enable LUMA_MULT and CHROMA_MULT
2021-05-07 14:20:48 +03:00
Joose Sainio
132a8b3d96
Try to fix rd=0 for 4x4 blocks
2021-05-07 09:30:12 +03:00
Marko Viitanen
f36c4e71ed
[LMCS] Fix source_lmcs and rec_lmcs deallocation
2021-05-06 13:15:39 +03:00
Marko Viitanen
d2670ccdc8
[LMCS] Create separate pictures for LMCS mapped pixels
2021-05-05 13:28:39 +03:00
Marko Viitanen
703cb155cb
[LMCS] Disable aps_chroma_present_flag -> decoded with hash mismatch
2021-05-04 16:54:14 +03:00
Marko Viitanen
e2ebfc946a
[LMCS] Free lmcs_aps in correct place
2021-05-04 16:44:05 +03:00
Marko Viitanen
73908b5237
[LMCS] Run the reshaper contruction and fix an assert
2021-05-04 15:48:01 +03:00
Marko Viitanen
d5abc3eb17
[LMCS] fix ReshapeCW.binCW array size
2021-05-04 12:17:59 +03:00
Marko Viitanen
19a3274770
[LMCS] Enable initial LMCS processing and APS writing
2021-05-04 12:04:22 +03:00
Joose Sainio
7674e94fd1
[rdoq] transform skip RDOQ
...
Copy the implementation from VTM
2021-05-03 12:52:10 +03:00
Marko Viitanen
69c1c3f4ea
[LMCS] Add kvz_construct_reshaper_lmcs and related functions
2021-05-03 09:13:53 +03:00
Marko Viitanen
3fadd91fb5
[LMCS] Add an assert in deriveReshapeParametersSDR to remove static analyser warning
2021-04-30 16:41:06 +03:00
Marko Viitanen
915057c0c5
[LMCS] Replace some dynamic arrays with static
2021-04-30 16:37:00 +03:00
Marko Viitanen
81ec3c3a1a
[LMCS] Converted kvz_lmcs_preanalyzer and related functions from VTM
2021-04-30 16:25:03 +03:00
Marko Viitanen
291ec70ccd
[LMCS] Convert stats generation function kvz_calc_seq_stats from VTM
2021-04-30 11:38:15 +03:00
Joose Sainio
d2b9893bb7
[transform skip] Fix misunderstanding that caused TS to use QP 52>=
2021-04-30 10:55:23 +03:00
Joose Sainio
a998f3ed74
[transform-skip] Convert the HEVC transfrom skip to VVC
...
For some reason transform skip uses QP MAX(52, QP) and the coeffs are
no longer shifted
2021-04-30 10:55:23 +03:00
Joose Sainio
7ff904fd9d
[transform-skip] Bitstream generation for transform-skip
2021-04-30 10:54:45 +03:00
Marko Viitanen
38eafbbf78
[LMCS] initial bitstream writing and LMCS structures
2021-04-30 10:04:41 +03:00
Marko Viitanen
3d9d1930d8
[LMCS] Add commandline option to enable LMCS
2021-04-30 09:51:41 +03:00
Joose Sainio
0cc1bf197f
Add monochrome tests and fix monochrome
2021-04-23 13:50:09 +03:00
Joose Sainio
56f163357b
Fix minor mistake in rewriting the history
2021-04-23 11:06:07 +03:00
Joose Sainio
fda73ded4a
Parameterize chroma qp scaling.
2021-04-23 10:57:30 +03:00
Joose Sainio
09b738061c
Fix deblocking
2021-04-23 10:57:30 +03:00
Joose Sainio
4f0ce14e53
Make internal symbols static
2021-04-23 10:57:30 +03:00
Joose Sainio
a12f99b7a3
Fix deblocking for luma
2021-04-23 10:57:29 +03:00
Joose Sainio
2ab005692d
Enable 4x4 intra CUs
2021-04-23 10:57:29 +03:00
Joose Sainio
d5a62c96b0
Properly implement chroma filtering
2021-04-23 10:57:29 +03:00
Joose Sainio
e521a59cd5
Perform deblocking on 4x4 grid instead of 8x8
2021-04-23 10:57:29 +03:00
Joose Sainio
1aaa95601c
Merge remote-tracking branch 'remotes/kvz_github/master' into Fix-monochrome
...
# Conflicts:
# .gitlab-ci.yml
# build/kvazaar_lib/kvazaar_lib.vcxproj.filters
# src/cfg.c
# src/encoder.h
# src/kvazaar.h
# src/rdo.c
2021-04-23 10:56:50 +03:00
Joose Sainio
764d23cdf5
Update entropy tables and correct order
2021-04-23 10:54:11 +03:00
Joose Sainio
119f80054a
Update get_ic_rate
2021-04-23 10:53:20 +03:00
Joose Sainio
15b710f4f6
update calc_last_bits
2021-04-23 10:52:50 +03:00
Joose Sainio
27e46ab7f4
ctx_set was incorrect for second iteration of coefficient level estimation
2021-04-23 10:51:52 +03:00
Joose Sainio
e8eab326fb
Update context selection to match VVC
2021-04-23 10:51:01 +03:00
Joose Sainio
1fd583eae0
go_rice_param calculation fix
2021-04-23 10:49:31 +03:00
Joose Sainio
8049ebb597
Fix header writing for monochrome. WIP: checksum header still incorrect
2021-03-17 13:01:26 +02:00
Joose Sainio
bdcf2210ed
reverse
2021-03-17 08:23:07 +02:00
Joose Sainio
7929c4bfe5
Test c_lambda instead of CHROMA_MULT
2021-03-17 08:22:38 +02:00
Joose Sainio
b2076d3b39
Enable chroma scaling
...
WIP: user defined scaling array
2021-03-16 10:31:26 +02:00
Joose Sainio
412781db41
[scalinglist] Fix quant-generic
2021-03-09 10:42:40 +02:00
Joose Sainio
21bc9aa3c2
[scalinglist] Fix memory leak
2021-03-09 10:04:11 +02:00
Joose Sainio
30e573c261
[scalinglist] WIP: Update scalinglist for VVC
...
Seems to work when rdoq is enabled but not when it is disabled
2021-03-09 09:51:49 +02:00
Ari Lemmetti
dad3d6818e
Only read left and right border pixels if necessary
2021-03-08 22:36:10 +02:00
Ari Lemmetti
b72ab583b4
Handle "don't care" rows in the end separately
2021-03-08 22:36:09 +02:00
Ari Lemmetti
33295bf350
Use AVX2 luma interpolation for SMP and AMP as well
2021-03-08 22:36:09 +02:00
Ari Lemmetti
7ce68761c2
Add a reminder to fix a rare case for bipred
2021-03-08 22:36:09 +02:00
Ari Lemmetti
475f1d79d5
Add some defines for important interpolation related sizes
2021-03-08 22:36:09 +02:00
Ari Lemmetti
4314f3a9a7
Rename some interpolation functions and strategies for consistency
2021-03-08 22:36:08 +02:00
Ari Lemmetti
5a70b49f69
Require 64-bit build for AVX2 interpolation filter functions
2021-03-08 22:36:08 +02:00
Ari Lemmetti
5631651469
Remove unused functions and variables
2021-03-08 22:36:08 +02:00
Ari Lemmetti
d8e7aac380
Do not use nonstandard extension for struct initialization.
2021-03-08 22:36:07 +02:00
Ari Lemmetti
e38219e489
Fix epol_func signature and function definition
2021-03-08 22:36:07 +02:00
Ari Lemmetti
7e6ba9750f
Add new AVX2 ip filters for chroma
2021-03-08 22:36:07 +02:00
Ari Lemmetti
3476fc62c7
Fix parameter to signed
2021-03-08 22:36:06 +02:00
Ari Lemmetti
e572066e46
Add new AVX2 vertical ip filter for pixel precision
2021-03-08 22:36:06 +02:00
Ari Lemmetti
9e4b62a891
Use the new horizontal filter for pixel precision as well
2021-03-08 22:36:06 +02:00
Ari Lemmetti
2175023843
Relocate function
2021-03-08 22:36:06 +02:00
Ari Lemmetti
f5b0e3c52b
Add new AVX2 horizontal ip filter capable of every luma PB
2021-03-08 22:36:05 +02:00
Ari Lemmetti
d9a3225ae5
Add new AVX2 vertical ip filter for high-precision
2021-03-08 22:36:05 +02:00
Ari Lemmetti
84222cf3e7
Replace old block extrapolation with more capable one.
...
Separate paddings for different directions can be now specified.
2021-03-08 22:36:04 +02:00
Jaakko Laitinen
845902062c
Fix warning and limit intra qp offset to -3
2021-03-04 18:08:59 +02:00
Marko Viitanen
29dee4e32a
[rdoq] implement more parts of rdoq like in VTM related to reg_bins value usage
2021-02-26 22:11:47 +02:00
Marko Viitanen
7dcf00d536
[rdoq] Change kvz_get_coeff_cost() to match current VTM
2021-02-26 20:43:33 +02:00
Marko Viitanen
467a3d97cc
[rdoq] Update contexts to use correct chroma model
2021-02-26 20:26:08 +02:00
Marko Viitanen
6544c25daf
[rdoq] improve the cost calculations and clean up unused code
2021-02-26 20:23:06 +02:00
Marko Viitanen
d6379c02e0
[rdoq] implement kvz_get_ic_rate correct bit values
2021-02-26 20:23:06 +02:00
Marko Viitanen
3c75500cd4
Fix PSNR calculation, broken after the introduction of frame padding
2021-02-26 20:20:51 +02:00
Marko Viitanen
c6baa8ad62
[rdoq] rename some contexts and add gt2 context template, change kvz_context_get_sig_coeff_group width -> cg_width
...
* RDOQ is not working as it should, but no longer tries to access incorrect memory locations
2021-02-25 13:41:47 +02:00
siivonek
bf0bf73665
Fix mistake in define.
2021-02-16 20:21:33 +02:00
siivonek
6f455f29cc
Add MINGW64 to define. Try to fix tsan test path error to suppressions.txt.
2021-02-16 15:44:18 +02:00
siivonek
9a65617a34
Disable thread exit call in encmain when MINGW is used. This should fix the issue with media auto-build suite.
2021-02-15 14:47:18 +02:00
Marko Viitanen
e05dcdb193
Enable sign hiding in quant_avx2 and fix a bug in kvz_encode_coeff_nxn_generic()
2021-02-12 16:40:28 +02:00
Marko Viitanen
113b94f5e1
Add sh_sign_data_hiding_used_flag to slice header
2021-02-12 14:19:56 +02:00
Marko Viitanen
79c36f6aeb
Enable RDOQ and sign hiding
2021-02-12 13:24:02 +02:00
Arttu Makinen
7098a94a6f
Implemented implicit MTS.
...
Added selection of implicit MTS to command parameters.
Updated the transform selection to support implicit MTS.
2021-02-11 15:11:15 +02:00
Arttu Mäkinen
8f34685a8f
Merge branch 'master' into 'mts'
...
# Conflicts:
# src/cfg.c
# src/kvazaar.h
2021-02-10 13:05:18 +02:00
Arttu Makinen
c5570abe1b
Removed 'emt' variable from cu_info_t and changed 'emt' globally to 'mts' for consistency.
2021-02-10 12:08:05 +02:00
Arttu Makinen
d0b7dd95f7
MTS works on intra mode.
...
Fixed usage of MTS constraints.
Fixed DCT8 transforms.
Added sorting function of MTS modes with intra modes and costs to search.c.
2021-02-10 11:01:58 +02:00
Arttu Makinen
2e7c342645
Implemented DCT2, DST7, and DCT8 transforms, and search for selecting transform for MTS. Using MTS results mismatch for luma component.
2021-02-02 11:09:43 +02:00
Marko Viitanen
c6b3065e7c
Merge branch 'deblocking_fix' into 'master'
...
Deblocking fix
See merge request cs/ultravideo/vvc/uvg266!1
2021-01-26 14:18:34 +02:00
Arttu Makinen
b9c3336f0e
MTS bitstream encoding added for intra. Work with depths 0-3.
2021-01-18 20:44:36 +02:00
Jaakko Laitinen
1c6bef2514
Fix luma large block deblocking bug
2021-01-14 17:22:12 +02:00
Arttu Makinen
65cbee85d7
Fix for sad_tests. Forced intra mode removed. Define for frame padding added.
2021-01-14 14:30:50 +02:00
Jaakko Laitinen
f19c049db7
Fix luma hor edge rightmost pixel filtering
2021-01-13 18:04:56 +02:00
Pauli Oikkonen
fcc2c1fa7b
return-type error does not know that you don't return from assert(0)
2021-01-12 13:28:55 +02:00
Pauli Oikkonen
fa8cfb92e8
Maybe this would work with VC++
...
Our threadwrapper does not support PTHREAD_MUTEX_INITIALIZER, apparently
that's a toughie to implement on Windows or something, dunno. Use
dynamic initialization instead, then.
2021-01-11 18:22:53 +02:00
Pauli Oikkonen
20758a77e3
document fastrd measurement tools
2021-01-11 18:22:53 +02:00
Pauli Oikkonen
0e07308ea5
new weights
2021-01-11 18:22:53 +02:00
Pauli Oikkonen
5827ecc5a6
this little piggy wasn't on board, obviously..
2021-01-11 18:22:53 +02:00
Pauli Oikkonen
643e70d4ca
also move the readme file :^)
2021-01-11 18:22:53 +02:00
Pauli Oikkonen
1c1807f80b
move rdcost stuff into a separate directory
2021-01-11 18:22:53 +02:00
Pauli Oikkonen
a37095b061
new weights using new scripts
2021-01-11 18:22:53 +02:00
Pauli Oikkonen
17bedc9751
script to average out results by qp over sequences
2021-01-11 18:22:53 +02:00
Pauli Oikkonen
ab13018b7c
tidy it up
2021-01-11 18:22:53 +02:00
Pauli Oikkonen
8aa9a29e24
what if this were to work now
2021-01-11 18:22:53 +02:00
Pauli Oikkonen
4deed04eb9
you know what, fread returns number of elements, not bytes
2021-01-11 18:22:53 +02:00
Pauli Oikkonen
c89477bb41
Ditto for 2nd part of least squares
2021-01-11 18:22:52 +02:00
Pauli Oikkonen
3dd4f0e00b
Process some fault conditions in filter_rdcosts
2021-01-11 18:22:52 +02:00
Pauli Oikkonen
98a082cdcd
last fixes to extract_rdcosts
2021-01-11 18:22:52 +02:00
Pauli Oikkonen
b26e9c68c8
extract rdcosts works with the block qp fix
2021-01-11 18:22:52 +02:00
Pauli Oikkonen
40ae353820
Fix RD sampling to take the block QP into account
2021-01-11 18:22:52 +02:00
Pauli Oikkonen
03087fb44c
Fix RDO sampling to work thru a CLI parameter, implement accuracy check
...
TODO: write into encoder->fastrd_learning_outfile instead of stdout.
It's a toughie tho, because fwrite takes in FILE* instead of const FILE*
but the encoder_control_t is passed as a const.
2021-01-11 18:22:52 +02:00
Pauli Oikkonen
33dd9c95cd
Tool to extract RDO bitrates
2021-01-11 18:22:52 +02:00
Arttu Makinen
46ed459790
Removed functions from alf.h that are not used outside of alf.c. Rearranged functions in alf.c.
2021-01-11 10:42:44 +02:00
Arttu Makinen
1ae1d7e630
Cast couple more ALF functions to static.
2021-01-08 12:10:40 +02:00
Arttu Makinen
15816125aa
Cast ALF functions to static or set them to have prefix "kvz_".
2021-01-08 12:03:22 +02:00
Jaakko Laitinen
ecdb1c4dce
Fix chroma clip range bug
2021-01-07 16:06:03 +02:00
Jaakko Laitinen
88b837c4f0
Fix more chroma deblocking issues
2021-01-06 19:06:14 +02:00
Arttu Makinen
2786e8f0e2
Fix of problems that appeared with rebase.
2021-01-05 11:43:15 +02:00
Jaakko Laitinen
b96753961c
Fix some more chroma bugs
2021-01-02 20:59:55 +02:00
Jaakko Laitinen
c71a0d1e6f
Fix most(?) chroma issues
2021-01-01 20:10:08 +02:00
Jaakko Laitinen
c736837ca7
Fix luma deblocking
2020-12-31 19:23:33 +02:00
Arttu Makinen
e06759eb6e
Fixed a bug of ALF failing when CC-ALF was not enabled. Added ALF to README.md parameters.
2020-12-30 16:27:15 +02:00
Arttu Makinen
75b51c1d27
Bug fix of division with zero, initialization of APS, and missing "!".
2020-12-30 16:27:07 +02:00
Arttu Makinen
df375a055e
Small changes with VTM version 11.0.
2020-12-30 16:26:59 +02:00
Arttu Makinen
7109313161
Added forgotten memory release.
2020-12-30 16:26:50 +02:00
Arttu Makinen
b17e26511f
Removed/moved the last global variables from ALF.
2020-12-30 16:26:49 +02:00
Arttu Makinen
f5556a5d69
Moved cabac_estimator from globals to alf_info_t.
2020-12-30 16:26:30 +02:00
Arttu Makinen
ffdca81dca
ALF frame buffer moved.
2020-12-30 16:26:22 +02:00
Arttu Makinen
a3998450d0
Most of the remaining globals removed/moved.
2020-12-30 16:26:14 +02:00
Arttu Makinen
35233d2e17
Multiple global arrays placed in a struct of arrays.
...
Also g_ctb_distortion_unfilter and g_aps_id_start removed.
2020-12-30 16:25:54 +02:00
Arttu Makinen
aed4d29c79
Continuation of removal/moving of ALF globals.
...
Removed/moved globals: g_ctu_enable_flag, g_ctu_alternative, g_ctu_enable_flag_tmp, g_ctu_alternative_tmp.
2020-12-30 16:25:40 +02:00
Arttu Makinen
335ce2bdda
Moving ALF globals to alf_info struct inserted in videoframe_t.
...
g_alf_covariance and g_alf_covariance_frame moved.
2020-12-30 16:25:18 +02:00
Arttu Makinen
76cf8a16d9
Fixed couple of memory problem bugs.
2020-12-30 16:25:01 +02:00
Arttu Makinen
0914864300
Bug fix for reading alf type to cfg.
2020-12-30 16:24:59 +02:00
Arttu Makinen
9d56d6444d
Removed filter shape/type from variables and functions.
...
Filter shape/type size was only used and was always defined as 1.
2020-12-30 16:24:50 +02:00
Arttu Makinen
218d5b51d3
Cleaning ALF code.
2020-12-30 16:24:24 +02:00
Arttu Makinen
420ee4cc21
Changed alf_enabled and alf_cc_enabled flags into one alf_type enum as in sao.
2020-12-30 16:23:56 +02:00
Arttu Makinen
2b62b91589
Added CC ALF parameter for encoding.
2020-12-30 16:22:02 +02:00
Arttu Makinen
0e74bfb2a8
CC ALF now works properly.
2020-12-30 16:22:01 +02:00
Arttu Makinen
fc39b311bd
Added fixing of pixels outside of the actual frame before CC ALF.
2020-12-30 16:22:01 +02:00
Arttu Makinen
99745c2e5a
Added writing of CC ALF flag. Couple of bug fixes.
2020-12-30 16:22:00 +02:00
Arttu Makinen
1471448218
Bug fixes in derive_cc_alf_filter and get_blk_stats_cc_alf.
2020-12-30 16:22:00 +02:00
Arttu Makinen
f7fe8d9a27
Added more CC ALF functions.
...
Currently not working.
2020-12-30 16:21:59 +02:00
Arttu Makinen
9ed5169919
Finished functions get_blk_stats_cc_alf and calc_covariance_cc_alf for CC ALF.
2020-12-30 16:21:29 +02:00
Arttu Makinen
bf8bb62e50
Got rid of fair amount of global variables.
2020-12-30 16:21:28 +02:00
Arttu Makinen
7846796a4e
Removed #define FULL_FRAME.
2020-12-30 16:20:25 +02:00
Arttu Makinen
7bfb1ca6b4
Removal of useless comments.
2020-12-30 16:19:57 +02:00
Arttu Makinen
529bdb4dd2
Modify APS header writing.
2020-12-30 16:19:47 +02:00
Arttu Makinen
ee70bcfaec
Fixing warnings.
2020-12-30 16:19:07 +02:00
Arttu Makinen
d7eafc391f
Fixing uninitialized parameters.
2020-12-30 16:18:24 +02:00
Arttu Makinen
36ffdcaf3f
Disable output of debug stats.
2020-12-30 16:18:09 +02:00
Arttu Makinen
98768061db
Adding CC ALF.
2020-12-30 16:18:08 +02:00
Arttu Makinen
da04fffaec
Updated the creating of ALF parameters and init for them.
2020-12-30 16:17:54 +02:00
Arttu Makinen
bfa77e35c3
Fixed a bug where reconstruction for ALF was called multiple times for no reason.
...
Modified reconstruction of pixels after ALF search.
2020-12-30 16:17:43 +02:00
Arttu Makinen
bd292dab16
Fixed coding of headers for inter coding with ALF.
2020-12-30 16:15:12 +02:00
Arttu Makinen
26dc5b8c4e
Multiple APSs can now be signaled.
...
Can't test usage of multiple APSs properly because inter coding doesn't work.
2020-12-30 16:13:56 +02:00
Arttu Makinen
4ffb0b71a6
Chroma filtering works.
...
Also some code cleaning.
2020-12-30 16:13:25 +02:00
Arttu Makinen
a95fd73668
At least one APS can be signaled.
...
Problem with APS was in encoder_state-bitstream.c.
Cleaning of code.
2020-12-30 16:12:56 +02:00
Arttu Makinen
d7126520b2
Moving param_set_map from slices to cfg.
...
Bug fix in kvz_alf_encoder_ctb.
2020-12-30 16:12:38 +02:00
Arttu Makinen
c55a2a04e8
Bug fix in kvz_alf_encoder.
...
New bugs appeared with this fix.
2020-12-30 16:12:17 +02:00
Arttu Makinen
8aa91f320a
Bug fixes and cleaning.
2020-12-30 16:11:36 +02:00
Arttu Makinen
bfba8d43cb
Working on to get APS working for ALF.
2020-12-30 16:10:01 +02:00
Arttu Makinen
b3ecc755e2
ALF search is now executed for full frame. Works for only 1 frame.
...
Checksum matches.
APSs are not used currently.
#define FULL_FRAME in alf.h is set to 1 in order to use ALF for full frame.
#define FULL_FRAME 0 produces working bitstream but checksum doesn't match.
2020-12-30 16:08:46 +02:00
Arttu Makinen
94787acb73
Divided encoder_state_worker_encode_lcu -function in encoderstate.c into encoder_state_worker_encode_lcu_search and encoder_state_worker_encode_lcu_bitstream.
...
ALF off. No changes in bitstream.
2020-12-30 16:07:46 +02:00
Arttu Mäkinen
ec62ed89cb
LCUs now have mismatched only on boundaries.
...
Fixed a bug in alf.c line 5451.
Modifications to copying the boundary pixels of CTU.
2020-12-30 16:07:45 +02:00
Arttu Mäkinen
f202aa43fa
WIP Updating VTM8.2 to VTM10.0.
...
Small update to ALF cabac flags.
Minor variable definition updates.
2020-12-30 16:07:44 +02:00
Arttu Mäkinen
bc90b731a5
ALF updated to VTM8.2. Checksum doesn't match.
...
ALF uses currently only ready defined coefficients, not APSs.
Produces a valid bitstream, but checksum doesn't match.
CC ALF is disabled.
2020-12-30 16:06:59 +02:00
Arttu Mäkinen
2f80216514
Some cleaning and updating.
...
Set to use only existing filters rather than signal APS.
2020-12-30 16:02:01 +02:00
Arttu Mäkinen
a430d48669
ALF works now with VTM7.0 as in VTM6.1.
...
VTM properly decodes bitstream from kvazaar but the checksum doesn't match.
Couple hard coded values needed for this in function "kvz_encode_alf_bits".
2020-12-30 15:59:08 +02:00
Arttu Mäkinen
7250f4549b
Merge fixes.
2020-12-30 15:12:32 +02:00
Arttu Mäkinen
21a4751875
Works with VTM decoder with one frame with one hard coded value.
...
APS NAL unit type writing added.
Bug fixes.
WIP.
2020-12-30 15:11:17 +02:00
Arttu Mäkinen
9cad95c94c
Bug fixes.
...
WIP.
2020-12-30 15:09:13 +02:00
Arttu Mäkinen
09c68d9de6
Outputs valid frame with kvazaar. Still problems with cabac when decoding with VTM.
...
Decided to use buffers that were added in last commit.
Some small fixes and adjustments.
WIP.
2020-12-30 15:09:12 +02:00
Arttu Mäkinen
2cac901cca
Testing different kind of buffer for alf image fulldata.
...
WIP
2020-12-30 15:09:12 +02:00
Arttu Mäkinen
feb201986a
Changed to process one CTU at a time rather than all CTUs.
...
WIP
2020-12-30 15:09:11 +02:00
Arttu Mäkinen
b04bb66160
Adjustments and cleaning.
...
WIP
2020-12-30 15:09:10 +02:00
Arttu Mäkinen
c76c445142
Cabac/ctx calculation added.
...
Bug fixing and adjusting.
WIP
2020-12-30 14:32:01 +02:00
Arttu Makinen
ade4fc4061
Update of contexts of ALF.
...
WIP
2020-12-30 14:32:00 +02:00
Arttu Makinen
ebb99a7223
Changed 'width's to 'stride's, because added more pixels to 'fulldata'.
...
Also some small fixes and changes.
Checksum correct in luma.
WIP
2020-12-30 14:30:47 +02:00
Arttu Makinen
377aa989ab
Updated to VTM6.1.
...
Done according to all #ifs enabled
2020-12-30 14:27:15 +02:00
Arttu Makinen
0fbbf1a7e2
Small fixes/adjustments
2020-12-30 14:25:58 +02:00
Arttu Makinen
98a8e78e93
avx2/encode_coding_tree-avx2.c update, because it caused errors
2020-12-30 14:25:16 +02:00
Arttu Makinen
ed76650fa5
Updating to VTM6.0
2020-12-30 14:25:09 +02:00
Arttu Makinen
a24f49c286
Doesn't crash anymore during debug. Added new allocator for fulldata in kvz_picture.
2020-12-30 14:24:16 +02:00
Arttu Makinen
2b7a8af23a
Crashes now in kvz_image_free.
2020-12-30 14:22:38 +02:00
Arttu Makinen
05495bb555
Not working. All the functions done.
...
Heap corruption occur during debugging.
2020-12-30 14:22:30 +02:00
Arttu Mäkinen
236224dbb9
Broken version with header mismatch
2020-12-30 14:07:34 +02:00
Arttu Mäkinen
06233b5d3b
added alf parameter to cli
2020-12-30 14:02:58 +02:00
Jaakko Laitinen
71751c3770
Fix max filter size derivation
2020-12-29 17:57:35 +02:00
Jaakko Laitinen
6a8d73252a
Fix runtime errors
2020-12-28 16:41:00 +02:00
Jaakko Laitinen
85be89a85c
Fix compilation errors
2020-12-28 15:20:30 +02:00
Jaakko Laitinen
95ff22f0db
Finish max filter length fixes
2020-12-28 14:26:36 +02:00
Jaakko Laitinen
13e605153a
Fix bugs
2020-12-22 19:11:47 +02:00
Jaakko Laitinen
50e9acd3f4
Add max filter length derivation
2020-12-21 18:47:02 +02:00
Arttu Makinen
bc8507cc8d
MTS context.
2020-12-18 18:35:11 +02:00
Arttu Makinen
fd2f73b460
MTS headers and commands.
2020-12-18 17:40:47 +02:00
Jaakko Laitinen
7a71b700fb
Add chroma deblock filtering
2020-12-18 11:06:41 +02:00
Marko Viitanen
0c5e1db0fa
Fix wpp chroma bug
2020-12-15 22:59:22 +02:00
Marko Viitanen
071fe7fd51
Limit the top-right intra references when wpp is turned on
...
Chroma hash still fails.
2020-12-15 22:33:32 +02:00
Marko Viitanen
6146610ec8
Fix the wpp sync point to be the first LCU
2020-12-15 14:51:46 +02:00
Jaakko Laitinen
78be0ccd05
Fix chroma deblocking logic
2020-12-15 14:10:09 +02:00
Marko Viitanen
c07a56179f
Fix Hash SEI message for VTM11.0
2020-12-15 13:47:28 +02:00
Arttu Makinen
30c4065dc0
Headers for threading.
2020-12-15 13:04:39 +02:00
Jaakko Laitinen
6128db961a
Finish up large block filtering
2020-12-11 19:34:56 +02:00
Jaakko Laitinen
976d1c8812
Start implementing large block filtering
2020-12-10 18:03:18 +02:00
Jaakko Laitinen
33cea17484
Add logic for large block filtering
2020-12-09 19:10:38 +02:00
Jaakko Laitinen
d3d55933b2
Finish up strong filtering condition check
2020-12-08 18:38:05 +02:00
siivonek
e833354cdd
Merge branch 10-bit-assert-fix
2020-12-07 20:36:50 +02:00
Jaakko Laitinen
5a90deb678
Add initial max filter length and large block stuff
2020-12-07 18:54:43 +02:00
Jaakko Laitinen
03dade8246
Prepare for large blocks
2020-12-04 18:31:48 +02:00
Jaakko Laitinen
7b0b864947
Fix mvd thresholds and tc/beta index calculations
2020-12-04 15:54:40 +02:00
Jaakko Laitinen
8f3de705eb
Add todo list of things to check
2020-12-01 13:53:52 +02:00
Pauli Oikkonen
be19fd996b
Add default value for fast coeff table filename
...
..oops
2020-11-02 14:02:51 +02:00
Pauli Oikkonen
46301e9857
Document the --fast-coeff-table option
2020-10-29 15:23:26 +02:00
Pauli Oikkonen
816789c9f4
Allow fast coeff weights to be read from a file
2020-10-29 15:22:51 +02:00
Pauli Oikkonen
6799019db0
Move fast coeff table to transform.h
...
Guess this is a more logical place for it
2020-10-29 15:20:27 +02:00
Pauli Oikkonen
4712ce5f59
Round the fast coeff result instead of flooring
2020-10-29 15:20:27 +02:00
Pauli Oikkonen
0fb09c9920
New filtered coeff weight by QP values
2020-10-29 15:20:27 +02:00
Pauli Oikkonen
9bf0cb27b1
Constrain fast cost estimation to QPs we have weights for
2020-10-29 15:20:27 +02:00
Pauli Oikkonen
24d487f553
New weights for 12 <= QP <= 42
...
Trained using MSU ultrafast settings now
2020-10-29 15:20:27 +02:00
Pauli Oikkonen
3e1c6d84b8
Fix issues in fast coeff estimation
...
Allow weight table to start from nonzero QP, and round weights to Q8.8
instead of flooring them
2020-10-29 15:20:27 +02:00
Pauli Oikkonen
5f91bda762
Use newer data for fast coeff cost estimation
...
Same training dataset, but this time only buckets 0...3 were used to
approximate the function, no sign/cg width bucket.
2020-10-29 15:20:27 +02:00
Pauli Oikkonen
2abd733199
Use unsigned min() to correctly clip -32768
...
If a coeff happens to be -32768 (0x8000), its 16-bit abs() is also
0x8000. It should ultimately be clipped to 3, so interpret absolute
values as unsigned instead to make that happen.
2020-10-29 15:20:27 +02:00
Pauli Oikkonen
b93b90c0d7
Implement new fast coeff cost estimator in AVX2
2020-10-29 15:20:27 +02:00
Pauli Oikkonen
2f74a112b3
Try first lookup table based fast coeff estimation
2020-10-29 15:20:27 +02:00
Marko Viitanen
2db3a07b14
Prevent cu_sig_model_chroma array from being indexed over the limit
2020-10-13 14:14:57 +03:00
Marko Viitanen
f4948dda6f
Fix array size for bdpcm_mode[]
2020-10-13 12:51:20 +03:00
Marko Viitanen
9e3e8f51f6
Change kvz_g_tc_table_8x8 from uint8_t to uint16_t to fit all the values
2020-10-13 12:05:27 +03:00
Marko Viitanen
26f4f45c6d
Use correct pred_mode cabac models -> fixes inter cabac bits
2020-10-13 12:04:31 +03:00
Marko Viitanen
5a6806cbf7
[CI] Limit testing parameters to those that work
2020-10-09 09:37:15 +03:00
Marko Viitanen
3c7eb55292
Disable output of cabac debug when in "count only" mode
...
- Some code cleanup
2020-10-09 08:45:43 +03:00
Marko Viitanen
fa25621c77
Force certain intra modes off
2020-10-09 08:44:40 +03:00
Marko Viitanen
54b8fd054d
Fix Chroma QP scaling issue
2020-10-02 15:40:23 +03:00
Marko Viitanen
11229997b6
Fix NAL header layer_id
2020-10-01 11:10:40 +03:00
siivonek
bc1206a4d3
Define qp_delta_min & max in global.h instead of calculating them locally.
2020-09-29 13:46:27 +02:00
Marko Viitanen
ac2032eb65
Fixing P/B frame headers and debug output formatting
2020-09-28 14:58:07 +03:00
Marko Viitanen
bddfb47a55
Merge remote-tracking branch 'remotes/kvazaar_github/master'
2020-09-25 11:49:11 +03:00
Marko Viitanen
551a3991cf
Cleanup headers
2020-09-24 09:31:44 +03:00
siivonek
0f3ef786b9
Modify delta QP range assert so it will work with any valid bit depth. Modify VAQ code so it will clip the QP to a proper range which is dependent on bit depth
2020-09-22 20:15:23 +02:00
siivonek
fe6f93a951
Fix delta QP range check assert. Add separate asserts based on bit depth.
2020-09-22 20:15:22 +02:00
Marko Viitanen
449975b0fb
Fixed cubic filter usage in intra angular modes
2020-09-21 14:58:34 +03:00
Joose Sainio
8143ab971c
Merge branch 'stats-files'
...
# Conflicts:
# src/cfg.c
# src/cli.c
# src/kvazaar.h
2020-09-16 09:25:00 +03:00
Joose Sainio
1c06bd7f3d
Fix POC to be correct for all GOPs and Intra periods, fix issue with vaq
2020-09-14 14:25:48 +03:00
Sami Ahovainio
4d87fb2397
fixed potential out of bounds iteration
2020-09-10 12:59:39 +03:00
Sami Ahovainio
5d521a2444
Added option to force yuv as file format and made the options and file endings case insensitive
2020-09-09 16:05:59 +03:00
Joose Sainio
3fb8b7ebc6
Add --stats-file-prefix option
...
When the option is defined with an option four files prefixlambda.txt,
prefixqp.txt, prefixdist.txt, and prefixbits.txt that have the corresponding
data for each ctu. This is a debug feature.
2020-09-09 12:35:47 +03:00
Sami Ahovainio
84cabd9c20
Fixed sign match
2020-09-07 15:39:31 +03:00
Sami Ahovainio
d691849594
Added frame header reading for both read and seek functions
2020-09-07 15:31:08 +03:00
Sami Ahovainio
cbcee67821
y4m start header parsing ready
2020-09-07 15:31:07 +03:00
Joose Sainio
c10b841e7c
Merge remote-tracking branch 'remotes/origin/fix-sao-parameter' into master
2020-09-07 13:10:36 +03:00
Joose Sainio
da09d49890
Remove optionality from --sao
...
SAO parameter was optional which caused that if one wants to pass argument
one needs to use "=" which is confusing since this is not required for any
other parameter
2020-09-07 12:35:40 +03:00
Pauli Oikkonen
3f7f0d7ed7
Allow bit depth to be defined from the outside
...
For a 10-bit build, just use:
env CFLAGS="-DKVZ_BIT_DEPTH=10" ./configure && make clean && make
2020-09-02 17:55:22 +03:00
Pauli Oikkonen
780da4568a
Exclude 8-bit-only code from 10-bit builds and use uint8_t instead of kvz_pixel for code that assumes 8-bit pixels
2020-09-02 17:46:33 +03:00
Pauli Oikkonen
31ef4e4216
Fix ml functions to accept kvz_pixel*, not uint8_t*
2020-09-02 17:46:33 +03:00
Marko Viitanen
574c4d06ee
Fix use of log2_cg_size in coeff coding -> smaller blocks also decoded correctly
2020-08-27 18:26:16 +03:00
Marko Viitanen
b3f3a9eae6
Add two EOS NAL units at the end of each picture to make intra sequence work
2020-08-25 15:30:21 +03:00
Marko Viitanen
b7638172ca
Use continuous POC for all intra and add aud_irap_or_gdr_au_flag
2020-08-25 11:53:55 +03:00
Marko Viitanen
b53b53ed09
Fixed SAO headers, SAO produces valid output
2020-08-20 15:37:29 +03:00
Marko Viitanen
b4907e6337
Fix deblocking headers and some cleanup, deblocking does not produce valid output
2020-08-20 15:25:18 +03:00
Arttu Mäkinen
4da90b3722
Update of contexts.
2020-08-17 18:18:35 +03:00
Arttu Mäkinen
232332dc5f
Update of contexts.
2020-08-17 14:23:26 +03:00
Marko Viitanen
2fc8558926
Set correct profile, level and inter flags in IDR
2020-08-17 11:51:57 +03:00
Marko Viitanen
0f8ada02c4
Fix VPS writing
2020-08-17 11:26:09 +03:00
Arttu Mäkinen
da9f542209
WIP updating VTM8.2 to VTM10.0rc
2020-08-17 10:27:03 +03:00
Joose Sainio
faf5cc858d
Merge branch 'fix-lp-gop-rc'
2020-06-25 09:41:57 +03:00
Joose Sainio
138651ee85
Fix the bit and frame counts for calculating the gop allocation
...
Additionally dynamically adjust the smoothing window if there are rapid changes
2020-06-24 15:26:54 +03:00
Ari Lemmetti
f8ff6dd567
Merge pull request #262 from jbeich/truncate-freebsd
...
Unbreak build on FreeBSD
2020-06-22 18:08:01 +03:00
Ari Lemmetti
d1abf85229
Add MV constraint check to motion estimation start point
2020-06-01 23:51:38 +03:00
Marko Viitanen
20b66c9949
Sync to VTM 8.2 and add separate height to last_sig coding
2020-04-29 08:52:38 +03:00
Jan Beich
1fa69c705d
Rename truncate() from 30ce461d98
to avoid conflict with POSIX version
...
strategies/avx2/dct-avx2.c:55:23: error: static declaration of 'truncate' follows non-static declaration
static INLINE __m256i truncate(__m256i v, __m256i debias, int32_t shift)
^
/usr/include/stdio.h:448:6: note: previous declaration is here
int truncate(const char *, __off_t);
^
2020-04-22 16:09:42 +00:00
Ari Lemmetti
9753820b3a
Update version to 2.0.0
2020-04-22 01:03:36 +03:00
Ari Lemmetti
40e81f3243
Update preset tables. Update docs.
2020-04-22 01:03:21 +03:00
siivonek
54f438a75c
Update VAQ help text. Update docs. Change some lingering tabs to spaces.
2020-04-20 16:52:07 +02:00
Marko Viitanen
86d76b19a4
Fix intra neighboring block selection and clean some unused code
2020-04-16 14:12:40 +03:00
Marko Viitanen
27b4dd50f8
Fix picture header to code Inter frame
2020-04-14 08:24:11 +03:00
Ari Lemmetti
f31dddc019
Bypass inverse quantization and inverse transform when trying early skip
2020-04-10 16:02:09 +03:00
Pauli Oikkonen
fbdb1e2d15
Add correct path to sao_shared_generics.h in makefile
2020-04-08 19:27:12 +03:00
Pauli Oikkonen
8617530b13
Use _mm_store_epi64 instead of _mm_cvtsi128_si64
...
Fix 32-bit builds that tend to lack the cvt intrinsic. Hope it will be
optimized to a movq r64, xmm on modern platforms though
2020-04-07 23:51:54 +03:00
Pauli Oikkonen
a82966c0f5
Fix lacking _mm256_cvtss_f32 intrinsic on VS
...
Cast __m256 into __m128 first, the XMM variant of the intrinsic has been
around for a long enough time to be supported
2020-04-07 22:38:10 +03:00
Marko Viitanen
27ffba2c9c
Fix terminating bit condition at the end of the slice
2020-04-07 15:30:02 +03:00
Marko Viitanen
e737a878a6
Fix split flags and remove an extra terminating bit
2020-04-07 09:57:30 +03:00
Joose Sainio
c369ff8873
Fix a potential division by zero in a floating point operation
...
When C is calculated with K if the value of K is not clipped before in some
cases it is possible that K gets such a large negative value that bpp^K is
rounded to zero. In real-life cases this is extremely rare and clipping
beforhand has very little to no effect.
Also remove commented debug prints
2020-04-06 11:05:49 +03:00
Ari Lemmetti
901c25c0c8
Merge branch 'vaq'
2020-04-03 19:51:17 +03:00
Ari Lemmetti
51451be5ef
Handle cases where the number of pixels is not divisible by 32
2020-04-03 19:37:47 +03:00
siivonek
ee544304f1
Make function static to not mess up tests.
2020-04-03 15:22:34 +02:00
siivonek
e5267f7706
Fix define for use with Visual Studio.
2020-04-03 15:11:01 +02:00
siivonek
9e34369304
Merge branch 'vaq' of https://gitlab.tut.fi/TIE/ultravideo/kvazaar into vaq
2020-04-03 12:35:04 +02:00
siivonek
d025977949
Clamp edge lcu pixels if dimensions are not 64 divisible.
2020-04-03 12:33:14 +02:00
Pauli Oikkonen
addc1c3ede
Fix warning about potentially unused hsum_8x32b
...
There's a lot of alternative options available, such as making it
globally visible with a kvz_ prefix, force inlining it, or anything.
This could be good too, hope it won't be compiled at all to translation
units where it's not used.
2020-04-02 16:44:22 +03:00
siivonek
e3ba0bfb8c
Fix memory leak.
2020-04-02 14:15:36 +02:00
siivonek
566680af7b
Move function hsum to file where it is used to avoid errors.
2020-04-02 14:03:06 +02:00
siivonek
58be514e2a
Fix pipeline error.
2020-04-02 13:50:08 +02:00
siivonek
2aa0d97589
Add VAQ test in test_tools. Bump minor version number in configure.ac. Update help text for VAQ.
2020-04-01 18:16:39 +02:00
siivonek
c6e421019e
Merge vaq-simd
2020-03-31 21:40:29 +02:00
Jaakko Laitinen
8e4b738900
Fix error when first value in pu depth list is omitted
2020-03-31 16:57:12 +03:00
Jaakko Laitinen
54ef0bbfd2
Fix unintended functionality when giving multiple --pu-depth-intra/inter list parameters
2020-03-31 16:39:56 +03:00
Jaakko Laitinen
cb0c7b23b5
Merge branch 'intra_qp_offset_auto' into 'master'
...
Add auto option to intra-qp-offset
See merge request TIE/ultravideo/kvazaar!7
2020-03-31 16:17:36 +03:00
Pauli Oikkonen
99889dab15
Fix switch(bool) in picture-avx2.c
...
It passes on GCC but warns on Clang
2020-03-31 15:42:19 +03:00
Jaakko Laitinen
e0440c3de1
Update docs
2020-03-31 15:27:48 +03:00
Jaakko Laitinen
7760dcf441
Remove intra qp offset from preset parameters
2020-03-31 14:06:07 +03:00
Jaakko Laitinen
8bd1a2b667
Update help message
2020-03-31 13:19:05 +03:00
Jaakko Laitinen
b4f5486190
Set intra qp offset default to auto
2020-03-31 12:58:40 +03:00
Jaakko Laitinen
740688c67d
Add auto option to intra qp offset
2020-03-31 11:56:44 +03:00
Marko Viitanen
a0af87bdc0
Update contexts to match VTM 8.0
2020-03-30 14:34:50 +03:00
Marko Viitanen
d36ba85861
Fixed PPS and slice header to match VTM 8.0 (only for I-Frame!)
2020-03-30 12:55:12 +03:00
Marko Viitanen
64b9177cf0
Fix SPS to match VTM 8.0
2020-03-30 09:56:38 +03:00
Pauli Oikkonen
0c7bfa7dc9
Fix AVX2 on Clang
...
Besides just -mavx2, AVX2 support depends on a couple minor instruction
set extensions that should always exist on AVX2-capable hardware. Too
bad the different bit twiddling instructions are invoked slightly
differently between GCC and Clang, but now Clang seems to also produce
an AVX2-capable build.
2020-03-26 18:48:48 +02:00
siivonek
89d3e674ce
Comment out code which possible messes up OBA
2020-03-26 17:49:31 +02:00
siivonek
be7d9ddec5
Fix error in frame variance calculation. Chroma channels were not added to variance
2020-03-26 14:33:00 +02:00
Marko Viitanen
8908324df8
Fix PTL DPB HDR param headers to match VTM 8.0
2020-03-26 10:40:27 +02:00
Marko Viitanen
d622ebb1f4
Fix NAL types to match VTM 8.0
2020-03-26 10:39:35 +02:00
Jaakko Laitinen
45ca8f8113
Merge branch 'master' into 'extended_pu-depths'
2020-03-25 15:11:08 +02:00
siivonek
5986e71535
Fix mistake
2020-03-20 13:43:44 +02:00
Jaakko Laitinen
d6ffe9e495
Update docs
2020-03-20 13:27:07 +02:00
Jaakko Laitinen
621450cc1d
Update --help
2020-03-20 13:07:48 +02:00
Jaakko Laitinen
aaac3df69b
Add prefix to kvazaar.h define
2020-03-20 09:04:00 +02:00
siivonek
2a85be5752
Move qp_to_lambda so it is defined before use. Change some tabs to spaces
2020-03-19 22:13:53 +02:00
siivonek
0a4ce3c0aa
Add vaq to new rate control
2020-03-19 21:43:52 +02:00
siivonek
1bbc598d75
Merge branch 'master' into vaq
2020-03-19 20:19:43 +02:00
Joose Sainio
b53911d637
Merge branch 'rc-intra'
2020-03-19 13:34:15 +02:00
Joose Sainio
a304a8ea6e
Add weights for GOP 16 based on fitting a power curve to bits spent by HM
2020-03-19 11:13:43 +02:00
Joose Sainio
e823ac1dae
miscellaneous fixes
...
- bump library version
- add help desk for --clip-neighbour
- update the default values of --clip-neighbour and --intra-bits
- update tests to more sensible
2020-03-19 10:47:28 +02:00
Jaakko Laitinen
b2ddba38c2
Set correct size for pu-depth min/max data structure
2020-03-19 09:29:43 +02:00
Joose Sainio
2c345bc3cf
try to fix tsan issue
2020-03-18 14:58:54 +02:00
Jaakko Laitinen
fe428dcbe1
Fix no gop functionality
2020-03-18 11:03:33 +02:00
Jaakko Laitinen
af3d559d8d
Let pu-depth be defined per gop-layer
2020-03-17 17:57:18 +02:00
Ari Lemmetti
cbd77944d8
Costs in rough intra search may be negative. Get rid of UBSan error.
2020-03-16 22:13:14 +02:00
Ari Lemmetti
aa0ade3f65
Cast values to unsigned to make UBSan not trigger due to left-shifting negatives
2020-03-16 19:52:34 +02:00
RLamm
27fe716654
Fixed reference POC indexing
2020-03-11 15:33:37 +02:00
RLamm
bf24831780
Attempt to fix random crashes
2020-03-11 15:31:47 +02:00
RLamm
887659db1f
Attempted to scale the extra_mvs
2020-03-11 15:31:46 +02:00
siivonek
8d9719ff90
Merge branch 'master' into vaq
2020-03-05 14:17:01 +02:00
Joose Sainio
c9a8f2a596
Completely disable intra based model for frame 1
2020-03-04 12:52:13 +02:00
Joose Sainio
19c79c3e58
don't use the intra frame based estimation if the result is bad
2020-03-04 09:26:22 +02:00
Ari Lemmetti
7b7358c25a
Update presets veryslow and placebo a bit
...
Both use now --gop 16, --intra-qp-offset -3, --me tz, and --transform-skip
2020-03-03 20:41:01 +02:00
Pauli Oikkonen
60e7956dc5
Disable inaccurate integer variance calculation for now
2020-03-02 19:18:55 +02:00
Pauli Oikkonen
fc1b91335b
Implement variance calculation in integer math
...
Maybe this is a bit faster than FP, it's not accurate though
2020-03-02 18:17:18 +02:00
Pauli Oikkonen
35c825c75f
Move hsum_8x32b to avx2_common_functions
2020-02-27 17:52:17 +02:00
Pauli Oikkonen
b00ac7d1c4
AVX2 version of buffer variance calculation
2020-02-25 15:57:56 +02:00
siivonek
a380e43bda
Add chroma channels to variance calculation.
2020-02-24 19:54:34 +02:00
Pauli Oikkonen
1bd9c6dd93
Make a strategy out of pixel_var
2020-02-24 19:37:36 +02:00
Pauli Oikkonen
86ebf366e1
fix typo
2020-02-24 18:18:10 +02:00
Joose Sainio
f81de41775
Merge branch 'master' into rc-intra
2020-02-24 15:30:57 +02:00
siivonek
5688bcd646
Merge branch 'master' into vaq
2020-02-21 17:11:10 +02:00
siivonek
908ecb1767
Add rounding to aq offsets. Fix typo
2020-02-21 13:51:43 +02:00
Ari Lemmetti
1dfc69b42e
Consider merge index bits in merge analysis and early skip
2020-02-20 09:43:58 +02:00
Joose Sainio
7deb22c8e8
Merge branch 'master' into rc-intra
2020-02-19 15:01:04 +02:00
Kari Siivonen (TAU)
c972ca9067
Add assert to check if deltaQP out of bounds. Clip adaptive QP to [-13, 12].
2020-02-18 13:20:26 +02:00
Kari Siivonen (TAU)
f07990794f
Fix error in vaq pixel blit range calculation
2020-02-18 13:20:26 +02:00
Kari Siivonen (TAU)
57ed40c263
Fix application of aq offset
2020-02-18 13:20:26 +02:00
Kari Siivonen (TAU)
be2f420d61
Change: vaq requires parameter. Parameter defines vaq strength ex. 15 == 1.5
2020-02-18 13:20:26 +02:00
Kari Siivonen (TAU)
bf1b2c1e22
Add define for vaq strength parameter
2020-02-18 13:20:26 +02:00
Kari Siivonen (TAU)
150559a7e8
Fix bugs. Enable set_qp_in_cu when using vaq
2020-02-18 13:20:26 +02:00
Kari Siivonen (TAU)
c8c71274ee
Change tabs to spaces.
2020-02-18 13:20:26 +02:00
siivonek
888382953d
Implement calculation of vaq values. Values not used yet.
2020-02-18 13:20:25 +02:00
siivonek
ad40a88c09
Add no-vaq option to vaq
2020-02-18 13:20:25 +02:00
siivonek
09f0a1c52e
Fix typo in comment
2020-02-18 13:20:25 +02:00
siivonek
84fb3fd7d1
aq: Add --vaq commandline option
2020-02-18 13:20:25 +02:00
Joose Sainio
2a98f5db1e
fix intra-bits for lp-gop
2020-02-18 10:38:29 +02:00
Ari Lemmetti
71d9327f62
Further improve fast bipred
2020-02-17 20:32:52 +02:00
Ari Lemmetti
80c26870d5
Update docs
2020-02-15 23:29:18 +02:00
Ari Lemmetti
ebb183cc01
Add option to make intra QP offset configurable
2020-02-15 22:54:48 +02:00
Ari Lemmetti
be3e08d6db
Add gop.h to Makefile
2020-02-15 22:54:47 +02:00
Ari Lemmetti
1354acd358
Prevent negative values being written to SPS with --gop=0
2020-02-15 22:54:47 +02:00
Ari Lemmetti
fe4869916c
Disable GOP and intra qp offset for all-intra coding automatically
2020-02-15 22:54:46 +02:00
Ari Lemmetti
9849fb7c77
Enable experimental rate control for GOP 16
2020-02-15 22:54:46 +02:00
Ari Lemmetti
a0a22dec8a
Remove deprecated / unused lambda adjustments
2020-02-15 22:54:46 +02:00
Arttu Ylä-Outinen
829a70e6a7
Copy lowdelay GOP definition from HM
2020-02-15 22:36:58 +02:00
Arttu Ylä-Outinen
28f99c0b87
Change definition of 8-GOP to match HM
2020-02-15 22:36:58 +02:00
Arttu Ylä-Outinen
636fa8fbdd
Fix maximum decoded picture buffer size
2020-02-15 22:36:57 +02:00
Arttu Ylä-Outinen
ebd5156db5
Add definition for random access GOP of length 16
2020-02-15 22:36:57 +02:00
Arttu Ylä-Outinen
6653f06dd0
Only compute GOP layer weights when RC is enabled
2020-02-15 22:36:57 +02:00
Arttu Ylä-Outinen
c8fff1e0d6
Use a larger number of bits for POC lsb when needed
...
Changes the number of bits used for coding the least significant bits of
the POC based on the GOP size.
2020-02-15 22:36:56 +02:00
Arttu Ylä-Outinen
d757a832c2
Change GOP QP offset handling to match HM
...
Adds fields qp_model_scale and qp_model_offset to kvz_gop_config and
intra_qp_offset to kvz_config.
2020-02-15 22:36:56 +02:00
Arttu Ylä-Outinen
f37dcd5879
Move GOP definition to a separate file
...
Moves definition of the 8-GOP from cfg.c to gop.h.
2020-02-15 22:36:55 +02:00
Ari Lemmetti
6e1007a3e7
Get rid of LAMBA! (Commit #3000 )
2020-02-15 22:32:52 +02:00
Ari Lemmetti
0c02e71b43
Remove minor error from readme
2020-02-15 22:29:08 +02:00
Joose Sainio
e90d3141a2
Merge branch 'master' into rc-intra
2020-02-05 11:06:56 +02:00
Ari Lemmetti
9a0236bb4e
Add option 'zero-coeff-rdo'
2020-02-04 21:26:29 +02:00
Ari Lemmetti
886ff36d12
Initial implementation of fast bipred.
2020-02-04 15:46:23 +02:00
Ari Lemmetti
3c7dd0752f
Remove the broken "no mov" branch.
...
Causes hash mismatches for example in SlideShow sequence.
2020-02-03 15:26:31 +02:00
RLamm
bf8941ddb8
Added comment about partial-coding usage
2020-01-31 16:19:48 +02:00
RLamm
b8488ab48d
Changed "partial-coding" variables to uint32_t
2020-01-31 16:02:29 +02:00
RLamm
76e3249754
Changed parameter "slicer" to "partial-coding" to avoid confusion.
2020-01-31 14:22:32 +02:00
RLamm
30d5df40c5
Custom headers for the distributed coding
2020-01-29 15:54:49 +02:00
Joose Sainio
54571529a4
Fix accessing previous frame that didn't exist
2020-01-17 10:48:35 +02:00
Joose Sainio
5c671d20e1
Use the new clipping only in situations where it actually helps
2020-01-17 09:08:21 +02:00
Joose Sainio
3c34d7c863
Fix qp estimation and checking of previous frames that dont exist
2020-01-15 09:32:04 +02:00
Joose Sainio
1a35c22a52
Change clipping of lambda and qp for ctus on OBA rc
...
instead of clipping qp and lambda to the value of last value from the state
clip to previous frame with same layer and if such frame doesn't exist, clip
to previous frame
2020-01-14 14:46:05 +02:00
Pauli Oikkonen
c3d9e97e9f
Fix VS build
2019-12-12 18:34:55 +02:00
Pauli Oikkonen
7f238ca299
Remove debug print functions
...
Whoops
2019-12-12 18:19:31 +02:00
Pauli Oikkonen
eefb5e50b3
De-inline pred_filtered_dc functions, shouldn't make much difference though
2019-12-12 17:30:00 +02:00
Pauli Oikkonen
169314de4f
32x32 filtered DC prediction in AVX2
2019-12-11 18:17:06 +02:00
Pauli Oikkonen
fb2481b7e4
16x16 filtered DC implemented in AVX2
2019-12-10 15:54:50 +02:00
Joose Sainio
b78aa7b272
save c and k to frame
2019-12-06 10:52:54 +02:00
Joose Sainio
5b10e5fb7e
parameterize the clipping option
2019-12-06 09:51:04 +02:00
Pauli Oikkonen
da370ea36d
Implement AVX2 8x8 filtered DC algorithm
2019-11-28 14:10:10 +02:00
Pauli Oikkonen
5d9b7019ca
Implement a 4x4 filtered DC pred function
2019-11-26 17:05:54 +02:00
Joose Sainio
ca0060cbba
try the original clipping
2019-11-26 15:13:04 +02:00
Pauli Oikkonen
f1485ab087
Start doing an arbitrary size filtered DC pred - maybe easier to just create separate functions for fixed block sizes?
2019-11-25 15:20:29 +02:00
Joose Sainio
ab2fded8af
Update threadwrapper to enable pthread_rwlock_t
2019-11-21 13:38:40 +02:00
Joose Sainio
eb78aead1f
Fix additional potential data races
2019-11-21 11:03:12 +02:00
Joose Sainio
35d7e0d88b
Fix data race
2019-11-21 10:25:04 +02:00
Marko Viitanen
94d89f03c7
Added cfg variable intra_smoothing_disabled and some cleanup
2019-11-20 08:38:33 +02:00
Marko Viitanen
eb2caf9118
Fix intra angle filter, changed from gauss filter table to run-time calculated 4-tap filter
2019-11-19 15:15:21 +02:00
Pauli Oikkonen
979d66031c
Create a strategy out of intra_pred_filtered_dc
2019-11-19 14:50:31 +02:00
Marko Viitanen
466d8772b0
Apply JVET_P0170_ZERO_POS_SIMPLIFICATION in coeff bypass coding
2019-11-19 14:32:38 +02:00
Joose Sainio
0e8815a3d8
test clipping qp to previous frame instead of previous ctus
2019-11-19 14:32:31 +02:00
Joose Sainio
ddb4e5a131
move the intra bit calculation so that it is used also with lambda rc
2019-11-19 14:16:48 +02:00
Joose Sainio
a07833f3e6
check that mallocs in rc initialization were successful
...
only call kvz_update_after_picture when using the OBA rc
2019-11-19 13:59:44 +02:00
Joose Sainio
50d410a316
re-enable static qp encoding and lambda rc
2019-11-19 13:45:58 +02:00
Pauli Oikkonen
fa4bb86406
Optimize intra_pred_planar_avx2 for 4x4 blocks
2019-11-19 13:39:02 +02:00
Marko Viitanen
3df2642b03
Fix qt cbf context init value
2019-11-19 13:27:36 +02:00
Joose Sainio
57e5615ece
Fix incorrect intra rc calculation skipping
2019-11-19 13:25:31 +02:00
Joose Sainio
6cc3bcd87e
Command line parameters for oba rc and implementation of the usage of the intra parameter
2019-11-19 09:29:06 +02:00
Joose Sainio
eb73548af5
Encode first frame completely before starting others to enable owf
2019-11-18 09:51:37 +02:00
Marko Viitanen
17a53230fd
Code cleanup, remove unused arrays and remove tabs
2019-11-18 09:01:23 +02:00
Pauli Oikkonen
4761d228f9
Start to vectorize the 4x4 loop
2019-11-15 17:32:40 +02:00
Pauli Oikkonen
8d45ab4951
Stupidify the 4x4 planar loop for vectorization
2019-11-14 17:14:04 +02:00
Marko Viitanen
91528f3292
Update contexts
2019-11-14 13:46:51 +02:00
Marko Viitanen
b309ed90be
Fix NAL packet and missing fields in SPS
2019-11-14 09:21:11 +02:00
Marko Viitanen
74514981a9
Fixed PPS, SPS and slice headers and NAL unit types
2019-11-13 15:59:36 +02:00
Joose Sainio
c759c138ed
Prepare the rc data structure to be shared among all frame encoders
2019-11-13 11:56:25 +02:00
Joose Sainio
cdb7c851a4
Fix weight calculation
2019-11-13 08:55:31 +02:00
Joose Sainio
b9b01f8036
WPP with threading
2019-11-12 12:12:57 +02:00
Joose Sainio
615973adca
should enable threading with wpp when owf is not used
2019-11-12 09:03:00 +02:00
Pauli Oikkonen
6f13f6525c
Merge branch 'new_prints'
2019-11-07 17:04:21 +02:00
Joose Sainio
d353f7dd1a
Disable debug prints, fix multiple bugs in the calculation
2019-11-07 15:08:57 +02:00
mercat
57e8c3ebc2
Merge branch 'ML-cplx_red_ICIP'
2019-11-07 13:25:47 +02:00
Pauli Oikkonen
558f0ec401
Mbps, not mbps
2019-11-05 18:06:00 +02:00
Pauli Oikkonen
2edf533925
Tidy the end report printing
...
Also fix a bug with non-integer target FPS
2019-11-05 17:20:00 +02:00
Joose Sainio
408fd4ccb6
Fix lambda and qp calcualtion for intra frames
...
also fixes a bug with selecting the clip neighbor lambda and clip neighbor qp
selection for inter frames
2019-11-05 10:51:39 +02:00
Pauli Oikkonen
c7313ce567
Store AVG QP information in encmain
2019-11-04 17:08:07 +02:00
Reima Hyvönen
80575c59bf
Some updates done to get right bitrate and avg QP
2019-10-31 15:56:24 +02:00
Reima Hyvönen
252bab8820
Added prints to bitrate and AVG QP
2019-10-31 15:56:24 +02:00
Pauli Oikkonen
6d7a4f555c
Also remove 16x16 (A * B^T)^T matrix multiply
...
Can be done using (B * A^T) instead, it's the exact same
2019-10-28 16:19:42 +02:00
Pauli Oikkonen
2c2deb2366
Tidy AVX2 32x32 matrix multiply
2019-10-28 16:19:42 +02:00
Pauli Oikkonen
98ad78b333
Tidy the old AVX2 32x32 matrix multiply
...
It was actually a very good algorithm, just looked messy!
2019-10-28 16:19:42 +02:00
Pauli Oikkonen
4a921cbdb5
Retain data as much in YMM registers as possible
...
This seems to make it a whole lot quicker
2019-10-28 16:19:42 +02:00
Pauli Oikkonen
ac4d710e23
Unroll 32x32 matrix multiply, use all regs
2019-10-28 16:19:42 +02:00
Pauli Oikkonen
a58608d0b8
Remove totally unnecessary (A * B^T)^T 32x32 multiply
2019-10-28 16:19:42 +02:00
Pauli Oikkonen
043f53539f
Implement a streamlined matrix-multiply 32x32 DCT
2019-10-28 16:19:42 +02:00
Pauli Oikkonen
e9da2d851b
Tidy 32x32 fast DCT's helper functions
2019-10-28 16:19:42 +02:00
Pauli Oikkonen
e382339182
Implement fast (butterfly) 32x32 DCT in AVX2
2019-10-28 16:19:42 +02:00
Pauli Oikkonen
b5962dadac
Tidy indentation in AVX2 16x16 iDCT
2019-10-28 16:19:42 +02:00
Pauli Oikkonen
36a8f89025
Fine-tune 16x16 AVX2 iDCT
2019-10-28 16:19:42 +02:00
Pauli Oikkonen
ca9409de2b
Implement 16x16 DCT as butterfly algorithm in AVX2
2019-10-28 16:19:42 +02:00
Pauli Oikkonen
7c69a26717
Use aligned loads and stores for AVX2 DCT
2019-10-28 16:19:42 +02:00
Pauli Oikkonen
8e9c65dca6
Align DCT matrices and temp transform buffers
2019-10-28 16:19:42 +02:00
Pauli Oikkonen
148a150522
Align DCT source and dest blocks to cache line
2019-10-28 16:19:42 +02:00
Pauli Oikkonen
8e60bbf6a6
Slightly tune 16x16 forward DCT
...
Use an array of __m256i's to store temporary value, essentially letting
the compiler enforce alignment and use aligned loads and stores.
2019-10-28 16:19:42 +02:00
Pauli Oikkonen
c0cc0e8a75
Optimize 16x16 multiply by only slicing right mat once
2019-10-28 16:19:42 +02:00
Pauli Oikkonen
e463d27f22
Implement streamlined generic 16x16 matrix multiply
...
It can't be this fast for real, can it?
2019-10-28 16:19:42 +02:00
Pauli Oikkonen
beb85ce9d6
Reorder parameters for 8x8 matrix multiplies
2019-10-28 16:19:42 +02:00
Pauli Oikkonen
292af62256
Implement tailored 16x16 forward DCT
2019-10-28 16:19:42 +02:00
Pauli Oikkonen
30ce461d98
Redo 4x4 matrix multiplication
2019-10-28 16:19:42 +02:00
Pauli Oikkonen
07970ea82f
Streamline by-the-book 8x8 matrix multiplication
...
Also chop up the forward transform into two tailored multiply functions
2019-10-28 16:19:42 +02:00
Pauli Oikkonen
7ec7ab3361
Implement a tailored AVX2 8x8 DCT
2019-10-28 16:19:42 +02:00
Joose Sainio
372934c7db
Fix division by zero
2019-10-10 16:35:56 +03:00
Joose Sainio
9bdfdeaf5c
Rest of the owl
2019-10-09 15:48:58 +03:00
Joose Sainio
1ba8525faf
WIP
2019-10-09 10:35:07 +03:00
Joose Sainio
19496d2692
?
2019-10-03 14:50:11 +03:00
Joose Sainio
4b111e339e
fix couple of bugs in the implementation, bit calculation seems still bit off
2019-10-01 15:08:39 +03:00
Joose Sainio
84615e406a
fix compiler warnings
2019-09-27 14:20:08 +03:00
Joose Sainio
14b7a75713
Call the new functions and fix bugs
2019-09-27 14:14:24 +03:00
Joose Sainio
ef74bfb182
unify naming
2019-09-27 10:16:21 +03:00
Joose Sainio
e36f481bda
qp calculation for frame
2019-09-27 09:05:40 +03:00
Joose Sainio
47019ca1cd
intra ck update
2019-09-26 16:04:53 +03:00
Joose Sainio
7c8f4da7cb
Update c and k except after first intra
2019-09-26 13:09:28 +03:00
Joose Sainio
0577d481c1
CTU level code
2019-09-25 12:12:21 +03:00
pkubaj
1d7fcf4227
Fix build on powerpc64 with LLVM
2019-09-12 15:05:00 +02:00
mercat
0de567bfa4
Fixe memory leak
2019-09-12 09:45:32 +03:00
mercat
fa116de619
Add static
2019-09-11 16:18:12 +03:00
mercat
b8753a9293
Fucking INLINE fixed
2019-09-11 16:12:07 +03:00
mercat
b855144e68
INLINE fixe
2019-09-11 16:12:07 +03:00
mercat
694337b803
Add const and more const
2019-09-11 16:12:07 +03:00
mercat
21c07638ed
Remove const into kvz_init_constraint.
2019-09-11 16:12:06 +03:00
mercat
2bca507abe
Clean version of machine learning constraint code. (ICIP paper)
2019-09-11 16:12:06 +03:00
Alexandre Mercat
0f4b7be6ee
First version of ML ICIP code for master
2019-09-11 16:12:06 +03:00
Pauli Oikkonen
99597b828a
Work around the ancient Win32 calling convention hassle
...
See if this'll work now
2019-09-06 13:14:42 +03:00
Pauli Oikkonen
c5ca18950c
Revert "Revert to 6924d90052
due to broken visual studio build"
...
This reverts commit 1dd0619bd7
.
2019-09-05 18:21:55 +03:00
Pauli Oikkonen
55529decd5
Implement _mm256_insert_epi32 and extract pseudo-ops
...
Visual Studio headers apparently lack these guys
2019-09-05 18:20:52 +03:00
Marko Viitanen
28dc4fa2ed
Fix intra MPM selection
2019-09-05 09:39:13 +03:00
Ari Lemmetti
147378e1f9
Prevent 8x4 and 4x8 bipred in merge analysis
2019-09-03 16:32:50 +03:00
Ari Lemmetti
ef1fdbf259
Separate prediction of single PU/PB from CU/CB
2019-09-03 16:32:50 +03:00
Joose Sainio
7d2737bdf6
WIP picture lambda calculation
2019-09-03 11:03:35 +03:00
Ari Lemmetti
3bc510712f
Enable merge analysis for smp and amp
2019-09-02 17:31:51 +03:00
Ari Lemmetti
557bcbc6aa
Make luma or chroma only inter "recon" or predict possible
2019-09-02 17:15:28 +03:00
Marko Viitanen
6d5e20ca13
Header changes to match VTM 6.1
2019-09-02 09:42:35 +03:00
RLamm
60be6d411c
Intra filtering fixed at least for luma. All intra modes output valid luma (hashes match), but chroma is still broken.
2019-08-30 16:14:00 +03:00
RLamm
83ac39094a
Use new PDPC filtering for planar and DC modes
2019-08-29 12:51:34 +03:00
Joose Sainio
131c04f65c
Fix incorrect weight for intra frame
2019-08-29 12:01:13 +03:00
Joose Sainio
8f96678d13
Fix issue with intra frames being part of gop when they shouldn't
2019-08-29 09:28:10 +03:00
Ari Lemmetti
aa8ab195d1
Compare rough cost of the best merge mode against AMVP to make mode decision
2019-08-26 22:49:09 +03:00
Ari Lemmetti
8f866ff83a
Use correct index
2019-08-26 20:10:10 +03:00
Ari Lemmetti
2343958a14
Fix transform split for small luma blocks
2019-08-24 21:50:17 +03:00
Ari Lemmetti
800fc8644d
Reset CBFs because CBFs might have been set earlier for depth earlier.
2019-08-24 21:49:33 +03:00
Ari Lemmetti
a80de22bc7
Add only different candidates to the list
2019-08-24 21:49:33 +03:00
Ari Lemmetti
45c7961412
Remove tr depth fill. It should not be needed.
2019-08-24 21:49:32 +03:00
Ari Lemmetti
ff8711aaab
Add missing logic to add valid indices to list
2019-08-24 21:49:29 +03:00
Marko Viitanen
cb0d7c340a
Use the new PDPC filtering in angular intra
2019-08-23 14:44:41 +03:00
Marko Viitanen
5bebb18943
Change intra filtering according to VTM6
2019-08-23 08:56:35 +03:00
Marko Viitanen
a16efe6b52
Merge remote-tracking branch 'remotes/github_kvazaar/master'
...
# Conflicts:
# build/kvazaar_VS2013.sln
# build/kvazaar_VS2015.sln
# build/kvazaar_VS2017.sln
# build/kvazaar_cli/kvazaar_cli.vcxproj
# build/kvazaar_lib/kvazaar_lib.vcxproj
# build/kvazaar_tests/kvazaar_tests.vcxproj
# src/encode_coding_tree.c
# src/encode_coding_tree.h
# src/encoder_state-bitstream.c
# src/inter.c
# src/strategies/avx2/quant-avx2.c
2019-08-22 15:12:01 +03:00
Marko Viitanen
01ea762c1f
Fix coeff coding ad remove bdpcm flag -> CABAC bits match with VTM 6.0
2019-08-22 14:33:42 +03:00
Marko Viitanen
210af8adbe
Remove joint_cb_cr flag and fix split_flag context selection
2019-08-22 11:23:24 +03:00
Marko Viitanen
c713d31c93
Fix sig_coeff context selection
2019-08-22 10:57:50 +03:00
Marko Viitanen
48b8898e53
Fix CBF context init and use
2019-08-22 10:44:47 +03:00
Marko Viitanen
db94ec1a84
Rename intra_mode_model -> intra_luma_mpm_flag_model and update the contexts
2019-08-19 15:17:25 +03:00
Marko Viitanen
1c6ffc0a7e
Fix wrong variable types in context init
2019-08-19 14:33:55 +03:00
Marko Viitanen
cd6be15e10
Fix context init to match VTM6.0
2019-08-19 13:57:31 +03:00
Marko Viitanen
3de198d2db
Sync contexts with VTM6.0
2019-08-19 09:39:59 +03:00
Marko Viitanen
e644b03615
Fix headers to match VTM6.0rc1
2019-08-16 15:33:20 +03:00
Ari Lemmetti
1dd0619bd7
Revert to 6924d90052
due to broken visual studio build
2019-08-08 15:15:34 +03:00
Pauli Oikkonen
2852baa673
Separate sign3_diff_epu8 from calc_eo_cat
...
Just to keep things simple, clear and obvious
2019-08-07 16:35:24 +03:00
Pauli Oikkonen
17947b79ee
Add sao_shared_generics.h in Makefile.am
2019-08-07 16:35:24 +03:00
Pauli Oikkonen
a8dd6ce351
Add a note about having implemented a separate AVX2 version of SAO offset array calculation
2019-08-07 16:35:24 +03:00
Pauli Oikkonen
a858e7dd4b
Combine duplicate code into inline functions
2019-08-07 16:35:24 +03:00
Pauli Oikkonen
de0e97f711
Take 8/16/24b loads and stores into separate functions
2019-08-07 16:35:24 +03:00
Pauli Oikkonen
10979f58fe
Tidy up code
2019-08-07 16:35:24 +03:00
Pauli Oikkonen
9cc11976c0
Combine the delta accumulation from edge and band ddistortion into shared func
...
This won't reduce object size, but there'll be less duplicate code
2019-08-07 16:35:24 +03:00
Pauli Oikkonen
55d877bd66
Vectorize sao_edge_ddistortion
2019-08-07 16:35:24 +03:00
Pauli Oikkonen
aef0f301d3
Fix function signatures
...
Mark anything intended as read-only to be const, and fix alignment
2019-08-07 16:35:24 +03:00
Pauli Oikkonen
997fd369b3
Redo calc_sao_edge_dir_avx2
...
Do it wider, 32 pixels at once!
2019-08-07 16:35:24 +03:00
Pauli Oikkonen
db1e475e02
Use i32 instead of i8 for x/y offsets
...
Doesn't matter too much, because this number isn't used in SIMD
computation, only as a memory reference offset.
2019-08-07 16:35:24 +03:00
Pauli Oikkonen
12de466ef5
Reimplement non-band SAO color reconstruction in AVX2
...
Streamline things to work on 32 pixels at once instead of 8
2019-08-07 16:35:24 +03:00
Pauli Oikkonen
e8bff99329
Redo the SAO_TYPE_BAND subsection of AVX2 SAO color reconstruction
...
Vectorize it all, hope this helps with perf
2019-08-07 16:35:24 +03:00
Pauli Oikkonen
7b5dffa855
Implement calc_sao_offset_array in AVX2
...
To be efficient, the AVX2 color reconstruction algorithm will need
offsets in byte, not dword, arrays. This is completely specific to 8-bit
pixels and the function signature is fundamentally distinct from the
generic algorithm, so it's better to not strategize SAO offset array
calculation.
2019-08-07 16:35:24 +03:00
Pauli Oikkonen
29563b7039
Make kvz_calc_sao_offset_array more obvious
...
Name temporary values from array lookups etc that are referred multiple
times to, to make the behavior of the mechanism more transparent. Define
all the constant values at the beginning of the function and declare as
const.
2019-08-07 16:35:24 +03:00
Pauli Oikkonen
08881f5e9b
(TEMP) (TODO) (whatever) Avoid compiler warnings
...
I want the CI to not crash on its -Wall -Werror, but instead to actually
build the thing and report me about actual memory errors etc
2019-08-07 16:35:24 +03:00
Pauli Oikkonen
c18adc5ee0
Redo sao_band_ddistortion_avx2
...
Avoid branching and do the entire thing on 32 pixels at once in YMMs.
Also make the sao_bands function parameter const.
2019-08-07 16:35:24 +03:00
Pauli Oikkonen
2827c3e3ab
Make calc_sao_bands less opaque
2019-08-07 16:35:24 +03:00
Pauli Oikkonen
1bb9a079a8
Fix indentation
2019-08-07 16:35:24 +03:00
Reima Hyvönen
7bc959c7c5
3 sao functions are now working
2019-08-07 16:35:24 +03:00
Reima Hyvönen
0e0f2d3490
made to clear sum vector after it has been set to memory
2019-08-07 16:35:24 +03:00
Reima Hyvönen
f146de7acb
removed some variables to prevent memory losses
2019-08-07 16:35:24 +03:00
Reima Hyvönen
247c3a7a71
conversed gined to unsigned int
2019-08-07 16:35:24 +03:00
Reima Hyvönen
ac5c216974
Some more memory error preventing to sao_edge_ddistortion_avx2
2019-08-07 16:35:24 +03:00
Reima Hyvönen
3fb1cbca35
more editing sao_edge_ddistortion_avx2
2019-08-07 16:35:24 +03:00
Reima Hyvönen
afbb6fb960
some more modifications to sao_edge_ddistortion_avx2 to prevent memory failures
2019-08-07 16:35:24 +03:00
Reima Hyvönen
3496a57f7a
Edited sao_edge_ddistortion_avx2 to avoid memory overflow
2019-08-07 16:35:24 +03:00
Reima Hyvönen
267ba1d6ce
Modified sao_band_ddistortion_avx2
2019-08-07 16:35:24 +03:00
Reima Hyvönen
e70663b245
added some sub commands to avoid memory read errors
2019-08-07 16:35:24 +03:00
Reima Hyvönen
59dfb4570c
Converted some loads to load int8_t instead ints
2019-08-07 16:35:24 +03:00