Arttu Ylä-Outinen
435c387357
Refactor rate control
...
- Defines MIN_LAMBDA and MAX_LAMBDA constants.
- Moves resetting state->frame->cur_gop_bits_coded to rate_control.c.
- Changes gop_allocate_bits to return the number of bits allocated like
pic_allocate_bits does.
2017-01-09 01:24:23 +09:00
Arttu Ylä-Outinen
6c4f2d196a
Move fields from encoder_state_t to frame
...
Moves fields prepared and frame_done from encoder_state_t to
encoder_state_config_frame_t.
2017-01-09 01:24:23 +09:00
Arttu Ylä-Outinen
97863cdaa2
Fail encoder init when CQM file cannot be opened
2017-01-08 19:17:43 +09:00
Arttu Ylä-Outinen
db5e750c7f
Fix --threads=auto
...
When --threads=auto was given on the command line, cfg->threads was
actually set to zero, disabling threads altogether. Fixed to set
cfg->threads to -1, so that the number of threads is chosen
automatically.
2017-01-08 17:58:22 +09:00
Ari Koivula
a9e45efcfc
Add a fast lane for byte-aligned bitstream writes
...
The CABAC engine only writes to the bitstream when it has a full byte.
These writes are also always byte-aligned, so there is no need to even
check for stream alignment.
Speedup was around 3% with ultrafast and low QP.
2016-12-23 17:01:44 +02:00
Jaakko Laitinen
deb63f735f
Fix gop disabling
2016-12-20 14:25:13 +02:00
Ari Lemmetti
70a52f0e48
10-bit: add missing bit depth adjustment to ssd
2016-11-17 19:28:04 +02:00
Ari Koivula
fa078102f1
Fix 32bit compilation
...
Got a warning about implicit cast from uint64_t to void*.
2016-11-17 17:53:57 +02:00
Ari Koivula
5ceec06bd3
Merge pull request #148 from Venti-/crypto
...
Crypto
2016-11-16 21:33:55 +02:00
Ari Lemmetti
c31207ea7d
Optimize intra reference building
...
-Add function with reduced logic for the most common case
2016-11-16 18:28:42 +02:00
Ari Koivula
24f2a23ef8
Remove unnecessary crypto state
...
The frame does not need it's own crypto state, since it always has at
least one sub tile.
2016-11-16 13:58:41 +02:00
Ari Koivula
8951e34fd2
Change crypto.h stubs to print instead of assert
2016-11-16 13:58:41 +02:00
Wassim Hamidouche
ea82c38906
correct memory allocation
2016-11-16 12:35:28 +02:00
Wassim Hamidouche
da3e2d1d07
resolve parallel encryption
2016-11-16 12:35:28 +02:00
Ari Koivula
b8a618e666
Fix problems with >8 bit input
...
Enforce bit depth promised by --input-bitdepth to avoid crashes when
larger values are provided.
Do endianess byte swap for all bytes when the buffer gets extended
to multiple of 8 pixels, and not just the number of input pixels.
Don't swap bytes on a little-endian system.
2016-11-13 19:58:54 +02:00
Ari Koivula
2c005cda25
Fix bug with sub-pixel motion estimation in tiles
...
The width of the tile was being used to index the frame pixel buffer
instead of the width of the buffer.
2016-11-07 15:53:52 +02:00
Ari Koivula
78a28e0338
Reformat --help message
...
- Reduce indentation to 6 spaces
- Word wrap everything to under 80 characters
- Remove defaults from options covered by presets
- Add a dash in front of argument descriptions
- Add --(no-) to names of parameters that accept it and remove mention
of enabling or disabling
- Add executable and scripts as a dependancy to make docs
2016-11-04 15:40:28 +02:00
Ari Koivula
d18de19d8a
Fix DTS and PTS not being passed on through lib API
...
Fixes "cur_dts is invalid" warning from FFmpeg.
2016-10-28 19:05:47 +03:00
Ari Koivula
0c41c2ebd6
Make CLI set PTS for each input picture
...
This value is not represented in the HEVC bitstream, which is why it
was not set previously. FFmpeg sets and needs it however, so make the
CLI set it as well to make sure we handle it correctly.
2016-10-28 19:03:03 +03:00
Ari Koivula
5bf745460d
Re-categorize options in the help message
...
- Move VUI stuff to the bottom
- Merge Parallel processing, WPP, Tiles and slices
- Add more categories for the other options
2016-10-27 03:26:15 +03:00
Ari Koivula
cb6672b452
Disable WPP when Tiles are enabled
...
Closes #142 .
2016-10-27 02:07:10 +03:00
darealshinji
488d042e5f
Bump KVZ_VERSION
2016-10-25 12:32:13 +02:00
Ari Lemmetti
29153ed503
Remove unused variable
2016-10-21 17:28:42 +03:00
Ari Lemmetti
778e46dfd8
Add AVX2 version of SSD
2016-10-21 15:07:53 +03:00
Ari Lemmetti
6f5d7c9e06
Move SSD to strategies
2016-10-21 15:07:23 +03:00
Ari Lemmetti
89b941eab4
Fix typo
2016-10-21 15:07:02 +03:00
Alexis Ballier
1dcc993743
Include i386 & i486 for compiling intel asm.
...
x86_64-pc-linux-gnu-gcc -m32 that I use for building 32bits libraries on amd64 defines only __i386__.
2016-10-14 18:07:37 +02:00
Arttu Ylä-Outinen
5fb7afe8c4
Add --implicit-rdpcm command line parameter.
...
Makes it possible to use lossless coding without implicit residual DPCM.
2016-10-03 20:01:55 +09:00
Arttu Ylä-Outinen
5affc0f527
Use implicit RDPCM in lossless mode.
...
Sets implicit RDPCM flag in SPS when lossy coding is disabled and
applies DPCM to intra residual when prediction mode is horizontal or
vertical.
2016-10-03 19:31:38 +09:00
Ari Koivula
016dbe0894
Further refine presets
...
The rd-complexity of slow presets is better with a less agressive GOP.
Adding the GOP as part of the preset improved BDRate enough, that it
didn't make sense anymore to have a veryslow target the best BDRate.
Instead, push that responsibility to placebo by making it a little bit
faster.
2016-09-29 17:35:12 +03:00
Ari Koivula
31c5ff0f16
Add cross-platform core number detection
...
Well, turns out pthread_num_processors_np isn't standard so we need to
do this crap. Threw in hyper threading detection as a bonus.
2016-09-29 00:03:21 +03:00
Ari Koivula
8c7351eac8
Fix lp-gop with depth 1
...
GOPs with depth 1 had the same structure as those with depth 2:
g4d3t1 = 3 2 3 1
g4d2t1 = 2 2 2 1
g4d1t1 = 2 2 2 1
It now results in the correct:
g4d1t1 = 1 1 1 1
2016-09-29 00:03:21 +03:00
Ari Koivula
a395aeaac9
Set default settings to those of --preset=medium
2016-09-29 00:03:21 +03:00
Ari Koivula
4388fe0d30
Set presets to ratedistortion-complexity optimized versions
2016-09-29 00:03:20 +03:00
Ari Koivula
facb1e16df
Use -p64 -q22 and --gop=lp-g4d3t1 by default
...
Coding inter without GOP of any kind really isn't a very sensible
default. Defaulting to B-GOP of some kind would be more better,
but lp-gop is more robust for now.
2016-09-29 00:03:20 +03:00
Ari Koivula
d7391a9593
Improve default for number of parallel frames
2016-09-29 00:03:20 +03:00
Ari Koivula
19d423ab29
Use all available cores by default
2016-09-29 00:03:20 +03:00
Ari Koivula
3f138f087a
Allow non-gop-length --period for lp-gop
2016-09-29 00:03:19 +03:00
Ari Koivula
16790c9f15
Remove number of references from --gop=lp syntax
...
The number of references should be part of the presets, so gop should
be defined separately.
2016-09-29 00:03:19 +03:00
Ari Koivula
cbfa824d1a
Merge branch 'simd'
2016-09-27 20:49:45 +03:00
Ari Koivula
14a7bcba25
Use a faster function for clipped inter SAD
...
Use the vectorized general SSE41 inter SAD in AVX reg_sad for shapes
for which we don't have AVX versions yet.
Also improves speed of --smp and --amp a lot. Got a 1.25x speedup for:
--preset=ultrafast -q 27 --gop=lp-g4d3r3t1 --me-early-termination=on --rd=1 --pu-depth-inter=1-3 --smp --amp
* Suite speed_tests:
-PASS inter_sad: 0.898M x reg_sad(64x63):x86_asm_avx (1000 ticks, 1.000 sec)
+PASS inter_sad: 2.503M x reg_sad(64x63):x86_asm_avx (1000 ticks, 1.000 sec)
-PASS inter_sad: 115.054M x reg_sad(1x1):x86_asm_avx (1000 ticks, 1.000 sec)
+PASS inter_sad: 133.577M x reg_sad(1x1):x86_asm_avx (1000 ticks, 1.000 sec)
2016-09-27 20:48:30 +03:00
Arttu Ylä-Outinen
4313e56c2d
Add --no-rdoq-skip command line switch
2016-09-11 17:40:16 +09:00
Ari Koivula
a7a33b08ec
Remove --slice-addresses from usage message
...
And give a warning if it's used.
Slices will have to be implemented at some point, but they aren't yet
so let's not advertize them.
2016-09-10 21:06:00 +03:00
Eemeli Kallio
f41e428e5f
Removed kvz_skip_unnecessary_rdoq and reworked --rdoq-skip to skip 4x4 blocks when it is on.
2016-09-09 10:26:07 +03:00
Eemeli Kallio
ed9c0b0416
RDOQ reworked in rdo.c. rdoq_signhide now skips coeffs that are after best_last_idx.
2016-09-09 10:16:51 +03:00
Ari Koivula
02cd17b427
Add faster AVX inter SAD for 32x32 and 64x64
...
Add implementations for these functions that process the image line by
line instead of using the 16x16 function to process block by block.
The 32x32 is around 30% faster, and 64x64 is around 15% faster,
on Haswell.
PASS inter_sad: 28.744M x reg_sad(32x32):x86_asm_avx (1014 ticks, 1.014 sec)
PASS inter_sad: 7.882M x reg_sad(64x64):x86_asm_avx (1014 ticks, 1.014 sec)
to
PASS inter_sad: 37.828M x reg_sad(32x32):x86_asm_avx (1014 ticks, 1.014 sec)
PASS inter_sad: 9.081M x reg_sad(64x64):x86_asm_avx (1014 ticks, 1.014 sec)
2016-09-01 21:36:39 +03:00
Ari Koivula
d0512d25c6
Use fixed point in get_mvd_coding_cost
2016-08-30 21:37:12 +03:00
Ari Koivula
ec7507a935
Further optimize get_ep_ex_golomb_bitcost
...
Unrolled 16-bit log2 calculation.
2016-08-30 21:37:01 +03:00
Ari Koivula
a4ba794587
Optimize get_ep_ex_golomb_bitcost
...
Arrange the decision tree such that there is only 3 branches on the
most common paths and the more likely branch is always fall-through.
A profile guided optimization pass would probably do something similar.
2016-08-30 05:24:16 +03:00
Ari Koivula
82cfab58f8
Improve fast mvd coding cost estimation
...
A lot of time is being taken up by this function on ultrafast, and it
doesn't do a very good job. This change aims to both simplify the
logic and make the estimate better.
The logic is simplified by using a look up for the step mvd bit cost
step function instead of mimicking the binarization process. The
estimation is made better by checking fractional cabac bit costs.
The new function returns the same results as
kvz_get_mvd_coding_cost_cabac, but is also faster than the old
function.
2016-08-30 04:55:09 +03:00
Ari Koivula
d31be8eb27
Make mvd_coding_cost functions take const cabac
2016-08-30 04:46:46 +03:00
Ari Koivula
64d631c174
Fix 8bit to 10bit input conversion regression
2016-08-25 22:09:40 +03:00
Ari Koivula
27789125d8
Fix input bit depth conversion
...
The input was being shifted to the wrong direction.
2016-08-25 22:05:25 +03:00
Ari Koivula
4ec039004b
Add monochrome encoding
...
Write bitstream without chroma when encoding with --input-format=P400.
This reduces bitstream size by 0-1 %, compared to coding monochrome in
420 format, and speeds up encoding slightly due to not processing
chroma.
2016-08-25 20:15:26 +03:00
Ari Koivula
c5b70cf812
Add chroma format support to yuv_t
2016-08-24 19:20:53 +03:00
Ari Koivula
032ed30ff4
Add chroma format support to kvz_picture
...
Add picture_alloc_csp to libkvz api to allocated pictures with chroma
format different from 420.
2016-08-24 19:20:53 +03:00
Ari Koivula
48ccc26839
Add --input-format and --input-bitdepth
...
Adds reading of 10 bit input for 10-bit encoding.
2016-08-24 19:20:53 +03:00
Ari Koivula
cc08073615
Refactor some indexing weirdness in init_lcu_t
...
I thought there might be a bug in this so I cleaned it up.
2016-08-24 19:12:48 +03:00
Ari Koivula
b6d674d66e
Refactor integer vector inter prediction
...
This code was pretty bad, so I cleaned it up a bit.
2016-08-24 19:09:26 +03:00
Ari Lemmetti
28c4174d0e
Fix incorrect shuffle parameters
...
_MM_SHUFFLE uses reverse order
2016-08-23 19:40:46 +03:00
Ari Lemmetti
ce77bfa15b
Replace KVZ_PERMUTE with _MM_SHUFFLE
...
The same exact macro already exists
2016-08-22 19:08:46 +03:00
Jovasa
68eef660bd
Fixed search around mv_in in fullsearch not being saved.
2016-08-19 15:19:29 +03:00
Eemeli Kallio
99d8b9abeb
Changed skip_rdoq name to kvz_skip_unnecessary_rdoq. Changed the order it uses when it goes through CGs and tuned its sum calculation.
2016-08-18 14:02:56 +03:00
Eemeli Kallio
1fb4755f31
Added rdoq-skip to quant-generic.c
2016-08-18 12:17:54 +03:00
Eemeli Kallio
d20ac03ca2
Added --rdoq-skip option
2016-08-18 12:17:53 +03:00
Marko Viitanen
83cf801664
Fixed MV constraint condition in bipred
2016-08-18 08:53:17 +03:00
Marko Viitanen
5ae1c595f2
Fixed slice_temporal_mvp_enabled_flag and disabled TMVP with tiles
...
- slice_temporal_mvp_enabled_flag should be signalled also with non-IDR I-slices
2016-08-10 14:51:41 +03:00
Marko Viitanen
5326519182
TMVP cleanup and const qualifier fixes
2016-08-10 14:10:43 +03:00
Marko Viitanen
f40907260d
Added config parameter for TMVP and cmdline option --no-tmvp
...
- Enabled by default
- Cannot be used with GOP at the moment
2016-08-10 14:09:29 +03:00
Marko Viitanen
fd52dac1f7
Fixed TMVP scaling
2016-08-10 14:09:28 +03:00
Marko Viitanen
c664bc8cf7
Added flag collocated_ref_idx to the slice header
2016-08-10 14:09:28 +03:00
Marko Viitanen
c5f2611a38
Fixes for TMVP to work with the new CU array
2016-08-10 14:09:28 +03:00
Marko Viitanen
d85af5755b
TMVP working when only 1 ref frame
2016-08-10 14:09:28 +03:00
Marko Viitanen
39f0165efe
Fix a bug in TMVP, the reference cu_array was being overwritten
2016-08-10 14:09:27 +03:00
Marko Viitanen
adab8c327e
Clean TMVP code
2016-08-10 14:09:20 +03:00
Marko Viitanen
5fa8226ac9
Temporal merge candidate selection
2016-08-10 14:09:20 +03:00
Marko Viitanen
f83042f4a1
Temporal MV candidate selection
2016-08-10 14:09:19 +03:00
Marko Viitanen
f8671581e3
Implemented function kvz_inter_get_temporal_merge_candidates()
2016-08-10 14:09:19 +03:00
Marko Viitanen
2956bdb379
Added flag slice_temporal_mvp_enabled_flag
2016-08-10 14:09:19 +03:00
Arttu Ylä-Outinen
2a946bd88e
Rename encoder_state_t.global to frame
...
"Frame" is more accurate than "global" since when OWF is used, encoder
states for each frame have their own struct.
2016-08-10 13:22:36 +09:00
Arttu Ylä-Outinen
5fbb0a8c27
Fix includes
2016-08-10 13:05:40 +09:00
Arttu Ylä-Outinen
aabf6ca3ee
Extract encoding code from encoderstate.c
...
Moves functions kvz_encode_coding_tree and kvz_encode_coeff_nxn from
encoderstate.c to encode_coding_tree.c.
2016-08-09 22:16:50 +09:00
Arttu Ylä-Outinen
803f29be8f
Remove reconstructed picture allocation in lossless.
...
Changes encoder_set_source_picture to set the reconstructed picture to
a copy of the source picture instead of allocating a new picture when
lossless coding is used.
2016-08-03 14:25:08 +09:00
Arttu Ylä-Outinen
aaec473a19
Refactor encoder state initialization.
...
- Moves allocation of the reconstructed picture after the source picture
is set.
- Extracts main state initialization to a separate function from
encoder_state_new_frame.
- Changes kvz_encoder_feed_frame to return the frame.
- Renames some functions to better match their purpose.
2016-08-03 14:25:08 +09:00
Arttu Ylä-Outinen
cd7024b3a5
Skip computing SSD when using lossless coding.
...
The SSD is always zero since it is lossless.
2016-08-03 14:25:08 +09:00
Arttu Ylä-Outinen
fbbe5d1844
Use kvz_pixels_calc_ssd for SSD in search.c.
...
Replaces loops for computing SSDs by calling kvz_pixels_calc_ssd in
search.c.
2016-08-03 14:25:08 +09:00
Arttu Ylä-Outinen
22cc97ffb1
Fix missing field initializers.
2016-08-03 14:25:08 +09:00
Arttu Ylä-Outinen
06b82bf888
Disable filters, trskip and signhide in lossless.
...
When lossless coding is used, deblock and SAO are skipped, transform
skip flag is not written and sign hiding is not used.
2016-08-03 14:25:08 +09:00
Arttu Ylä-Outinen
97451ec401
Align assignments in encoder.c.
2016-08-03 14:25:08 +09:00
Arttu Ylä-Outinen
1dc94663c3
Bypass transform and quantization with --lossless.
...
When --lossless is given, set cu_transquant_bypass_flag for every CU and
bypass transform and quantization by directly copying reference pixels
to reconstruction and the residual to coefficients.
2016-08-03 14:25:08 +09:00
Arttu Ylä-Outinen
2113b0182d
Enable PPS-level tq bypass flag with --lossless.
...
Sets transquant_bypass_enable_flag to true in PPS when --lossless is
given.
2016-08-03 14:25:08 +09:00
Arttu Ylä-Outinen
a5897bbece
Make cabac context initialization tables static.
2016-08-03 14:25:08 +09:00
Arttu Ylä-Outinen
23e7d9bb37
Add --lossless command line parameter.
2016-08-03 14:25:08 +09:00
Arttu Ylä-Outinen
5372ea432f
Update README and manpage.
2016-08-03 14:25:08 +09:00
Ari Lemmetti
6bcba004ff
Comment out to fix unused code error on clang.
2016-07-14 14:12:16 +03:00
Ari Lemmetti
c0979ebdcb
Implement AVX2 luma sampling
2016-07-14 12:53:02 +03:00
Ari Lemmetti
6244560426
Add avx2 strategy for kvz_filter_frac_blocks_luma.
2016-07-14 12:53:02 +03:00
Ari Lemmetti
9c4e9e049b
Load only what is needed. Eliminate latency from hadds.
2016-07-14 12:53:01 +03:00
Ari Lemmetti
7f71cb423a
Check 4 fractional pixel positions simultaneously
2016-07-14 12:52:24 +03:00
Ari Lemmetti
ad445ab8a1
Transition to kvz_filter_frac_blocks_luma
2016-07-14 12:51:02 +03:00
Ari Lemmetti
fccfbd2f28
Add strategy for kvz_filter_frac_blocks_luma
2016-07-14 12:51:02 +03:00
Ari Lemmetti
e9c3074d32
Add buffers and definitions for upcoming filtering
...
Samples are to be filtered in separate blocks instead of
making one big picture with interpolated pixels
2016-07-14 12:51:02 +03:00
Ari Lemmetti
7afe7e963b
Use fme_level to control the search accuracy.
2016-07-14 12:51:01 +03:00
Ari Lemmetti
5fa323bf25
Skip searching best hpel twice. Make hpel and qpel loops similar.
2016-07-14 12:51:01 +03:00
Ari Lemmetti
bc98a9affa
Change the search order to suit lighter fme search
2016-07-14 12:51:01 +03:00
Ari Lemmetti
2b0c8db349
Add quad satd for avx2
2016-07-14 12:50:24 +03:00
Ari Lemmetti
0ff69fd6f8
Add any size multi satd
2016-07-14 12:48:37 +03:00
Ari Lemmetti
d17b9e7d6e
Allow subme parameters 0-4
...
Update usage, presets,defaults,lib version
2016-07-12 19:49:38 +03:00
Arttu Ylä-Outinen
62ad57d0bf
Fix kvz_image_list_add for zero-sized lists.
...
When a list does not have space for the new element, its size is
doubled. If the size of the list is zero, it would not be resized. Fixed
to always resize the list so that the new element can be added.
2016-06-22 13:35:16 +09:00
Arttu Ylä-Outinen
433e528af7
Drop unused variable in search_pu_inter.
...
Removes unused variable max_px_below_lcu.
2016-06-22 13:35:16 +09:00
Arttu Ylä-Outinen
7836ff6ec9
Drop unused functions.
...
Removes functions kvz_coefficients_calc_abs, kvz_intra_rdo_cost_compare
and kvz_rdo_cost_intra which are no longer used.
2016-06-22 13:35:15 +09:00
Arttu Ylä-Outinen
e4b5840f56
Add parentheses around macro arguments in cabac.h.
2016-06-22 13:35:15 +09:00
Arttu Ylä-Outinen
a387b74e51
Fix resolution auto-detection.
...
Only try to guess the resolution from filename when neither width nor
height is given.
2016-06-22 13:35:15 +09:00
Arttu Ylä-Outinen
097bf8f3c0
Add a typedef for mvd coding cost functions.
2016-06-20 13:56:10 +09:00
Arttu Ylä-Outinen
d3c0e49286
Update comments.
2016-06-16 20:25:08 +09:00
Arttu Ylä-Outinen
ae832cda8c
Pack cbf flags in cu_info_t to two bytes.
...
Reduces size of cu_info_t.
2016-06-16 20:24:19 +09:00
Arttu Ylä-Outinen
cad2d496b8
Enable 4x8 and 4x16 partition modes
...
Enables search for 2NxN and Nx2N partition modes for 8x8 CUs and 2NxnU,
2NxnD, nLx2N and nRx2N partition modes for 16x16 CUs.
Changes the loop for copying reconstructed luma pixels in
kvz_inter_recon_lcu to use 4 byte chunks instead of 8 byte chunks since
it is now possible to have 4 pixel wide blocks.
2016-06-16 20:23:16 +09:00
Arttu Ylä-Outinen
90df7350f0
Make deblocking work with 4 pixel wide blocks.
2016-06-16 20:21:50 +09:00
Arttu Ylä-Outinen
bf26661782
Add support for 4x4 blocks to SATD_ANY_SIZE.
...
Makes functions satd_any_size_generic and satd_any_size_8bit_avx2 work
on blocks whose width and/or height are not multiples of 8.
2016-06-16 18:53:17 +09:00
Arttu Ylä-Outinen
2ae260e422
Change width of cells in lcu_t to 4 pixels.
...
Intra mode info for NxN partition units is now stored in the
corresponding 4x4 cell in lcu_t.cu array.
2016-06-16 18:53:17 +09:00
Arttu Ylä-Outinen
360f5bb8da
Always use pixel coordinates for indexing lcu_t.
...
Removes macro LCU_GET_CU and uses LCU_GET_CU_AT_PX in its place.
2016-06-16 18:53:17 +09:00
Arttu Ylä-Outinen
46e8122d27
Add functions for indexing cu_array_t structures.
...
Replaces macro CU_ARRAY_AT with functions kvz_cu_array_at and
kvz_cu_array_at_const.
2016-06-16 18:52:19 +09:00
Arttu Ylä-Outinen
c5afabdd3b
Change width of cells in cu_array_t to 4 pixels.
2016-06-15 12:25:11 +09:00
Arttu Ylä-Outinen
57a3d9b4b9
Add a function for copying CU data from LCUs.
...
Adds function kvz_cu_array_copy_from_lcu which CU info data from an
lcu_t structure to a cu_array_t structure.
2016-06-15 12:25:11 +09:00
Arttu Ylä-Outinen
2c85a00a55
Change kvz_cu_array_alloc to use pixel dimensions.
...
Changes function kvz_cu_array_alloc to take width and height parameters
in pixels instead of SCUs.
2016-06-15 12:25:11 +09:00
Arttu Ylä-Outinen
b276a347c0
Add a macro for indexing cu_array_t.
...
Adds macro CU_ARRAY_AT(cu_array, x, y) to cu.h.
2016-06-15 12:25:11 +09:00
Arttu Ylä-Outinen
8ac1f1986e
Move CU array copy to a separate function.
...
Moves code for copying parts of cu_array_t to a new function
kvz_cu_array_copy in cu module.
2016-06-15 12:25:11 +09:00
Arttu Ylä-Outinen
41e75daed7
Fix overlapping memcpy in kvz_search_cu_smp.
...
The destination and source pointers might be equal. Fixed by replacing
the memcpy call with a simple assignment.
2016-06-15 12:25:11 +09:00
Ari Lemmetti
29af8bcd21
Remove const to match function signature
2016-06-14 18:19:40 +03:00
Eemeli Kallio
5af6ab320c
Merge branch 'me_early_terminate'
...
Conflicts:
configure.ac
src/cfg.c
src/cli.c
src/kvazaar.h
src/search_inter.c
2016-06-14 15:03:35 +03:00
Eemeli Kallio
43c7778b82
Updated version number.
2016-06-14 10:53:04 +03:00
Arttu Ylä-Outinen
23fdeeaf10
Move mv_cand and mv_dir into a bitfield in cu_info_t.
...
Reduces size of cu_info_t.
2016-06-14 12:21:57 +09:00
Arttu Ylä-Outinen
35aadf6776
Reduce size of type in cu_info_t to two bits.
...
Reduces size of cu_info_t.
2016-06-14 12:21:57 +09:00
Arttu Ylä-Outinen
1cbe844f79
Move inter and intra into an union in cu_info_t.
...
Reduces size of cu_info_t.
2016-06-14 12:21:57 +09:00
Arttu Ylä-Outinen
b6d793ef33
Drop field inter.mvd from cu_info_t
...
Instead of storing the mv differences in cu_info_t, they are computed
from the mv candidates and the motion vector. Reduces the size of
cu_info_t.
2016-06-14 12:21:57 +09:00
Arttu Ylä-Outinen
98aa906f30
Drop field coded from cu_info_t
...
It can be inferred from the position and size of the CU.
2016-06-14 12:21:57 +09:00
Arttu Ylä-Outinen
ebb10763f1
Drop field inter.mv_ref_coded from cu_info_t.
...
Storing inter.mv_ref_coded in cu_info_t is unnecessary since it can be
computed from refmap and inter.mv_ref.
2016-06-14 12:21:57 +09:00
Arttu Ylä-Outinen
4be5c8f349
Move flags into a bitfield in cu_info_t.
...
Reduces the size of cu_info_t.
2016-06-14 12:21:57 +09:00
Arttu Ylä-Outinen
30e9ee988d
Move bitcost field out of cu_info_t.inter.
...
The bitcost is only needed for the currently searched CU.
Fixes bitcost of the second PU being ignored when using SMP or AMP.
2016-06-14 12:21:57 +09:00
Arttu Ylä-Outinen
16d13ed046
Move cost field out of cu_info_t.inter
...
The cost is only needed for the currently searched CU.
2016-06-14 12:20:05 +09:00
Arttu Ylä-Outinen
c5c2c182d9
Drop unused field mode from cu_info_t.inter.
2016-06-14 12:18:17 +09:00
Eemeli Kallio
e4f1a74512
Added early termination option for motion estimation.
...
Conflicts:
src/search_inter.c
2016-06-13 16:20:35 +03:00
Wassim Hamidouche
5bc7287c67
add fix for crypro
2016-06-09 10:49:31 +03:00
Wassim Hamidouche
35634b5596
correct MV sign encryption
2016-06-09 10:49:31 +03:00
Wassim Hamidouche
15abdc6e81
correct sign encryption
2016-06-09 10:49:31 +03:00
Wassim Hamidouche
73c3203a26
encry coef transfs
2016-06-09 10:49:31 +03:00
Wassim Hamidouche
7ad5f8bbe5
encry coef transf sign
2016-06-09 10:49:31 +03:00
Wassim Hamidouche
02b0712973
fix g++ compilation
2016-06-09 10:48:44 +03:00
Ari Koivula
a2170f0763
Compile the cryptopp wrapper only when used
...
This should allow us to avoid an unnecessary dependancy to a C++
compiler.
Conflicts:
configure.ac
2016-06-07 17:11:12 +03:00
Ari Koivula
182038c743
Don't allow enabling encryption when it's not compiled in
2016-06-07 16:58:09 +03:00
Ari Koivula
8eb087120e
Make VisualStudio ignore the crypto stuff
...
Add stubs for the crypto functions so we can refer to them, even if we
never use them.
2016-06-07 16:58:09 +03:00
Wassim Hamidouche
76cb6dc6c2
add check flags
2016-06-07 10:54:26 +02:00
Ari Koivula
60ea8a359f
Add --crypto parameter
2016-06-07 10:31:40 +02:00
Wassim Hamidouche
02308d1ba6
add MVs encryption
2016-06-07 10:28:30 +02:00
Wassim Hamidouche
4637c8a828
compile Kvazaar encoder with ITpp library
2016-06-07 08:33:04 +02:00
Eemeli Kallio
8f182ac6de
Added functions select_starting_point and mv_in_merge to search_inter.c
2016-06-06 17:16:04 +03:00
Ari Koivula
fe71638a96
Fix problem with ASM compilation
...
When compiling C++ files along with C, libtool would complain about
the --tag missing, even though CC should be the default.
2016-06-06 15:47:56 +03:00
Eemeli Kallio
836a3b1daa
Added functions select_starting_point and mv_in_merge.
2016-06-06 12:18:33 +03:00
Ari Koivula
4eaacbe23e
Fix bug with lp-gop and ratecontrol
...
The first frame was always qp51 due to gop_offset being -1 for the
first frame. This fix makes it so that bits are allocated as if it was
the last (high quality) frame from the previous GOP.
2016-05-27 15:53:55 +03:00
Ari Koivula
3fbd7ed97f
Add GOP layer weights for lowdelay-P
...
When using ratecontrol with lowdelay-P, this improves BDRate by 1-25%.
Strongest effect is when using 4 layers and multiple references.
Also allow using 1 or 2 layers with ratecontrol.
2016-05-27 13:46:26 +03:00
Ari Koivula
67acead4bc
Fix referring over IDR boundary when using --gop
...
This problem resulted in an illegal bitstream with --gop=lp, because it
uses IDR's. The --gop=8 would not code IDR pictures, even when told to
with -p, which masked this problem.
This fix solves the problem with --gop=lp and also prevents references
across the intra picture in --gop=8. The intra pictures should be set
to IDR in a later fix, or an alternate method of differentiating
between IDR and non-IDR intra should be made.
2016-05-27 13:20:53 +03:00
Ari Koivula
a77dc1610e
Refactor encoder_state_remove_refs
...
I needed to debug this, so I rewrote it to make sense. There is an
obvious bug with the IDR handling that I left in place to fix in a
separate commit.
2016-05-27 13:20:45 +03:00
Eemeli Kallio
b5c05e58e0
Fixed typo in strategyselector.c
2016-05-24 11:04:29 +03:00
Ari Lemmetti
68c6f0f7b8
Enable deblocking for every preset
...
Deblocking adds very little complexity
while giving massive coding performance boost
2016-05-17 18:50:31 +03:00
Ari Lemmetti
6a07761b46
Add smp and amp options to presets
2016-05-17 14:26:58 +03:00
Ari Lemmetti
3107a93eaf
Fix avx2 chroma sampling for amp
2016-05-17 14:09:57 +03:00
Ari Koivula
24d0f9f685
Fix usage message for --hash
2016-05-11 15:03:43 +03:00
Ari Koivula
a1c772b696
Merge pull request #136 from MrAsura/cu-split-termination
...
Cu split termination
Closes #133 .
2016-05-10 17:22:08 +03:00
Jaakko Laitinen
7010526b1d
Removed tabs.
2016-05-10 15:52:44 +03:00
Jaakko Laitinen
a77eb5c874
Fixed type conversion error when parsing cu split termination.
2016-05-10 14:34:46 +03:00
Jaakko Laitinen
0d361d5bc7
Moved cu split termination from a pre-processor to a input parameter.
2016-05-10 14:15:41 +03:00
Ari Koivula
1dbe4eb852
Merge branch 'mv-full'
2016-05-10 13:28:07 +03:00
Ari Koivula
f6a9d237a3
Merge pull request #134 from miimiz/testink_eemeli
...
Strategyselector prints
2016-05-10 13:27:23 +03:00
Eemeli Kallio
8cfeed852c
Added print about SIMD optimizations available and in use to strategyselector.
2016-05-10 12:59:15 +03:00
Ari Koivula
f51a68b6fa
Add different sizes of search window for full search
2016-04-21 15:11:35 +03:00
Ari Lemmetti
efbdc5dade
Utilize registers more efficiently for 8x8 and larger blocks
2016-04-21 13:26:38 +03:00
Ari Lemmetti
192cee95b2
Vectorize vertical filtering
2016-04-21 13:26:38 +03:00
Ari Lemmetti
0be35f72b8
Filter 4 pixels simultaneously in x direction
2016-04-21 13:26:38 +03:00
Ari Lemmetti
10484bda9f
Make strategies out of fractional pixel sample functions
2016-04-21 13:26:38 +03:00
Ari Koivula
28e7548387
Fix bug in full mv search
...
This optimization led to some points not being searched.
2016-04-21 12:03:57 +03:00
Ari Koivula
2576aeee0b
Use merge candidates in full mv search
...
Perform a full search window around every mv candidate and the
0-vector.
2016-04-20 20:47:11 +03:00
Ari Lemmetti
8247faf8e0
Remove 64-bit only instruction to fix 32-bit compilation.
2016-04-19 18:05:11 +03:00
Ari Lemmetti
eb55d6b6b9
Fix writing over boundary.
2016-04-19 16:03:43 +03:00
Ari Lemmetti
bcabc6fadd
Remove pixel blit from strategies. Use memcpy instead.
2016-04-06 18:44:04 +03:00
Ari Lemmetti
2140197ccc
Tidy up coeff blit function and use memcpy again.
...
Give memcpy constants for fixed sizes to enable copying many bytes simultaneously.
2016-04-06 18:03:00 +03:00
Ari Koivula
08b4480d94
Re-add time.h include
...
Include-what-you-use wants to include sys/time.h instead, or if I
override it to include time.h it will remove the include completely.
2016-04-02 19:05:16 +03:00
Ari Koivula
61fc3e87ba
Run include-what-you-use fix_includes.py fix_includes.py
...
The includes should make more sense now and not just happen to compile
due to headers included from other headers.
Used a modified version of IWYU. Modifications were to attribute int8_t
and so on to stdint.h instead of sys/types.h and immintrin.h instead of
more specific headers.
include-what-you-use 0.7 (git:b70df35)
based on clang version 3.9.0 (trunk 264728)
2016-04-01 17:46:55 +03:00
Ari Koivula
016810d982
Move COMPILE_ macro to global.h
...
While these are only used for strategies, it's non-intuitive to have
to include strategyselector.h in every file under strategies before
including anything else.
2016-04-01 17:46:55 +03:00
Ari Koivula
8908d85d66
Change all relative includes to absolute
2016-04-01 17:46:44 +03:00
Ari Koivula
4876879b82
Add IWYU pragmas
2016-03-31 12:33:34 +03:00
Marko Viitanen
41a5f9bbbe
Fix filetime conversion to timespec
2016-03-24 10:08:11 +02:00
Ari Koivula
9139e169fe
Fix unnecessary waiting in main thread
...
The main thread has to wait for the worker threads to finish. The
pthread_cond_timedwait call used to accomplish this was given
a relative instead of absolute time, which resulted in the call
returning immediately, because the time had already passed.
This removes the now unnecessary sleeps and fixes the time given to
the pthread_cond_timedwait such that it now waits until a job finishes
or 100ms have passed.
2016-03-23 22:23:04 +02:00
Ari Koivula
e23ed231fb
Fix race condition with owf and non-square motion partitions
...
The OWF wpp limit code assumed square blocks, and as such did not work
correctly when height != width. This changes the relevant code to consider
both height and width.
2016-03-22 16:46:38 +02:00
Arttu Ylä-Outinen
d6a3e02f16
Fix calculating reference CU index in inter search
...
Fixes a possible segfault when SMP or AMP blocks are used.
2016-03-22 12:55:58 +02:00
Ari Lemmetti
f4538ab474
Copy pixels more efficiently in lcu recon.
2016-03-18 20:10:03 +02:00
Ari Koivula
5b66578f71
Add kvz_ prefix to md5 functions
...
The non kvz_ symbols were being exported in the static lib, which got caught
by Travis tests.
2016-03-18 13:13:35 +02:00
Ari Koivula
4125218cfa
Add --hash=md5
...
Add md5 through extras/libmd5 taken from HM with BSD license. It's
implemented as a generic strategy using the same interface as checksum,
so we can write a SIMD version if it seems necessary.
2016-03-18 05:23:57 +02:00
Ari Koivula
883448b8fb
Add --hash parameter
...
Allows decoded picture hash to be selected among none and checksum.
2016-03-18 05:20:15 +02:00
Ari Lemmetti
6d5f8e3aec
Define KVZ_COMPILE_ASM for the correct files.
...
Enables asm strategies again.
2016-03-17 16:21:31 +02:00
Ari Lemmetti
e502292ba8
Remove old function
2016-03-16 20:18:55 +02:00
Ari Lemmetti
c6cc96f5ec
Optimize sao band ddistortion
2016-03-16 20:16:00 +02:00
Ari Lemmetti
ab577f476f
Optimize sao reconstruct color
2016-03-16 20:15:32 +02:00
Ari Lemmetti
48bfddf4ec
Optimize calc sao edge dir
2016-03-16 20:14:50 +02:00
Ari Lemmetti
ba69992941
Optimize sao edge ddistortion
2016-03-16 20:14:19 +02:00
Ari Lemmetti
941b6b3e27
Optimize calc eo cat
2016-03-16 20:13:30 +02:00
Ari Lemmetti
04fbb48a09
Add strategy for avx2. Copy generic functions there.
2016-03-16 20:13:15 +02:00
Ari Lemmetti
4e30a215d8
Create generic strategy for sao.
2016-03-16 20:11:15 +02:00
Ari Koivula
6f431e510c
Comment and tidy threadqueue_worker
...
Carefully avoided making any changes to the logic.
2016-03-14 20:08:04 +02:00
Ari Koivula
1165ae2e1f
Increase --mv-constraint=frametimemargin margin
...
Increase the margin to be 4 luma pixels to every direction.
2016-03-14 16:02:54 +02:00
Arttu Ylä-Outinen
0eda28ced6
Fix Visual Studio warnings
...
Initialization of a struct with addresses of local variables generated
warning C4221 in encmain.
2016-03-14 14:12:21 +02:00
Ari Koivula
e91ca74733
Refactor kvz_encode_last_significant_xy
2016-03-10 18:47:16 +02:00
Ari Koivula
1fc0e8076c
Format kvz_encode_last_significant_xy whitespace
2016-03-10 18:17:45 +02:00
Ari Koivula
df9a958ef2
Merge branch 'log2'
2016-03-10 18:16:41 +02:00
Ari Koivula
4112a4364d
Remove g_to_bits table
2016-03-10 15:59:51 +02:00
Ari Koivula
9fcfba637f
Remove duplicated inline functions
2016-03-10 15:28:31 +02:00
Ari Koivula
e27ec2cc53
Add kvz_math.h for common inline math functions
...
Calling it just math.h would have prevented including system math.h.
2016-03-10 15:26:18 +02:00
Ricardo Constantino
c515796a21
Only use version prefix in kvazaar binary
...
Fixes regression since 54f08f2
causing libkvazaar version checks to not
work (i.e. pkg-config)
2016-03-09 16:13:59 +00:00
Arttu Ylä-Outinen
54f08f2bdb
Use output of git describe as version.
2016-03-09 15:04:29 +02:00
Ari Koivula
f8edf28161
Fix const qualifier warning
...
Also set the warning to an error in VS.
2016-03-09 14:16:15 +02:00
Ari Koivula
b0c3ece31e
Fix race condition when deblocking is on but SAO is off
...
Already suspected this yesterday, but didn't want to add the code to
handle it before confirming that it's actually a problem. It is.
2016-03-09 14:02:46 +02:00
Ari Koivula
1671725c72
Fix non-determinism issue with OWF WPP margin
...
The previous reasoning used deblocking and fractional motion estimation
together to arrive at a margin of 4 pixels. This was wrong, and with
either of these off, half pixel chroma interpolation could use pixels
outside the intended region.
Deblocking does not currently affect the margin needed.
2016-03-08 20:18:38 +02:00
Ari Koivula
674bfa14ce
Comment WPP deblocking and SAO
...
I was a bit unclear about exactly what happens and when regarding SAO
and deblocking when we do frame-parallel WPP parallelism, so I checked
and commented the bits that were unclear to me.
2016-03-08 19:39:04 +02:00
Ari Koivula
aec152c953
Fix OWF mv restriction limit
...
The check was done in regard to the wrong dimension, allowing the
access to unfinished parts of the frame when coding multiple frames
at the same time.
2016-03-08 17:12:43 +02:00
Ari Koivula
fda103aa7c
Refactor cfg->tiles_width_count and cfg->tiles_height_count
...
Change code everywere so these actually mean "width count" and not
"width count minus one".
2016-03-07 17:29:15 +02:00
Ari Koivula
a350eb3a1e
Fix --tiles to have the correct number of tiles.
...
The tiles_width_count etc. actually mean "count minus one".
2016-03-07 17:24:31 +02:00
Ari Koivula
49ea2d7b7f
Fix --mv-constraint=frametile
...
Option --mv-constraint=frametilemargin was being used instead of
frametile.
2016-03-07 16:41:00 +02:00
Ari Koivula
95b8dd99f6
Add --tiles parameter
...
Add new parameter --tiles that accept only uniform split. I considered
supporting the syntax of --tiles-width-split for this, but writing
--tiles=u2xu2 is just not as intuitive as --tiles=2x2, and there is
hardly ever any reason to use anything but uniform split. The more
cumbersome --tiles-width-split and --tiles-height-split parameters
are still there to allow finer control.
2016-03-07 16:33:51 +02:00
Ari Koivula
fd34dd9bc6
Fix race condition with OWF
...
There was an off by one error in the dependance setting code, which
resulted in dependencies not being set resulting in checksum errors.
For example if ref_neg=1 and owf=1.
2016-03-07 13:38:23 +02:00
Ari Koivula
81b439f4da
Optimize starting point selection in tz
...
Avoid checking zero motion vectors multiple times. The merge candidate
list often has only one or two candidates, the other being zeroes.
2016-03-04 16:48:46 +02:00
Ari Koivula
2436702c27
Optimize starting point selection in hexbs
...
Avoid checking zero motion vectors multiple times. The merge candidate
list often has only one or two candidates, the other being zeroes.
2016-03-04 16:48:12 +02:00
Ari Koivula
5327b59b45
Remove KVZ_PERF_SEARCHPX
...
It's too invasive and we don't really need it.
2016-03-04 16:48:12 +02:00
Arttu Ylä-Outinen
348ac4888b
Fix calc_mode_bits.
...
The CUs left and above the current one would be set to NULL when there
was only one CU between the current one and the left or top edge of the
frame.
2016-03-04 14:08:35 +02:00
Ari Koivula
86219aa0fc
Fix non-determinism with tiles
...
Earlier fix that fixed the supply side of the cu_array to take tile
coordinates into account should have been accompanied with this one
that does the same thing to demand side.
2016-03-03 17:39:20 +02:00
Arttu Ylä-Outinen
626b53ce85
Move sao search from encoderstate to sao.
...
Moves sao search from function encoder_state_worker_encode_lcu in
encoderstate.c to function kvz_sao_search_lcu in sao.c. Makes functions
kvz_init_sao_info, kvz_sao_search_chroma and kvz_sao_search_luma static
since they are no longer used outside sao.c.
2016-03-01 14:56:16 +02:00
Ari Koivula
cfa722e448
Reduce parallelism for tiles
...
There is still some race-condition with encoding tiles from multiple
frames, so disable this to keep the bitstream deterministic.
2016-02-29 20:20:21 +02:00
Ari Koivula
3dcc0957f8
Deal with impossible mv constraints
...
If 0,0 vector is illegal, it's possible that no legal movement vector,
is found, in which case a large cost is returned instead. The cost
overflowed and there is all sorts of silliness with converting from
double to int, but I'm not going to fix all of it because when we
remove the doubles it will all get fixed.
2016-02-29 19:18:14 +02:00
Ari Koivula
b1adf1576a
Add --mv-constraint=frametilemargin
...
Add an even stricter motion vector constraint to prevent motion vectors
to fractional pixel positions that would need pixels outside the tile.
2016-02-29 19:18:14 +02:00
Ari Koivula
f808cbf608
Allow increased parallelism for tiles
...
When movement vectors are constrained to tiles, only the same tile in
previous frame needs to be depended upon.
2016-02-29 14:33:06 +02:00
Ari Koivula
f4ebff12b0
Combine tile mv constraint with OWF mv constraint
...
This also fixes movement vectors in tiles when OWF is on. The OWF mv
constraint assumed WPP, so it didn't work with tiles.
2016-02-29 14:33:06 +02:00
Ari Koivula
7981609cd0
Add --mv-constraint=frametile
2016-02-29 14:33:06 +02:00
Ari Koivula
9dbbb7fdbc
Add --mv-constraint argument
2016-02-29 14:33:06 +02:00
Ari Koivula
1be877faf9
Fix chroma reconstruction with tiles
...
An incorrect frame boundary check caused a checksum error, because the
chroma reconstruction of the encoder was wrong. The encoder treated
horizontal tile boundaries as frame boundaries when the vertical
component of the movement vector was a multiple of 8.
2016-02-29 14:32:51 +02:00
Ari Koivula
c0dc490dd1
Fix inter non-determinism with tiles
...
CU data was being copied to the wrong place in the reference frames
cu_array, which led to uninitialized data being used as a starting
point for motion vector search.
Fixes #99 .
2016-02-26 17:05:04 +02:00
Ari Koivula
719d72925b
Add loop-input option
...
This option is useful for testing long encodes, as you don't have to
find an actual infinite input.
2016-02-18 20:00:55 +02:00
Ari Koivula
d23a5a15f1
Fix overflow in rate control
...
A 32 bit int overflowed after 2^31 bits (2Gb). It will still overflow
eventually, after 500 years of outputting 1Gb/s, but by that time,
I recon we will have fixed this properly and it's time to upgrade.
2016-02-18 16:48:21 +02:00
Ari Koivula
eeafe14946
Clean up search initialization
...
Copy lcu explicitly instead of initializing with the same parameters.
2016-02-17 14:57:31 +02:00
Arttu Ylä-Outinen
e5c84c361c
Eliminate a race condition with input thread.
...
Changes communication between the input thread and main thread in
encmain.c so that only one of them uses img_in and retval at a time.
Fixes a race condition which would sometimes result in a deadlock.
2016-02-17 12:09:19 +02:00
Ari Koivula
c40ede56ad
Allow more frame parallelism in LP-gop
...
Add dependency to the reference frame instead of the previous frame,
in order to allow more frames to be encoded in parallel when temporal
stepping >1 in LP-gop (such as --gop=lp-g8d4r1t2).
2016-02-05 17:08:24 +02:00
Arttu Ylä-Outinen
40c7198f7d
Add a script for updating README
...
Adds script tools/update_readme.sh for regenerating the "Using Kvazaar"
section of README.md from the output of "kvazaar --help".
2016-02-05 16:21:39 +02:00
Arttu Ylä-Outinen
aac5373095
Fix typos in documentation
...
Fixes a few typos in README and command line help.
2016-02-05 16:21:27 +02:00
Ari Koivula
a4915dc547
Update man and README
2016-02-04 14:16:58 +02:00
Ari Koivula
e941e21cd6
Enable errors about non-existing CLI options
...
Set opterr and optind to their normal default values.
2016-02-04 13:48:58 +02:00
Ari Koivula
7a4bf94a52
Add --version and --help
...
Also don't print help by default, because it's too long. Print a
shorter usage message instead.
2016-02-04 13:48:48 +02:00
Ari Lemmetti
99e37ec235
Update old pixel type to the current one
2016-01-30 19:33:09 +02:00
Ari Koivula
c76a0951cf
Change version to 0.8.3
2016-01-28 21:21:02 +02:00
Ari Koivula
cb2121b1aa
Double time scale when field coding is used
2016-01-28 21:04:52 +02:00
Ari Koivula
8ad7d2a714
Move interlacing stuff to libkvazaaar API
...
This moves the interlacing from CLI code to api->encoder_encode, in
order to make it possible to use field coding through the lib API.
The field order is now determined per frame, as FFmpeg gives it per
frame and it's signaled per frame.
As a side effect, the CLI also now prints info from frames instead of
fields. While we might want to extend the API in the future to allow
printing of more detailed information about fields, for now it's
more important that the CLI uses the real lib API.
PSNR calculation for interlaced frames disabled until we have a way to
avoid deinterlacing the frame when it's not necessary.
2016-01-27 15:29:45 +02:00
Ari Koivula
6952f0fcc6
Refactor interlaced reading
...
Doesn't change the way it works. Just rearranges things so it's easier
to see what is going on.
2016-01-26 13:42:41 +02:00
Ari Koivula
a46351efe1
Fix out of bounds error in interlacing
...
When field height was padded to a multiple of 8, yuv_io_extract_field
would read outside the buffer.
2016-01-26 13:41:52 +02:00
Arttu Ylä-Outinen
49677810b5
Rename config module to cfg.
...
Prevents a conflict with config.h and src/config.h so that the config.h
generated by configure is included in global.h. Fixes problems with
large input files on 32-bit systems.
2016-01-25 12:26:46 +02:00
Marko Viitanen
8e6c12b859
Merge branch 'input_reading_thread'
2016-01-25 12:00:03 +02:00
Marko Viitanen
b4a4ce848c
Use field parity for extracting correct fields from the interlaced picture
2016-01-25 10:58:12 +02:00
Marko Viitanen
441ce7728f
Fix for input_read_thread() in the case when interlaced source-scan-type is used
2016-01-25 10:57:51 +02:00
Marko Viitanen
198204a20a
Fix when using --source-scan-type=bff, offset was used for output lines
2016-01-25 10:13:51 +02:00
Ari Koivula
22b8ed43dc
Remove global.h include from kvazaar.h
...
It shouldn't have been put there as it's the lib interface.
2016-01-22 15:23:34 +02:00
Ari Koivula
249c88011e
Fix problem with >2GB input files on 32bit
2016-01-22 15:15:02 +02:00
Ari Koivula
fa1af14637
Fix includes to include global.h first everywhere
2016-01-22 15:07:49 +02:00
Ari Koivula
3bf278529c
Fix interlacing when using lib interface
...
Some flags used for interlacing were set in CLI interface, which
meant that interlacing didn't work correctly when used through
libkvazaar.
2016-01-22 14:35:20 +02:00
Marko Viitanen
0128ee26e7
Clear img_in pointer after reading it
2016-01-22 14:29:35 +02:00
Marko Viitanen
b5459c1f23
Fixed performance monitoring by adding KVZ_ prefix to GET_TIME
2016-01-22 11:27:25 +02:00
Marko Viitanen
e36237335e
Fixed memory leaks caused by the input handler thread and cleaned up the code
2016-01-22 11:27:25 +02:00
Marko Viitanen
ad9a1f6539
Input thread implementation
...
- Handle input processing in a separate thread to allow main thread more time with thread handling etc
- Significant speedup can be seen when run on ultrafast settings and on a system with great number of cores
2016-01-22 11:27:25 +02:00
Ari Koivula
5e734593c0
Add psnr argument to CLI
...
To disable calculation of PSNR for frames, printing 0.0dB instead.
2016-01-21 15:08:34 +02:00
Ari Koivula
9eba3a83cc
Add compiler flag checking to configure
2016-01-20 16:32:34 +00:00
Arttu Ylä-Outinen
d452709795
Fix compiling AVX2 strategies.
...
Option -mavx2 was omitted when compiling AVX2 strategies. This commit
moves strategies to convenience libraries so that their compilation
flags can be easily set and adds -mavx2 to CFLAGS of the AVX2 library.
2016-01-20 11:04:12 +02:00
Ari Koivula
8060e2f6ec
Delete kvazaar_version.h
...
It's not used anymore.
2016-01-19 20:40:35 +02:00
Ari Lemmetti
44656aeb19
Remove useless calculation
2016-01-19 16:35:16 +02:00
Marko Viitanen
e822c16659
Removed unneeded cpu flags causing compiling to fail on powerpc, closes #121
2016-01-18 08:55:32 +02:00
Ari Koivula
c8c0b4e8e8
Change version number for v0.8.2
2016-01-15 19:42:07 +02:00
Ari Koivula
e2402c0000
Remove kva_api_get versioning.
...
We have soname versioning now, so we should focus on getting that right
instead. This also serves as an example of correctly incrementing the
lib-version.
2016-01-15 19:39:24 +02:00
Ari Koivula
caf809f26d
Remove scons build scripts
...
Because we are not going to maintain them.
2016-01-15 17:35:35 +02:00
Ari Koivula
15e1110997
Remove reference to Makefile-old
...
Makefile-old was deleted and this reference breaks make dist.
2016-01-15 17:32:54 +02:00
Ari Lemmetti
a9decd2f40
Bump for yet another release
2016-01-14 23:23:11 +02:00
Ari Koivula
7718ac378f
Add fractional FPS support.
...
Now that we put the timing info into the bitstream, the time base must
be precisely known. Represent framerate as a fraction and add timing
info only if the old floating point framerate was not used.
Deprecate cfg->framerate so it can be removed once we get patches to
FFmpeg and libav.
Add support for (num)/(denom) format to --input-fps.
2016-01-14 22:16:53 +02:00
Ari Lemmetti
a9bd7b9e63
Bump version numbers for release v0.8.0
2016-01-14 20:38:28 +02:00
Ari Lemmetti
b605e3866e
Bye bye Makefile
2016-01-14 20:38:01 +02:00
Marko Viitanen
242edf98ad
Added calculation and writing of VUI num_units_in_tick and time_scale
2016-01-14 15:32:33 +02:00
Ari Lemmetti
daf39e348f
Add dedicated handling for blitting NxN coeffs when N is 4, 8 or 16
2016-01-13 19:27:45 +02:00
Ari Lemmetti
a2fc9920e6
Merge branch 'alternative-satd'
2016-01-13 15:00:43 +02:00
Ari Lemmetti
1ed34f2df8
Add some planar pred optimization for blocks larger than 8x8
2016-01-13 14:50:17 +02:00
Ari Lemmetti
0df88697ff
Copy generic function to AVX2 strategy
2016-01-12 23:51:18 +02:00
Ari Lemmetti
62799a9fc3
Create generic strategy of planar prediction
2016-01-12 23:50:47 +02:00
Ari Lemmetti
3cb1cebfe5
Add missing inlines
2016-01-12 23:03:31 +02:00
Ari Lemmetti
6a0b13b8b6
Remove unused functions
2016-01-12 22:55:37 +02:00
Ari Lemmetti
61155f0edd
Add 128-bit version of the functions as well
2016-01-12 22:52:00 +02:00
Ari Lemmetti
a6afb8a8f4
Small refactoring
2016-01-12 22:29:33 +02:00
Ari Lemmetti
a756f6133a
Manually unroll vertical Hadamard transform
2016-01-12 21:45:02 +02:00
Ari Lemmetti
66350aa20e
Experiment with alternative implementation of FWHT
2016-01-11 16:25:56 +02:00
Arttu Ylä-Outinen
e14858f41a
Fix build and tests.
...
- Remove non-existent file interface_main.c from library sources.
- Add file mv_cand_tests.c to test sources.
2015-12-21 16:03:55 +02:00
Arttu Ylä-Outinen
9abdee7cc3
Merge branch 'autotools'
2015-12-21 15:54:30 +02:00