Ari Lemmetti
33b6481660
Remove unused variables.
2015-08-11 15:53:40 +03:00
Ari Lemmetti
348d7780fc
Remove third shift and offset from 14-bit sampling functions (change missing from rebase)
2015-08-11 15:06:16 +03:00
Marko Viitanen
8409317bd9
Fixed rebasing errors for 10bit branch
2015-08-11 14:56:45 +03:00
Marko Viitanen
58f12bd530
Changed frame 8bit to 10bit conversion to be done without memory allocation
2015-08-11 08:18:14 +03:00
Marko Viitanen
6453a511d7
Scale SAD/SATD costs to match bit depth
...
Conflicts:
src/image.c
2015-08-11 08:18:14 +03:00
Marko Viitanen
0304b6c412
Fixed luma interpolation filter when 10bit coding and some other minor fixes
2015-08-11 08:17:48 +03:00
Marko Viitanen
450b5e64ca
Fixed overflow on generic ipol filters when 10bit encoding
...
Conflicts:
src/strategies/generic/ipol-generic.c
2015-08-11 08:17:48 +03:00
Marko Viitanen
191d3e4d87
Fixed RDOQ on 10bit encoding
2015-08-11 08:14:35 +03:00
Marko Viitanen
414ebe6101
Fixed checksum on bitdepth > 8 cases
...
Conflicts:
src/nal.c
src/nal.h
src/strategies/generic/nal-generic.c
src/strategies/strategies-nal.c
src/strategies/strategies-nal.h
2015-08-11 08:14:35 +03:00
Marko Viitanen
57ab46f110
Small fixes all around to enable 10bit encoding
...
Conflicts:
src/encmain.c
src/encoder.c
src/encoderstate.c
src/global.h
2015-08-11 07:59:20 +03:00
Ari Lemmetti
7cd4f7a5c9
Enable fractional motion vectors with bipred
2015-08-10 18:49:12 +03:00
Ari Lemmetti
5887c96991
Add and use 14bit reconstruction for fractional motion vectors with bipred
2015-08-10 18:45:29 +03:00
Ari Lemmetti
a87dafb27a
Add hi_prec_buf_t for higher precision intermediate values for interpolation filter.
2015-08-10 18:35:06 +03:00
Ari Lemmetti
3e31ff2476
Use the function to sample half-pixels.
2015-08-10 18:30:41 +03:00
Ari Lemmetti
0a096c7040
Move qpel and octpel reconstruction in separate functions.
2015-08-10 18:25:52 +03:00
Ari Lemmetti
fc00b4795c
Preparations for more accurate reconstruction with bipred
2015-08-10 18:08:13 +03:00
Ari Lemmetti
8b4a6c92da
Add 14bit precision sample functions.
2015-08-10 18:02:06 +03:00
Ari Lemmetti
b30f17d4b8
Add fractional pixel sampling for chroma
2015-08-10 17:55:37 +03:00
Ari Lemmetti
650dd7d840
Use pixels_blit to copy neccessary pixels.
2015-08-10 17:52:00 +03:00
Ari Lemmetti
01f40ec104
Add fractional pixel sampling for luma
2015-08-10 17:51:48 +03:00
Ari Koivula
0c3c93d456
Optimize intra SAD intrinsics.
...
- Added 64x64 version for completeness.
- With the exception of 16x16, these were all slightly slower than the ASM
versions, as measured by "kvazaar_test -s speed -t intra_sad", but now they
are on par or slightly faster.
- None of these actually use any AVX2 intrinsics, and probably never will,
unless someone adds an interface for doing more than one block at a time,
in which case the non-destructive versions might come in handy.
2015-08-06 19:35:00 +03:00
Ari Lemmetti
20b833bc8e
Fix mingw errors
2015-07-31 18:44:36 +03:00
Ari Lemmetti
12c391eb08
Add auto-detection for input resolution.
...
Use --input-res=auto as default.
2015-07-31 17:35:16 +03:00
Ari Koivula
0740a73fbb
Clean up Makefile.
...
- Move stuff around.
- LDFLAGS -shared and -dynamiclib imply -fpic.
2015-07-31 15:57:05 +03:00
Ari Koivula
beec2705b1
Add cli, lib-shared and lib-static to Makefile.
2015-07-31 15:57:05 +03:00
Ari Koivula
24b3306325
Fix incorrect pattern rules in Makefile.
...
- Having more than one rule in a pattern rule means that both of those files
are created at the same time with the rule. This only worked for debug,
because debug build was never done in the same invocation as release build.
2015-07-31 14:36:45 +03:00
Ari Koivula
1c27f67963
Remove -flto.
...
- Always use the compiler to invoke the linker. Clang will give additional
parameters to the linker when compiled with -flto.
- Giving a different optimization level to linker did not make any difference
in gcc-5.1.1.
2015-07-31 14:36:26 +03:00
Ari Koivula
54b1be341e
Don't compile executable with PIC.
...
- It's required for .so and .dylib, but not for .dll or the executable.
- It might be better to use libtool for this, but I'm not ready to go that
far yet.
2015-07-29 17:12:09 +03:00
Ari Koivula
f8154f8382
Merge branch 'make-dylib'
2015-07-29 11:28:43 +03:00
Ari Koivula
60437fd0c3
Add -lrt back the exe link command.
2015-07-29 11:28:11 +03:00
Luca Barbato
5c7a808bbd
build: Generate a pkg-conf file
2015-07-29 02:27:12 +02:00
Ari Koivula
04e1a21ded
Merge branch 'make-dylib'
...
Closes #94 .
2015-07-28 11:42:46 +03:00
Ari Koivula
2211b90a24
Move comments for defines to a different line.
...
- Having comment as part of the define confuses doxygen. They get added
to every function that uses the macro.
2015-07-21 17:10:08 +03:00
Ari Koivula
022d28ab11
Fix small hexbs pattern.
...
- Who could mess this up? Oh.. right.
2015-07-21 16:12:44 +03:00
Ari Koivula
22e56f86c7
Move inter search patterns inside the search functions.
2015-07-21 16:06:31 +03:00
Ari Koivula
b73b275e08
Remove unused includes from search.
2015-07-21 15:06:06 +03:00
Ari Koivula
ae56118010
Move functions from search to search_intra.
2015-07-21 14:59:19 +03:00
Ari Koivula
bf7542c35d
Move functions from search to search_inter.
2015-07-21 12:16:05 +03:00
Ari Koivula
3c9b830d8f
Add modules search_intra and search_inter.
...
- For breaking up search module.
2015-07-21 12:04:16 +03:00
Arttu Ylä-Outinen
06ea593477
Change dylib file name to libkvazaar.X.dylib.
...
Changes the version number in the dylib filename from a three-digit
version (libkvazaar.X.Y.Z.dylib) to a single-digit one
(libkvazaar.X.dylib).
2015-07-20 15:09:46 +03:00
Arttu Ylä-Outinen
df749e032e
Add necessary linker options when building dylib.
...
Sets linker options -compatibility_version and -install_name when making
dylib.
2015-07-20 15:09:09 +03:00
Luca Barbato
9c414995c5
build: Add a MacOSX install target for the library
2015-07-17 19:44:20 +02:00
Arttu Ylä-Outinen
59f95b8e73
Add nasm support.
...
Makes is possible to build kvazaar using nasm instead of yasm.
- Adds trailing slashes to -I params in ASFLAGS.
- Disables CPU NOP directives when assembler is not yasm.
2015-07-17 13:59:25 +03:00
Arttu Ylä-Outinen
e307b7cec4
Check that input dimensions are multiples of two.
...
Fixes wrongly accepting non-multiple of two resolutions and a segfault
when one of the input dimensions is one.
2015-07-17 10:07:24 +03:00
Arttu Ylä-Outinen
d2c42cb303
Fix making tests.
...
Commit 9cfbd55e
removed "./" prefix of the TESTS variable in the
Makefile but the recipe of target tests was expecting it. Fixed by
prepending "./" to the tests recipe.
2015-07-17 10:07:24 +03:00
Luca Barbato
56ff1c7805
build: Drop the non-standard -t
...
Should unbreak freebsd.
2015-07-16 16:50:09 +02:00
Arttu Ylä-Outinen
94e8fc1536
Build dylib on Darwin.
...
Adds target libkvazaar.dylib to Makefile. On Darwin, libkvazaar.dylib is
set as a prerequisite of the all target.
2015-07-16 14:15:09 +03:00
Arttu Ylä-Outinen
a4ec92081a
Make symbols hidden by default.
...
Adds "-fvisibility=hidden" to CFLAGS and LDFLAGS. Defines macro
KVZ_PUBLIC for marking symbols that should be visible.
2015-07-13 14:20:21 +03:00
Arttu Ylä-Outinen
9cfbd55ea8
Add making symlinks to make install.
...
Running "make install" now creates symlinks libkvazaar.so and
libkvazaar.so.X pointing to libkvazaar.so.X.Y.Z.
2015-07-13 11:45:42 +03:00
Ari Koivula
c94d91061c
Merge branch 'cpuid-fix'
2015-07-09 11:40:46 +03:00
Arttu Ylä-Outinen
8550c6ccd8
Fix AVX2 detection.
...
Replaces calls to __get_cpuid by __cpuid_count on gcc and clang and
calls to __cpuid by __cpuidex on MSVC. Unlike __get_cpuid and __cpuid,
__cpuid_count and __cpuidex set the ecx register which is required for
AVX2 detection.
2015-07-09 11:20:37 +03:00
Ari Koivula
9acf7795a2
Refactor cpuid capability detection.
...
- Moved cpuid data to a struct to make it easier to group data from one
cpuid call together.
- Renamed the bit masks to make it harder to mask the wrong register or
cpuid.
- Remove the .byte trick. We don't really need to support such ancient
compilers?
2015-07-09 11:20:37 +03:00
Arttu Ylä-Outinen
e69088026e
Write slice header before joining child streams.
...
The lengths of the leaf streams must be available when the slice header
is written. Writing the header before joining child streams removes the
need to copy leaf bitstreams instead of moving them.
2015-07-08 13:14:17 +03:00
Arttu Ylä-Outinen
907451590e
Fix encoding when both GOP and OWF are enabled.
...
Changes kvazaar_encode to not increase cur_state_num unless a frame is
started.
2015-07-07 10:05:42 +03:00
Arttu Ylä-Outinen
3efdee2c13
Fix compilation warnings when using clang.
...
Removes typedef redefinitions in kvazaar_internal.h.
2015-07-06 13:46:56 +03:00
Arttu Ylä-Outinen
cc580ac861
Only print PSNR if some frames were encoded.
2015-07-06 13:39:47 +03:00
Arttu Ylä-Outinen
089ff895ad
Fix seeking when input stream is not seekable.
2015-07-06 12:07:05 +03:00
Arttu Ylä-Outinen
aca5d7514f
Fix pocs reallocation in imagelist.
...
Replaced sizeof(int32_t*) by sizeof(int32_t).
2015-07-06 11:58:05 +03:00
Arttu Ylä-Outinen
ca8435f581
Remove setting CC in Makefile.
2015-07-06 11:27:28 +03:00
Arttu Ylä-Outinen
3a47aab696
Fix allocating tile boundary arrays.
...
Column and row numbers had been mixed up.
2015-07-06 10:48:19 +03:00
Arttu Ylä-Outinen
a0865ff351
Change ime_algorithm in kvz_config to an enum.
...
Adds enum kvz_ime_algorithm to kvzaar.h.
2015-07-06 09:47:18 +03:00
Arttu Ylä-Outinen
66656fdebc
Move handling of command line args to cli module.
...
- Adds struct cmdline_opts_t.
- Adds functions cmdline_opts_parse and cmdline_opts_free to cli module.
- Removes fields input, output, debug, frames and seek from struct
kvz_config.
- Removes function config_read from config module.
2015-07-06 08:25:54 +03:00
Arttu Ylä-Outinen
581f740c59
Fix compilation when checkpoints are enabled.
...
- Include string.h in checkpoint.h
- Check return values of fgets calls in checkpoint.h.
- Replace variable length array in image.c by a dynamically allocated
array.
- Add -DCHECKPOINTS to CFLAGS in Makefile when CHECKPOINTS is defined.
2015-07-03 13:54:44 +03:00
Arttu Ylä-Outinen
6eb89a2813
Adjust Makefile for building kvazaar.dll.
...
Adds targets "kvazaar.dll" and "install-dll" to the Makefile.
2015-07-02 16:58:30 +03:00
Arttu Ylä-Outinen
af2b417809
Set up Makefile for building libkvazaar.so.
...
Adds targets "libkvazaar.so.0.0.0", "install", "install-prog" and
"install-lib" to the Makefile.
2015-07-02 16:58:30 +03:00
Arttu Ylä-Outinen
4ab9aa3e2f
Move kvz_encoder definition to kvazaar_internal.h.
2015-07-02 16:58:30 +03:00
Arttu Ylä-Outinen
b715ae9767
Return length of the data from encoder_encode.
...
Adds parameter len_out returning the length of the encoded data in bytes
to function encoder_encode.
2015-07-02 16:58:29 +03:00
Arttu Ylä-Outinen
538deaa9d6
Add functions picture_{alloc,free} to kvazaar API.
2015-07-02 16:58:29 +03:00
Arttu Ylä-Outinen
6451df9a4f
Move bitstream chunk definition to kvazaar.h.
...
- Renames struct bitstream_chunk_t to kvz_data_chunk.
- Renames macro BITSTREAM_MEMORY_CHUNK_SIZE to KVZ_DATA_CHUNK_SIZE.
- Removes kvz_payload typedef.
- Adds function chunk_free(kvz_data_chunk *chunk) to kvazaar API.
2015-07-02 16:58:28 +03:00
Arttu Ylä-Outinen
f7f17a060c
Rename pixel_t to kvz_pixel.
2015-07-02 16:58:28 +03:00
Arttu Ylä-Outinen
cecea44d37
Rename config_t to kvz_config.
2015-07-02 16:58:28 +03:00
Arttu Ylä-Outinen
17d720363a
Rename struct image_t to kvz_picture.
2015-07-02 16:55:48 +03:00
Arttu Ylä-Outinen
fab07d80da
Rename macro BIT_DEPTH to KVZ_BIT_DEPTH.
2015-07-02 16:55:47 +03:00
Arttu Ylä-Outinen
7b6178f6e0
Rename macro MAX_GOP to KVZ_MAX_GOP_LENGTH.
2015-07-02 16:55:47 +03:00
Arttu Ylä-Outinen
3f32d500e2
Move config_t structure to kvazaar.h.
2015-07-02 16:55:46 +03:00
Arttu Ylä-Outinen
cecdf4f34e
Move config validation to encoder_control_init.
...
Ensures that config is valid even when not initialized by config_read.
2015-07-02 16:47:28 +03:00
Arttu Ylä-Outinen
04a1fc07cf
Move all config validation to config_validate.
2015-07-02 16:43:19 +03:00
Arttu Ylä-Outinen
25706af770
Add a function for moving bitstream data.
...
Replaces calls to bitstream_append with bitstream_move where possible.
2015-07-02 16:35:47 +03:00
Arttu Ylä-Outinen
398f0c823b
Replace memory bitstreams with linked lists.
...
- Removes all bitstream types.
- Changes encoder_encode to return the encoded data as list of chunks.
- Moves writing of the encoded data to the main function.
2015-07-02 16:35:46 +03:00
Arttu Ylä-Outinen
7e20e62cc7
Make kvazaar_encode consume one frame on each call.
...
- Replaces read_one_frame by encoder_feed_frame.
- Adds field "prepared" to encoderstate_t to indicate that
encoder_next_frame has been called.
- Input frames are read in the main function and passed to
encoder_encode.
2015-07-02 16:28:40 +03:00
Arttu Ylä-Outinen
012c0580df
Move writing reconstructed image to yuv_io module.
...
Adds function yuv_io_write.
2015-07-02 16:28:39 +03:00
Arttu Ylä-Outinen
7bd23f5dbb
Rename yuv_input module to yuv_io.
2015-07-02 16:28:39 +03:00
Arttu Ylä-Outinen
1f41717351
Rename stats_done to frame_done in encoderstate.
...
The new field frame_done is set to zero when starting to encode a new
frame and reset to one when the encoded data has been written.
2015-07-02 16:24:26 +03:00
Arttu Ylä-Outinen
50a5d5faa5
Let subimages have multiple references.
...
Adds function image_copy_ref to image module for getting a new reference
to an image. It can be used instead of image_make_subimage when the
sizes of the original and the subimage are same.
2015-07-02 16:24:26 +03:00
Arttu Ylä-Outinen
cec9b937dc
Make image list resize use realloc.
...
Much simpler than allocating, copying and freeing the arrays manually.
2015-07-02 16:24:25 +03:00
Arttu Ylä-Outinen
fe3b629905
Move poc from image_t to image_list_t.
2015-07-02 16:24:25 +03:00
Arttu Ylä-Outinen
5d524c0290
Move seeking to yuv_input module.
2015-07-02 16:24:24 +03:00
Arttu Ylä-Outinen
f41ce04488
Refactor main function.
...
- Make sure that everything which is allocated gets deallocated.
- Move finalization of encoder states to kvazaar.c.
- Remove empty strategyselector_free function.
- Remove unused variable curpos.
- Fix includes.
2015-07-02 16:24:24 +03:00
Arttu Ylä-Outinen
9c20f96397
Move opening files in main to separate functions.
2015-07-02 16:24:24 +03:00
Arttu Ylä-Outinen
40b136cf48
Fix seeking when reading from stdin.
...
Seeking used read_one_frame to skip frames. Changed to simply use fread
instead.
2015-07-02 16:24:24 +03:00
Arttu Ylä-Outinen
970d0ec182
Move input reading functions to yuv_input module.
...
Adds function read_yuv_frame and moves functions fill_after_frame and
read_and_fill_frame_data from encoderstate to yuv_input.
2015-07-02 16:24:23 +03:00
Arttu Ylä-Outinen
4a7b86a43b
Make g_exp_table statically allocated.
...
Removes the need to free the table.
2015-07-02 16:14:52 +03:00
Arttu Ylä-Outinen
b130ecc9bb
Fix "reference not found" when GOP is enabled.
...
The encoder state must be cleared by calling encoder_next_frame before
calling read_one_frame.
2015-07-02 16:14:51 +03:00
Ari Koivula
7e98a483d7
Use the API for checking whether the encoding is finished.
2015-07-02 16:14:51 +03:00
Ari Koivula
fc58748ae8
Output bitstream through API.
...
- Use the existing bitstream_t type to give access to the bitstream.
We can extend it later to make it a linked list like I was planning
to do with the payload type.
- The main encoder now also stores the bitstream in memory.
2015-07-02 16:10:51 +03:00
Ari Koivula
df50a0dae6
Move config_parse into api.
2015-07-02 15:52:24 +03:00
Ari Koivula
4e5326d3d5
Move encoding to API.
...
- Api->encoder_encode can now be called repeatedly to start encoder
jobs and to retrieve the results.
Conflicts:
src/encmain.c
2015-07-02 15:52:23 +03:00
Ari Koivula
9a3edce3fc
Separate input and output from encoding.
...
- Move image_t and pixel_t to the kvazaar.h API.
- Try and arrange things such that image_t can be used as input and
output for encoding.
Conflicts:
src/encmain.c
2015-07-02 15:52:23 +03:00
Ari Koivula
f87cea78da
Wait for bitstream immediately after encoding the frame.
...
- This should reduce the encoding delay by one frame when encoding in
real time.
2015-07-02 15:52:23 +03:00
Ari Koivula
ad11d1bca5
Add kvazaar.h to hold high-level encoder API.
...
- Move encoder initialization from main to kvazaar.c.
- Have main use the API for initialization.
Conflicts:
src/encmain.c
2015-07-02 15:52:23 +03:00
Ari Koivula
0170e9280f
Move some initialization to encoder_control_init.
...
- Removed some members from encoder_control_t that weren't really used
very much anymore.
2015-07-02 15:45:35 +03:00
Ari Koivula
504f3d9c9b
Move some config initialization to config_read.
2015-07-02 15:45:34 +03:00
Ari Koivula
c99fe63860
Move seek functionality outside the main input loop.
2015-07-02 15:45:34 +03:00
Ari Koivula
4f4b62b13c
Fix owf.
2015-07-02 15:45:34 +03:00
Ari Koivula
5c28745457
Move OWF logic and CLI stuff out of encoder_compute_stats.
...
- CLI stuff is moved to either cli-module or to main function.
- OWF stuff is made more explicit by counting the frames instead of
communicating through encoder_state_t.stats_done.
2015-07-02 15:45:33 +03:00
Ari Koivula
ea50d03e52
Add cli module and move interface stuff to there.
2015-07-02 15:45:33 +03:00
Marko Viitanen
ff4fb64169
Fixed a precedence bug in bipred search
2015-06-12 09:49:56 +03:00
Marko Viitanen
44ba9d9f7c
Bump version number to 0.5.0
2015-06-11 10:33:27 +03:00
Ari Koivula
30f8640380
Disable trskip SAD calcultation when trskip is not enabled.
2015-06-08 13:16:12 +03:00
Marko Viitanen
3253ba4812
Fixed non-deterministic behavior when using bipred and owf
2015-06-08 13:14:53 +03:00
Ari Koivula
cdb66baf16
Fix mutex being unlocked twice.
2015-06-01 16:28:50 +03:00
Ari Koivula
80ec1fda3a
Remove unnecessary dependency between I-frames.
...
- Inter OWF dependency was being added to non-IDR I-frames.
2015-06-01 16:26:30 +03:00
Arttu Ylä-Outinen
984e7cb4e0
Fix setting QP when rate control is disabled.
...
When rate control is disabled, QP and lambda are now selected like they
were before rate control was implemented.
2015-06-01 13:57:11 +03:00
Arttu Ylä-Outinen
b0435d37a9
Update rate control parameters.
...
On each frame, adjust the parameters alpha and beta in the equation
lambda = alpha * pow(R, beta)
2015-05-29 11:50:08 +03:00
Arttu Ylä-Outinen
b24d92bd6e
Move initialization of constants to encoder.c.
...
Some constants used in rate control are now initialized only once instead
of being computed on every frame. Adds pixels_per_pic, target_avg_bppic,
target_avg_bpp and gop_layer_weights to encoder_control_t.
2015-05-29 11:45:36 +03:00
Arttu Ylä-Outinen
b54d5aa91f
Select GOP picture weights according to bitrate.
...
Pictures in same layer have equal weights. At low bitrates, the difference
between low and high layers is greater than at high bitrates.
2015-05-29 11:43:42 +03:00
Arttu Ylä-Outinen
93d2a95ddc
Implement rate control in lambda domain.
...
- Rate control adjusts the lambda value.
- QP is selected according to lambda.
- Bits are allocated for GOPs and individual pictures.
2015-05-19 11:40:51 +03:00
Arttu Ylä-Outinen
664de9ade0
Keep track of bits written in current gop.
...
Adds cur_gop_bits_coded into encoder_state_config_global_t. The count is
updated whenever a frame is written.
2015-05-19 10:42:23 +03:00
Arttu Ylä-Outinen
4a5698a6ba
Implement basic rate control.
2015-05-19 10:42:17 +03:00
Arttu Ylä-Outinen
d27cde55a4
Add --input-fps and --bitrate parameters.
2015-05-15 13:57:51 +03:00
Arttu Ylä-Outinen
5b8cd76f01
Keep track of total number of bits coded.
...
Adds total_bits_coded into encoder_state_config_global_t. The count is
updated whenever a frame is written.
2015-05-15 13:57:50 +03:00
Arttu Ylä-Outinen
815a2bea55
Use bitstream_tell to get stream position.
2015-05-15 13:57:50 +03:00
Ari Koivula
56bb8e75ba
Fix non-deterministic behavior with tiles.
...
- Depend on the whole previous frame.
- We should really go through all these FIXME's sometime.
2015-05-12 12:00:32 +03:00
Ari Koivula
a48d91dacd
Fix WPP not working when SAO is off and OWF is on.
...
- Every wavefront row was being set to done when the first wavefront
row got done.
- Looks like I didn't understand how the data structure worked when I
"cleaned this up", and it didn't get caught in tests because it
needs OWF to be on to affect anything.
2015-05-11 12:01:17 +03:00
Ari Koivula
87936eb99f
Revert "Fix keeping of reference frames over IDR boundary."
...
This reverts commit b43f1cb9eb
.
- This change resulted in use of uninitialized memory with owf != 0.
Conflicts:
src/encoderstate.c
2015-05-05 17:07:49 +03:00
Ari Koivula
c3b42291e1
Merge branches 'coverity-fix-5', 'coverity-fix-6', 'coverity-fix-7', 'coverity-fix-8', 'coverity-fix-9', 'coverity-fix-10', 'coverity-fix-11', 'coverity-fix-12', 'coverity-fix-13' and 'coverity-fix-14' into coverity-fixes2
2015-05-05 12:27:02 +03:00
Ari Koivula
62285c405c
Fix coverity warning.
...
- False positive about a shift with -1 when code_num overflows.
2015-05-05 12:20:07 +03:00
Ari Koivula
ed670f4185
Fix coverity warning.
...
- False positive about buffer overrun due to thinking work_tree_copy_up
could be called with depth == 4.
2015-05-05 11:54:09 +03:00
Ari Koivula
2276e0028f
Fix coverity warning.
...
- False positive about use of an uninitialized value. Actually just
copying uninitialized data from one struct to another.
2015-05-05 10:39:29 +03:00
Ari Koivula
41d9889e28
Fix coverity warning.
...
- False positive about coeff_y being uninitialized when width == 0.
2015-05-05 10:23:52 +03:00
Ari Koivula
80cbda364b
Fix coverity warning.
...
- Variable guards dead code. Although, maybe it will complain about the
dead code now instead.
2015-05-05 10:17:06 +03:00
Ari Koivula
08d079773f
Fix coverity warning.
...
- Dead code.
2015-05-05 10:12:01 +03:00
Ari Koivula
e225c5b302
Fix coverity warning.
...
- Dead code due to current value of MRG_MAX_NUM_CANDS. Not sure if this
fix will work but I think it looks better.
2015-05-05 10:06:18 +03:00
Ari Koivula
17bdc82b5e
Fix coverity warning.
...
- Dereferencing a pointer from realloc before checking if it's null.
2015-05-05 09:40:24 +03:00
Ari Koivula
cc980fb815
Fix coverity warning.
...
- False positive for use of uninitialized variable.
2015-05-05 09:37:34 +03:00
Ari Koivula
7a551bece5
Fix coverity warning.
...
- Remove dead code.
2015-05-05 09:29:40 +03:00
Ari Koivula
cf2a406aba
Fix coverity warning.
...
- False positive for overflow. Fixed the parameter declaration.
2015-05-04 17:38:16 +03:00
Ari Koivula
1c6c4963e7
Fix coverity warning.
...
- Mutex was left locked when malloc failed. Fixed.
2015-05-04 17:38:16 +03:00
Ari Koivula
1c3873f5b2
Fix coverity warning.
...
- Overflow from buggy implementation of modulo behavior for
pattern_type. As there is no need for such behavior I removed it.
2015-05-04 17:38:16 +03:00
Ari Koivula
6234b09461
Fix coverity warning.
...
- A false alarm about buffer overflow. No new modes are added if all modes
are already in the list.
- Skip checking predicted modes if all modes are in the list.
2015-05-04 17:38:16 +03:00
Ari Koivula
9015aab996
Clean up IDR handling code.
...
- IDR was called RADL, probably because the NAL type is IDR_W_RADL.
- Move things around to make it clearer what is happening.
2015-04-30 20:46:07 +03:00
Ari Koivula
b43f1cb9eb
Fix keeping of reference frames over IDR boundary.
...
-
2015-04-30 15:42:16 +03:00
Ari Koivula
c0c9bc619a
Fix valgrind warning.
...
- Attribute state->global->slicetype was used before being initialized.
- The reference frame lists should be updated based on current frame,
not on previous frame (or uninitialized data).
2015-04-30 13:18:28 +03:00
Ari Lemmetti
afcccb5c81
Merge branch 'memory_leak_test'
2015-04-29 16:26:59 +03:00
Ari Lemmetti
0081384727
Clean Makefile a bit. Add debug build option.
2015-04-24 20:45:19 +03:00
Marko Viitanen
8ed5d06ebe
Fixed compiler warnings caused by the bipred branch merge
2015-04-23 15:12:48 +03:00
Marko Viitanen
fd060cf2c6
Merge branch 'bipred'
...
Conflicts:
README.md
src/config.c
src/config.h
src/encmain.c
2015-04-23 14:45:44 +03:00
Marko Viitanen
79dc7e7270
Bi-pred search cleanup
2015-04-23 14:39:41 +03:00
Marko Viitanen
0e958ebe84
Fixed merge candidate selection
2015-04-23 12:18:33 +03:00
Marko Viitanen
3c694a8f6e
Fixed bipred mv candidate selection
2015-04-23 12:18:05 +03:00
Marko Viitanen
9951810910
Fixed deblocking with bi-dir blocks
2015-04-23 09:43:39 +03:00
Marko Viitanen
7f504b7808
Added a commandline parameter --bipred to enable bi-pred search
2015-04-21 14:35:16 +03:00
Marko Viitanen
fb74f86a5b
Bi-pred search now actually does cost calculations
2015-04-21 14:16:06 +03:00
Marko Viitanen
e12ba7c80f
Created function inter_recon_lcu_bipred() and moved bipred recon there
2015-04-21 12:05:21 +03:00
Marko Viitanen
50fce975d9
Clamp bi-pred motion vectors because ipol filtering requires modifications
2015-04-21 11:24:07 +03:00
Ari Koivula
13924a2057
Add --no-info parameter.
...
- Stops encoder information from being added to bitstream.
- The version information overhead is too big when doing comparisons with
very short sequences.
2015-04-16 17:30:36 +03:00
Ari Koivula
7028846423
Fix bug in intra mode search.
...
- The cost of the first mode in the mode list was returned instead of cost of
the selected mode, as this used to be the best mode when the list was
sorted. Should only matter when doing inter coding.
- This pretty much affects only --rd=1 in inter frames.
2015-04-09 16:05:53 +03:00
Marko Viitanen
da3fe9f199
Fixed rounding in bi-pred reconstruction
2015-04-02 15:55:13 +03:00
Marko Viitanen
c7a17cf1c4
Changed motion vector candidate derivation to work with bi-pred case
2015-04-02 14:05:24 +03:00
Marko Viitanen
73db9fec83
Fixed asserts for intra PU-depth configurations
2015-04-02 10:31:56 +03:00
Marko Viitanen
5d71fb3136
Fixed leaf aligning
2015-04-01 08:49:22 +03:00
Marko Viitanen
d26b89174b
Fixed intra chroma search ref_v pointer
2015-03-31 15:43:22 +03:00
Marko Viitanen
4b7db2e014
Added a dummy bi-pred search, always selects bi-pred block when possible
2015-03-31 15:02:43 +03:00
Marko Viitanen
2c676927f0
Fixed a bug in bipred reconstruction causing an overflow
2015-03-31 15:02:10 +03:00
Marko Viitanen
06bc4f3d5e
Fixed duplicate checking for merge cand and some cleanup
2015-03-31 12:23:46 +03:00
Marko Viitanen
004e8082ab
Fixed deblocking after L0/L1 mv changes
2015-03-31 12:22:48 +03:00
Marko Viitanen
c02e3b8e26
Fixed inter_get_mv_cand() reference picture checking
2015-03-30 15:22:56 +03:00
Marko Viitanen
f881d6bf8a
Modified structures and mv handling to use L0/L1 vectors
2015-03-30 14:40:29 +03:00
Marko Viitanen
d6f68d0950
Force clearing of references when GOP not used and I-slice
2015-03-30 10:21:41 +03:00
Marko Viitanen
f28ebbcd41
Moved GOP defining to config.c and added parameter --gop
...
* Checking that intra period and gop_len match
2015-03-30 10:09:54 +03:00
Marko Viitanen
c82915761f
Enabled insertion of I-slices when GOP is used
2015-03-30 10:09:49 +03:00
Marko Viitanen
815e0b8897
Moved reference list printing to encoder_compute_stats()
2015-03-30 10:09:32 +03:00
Marko Viitanen
2243d139bf
Fixed GOP reference usage when using owf
2015-03-26 14:11:13 +02:00
Marko Viitanen
1dc53be8fc
Fixed leaf aligning
2015-03-26 13:54:17 +02:00
Marko Viitanen
bbeb85f9ee
Fixed case when cfg->frames is zero
2015-03-26 11:24:41 +02:00
Marko Viitanen
5c04603421
Remove unused ref frames on GOP case even when number of ref frames is within limits
2015-03-26 11:14:13 +02:00
Marko Viitanen
5071b5c990
Moved reference list sorting and parsing to encoder_state_new_frame()
...
* fixed a bug in reference verification and added an error state
2015-03-26 10:58:56 +02:00
Marko Viitanen
c40ca49b6c
When GOP is used, verify the references are available
2015-03-26 10:38:21 +02:00
Marko Viitanen
fe581b881e
Changed GOP structure to enable coding sequences not divisible by gop_len
2015-03-25 16:00:20 +02:00
Marko Viitanen
42e02dbfd9
Fixed tr-skip cost calculation
2015-03-24 13:35:28 +02:00
Marko Viitanen
a7328ab008
Fixed tr-skip cost calculation
2015-03-24 12:40:01 +02:00
Marko Viitanen
c649c90f3a
Changes to enable adaptation to any GOP len
2015-03-24 12:01:57 +02:00
Marko Viitanen
9a828ae5da
Fixed merge candidate scaling in hexbs and excluded weighted pred candidates in cost calc
2015-03-24 09:38:24 +02:00
Marko Viitanen
2d8552d0d6
Fixed merge candidate usage by skipping weighted prediction candidate
2015-03-23 15:17:41 +02:00
Marko Viitanen
7952f892fc
Fixed GOP reference usage
2015-03-23 14:17:44 +02:00
Marko Viitanen
34e8f70c8c
Fixed temporal MV predictor offset
2015-03-23 09:22:47 +02:00
Marko Viitanen
eccf1c1a16
Fixed temporal MV predictor offset
2015-03-23 09:21:52 +02:00
Marko Viitanen
164b7a7743
Merge remote-tracking branch 'remotes/origin/master' into GOP
2015-03-20 11:40:15 +02:00
Marko Viitanen
5ae9a70e38
Disable usage of P-slices when GOP
2015-03-20 10:43:59 +02:00
Marko Viitanen
26082d5328
Zero merge candidate fix for B-frames
2015-03-20 10:33:05 +02:00
Marko Viitanen
0c1aa6f73c
Better reference picture removal function encoder_state_remove_refs()
2015-03-20 10:28:17 +02:00
Marko Viitanen
7dab3ea0f6
Replaced temporary reference lists with the ones in gop configurations
2015-03-20 10:25:40 +02:00
Marko Viitanen
f166d25dd0
Added positive and negative reference frames to the gop config
2015-03-20 10:22:53 +02:00
Arttu Ylä-Outinen
3f31e7bf47
Merge branch 'tz-search'
2015-03-19 19:04:44 +02:00
Arttu Ylä-Outinen
176dbb6a5b
Add --me parameter.
...
Selects the integer motion estimation algorithm (hexbs or tz).
2015-03-19 18:48:10 +02:00
Marko Viitanen
d72c560880
Generate sorted reference list for L0 and L1
2015-03-19 12:26:59 +02:00
Marko Viitanen
c761a6beb3
Added encoder state as an input parameter to inter_get_merge_cand()
2015-03-18 12:35:47 +02:00
Marko Viitanen
c56b4d5747
Added combined merge candidates on B-slices and struct inter_merge_cand_t
2015-03-18 10:03:06 +02:00
Ari Koivula
6ec177f75c
Improve handling of input vector to inter search.
2015-03-17 17:16:15 +02:00
Ari Koivula
55ae02f367
Copy cu_info from tiles to main state.
...
- Main states cu_array can be accessed through state->global->ref, which
allows the use of cu_info data from reference frames.
- This was already used by giving previous frames movement vector to next
frame as a starting point candidate, but that functionality was broken at
some point because the data wasn't being moved from child tiles cu_array
to the main cu_array.
- Alternative would be to access the child tiles array directly, but
currently there isn't a mechanism to preserve those arrays for reference
frames.
2015-03-17 13:24:20 +02:00
Ari Koivula
4bec6cec93
Simplify wavefront handling.
...
- Move the reconstruction status assignment out of the main for job loop.
2015-03-17 13:23:27 +02:00
Ari Koivula
4a27f79f20
Update comments.
2015-03-17 13:23:16 +02:00
Marko Viitanen
1da1dc9578
Clean up reference index and mvd writing
2015-03-16 09:41:02 +02:00
Ari Koivula
ca09e8bfe3
Fix WPP not working with threads=0.
...
- Apparently threadqueue_submit runs the job if there are no threads.
2015-03-13 17:15:05 +02:00
SanteriS
913ade461b
tz_search step 1, first if: && -> ||
2015-03-12 17:57:17 +02:00
SanteriS
949ec57849
Merge branch 'master' of https://github.com/ultravideo/kvazaar
2015-03-12 17:55:03 +02:00
SanteriS
bdb0639ac9
fixed function interfaces for tz_search and its subfunctions.
2015-03-12 17:54:21 +02:00
Ari Koivula
d2bb71739f
Clean up and comment WPP threading code.
...
- Remove WPP row reconstruction dependency to the row above current one in
the previous frame. It's obviously unnecessary.
- Remove WPP row reconstruction dependency to the current row in the
previous frame, unless the current row is the last row.
2015-03-11 18:30:37 +02:00
Ari Lemmetti
b9ec4b0a54
AVX2 acceleration for new luma filtering.
2015-03-11 15:33:38 +02:00
Marko Viitanen
bc8ea9547e
Use P-frames when last GOP picture
2015-03-11 15:23:16 +02:00
Marko Viitanen
a4b5f46b46
Fixed reference list delta and num_ref_idx_lX_active values
2015-03-11 15:19:32 +02:00
Marko Viitanen
ac4973c544
Fixed deblocking strength in this configuration when B-slice
2015-03-10 15:20:02 +02:00
Marko Viitanen
1527822753
Fixed GOP POC order when not using threads
2015-03-10 14:12:51 +02:00
Marko Viitanen
866c3bfdf1
Setting gop_len to 0 now works
2015-03-10 12:16:57 +02:00
Marko Viitanen
1c38fbbd3b
Fixed GOP when no threads are used
2015-03-10 10:45:05 +02:00
Marko Viitanen
66660516b7
Merge remote-tracking branch 'remotes/github/master' into GOP
...
Conflicts:
src/cabac.h
src/config.h
src/cu.h
src/encoder_state-bitstream.c
src/encoderstate.c
2015-03-10 10:32:00 +02:00
Marko Viitanen
ff41ef557d
Fixed reference usage of top GOP layer pictures
2015-03-10 09:18:19 +02:00
Marko Viitanen
eba298e635
Added cu->inter.mv_ref_coded variable
2015-03-10 09:17:25 +02:00
Marko Viitanen
ec02642cc8
Added more bits to POC counter and fixed num_reorder_pic and max_dec_pic_buffering values
2015-03-10 09:06:32 +02:00
SanteriS
9e9f5e3150
Merge branch 'master' of https://github.com/ultravideo/kvazaar
2015-03-08 19:20:08 +02:00
SanteriS
e2f9fe130a
changed step 1 for tz_search
2015-03-08 19:19:23 +02:00
Ari Lemmetti
39eceec38d
Rewrite of luma fractional pixel filtering. Utilizes intermediate values instead of calculating everything again.
2015-03-06 17:58:22 +02:00
Marko Viitanen
42d3f2a8b0
Added B-frame encoding and reference list exceptions for top-layer GOP pictures
2015-03-06 16:32:50 +02:00
Marko Viitanen
1afba671e2
Added missing cabac bits to mv coding
2015-03-06 16:31:27 +02:00
Marko Viitanen
6095503918
Modified search to use correct reference id and mv directions
2015-03-06 16:29:24 +02:00
Marko Viitanen
13c925b701
Testset of data for reference picture lists
2015-03-06 16:28:23 +02:00
Marko Viitanen
43b086caed
Added missing slice header flag "mvd_l1_zero_flag"
2015-03-06 16:27:42 +02:00
Marko Viitanen
18d9789fab
Cabac context array for inter direction
2015-03-06 16:26:16 +02:00
Ari Koivula
2f79bfebf7
Rename parameter encoder_state to state in all functions.
...
- It's so widely used that there isn't really need to emphasize that
it's the encoders state. Also, it isn't really the encoders state,
but encoding jobs state.
2015-03-04 17:31:07 +02:00
Ari Koivula
14fe1b6648
Rename enum color_index to color_t.
2015-03-04 16:37:35 +02:00
Ari Koivula
ded6fd9ee8
Renamed typedef pixel to pixel_t.
2015-03-04 16:35:53 +02:00
Ari Koivula
1f42adb1ea
Renamed typedef coefficient to coeff_t.
2015-03-04 16:33:47 +02:00
Ari Koivula
fedd05465d
Rename struct sao_info to sao_info_t.
2015-03-04 16:32:38 +02:00
Ari Koivula
3d135324da
Rename struct threadqueue_queue to threadqueue_queue_t.
2015-03-04 16:30:20 +02:00
Ari Koivula
b7fcb800b2
Rename struct threadqueue_job to threadqueue_job_t.
2015-03-04 16:28:56 +02:00
Ari Koivula
cf5f240604
Rename struct hardware_flags to hardware_flags_t.
2015-03-04 16:24:59 +02:00
Ari Koivula
e7754bb518
Rename struct strategy_to_select to strategy_to_select_t.
2015-03-04 16:24:06 +02:00
Ari Koivula
e95b138e62
Rename struct strategy_list to strategy_list_t.
2015-03-04 16:23:04 +02:00
Ari Koivula
95afc5af51
Rename struct strategy to strategy_t.
2015-03-04 16:17:45 +02:00
Ari Koivula
db42176a64
Rename struct image_list to image_list_t.
2015-03-04 16:13:57 +02:00
Ari Koivula
7bafd34cfa
Remove struct rd_stats.
2015-03-04 14:01:17 +02:00
Ari Koivula
fe55961f84
Rename struct image to image_t.
2015-03-04 14:01:17 +02:00
Ari Koivula
5431d0ce19
Rename struct lcu_order_element to lcu_order_element_t.
2015-03-04 14:01:17 +02:00
Ari Koivula
9e64ee3cee
Suffix encoder_state_config structs with _t.
2015-03-04 14:01:17 +02:00
Ari Koivula
cdb1a25f05
Inline struct me into encoder_control_t.
2015-03-04 14:01:16 +02:00
Ari Koivula
e5b18cd536
Inline cu_info_intra and cu_info_inter into cu_info_t.
2015-03-04 14:01:16 +02:00
Ari Koivula
a0767a76d2
Rename struct vector2d to vector2d_t.
2015-03-04 14:01:16 +02:00
Ari Koivula
5b12830756
Rename struct config to config_t.
2015-03-04 14:01:16 +02:00
Ari Koivula
1a62fee300
Rename struct cabac_data to cabac_data_t.
2015-03-04 14:01:16 +02:00
Ari Koivula
727fefacc4
Rename struct cabac_ctx to cabac_ctx_t.
2015-03-04 14:01:16 +02:00
Ari Koivula
4bc0308b7e
Rename struct bitstream_file to bitstream_file_t.
2015-03-04 14:01:15 +02:00
Ari Koivula
d6ec6a618d
Rename struct bitstream_mem to bitstream_mem_t.
2015-03-04 14:01:15 +02:00
Ari Koivula
106c9128ad
Rename struct bitstream_base to bitstream_base_t.
2015-03-04 14:01:15 +02:00
Ari Koivula
5d8498dc88
Rename struct bit_table to bit_table_t.
2015-03-04 14:01:15 +02:00
Ari Koivula
8cd8240f7a
Rename struct bitstream to bitstream_t.
2015-03-04 14:01:15 +02:00
Ari Koivula
7ca688b376
Rename struct videoframe to videoframe_t.
2015-03-04 14:01:15 +02:00
Ari Koivula
63e224574e
Rename struct cu_info to cu_info_t.
2015-03-04 14:01:15 +02:00
Ari Koivula
f3fab62d33
Rename struct cu_array to cu_array_t.
2015-03-04 14:01:15 +02:00
Ari Koivula
78f0c3a83b
Rename struct scaling_list to scaling_list_t.
2015-03-04 14:01:14 +02:00
Ari Koivula
f6147b410a
Rename struct encoder_control to encoder_control_t.
...
Conflicts:
src/encoder_state-geometry.h
src/encoderstate.h
2015-03-04 14:01:14 +02:00
Ari Koivula
b14f89c88f
Rename struct encoder_state to encoder_state_t.
2015-03-04 14:00:46 +02:00
Marko Viitanen
890b4c1e20
Modified image handling and QP calculations to support GOP
2015-03-03 12:22:50 +02:00
Marko Viitanen
c3d9e0b707
Added testset of data for GOP
2015-03-03 12:22:09 +02:00
Marko Viitanen
34b231378b
Modified config and encoder_state structs for GOP
2015-03-03 12:21:45 +02:00
SanteriS
b55bfe1729
Merge branch 'master' of https://github.com/ultravideo/kvazaar
2015-02-25 18:15:35 +02:00
SanteriS
bef7cae4f8
Merge branch 'master' of https://github.com/ultravideo/kvazaar
2015-02-25 15:29:11 +02:00
SanteriS
f478732b4c
tz search bugfix
2015-02-25 15:28:45 +02:00
Ari Koivula
d7383ccb25
Change license to LGPL.
...
- Everyone who has contributed code to the project has been asked to license
their contributions under LPGL and they have agreed.
- COPYING file changed to say LGPLv2.1 instead of GPLv2.
- GPL changed to LGPL in the header of every single file that a header and
header added to the few that were missing one.
- Also.. Happy new year!
2015-02-25 15:19:05 +02:00
Ari Koivula
3e58e03b56
Select motion compensation search starting point from among merge candidates.
...
- Greatly reduces bdrate for most sequences.
2015-02-25 12:58:15 +02:00
SanteriS
2f68cf3847
(TZ search) Fixed missing check for owf mode. Added 6 point hexagon search pattern.
2015-02-23 16:59:48 +02:00
Ari Koivula
9865e73b90
Remove NetBSD getopt dependency to unistd.h.
...
- Remove the $NetBSD header as it wouldn't get updated and is wrong.
2015-02-19 16:26:14 +02:00
Ari Koivula
dd54b5ae10
Replace GNU getopt with NetBSD getopt.
...
- This doesn't compile, but I'm including it to have a version history for
changes required to make it work.
- We need this for to have a getopt implementation on Windows.
- It's necessary to change the implementation to switch from GPL to LGPL.
2015-02-19 16:26:14 +02:00
Ari Koivula
c979db7e95
Avoid sorting intra modes unnecessarily.
2015-02-19 16:25:45 +02:00
Ari Koivula
1c2129fdcb
Improve sort_modes.
...
- When encoding with fast enough settings this function can use up to 5%
of the cpu time, so I tried to optimize it a little bit.
2015-02-19 16:25:38 +02:00
Ari Koivula
5fa6438b25
Clean up calls to memset.
...
- Replaces all calls to memset with new FILL and FILL_ARRAY macros. The use
of memset was inconsistent and we never use it for anything complicated.
2015-02-19 16:25:28 +02:00
Arttu Ylä-Outinen
b6776a8cee
Add --vps-period parameter.
2015-02-18 13:55:27 +02:00
SanteriS
1a4d30d15a
fixed step 1 of TZ algorithm
2015-02-11 18:51:21 +02:00
SanteriS
ce4c251cd1
Merge branch 'master' of https://github.com/ultravideo/kvazaar
2015-02-09 17:29:49 +02:00
Ari Lemmetti
8aea1a0fa9
Updated version string. Fixed dct strategy registration error message.
2015-02-05 14:07:26 +02:00
Ari Lemmetti
7846cf3093
Merge branch 'faster_interpolation'
2015-02-05 13:29:43 +02:00
Ari Lemmetti
7430622038
Copy ipol-generic strategy as a base for avx2 strategy
2015-02-05 13:28:07 +02:00
Ari Lemmetti
8495870df8
Using BIT_DEPTH macro because it is constant
2015-02-05 13:19:54 +02:00
Ari Lemmetti
c82adae0c4
Use four tap functions in octpel chroma interpolation
2015-02-04 18:23:57 +02:00
Ari Lemmetti
2f11caeb73
Added generic four tap functions. Use them in halfpel chroma interpolation.
2015-02-04 17:50:12 +02:00
Ari Lemmetti
ff456c120a
Enabled link time optimizations. Disabled default rules.
2015-02-04 15:19:47 +02:00
SanteriS
50dd59eb21
Added different search patterns for TZ search.
2015-02-02 19:14:45 +02:00
Ari Lemmetti
041d970ece
Apply fast clipping also to chroma filtering.
2015-01-29 16:19:04 +02:00
Ari Koivula
ff721bab81
Fix possible non-determinism with owf.
...
- Triggers when owf is on, sao is off and deblocking is on.
2015-01-26 16:02:31 +02:00
Ari Koivula
f01cbbb5ca
Add --no-signhide parameter.
2015-01-24 21:29:37 +02:00
Ari Koivula
5f24c6b73d
Make normal dequant use runtime sign-hiding configuration.
2015-01-24 21:29:25 +02:00
Ari Koivula
1ccb3bd324
Move sign hiding stuff in rdoq to its own function.
...
- There is some stuff from sign hiding left intermingled with rdoq code,
but I don't want to change the code too before testing that I didn't
break anything.
2015-01-24 21:27:20 +02:00
Ari Koivula
804a3b648b
Clean up quantization sign hiding.
...
- To allow for later configuration at runtime.
2015-01-23 16:03:59 +02:00
Ari Koivula
c940ccb549
Fix gcc error.
...
encmain.c:433:13: error: format ‘%llu’ expects argument of type ‘long long unsigned int’, but argument 4 has type ‘uint64_t’
2015-01-23 15:50:14 +02:00
Ari Koivula
5d16fa6c4f
Add VPS every intra frame.
...
- Just rdo=0 for now. Later this can be extended to be configured separately.
2015-01-22 13:13:23 +02:00
Ari Koivula
d685ee86d6
Record total bitstream length correctly when using stdout.
...
- If the output is not a file, we can't check the size of the file.
2015-01-22 12:29:06 +02:00
Ari Koivula
1b19afc706
Flush output buffer after every frame.
2015-01-22 12:29:06 +02:00
Ari Lemmetti
b4aab06073
Added new files in Makefile.
2015-01-21 18:38:09 +02:00
Ari Lemmetti
c21351cc12
Added fast clipping function for clamping values to bit depth.
2015-01-21 17:53:06 +02:00
SanteriS
4b3d77aaf2
Enable tz search.
2015-01-21 12:55:00 +02:00
Ari Koivula
f86def8ed8
Remove unused variables.
2015-01-20 17:50:19 +02:00
Ari Koivula
8ac66934c0
Clean up NAL header code.
...
- Use long start code for RADL NAL units if they are the first NAL in the
access unit.
- Ffmpeg mpegts was complaining about start codes not being present.
There wasn't anything wrong that I could find though, besides the
missing intra long start code.
2015-01-20 17:34:59 +02:00
Ari Koivula
81ad583e08
Use the same coeff cost calculation for all rd modes.
...
- It's not worth it to have these faster approximations for coefficient cost.
2015-01-20 17:34:59 +02:00
Ari Koivula
870171e6ad
Fix --rd=0 actually work.
2015-01-20 17:34:59 +02:00
Ari Lemmetti
f037ed580c
Improved data layout
2015-01-15 16:31:18 +02:00
Ari Lemmetti
4382c2f088
Added missing -1 to PIXEL_MAX macro
2015-01-15 16:14:07 +02:00
Ari Lemmetti
465f718eeb
Move value clipping away from separate loop
2015-01-15 16:14:00 +02:00
Ari Lemmetti
9d12ce21d5
Cleaned luma interpolation, added functions for 8-tap filtering.
2015-01-15 16:13:12 +02:00
Ari Lemmetti
0e56d13b5d
Use smaller bit depth for fractional pixel interpolation
2015-01-15 15:00:09 +02:00
Ari Lemmetti
cc061b4c3d
Added ipol strategy for interpolation filters.
...
Added initial files for AVX2 and generic strategies.
2015-01-15 14:59:37 +02:00
Ari Lemmetti
73762062b6
Clarified comments a bit
2015-01-15 11:57:19 +02:00
Ari Koivula
ab3364afb4
Add skipping of intra search in inter frames for rd=0.
2015-01-15 11:54:35 +02:00
Ari Lemmetti
c9f310a6c2
Use pixel type instead of uint8_t
2015-01-15 11:47:00 +02:00
Ari Lemmetti
cad5f14372
Fixed compile errors (-Werror)
2015-01-14 18:27:35 +02:00
SanteriS
126569c737
Added first version of TZ search algorithm.
2015-01-14 14:54:09 +02:00
Ari Koivula
660547098a
Merge branch 'intra-fast-lcu'
2015-01-14 12:03:12 +02:00
Ari Koivula
01195aecbb
Move cu split model to a function.
2015-01-14 11:16:34 +02:00
Ari Koivula
8c89dcfc50
Move mode bit calculation to a function.
2015-01-14 10:44:52 +02:00
Daniel Eneyev
27d79ffae3
workaround for GET_TIME in Mac OS
2015-01-13 17:06:55 +03:00
Ari Koivula
fc79c2103e
Generalize the fast intra-mode tryout code to work for any depth.
2015-01-12 11:47:21 +02:00
Ari Koivula
f1364d297b
Fix bug resulting in incorrect bitstream.
...
- If 64x64 intra PUs were enabled and --rd was less than 2, no intra mode
search was performed for depth 0 resulting in incorrect bitstream.
2015-01-12 11:16:33 +02:00
Ari Koivula
bbae2e8a27
Update usage and readme.
2015-01-12 10:59:28 +02:00
Ari Koivula
f4bd322804
Add command line options for prediction unit depth.
2015-01-12 10:40:34 +02:00
Ari Koivula
edf2681ea4
Comment functions in search.c.
2015-01-07 14:56:14 +02:00
Ari Koivula
8c1e0b8a7f
Tweak owf=auto.
...
- Twice the required number is too little.
2014-12-10 11:23:51 +02:00
Ari Koivula
129c8e38e0
Set owf default to auto.
2014-12-09 19:00:11 +02:00
Ari Koivula
51b5692121
Rewrite owf=auto code to be more general.
...
- Change the definition to be a bit more general. The mapping from resolution
to owf frames stays mostly the same however, but should handle weird
resolutions better.
- Move everything to config module.
- Fix handling of tiles. It had a bug where owf for tiles was always
threads * 4/3 - 1. Works as intended now.
2014-12-09 19:00:11 +02:00
Ari Koivula
374012ab26
Merge branch 'intraskip'
2014-12-01 17:30:03 +02:00
Ari Lemmetti
24492adb02
Merge branch 'fme_merge'
2014-11-21 15:08:45 +02:00
Ari Koivula
21d221c075
Add fast 64x64 intra test.
...
- If intra search is not enabled for a depth, try the result from the
top left CU of the next depth. This seems to give most of the benefit
of at least 64x64 intra prediction units without costing very much
in performance.
2014-11-20 17:20:24 +02:00
Ari Lemmetti
4874f2662f
Added --subme commandline parameter for fractional pixel motion estimation: 1 == enable (default), 0 == disable.
2014-11-20 14:59:04 +02:00
Ari Lemmetti
d5d2e04995
Merge branch 'fme'
2014-11-19 16:40:22 +02:00
Ari Koivula
3ef88dfda5
Add --owf=auto option.
...
- The optimal value for Overlapping Wave Front (OWF) depends on a bunch of
variables. Attempt to set the optimal owf value, at least for all intra.
2014-11-18 02:19:40 +02:00
Ari Lemmetti
5a946f24ea
Fixed time output formatting.
2014-11-14 16:46:41 +02:00
Ari Lemmetti
56c537e145
Build fixes for MinGW.
...
threads.h: use windows.h headers for clock stuff on MinGW
strategyselector.c: assert with strlen for MinGW support
2014-11-14 16:46:41 +02:00
Daniel Eneyev
992a98c5c4
If output name is dash - write to stdout
2014-11-13 12:45:53 +03:00
Ari Lemmetti
c46b75a0ca
Fixed mingw build error. Modified function declaration in getopt.h.
...
A macro definition adds * in front of __argc and __argv, causing
build error with mingw. Renamed them to argc and argv to prevent this.
2014-10-31 17:40:18 +02:00
Ari Lemmetti
6a12bc406d
Load greatest submodule. Fixed loop that occurred during build process.
2014-10-30 15:17:50 +02:00
Ari Lemmetti
a64aae7c53
Makefile now compiles tests. Fixed test files. Removed unused stuff.
2014-10-29 15:32:47 +02:00
Ari Koivula
50643eeaf8
Merge pull request #88 from darealshinji/patch-2
...
version.h is no longer used
2014-10-27 20:23:15 +02:00
darealshinji
81ecef17d7
version.h is no longer used
2014-10-27 18:17:26 +01:00
darealshinji
e230fb2eab
make it possible to add custom CFLAGS
2014-10-27 17:19:05 +01:00
Ari Lemmetti
e93fa54838
Added -lrt to fix undefined references to clock_gettime on some systems
2014-10-23 14:51:28 +03:00
Ari Lemmetti
eb7cecc3dd
Added .travis.yml for continuous integration. Added env variable to disable AVX2 for Travis (GCC version doesn't support it yet).
2014-10-23 14:20:07 +03:00
Ari Lemmetti
20967cfafe
Allow CC to be defined other than gcc. If not defined, use gcc as default.
2014-10-23 13:25:00 +03:00
Ari Koivula
fcb6fa6d4b
Fix compilation error on PowerPC.
...
- Need abs from stdlib.
2014-10-21 18:14:32 +03:00
Ari Koivula
f6fead6221
Fix crash on inter frames.
...
- If the bitcost was 0 it would underflow for skip mode. The bitcost is now
checked before decrementing.
2014-10-21 18:11:39 +03:00
Ari Koivula
dfc67b766a
Disable rd1 chroma search.
...
- The bdrate improvement isn't really worth the time it takes, so enable it
only for rd3 untill it can be made faster or better.
2014-10-16 13:59:20 +03:00
Ari Koivula
e9b8d9b889
Fix gcc warnings.
...
- Remove unused variables.
- Change intra prediction functions to take their inputs as const pointers.
- Change intra_get_pred to take two pointers instead of an array of pointers,
because the warnings got just too exotic.
2014-10-16 13:17:46 +03:00
Ari Koivula
4bac52d9b6
Merge branch 'intra'
2014-10-16 13:11:23 +03:00
Ari Koivula
afb9e8c3f4
Remove extra parameter sets.
2014-10-16 12:21:36 +03:00
Ari Koivula
02ec26fcea
Try different number of chroma intra modes for different depths.
...
- And avoid doing extra work if no extra modes are tested for certain depths.
2014-10-16 12:21:36 +03:00
Ari Koivula
3cf5e422e8
Make fast chroma mode search select modes for slower chroma search.
2014-10-16 12:21:36 +03:00
Ari Koivula
d12dbd4aa0
Add fast intra chroma mode search.
2014-10-16 12:21:08 +03:00
Ari Koivula
75a137c1e9
Add --cpuid parameter to disable runtime optimizations.
2014-10-16 12:01:36 +03:00
Ari Koivula
3e6023dfb5
Rename search constants and set sane defaults.
2014-10-16 03:08:11 +03:00
Ari Koivula
8a407b0313
Estimate luma and chroma intra mode bits separately.
...
- Remove cu_info.intra[].cost and bitcost as unnecessary.
- Add luma_mode_bits to complement chroma_mode_bits and remove
intra_pred_ratecost as unneccessary. Difference is that intra_pred_ratecost
was more coarse and included chroma mode with the assumption that it would
be the same as chroma.
2014-10-16 03:08:11 +03:00
Ari Koivula
c9e212ba92
Add intra chroma mode search.
...
- Based on full chroma reconstruction so enabled only for --rd=2.
2014-10-16 03:07:50 +03:00
Ari Koivula
b32867be2a
Remove -lrt from LDFLAGS.
...
- This might be required on some embedded system, but from what I can see
all the functions we use from real time extensions are included in libc
and the program seems to work fine without it.
- It doesn't exist on MingwW or Mac, so I think it's better to remove it
completely and add it later on any system that actually requires it.
- Related to #85 .
2014-10-14 11:48:57 +03:00
Ari Koivula
6f8a976b12
Give ARCH_X86_64 to yasm on Mac.
...
- Issue raised in #85 .
2014-10-14 09:47:56 +03:00
Ari Koivula
55ab08c213
Fix incorrect const qualifiers.
...
- Change input pointers to const in dct-generic, like they should have been.
- Fixes compilation error on GCC.
2014-10-13 16:57:15 +03:00
Ari Koivula
8a5b24bcbe
Remove usages of GCC __attribute__.
...
- To allow clang to compile, as it doesn't according to #58 .
- The target attributes are not needed anymore due to makefile handling
targetting now.
- The __attribute__((unused)) used for debugging. I don't know if clang
supports this attribute or not but it doesn't seem very important so
I'm removing it just in case.
2014-10-13 16:46:26 +03:00
Ari Koivula
04613bd5b3
Disable GET_TIME on Mac.
...
- This should fix the Mac version not compiling in issue #85 .
2014-10-13 16:22:11 +03:00
Ari Koivula
a469c059a5
Take chroma tr-skip bits into account.
2014-10-13 10:48:39 +03:00
Ari Koivula
7a5cf5d865
Add trskip mode cost to fast trskip mode decision.
2014-10-13 10:45:41 +03:00
Ari Koivula
f164a5ba79
Add fast transform skip estimation to rough intra search.
2014-10-13 10:42:24 +03:00
Ari Koivula
d893a489d6
Fix mingw compilation issue.
...
strategies/avx2/dct-avx2.c:334:25: error: pasting "g_dct_16" and "[" does
not give a valid preprocessing token
- The [ is not part of the token so compilation failed on mingw GCC 4.9.1.
- Fixes #86 .
2014-10-10 16:32:39 +03:00
Ari Koivula
28d1532578
Make rd=1 use cabac for coeff cost estimation.
2014-10-08 12:50:03 +03:00
Ari Koivula
cbb2aa75b7
Add macros for adjusting weight of distortion between luma and chroma.
...
- Everything needs to have a short name because windows has a maximum path
length limitation that is breaking my testing framework.
2014-10-08 10:31:54 +03:00
Ari Koivula
49ad845c33
Add cabac bits for part_mode.
2014-10-08 10:31:54 +03:00
Ari Koivula
b6710e7893
Add cabac bits for cu split flag.
2014-10-08 10:31:54 +03:00
Ari Koivula
38b224cf69
Change rest of cu split search costs to double.
2014-10-08 10:31:54 +03:00
Ari Koivula
17473624d3
Add transform tree bit costs for cbf_luma.
2014-10-08 10:31:54 +03:00
Ari Koivula
3b04d39db4
Take cabac bits into account on transform tree.
2014-10-08 10:31:54 +03:00
Ari Koivula
296f142d9e
Retain coded block flag data during transform split search.
2014-10-08 10:31:54 +03:00
Ari Koivula
85dea10f3f
Clean up transform split search.
...
- Remove unnecessary checks and comment.
2014-10-08 10:31:54 +03:00
Ari Koivula
e1b801eb6f
Add transform tree chroma cbf bits.
2014-10-08 10:31:23 +03:00
Ari Koivula
3868cc7ff1
Fix crash on inter search when --tr-depth-intra is used.
...
- Transform splits meant for intra modes were used for inter when inter mode
was chosen, which caused an assert to be triggered if the split transform
block didn't have any coefficients.
2014-10-03 19:29:06 +03:00
Ari Lemmetti
bcf12567d0
Added some comments.
2014-10-03 17:51:58 +03:00
Ari Lemmetti
fea517c2ae
Misc code cleanup
2014-10-03 17:06:09 +03:00
Ari Lemmetti
85682c3b6a
Removed unused transpose functions.
2014-10-03 11:39:31 +03:00
Ari Koivula
8a80845b91
Add chroma to transform split search.
2014-10-03 11:36:57 +03:00
Ari Koivula
51662e1081
Fix differences between cu_rd_cost_luma and rdo_cost_intra.
2014-10-03 11:36:57 +03:00
Ari Koivula
bc7d7d5cb6
Add cu_info* as parameter to reconstruction functions.
...
- This is required so these functions can be used for searching. When NULL
is given they take the CU from LCU struct as they did previously.
Conflicts:
src/search.c
2014-10-03 11:36:56 +03:00
Ari Koivula
ccc575e2c6
Disable transform tree bits.
2014-10-03 11:36:56 +03:00
Ari Koivula
a0ab469c89
Disable rdo_cost_intra.
2014-10-03 11:36:56 +03:00
Ari Koivula
c164978e21
Add FULL_CU_SPLIT_SEARCH macro for disabling cu split optimization.
2014-10-03 11:36:56 +03:00
Ari Koivula
549ac96438
Change costs to doubles to avoid rounding intermediate results.
...
- Helps with debugging.
2014-10-03 11:36:56 +03:00
Ari Koivula
e591e89ade
Add prediction mode to chroma reconstruction parameters.
...
- Just like in luma.
2014-10-03 11:36:56 +03:00
Ari Koivula
f6272f06fc
Unify signature for transform functions.
...
- Some used block, coeff and some src, dst. Now all signatures are const input
and non-const output.
2014-10-03 11:21:43 +03:00
Ari Koivula
b932cf4b21
Clean up avx2 dct macros.
2014-10-03 11:16:25 +03:00
Ari Koivula
47244a15c3
Merge branch 'dct-optimizations'
...
Conflicts:
src/strategies/avx2/dct-avx2.c
src/strategies/generic/dct-generic.c
2014-10-02 13:45:21 +03:00
Ari Lemmetti
61e1510480
Transform functions in dct-avx2.c are now generated with macros.
2014-10-02 13:24:30 +03:00
Ari Lemmetti
9407610555
Moved DCT / DST matrices to dct-generic.c
2014-10-02 13:24:30 +03:00
Ari Lemmetti
7255112bd8
Added transposed DCT/DST tables. Use them while calculating transforms instead of doing runtime transpose. Added separate functions for DST and IDST.
2014-10-02 13:24:30 +03:00
Ari Lemmetti
e7bcb58846
Added 32x32 IDCT
2014-10-02 13:24:30 +03:00
Ari Lemmetti
eacf173b7e
Added 32x32 DCT for AVX2
2014-10-02 13:24:30 +03:00
Ari Lemmetti
d2856a5d40
Added 32x32 transpose
2014-10-02 13:24:30 +03:00
Ari Lemmetti
7a33f08312
Added 16x16 DCT and IDCT for AVX2
2014-10-02 13:24:30 +03:00
Ari Lemmetti
d2fe2a5391
Added 16x16 transpose
2014-10-02 13:24:30 +03:00
Ari Lemmetti
d6af146a2e
Added part of the functions 16x16 DCT needs
2014-10-02 13:24:30 +03:00
Ari Lemmetti
aba3acdfff
Added AVX2 optimized transforms for 4x4 and 8x8 blocks
2014-10-02 13:24:30 +03:00
Ari Lemmetti
5856f32d81
Fixed incorrect shift values for inverse transforms in generic strategy
2014-10-02 13:24:29 +03:00
Ari Lemmetti
41b032664d
First version of 4x4 forward DCT
2014-10-02 13:24:29 +03:00
Ari Koivula
36232619ab
Fix broken cabac contexts in wpp.
...
- Fixes #84 .
- The issue was caused by 241b9d6
naively copying the whole struct, which
contains data other than just the contexts. Rather than reverting the
change, the struct was refactored to have another struct that contained
just the contexts.
2014-09-24 01:02:52 +03:00
Ari Koivula
4e052d3f0f
Wrap contexts of cabac_data inside cabac_data.ctx struct.
2014-09-24 01:02:37 +03:00
Ari Koivula
b339004c4c
Rename cabac_state.ctx to cur_ctx.
2014-09-24 01:02:28 +03:00
Ari Koivula
8b8b53fba5
Merge branch 'sao_cabac'
2014-09-22 10:28:30 +03:00
Ari Koivula
bfa399c8fc
Fix compiler warnings.
...
- Non-parenthesized parameter in a macro.
- Unused variables.
- Wrong const qualifiers.
- Signed/unsigned comparison.
2014-09-22 10:04:57 +03:00
Marko Viitanen
6f65a9cbbd
Improved SAO merge decisions
2014-09-16 10:08:17 +03:00
Marko Viitanen
21df11ba4e
Implemented SAO search for both chroma components
2014-09-15 16:07:31 +03:00
Marko Viitanen
e8d1140a1a
Check SAO band offset for both chroma components and better SAO chroma cabac costs
2014-09-15 16:07:31 +03:00
Marko Viitanen
0c92031e8a
SAO merge checking cleanup
2014-09-15 16:07:31 +03:00
Marko Viitanen
b274e7adcd
Added cabac bit cost calculations to SAO search
2014-09-15 16:07:31 +03:00
Ari Koivula
5f732126c3
Add cabac bit costs float table.
2014-09-15 15:45:43 +03:00
Ari Koivula
0db7d8d20f
test cu split cost
2014-09-15 15:42:03 +03:00
Ari Koivula
35b2e6f755
Add missing cabac context for chroma cbf.
...
- The context was also missing from HM, but has been fixed in HM13.
2014-09-15 15:41:44 +03:00
Ari Koivula
241b9d6adb
Simplify cabac context copying.
...
Conflicts:
src/context.c
2014-09-15 15:41:44 +03:00
darealshinji
61a414bced
reposition colons in usage message to match with the rest
2014-09-15 03:40:18 +02:00
Ari Koivula
3c73892609
Fix transform split search.
...
- Redo the search with the best mode to make sure the tr_depth parameters are
correct.
2014-09-11 10:56:53 +03:00
Ari Koivula
46b6b1243b
Add --rd=3 mode and enable searching of intra depth 0.
...
- intra_build_reference_border was overflowing at depth 0 because it uses
arrays just large enough to accommodate 32x32 transforms, which is the
biggest transform.
- For similar reasons search_intra_rough doesn't work at depth 0.
- The --rd=3 mode tries all modes with transform search. It also works without
rough search so it was used to test depth 0 search. If --rd=3 is not on intra
split at depth 0 is not searched for.
Conflicts:
src/search.c
2014-09-11 10:54:41 +03:00
Ari Koivula
c5fa824347
Rebase transform split search.
2014-09-08 14:13:59 +03:00
Ari Koivula
79b86ce6e1
Add --tr-depth-intra command line option.
...
Conflicts:
src/encoder.c
2014-09-04 13:42:24 +03:00
Marko Viitanen
fe236de807
Fixed sps_max_dec_pic_buffering value to include current picture
2014-09-01 10:31:11 +03:00
Marko Viitanen
dbcc8d65aa
Removed duplicate function from RDOQ
2014-08-28 08:50:01 +03:00
Ari Koivula
931ec7301c
Put slice delta QP to bitstream.
...
- Before slice delta QP was always 0. Now if global->QP is changed before
contexts are set, the delta qp is put to the bitstream, allowing for rough
frame level rate control.
2014-08-25 16:43:23 +03:00
Ari Koivula
4c3bbd4a35
Rewrite the SContruct.
...
- Works with new /strategy/ structure.
- Change architecture selection to use arch= instead of construction target.
2014-08-25 16:43:23 +03:00
Ari Lemmetti
f88c3b6f37
Removed unnecessary if (both branches did the same thing)
2014-08-20 11:54:35 +03:00
Laurent Fasnacht
f3c311fe1a
Fix commit 8502f3d
2014-08-11 15:17:15 +02:00
Laurent Fasnacht
f9bffe35a5
Log tile id in sad perf log
2014-08-11 11:57:08 +02:00
Laurent Fasnacht
6a937de9b2
Fix search_cu log
2014-08-11 11:57:08 +02:00
Laurent Fasnacht
8502f3d850
Improve logging
2014-08-11 11:57:07 +02:00
Laurent Fasnacht
f1b303a2d2
Fix compilation errors
2014-08-11 09:53:06 +02:00
Ari Lemmetti
47e3bcfb50
Fixed incorrect shift values for inverse transforms in generic strategy
2014-08-07 16:01:30 +03:00
Ari Lemmetti
709520a233
Removed all AVX2 instructions from SATD functions.
...
-Zero extend macro now returns results in 2 xmm registers instead of one ymm
2014-07-31 13:25:28 +03:00
Ari Lemmetti
0beb278f5b
Partial butterfly strategy is now called DCT strategy. Made changes to transform functions in preparation for optimizations.
...
-Moved fast_forward_dst and fast_inverse_dst to DCT strategies
2014-07-31 13:25:28 +03:00
Ari Lemmetti
6bf63bd171
Added AVX2 strategy for partial butterfly (no optimizations yet)
2014-07-31 13:25:28 +03:00
Ari Lemmetti
faccc4f09b
Partial butterfly functions now utilize the strategy selector
2014-07-31 13:25:28 +03:00
Ari Koivula
c2fac805d7
Give HAVE_ALIGNED_STACK to yasm on windows.
...
- Linux gets it through some other means but on windows it needs to be
given explicitly.
- Fixes issue #78 .
2014-07-30 16:26:23 +03:00
Ari Koivula
669e99dd7f
Improve intra SAD AVX2 intrinsics.
...
- Moved implementations for different sizes to inline functions that are
defined using each other, reducing the amount of redundant code.
- Performance of sad_8bit_32x32_avx2 improved by about 10% due to unrolling of
the loop.
2014-07-25 15:59:55 +03:00
Ari Koivula
e00102f0ca
Compile asm optimizations only if yasm is present.
2014-07-23 14:57:40 +03:00
Ari Lemmetti
85fb0784e4
Fixed intendentation and added some empty lines for readability
2014-07-23 12:32:27 +03:00
Ari Lemmetti
bd6e89c1f0
Updated include directories and file names to Makefile
2014-07-22 15:36:54 +03:00
Ari Lemmetti
4f88ebce5a
Added comments and made visual studio not to compile x86inc.asm
2014-07-22 15:07:57 +03:00
Ari Koivula
cfd3636e08
Move some repetitive SATD asm into a macro.
...
Conflicts:
src/strategies/x86_avx/picture_x86.asm
2014-07-22 12:46:39 +03:00
Ari Lemmetti
c81639dd09
Removed old unused macro
2014-07-22 11:11:20 +03:00
Ari Lemmetti
cf0797cafd
Reordered and intended assembly code
2014-07-22 11:07:42 +03:00
Ari Lemmetti
fea44c8234
Renaming AVX/asm files
...
-Splitted SAD and SATD functions in separate files
2014-07-21 18:02:01 +03:00
Ari Lemmetti
a64df6f0d0
Merge branch 'asm'
...
Conflicts:
build/kvazaar_lib/kvazaar_lib.vcxproj.filters
src/Makefile
src/strategies/strategies-picture.c
2014-07-21 16:41:09 +03:00
Ari Lemmetti
1be2c3aae5
Preparing push to master and misc
...
-Removed unnecessary <math.h> headers
-Updated AVX/asm optimizations to match the new file hierarchy
-Makefile only compiles .asm files if KVAZAAR_DISABLE_YASM is not set to 1 and TARGET_CPU_ARCH is x86
2014-07-21 12:39:56 +03:00
Ari Koivula
a8f7103797
Add AVX2 implementations for sad_8bit_ 8x8, 16x16 and 32x32.
2014-07-18 18:27:30 +03:00
Ari Koivula
3daa5dd1f1
Add sse2 implementaton for sad_8bit_4x4.
2014-07-18 18:20:34 +03:00
Ari Koivula
f49332c9b8
Add missing includes.
2014-07-18 17:56:15 +03:00
Ari Koivula
291817667f
Tidy up the Makefile.
2014-07-18 17:31:18 +03:00
Ari Koivula
e241866f43
Compile intrinsic functions with appropriate flags in gcc.
...
- Remove -march=native as it's no longer necessary for intrinsics to work.
Closes #77 .
- I couldn't test altivec or sse4.1, but sse4.1 compiles so I expect it
to work.
2014-07-18 17:28:14 +03:00
Ari Koivula
5662621b3c
Free threadqueue jobs when they are not needed.
...
- Also add destroying the mutex when the job is freed.
- This makes Kvazaar no longer acquire thousands of OS handles on Windows.
2014-07-16 16:51:20 +03:00
Ari Lemmetti
1e94262f85
Made AVX asm compatible with the changed system
...
- x86inc.asm is now located in extras
- Removed unused cpu.asm/h
2014-07-14 18:51:17 +03:00
Ari Lemmetti
683eda1183
Merge branch 'master' into asm
...
Conflicts:
build/kvazaar_lib/kvazaar_lib.vcxproj
build/kvazaar_lib/kvazaar_lib.vcxproj.filters
src/Makefile
src/strategies/strategies-picture.c
2014-07-14 16:42:33 +03:00
Ari Lemmetti
7f873e037c
Updated Makefile to compile picture_x86.asm
2014-07-14 15:30:08 +03:00
Ari Lemmetti
2169f9ab8c
Added AVX asm comments and fixes
...
-Added vzeroupper to satd macro to prevent AVX-SSE transition penalties int picture_x86.asm
-Fixed the order of registers in zero extend macro in picture_x86.asm
-Fixed SATD checkers test pattern in satd_tests.c
2014-07-14 14:43:36 +03:00
Ari Koivula
5d0df56c94
Move optimizations to their own compilation units according to target.
...
- This is necessary in order to compile AVX intrinsics correctly in
Visual Studio. Having everything in their own units should also make
compiling normal C code with optimizations on easier.
- For now the makefile still relies on GCC __target__ attribute for compiling
intrinsics.
2014-07-11 17:26:19 +03:00
Ari Koivula
f605d6c35b
Align intra buffers to 32 bytes for 256 bit SIMD instructions.
2014-07-11 17:26:19 +03:00
Ari Koivula
fbd03b706e
Reconfigure VS project.
...
- Moved compilation flag stuff from project file to the abstraction layer.
- Disabled randomized base address as unnecessary.
- Disable stack buffer security check from release.
2014-07-11 17:26:19 +03:00
Laurent Fasnacht
72abc69b3d
Measure time for SAD in _DEBUG mode
2014-07-08 11:42:58 +02:00
Laurent Fasnacht
1a318c714d
log poc with new_frame
2014-07-08 11:42:19 +02:00
Laurent Fasnacht
e64a692780
Add CU type in threadqueue.log
2014-07-08 09:06:31 +02:00
Laurent Fasnacht
abfbb7cad3
Fix duplicate type key in threadqueue.log
2014-07-07 11:36:50 +02:00
Laurent Fasnacht
946e3b9651
Log search_cu to threadqueue.log
2014-07-07 10:50:05 +02:00
Laurent Fasnacht
f62e571c15
Add missing info to threadqueue.log
2014-07-07 10:49:40 +02:00
Ari Lemmetti
048127c7e3
AVX assembly optimizations improved
2014-07-02 16:57:06 +03:00
Ari Koivula
7ecf78bb70
Use sqrt lambda cost for searches not using SSD.
...
- Add encoder_state->global->cur_lambda_cost_sqrt.
- Use sqrt lambda for inter search and rough intra search.
- The effect on inter is around 10-20% bdrate. The effect on intra is smaller
and non-existent when --rd=2 is enabled, as the intra search refinement was
already done with SSD and correct lambda.
2014-06-26 13:56:38 +03:00
Laurent Fasnacht
1112dca933
Fix compilation issue with assertion disabled
2014-06-26 07:31:37 +02:00
Laurent Fasnacht
9ab9defe67
Bitstream length per frame works again
2014-06-19 10:24:03 +02:00
Laurent Fasnacht
45faadb2c9
Fix bug where the wrong number of frames could be encoded (if one frame takes longer than the others)
2014-06-19 10:24:02 +02:00
Ari Koivula
d5a77be4b8
Fix avx detection for gcc.
...
- GCC doesn't support _xgetbv intrinsic so we have to use inline assembler.
2014-06-18 11:50:17 +03:00
Ari Lemmetti
bdef5384ef
Added AVX strategy
2014-06-17 16:52:24 +03:00
Ari Koivula
d7abe6a7c2
Address compilation warning.
...
strategyselector.c:170:10: error: ‘__get_cpuid’ is static but used in inline function ‘get_cpuid’ which is not static [-Werror]
return __get_cpuid(level, eax, ebx, ecx, edx);
2014-06-17 16:26:55 +03:00
Ari Koivula
60ecc6baae
Remove unused stuff.
2014-06-17 16:20:01 +03:00
Ari Koivula
7532b789f8
Add -std=gnu99 for gcc.
...
- std=c99 doesn't work because then struct timespec won't be defined.
2014-06-17 16:15:39 +03:00
Ari Koivula
94bc457b6c
Add option to disable fast intra search.
2014-06-17 15:32:05 +03:00
Ari Koivula
e27fc875c0
Clean up intra search.
2014-06-17 15:09:12 +03:00
Ari Koivula
e4d70ac1ab
Use more starting points for smaller blocks in intra search.
2014-06-17 13:28:27 +03:00
Ari Koivula
9911c7553b
Avoid unnecessary intra dir searching.
2014-06-17 13:11:35 +03:00
Ari Koivula
bd16a55b9b
Always check DC and planar intra modes.
...
- At least one of them is always in predicted modes, but to make sure they
are both included add them explicitly.
2014-06-17 12:51:15 +03:00
Ari Koivula
70740da123
Add smarter rough intra search.
...
- Directional intra mode search is done using halving search from the best
known mode. Starting modes are vertical, horizontal and the 3 diagonal
modes.
Conflicts:
src/search.c
2014-06-17 12:33:10 +03:00
Marko Viitanen
0e2fe9e7ff
Changed intra search to skip some modes speeding it up
2014-06-17 12:32:29 +03:00
Marko Viitanen
a1c3cfe944
Moved intra mode cost calculation to a function
...
Conflicts:
src/search.c
2014-06-17 12:32:29 +03:00
Marko Viitanen
eb7d46f9ef
Modify CU split cost.
2014-06-17 12:30:32 +03:00
Marko Viitanen
bfa37b876b
Conformance fix: set sps_max_dec_pic_buffering to correct value
2014-06-17 12:30:32 +03:00
Ari Koivula
b3c15b8f94
Merge branch 'owf'
2014-06-16 16:07:41 +03:00
Laurent Fasnacht
91de92134f
Constrain the search not to go under the LCU below if OWF is enabled
2014-06-16 14:27:56 +02:00
Laurent Fasnacht
ef9c2258e9
Fix frame counter and stats
2014-06-16 13:21:52 +02:00
Ari Koivula
153b1ee41f
Merge branch 'intra-sad-strategies'
2014-06-16 12:34:37 +03:00
Laurent Fasnacht
84d34c2655
Fix compilation on non-intel
2014-06-16 11:24:02 +02:00
Ari Koivula
3f00592b96
Separate strategyselector debug prints from _DEBUG.
...
- I only want to see the strategy stuff.
2014-06-16 12:15:19 +03:00
Ari Koivula
1c97a10a6d
Move intra SAD and SATD functions under strategies.
2014-06-16 12:13:41 +03:00
Laurent Fasnacht
4b4702819b
Also print encoding FPS
2014-06-16 11:10:11 +02:00
Laurent Fasnacht
2347574a8e
Fix problems revealed by valgrind
2014-06-16 11:10:09 +02:00
Laurent Fasnacht
28c3f22ba1
Fix possible freeze
2014-06-16 11:03:48 +02:00
Laurent Fasnacht
a96c742ad4
Fix depends for wpp+owf
2014-06-16 11:03:47 +02:00
Laurent Fasnacht
f99e41d41f
Improved CPU time statistics
2014-06-16 11:03:46 +02:00
Laurent Fasnacht
8a33c0a688
Fix recon job for wfrow
2014-06-16 10:55:01 +02:00
Laurent Fasnacht
bf6024734a
Fix statistics with OWF
2014-06-16 10:55:00 +02:00
Laurent Fasnacht
0522a3d8e5
--owf option
2014-06-16 10:55:00 +02:00
Laurent Fasnacht
47d1ded7b0
Dependencies between frames
2014-06-16 10:54:59 +02:00
Laurent Fasnacht
003d3c504c
image_list_copy_contents
2014-06-16 10:54:58 +02:00
Laurent Fasnacht
f4187dd10c
cu_array data structure
2014-06-16 10:54:57 +02:00
Laurent Fasnacht
3be3fa8d6e
Use different processing order depending if we have OWF or not
2014-06-16 10:54:56 +02:00
Laurent Fasnacht
c32943f78b
OWF
2014-06-16 10:54:56 +02:00
Laurent Fasnacht
490dd15f3d
Remove flush between frame
2014-06-16 10:51:33 +02:00
Laurent Fasnacht
fddcbabe28
bitstream writing is now a "normal" job in a thread
2014-06-16 10:51:32 +02:00
Laurent Fasnacht
ff7143cc24
Assign thread_queue_jobs and move image_free to a more suitable place
2014-06-16 10:51:32 +02:00
Ari Koivula
87ca828a63
Correct intra sad function labels.
...
- These haven't been 16 bit for a long time.
2014-06-16 10:45:10 +03:00
Ari Koivula
fcce6ae823
Fix printing of AVX2 capability.
2014-06-14 01:24:19 +03:00
Ari Koivula
a49ba2633a
Add OS and CPU detection for AVX2 and AVX.
2014-06-13 16:57:53 +03:00
Ari Koivula
1de102be61
Move strategies to their own compilation units.
...
- Enforces a little bit more hierarchy. Compilation units are in strategies
and whatever inline includes they have are in a folder with the same name
as the strategy.
2014-06-13 15:30:23 +03:00
Ari Koivula
aa3549a717
Change SLEEP(0) to SLEEP(10) on Windows.
...
- This is a workaround for a performance problem on Windows where main thread
is busy looping.
2014-06-13 12:01:03 +03:00
Laurent Fasnacht
4acadccf89
Only signal the required number of threads
2014-06-13 08:34:59 +02:00
Laurent Fasnacht
70ce7cec20
Remove unneccessary locks by adding threadqueue->queue_running counter
2014-06-13 08:34:58 +02:00
Laurent Fasnacht
7ef34ff5a1
Ability to dump mutex_lock, mutex_unlock and cond_wait timing, if compiled with -D_PTHREAD_DUMP
2014-06-13 08:32:14 +02:00
Laurent Fasnacht
68ad323e84
Tentative fix for race condition
2014-06-12 14:01:33 +02:00
Laurent Fasnacht
b194e19708
Tentative fix for deadlock
2014-06-12 12:57:14 +02:00
Laurent Fasnacht
b765eca153
Remove unneeded encoder_state_blit_pixels
2014-06-12 11:47:46 +02:00
Laurent Fasnacht
da07b8b35d
No-copy works (SAO and deblocking enabled)
2014-06-12 11:47:38 +02:00
Laurent Fasnacht
2cc700fab8
No-copy works with --no-sao (deblocking enabled)
2014-06-12 11:47:31 +02:00
Laurent Fasnacht
6b408b5904
No-copy works with --no-sao --no-deblock
2014-06-12 11:47:30 +02:00
Laurent Fasnacht
0dbfa62698
Replace copy of images made for tiles by sub-images (no copy)
...
- replace width by stride where required in the source code
2014-06-12 11:47:30 +02:00
Laurent Fasnacht
b1347efef5
Add checkpoint in sao_reconstruct
2014-06-12 11:47:29 +02:00
Laurent Fasnacht
ae4dc4eb44
Fix uninitialized sao_info structure members, which was creating false positive when checkpointing SAO
2014-06-12 11:47:29 +02:00
Laurent Fasnacht
f371bdafc3
sao_info checkpoints
2014-06-12 11:47:28 +02:00
Laurent Fasnacht
b7fe81c55c
Checkpoint in pixels_blit, and avoid doing undefined behaviour when source and destination is the same.
...
Seems a reasonnable point to observe when refactoring, since it's called on most image data.
2014-06-12 11:47:28 +02:00
Laurent Fasnacht
da8559fa34
Fix bug in CHECKPOINTS_FINALIZE() when checkpoints are disabled
2014-06-12 11:47:27 +02:00
Laurent Fasnacht
14df6de0d0
Checkpoint on frame checksum
2014-06-12 11:47:00 +02:00
Laurent Fasnacht
22df7cf98b
Use an assert instead of a dumb assignment
2014-06-12 11:47:00 +02:00
Laurent Fasnacht
cf123e317f
Code to checkpoint cu_info and lcu_t
2014-06-12 11:47:00 +02:00
Ari Koivula
ea830d3dd2
Add warning for VLAs in Makefile.
2014-06-12 09:57:08 +03:00
Ari Koivula
443f2f00aa
Fix compilation for VS.
...
- VS2013 does not support variable length arrays.
2014-06-11 17:51:55 +03:00
Laurent Fasnacht
87ed365053
typo fix
2014-06-11 10:29:05 +02:00
Laurent Fasnacht
6ca30367f9
Fix POC bug
2014-06-11 10:29:05 +02:00
Laurent Fasnacht
8437229885
Fix handling of cu_arrays
2014-06-11 10:29:04 +02:00
Laurent Fasnacht
e1d9cb015a
Basic checkpointing system
2014-06-11 10:29:03 +02:00
Laurent Fasnacht
27a49d287d
Big refactor to use videoframe, image_list, and image instead of picture*
2014-06-10 09:19:06 +02:00
Laurent Fasnacht
530faf3951
Move video frame related stuff to videoframe
2014-06-05 14:08:31 +02:00
Laurent Fasnacht
0fac77f9eb
Image now in separate module
2014-06-05 14:04:12 +02:00
Laurent Fasnacht
2456c65822
Replace accesses to picture->cu_array with picture_get_cu and picture_get_cu_const
2014-06-05 10:41:58 +02:00
Laurent Fasnacht
821b71910b
Move picture_list to its own module
2014-06-05 09:49:24 +02:00
Laurent Fasnacht
7372f9244d
Basic infrastructure for OWF
2014-06-05 09:09:25 +02:00
Laurent Fasnacht
16e3a58359
Performance improvement
2014-06-05 06:57:51 +02:00
Laurent Fasnacht
bad6d45e5f
Performance improvement
2014-06-05 06:57:51 +02:00
Laurent Fasnacht
aad2089fcf
Use -ftree-vectorize
2014-06-05 06:57:50 +02:00
Laurent Fasnacht
ea04bcd6a4
AltiVec support for SAD
2014-06-05 06:57:34 +02:00
Ari Koivula
3a7147baf4
Merge branch 't-20140602'
2014-06-04 18:11:15 +03:00
Ari Koivula
31b1bbc215
Address implicit declaration of warnings.
2014-06-04 18:00:50 +03:00
Ari Koivula
4f5c87fc5e
Remove duplicate function definition.
2014-06-04 17:56:05 +03:00
Ari Koivula
cb7d7f9e15
Update Makefile.
2014-06-04 17:52:28 +03:00
Ari Koivula
bb47534b88
Make encoder_state .c files their own compilation units.
...
- It's good that this module has been chopped to smaller pieces, but lets
avoid including .c files unless we really have to. These make pretty good
submodules on their own so just make them their own compilation units.
- Move some stuff around to avoid having to forward declare them
in encoderstate.c.
2014-06-04 17:45:18 +03:00
Ari Lemmetti
9e649a8f38
Updated usage message
2014-06-04 15:23:27 +03:00
Laurent Fasnacht
b8acdc784a
Fix compilation of encoder.c with -D_DEBUG
2014-06-03 15:02:14 +02:00
Laurent Fasnacht
961da05235
Split encoderstate.c in multiple files
2014-06-03 14:47:49 +02:00
Laurent Fasnacht
3d07f8cc84
encoderstate refactor
2014-06-03 14:25:16 +02:00
Laurent Fasnacht
2e821b79a9
encoder_state in now in encoder_state.[ch]
2014-06-03 13:51:30 +02:00
Laurent Fasnacht
9bdecbe071
Better thread scheduling
2014-06-03 11:39:16 +02:00
Laurent Fasnacht
0811dbcfbe
Remove unneeded cond_broadcast. Limit contention
2014-06-03 09:45:17 +02:00
Laurent Fasnacht
5ee1319c08
Altivec detection
2014-06-03 07:55:39 +02:00
Laurent Fasnacht
58ad3b4d26
Log more performance data, plot also now many threads are running
2014-06-03 07:42:22 +02:00
Laurent Fasnacht
5ed69b063b
Strategy selector for array_checksum, basic implementation using precomputed 256*256 block with larger accesses than byte
2014-06-03 07:42:22 +02:00
Ari Koivula
a483e8cb0f
Move cpuid stuff away from compiler namespace.
...
Conflicts:
src/strategyselector.c
2014-05-30 10:08:14 +03:00
Marko Viitanen
6a72f87028
Merge commit '792a5a5dd1946a327f22b2daba05c6645dfa8037'
2014-05-30 08:47:01 +03:00
Marko Viitanen
792a5a5dd1
Small fix for __get_cpuid()
2014-05-30 08:37:03 +03:00
Laurent Fasnacht
642564b6fb
Remove unused variable
2014-05-28 15:04:45 +02:00
Laurent Fasnacht
4f86919d75
Get rid of assembly cpuid for x86, compilation works for powerpc
2014-05-28 15:04:00 +02:00
Ari Koivula
e585da37e5
Give correct transform depth to RDOQ.
...
Conflicts:
src/search.c
2014-05-28 15:47:49 +03:00
Ari Koivula
dceb3da9b8
Fix bug in search relating to transform with no non-zero coefficients.
...
- Because cost was calculated even though there were no coefficients, these
very good modes were less likely to be selected.
- Added assert to encode_coeff_nxn to avoid these problems in the future.
2014-05-28 15:22:18 +03:00
Ari Koivula
ddc02cc09e
Avoid regenerating reference pixels for every rdo mode.
2014-05-22 13:18:28 +03:00
Ari Koivula
dbe13d0cba
Separate sad intra search from rdo search.
2014-05-22 12:47:45 +03:00
Ari Koivula
19ce21e07c
Split final cost to luma and chroma functions.
2014-05-22 09:45:00 +03:00
Ari Koivula
a6962e2974
Separate intra transform coding to luma and chroma functions.
2014-05-22 09:40:34 +03:00
Laurent Fasnacht
3a30a886fc
FREE_POINTER of job->rdepends was at the wrong place (memory leak)
2014-05-22 07:15:18 +02:00
Laurent Fasnacht
3b38777b71
Fix condition depending on uninitialized value in SAO
2014-05-21 16:33:24 +02:00
Laurent Fasnacht
66e730ba94
Fix encoder_state_init, which was making out of bound reads
2014-05-21 14:23:36 +02:00
Laurent Fasnacht
37c20b8ce5
Add dependency between SAO rows
2014-05-21 13:52:56 +02:00
Laurent Fasnacht
90f46dc56f
Threadqueue has now a start index to the first queue job. It improves the speed a little
2014-05-21 12:02:55 +02:00
Laurent Fasnacht
f4f9093cb5
Parallel SAO
2014-05-21 11:48:29 +02:00
Laurent Fasnacht
a3fcb141ed
lcu_order_element now has pointer to neighbor LCUs
2014-05-21 11:06:53 +02:00
Ari Koivula
de76d0a294
Don't add dependency to the above LCU in wavefront if it's not necessary.
...
- The top-right LCU already has dependency to the top LCU.
2014-05-20 10:48:19 +03:00
Laurent Fasnacht
bdc2d43180
Write bitstream directly after doing the search. This is required since we need the correct entropy status for wpp
2014-05-20 09:29:01 +02:00
Laurent Fasnacht
06532292fc
Wavefront are in tile coordinates
2014-05-20 09:28:58 +02:00
Ari Koivula
4751a3744b
Fix intra mode search not doing boundary smoothing for DC.
...
- Move the boundary smoothing to the prediction function to make sure it's not
forgotten.
2014-05-19 16:23:17 +03:00
Ari Koivula
f9a603e4ea
Move intra mode search form intra module to search module.
...
- Make the actual intra prediction function global.
- Move the rdo stuff to rdo module.
2014-05-19 16:12:02 +03:00
Ari Koivula
1da94f2085
Stop deblocking from filtering edges not on 8x8 grid.
2014-05-19 15:58:54 +03:00
Ari Koivula
2224e18a46
Make deblocking work with transform splits.
...
- It used to work only with the implicit transform split from LCU size.
2014-05-19 15:58:54 +03:00
Ari Koivula
656b0a321b
Add chroma mode to lcu_set_intra_mode.
...
- This is needed for intra split.
2014-05-19 15:58:54 +03:00
Ari Koivula
921f58b249
Add tr_split to lcu_set_intra_mode.
2014-05-19 15:58:54 +03:00
Ari Koivula
846b608125
Add transform split recursion to intra reconstruction.
2014-05-19 15:58:54 +03:00
Ari Koivula
63f6cad5a0
Include global.h in thread modules.
2014-05-19 15:58:16 +03:00
Ari Koivula
551b087b47
Remove bunch of unnecessary code from encode_transform_unit.
...
- Really, it's useless. Selecting scan order isn't this hard.
- Checked from HM that ctx_idx doesn't have anything to do with contexts.
2014-05-16 17:42:40 +03:00
Ari Koivula
f73bef0941
Remove unused include.
2014-05-16 16:09:59 +03:00
Laurent Fasnacht
6fdb821b14
Fix memory leaks
2014-05-16 12:20:40 +02:00
Laurent Fasnacht
d4a6aed471
Multi-row jobs
2014-05-16 12:20:40 +02:00
Marko Viitanen
94285fbed7
Fixed compiling on visual studio with _DEBUG defined
2014-05-16 12:22:06 +03:00
Marko Viitanen
86155ef1ba
Added windows specific timing macros for thread debugging
2014-05-16 12:16:22 +03:00
Laurent Fasnacht
36945e89ce
Stubs to be able to make a portable version of the profiling
2014-05-16 10:15:05 +02:00
Laurent Fasnacht
53b0835316
Improve handling of jobs when not using threads
2014-05-16 08:50:43 +02:00
Laurent Fasnacht
519750d630
Write bitstream of a wavefront in a parallel way
2014-05-16 08:50:42 +02:00
Laurent Fasnacht
7473ac1bfc
Able to log time in a simple way
2014-05-16 08:50:42 +02:00
Laurent Fasnacht
86e01284b8
Add -lrt
2014-05-16 08:48:54 +02:00
Laurent Fasnacht
4f73a7fc91
Instrument threads in order to be able to do some visualization
2014-05-16 08:44:32 +02:00
Ari Koivula
a7cd31d87b
Update the names of some bins to the current spec.
...
- Helps with debugging.
2014-05-16 05:44:03 +03:00
Ari Koivula
ab4041c8fc
Change cabac debug statements to show information better.
...
- Show the number of bits when encoding multiple bins. I would like just the
bits them selves in string form, but that's too much trouble for this.
- Print then as unsigned and coerce them to unsigned, as they are going
get coerced to unsigned by the function call anyway.
- Change state to be less verbose.
2014-05-16 05:44:03 +03:00
Ari Koivula
c9a8756fbd
Fix NxN scan mode for lcu_get_final_cost.
...
- Scan mode was always selected according to the first PU mode.
2014-05-15 16:20:35 +03:00
Marko Viitanen
b08047cce9
Fixed intra chroma mode selection
2014-05-15 09:50:05 +03:00
Tapio Katajisto
4d879945b2
Fixed cost calculations in fme
2014-05-15 03:42:42 +00:00
Ari Koivula
f0e990905e
Remove chroma mode "36".
...
- It's an unnecessary chore to handle this special case everywhere (it means
chroma_mode == intra_mode). Better just to use the actual mode.
2014-05-14 19:56:35 +03:00
Ari Koivula
60a0ba4280
Update VS project files to link win32-pthread.
...
- I haven't found a good way of including external dependencies to VS projects
yet. Win32-pthreads is assumed to be found at the same level as kvazaar dir
and has the files x86/pthreadVC2.lib and x64/pthreadVC2.lib.
- Win32-pthreads also requires the pthreadVC2.dll to be in PATH when running
the program. Not sure what to do about that yet. We might need an installer
for windows to handle that.
- Disable openmp as it's no longer used.
- Stop linking Ws2_32.lib as that hasn't been used for ages.
2014-05-14 17:54:34 +03:00
Laurent Fasnacht
8ff9ea0eee
Wavefront works with parallelism + deblock (still no SAO)
2014-05-14 14:01:26 +02:00
Laurent Fasnacht
38444a81a6
Threads should be put in queue in wait state if we want to add dependencies later
2014-05-14 14:01:25 +02:00
Laurent Fasnacht
e72408249b
Add encoder_state pointer to lcu_order_element, new worker_encoder_state_search_lcu function to run the search stuff on one LCU
2014-05-14 14:01:24 +02:00
Laurent Fasnacht
eb62696461
Fix problems when image dimensions is not a multiple of LCU
2014-05-14 13:27:14 +02:00
Laurent Fasnacht
1ba1683c05
search buffer has to be allocated tile-wise to avoid problems with wavefronts
2014-05-14 13:27:13 +02:00
Laurent Fasnacht
bb86f24000
Take advantage of the new buffers to remove uneeded item assignment
2014-05-14 13:27:13 +02:00
Laurent Fasnacht
6607c9f563
Use new buffers for search
2014-05-14 13:27:12 +02:00
Laurent Fasnacht
c257c4b863
Add const for the buffers
2014-05-14 13:27:12 +02:00
Laurent Fasnacht
1680273e80
Store search borders in a buffer for the whole picture
2014-05-14 13:27:11 +02:00
Laurent Fasnacht
0ceb1469a2
Improve decision about when to split into threads
2014-05-14 13:27:11 +02:00
Laurent Fasnacht
d4a303e7e6
Free jobs as soon as possible
2014-05-14 13:27:09 +02:00
Laurent Fasnacht
63adb54a3d
Add --threads <int> command line parameter
2014-05-14 13:27:09 +02:00
Laurent Fasnacht
e772799d5e
encoder_state_encode uses now the threadqueue
2014-05-14 13:27:08 +02:00
Laurent Fasnacht
baede7f6c4
threadqueue
2014-05-14 13:27:08 +02:00
Laurent Fasnacht
8b7774153f
Add SLEEP() define
2014-05-14 13:27:08 +02:00
Laurent Fasnacht
aac7fc55b1
Remove filter_deblock function, which is not used and somewhat dangerous, since it doesn't take into account specific stuff about subencoders.
2014-05-14 13:27:07 +02:00
Laurent Fasnacht
bc3ca90bdf
Fix tiles when SAO or deblock is enabled.
...
Was broken by previous commit.
2014-05-14 13:27:07 +02:00
Laurent Fasnacht
4815a0604b
Entropy coding sync works without parallelism, without SAO and without deblocking
2014-05-14 13:27:06 +02:00
Laurent Fasnacht
2c2a2528f3
Remove openmp stuff
2014-05-14 13:27:06 +02:00
Ari Koivula
aee9bf2875
Re-add rdo control to transformskip decision.
...
- It got left out when rewriting the function.
2014-05-14 12:39:23 +03:00
Ari Koivula
9147b7acbf
Split residual quantization to separate luma and chroma function.
2014-05-14 11:19:48 +03:00
Tapio Katajisto
cc92cfee18
Added few warnings to Makefile
...
Cleaned fme code a bit
2014-05-14 01:49:34 +00:00
Tapio Katajisto
efc43c8b3a
Added fractional pixel motion estimation
...
Added farctional mv support for inter recon
Added 1/8-pel chroma and 1/4-pel luma interpolation
2014-05-14 01:42:02 +00:00
Ari Koivula
e947bd4c0e
Clean up trskip decision code and remove old code.
...
- You can define structs inside functions! This changes everything!!
- Bitstream changes a little bit compared to old trskip decision. Bdrate
change is insignificant though.
2014-05-13 22:00:04 +03:00
Ari Koivula
a3cdee9ec5
Move new trskip decision to a function.
2014-05-13 21:59:00 +03:00
Ari Koivula
2ff713ccb2
Add new implementation for trskip decision.
2014-05-13 21:57:45 +03:00
Ari Koivula
8b8da6f493
Make luma and chroma use the same quantization function.
...
- Only thing not working was transform skip.
2014-05-13 21:57:23 +03:00
Ari Koivula
f0bfcedba2
Clean up coeff reconstruction code.
2014-05-13 21:56:10 +03:00
Ari Koivula
0c65a9b658
Remove abs_sum from coeff quantization.
...
- It's meant for checking if there are any coefficients, but we don't use it
and it's annoying to remember to initialize it and pass it around. The
benefit should be quite small anyway.
2014-05-13 21:54:34 +03:00
Ari Koivula
75042fc65d
Move luma quantization to it's own function.
2014-05-13 21:34:06 +03:00
Ari Koivula
ba3aaf3189
Expand chroma functions to parent function.
...
- This was done so that making the function work with luma would be easier.
2014-05-13 21:30:14 +03:00
Ari Koivula
637aceb495
Add TR_MAX_WIDTH.
...
- Max transform size is constrained by but independent of LCU size.
- Luma and chroma now have the same stride for transform arrays.
2014-05-13 21:22:40 +03:00
Ari Koivula
1c38209cab
Add missing include.
2014-05-13 09:33:05 +03:00
Ari Koivula
13577562e5
Revert change to definition of LCU_WIDTH.
2014-05-13 09:28:01 +03:00
Ari Koivula
fb763f7940
Move coefficient generation functions from encoder.c to transform.c.
...
- These functions probably should have been there to begin with.
2014-05-12 11:37:39 +03:00
Ari Koivula
a3478ecd20
Move transform skip decision to it's own function.
2014-05-12 11:18:27 +03:00
Ari Koivula
d9b890de6e
Remove redundant variables.
...
- Redefine LCU_WIDTH to be 64. Stuff will break horribly if it's
anything else anyway.
- Add LCU_WIDTH_C for chroma LCU width. It should be more readable than the
constant (LCU_WIDTH >> 1).
2014-05-12 10:58:07 +03:00
Ari Koivula
59e0e98523
Separate luma and chroma coefficient generation variables.
2014-05-12 10:38:24 +03:00
Ari Koivula
0ca65e7606
Move chroma coefficient generation to it's own function.
...
- It's time to chop up this monster that is encode_transform_tree.
2014-05-12 10:24:06 +03:00
Ari Koivula
3c3c9a26c6
Move scan order selection to a function.
2014-05-12 08:47:16 +03:00
Ari Koivula
623d9001a8
Reorder chroma coefficient generation.
2014-05-12 08:47:16 +03:00
Ari Koivula
93141c7d2e
Avoid unnecessary copying of predicted pixels when there are no coeffs.
...
- These are probably from a time when reconstruction happened in this
function.
2014-05-09 16:39:58 +03:00
Ari Koivula
27ab882c25
Clean up coefficient generation.
2014-05-09 16:33:10 +03:00
Ari Koivula
ce945ab4ef
Handle coefficient initialization better.
...
- Coefficients are no longer required to be pre-zeroed. The resulting zeroes
are copied in even in the case where we already know they are all zeroes.
- Move cbf clearing code to only happen at the leaves of the recursion.
2014-05-09 16:30:28 +03:00
Laurent Fasnacht
b274558139
Refactor and fix entry_points functions.
...
Seems to be OK with HM now
2014-05-09 12:42:37 +02:00
Laurent Fasnacht
43b5f84c0d
Fix sao_calc_edge_block_dims
...
It was computing wrong dimensions, which was causing out-of-bounds reads in sao_reconstruct.
2014-05-09 10:30:34 +02:00
Laurent Fasnacht
3f975e92cd
Replace line fixing symptoms by assertions, to reveal the cause
2014-05-09 08:24:03 +02:00
Laurent Fasnacht
4dbf7c7a52
Fix blit dimensions in sao_search_best_mode
2014-05-09 08:24:02 +02:00
Ari Koivula
cb5d7e6541
Fix compilation for VS2010.
2014-05-08 17:28:12 +03:00
Laurent Fasnacht
0452806ec4
Entry points
2014-05-08 15:04:56 +02:00
Laurent Fasnacht
da588af2ba
Partial support for wavefront
2014-05-08 15:04:55 +02:00
Laurent Fasnacht
4de5660254
Fix missing offset in LCU range computation for wavefronts
2014-05-08 15:04:55 +02:00
Laurent Fasnacht
dc34a5eac6
LCU borders
2014-05-08 15:04:54 +02:00
Laurent Fasnacht
24f4a8cad1
Wavefront also needs entrypoints
2014-05-08 15:04:53 +02:00
Laurent Fasnacht
d05f8b52aa
Rewrite of encoder_state_write_bitstream_leaf: handle slice + tiles + wavefronts correctly
2014-05-08 15:04:53 +02:00
Laurent Fasnacht
27f694e3e8
Some initial code to support wpp and slices
2014-05-08 15:04:52 +02:00
Laurent Fasnacht
b3d1754cc3
context_copy function
2014-05-08 15:04:51 +02:00
Laurent Fasnacht
163189c3c7
Bitstream for leaves can be computed in parallel
2014-05-08 15:04:51 +02:00
Laurent Fasnacht
be9882f5b2
Leaf bitstream write
2014-05-08 15:04:50 +02:00
Laurent Fasnacht
ae6a7a9c4b
Leaf encoder uses encoder_state->lcu_order
2014-05-08 15:04:49 +02:00
Laurent Fasnacht
b740142325
Add is_leaf to encoder_state
2014-05-08 15:04:48 +02:00
Laurent Fasnacht
8451d5b100
Move some init code to encoder_state_new_frame
2014-05-08 15:04:48 +02:00
Laurent Fasnacht
1cb3f14dfe
lcu_order_count in (leaves) encoder
2014-05-08 15:04:47 +02:00
Laurent Fasnacht
ef6ae3e723
Remove dead code
2014-05-08 15:04:46 +02:00
Ari Koivula
535b42bc9b
Fix compilation for VS2010.
2014-05-07 15:26:44 +03:00
Laurent Fasnacht
05eef82896
Remove extra [ from graphviz dump
2014-05-07 13:40:29 +02:00
Laurent Fasnacht
84e5dbee39
Remove quote from graphviz dump
2014-05-07 13:33:02 +02:00
Laurent Fasnacht
b48a687d3c
Restored parallelism, but it will be done in another way... OpenMP is not very efficient in these kind of dynamic situation
2014-05-07 11:55:56 +02:00
Laurent Fasnacht
0e6f1c99fc
Refactor picture to remove hidden dependency between slice and tiles
...
picture.type -> encoder_state->global->pictype
picture.slicetype -> encoder_state->global->slicetype
picture.slice_sao_luma_flag -> 1 (was constant)
picture.slice_sao_chroma_flag -> 1 (was constant)
This may be changed later. For now it's better to avoid having slice related stuff in picture.
2014-05-07 11:55:48 +02:00
Laurent Fasnacht
39d96e0546
Fix bug with cabac stream pointing to bad data
2014-05-07 11:55:41 +02:00
Laurent Fasnacht
e144f817ef
Works when not using tiles
2014-05-07 11:55:16 +02:00
Laurent Fasnacht
24c2bd70ca
Fix small bugs with compilation
2014-05-07 11:54:35 +02:00
Laurent Fasnacht
a03f0cba19
encoder_control_input_init near the other encoder_control_* functions
2014-05-07 11:53:21 +02:00
Laurent Fasnacht
1e2671ac30
Renamed encoder_clear_refs to encoder_state_clear_refs
2014-05-07 11:53:12 +02:00
Laurent Fasnacht
831b221cf8
Parsing seems to work now
2014-05-07 11:53:01 +02:00
Laurent Fasnacht
8b5cb62237
Debug code to generate a graph
2014-05-07 11:52:04 +02:00
Laurent Fasnacht
cee6bb0e71
Fix iteration on children
2014-05-07 11:49:14 +02:00
Laurent Fasnacht
699669ee35
fixed typo
2014-05-07 11:48:16 +02:00
Laurent Fasnacht
6c6adf18c7
Refactor encoder_state
2014-05-07 11:47:31 +02:00
Laurent Fasnacht
a23edd0339
added parent to encoder_state
2014-05-07 11:42:54 +02:00
Laurent Fasnacht
5ce518a47a
lcu_at_tile_start and lcu_at_tile_end helper functions
2014-05-07 11:42:30 +02:00
Laurent Fasnacht
c2872bd6b0
Slices and WPP in command line and encoder
2014-05-07 11:42:04 +02:00
Laurent Fasnacht
2d6f199246
reorganized encoder_state structure
2014-05-07 11:41:27 +02:00
Laurent Fasnacht
f0b076876f
Moved all the stream related stuff into substream_write_bitstream
2014-05-07 11:40:20 +02:00
Laurent Fasnacht
f30b9c2a11
Fix a buffer overflow in parse_tiles_specification
2014-05-07 11:39:45 +02:00
Ari Koivula
eaf8835bda
Add some comments and const qualifiers.
2014-05-06 19:20:38 +03:00
Ari Koivula
3910b7989a
Clear old cbf data before recursion in encode_transform_tree.
...
- Because encode_transform_tree also maintains the CBF data and assumes that
the CBFs are initially zeroed, calling the function more than once would
result in incorrect CBF data.
2014-05-06 19:03:29 +03:00
Ari Koivula
bdc16d2612
Improve cu_info coded block flag data structure a bit.
...
- It works just like the old structure except that the flags are checked with
bitmasks instead of having the flag value be propagated upwards. There isn't
really any benefit to this because the flags still have to be propagated to
parent CUs.
- Wrapped them inside a struct to make copying them easier. (Just need to copy
the struct instead of making individual copies)
2014-05-06 18:28:04 +03:00
Ari Koivula
d123b98aea
Remove unnecessary tertiary expressions from usages of CABAC_BIN.
2014-05-06 17:39:25 +03:00
Ari Koivula
380401b2eb
Have CABAC_BIN accept any >0 as binary 1.
...
It used to treat odd numbers as false.
2014-05-06 17:39:10 +03:00
Marko Viitanen
bf2c2a1330
Small changes to fix compiling on VS
...
- Added threads.h to VS project
- Included Windows.h in threads.h
2014-05-05 11:18:43 +03:00
Laurent Fasnacht
f3d4e6eb09
Move bitstream write to a separate function, and add assertions about the part which should not write to bitstream.
2014-05-05 09:24:57 +02:00
Laurent Fasnacht
0fe080ad0a
bitstream_tell
2014-05-05 08:53:06 +02:00
Laurent Fasnacht
7f6f4fe9c1
Reference count for picture
2014-05-05 08:03:24 +02:00
Laurent Fasnacht
323054d5e2
naming: alloc_yuv_t -> yuv_t_alloc dealloc_yuv_t -> yuv_t_free
2014-05-02 11:45:27 +02:00
Laurent Fasnacht
7d6d1d5536
Remove pic->pred_*
2014-05-02 11:38:07 +02:00
Laurent Fasnacht
92e14cc80d
rename picture_init to picture alloc and picture_destroy to picture_free
2014-05-02 10:58:28 +02:00
Laurent Fasnacht
b76f7377b6
Always initialize tiles data structures (even with only one tile)
2014-05-02 10:00:22 +02:00
Laurent Fasnacht
f97e60a80d
Doc for encoder state
2014-05-02 10:00:12 +02:00
Laurent Fasnacht
161fe38f5e
Remove USE_TILES define
2014-05-01 13:58:13 +02:00
Laurent Fasnacht
a84fd6486d
Add function subencoder_blit_pixels
2014-05-01 11:16:11 +02:00
Laurent Fasnacht
b8b28635ff
Iterable structure for sub-encoders (more flexibility)
2014-05-01 11:16:10 +02:00
Laurent Fasnacht
212d390003
Cleanup of encoder_state_init and encoder_state_finalize
2014-05-01 11:16:10 +02:00
Laurent Fasnacht
161053f86b
Do not allow more tiles than dimension in LCU
2014-05-01 07:11:31 +02:00
Ari Koivula
42295d3cb9
Pass preprocessor defines for supported intrinsics in VS2010 explicitly.
...
- _M_IX86_FP defines whether VS should generate code using SSE or SSE2
instructions. It isn't correct to use it to check whether optional runtime
optimizations should be compiled in. It's also not defined at all in 64-bit
mode.
- So let's just keep it simple and give a list of everything that is supported
as release optimizations. It's not clear from the documentation if all of
these are really supported. It just list a bunch of intrinsics from these
that are.
2014-04-30 17:41:15 +03:00
Ari Koivula
d1fbc6dc80
Fix a small memory leak.
...
- Malloced pointer returned by alloc_yuv_t was not being freed in
substream_encode.
- Remove use of yuv_t from encode_one_frame, as it's not used there anymore.
2014-04-30 11:15:34 +03:00
Ari Koivula
d808fe3b02
Merge branch 'strategy_selector'
2014-04-29 15:36:48 +03:00
Ari Koivula
bd7e021742
Modify strategyselector to work with VS2010.
...
- VS doesn't have snprintf.
- VS doesn't support GCC attributes.
- Add defines for __SSE__ and __SSE2__ on VS.
2014-04-29 15:29:06 +03:00
Laurent Fasnacht
bf7e755cf7
Strategies and runtime detection/choice of best algorithm
2014-04-29 11:51:41 +02:00
Ari Koivula
27b94d4b45
Address gcc -Wtype-limits errors.
...
- Fixes warnings in #19 and #16 .
2014-04-29 09:15:52 +03:00
Ari Koivula
2a17e9a7aa
Merge branch 'sse_intrinsics'
2014-04-28 19:38:08 +03:00
Ari Koivula
cecf4b0b4e
Move __USE_MINGW_ANSI_STDIO to Makefile.
...
- I'm not too clear on how this should be used, but having it in the source
file after mingw stuff was included caused a warning about redefinition of
__USE_MINGW_ANSI_STDIO.
2014-04-28 19:37:37 +03:00
Ari Koivula
4e7e40054f
Move picture-sse2.c to src/inline-optimizations/.
...
- Having it in the src dir even though it's not a module on it's own breaks
the scons build script. It's probably better to have these a little bit
separated from the normal code anyway.
2014-04-28 19:36:40 +03:00
Laurent Fasnacht
d66f809734
reg_sad implementation using SSE2/SSE4.1 intrinsics
2014-04-28 15:36:58 +02:00
Ari Koivula
4490e8afd6
Remove depth dimension from picture->cu_array.
...
- It isn't used for anything anymore.
- It was used in the past to hold information during search, but now that
information is held in lcu_t structs.
2014-04-28 10:18:22 +03:00
Laurent Fasnacht
76ec605b72
SAO works with tiles now
2014-04-28 06:29:21 +02:00
Yusuke Nakamura
0214d4ffcc
Makefile: Remove unneeded arguments in CCFLAGS.
...
This fixes a compilation on clang.
2014-04-27 00:41:10 +09:00
Yusuke Nakamura
03da39e229
config: Use built-in getopt on non-MSVC environments.
2014-04-27 00:40:52 +09:00
Yusuke Nakamura
c5a4e7b52c
encmain: Remove a warning on MinGW.
2014-04-26 23:56:50 +09:00
Ari Koivula
145816cfb5
Move printing of CLI stuff to stderr.
...
- Printing to stdout corrupts the stream when used with "-o -".
2014-04-26 12:56:39 +03:00
Laurent Fasnacht
5e7945888a
Inter-frame prediction with tiles works.
...
Many thanks to Jean-Hugues Recolin for the insightful comments about shifts!
2014-04-25 09:28:00 +02:00
Laurent Fasnacht
7719837f17
Simple OpenMP parallelization
2014-04-25 09:11:10 +02:00
Laurent Fasnacht
4e34859e66
Fix compilation error with USE_TILES=1 and -Werror=maybe-uninitialized
2014-04-24 08:41:05 +02:00
Laurent Fasnacht
59392c4a62
Fix compilation issue with USE_TILES=0
2014-04-24 08:38:24 +02:00
Laurent Fasnacht
571a373f69
Use tile offset in search
2014-04-24 08:38:24 +02:00
Laurent Fasnacht
2e7d958af3
Picture and reference may have different sizes
2014-04-24 08:38:24 +02:00
Laurent Fasnacht
af9a1c0fbb
Use same reference images for all subencoders
2014-04-24 08:38:23 +02:00
Laurent Fasnacht
73c574fb45
P-frame: first try...
2014-04-24 08:38:22 +02:00
Laurent Fasnacht
03361dcf2c
sao try... still not working
2014-04-24 08:38:22 +02:00
Laurent Fasnacht
3db4c59478
Recontruct full frame from tiles
2014-04-24 08:38:21 +02:00
Laurent Fasnacht
35d5d22ccc
Fix tile size not to go outside of the original picture
2014-04-24 08:38:20 +02:00
Laurent Fasnacht
985630b8b2
Add a check to fix picture_blit_pixels when width > orig_stride
2014-04-24 08:38:20 +02:00
Laurent Fasnacht
b36e154c38
Some cleanup
2014-04-24 08:38:19 +02:00
Laurent Fasnacht
01580a93c3
Encoding with tiles now more or less works with -p 1 --no-sao --no-deblock
2014-04-24 08:38:19 +02:00
Laurent Fasnacht
fd89b9af76
New functions: bitstream_append and bitstream_clear
2014-04-24 08:38:18 +02:00
Laurent Fasnacht
356c17e0de
Add missing break in bitstream_writebyte
2014-04-24 08:38:18 +02:00
Laurent Fasnacht
5fb4d9c36e
substream_encode function
2014-04-24 08:38:17 +02:00