Enables search for 2NxN and Nx2N partition modes for 8x8 CUs and 2NxnU,
2NxnD, nLx2N and nRx2N partition modes for 16x16 CUs.
Changes the loop for copying reconstructed luma pixels in
kvz_inter_recon_lcu to use 4 byte chunks instead of 8 byte chunks since
it is now possible to have 4 pixel wide blocks.
This problem resulted in an illegal bitstream with --gop=lp, because it
uses IDR's. The --gop=8 would not code IDR pictures, even when told to
with -p, which masked this problem.
This fix solves the problem with --gop=lp and also prevents references
across the intra picture in --gop=8. The intra pictures should be set
to IDR in a later fix, or an alternate method of differentiating
between IDR and non-IDR intra should be made.
The includes should make more sense now and not just happen to compile
due to headers included from other headers.
Used a modified version of IWYU. Modifications were to attribute int8_t
and so on to stdint.h instead of sys/types.h and immintrin.h instead of
more specific headers.
include-what-you-use 0.7 (git:b70df35)
based on clang version 3.9.0 (trunk 264728)
I was a bit unclear about exactly what happens and when regarding SAO
and deblocking when we do frame-parallel WPP parallelism, so I checked
and commented the bits that were unclear to me.
There was an off by one error in the dependance setting code, which
resulted in dependencies not being set resulting in checksum errors.
For example if ref_neg=1 and owf=1.
Moves sao search from function encoder_state_worker_encode_lcu in
encoderstate.c to function kvz_sao_search_lcu in sao.c. Makes functions
kvz_init_sao_info, kvz_sao_search_chroma and kvz_sao_search_luma static
since they are no longer used outside sao.c.
CU data was being copied to the wrong place in the reference frames
cu_array, which led to uninitialized data being used as a starting
point for motion vector search.
Fixes#99.
Add dependency to the reference frame instead of the previous frame,
in order to allow more frames to be encoded in parallel when temporal
stepping >1 in LP-gop (such as --gop=lp-g8d4r1t2).
Prevents a conflict with config.h and src/config.h so that the config.h
generated by configure is included in global.h. Fixes problems with
large input files on 32-bit systems.
Add module information to all header files.
Update all header file documentations to briefly say what they are, and
to use the javadoc format so the brief actually gets included into the
doxygen documentation.
Remove \file from implementation files, in order to not repeat the info
from the header files.
Add files under strategies and tools to Doxygen and update the Doxygen
settings to be just plain better.
Make README be the main page of Doxygen documentation.
Remove the need to count the coefficients by populating the significant
coefficient group map first and finding the last coefficient from the
last group afterward. The speedup is about 2% on ultrafast.
The previous version of this patch was reverted due to a bug, which
has now been fixed.
This reverts commit 25462124f8.
That commit broke the bitstream. If it's not good enough to push on Friday
night, it's probably not good enough on Monday morning either.
Remove the need to count the coefficients by populating the significant
coefficient group map first and finding the last coefficient from the
last group afterward.
Changes main function to compute frame PSNR by calling
kvz_videoframe_compute_psnr directly with the source and reconstructed
pictures returned from encoder_encode.
The code for building the reference picture lists was duplicated in
functions encoder_state_ref_sort and print_frame_info. This commit moves
it to a new function kvz_encoder_get_ref_lists. Also makes
encoder_ref_insertion_sort static since it is not used outside the
encoderstate module any more.
This bug caused a single tiles worth of lcu_info_t structs to be copied
unnecessarily for every LCU in the frame. This obviously caused huge
memory bandwidth issues when coding large frames without tiles. The
effect was minimized somewhat with a large number of tiles, because
only the current tile was copied.
From context it is clear that this piece of code was supposed to copy
a single tile or frame, once the frame was done, but because it was
placed in a function which is called for every LCU, it copied the data
for the LCU, but also lots of extra stuff.
The fix is to copy only the current LCU instead of the whole tile.
A call to kvz_threadqueue_waitfor caused the tqj_bitstream_written field
of the previous encoder state to become a dangling pointer, subsequently
causing an assertion to fail. This would only occur when the encoder
state used for a new frame was not the last finished one.
Fixed by setting tqj_bitstream_written to NULL after the job is done and
removing unnecessary calls to kvz_threadqueue_waitfor.
- Removes all bitstream types.
- Changes encoder_encode to return the encoded data as list of chunks.
- Moves writing of the encoded data to the main function.
- Replaces read_one_frame by encoder_feed_frame.
- Adds field "prepared" to encoderstate_t to indicate that
encoder_next_frame has been called.
- Input frames are read in the main function and passed to
encoder_encode.
Adds function image_copy_ref to image module for getting a new reference
to an image. It can be used instead of image_make_subimage when the
sizes of the original and the subimage are same.
- Use the existing bitstream_t type to give access to the bitstream.
We can extend it later to make it a linked list like I was planning
to do with the payload type.
- The main encoder now also stores the bitstream in memory.
- Move image_t and pixel_t to the kvazaar.h API.
- Try and arrange things such that image_t can be used as input and
output for encoding.
Conflicts:
src/encmain.c
- CLI stuff is moved to either cli-module or to main function.
- OWF stuff is made more explicit by counting the frames instead of
communicating through encoder_state_t.stats_done.
- Every wavefront row was being set to done when the first wavefront
row got done.
- Looks like I didn't understand how the data structure worked when I
"cleaned this up", and it didn't get caught in tests because it
needs OWF to be on to affect anything.
- Attribute state->global->slicetype was used before being initialized.
- The reference frame lists should be updated based on current frame,
not on previous frame (or uninitialized data).
- Main states cu_array can be accessed through state->global->ref, which
allows the use of cu_info data from reference frames.
- This was already used by giving previous frames movement vector to next
frame as a starting point candidate, but that functionality was broken at
some point because the data wasn't being moved from child tiles cu_array
to the main cu_array.
- Alternative would be to access the child tiles array directly, but
currently there isn't a mechanism to preserve those arrays for reference
frames.
- Remove WPP row reconstruction dependency to the row above current one in
the previous frame. It's obviously unnecessary.
- Remove WPP row reconstruction dependency to the current row in the
previous frame, unless the current row is the last row.
- It's so widely used that there isn't really need to emphasize that
it's the encoders state. Also, it isn't really the encoders state,
but encoding jobs state.
- Everyone who has contributed code to the project has been asked to license
their contributions under LPGL and they have agreed.
- COPYING file changed to say LGPLv2.1 instead of GPLv2.
- GPL changed to LGPL in the header of every single file that a header and
header added to the few that were missing one.
- Also.. Happy new year!
- There is some stuff from sign hiding left intermingled with rdoq code,
but I don't want to change the code too before testing that I didn't
break anything.