- Because encode_transform_tree also maintains the CBF data and assumes that
the CBFs are initially zeroed, calling the function more than once would
result in incorrect CBF data.
- It works just like the old structure except that the flags are checked with
bitmasks instead of having the flag value be propagated upwards. There isn't
really any benefit to this because the flags still have to be propagated to
parent CUs.
- Wrapped them inside a struct to make copying them easier. (Just need to copy
the struct instead of making individual copies)
- Malloced pointer returned by alloc_yuv_t was not being freed in
substream_encode.
- Remove use of yuv_t from encode_one_frame, as it's not used there anymore.
- False alarm, but surprisingly difficult to convince clang of that. It
doesn't seem to understand bit shifts very well.
- Only assert and changing LCU_WIDTH>>depth to width was necessary to satisfy
clang.
- Closes#35.
transform skip vs. normal transform selection criteria might need more work, currently both are calculated for each 4x4 block and SAD+coeff_SSE is compared.
- Intra works. There is still something wrong in inter.
- Avoid horizontal deblocking of the rightmost 4 pixels in the LCU.
This is because vertical deblocking must be done for all pixels
before horizontal, but vertical deblocking can't be done for those
pixels before the next LCU is finished.
- Add separate deblocking of the rightmost pixels of the last LCU
after the LCU edge has been deblocked.
- This is a pretty ugly hack but will have to do for now.
I didn't take into account that the reference pixel on the top-left of the
LCU gets over written if we just replace the top reference pixels for
current LCU with the bottom reference pixels after doing the search.
To handle this I copy the pixel that gets overwritten to the vertical
reference pixels.
- This is necessary because after we add in-loop filters to be done per LCU,
the reconstruction buffer will have the deblocked pixels. We only need the
edge-pixels for intra prediction though so we just save those.
- Right now it only copies the pixels and passes them on to search, where
the copied pixels are asserted to be the same ones we copy from
reconstruction buffer.
- New yuv_t struct added for arrays of dynamic length. We might want to change
other buffers to use it or something like it in the future.
There is a lot of duplicated code due to handling random access and trailing
pictures separately. I merged the code for these two branches so it would be
easier to modify.
According to spec the end_of_slice_segment_flag is always coded, but in the
code it looked like it was not coded for the last LCU in picture. This was
due to the end_of_slice_segment_flag being coded inside cabac_flush, like it
is in HM. This is a bit silly so I moved it out of cabac_flush.
The relation between coefficients positions and coefficient group positions
was a big confusing due to the use of 16x16 diagonal coefficient mappings
also as coefficient group mappings.
- Moved all coefficient group mappings to their own const arrays and added
a new array the select the correct coefficient group mapping. This removes
special cases for 8x8 and 32x32 transform sizes.
- Removed all coefficient group mapping initialization from init_sig_last_scan.
- Removed 128x128 and 64x64 from regular coefficient group array as those
transform sizes don't exist anymore in HEVC.
- Moved NxN search to be done on the same level as other searches, as it's
really not any different from 2Nx2N.
- Produces working bitstream but reconstruction is different.
Moves CABAC context initialization to take place before search. This fixes
an issue with RDOQ returning different coefficients for identical adjacent
frames.
- This actually probably worsens BD-rate a little for all frames except the
first one because we were using last frames final CABAC context for every
LCU and now we are using initialized CABAC contexts. The fix is to encode
the LCU before we start compressing the next LCU so we can update CABAC
contexts.
Moves CABAC context initialization to take place before search. This fixes
an issue with RDOQ returning different coefficients for identical adjacent
frames.
- This actually probably worsens BD-rate a little for all frames except the
first one because we were using last frames final CABAC context for every
LCU and now we are using initialized CABAC contexts. The fix is to encode
the LCU before we start compressing the next LCU so we can update CABAC
contexts.
The coeff flags are no longer propagated upwards because encode_transform_tree
is being called from depth > 0. The fix is to initialize the whole coeff flag
array when the coeff flag is set.
This does not currently affect the search primary search defining the used block sizes, only the refining second intra search. Gain 1.9% BD-rate on All Intra 600f of BQMall QP 22,27,32,37.
This does not currently affect the search primary search defining the used block sizes, only the refining second intra search. Gain 1.9% BD-rate on All Intra 600f of BQMall QP 22,27,32,37.
- Working towards issue #11.
- Widened datatypes for cfg struct members that take values from atoi to full
ints so that bounds checking can be done after parsing without overflow.
- Working towards issue #11.
- Removed intra_get_block_mode as unused.
- Removed unused parameters from functions. Many of them were remnants from
earlier data structures and earlier features of HEVC that have been removed.
- Lots of implicit conversions from larger types to smaller ones. I tried to
avoid turning all of them to explicit ones this time and opted for changing
the original data type instead. Had to do it in few cases though to stop the
changes from propagating too widely.
In NxN mode, chroma predictions were pushed to buffer when chroma should not have been used at all. (Because it is processed only on first of the four NxN luma blocks)
The search_buildReferenceBorder was an ugly hack and a place for bugs to hide
that should never have existed. Now it doesn't.
The change reduces PSNR a little, but also reduces the bitrate, when the
expected result was to have no change in either. I'm guessing there was still
some bug in the search_buildReferenceBorder, but the bug could also be in
intra_build_reference_border. Will have to do more testing to be sure, but
having one place to look at will be better than having two.
Enables the output of spec-compliant byte streams, as the specification
notes that an additional zero_byte has to be added under certain
circuimstances.