transform skip vs. normal transform selection criteria might need more work, currently both are calculated for each 4x4 block and SAD+coeff_SSE is compared.
- Intra works. There is still something wrong in inter.
- Avoid horizontal deblocking of the rightmost 4 pixels in the LCU.
This is because vertical deblocking must be done for all pixels
before horizontal, but vertical deblocking can't be done for those
pixels before the next LCU is finished.
- Add separate deblocking of the rightmost pixels of the last LCU
after the LCU edge has been deblocked.
- This is a pretty ugly hack but will have to do for now.
I didn't take into account that the reference pixel on the top-left of the
LCU gets over written if we just replace the top reference pixels for
current LCU with the bottom reference pixels after doing the search.
To handle this I copy the pixel that gets overwritten to the vertical
reference pixels.
- This is necessary because after we add in-loop filters to be done per LCU,
the reconstruction buffer will have the deblocked pixels. We only need the
edge-pixels for intra prediction though so we just save those.
- Right now it only copies the pixels and passes them on to search, where
the copied pixels are asserted to be the same ones we copy from
reconstruction buffer.
- New yuv_t struct added for arrays of dynamic length. We might want to change
other buffers to use it or something like it in the future.
There is a lot of duplicated code due to handling random access and trailing
pictures separately. I merged the code for these two branches so it would be
easier to modify.
According to spec the end_of_slice_segment_flag is always coded, but in the
code it looked like it was not coded for the last LCU in picture. This was
due to the end_of_slice_segment_flag being coded inside cabac_flush, like it
is in HM. This is a bit silly so I moved it out of cabac_flush.
The relation between coefficients positions and coefficient group positions
was a big confusing due to the use of 16x16 diagonal coefficient mappings
also as coefficient group mappings.
- Moved all coefficient group mappings to their own const arrays and added
a new array the select the correct coefficient group mapping. This removes
special cases for 8x8 and 32x32 transform sizes.
- Removed all coefficient group mapping initialization from init_sig_last_scan.
- Removed 128x128 and 64x64 from regular coefficient group array as those
transform sizes don't exist anymore in HEVC.
- Moved NxN search to be done on the same level as other searches, as it's
really not any different from 2Nx2N.
- Produces working bitstream but reconstruction is different.
Moves CABAC context initialization to take place before search. This fixes
an issue with RDOQ returning different coefficients for identical adjacent
frames.
- This actually probably worsens BD-rate a little for all frames except the
first one because we were using last frames final CABAC context for every
LCU and now we are using initialized CABAC contexts. The fix is to encode
the LCU before we start compressing the next LCU so we can update CABAC
contexts.
Moves CABAC context initialization to take place before search. This fixes
an issue with RDOQ returning different coefficients for identical adjacent
frames.
- This actually probably worsens BD-rate a little for all frames except the
first one because we were using last frames final CABAC context for every
LCU and now we are using initialized CABAC contexts. The fix is to encode
the LCU before we start compressing the next LCU so we can update CABAC
contexts.
The coeff flags are no longer propagated upwards because encode_transform_tree
is being called from depth > 0. The fix is to initialize the whole coeff flag
array when the coeff flag is set.
This does not currently affect the search primary search defining the used block sizes, only the refining second intra search. Gain 1.9% BD-rate on All Intra 600f of BQMall QP 22,27,32,37.
This does not currently affect the search primary search defining the used block sizes, only the refining second intra search. Gain 1.9% BD-rate on All Intra 600f of BQMall QP 22,27,32,37.
- Working towards issue #11.
- Widened datatypes for cfg struct members that take values from atoi to full
ints so that bounds checking can be done after parsing without overflow.
- Working towards issue #11.
- Removed intra_get_block_mode as unused.
- Removed unused parameters from functions. Many of them were remnants from
earlier data structures and earlier features of HEVC that have been removed.
- Lots of implicit conversions from larger types to smaller ones. I tried to
avoid turning all of them to explicit ones this time and opted for changing
the original data type instead. Had to do it in few cases though to stop the
changes from propagating too widely.
In NxN mode, chroma predictions were pushed to buffer when chroma should not have been used at all. (Because it is processed only on first of the four NxN luma blocks)
The search_buildReferenceBorder was an ugly hack and a place for bugs to hide
that should never have existed. Now it doesn't.
The change reduces PSNR a little, but also reduces the bitrate, when the
expected result was to have no change in either. I'm guessing there was still
some bug in the search_buildReferenceBorder, but the bug could also be in
intra_build_reference_border. Will have to do more testing to be sure, but
having one place to look at will be better than having two.
Enables the output of spec-compliant byte streams, as the specification
notes that an additional zero_byte has to be added under certain
circuimstances.
Conflicts:
src/encoder.c
- Chroma RDOQ changes conflicted because I had moved the chroma
quantization/dequantization to it's own function.
- Merged to master because I want my code to show up in github. =)
All the old stuff still works, even though NxN doesn't work, so there
is no reason not to merge anyway.
- Re-enable intra search based on reconstructed image.
- This didn't have as much of an effect as I thought it would.
- Re-enable SAO and deblocking.
- Disable NxN searching. (4x4 luma coding is still broken)
Still doesn't work. I have no idea what the problem is. Probably somehow related to the coefficient coding, since the bitstream seems to work, the prediction is correct and the error is not very severe.
- Change scan order selection to be more verbose and based on the correct mode for 4x4. Didn't affect the problem with 4x4 luma in any way although it should have.
- Re-enable residual coding as everything seems to work now besides 4x4 luma.
- Set part-size for Inter.
- Change to Intra Only mode for testing.
- Many small changes here and there. Should have been separate commits probably, but too late.
- Disable SAO and deblocking to be able to see problems with reconstruction better.
- Adjust predictor list to take modes from PUs in addition to 2Nx2N CUs.
- Change intra_get_dir_luma_predictor to take PU index instead of CU index.
- Comment prediction encoding now that I've had to look it up.
- Clean up code and comment.
- Change terminology to match H.265 specification where possible.
- Move transform splitting for depth==0 out of the coding part. It's not
possible to do it here anyway because intra reconstruction is different
if the transform is split.
- Add checking for transform hierarchy depth when coding split flag.
- Fixes bug with cu_data.tr_depth being set. The CU struct was being reused
for inter coded CUs, which did not initialize the tr_depth.
- Add room to cu_data.coeff_top_yuv arrays for the 4x4 PUs data. Will probably have to do the same to other coeff flags. The flags could also probably be combined as they are a bit redundant.
- Doesn't work yet so it's disabled.
- Change encode_transform_coeff to accept PU (Prediction Unit) coordinates
instead of CU coordinates because CUs are 8x8.
Should work exactly the same, but with the prediction mode selection done
separately from the binarization it's easier to see that the implementation
is correct.
One I frame and 99 P frames encoded with SAO off and on.
Processed 100 frames, 6693224 bits AVG PSNR: 30.7248 37.8978 37.8287
Processed 100 frames, 6295072 bits AVG PSNR: 32.2511 38.9373 38.9818
- Move sao-stuff not directly related to encoding to sao-module.
- Calculate sao for all LCUs before encoding any of them. This is in
preparation to doing the reconstruction line at a time instead of
LCU at a time.
SAO needs to be coded before LCU data has been searched. Searching
has already been moved to happen before encoding in the master branch.
Conflicts:
src/encoder.c
src/picture.c
src/picture.h