Ari Koivula
7ecf78bb70
Use sqrt lambda cost for searches not using SSD.
...
- Add encoder_state->global->cur_lambda_cost_sqrt.
- Use sqrt lambda for inter search and rough intra search.
- The effect on inter is around 10-20% bdrate. The effect on intra is smaller
and non-existent when --rd=2 is enabled, as the intra search refinement was
already done with SSD and correct lambda.
2014-06-26 13:56:38 +03:00
Laurent Fasnacht
1112dca933
Fix compilation issue with assertion disabled
2014-06-26 07:31:37 +02:00
Laurent Fasnacht
9ab9defe67
Bitstream length per frame works again
2014-06-19 10:24:03 +02:00
Laurent Fasnacht
45faadb2c9
Fix bug where the wrong number of frames could be encoded (if one frame takes longer than the others)
2014-06-19 10:24:02 +02:00
Ari Koivula
d5a77be4b8
Fix avx detection for gcc.
...
- GCC doesn't support _xgetbv intrinsic so we have to use inline assembler.
2014-06-18 11:50:17 +03:00
Ari Lemmetti
bdef5384ef
Added AVX strategy
2014-06-17 16:52:24 +03:00
Ari Koivula
d7abe6a7c2
Address compilation warning.
...
strategyselector.c:170:10: error: ‘__get_cpuid’ is static but used in inline function ‘get_cpuid’ which is not static [-Werror]
return __get_cpuid(level, eax, ebx, ecx, edx);
2014-06-17 16:26:55 +03:00
Ari Koivula
60ecc6baae
Remove unused stuff.
2014-06-17 16:20:01 +03:00
Ari Koivula
7532b789f8
Add -std=gnu99 for gcc.
...
- std=c99 doesn't work because then struct timespec won't be defined.
2014-06-17 16:15:39 +03:00
Ari Koivula
94bc457b6c
Add option to disable fast intra search.
2014-06-17 15:32:05 +03:00
Ari Koivula
e27fc875c0
Clean up intra search.
2014-06-17 15:09:12 +03:00
Ari Koivula
e4d70ac1ab
Use more starting points for smaller blocks in intra search.
2014-06-17 13:28:27 +03:00
Ari Koivula
9911c7553b
Avoid unnecessary intra dir searching.
2014-06-17 13:11:35 +03:00
Ari Koivula
bd16a55b9b
Always check DC and planar intra modes.
...
- At least one of them is always in predicted modes, but to make sure they
are both included add them explicitly.
2014-06-17 12:51:15 +03:00
Ari Koivula
70740da123
Add smarter rough intra search.
...
- Directional intra mode search is done using halving search from the best
known mode. Starting modes are vertical, horizontal and the 3 diagonal
modes.
Conflicts:
src/search.c
2014-06-17 12:33:10 +03:00
Marko Viitanen
0e2fe9e7ff
Changed intra search to skip some modes speeding it up
2014-06-17 12:32:29 +03:00
Marko Viitanen
a1c3cfe944
Moved intra mode cost calculation to a function
...
Conflicts:
src/search.c
2014-06-17 12:32:29 +03:00
Marko Viitanen
eb7d46f9ef
Modify CU split cost.
2014-06-17 12:30:32 +03:00
Marko Viitanen
bfa37b876b
Conformance fix: set sps_max_dec_pic_buffering to correct value
2014-06-17 12:30:32 +03:00
Ari Koivula
b3c15b8f94
Merge branch 'owf'
2014-06-16 16:07:41 +03:00
Laurent Fasnacht
91de92134f
Constrain the search not to go under the LCU below if OWF is enabled
2014-06-16 14:27:56 +02:00
Laurent Fasnacht
ef9c2258e9
Fix frame counter and stats
2014-06-16 13:21:52 +02:00
Ari Koivula
153b1ee41f
Merge branch 'intra-sad-strategies'
2014-06-16 12:34:37 +03:00
Laurent Fasnacht
84d34c2655
Fix compilation on non-intel
2014-06-16 11:24:02 +02:00
Ari Koivula
3f00592b96
Separate strategyselector debug prints from _DEBUG.
...
- I only want to see the strategy stuff.
2014-06-16 12:15:19 +03:00
Ari Koivula
1c97a10a6d
Move intra SAD and SATD functions under strategies.
2014-06-16 12:13:41 +03:00
Laurent Fasnacht
4b4702819b
Also print encoding FPS
2014-06-16 11:10:11 +02:00
Laurent Fasnacht
2347574a8e
Fix problems revealed by valgrind
2014-06-16 11:10:09 +02:00
Laurent Fasnacht
28c3f22ba1
Fix possible freeze
2014-06-16 11:03:48 +02:00
Laurent Fasnacht
a96c742ad4
Fix depends for wpp+owf
2014-06-16 11:03:47 +02:00
Laurent Fasnacht
f99e41d41f
Improved CPU time statistics
2014-06-16 11:03:46 +02:00
Laurent Fasnacht
8a33c0a688
Fix recon job for wfrow
2014-06-16 10:55:01 +02:00
Laurent Fasnacht
bf6024734a
Fix statistics with OWF
2014-06-16 10:55:00 +02:00
Laurent Fasnacht
0522a3d8e5
--owf option
2014-06-16 10:55:00 +02:00
Laurent Fasnacht
47d1ded7b0
Dependencies between frames
2014-06-16 10:54:59 +02:00
Laurent Fasnacht
003d3c504c
image_list_copy_contents
2014-06-16 10:54:58 +02:00
Laurent Fasnacht
f4187dd10c
cu_array data structure
2014-06-16 10:54:57 +02:00
Laurent Fasnacht
3be3fa8d6e
Use different processing order depending if we have OWF or not
2014-06-16 10:54:56 +02:00
Laurent Fasnacht
c32943f78b
OWF
2014-06-16 10:54:56 +02:00
Laurent Fasnacht
490dd15f3d
Remove flush between frame
2014-06-16 10:51:33 +02:00
Laurent Fasnacht
fddcbabe28
bitstream writing is now a "normal" job in a thread
2014-06-16 10:51:32 +02:00
Laurent Fasnacht
ff7143cc24
Assign thread_queue_jobs and move image_free to a more suitable place
2014-06-16 10:51:32 +02:00
Ari Koivula
87ca828a63
Correct intra sad function labels.
...
- These haven't been 16 bit for a long time.
2014-06-16 10:45:10 +03:00
Ari Koivula
fcce6ae823
Fix printing of AVX2 capability.
2014-06-14 01:24:19 +03:00
Ari Koivula
a49ba2633a
Add OS and CPU detection for AVX2 and AVX.
2014-06-13 16:57:53 +03:00
Ari Koivula
1de102be61
Move strategies to their own compilation units.
...
- Enforces a little bit more hierarchy. Compilation units are in strategies
and whatever inline includes they have are in a folder with the same name
as the strategy.
2014-06-13 15:30:23 +03:00
Ari Koivula
aa3549a717
Change SLEEP(0) to SLEEP(10) on Windows.
...
- This is a workaround for a performance problem on Windows where main thread
is busy looping.
2014-06-13 12:01:03 +03:00
Laurent Fasnacht
4acadccf89
Only signal the required number of threads
2014-06-13 08:34:59 +02:00
Laurent Fasnacht
70ce7cec20
Remove unneccessary locks by adding threadqueue->queue_running counter
2014-06-13 08:34:58 +02:00
Laurent Fasnacht
7ef34ff5a1
Ability to dump mutex_lock, mutex_unlock and cond_wait timing, if compiled with -D_PTHREAD_DUMP
2014-06-13 08:32:14 +02:00
Laurent Fasnacht
68ad323e84
Tentative fix for race condition
2014-06-12 14:01:33 +02:00
Laurent Fasnacht
b194e19708
Tentative fix for deadlock
2014-06-12 12:57:14 +02:00
Laurent Fasnacht
b765eca153
Remove unneeded encoder_state_blit_pixels
2014-06-12 11:47:46 +02:00
Laurent Fasnacht
da07b8b35d
No-copy works (SAO and deblocking enabled)
2014-06-12 11:47:38 +02:00
Laurent Fasnacht
2cc700fab8
No-copy works with --no-sao (deblocking enabled)
2014-06-12 11:47:31 +02:00
Laurent Fasnacht
6b408b5904
No-copy works with --no-sao --no-deblock
2014-06-12 11:47:30 +02:00
Laurent Fasnacht
0dbfa62698
Replace copy of images made for tiles by sub-images (no copy)
...
- replace width by stride where required in the source code
2014-06-12 11:47:30 +02:00
Laurent Fasnacht
b1347efef5
Add checkpoint in sao_reconstruct
2014-06-12 11:47:29 +02:00
Laurent Fasnacht
ae4dc4eb44
Fix uninitialized sao_info structure members, which was creating false positive when checkpointing SAO
2014-06-12 11:47:29 +02:00
Laurent Fasnacht
f371bdafc3
sao_info checkpoints
2014-06-12 11:47:28 +02:00
Laurent Fasnacht
b7fe81c55c
Checkpoint in pixels_blit, and avoid doing undefined behaviour when source and destination is the same.
...
Seems a reasonnable point to observe when refactoring, since it's called on most image data.
2014-06-12 11:47:28 +02:00
Laurent Fasnacht
da8559fa34
Fix bug in CHECKPOINTS_FINALIZE() when checkpoints are disabled
2014-06-12 11:47:27 +02:00
Laurent Fasnacht
14df6de0d0
Checkpoint on frame checksum
2014-06-12 11:47:00 +02:00
Laurent Fasnacht
22df7cf98b
Use an assert instead of a dumb assignment
2014-06-12 11:47:00 +02:00
Laurent Fasnacht
cf123e317f
Code to checkpoint cu_info and lcu_t
2014-06-12 11:47:00 +02:00
Ari Koivula
ea830d3dd2
Add warning for VLAs in Makefile.
2014-06-12 09:57:08 +03:00
Ari Koivula
443f2f00aa
Fix compilation for VS.
...
- VS2013 does not support variable length arrays.
2014-06-11 17:51:55 +03:00
Laurent Fasnacht
87ed365053
typo fix
2014-06-11 10:29:05 +02:00
Laurent Fasnacht
6ca30367f9
Fix POC bug
2014-06-11 10:29:05 +02:00
Laurent Fasnacht
8437229885
Fix handling of cu_arrays
2014-06-11 10:29:04 +02:00
Laurent Fasnacht
e1d9cb015a
Basic checkpointing system
2014-06-11 10:29:03 +02:00
Laurent Fasnacht
27a49d287d
Big refactor to use videoframe, image_list, and image instead of picture*
2014-06-10 09:19:06 +02:00
Laurent Fasnacht
530faf3951
Move video frame related stuff to videoframe
2014-06-05 14:08:31 +02:00
Laurent Fasnacht
0fac77f9eb
Image now in separate module
2014-06-05 14:04:12 +02:00
Laurent Fasnacht
2456c65822
Replace accesses to picture->cu_array with picture_get_cu and picture_get_cu_const
2014-06-05 10:41:58 +02:00
Laurent Fasnacht
821b71910b
Move picture_list to its own module
2014-06-05 09:49:24 +02:00
Laurent Fasnacht
7372f9244d
Basic infrastructure for OWF
2014-06-05 09:09:25 +02:00
Laurent Fasnacht
16e3a58359
Performance improvement
2014-06-05 06:57:51 +02:00
Laurent Fasnacht
bad6d45e5f
Performance improvement
2014-06-05 06:57:51 +02:00
Laurent Fasnacht
aad2089fcf
Use -ftree-vectorize
2014-06-05 06:57:50 +02:00
Laurent Fasnacht
ea04bcd6a4
AltiVec support for SAD
2014-06-05 06:57:34 +02:00
Ari Koivula
3a7147baf4
Merge branch 't-20140602'
2014-06-04 18:11:15 +03:00
Ari Koivula
31b1bbc215
Address implicit declaration of warnings.
2014-06-04 18:00:50 +03:00
Ari Koivula
4f5c87fc5e
Remove duplicate function definition.
2014-06-04 17:56:05 +03:00
Ari Koivula
cb7d7f9e15
Update Makefile.
2014-06-04 17:52:28 +03:00
Ari Koivula
bb47534b88
Make encoder_state .c files their own compilation units.
...
- It's good that this module has been chopped to smaller pieces, but lets
avoid including .c files unless we really have to. These make pretty good
submodules on their own so just make them their own compilation units.
- Move some stuff around to avoid having to forward declare them
in encoderstate.c.
2014-06-04 17:45:18 +03:00
Ari Lemmetti
9e649a8f38
Updated usage message
2014-06-04 15:23:27 +03:00
Laurent Fasnacht
b8acdc784a
Fix compilation of encoder.c with -D_DEBUG
2014-06-03 15:02:14 +02:00
Laurent Fasnacht
961da05235
Split encoderstate.c in multiple files
2014-06-03 14:47:49 +02:00
Laurent Fasnacht
3d07f8cc84
encoderstate refactor
2014-06-03 14:25:16 +02:00
Laurent Fasnacht
2e821b79a9
encoder_state in now in encoder_state.[ch]
2014-06-03 13:51:30 +02:00
Laurent Fasnacht
9bdecbe071
Better thread scheduling
2014-06-03 11:39:16 +02:00
Laurent Fasnacht
0811dbcfbe
Remove unneeded cond_broadcast. Limit contention
2014-06-03 09:45:17 +02:00
Laurent Fasnacht
5ee1319c08
Altivec detection
2014-06-03 07:55:39 +02:00
Laurent Fasnacht
58ad3b4d26
Log more performance data, plot also now many threads are running
2014-06-03 07:42:22 +02:00
Laurent Fasnacht
5ed69b063b
Strategy selector for array_checksum, basic implementation using precomputed 256*256 block with larger accesses than byte
2014-06-03 07:42:22 +02:00
Ari Koivula
a483e8cb0f
Move cpuid stuff away from compiler namespace.
...
Conflicts:
src/strategyselector.c
2014-05-30 10:08:14 +03:00
Marko Viitanen
6a72f87028
Merge commit '792a5a5dd1946a327f22b2daba05c6645dfa8037'
2014-05-30 08:47:01 +03:00
Marko Viitanen
792a5a5dd1
Small fix for __get_cpuid()
2014-05-30 08:37:03 +03:00
Laurent Fasnacht
642564b6fb
Remove unused variable
2014-05-28 15:04:45 +02:00
Laurent Fasnacht
4f86919d75
Get rid of assembly cpuid for x86, compilation works for powerpc
2014-05-28 15:04:00 +02:00
Ari Koivula
e585da37e5
Give correct transform depth to RDOQ.
...
Conflicts:
src/search.c
2014-05-28 15:47:49 +03:00
Ari Koivula
dceb3da9b8
Fix bug in search relating to transform with no non-zero coefficients.
...
- Because cost was calculated even though there were no coefficients, these
very good modes were less likely to be selected.
- Added assert to encode_coeff_nxn to avoid these problems in the future.
2014-05-28 15:22:18 +03:00
Ari Koivula
ddc02cc09e
Avoid regenerating reference pixels for every rdo mode.
2014-05-22 13:18:28 +03:00
Ari Koivula
dbe13d0cba
Separate sad intra search from rdo search.
2014-05-22 12:47:45 +03:00
Ari Koivula
19ce21e07c
Split final cost to luma and chroma functions.
2014-05-22 09:45:00 +03:00
Ari Koivula
a6962e2974
Separate intra transform coding to luma and chroma functions.
2014-05-22 09:40:34 +03:00
Laurent Fasnacht
3a30a886fc
FREE_POINTER of job->rdepends was at the wrong place (memory leak)
2014-05-22 07:15:18 +02:00
Laurent Fasnacht
3b38777b71
Fix condition depending on uninitialized value in SAO
2014-05-21 16:33:24 +02:00
Laurent Fasnacht
66e730ba94
Fix encoder_state_init, which was making out of bound reads
2014-05-21 14:23:36 +02:00
Laurent Fasnacht
37c20b8ce5
Add dependency between SAO rows
2014-05-21 13:52:56 +02:00
Laurent Fasnacht
90f46dc56f
Threadqueue has now a start index to the first queue job. It improves the speed a little
2014-05-21 12:02:55 +02:00
Laurent Fasnacht
f4f9093cb5
Parallel SAO
2014-05-21 11:48:29 +02:00
Laurent Fasnacht
a3fcb141ed
lcu_order_element now has pointer to neighbor LCUs
2014-05-21 11:06:53 +02:00
Ari Koivula
de76d0a294
Don't add dependency to the above LCU in wavefront if it's not necessary.
...
- The top-right LCU already has dependency to the top LCU.
2014-05-20 10:48:19 +03:00
Laurent Fasnacht
bdc2d43180
Write bitstream directly after doing the search. This is required since we need the correct entropy status for wpp
2014-05-20 09:29:01 +02:00
Laurent Fasnacht
06532292fc
Wavefront are in tile coordinates
2014-05-20 09:28:58 +02:00
Ari Koivula
4751a3744b
Fix intra mode search not doing boundary smoothing for DC.
...
- Move the boundary smoothing to the prediction function to make sure it's not
forgotten.
2014-05-19 16:23:17 +03:00
Ari Koivula
f9a603e4ea
Move intra mode search form intra module to search module.
...
- Make the actual intra prediction function global.
- Move the rdo stuff to rdo module.
2014-05-19 16:12:02 +03:00
Ari Koivula
1da94f2085
Stop deblocking from filtering edges not on 8x8 grid.
2014-05-19 15:58:54 +03:00
Ari Koivula
2224e18a46
Make deblocking work with transform splits.
...
- It used to work only with the implicit transform split from LCU size.
2014-05-19 15:58:54 +03:00
Ari Koivula
656b0a321b
Add chroma mode to lcu_set_intra_mode.
...
- This is needed for intra split.
2014-05-19 15:58:54 +03:00
Ari Koivula
921f58b249
Add tr_split to lcu_set_intra_mode.
2014-05-19 15:58:54 +03:00
Ari Koivula
846b608125
Add transform split recursion to intra reconstruction.
2014-05-19 15:58:54 +03:00
Ari Koivula
63f6cad5a0
Include global.h in thread modules.
2014-05-19 15:58:16 +03:00
Ari Koivula
551b087b47
Remove bunch of unnecessary code from encode_transform_unit.
...
- Really, it's useless. Selecting scan order isn't this hard.
- Checked from HM that ctx_idx doesn't have anything to do with contexts.
2014-05-16 17:42:40 +03:00
Ari Koivula
f73bef0941
Remove unused include.
2014-05-16 16:09:59 +03:00
Laurent Fasnacht
6fdb821b14
Fix memory leaks
2014-05-16 12:20:40 +02:00
Laurent Fasnacht
d4a6aed471
Multi-row jobs
2014-05-16 12:20:40 +02:00
Marko Viitanen
94285fbed7
Fixed compiling on visual studio with _DEBUG defined
2014-05-16 12:22:06 +03:00
Marko Viitanen
86155ef1ba
Added windows specific timing macros for thread debugging
2014-05-16 12:16:22 +03:00
Laurent Fasnacht
36945e89ce
Stubs to be able to make a portable version of the profiling
2014-05-16 10:15:05 +02:00
Laurent Fasnacht
53b0835316
Improve handling of jobs when not using threads
2014-05-16 08:50:43 +02:00
Laurent Fasnacht
519750d630
Write bitstream of a wavefront in a parallel way
2014-05-16 08:50:42 +02:00
Laurent Fasnacht
7473ac1bfc
Able to log time in a simple way
2014-05-16 08:50:42 +02:00
Laurent Fasnacht
86e01284b8
Add -lrt
2014-05-16 08:48:54 +02:00
Laurent Fasnacht
4f73a7fc91
Instrument threads in order to be able to do some visualization
2014-05-16 08:44:32 +02:00
Ari Koivula
a7cd31d87b
Update the names of some bins to the current spec.
...
- Helps with debugging.
2014-05-16 05:44:03 +03:00
Ari Koivula
ab4041c8fc
Change cabac debug statements to show information better.
...
- Show the number of bits when encoding multiple bins. I would like just the
bits them selves in string form, but that's too much trouble for this.
- Print then as unsigned and coerce them to unsigned, as they are going
get coerced to unsigned by the function call anyway.
- Change state to be less verbose.
2014-05-16 05:44:03 +03:00
Ari Koivula
c9a8756fbd
Fix NxN scan mode for lcu_get_final_cost.
...
- Scan mode was always selected according to the first PU mode.
2014-05-15 16:20:35 +03:00
Marko Viitanen
b08047cce9
Fixed intra chroma mode selection
2014-05-15 09:50:05 +03:00
Tapio Katajisto
4d879945b2
Fixed cost calculations in fme
2014-05-15 03:42:42 +00:00
Ari Koivula
f0e990905e
Remove chroma mode "36".
...
- It's an unnecessary chore to handle this special case everywhere (it means
chroma_mode == intra_mode). Better just to use the actual mode.
2014-05-14 19:56:35 +03:00
Ari Koivula
60a0ba4280
Update VS project files to link win32-pthread.
...
- I haven't found a good way of including external dependencies to VS projects
yet. Win32-pthreads is assumed to be found at the same level as kvazaar dir
and has the files x86/pthreadVC2.lib and x64/pthreadVC2.lib.
- Win32-pthreads also requires the pthreadVC2.dll to be in PATH when running
the program. Not sure what to do about that yet. We might need an installer
for windows to handle that.
- Disable openmp as it's no longer used.
- Stop linking Ws2_32.lib as that hasn't been used for ages.
2014-05-14 17:54:34 +03:00
Laurent Fasnacht
8ff9ea0eee
Wavefront works with parallelism + deblock (still no SAO)
2014-05-14 14:01:26 +02:00
Laurent Fasnacht
38444a81a6
Threads should be put in queue in wait state if we want to add dependencies later
2014-05-14 14:01:25 +02:00
Laurent Fasnacht
e72408249b
Add encoder_state pointer to lcu_order_element, new worker_encoder_state_search_lcu function to run the search stuff on one LCU
2014-05-14 14:01:24 +02:00
Laurent Fasnacht
eb62696461
Fix problems when image dimensions is not a multiple of LCU
2014-05-14 13:27:14 +02:00
Laurent Fasnacht
1ba1683c05
search buffer has to be allocated tile-wise to avoid problems with wavefronts
2014-05-14 13:27:13 +02:00
Laurent Fasnacht
bb86f24000
Take advantage of the new buffers to remove uneeded item assignment
2014-05-14 13:27:13 +02:00
Laurent Fasnacht
6607c9f563
Use new buffers for search
2014-05-14 13:27:12 +02:00
Laurent Fasnacht
c257c4b863
Add const for the buffers
2014-05-14 13:27:12 +02:00
Laurent Fasnacht
1680273e80
Store search borders in a buffer for the whole picture
2014-05-14 13:27:11 +02:00
Laurent Fasnacht
0ceb1469a2
Improve decision about when to split into threads
2014-05-14 13:27:11 +02:00
Laurent Fasnacht
d4a303e7e6
Free jobs as soon as possible
2014-05-14 13:27:09 +02:00
Laurent Fasnacht
63adb54a3d
Add --threads <int> command line parameter
2014-05-14 13:27:09 +02:00
Laurent Fasnacht
e772799d5e
encoder_state_encode uses now the threadqueue
2014-05-14 13:27:08 +02:00
Laurent Fasnacht
baede7f6c4
threadqueue
2014-05-14 13:27:08 +02:00
Laurent Fasnacht
8b7774153f
Add SLEEP() define
2014-05-14 13:27:08 +02:00
Laurent Fasnacht
aac7fc55b1
Remove filter_deblock function, which is not used and somewhat dangerous, since it doesn't take into account specific stuff about subencoders.
2014-05-14 13:27:07 +02:00
Laurent Fasnacht
bc3ca90bdf
Fix tiles when SAO or deblock is enabled.
...
Was broken by previous commit.
2014-05-14 13:27:07 +02:00
Laurent Fasnacht
4815a0604b
Entropy coding sync works without parallelism, without SAO and without deblocking
2014-05-14 13:27:06 +02:00
Laurent Fasnacht
2c2a2528f3
Remove openmp stuff
2014-05-14 13:27:06 +02:00
Ari Koivula
aee9bf2875
Re-add rdo control to transformskip decision.
...
- It got left out when rewriting the function.
2014-05-14 12:39:23 +03:00
Ari Koivula
9147b7acbf
Split residual quantization to separate luma and chroma function.
2014-05-14 11:19:48 +03:00
Tapio Katajisto
cc92cfee18
Added few warnings to Makefile
...
Cleaned fme code a bit
2014-05-14 01:49:34 +00:00
Tapio Katajisto
efc43c8b3a
Added fractional pixel motion estimation
...
Added farctional mv support for inter recon
Added 1/8-pel chroma and 1/4-pel luma interpolation
2014-05-14 01:42:02 +00:00
Ari Koivula
e947bd4c0e
Clean up trskip decision code and remove old code.
...
- You can define structs inside functions! This changes everything!!
- Bitstream changes a little bit compared to old trskip decision. Bdrate
change is insignificant though.
2014-05-13 22:00:04 +03:00
Ari Koivula
a3cdee9ec5
Move new trskip decision to a function.
2014-05-13 21:59:00 +03:00
Ari Koivula
2ff713ccb2
Add new implementation for trskip decision.
2014-05-13 21:57:45 +03:00
Ari Koivula
8b8da6f493
Make luma and chroma use the same quantization function.
...
- Only thing not working was transform skip.
2014-05-13 21:57:23 +03:00
Ari Koivula
f0bfcedba2
Clean up coeff reconstruction code.
2014-05-13 21:56:10 +03:00
Ari Koivula
0c65a9b658
Remove abs_sum from coeff quantization.
...
- It's meant for checking if there are any coefficients, but we don't use it
and it's annoying to remember to initialize it and pass it around. The
benefit should be quite small anyway.
2014-05-13 21:54:34 +03:00
Ari Koivula
75042fc65d
Move luma quantization to it's own function.
2014-05-13 21:34:06 +03:00
Ari Koivula
ba3aaf3189
Expand chroma functions to parent function.
...
- This was done so that making the function work with luma would be easier.
2014-05-13 21:30:14 +03:00
Ari Koivula
637aceb495
Add TR_MAX_WIDTH.
...
- Max transform size is constrained by but independent of LCU size.
- Luma and chroma now have the same stride for transform arrays.
2014-05-13 21:22:40 +03:00
Ari Koivula
1c38209cab
Add missing include.
2014-05-13 09:33:05 +03:00
Ari Koivula
13577562e5
Revert change to definition of LCU_WIDTH.
2014-05-13 09:28:01 +03:00
Ari Koivula
fb763f7940
Move coefficient generation functions from encoder.c to transform.c.
...
- These functions probably should have been there to begin with.
2014-05-12 11:37:39 +03:00
Ari Koivula
a3478ecd20
Move transform skip decision to it's own function.
2014-05-12 11:18:27 +03:00
Ari Koivula
d9b890de6e
Remove redundant variables.
...
- Redefine LCU_WIDTH to be 64. Stuff will break horribly if it's
anything else anyway.
- Add LCU_WIDTH_C for chroma LCU width. It should be more readable than the
constant (LCU_WIDTH >> 1).
2014-05-12 10:58:07 +03:00
Ari Koivula
59e0e98523
Separate luma and chroma coefficient generation variables.
2014-05-12 10:38:24 +03:00
Ari Koivula
0ca65e7606
Move chroma coefficient generation to it's own function.
...
- It's time to chop up this monster that is encode_transform_tree.
2014-05-12 10:24:06 +03:00
Ari Koivula
3c3c9a26c6
Move scan order selection to a function.
2014-05-12 08:47:16 +03:00
Ari Koivula
623d9001a8
Reorder chroma coefficient generation.
2014-05-12 08:47:16 +03:00
Ari Koivula
93141c7d2e
Avoid unnecessary copying of predicted pixels when there are no coeffs.
...
- These are probably from a time when reconstruction happened in this
function.
2014-05-09 16:39:58 +03:00
Ari Koivula
27ab882c25
Clean up coefficient generation.
2014-05-09 16:33:10 +03:00
Ari Koivula
ce945ab4ef
Handle coefficient initialization better.
...
- Coefficients are no longer required to be pre-zeroed. The resulting zeroes
are copied in even in the case where we already know they are all zeroes.
- Move cbf clearing code to only happen at the leaves of the recursion.
2014-05-09 16:30:28 +03:00
Laurent Fasnacht
b274558139
Refactor and fix entry_points functions.
...
Seems to be OK with HM now
2014-05-09 12:42:37 +02:00
Laurent Fasnacht
43b5f84c0d
Fix sao_calc_edge_block_dims
...
It was computing wrong dimensions, which was causing out-of-bounds reads in sao_reconstruct.
2014-05-09 10:30:34 +02:00
Laurent Fasnacht
3f975e92cd
Replace line fixing symptoms by assertions, to reveal the cause
2014-05-09 08:24:03 +02:00
Laurent Fasnacht
4dbf7c7a52
Fix blit dimensions in sao_search_best_mode
2014-05-09 08:24:02 +02:00
Ari Koivula
cb5d7e6541
Fix compilation for VS2010.
2014-05-08 17:28:12 +03:00
Laurent Fasnacht
0452806ec4
Entry points
2014-05-08 15:04:56 +02:00
Laurent Fasnacht
da588af2ba
Partial support for wavefront
2014-05-08 15:04:55 +02:00
Laurent Fasnacht
4de5660254
Fix missing offset in LCU range computation for wavefronts
2014-05-08 15:04:55 +02:00
Laurent Fasnacht
dc34a5eac6
LCU borders
2014-05-08 15:04:54 +02:00
Laurent Fasnacht
24f4a8cad1
Wavefront also needs entrypoints
2014-05-08 15:04:53 +02:00
Laurent Fasnacht
d05f8b52aa
Rewrite of encoder_state_write_bitstream_leaf: handle slice + tiles + wavefronts correctly
2014-05-08 15:04:53 +02:00
Laurent Fasnacht
27f694e3e8
Some initial code to support wpp and slices
2014-05-08 15:04:52 +02:00
Laurent Fasnacht
b3d1754cc3
context_copy function
2014-05-08 15:04:51 +02:00
Laurent Fasnacht
163189c3c7
Bitstream for leaves can be computed in parallel
2014-05-08 15:04:51 +02:00
Laurent Fasnacht
be9882f5b2
Leaf bitstream write
2014-05-08 15:04:50 +02:00
Laurent Fasnacht
ae6a7a9c4b
Leaf encoder uses encoder_state->lcu_order
2014-05-08 15:04:49 +02:00
Laurent Fasnacht
b740142325
Add is_leaf to encoder_state
2014-05-08 15:04:48 +02:00
Laurent Fasnacht
8451d5b100
Move some init code to encoder_state_new_frame
2014-05-08 15:04:48 +02:00
Laurent Fasnacht
1cb3f14dfe
lcu_order_count in (leaves) encoder
2014-05-08 15:04:47 +02:00
Laurent Fasnacht
ef6ae3e723
Remove dead code
2014-05-08 15:04:46 +02:00
Ari Koivula
535b42bc9b
Fix compilation for VS2010.
2014-05-07 15:26:44 +03:00
Laurent Fasnacht
05eef82896
Remove extra [ from graphviz dump
2014-05-07 13:40:29 +02:00
Laurent Fasnacht
84e5dbee39
Remove quote from graphviz dump
2014-05-07 13:33:02 +02:00
Laurent Fasnacht
b48a687d3c
Restored parallelism, but it will be done in another way... OpenMP is not very efficient in these kind of dynamic situation
2014-05-07 11:55:56 +02:00
Laurent Fasnacht
0e6f1c99fc
Refactor picture to remove hidden dependency between slice and tiles
...
picture.type -> encoder_state->global->pictype
picture.slicetype -> encoder_state->global->slicetype
picture.slice_sao_luma_flag -> 1 (was constant)
picture.slice_sao_chroma_flag -> 1 (was constant)
This may be changed later. For now it's better to avoid having slice related stuff in picture.
2014-05-07 11:55:48 +02:00
Laurent Fasnacht
39d96e0546
Fix bug with cabac stream pointing to bad data
2014-05-07 11:55:41 +02:00
Laurent Fasnacht
e144f817ef
Works when not using tiles
2014-05-07 11:55:16 +02:00
Laurent Fasnacht
24c2bd70ca
Fix small bugs with compilation
2014-05-07 11:54:35 +02:00
Laurent Fasnacht
a03f0cba19
encoder_control_input_init near the other encoder_control_* functions
2014-05-07 11:53:21 +02:00
Laurent Fasnacht
1e2671ac30
Renamed encoder_clear_refs to encoder_state_clear_refs
2014-05-07 11:53:12 +02:00
Laurent Fasnacht
831b221cf8
Parsing seems to work now
2014-05-07 11:53:01 +02:00
Laurent Fasnacht
8b5cb62237
Debug code to generate a graph
2014-05-07 11:52:04 +02:00
Laurent Fasnacht
cee6bb0e71
Fix iteration on children
2014-05-07 11:49:14 +02:00
Laurent Fasnacht
699669ee35
fixed typo
2014-05-07 11:48:16 +02:00
Laurent Fasnacht
6c6adf18c7
Refactor encoder_state
2014-05-07 11:47:31 +02:00
Laurent Fasnacht
a23edd0339
added parent to encoder_state
2014-05-07 11:42:54 +02:00
Laurent Fasnacht
5ce518a47a
lcu_at_tile_start and lcu_at_tile_end helper functions
2014-05-07 11:42:30 +02:00
Laurent Fasnacht
c2872bd6b0
Slices and WPP in command line and encoder
2014-05-07 11:42:04 +02:00
Laurent Fasnacht
2d6f199246
reorganized encoder_state structure
2014-05-07 11:41:27 +02:00
Laurent Fasnacht
f0b076876f
Moved all the stream related stuff into substream_write_bitstream
2014-05-07 11:40:20 +02:00
Laurent Fasnacht
f30b9c2a11
Fix a buffer overflow in parse_tiles_specification
2014-05-07 11:39:45 +02:00
Ari Koivula
eaf8835bda
Add some comments and const qualifiers.
2014-05-06 19:20:38 +03:00
Ari Koivula
3910b7989a
Clear old cbf data before recursion in encode_transform_tree.
...
- Because encode_transform_tree also maintains the CBF data and assumes that
the CBFs are initially zeroed, calling the function more than once would
result in incorrect CBF data.
2014-05-06 19:03:29 +03:00
Ari Koivula
bdc16d2612
Improve cu_info coded block flag data structure a bit.
...
- It works just like the old structure except that the flags are checked with
bitmasks instead of having the flag value be propagated upwards. There isn't
really any benefit to this because the flags still have to be propagated to
parent CUs.
- Wrapped them inside a struct to make copying them easier. (Just need to copy
the struct instead of making individual copies)
2014-05-06 18:28:04 +03:00
Ari Koivula
d123b98aea
Remove unnecessary tertiary expressions from usages of CABAC_BIN.
2014-05-06 17:39:25 +03:00
Ari Koivula
380401b2eb
Have CABAC_BIN accept any >0 as binary 1.
...
It used to treat odd numbers as false.
2014-05-06 17:39:10 +03:00
Marko Viitanen
bf2c2a1330
Small changes to fix compiling on VS
...
- Added threads.h to VS project
- Included Windows.h in threads.h
2014-05-05 11:18:43 +03:00
Laurent Fasnacht
f3d4e6eb09
Move bitstream write to a separate function, and add assertions about the part which should not write to bitstream.
2014-05-05 09:24:57 +02:00
Laurent Fasnacht
0fe080ad0a
bitstream_tell
2014-05-05 08:53:06 +02:00
Laurent Fasnacht
7f6f4fe9c1
Reference count for picture
2014-05-05 08:03:24 +02:00
Laurent Fasnacht
323054d5e2
naming: alloc_yuv_t -> yuv_t_alloc dealloc_yuv_t -> yuv_t_free
2014-05-02 11:45:27 +02:00
Laurent Fasnacht
7d6d1d5536
Remove pic->pred_*
2014-05-02 11:38:07 +02:00
Laurent Fasnacht
92e14cc80d
rename picture_init to picture alloc and picture_destroy to picture_free
2014-05-02 10:58:28 +02:00
Laurent Fasnacht
b76f7377b6
Always initialize tiles data structures (even with only one tile)
2014-05-02 10:00:22 +02:00
Laurent Fasnacht
f97e60a80d
Doc for encoder state
2014-05-02 10:00:12 +02:00
Laurent Fasnacht
161fe38f5e
Remove USE_TILES define
2014-05-01 13:58:13 +02:00
Laurent Fasnacht
a84fd6486d
Add function subencoder_blit_pixels
2014-05-01 11:16:11 +02:00
Laurent Fasnacht
b8b28635ff
Iterable structure for sub-encoders (more flexibility)
2014-05-01 11:16:10 +02:00
Laurent Fasnacht
212d390003
Cleanup of encoder_state_init and encoder_state_finalize
2014-05-01 11:16:10 +02:00
Laurent Fasnacht
161053f86b
Do not allow more tiles than dimension in LCU
2014-05-01 07:11:31 +02:00
Ari Koivula
42295d3cb9
Pass preprocessor defines for supported intrinsics in VS2010 explicitly.
...
- _M_IX86_FP defines whether VS should generate code using SSE or SSE2
instructions. It isn't correct to use it to check whether optional runtime
optimizations should be compiled in. It's also not defined at all in 64-bit
mode.
- So let's just keep it simple and give a list of everything that is supported
as release optimizations. It's not clear from the documentation if all of
these are really supported. It just list a bunch of intrinsics from these
that are.
2014-04-30 17:41:15 +03:00
Ari Koivula
d1fbc6dc80
Fix a small memory leak.
...
- Malloced pointer returned by alloc_yuv_t was not being freed in
substream_encode.
- Remove use of yuv_t from encode_one_frame, as it's not used there anymore.
2014-04-30 11:15:34 +03:00
Ari Koivula
d808fe3b02
Merge branch 'strategy_selector'
2014-04-29 15:36:48 +03:00
Ari Koivula
bd7e021742
Modify strategyselector to work with VS2010.
...
- VS doesn't have snprintf.
- VS doesn't support GCC attributes.
- Add defines for __SSE__ and __SSE2__ on VS.
2014-04-29 15:29:06 +03:00
Laurent Fasnacht
bf7e755cf7
Strategies and runtime detection/choice of best algorithm
2014-04-29 11:51:41 +02:00
Ari Koivula
27b94d4b45
Address gcc -Wtype-limits errors.
...
- Fixes warnings in #19 and #16 .
2014-04-29 09:15:52 +03:00
Ari Koivula
2a17e9a7aa
Merge branch 'sse_intrinsics'
2014-04-28 19:38:08 +03:00
Ari Koivula
cecf4b0b4e
Move __USE_MINGW_ANSI_STDIO to Makefile.
...
- I'm not too clear on how this should be used, but having it in the source
file after mingw stuff was included caused a warning about redefinition of
__USE_MINGW_ANSI_STDIO.
2014-04-28 19:37:37 +03:00
Ari Koivula
4e7e40054f
Move picture-sse2.c to src/inline-optimizations/.
...
- Having it in the src dir even though it's not a module on it's own breaks
the scons build script. It's probably better to have these a little bit
separated from the normal code anyway.
2014-04-28 19:36:40 +03:00
Laurent Fasnacht
d66f809734
reg_sad implementation using SSE2/SSE4.1 intrinsics
2014-04-28 15:36:58 +02:00
Ari Koivula
4490e8afd6
Remove depth dimension from picture->cu_array.
...
- It isn't used for anything anymore.
- It was used in the past to hold information during search, but now that
information is held in lcu_t structs.
2014-04-28 10:18:22 +03:00
Laurent Fasnacht
76ec605b72
SAO works with tiles now
2014-04-28 06:29:21 +02:00
Yusuke Nakamura
0214d4ffcc
Makefile: Remove unneeded arguments in CCFLAGS.
...
This fixes a compilation on clang.
2014-04-27 00:41:10 +09:00
Yusuke Nakamura
03da39e229
config: Use built-in getopt on non-MSVC environments.
2014-04-27 00:40:52 +09:00
Yusuke Nakamura
c5a4e7b52c
encmain: Remove a warning on MinGW.
2014-04-26 23:56:50 +09:00
Ari Koivula
145816cfb5
Move printing of CLI stuff to stderr.
...
- Printing to stdout corrupts the stream when used with "-o -".
2014-04-26 12:56:39 +03:00
Laurent Fasnacht
5e7945888a
Inter-frame prediction with tiles works.
...
Many thanks to Jean-Hugues Recolin for the insightful comments about shifts!
2014-04-25 09:28:00 +02:00
Laurent Fasnacht
7719837f17
Simple OpenMP parallelization
2014-04-25 09:11:10 +02:00
Laurent Fasnacht
4e34859e66
Fix compilation error with USE_TILES=1 and -Werror=maybe-uninitialized
2014-04-24 08:41:05 +02:00
Laurent Fasnacht
59392c4a62
Fix compilation issue with USE_TILES=0
2014-04-24 08:38:24 +02:00
Laurent Fasnacht
571a373f69
Use tile offset in search
2014-04-24 08:38:24 +02:00
Laurent Fasnacht
2e7d958af3
Picture and reference may have different sizes
2014-04-24 08:38:24 +02:00
Laurent Fasnacht
af9a1c0fbb
Use same reference images for all subencoders
2014-04-24 08:38:23 +02:00
Laurent Fasnacht
73c574fb45
P-frame: first try...
2014-04-24 08:38:22 +02:00
Laurent Fasnacht
03361dcf2c
sao try... still not working
2014-04-24 08:38:22 +02:00
Laurent Fasnacht
3db4c59478
Recontruct full frame from tiles
2014-04-24 08:38:21 +02:00
Laurent Fasnacht
35d5d22ccc
Fix tile size not to go outside of the original picture
2014-04-24 08:38:20 +02:00
Laurent Fasnacht
985630b8b2
Add a check to fix picture_blit_pixels when width > orig_stride
2014-04-24 08:38:20 +02:00
Laurent Fasnacht
b36e154c38
Some cleanup
2014-04-24 08:38:19 +02:00
Laurent Fasnacht
01580a93c3
Encoding with tiles now more or less works with -p 1 --no-sao --no-deblock
2014-04-24 08:38:19 +02:00
Laurent Fasnacht
fd89b9af76
New functions: bitstream_append and bitstream_clear
2014-04-24 08:38:18 +02:00
Laurent Fasnacht
356c17e0de
Add missing break in bitstream_writebyte
2014-04-24 08:38:18 +02:00
Laurent Fasnacht
5fb4d9c36e
substream_encode function
2014-04-24 08:38:17 +02:00
Laurent Fasnacht
e292b2c274
allocate subencoders
2014-04-24 08:38:17 +02:00
Laurent Fasnacht
12e3900fd1
( ) for preprocessor directives...
2014-04-24 08:38:16 +02:00
Laurent Fasnacht
fba4f5432a
Fix debug code
2014-04-24 08:38:16 +02:00
Laurent Fasnacht
b255133460
Debug for tiles
2014-04-24 08:38:15 +02:00
Laurent Fasnacht
066ce6c9f4
Remove unused prototype
2014-04-24 08:38:15 +02:00
Laurent Fasnacht
11629ce811
Use tile scan order in encode_one_frame()
2014-04-24 08:38:14 +02:00
Laurent Fasnacht
0036afa056
Write tiles related information picture parameter set and slice header
2014-04-24 08:38:14 +02:00
Laurent Fasnacht
1e9c894eba
Coding tree block raster and tile scanning conversion process, according to ITU-T Rec. H.265 (04/2013) 6.5.1
2014-04-24 08:38:13 +02:00
Laurent Fasnacht
7bd6aa2e9c
encoder_control_input_init call moved to encoder_control_init
2014-04-24 08:38:13 +02:00
Laurent Fasnacht
ff318ae0e9
Tiles in encoder_control
2014-04-24 08:38:12 +02:00
Laurent Fasnacht
9353f14792
Parameters for using tiles in command line arguments.
...
--tiles-width-split
--tiles-height-split
2014-04-24 08:38:11 +02:00
Laurent Fasnacht
61c67dc485
Allow -DUSE_TILES=1 to be specified in Makefile; define MAX_TILES_PER_DIM.
2014-04-24 08:38:11 +02:00
Laurent Fasnacht
19b1642aa2
Removed all cabac parameters (cabac is part of encoder_state)
2014-04-22 11:46:53 +02:00
Ari Koivula
a539ae7e08
Address clang-analyzer warning.
...
- The assert needs to be before the initialization.
2014-04-22 11:55:28 +03:00
Laurent Fasnacht
5fea5875a5
Huge refactoring
...
Split some parts of encoder_control into encoder_state
(idea: encoder_control is immutable)
Goal is to allow multiple substreams in the future.
2014-04-22 10:39:12 +02:00
Ari Koivula
88a67a4e49
Fix faulty assert that stops the program from working with inter frames.
...
- The assert would be true after the next if block, but in it's current place
it's false.
2014-04-22 10:57:38 +03:00
Ari Koivula
54270f271d
Fix c89 problem to allow compilation with VS2010.
2014-04-17 19:12:39 +03:00
Ari Koivula
1b437a5989
Address clang-analyzer warnings about garbage values.
...
- False alarm, but surprisingly difficult to convince clang of that. It
doesn't seem to understand bit shifts very well.
- Only assert and changing LCU_WIDTH>>depth to width was necessary to satisfy
clang.
- Closes #35 .
2014-04-17 18:43:09 +03:00
Ari Koivula
11509c68dc
Address clang-analyzer warnings about unused values.
...
- Related to issue #35 .
2014-04-17 18:43:08 +03:00