Laurent Fasnacht
8437229885
Fix handling of cu_arrays
2014-06-11 10:29:04 +02:00
Laurent Fasnacht
e1d9cb015a
Basic checkpointing system
2014-06-11 10:29:03 +02:00
Laurent Fasnacht
27a49d287d
Big refactor to use videoframe, image_list, and image instead of picture*
2014-06-10 09:19:06 +02:00
Laurent Fasnacht
530faf3951
Move video frame related stuff to videoframe
2014-06-05 14:08:31 +02:00
Laurent Fasnacht
0fac77f9eb
Image now in separate module
2014-06-05 14:04:12 +02:00
Laurent Fasnacht
2456c65822
Replace accesses to picture->cu_array with picture_get_cu and picture_get_cu_const
2014-06-05 10:41:58 +02:00
Laurent Fasnacht
821b71910b
Move picture_list to its own module
2014-06-05 09:49:24 +02:00
Laurent Fasnacht
7372f9244d
Basic infrastructure for OWF
2014-06-05 09:09:25 +02:00
Laurent Fasnacht
16e3a58359
Performance improvement
2014-06-05 06:57:51 +02:00
Laurent Fasnacht
bad6d45e5f
Performance improvement
2014-06-05 06:57:51 +02:00
Laurent Fasnacht
aad2089fcf
Use -ftree-vectorize
2014-06-05 06:57:50 +02:00
Laurent Fasnacht
ea04bcd6a4
AltiVec support for SAD
2014-06-05 06:57:34 +02:00
Ari Koivula
3a7147baf4
Merge branch 't-20140602'
2014-06-04 18:11:15 +03:00
Ari Koivula
31b1bbc215
Address implicit declaration of warnings.
2014-06-04 18:00:50 +03:00
Ari Koivula
4f5c87fc5e
Remove duplicate function definition.
2014-06-04 17:56:05 +03:00
Ari Koivula
cb7d7f9e15
Update Makefile.
2014-06-04 17:52:28 +03:00
Ari Koivula
bb47534b88
Make encoder_state .c files their own compilation units.
...
- It's good that this module has been chopped to smaller pieces, but lets
avoid including .c files unless we really have to. These make pretty good
submodules on their own so just make them their own compilation units.
- Move some stuff around to avoid having to forward declare them
in encoderstate.c.
2014-06-04 17:45:18 +03:00
Ari Lemmetti
9e649a8f38
Updated usage message
2014-06-04 15:23:27 +03:00
Laurent Fasnacht
b8acdc784a
Fix compilation of encoder.c with -D_DEBUG
2014-06-03 15:02:14 +02:00
Laurent Fasnacht
961da05235
Split encoderstate.c in multiple files
2014-06-03 14:47:49 +02:00
Laurent Fasnacht
3d07f8cc84
encoderstate refactor
2014-06-03 14:25:16 +02:00
Laurent Fasnacht
2e821b79a9
encoder_state in now in encoder_state.[ch]
2014-06-03 13:51:30 +02:00
Laurent Fasnacht
9bdecbe071
Better thread scheduling
2014-06-03 11:39:16 +02:00
Laurent Fasnacht
0811dbcfbe
Remove unneeded cond_broadcast. Limit contention
2014-06-03 09:45:17 +02:00
Laurent Fasnacht
5ee1319c08
Altivec detection
2014-06-03 07:55:39 +02:00
Laurent Fasnacht
58ad3b4d26
Log more performance data, plot also now many threads are running
2014-06-03 07:42:22 +02:00
Laurent Fasnacht
5ed69b063b
Strategy selector for array_checksum, basic implementation using precomputed 256*256 block with larger accesses than byte
2014-06-03 07:42:22 +02:00
Ari Koivula
a483e8cb0f
Move cpuid stuff away from compiler namespace.
...
Conflicts:
src/strategyselector.c
2014-05-30 10:08:14 +03:00
Marko Viitanen
6a72f87028
Merge commit '792a5a5dd1946a327f22b2daba05c6645dfa8037'
2014-05-30 08:47:01 +03:00
Marko Viitanen
792a5a5dd1
Small fix for __get_cpuid()
2014-05-30 08:37:03 +03:00
Laurent Fasnacht
642564b6fb
Remove unused variable
2014-05-28 15:04:45 +02:00
Laurent Fasnacht
4f86919d75
Get rid of assembly cpuid for x86, compilation works for powerpc
2014-05-28 15:04:00 +02:00
Ari Koivula
e585da37e5
Give correct transform depth to RDOQ.
...
Conflicts:
src/search.c
2014-05-28 15:47:49 +03:00
Ari Koivula
dceb3da9b8
Fix bug in search relating to transform with no non-zero coefficients.
...
- Because cost was calculated even though there were no coefficients, these
very good modes were less likely to be selected.
- Added assert to encode_coeff_nxn to avoid these problems in the future.
2014-05-28 15:22:18 +03:00
Ari Koivula
ddc02cc09e
Avoid regenerating reference pixels for every rdo mode.
2014-05-22 13:18:28 +03:00
Ari Koivula
dbe13d0cba
Separate sad intra search from rdo search.
2014-05-22 12:47:45 +03:00
Ari Koivula
19ce21e07c
Split final cost to luma and chroma functions.
2014-05-22 09:45:00 +03:00
Ari Koivula
a6962e2974
Separate intra transform coding to luma and chroma functions.
2014-05-22 09:40:34 +03:00
Laurent Fasnacht
3a30a886fc
FREE_POINTER of job->rdepends was at the wrong place (memory leak)
2014-05-22 07:15:18 +02:00
Laurent Fasnacht
3b38777b71
Fix condition depending on uninitialized value in SAO
2014-05-21 16:33:24 +02:00
Laurent Fasnacht
66e730ba94
Fix encoder_state_init, which was making out of bound reads
2014-05-21 14:23:36 +02:00
Laurent Fasnacht
37c20b8ce5
Add dependency between SAO rows
2014-05-21 13:52:56 +02:00
Laurent Fasnacht
90f46dc56f
Threadqueue has now a start index to the first queue job. It improves the speed a little
2014-05-21 12:02:55 +02:00
Laurent Fasnacht
f4f9093cb5
Parallel SAO
2014-05-21 11:48:29 +02:00
Laurent Fasnacht
a3fcb141ed
lcu_order_element now has pointer to neighbor LCUs
2014-05-21 11:06:53 +02:00
Ari Koivula
de76d0a294
Don't add dependency to the above LCU in wavefront if it's not necessary.
...
- The top-right LCU already has dependency to the top LCU.
2014-05-20 10:48:19 +03:00
Laurent Fasnacht
bdc2d43180
Write bitstream directly after doing the search. This is required since we need the correct entropy status for wpp
2014-05-20 09:29:01 +02:00
Laurent Fasnacht
06532292fc
Wavefront are in tile coordinates
2014-05-20 09:28:58 +02:00
Ari Koivula
4751a3744b
Fix intra mode search not doing boundary smoothing for DC.
...
- Move the boundary smoothing to the prediction function to make sure it's not
forgotten.
2014-05-19 16:23:17 +03:00
Ari Koivula
f9a603e4ea
Move intra mode search form intra module to search module.
...
- Make the actual intra prediction function global.
- Move the rdo stuff to rdo module.
2014-05-19 16:12:02 +03:00
Ari Koivula
1da94f2085
Stop deblocking from filtering edges not on 8x8 grid.
2014-05-19 15:58:54 +03:00
Ari Koivula
2224e18a46
Make deblocking work with transform splits.
...
- It used to work only with the implicit transform split from LCU size.
2014-05-19 15:58:54 +03:00
Ari Koivula
656b0a321b
Add chroma mode to lcu_set_intra_mode.
...
- This is needed for intra split.
2014-05-19 15:58:54 +03:00
Ari Koivula
921f58b249
Add tr_split to lcu_set_intra_mode.
2014-05-19 15:58:54 +03:00
Ari Koivula
846b608125
Add transform split recursion to intra reconstruction.
2014-05-19 15:58:54 +03:00
Ari Koivula
63f6cad5a0
Include global.h in thread modules.
2014-05-19 15:58:16 +03:00
Ari Koivula
551b087b47
Remove bunch of unnecessary code from encode_transform_unit.
...
- Really, it's useless. Selecting scan order isn't this hard.
- Checked from HM that ctx_idx doesn't have anything to do with contexts.
2014-05-16 17:42:40 +03:00
Ari Koivula
f73bef0941
Remove unused include.
2014-05-16 16:09:59 +03:00
Laurent Fasnacht
6fdb821b14
Fix memory leaks
2014-05-16 12:20:40 +02:00
Laurent Fasnacht
d4a6aed471
Multi-row jobs
2014-05-16 12:20:40 +02:00
Marko Viitanen
94285fbed7
Fixed compiling on visual studio with _DEBUG defined
2014-05-16 12:22:06 +03:00
Marko Viitanen
86155ef1ba
Added windows specific timing macros for thread debugging
2014-05-16 12:16:22 +03:00
Laurent Fasnacht
36945e89ce
Stubs to be able to make a portable version of the profiling
2014-05-16 10:15:05 +02:00
Laurent Fasnacht
53b0835316
Improve handling of jobs when not using threads
2014-05-16 08:50:43 +02:00
Laurent Fasnacht
519750d630
Write bitstream of a wavefront in a parallel way
2014-05-16 08:50:42 +02:00
Laurent Fasnacht
7473ac1bfc
Able to log time in a simple way
2014-05-16 08:50:42 +02:00
Laurent Fasnacht
86e01284b8
Add -lrt
2014-05-16 08:48:54 +02:00
Laurent Fasnacht
4f73a7fc91
Instrument threads in order to be able to do some visualization
2014-05-16 08:44:32 +02:00
Ari Koivula
a7cd31d87b
Update the names of some bins to the current spec.
...
- Helps with debugging.
2014-05-16 05:44:03 +03:00
Ari Koivula
ab4041c8fc
Change cabac debug statements to show information better.
...
- Show the number of bits when encoding multiple bins. I would like just the
bits them selves in string form, but that's too much trouble for this.
- Print then as unsigned and coerce them to unsigned, as they are going
get coerced to unsigned by the function call anyway.
- Change state to be less verbose.
2014-05-16 05:44:03 +03:00
Ari Koivula
c9a8756fbd
Fix NxN scan mode for lcu_get_final_cost.
...
- Scan mode was always selected according to the first PU mode.
2014-05-15 16:20:35 +03:00
Marko Viitanen
b08047cce9
Fixed intra chroma mode selection
2014-05-15 09:50:05 +03:00
Ari Koivula
f0e990905e
Remove chroma mode "36".
...
- It's an unnecessary chore to handle this special case everywhere (it means
chroma_mode == intra_mode). Better just to use the actual mode.
2014-05-14 19:56:35 +03:00
Ari Koivula
60a0ba4280
Update VS project files to link win32-pthread.
...
- I haven't found a good way of including external dependencies to VS projects
yet. Win32-pthreads is assumed to be found at the same level as kvazaar dir
and has the files x86/pthreadVC2.lib and x64/pthreadVC2.lib.
- Win32-pthreads also requires the pthreadVC2.dll to be in PATH when running
the program. Not sure what to do about that yet. We might need an installer
for windows to handle that.
- Disable openmp as it's no longer used.
- Stop linking Ws2_32.lib as that hasn't been used for ages.
2014-05-14 17:54:34 +03:00
Laurent Fasnacht
8ff9ea0eee
Wavefront works with parallelism + deblock (still no SAO)
2014-05-14 14:01:26 +02:00
Laurent Fasnacht
38444a81a6
Threads should be put in queue in wait state if we want to add dependencies later
2014-05-14 14:01:25 +02:00
Laurent Fasnacht
e72408249b
Add encoder_state pointer to lcu_order_element, new worker_encoder_state_search_lcu function to run the search stuff on one LCU
2014-05-14 14:01:24 +02:00
Laurent Fasnacht
eb62696461
Fix problems when image dimensions is not a multiple of LCU
2014-05-14 13:27:14 +02:00
Laurent Fasnacht
1ba1683c05
search buffer has to be allocated tile-wise to avoid problems with wavefronts
2014-05-14 13:27:13 +02:00
Laurent Fasnacht
bb86f24000
Take advantage of the new buffers to remove uneeded item assignment
2014-05-14 13:27:13 +02:00
Laurent Fasnacht
6607c9f563
Use new buffers for search
2014-05-14 13:27:12 +02:00
Laurent Fasnacht
c257c4b863
Add const for the buffers
2014-05-14 13:27:12 +02:00
Laurent Fasnacht
1680273e80
Store search borders in a buffer for the whole picture
2014-05-14 13:27:11 +02:00
Laurent Fasnacht
0ceb1469a2
Improve decision about when to split into threads
2014-05-14 13:27:11 +02:00
Laurent Fasnacht
d4a303e7e6
Free jobs as soon as possible
2014-05-14 13:27:09 +02:00
Laurent Fasnacht
63adb54a3d
Add --threads <int> command line parameter
2014-05-14 13:27:09 +02:00
Laurent Fasnacht
e772799d5e
encoder_state_encode uses now the threadqueue
2014-05-14 13:27:08 +02:00
Laurent Fasnacht
baede7f6c4
threadqueue
2014-05-14 13:27:08 +02:00
Laurent Fasnacht
8b7774153f
Add SLEEP() define
2014-05-14 13:27:08 +02:00
Laurent Fasnacht
aac7fc55b1
Remove filter_deblock function, which is not used and somewhat dangerous, since it doesn't take into account specific stuff about subencoders.
2014-05-14 13:27:07 +02:00
Laurent Fasnacht
bc3ca90bdf
Fix tiles when SAO or deblock is enabled.
...
Was broken by previous commit.
2014-05-14 13:27:07 +02:00
Laurent Fasnacht
4815a0604b
Entropy coding sync works without parallelism, without SAO and without deblocking
2014-05-14 13:27:06 +02:00
Laurent Fasnacht
2c2a2528f3
Remove openmp stuff
2014-05-14 13:27:06 +02:00
Ari Koivula
aee9bf2875
Re-add rdo control to transformskip decision.
...
- It got left out when rewriting the function.
2014-05-14 12:39:23 +03:00
Ari Koivula
9147b7acbf
Split residual quantization to separate luma and chroma function.
2014-05-14 11:19:48 +03:00
Ari Koivula
e947bd4c0e
Clean up trskip decision code and remove old code.
...
- You can define structs inside functions! This changes everything!!
- Bitstream changes a little bit compared to old trskip decision. Bdrate
change is insignificant though.
2014-05-13 22:00:04 +03:00
Ari Koivula
a3cdee9ec5
Move new trskip decision to a function.
2014-05-13 21:59:00 +03:00
Ari Koivula
2ff713ccb2
Add new implementation for trskip decision.
2014-05-13 21:57:45 +03:00
Ari Koivula
8b8da6f493
Make luma and chroma use the same quantization function.
...
- Only thing not working was transform skip.
2014-05-13 21:57:23 +03:00
Ari Koivula
f0bfcedba2
Clean up coeff reconstruction code.
2014-05-13 21:56:10 +03:00