hashirama/uvg266

mirror of https://github.com/ultravideo/uvg266.git synced 2024-12-01 05:04:05 +00:00

Author	SHA1	Message	Date
Ari Koivula	67acead4bc	Fix referring over IDR boundary when using --gop This problem resulted in an illegal bitstream with --gop=lp, because it uses IDR's. The --gop=8 would not code IDR pictures, even when told to with -p, which masked this problem. This fix solves the problem with --gop=lp and also prevents references across the intra picture in --gop=8. The intra pictures should be set to IDR in a later fix, or an alternate method of differentiating between IDR and non-IDR intra should be made.	2016-05-27 13:20:53 +03:00
Ari Koivula	a77dc1610e	Refactor encoder_state_remove_refs I needed to debug this, so I rewrote it to make sense. There is an obvious bug with the IDR handling that I left in place to fix in a separate commit.	2016-05-27 13:20:45 +03:00
Eemeli Kallio	b5c05e58e0	Fixed typo in strategyselector.c	2016-05-24 11:04:29 +03:00
Ari Lemmetti	68c6f0f7b8	Enable deblocking for every preset Deblocking adds very little complexity while giving massive coding performance boost	2016-05-17 18:50:31 +03:00
Ari Lemmetti	6a07761b46	Add smp and amp options to presets	2016-05-17 14:26:58 +03:00
Ari Lemmetti	3107a93eaf	Fix avx2 chroma sampling for amp	2016-05-17 14:09:57 +03:00
Ari Koivula	24d0f9f685	Fix usage message for --hash	2016-05-11 15:03:43 +03:00
Ari Koivula	a1c772b696	Merge pull request #136 from MrAsura/cu-split-termination Cu split termination Closes #133.	2016-05-10 17:22:08 +03:00
Jaakko Laitinen	7010526b1d	Removed tabs.	2016-05-10 15:52:44 +03:00
Jaakko Laitinen	a77eb5c874	Fixed type conversion error when parsing cu split termination.	2016-05-10 14:34:46 +03:00
Jaakko Laitinen	0d361d5bc7	Moved cu split termination from a pre-processor to a input parameter.	2016-05-10 14:15:41 +03:00
Ari Koivula	1dbe4eb852	Merge branch 'mv-full'	2016-05-10 13:28:07 +03:00
Ari Koivula	f6a9d237a3	Merge pull request #134 from miimiz/testink_eemeli Strategyselector prints	2016-05-10 13:27:23 +03:00
Eemeli Kallio	8cfeed852c	Added print about SIMD optimizations available and in use to strategyselector.	2016-05-10 12:59:15 +03:00
Ari Koivula	f51a68b6fa	Add different sizes of search window for full search	2016-04-21 15:11:35 +03:00
Ari Lemmetti	efbdc5dade	Utilize registers more efficiently for 8x8 and larger blocks	2016-04-21 13:26:38 +03:00
Ari Lemmetti	192cee95b2	Vectorize vertical filtering	2016-04-21 13:26:38 +03:00
Ari Lemmetti	0be35f72b8	Filter 4 pixels simultaneously in x direction	2016-04-21 13:26:38 +03:00
Ari Lemmetti	10484bda9f	Make strategies out of fractional pixel sample functions	2016-04-21 13:26:38 +03:00
Ari Koivula	28e7548387	Fix bug in full mv search This optimization led to some points not being searched.	2016-04-21 12:03:57 +03:00
Ari Koivula	2576aeee0b	Use merge candidates in full mv search Perform a full search window around every mv candidate and the 0-vector.	2016-04-20 20:47:11 +03:00
Ari Lemmetti	8247faf8e0	Remove 64-bit only instruction to fix 32-bit compilation.	2016-04-19 18:05:11 +03:00
Ari Lemmetti	eb55d6b6b9	Fix writing over boundary.	2016-04-19 16:03:43 +03:00
Ari Lemmetti	bcabc6fadd	Remove pixel blit from strategies. Use memcpy instead.	2016-04-06 18:44:04 +03:00
Ari Lemmetti	2140197ccc	Tidy up coeff blit function and use memcpy again. Give memcpy constants for fixed sizes to enable copying many bytes simultaneously.	2016-04-06 18:03:00 +03:00
Ari Koivula	08b4480d94	Re-add time.h include Include-what-you-use wants to include sys/time.h instead, or if I override it to include time.h it will remove the include completely.	2016-04-02 19:05:16 +03:00
Ari Koivula	61fc3e87ba	Run include-what-you-use fix_includes.py fix_includes.py The includes should make more sense now and not just happen to compile due to headers included from other headers. Used a modified version of IWYU. Modifications were to attribute int8_t and so on to stdint.h instead of sys/types.h and immintrin.h instead of more specific headers. include-what-you-use 0.7 (git:b70df35) based on clang version 3.9.0 (trunk 264728)	2016-04-01 17:46:55 +03:00
Ari Koivula	016810d982	Move COMPILE_ macro to global.h While these are only used for strategies, it's non-intuitive to have to include strategyselector.h in every file under strategies before including anything else.	2016-04-01 17:46:55 +03:00
Ari Koivula	8908d85d66	Change all relative includes to absolute	2016-04-01 17:46:44 +03:00
Ari Koivula	4876879b82	Add IWYU pragmas	2016-03-31 12:33:34 +03:00
Marko Viitanen	41a5f9bbbe	Fix filetime conversion to timespec	2016-03-24 10:08:11 +02:00
Ari Koivula	9139e169fe	Fix unnecessary waiting in main thread The main thread has to wait for the worker threads to finish. The pthread_cond_timedwait call used to accomplish this was given a relative instead of absolute time, which resulted in the call returning immediately, because the time had already passed. This removes the now unnecessary sleeps and fixes the time given to the pthread_cond_timedwait such that it now waits until a job finishes or 100ms have passed.	2016-03-23 22:23:04 +02:00
Ari Koivula	e23ed231fb	Fix race condition with owf and non-square motion partitions The OWF wpp limit code assumed square blocks, and as such did not work correctly when height != width. This changes the relevant code to consider both height and width.	2016-03-22 16:46:38 +02:00
Arttu Ylä-Outinen	d6a3e02f16	Fix calculating reference CU index in inter search Fixes a possible segfault when SMP or AMP blocks are used.	2016-03-22 12:55:58 +02:00
Ari Lemmetti	f4538ab474	Copy pixels more efficiently in lcu recon.	2016-03-18 20:10:03 +02:00
Ari Koivula	5b66578f71	Add kvz_ prefix to md5 functions The non kvz_ symbols were being exported in the static lib, which got caught by Travis tests.	2016-03-18 13:13:35 +02:00
Ari Koivula	4125218cfa	Add --hash=md5 Add md5 through extras/libmd5 taken from HM with BSD license. It's implemented as a generic strategy using the same interface as checksum, so we can write a SIMD version if it seems necessary.	2016-03-18 05:23:57 +02:00
Ari Koivula	883448b8fb	Add --hash parameter Allows decoded picture hash to be selected among none and checksum.	2016-03-18 05:20:15 +02:00
Ari Lemmetti	6d5f8e3aec	Define KVZ_COMPILE_ASM for the correct files. Enables asm strategies again.	2016-03-17 16:21:31 +02:00
Ari Lemmetti	e502292ba8	Remove old function	2016-03-16 20:18:55 +02:00
Ari Lemmetti	c6cc96f5ec	Optimize sao band ddistortion	2016-03-16 20:16:00 +02:00
Ari Lemmetti	ab577f476f	Optimize sao reconstruct color	2016-03-16 20:15:32 +02:00
Ari Lemmetti	48bfddf4ec	Optimize calc sao edge dir	2016-03-16 20:14:50 +02:00
Ari Lemmetti	ba69992941	Optimize sao edge ddistortion	2016-03-16 20:14:19 +02:00
Ari Lemmetti	941b6b3e27	Optimize calc eo cat	2016-03-16 20:13:30 +02:00
Ari Lemmetti	04fbb48a09	Add strategy for avx2. Copy generic functions there.	2016-03-16 20:13:15 +02:00
Ari Lemmetti	4e30a215d8	Create generic strategy for sao.	2016-03-16 20:11:15 +02:00
Ari Koivula	6f431e510c	Comment and tidy threadqueue_worker Carefully avoided making any changes to the logic.	2016-03-14 20:08:04 +02:00
Ari Koivula	1165ae2e1f	Increase --mv-constraint=frametimemargin margin Increase the margin to be 4 luma pixels to every direction.	2016-03-14 16:02:54 +02:00
Arttu Ylä-Outinen	0eda28ced6	Fix Visual Studio warnings Initialization of a struct with addresses of local variables generated warning C4221 in encmain.	2016-03-14 14:12:21 +02:00
Ari Koivula	e91ca74733	Refactor kvz_encode_last_significant_xy	2016-03-10 18:47:16 +02:00
Ari Koivula	1fc0e8076c	Format kvz_encode_last_significant_xy whitespace	2016-03-10 18:17:45 +02:00
Ari Koivula	df9a958ef2	Merge branch 'log2'	2016-03-10 18:16:41 +02:00
Ari Koivula	4112a4364d	Remove g_to_bits table	2016-03-10 15:59:51 +02:00
Ari Koivula	9fcfba637f	Remove duplicated inline functions	2016-03-10 15:28:31 +02:00
Ari Koivula	e27ec2cc53	Add kvz_math.h for common inline math functions Calling it just math.h would have prevented including system math.h.	2016-03-10 15:26:18 +02:00
Ricardo Constantino	c515796a21	Only use version prefix in kvazaar binary Fixes regression since `54f08f2` causing libkvazaar version checks to not work (i.e. pkg-config)	2016-03-09 16:13:59 +00:00
Arttu Ylä-Outinen	54f08f2bdb	Use output of git describe as version.	2016-03-09 15:04:29 +02:00
Ari Koivula	f8edf28161	Fix const qualifier warning Also set the warning to an error in VS.	2016-03-09 14:16:15 +02:00
Ari Koivula	b0c3ece31e	Fix race condition when deblocking is on but SAO is off Already suspected this yesterday, but didn't want to add the code to handle it before confirming that it's actually a problem. It is.	2016-03-09 14:02:46 +02:00
Ari Koivula	1671725c72	Fix non-determinism issue with OWF WPP margin The previous reasoning used deblocking and fractional motion estimation together to arrive at a margin of 4 pixels. This was wrong, and with either of these off, half pixel chroma interpolation could use pixels outside the intended region. Deblocking does not currently affect the margin needed.	2016-03-08 20:18:38 +02:00
Ari Koivula	674bfa14ce	Comment WPP deblocking and SAO I was a bit unclear about exactly what happens and when regarding SAO and deblocking when we do frame-parallel WPP parallelism, so I checked and commented the bits that were unclear to me.	2016-03-08 19:39:04 +02:00
Ari Koivula	aec152c953	Fix OWF mv restriction limit The check was done in regard to the wrong dimension, allowing the access to unfinished parts of the frame when coding multiple frames at the same time.	2016-03-08 17:12:43 +02:00
Ari Koivula	fda103aa7c	Refactor cfg->tiles_width_count and cfg->tiles_height_count Change code everywere so these actually mean "width count" and not "width count minus one".	2016-03-07 17:29:15 +02:00
Ari Koivula	a350eb3a1e	Fix --tiles to have the correct number of tiles. The tiles_width_count etc. actually mean "count minus one".	2016-03-07 17:24:31 +02:00
Ari Koivula	49ea2d7b7f	Fix --mv-constraint=frametile Option --mv-constraint=frametilemargin was being used instead of frametile.	2016-03-07 16:41:00 +02:00
Ari Koivula	95b8dd99f6	Add --tiles parameter Add new parameter --tiles that accept only uniform split. I considered supporting the syntax of --tiles-width-split for this, but writing --tiles=u2xu2 is just not as intuitive as --tiles=2x2, and there is hardly ever any reason to use anything but uniform split. The more cumbersome --tiles-width-split and --tiles-height-split parameters are still there to allow finer control.	2016-03-07 16:33:51 +02:00
Ari Koivula	fd34dd9bc6	Fix race condition with OWF There was an off by one error in the dependance setting code, which resulted in dependencies not being set resulting in checksum errors. For example if ref_neg=1 and owf=1.	2016-03-07 13:38:23 +02:00
Ari Koivula	81b439f4da	Optimize starting point selection in tz Avoid checking zero motion vectors multiple times. The merge candidate list often has only one or two candidates, the other being zeroes.	2016-03-04 16:48:46 +02:00
Ari Koivula	2436702c27	Optimize starting point selection in hexbs Avoid checking zero motion vectors multiple times. The merge candidate list often has only one or two candidates, the other being zeroes.	2016-03-04 16:48:12 +02:00
Ari Koivula	5327b59b45	Remove KVZ_PERF_SEARCHPX It's too invasive and we don't really need it.	2016-03-04 16:48:12 +02:00
Arttu Ylä-Outinen	348ac4888b	Fix calc_mode_bits. The CUs left and above the current one would be set to NULL when there was only one CU between the current one and the left or top edge of the frame.	2016-03-04 14:08:35 +02:00
Ari Koivula	86219aa0fc	Fix non-determinism with tiles Earlier fix that fixed the supply side of the cu_array to take tile coordinates into account should have been accompanied with this one that does the same thing to demand side.	2016-03-03 17:39:20 +02:00
Arttu Ylä-Outinen	626b53ce85	Move sao search from encoderstate to sao. Moves sao search from function encoder_state_worker_encode_lcu in encoderstate.c to function kvz_sao_search_lcu in sao.c. Makes functions kvz_init_sao_info, kvz_sao_search_chroma and kvz_sao_search_luma static since they are no longer used outside sao.c.	2016-03-01 14:56:16 +02:00
Ari Koivula	cfa722e448	Reduce parallelism for tiles There is still some race-condition with encoding tiles from multiple frames, so disable this to keep the bitstream deterministic.	2016-02-29 20:20:21 +02:00
Ari Koivula	3dcc0957f8	Deal with impossible mv constraints If 0,0 vector is illegal, it's possible that no legal movement vector, is found, in which case a large cost is returned instead. The cost overflowed and there is all sorts of silliness with converting from double to int, but I'm not going to fix all of it because when we remove the doubles it will all get fixed.	2016-02-29 19:18:14 +02:00
Ari Koivula	b1adf1576a	Add --mv-constraint=frametilemargin Add an even stricter motion vector constraint to prevent motion vectors to fractional pixel positions that would need pixels outside the tile.	2016-02-29 19:18:14 +02:00
Ari Koivula	f808cbf608	Allow increased parallelism for tiles When movement vectors are constrained to tiles, only the same tile in previous frame needs to be depended upon.	2016-02-29 14:33:06 +02:00
Ari Koivula	f4ebff12b0	Combine tile mv constraint with OWF mv constraint This also fixes movement vectors in tiles when OWF is on. The OWF mv constraint assumed WPP, so it didn't work with tiles.	2016-02-29 14:33:06 +02:00
Ari Koivula	7981609cd0	Add --mv-constraint=frametile	2016-02-29 14:33:06 +02:00
Ari Koivula	9dbbb7fdbc	Add --mv-constraint argument	2016-02-29 14:33:06 +02:00
Ari Koivula	1be877faf9	Fix chroma reconstruction with tiles An incorrect frame boundary check caused a checksum error, because the chroma reconstruction of the encoder was wrong. The encoder treated horizontal tile boundaries as frame boundaries when the vertical component of the movement vector was a multiple of 8.	2016-02-29 14:32:51 +02:00
Ari Koivula	c0dc490dd1	Fix inter non-determinism with tiles CU data was being copied to the wrong place in the reference frames cu_array, which led to uninitialized data being used as a starting point for motion vector search. Fixes #99.	2016-02-26 17:05:04 +02:00
Ari Koivula	719d72925b	Add loop-input option This option is useful for testing long encodes, as you don't have to find an actual infinite input.	2016-02-18 20:00:55 +02:00
Ari Koivula	d23a5a15f1	Fix overflow in rate control A 32 bit int overflowed after 2^31 bits (2Gb). It will still overflow eventually, after 500 years of outputting 1Gb/s, but by that time, I recon we will have fixed this properly and it's time to upgrade.	2016-02-18 16:48:21 +02:00
Ari Koivula	eeafe14946	Clean up search initialization Copy lcu explicitly instead of initializing with the same parameters.	2016-02-17 14:57:31 +02:00
Arttu Ylä-Outinen	e5c84c361c	Eliminate a race condition with input thread. Changes communication between the input thread and main thread in encmain.c so that only one of them uses img_in and retval at a time. Fixes a race condition which would sometimes result in a deadlock.	2016-02-17 12:09:19 +02:00
Ari Koivula	c40ede56ad	Allow more frame parallelism in LP-gop Add dependency to the reference frame instead of the previous frame, in order to allow more frames to be encoded in parallel when temporal stepping >1 in LP-gop (such as --gop=lp-g8d4r1t2).	2016-02-05 17:08:24 +02:00
Arttu Ylä-Outinen	40c7198f7d	Add a script for updating README Adds script tools/update_readme.sh for regenerating the "Using Kvazaar" section of README.md from the output of "kvazaar --help".	2016-02-05 16:21:39 +02:00
Arttu Ylä-Outinen	aac5373095	Fix typos in documentation Fixes a few typos in README and command line help.	2016-02-05 16:21:27 +02:00
Ari Koivula	a4915dc547	Update man and README	2016-02-04 14:16:58 +02:00
Ari Koivula	e941e21cd6	Enable errors about non-existing CLI options Set opterr and optind to their normal default values.	2016-02-04 13:48:58 +02:00
Ari Koivula	7a4bf94a52	Add --version and --help Also don't print help by default, because it's too long. Print a shorter usage message instead.	2016-02-04 13:48:48 +02:00
Ari Lemmetti	99e37ec235	Update old pixel type to the current one	2016-01-30 19:33:09 +02:00
Ari Koivula	c76a0951cf	Change version to 0.8.3	2016-01-28 21:21:02 +02:00
Ari Koivula	cb2121b1aa	Double time scale when field coding is used	2016-01-28 21:04:52 +02:00
Ari Koivula	8ad7d2a714	Move interlacing stuff to libkvazaaar API This moves the interlacing from CLI code to api->encoder_encode, in order to make it possible to use field coding through the lib API. The field order is now determined per frame, as FFmpeg gives it per frame and it's signaled per frame. As a side effect, the CLI also now prints info from frames instead of fields. While we might want to extend the API in the future to allow printing of more detailed information about fields, for now it's more important that the CLI uses the real lib API. PSNR calculation for interlaced frames disabled until we have a way to avoid deinterlacing the frame when it's not necessary.	2016-01-27 15:29:45 +02:00
Ari Koivula	6952f0fcc6	Refactor interlaced reading Doesn't change the way it works. Just rearranges things so it's easier to see what is going on.	2016-01-26 13:42:41 +02:00
Ari Koivula	a46351efe1	Fix out of bounds error in interlacing When field height was padded to a multiple of 8, yuv_io_extract_field would read outside the buffer.	2016-01-26 13:41:52 +02:00
Arttu Ylä-Outinen	49677810b5	Rename config module to cfg. Prevents a conflict with config.h and src/config.h so that the config.h generated by configure is included in global.h. Fixes problems with large input files on 32-bit systems.	2016-01-25 12:26:46 +02:00
Marko Viitanen	8e6c12b859	Merge branch 'input_reading_thread'	2016-01-25 12:00:03 +02:00
Marko Viitanen	b4a4ce848c	Use field parity for extracting correct fields from the interlaced picture	2016-01-25 10:58:12 +02:00
Marko Viitanen	441ce7728f	Fix for input_read_thread() in the case when interlaced source-scan-type is used	2016-01-25 10:57:51 +02:00
Marko Viitanen	198204a20a	Fix when using --source-scan-type=bff, offset was used for output lines	2016-01-25 10:13:51 +02:00
Ari Koivula	22b8ed43dc	Remove global.h include from kvazaar.h It shouldn't have been put there as it's the lib interface.	2016-01-22 15:23:34 +02:00
Ari Koivula	249c88011e	Fix problem with >2GB input files on 32bit	2016-01-22 15:15:02 +02:00
Ari Koivula	fa1af14637	Fix includes to include global.h first everywhere	2016-01-22 15:07:49 +02:00
Ari Koivula	3bf278529c	Fix interlacing when using lib interface Some flags used for interlacing were set in CLI interface, which meant that interlacing didn't work correctly when used through libkvazaar.	2016-01-22 14:35:20 +02:00
Marko Viitanen	0128ee26e7	Clear img_in pointer after reading it	2016-01-22 14:29:35 +02:00
Marko Viitanen	b5459c1f23	Fixed performance monitoring by adding KVZ_ prefix to GET_TIME	2016-01-22 11:27:25 +02:00
Marko Viitanen	e36237335e	Fixed memory leaks caused by the input handler thread and cleaned up the code	2016-01-22 11:27:25 +02:00
Marko Viitanen	ad9a1f6539	Input thread implementation - Handle input processing in a separate thread to allow main thread more time with thread handling etc - Significant speedup can be seen when run on ultrafast settings and on a system with great number of cores	2016-01-22 11:27:25 +02:00
Ari Koivula	5e734593c0	Add psnr argument to CLI To disable calculation of PSNR for frames, printing 0.0dB instead.	2016-01-21 15:08:34 +02:00
Ari Koivula	9eba3a83cc	Add compiler flag checking to configure	2016-01-20 16:32:34 +00:00
Arttu Ylä-Outinen	d452709795	Fix compiling AVX2 strategies. Option -mavx2 was omitted when compiling AVX2 strategies. This commit moves strategies to convenience libraries so that their compilation flags can be easily set and adds -mavx2 to CFLAGS of the AVX2 library.	2016-01-20 11:04:12 +02:00
Ari Koivula	8060e2f6ec	Delete kvazaar_version.h It's not used anymore.	2016-01-19 20:40:35 +02:00
Ari Lemmetti	44656aeb19	Remove useless calculation	2016-01-19 16:35:16 +02:00
Marko Viitanen	e822c16659	Removed unneeded cpu flags causing compiling to fail on powerpc, closes #121	2016-01-18 08:55:32 +02:00
Ari Koivula	c8c0b4e8e8	Change version number for v0.8.2	2016-01-15 19:42:07 +02:00
Ari Koivula	e2402c0000	Remove kva_api_get versioning. We have soname versioning now, so we should focus on getting that right instead. This also serves as an example of correctly incrementing the lib-version.	2016-01-15 19:39:24 +02:00
Ari Koivula	caf809f26d	Remove scons build scripts Because we are not going to maintain them.	2016-01-15 17:35:35 +02:00
Ari Koivula	15e1110997	Remove reference to Makefile-old Makefile-old was deleted and this reference breaks make dist.	2016-01-15 17:32:54 +02:00
Ari Lemmetti	a9decd2f40	Bump for yet another release	2016-01-14 23:23:11 +02:00
Ari Koivula	7718ac378f	Add fractional FPS support. Now that we put the timing info into the bitstream, the time base must be precisely known. Represent framerate as a fraction and add timing info only if the old floating point framerate was not used. Deprecate cfg->framerate so it can be removed once we get patches to FFmpeg and libav. Add support for (num)/(denom) format to --input-fps.	2016-01-14 22:16:53 +02:00
Ari Lemmetti	a9bd7b9e63	Bump version numbers for release v0.8.0	2016-01-14 20:38:28 +02:00
Ari Lemmetti	b605e3866e	Bye bye Makefile	2016-01-14 20:38:01 +02:00
Marko Viitanen	242edf98ad	Added calculation and writing of VUI num_units_in_tick and time_scale	2016-01-14 15:32:33 +02:00
Ari Lemmetti	daf39e348f	Add dedicated handling for blitting NxN coeffs when N is 4, 8 or 16	2016-01-13 19:27:45 +02:00
Ari Lemmetti	a2fc9920e6	Merge branch 'alternative-satd'	2016-01-13 15:00:43 +02:00
Ari Lemmetti	1ed34f2df8	Add some planar pred optimization for blocks larger than 8x8	2016-01-13 14:50:17 +02:00
Ari Lemmetti	0df88697ff	Copy generic function to AVX2 strategy	2016-01-12 23:51:18 +02:00
Ari Lemmetti	62799a9fc3	Create generic strategy of planar prediction	2016-01-12 23:50:47 +02:00
Ari Lemmetti	3cb1cebfe5	Add missing inlines	2016-01-12 23:03:31 +02:00
Ari Lemmetti	6a0b13b8b6	Remove unused functions	2016-01-12 22:55:37 +02:00
Ari Lemmetti	61155f0edd	Add 128-bit version of the functions as well	2016-01-12 22:52:00 +02:00
Ari Lemmetti	a6afb8a8f4	Small refactoring	2016-01-12 22:29:33 +02:00
Ari Lemmetti	a756f6133a	Manually unroll vertical Hadamard transform	2016-01-12 21:45:02 +02:00
Ari Lemmetti	66350aa20e	Experiment with alternative implementation of FWHT	2016-01-11 16:25:56 +02:00
Arttu Ylä-Outinen	e14858f41a	Fix build and tests. - Remove non-existent file interface_main.c from library sources. - Add file mv_cand_tests.c to test sources.	2015-12-21 16:03:55 +02:00
Arttu Ylä-Outinen	9abdee7cc3	Merge branch 'autotools'	2015-12-21 15:54:30 +02:00
Arttu Ylä-Outinen	eb6fa3d980	Fix exporting functions in library. Rewrites definition of macro KVZ_PUBLIC in kvazaar.h so that KVZ_STATIC_LIB need not be defined when building a static library.	2015-12-21 14:38:59 +02:00
darealshinji	8427a85d36	Add tests	2015-12-19 08:24:35 +01:00
Ari Koivula	1270da3626	Move files under their modules in Visual Studio Also moves CLI stuff under CLI project, so they are compiled as their own lib just like when the Makefile is used. The file interface_main.c was an artifact from a bygone era and should have been deleted long ago.	2015-12-17 15:39:45 +02:00
Ari Koivula	947bae24f9	Update Doxygen documentation Add module information to all header files. Update all header file documentations to briefly say what they are, and to use the javadoc format so the brief actually gets included into the doxygen documentation. Remove \file from implementation files, in order to not repeat the info from the header files. Add files under strategies and tools to Doxygen and update the Doxygen settings to be just plain better. Make README be the main page of Doxygen documentation.	2015-12-17 14:05:50 +02:00
Ari Koivula	a6ea705e19	Add missing lambda to some bit costs Bits were being added to rate distortion without being multiplied by lambda in a few places. Fixing this bug also finally allows us to remove the magic bits from the Coding Unit split decision. I tried to find new optimum value for CU_COST and it turned out to be 2 for veryslow and 0 for superfast. The difference between 0 and 2 on veryslow was only 0.1% however, so I don't think this parameter is needed any longer. Before this fix the effect of removing CU_COST would have been 0.8%.	2015-12-15 16:32:38 +02:00
Arttu Ylä-Outinen	0e33049d9e	Enable full mv search once again. - Updates function search_mv_full so that it compiles and handles non-square blocks. - Enables compilation of search_mv_full. - Sets full search radius to 32. - Enables selecting full mv search with "--me full".	2015-12-15 12:26:26 +02:00
Arttu Ylä-Outinen	dbb9b0df85	Enable search for AMP blocks.	2015-12-15 11:21:46 +02:00
Arttu Ylä-Outinen	7e4f4538a4	Implement encoding AMP part modes. Also adds parameter --amp for enabling AMP blocks.	2015-12-15 11:21:45 +02:00
Arttu Ylä-Outinen	c3716f7803	Add --smp option for enabling SMP blocks.	2015-12-15 11:21:45 +02:00
Arttu Ylä-Outinen	38b881c36f	Implement search_frac for rectangular blocks. Replaces parameter depth of function search_frac with parameters width and height.	2015-12-15 11:21:45 +02:00
Arttu Ylä-Outinen	864c77f6eb	Use kvz_satd_any_size in inter search. Changes search_frac and kvz_search_cu_iter to use kvz_satd_any_size for computing the SATDs instead of getting the SATD function with kvz_pixels_get_satd_func.	2015-12-15 11:21:45 +02:00
Arttu Ylä-Outinen	056fa09ba5	Add arbitrary-sized SATD functions. Adds strategy satd_any_size for generic and AVX2. The satd_any_size functions are implemented with macro SATD_ANY_SIZE defined in strategies-picture.h.	2015-12-15 11:21:45 +02:00
Arttu Ylä-Outinen	6bdc08b6eb	Drop unused function declaration. Removes declaration of non-existent function satd_8bit_8x8_generic in strategyselector.h.	2015-12-15 11:21:44 +02:00
Arttu Ylä-Outinen	728a6abecc	Extract macro SATD_NxN. Combines definitions of macros SATD_NXN and SATD_NXN_AVX2 to macro SATD_NxN and moves it to strategies-picture.h.	2015-12-15 11:21:44 +02:00
Arttu Ylä-Outinen	1eebfde0c5	Make tz search work with non-square blocks. Replaces parameter depth with parameters width and height.	2015-12-15 11:21:44 +02:00
Arttu Ylä-Outinen	e203883f3d	Refactor kvz_filter_deblock_lcu. Moves code for filtering the rightmost 4 pixels of an LCU to a separate function filter_deblock_lcu_rightmost.	2015-12-15 11:21:44 +02:00
Arttu Ylä-Outinen	21ca74fe86	Replace deblock filter with a simple loop. - Adds function is_pu_boundary. - Moves code for filtering an edge of a single PU or TU to a new function filter_deblock_unit. - Replaces recursive CU tree traversal in filter_deblock_cu with a simple loop and renames it to filter_deblock_lcu_inside.	2015-12-15 11:21:43 +02:00
Arttu Ylä-Outinen	7516fda970	Make fractional recon work with non-square blocks. Adds parameter block_height to functions inter_recon_frac_luma, inter_recon_14bit_frac_luma and inter_recon_14bit_frac_chroma so that they can handle SMP blocks.	2015-12-15 11:21:43 +02:00
Arttu Ylä-Outinen	591a1ce6db	Turn some inter recon functions static. Makes the following functions static since they are not used outside inter.c: - kvz_inter_recon_frac_luma - kvz_inter_recon_14bit_frac_luma - kvz_inter_recon_frac_chroma - kvz_inter_recon_14bit_frac_chroma	2015-12-15 11:21:43 +02:00
Arttu Ylä-Outinen	0f531362bf	Enable Nx2N partitions.	2015-12-15 11:21:43 +02:00
Arttu Ylä-Outinen	4402e251ae	Fix kvz_get_extended_block functions. The buffers allocated in functions kvz_get_extended_block_avx2 and kvz_get_extended_block_generic were too small when the width of the block was less than its height. Fixed to allocate correctly sized buffers.	2015-12-15 11:21:43 +02:00
Arttu Ylä-Outinen	bdd8b1c0aa	Implement 2NxN partitions in inter search. - Try using 2NxN partitions after the usual 2Nx2N. - Adds function kvz_search_cu_smp to search_inter module.	2015-12-15 11:21:42 +02:00
Arttu Ylä-Outinen	410064e880	Split lcu_set_inter into two functions. Moves code for setting the inter modes for a single PU to a new function lcu_set_inter_pu.	2015-12-15 11:21:42 +02:00
Arttu Ylä-Outinen	3236428e4d	Make hexbs search work with non-square blocks. Replaces parameter depth with parameters width and height.	2015-12-15 11:21:42 +02:00
Arttu Ylä-Outinen	31ba8d61c3	Implement fractional chroma recon for SMP blocks. Adds parameter block_height to function kvz_inter_recon_frac_chroma.	2015-12-15 11:21:42 +02:00
Arttu Ylä-Outinen	0b6cef7be5	Remove unused function kvz_inter_set_block.	2015-12-15 11:21:42 +02:00
Arttu Ylä-Outinen	e63486b23f	Make lcu_set_inter work with SMP blocks.	2015-12-15 11:21:41 +02:00
Arttu Ylä-Outinen	7b99eb2970	Call recon functions correctly for SMP blocks. Makes calls to kvz_inter_recon_lcu and kvz_inter_recon_lcu_bipred in function search_cu work correctly when using SMP blocks.	2015-12-15 11:21:41 +02:00
Arttu Ylä-Outinen	dc4525c0e3	Implement inter recon for non-square blocks. Adds parameter height to functions kvz_inter_recon_lcu and kvz_inter_recon_lcu_bipred and makes them work on non-square sizes. Fractional reconstruction functions do not handle non-square blocks yet.	2015-12-15 11:21:41 +02:00
Arttu Ylä-Outinen	f874c8614e	Add part_mode binarization table comment.	2015-12-15 11:21:41 +02:00
Arttu Ylä-Outinen	c77074a7ff	Implement encoding SMP blocks.	2015-12-15 11:21:41 +02:00
Arttu Ylä-Outinen	98707a1288	Move encoding intra CU to a separate function. Moves code for encoding a single intra coding unit from function kvz_encode_coding_tree to a new function encode_intra_coding_unit.	2015-12-15 11:21:40 +02:00
Arttu Ylä-Outinen	c336674da3	Move encoding part mode to a separate function. Moves code for encoding the part mode from function kvz_encode_coding_tree to a new function encode_part_mode.	2015-12-15 11:21:40 +02:00
Arttu Ylä-Outinen	ac952cbb44	Move encoding inter PUs to a separate function. Moves code for encoding a single inter prediction unit from function kvz_encode_coding_tree to function encode_inter_prediction_unit.	2015-12-15 11:21:40 +02:00
Arttu Ylä-Outinen	5ee9f164e8	Add macros for getting PU location and size. - Moves SIZE_* definitions to cu.h. - Adds constant arrays kvz_part_mode_num_parts, kvz_part_mode_offsets and kvz_part_mode_sizes for storing the number of PUs, PU offsets and PU sizes. - Adds macros PU_GET_X, PU_GET_Y, PU_GET_W and PU_GET_H for getting the location and size of a PU.	2015-12-15 11:21:40 +02:00
Arttu Ylä-Outinen	a3df13fb99	Make kvz_inter_get_merge_cand work with SMP blocks. - Replaces parameter depth with parameters width and height. - Adds parameters use_a1 and use_b1 for disabling the use of merge candidates A1 and B1.	2015-12-15 11:21:40 +02:00
Arttu Ylä-Outinen	1cd149fb97	Check merge/mv candidate types earlier. Moves checks for motion vector prediction and merge candidate block types (inter/intra) from functions kvz_inter_get_mv_cand and kvz_inter_get_merge_cand to kvz_inter_get_spatial_merge_candidates.	2015-12-15 11:21:39 +02:00
Arttu Ylä-Outinen	969c91d7c4	Add a test for kvz_inter_get_spatial_merge_candidates.	2015-12-15 11:21:39 +02:00
Arttu Ylä-Outinen	02375bf7e5	Make kvz_inter_get_mv_cand work with SMP blocks. Replaces the depth parameter of kvz_inter_get_mv_cand with parameters width and height.	2015-12-15 11:21:39 +02:00
Ari Koivula	3a80c7de74	Further optimize coefficient coding Remove the need to count the coefficients by populating the significant coefficient group map first and finding the last coefficient from the last group afterward. The speedup is about 2% on ultrafast. The previous version of this patch was reverted due to a bug, which has now been fixed.	2015-12-11 16:47:55 +02:00
Ari Lemmetti	b78460b02c	Optimize another loop	2015-12-11 11:21:43 +02:00
Ari Koivula	b32965925e	Revert "Further optimize coefficient coding" This reverts commit `25462124f8`. That commit broke the bitstream. If it's not good enough to push on Friday night, it's probably not good enough on Monday morning either.	2015-12-07 15:12:04 +02:00
Ari Koivula	865c86fef2	Remove unused variable	2015-12-07 10:32:18 +02:00
Ari Koivula	91631a1c36	Merge branch 'coeff-optimization'	2015-12-07 10:25:46 +02:00
Ari Koivula	25462124f8	Further optimize coefficient coding Remove the need to count the coefficients by populating the significant coefficient group map first and finding the last coefficient from the last group afterward.	2015-12-07 10:23:01 +02:00
Ari Koivula	c94707e6e8	Fix bug with OWF+FME+deblocking Increases the MV safety margin of OWF from 2 to 3 when deblocking is used and 4 when both deblocking and FME are used. Fractional pixel motion estimation can move the vector one more pixel down causing checksum error. This fixes that error by increasing the OWF safety margin and changes the interface, so that different margin can be used when FME or deblocking are not in use.	2015-12-04 15:26:56 +02:00
darealshinji	fe2ff12244	fix building with autotools	2015-12-03 22:41:24 +01:00
darealshinji	b6d3510c2e	pkg-config: move -lm to Libs.private	2015-12-03 22:39:27 +01:00
Ari Lemmetti	6fe223c4dc	Nonzero calculation magic	2015-12-03 18:29:44 +02:00
Ari Lemmetti	f2d8cd4d64	Merge branch 'intra-search-multi'	2015-12-03 17:25:52 +02:00
Ari Lemmetti	c4e1552ef6	Replace original rough intra search	2015-12-03 17:13:11 +02:00
Ari Lemmetti	ee8c2d0218	Add 4x4 dual SATD for AVX2	2015-12-03 17:13:11 +02:00
Ari Lemmetti	00736fa708	Generate larger than 8x8 dual satd functions with macro	2015-12-03 17:13:11 +02:00
Ari Lemmetti	bd3e1922cd	Add AVX2 8x8 dual hadamard transform	2015-12-03 17:13:11 +02:00
Ari Lemmetti	d575b94357	Implement generic functions for dual sad / satd	2015-12-03 17:13:11 +02:00
Ari Lemmetti	183ee53f47	Add alternative version of rough intra search. Calculate two costs simultaneously to exploit larger SIMD registers. Implementation for dual functions missing currently.	2015-12-03 17:12:38 +02:00
darealshinji	8ff28ec974	Make dynamic linking easier	2015-12-01 14:34:08 +01:00
Arttu Ylä-Outinen	21e19067fe	Extract inter search in a single ref frame. Moves code for doing inter search in a single reference frame from function kvz_search_cu_inter to a new function search_cu_inter_ref.	2015-11-18 11:16:27 +02:00
Arttu Ylä-Outinen	f9f3d5929e	Use macros for indexing cu_array in lcu_t. Replaces accesses cu_array with macro calls and adds macros LCU_GET_TOP_RIGHT_CU and CU_GET_CU.	2015-11-18 11:16:27 +02:00
Arttu Ylä-Outinen	8db8f3d523	Use macro SUB_SCU where possible. Replaces expressions like (x & 0x3f) with SUB_SCU(x).	2015-11-18 11:16:26 +02:00

... 2 3 4 5 6 ...

1973 commits