Commit graph

1838 commits

Author SHA1 Message Date
Marko Viitanen 18d9789fab Cabac context array for inter direction 2015-03-06 16:26:16 +02:00
Ari Koivula 2f79bfebf7 Rename parameter encoder_state to state in all functions.
- It's so widely used that there isn't really need to emphasize that
  it's the encoders state. Also, it isn't really the encoders state,
  but encoding jobs state.
2015-03-04 17:31:07 +02:00
Ari Koivula 14fe1b6648 Rename enum color_index to color_t. 2015-03-04 16:37:35 +02:00
Ari Koivula ded6fd9ee8 Renamed typedef pixel to pixel_t. 2015-03-04 16:35:53 +02:00
Ari Koivula 1f42adb1ea Renamed typedef coefficient to coeff_t. 2015-03-04 16:33:47 +02:00
Ari Koivula fedd05465d Rename struct sao_info to sao_info_t. 2015-03-04 16:32:38 +02:00
Ari Koivula 3d135324da Rename struct threadqueue_queue to threadqueue_queue_t. 2015-03-04 16:30:20 +02:00
Ari Koivula b7fcb800b2 Rename struct threadqueue_job to threadqueue_job_t. 2015-03-04 16:28:56 +02:00
Ari Koivula cf5f240604 Rename struct hardware_flags to hardware_flags_t. 2015-03-04 16:24:59 +02:00
Ari Koivula e7754bb518 Rename struct strategy_to_select to strategy_to_select_t. 2015-03-04 16:24:06 +02:00
Ari Koivula e95b138e62 Rename struct strategy_list to strategy_list_t. 2015-03-04 16:23:04 +02:00
Ari Koivula 95afc5af51 Rename struct strategy to strategy_t. 2015-03-04 16:17:45 +02:00
Ari Koivula db42176a64 Rename struct image_list to image_list_t. 2015-03-04 16:13:57 +02:00
Ari Koivula 7bafd34cfa Remove struct rd_stats. 2015-03-04 14:01:17 +02:00
Ari Koivula fe55961f84 Rename struct image to image_t. 2015-03-04 14:01:17 +02:00
Ari Koivula 5431d0ce19 Rename struct lcu_order_element to lcu_order_element_t. 2015-03-04 14:01:17 +02:00
Ari Koivula 9e64ee3cee Suffix encoder_state_config structs with _t. 2015-03-04 14:01:17 +02:00
Ari Koivula cdb1a25f05 Inline struct me into encoder_control_t. 2015-03-04 14:01:16 +02:00
Ari Koivula e5b18cd536 Inline cu_info_intra and cu_info_inter into cu_info_t. 2015-03-04 14:01:16 +02:00
Ari Koivula a0767a76d2 Rename struct vector2d to vector2d_t. 2015-03-04 14:01:16 +02:00
Ari Koivula 5b12830756 Rename struct config to config_t. 2015-03-04 14:01:16 +02:00
Ari Koivula 1a62fee300 Rename struct cabac_data to cabac_data_t. 2015-03-04 14:01:16 +02:00
Ari Koivula 727fefacc4 Rename struct cabac_ctx to cabac_ctx_t. 2015-03-04 14:01:16 +02:00
Ari Koivula 4bc0308b7e Rename struct bitstream_file to bitstream_file_t. 2015-03-04 14:01:15 +02:00
Ari Koivula d6ec6a618d Rename struct bitstream_mem to bitstream_mem_t. 2015-03-04 14:01:15 +02:00
Ari Koivula 106c9128ad Rename struct bitstream_base to bitstream_base_t. 2015-03-04 14:01:15 +02:00
Ari Koivula 5d8498dc88 Rename struct bit_table to bit_table_t. 2015-03-04 14:01:15 +02:00
Ari Koivula 8cd8240f7a Rename struct bitstream to bitstream_t. 2015-03-04 14:01:15 +02:00
Ari Koivula 7ca688b376 Rename struct videoframe to videoframe_t. 2015-03-04 14:01:15 +02:00
Ari Koivula 63e224574e Rename struct cu_info to cu_info_t. 2015-03-04 14:01:15 +02:00
Ari Koivula f3fab62d33 Rename struct cu_array to cu_array_t. 2015-03-04 14:01:15 +02:00
Ari Koivula 78f0c3a83b Rename struct scaling_list to scaling_list_t. 2015-03-04 14:01:14 +02:00
Ari Koivula f6147b410a Rename struct encoder_control to encoder_control_t.
Conflicts:
	src/encoder_state-geometry.h
	src/encoderstate.h
2015-03-04 14:01:14 +02:00
Ari Koivula b14f89c88f Rename struct encoder_state to encoder_state_t. 2015-03-04 14:00:46 +02:00
Marko Viitanen 890b4c1e20 Modified image handling and QP calculations to support GOP 2015-03-03 12:22:50 +02:00
Marko Viitanen c3d9e0b707 Added testset of data for GOP 2015-03-03 12:22:09 +02:00
Marko Viitanen 34b231378b Modified config and encoder_state structs for GOP 2015-03-03 12:21:45 +02:00
SanteriS b55bfe1729 Merge branch 'master' of https://github.com/ultravideo/kvazaar 2015-02-25 18:15:35 +02:00
SanteriS bef7cae4f8 Merge branch 'master' of https://github.com/ultravideo/kvazaar 2015-02-25 15:29:11 +02:00
SanteriS f478732b4c tz search bugfix 2015-02-25 15:28:45 +02:00
Ari Koivula d7383ccb25 Change license to LGPL.
- Everyone who has contributed code to the project has been asked to license
  their contributions under LPGL and they have agreed.

- COPYING file changed to say LGPLv2.1 instead of GPLv2.

- GPL changed to LGPL in the header of every single file that a header and
  header added to the few that were missing one.

- Also.. Happy new year!
2015-02-25 15:19:05 +02:00
Ari Koivula 3e58e03b56 Select motion compensation search starting point from among merge candidates.
- Greatly reduces bdrate for most sequences.
2015-02-25 12:58:15 +02:00
SanteriS 2f68cf3847 (TZ search) Fixed missing check for owf mode. Added 6 point hexagon search pattern. 2015-02-23 16:59:48 +02:00
Ari Koivula 9865e73b90 Remove NetBSD getopt dependency to unistd.h.
- Remove the $NetBSD header as it wouldn't get updated and is wrong.
2015-02-19 16:26:14 +02:00
Ari Koivula dd54b5ae10 Replace GNU getopt with NetBSD getopt.
- This doesn't compile, but I'm including it to have a version history for
  changes required to make it work.
- We need this for to have a getopt implementation on Windows.
- It's necessary to change the implementation to switch from GPL to LGPL.
2015-02-19 16:26:14 +02:00
Ari Koivula c979db7e95 Avoid sorting intra modes unnecessarily. 2015-02-19 16:25:45 +02:00
Ari Koivula 1c2129fdcb Improve sort_modes.
- When encoding with fast enough settings this function can use up to 5%
  of the cpu time, so I tried to optimize it a little bit.
2015-02-19 16:25:38 +02:00
Ari Koivula 5fa6438b25 Clean up calls to memset.
- Replaces all calls to memset with new FILL and FILL_ARRAY macros. The use
  of memset was inconsistent and we never use it for anything complicated.
2015-02-19 16:25:28 +02:00
Arttu Ylä-Outinen b6776a8cee Add --vps-period parameter. 2015-02-18 13:55:27 +02:00
SanteriS 1a4d30d15a fixed step 1 of TZ algorithm 2015-02-11 18:51:21 +02:00
SanteriS ce4c251cd1 Merge branch 'master' of https://github.com/ultravideo/kvazaar 2015-02-09 17:29:49 +02:00
Ari Lemmetti 8aea1a0fa9 Updated version string. Fixed dct strategy registration error message. 2015-02-05 14:07:26 +02:00
Ari Lemmetti 7846cf3093 Merge branch 'faster_interpolation' 2015-02-05 13:29:43 +02:00
Ari Lemmetti 7430622038 Copy ipol-generic strategy as a base for avx2 strategy 2015-02-05 13:28:07 +02:00
Ari Lemmetti 8495870df8 Using BIT_DEPTH macro because it is constant 2015-02-05 13:19:54 +02:00
Ari Lemmetti c82adae0c4 Use four tap functions in octpel chroma interpolation 2015-02-04 18:23:57 +02:00
Ari Lemmetti 2f11caeb73 Added generic four tap functions. Use them in halfpel chroma interpolation. 2015-02-04 17:50:12 +02:00
Ari Lemmetti ff456c120a Enabled link time optimizations. Disabled default rules. 2015-02-04 15:19:47 +02:00
SanteriS 50dd59eb21 Added different search patterns for TZ search. 2015-02-02 19:14:45 +02:00
Ari Lemmetti 041d970ece Apply fast clipping also to chroma filtering. 2015-01-29 16:19:04 +02:00
Ari Koivula ff721bab81 Fix possible non-determinism with owf.
- Triggers when owf is on, sao is off and deblocking is on.
2015-01-26 16:02:31 +02:00
Ari Koivula f01cbbb5ca Add --no-signhide parameter. 2015-01-24 21:29:37 +02:00
Ari Koivula 5f24c6b73d Make normal dequant use runtime sign-hiding configuration. 2015-01-24 21:29:25 +02:00
Ari Koivula 1ccb3bd324 Move sign hiding stuff in rdoq to its own function.
- There is some stuff from sign hiding left intermingled with rdoq code,
  but I don't want to change the code too before testing that I didn't
  break anything.
2015-01-24 21:27:20 +02:00
Ari Koivula 804a3b648b Clean up quantization sign hiding.
- To allow for later configuration at runtime.
2015-01-23 16:03:59 +02:00
Ari Koivula c940ccb549 Fix gcc error.
encmain.c:433:13: error: format ‘%llu’ expects argument of type ‘long long unsigned int’, but argument 4 has type ‘uint64_t’
2015-01-23 15:50:14 +02:00
Ari Koivula 5d16fa6c4f Add VPS every intra frame.
- Just rdo=0 for now. Later this can be extended to be configured separately.
2015-01-22 13:13:23 +02:00
Ari Koivula d685ee86d6 Record total bitstream length correctly when using stdout.
- If the output is not a file, we can't check the size of the file.
2015-01-22 12:29:06 +02:00
Ari Koivula 1b19afc706 Flush output buffer after every frame. 2015-01-22 12:29:06 +02:00
Ari Lemmetti b4aab06073 Added new files in Makefile. 2015-01-21 18:38:09 +02:00
Ari Lemmetti c21351cc12 Added fast clipping function for clamping values to bit depth. 2015-01-21 17:53:06 +02:00
SanteriS 4b3d77aaf2 Enable tz search. 2015-01-21 12:55:00 +02:00
Ari Koivula f86def8ed8 Remove unused variables. 2015-01-20 17:50:19 +02:00
Ari Koivula 8ac66934c0 Clean up NAL header code.
- Use long start code for RADL NAL units if they are the first NAL in the
  access unit.
- Ffmpeg mpegts was complaining about start codes not being present.
  There wasn't anything wrong that I could find though, besides the
  missing intra long start code.
2015-01-20 17:34:59 +02:00
Ari Koivula 81ad583e08 Use the same coeff cost calculation for all rd modes.
- It's not worth it to have these faster approximations for coefficient cost.
2015-01-20 17:34:59 +02:00
Ari Koivula 870171e6ad Fix --rd=0 actually work. 2015-01-20 17:34:59 +02:00
Ari Lemmetti f037ed580c Improved data layout 2015-01-15 16:31:18 +02:00
Ari Lemmetti 4382c2f088 Added missing -1 to PIXEL_MAX macro 2015-01-15 16:14:07 +02:00
Ari Lemmetti 465f718eeb Move value clipping away from separate loop 2015-01-15 16:14:00 +02:00
Ari Lemmetti 9d12ce21d5 Cleaned luma interpolation, added functions for 8-tap filtering. 2015-01-15 16:13:12 +02:00
Ari Lemmetti 0e56d13b5d Use smaller bit depth for fractional pixel interpolation 2015-01-15 15:00:09 +02:00
Ari Lemmetti cc061b4c3d Added ipol strategy for interpolation filters.
Added initial files for AVX2 and generic strategies.
2015-01-15 14:59:37 +02:00
Ari Lemmetti 73762062b6 Clarified comments a bit 2015-01-15 11:57:19 +02:00
Ari Koivula ab3364afb4 Add skipping of intra search in inter frames for rd=0. 2015-01-15 11:54:35 +02:00
Ari Lemmetti c9f310a6c2 Use pixel type instead of uint8_t 2015-01-15 11:47:00 +02:00
Ari Lemmetti cad5f14372 Fixed compile errors (-Werror) 2015-01-14 18:27:35 +02:00
SanteriS 126569c737 Added first version of TZ search algorithm. 2015-01-14 14:54:09 +02:00
Ari Koivula 660547098a Merge branch 'intra-fast-lcu' 2015-01-14 12:03:12 +02:00
Ari Koivula 01195aecbb Move cu split model to a function. 2015-01-14 11:16:34 +02:00
Ari Koivula 8c89dcfc50 Move mode bit calculation to a function. 2015-01-14 10:44:52 +02:00
Daniel Eneyev 27d79ffae3 workaround for GET_TIME in Mac OS 2015-01-13 17:06:55 +03:00
Ari Koivula fc79c2103e Generalize the fast intra-mode tryout code to work for any depth. 2015-01-12 11:47:21 +02:00
Ari Koivula f1364d297b Fix bug resulting in incorrect bitstream.
- If 64x64 intra PUs were enabled and --rd was less than 2, no intra mode
  search was performed for depth 0 resulting in incorrect bitstream.
2015-01-12 11:16:33 +02:00
Ari Koivula bbae2e8a27 Update usage and readme. 2015-01-12 10:59:28 +02:00
Ari Koivula f4bd322804 Add command line options for prediction unit depth. 2015-01-12 10:40:34 +02:00
Ari Koivula edf2681ea4 Comment functions in search.c. 2015-01-07 14:56:14 +02:00
Ari Koivula 8c1e0b8a7f Tweak owf=auto.
- Twice the required number is too little.
2014-12-10 11:23:51 +02:00
Ari Koivula 129c8e38e0 Set owf default to auto. 2014-12-09 19:00:11 +02:00
Ari Koivula 51b5692121 Rewrite owf=auto code to be more general.
- Change the definition to be a bit more general. The mapping from resolution
  to owf frames stays mostly the same however, but should handle weird
  resolutions better.
- Move everything to config module.
- Fix handling of tiles. It had a bug where owf for tiles was always
  threads * 4/3 - 1. Works as intended now.
2014-12-09 19:00:11 +02:00
Ari Koivula 374012ab26 Merge branch 'intraskip' 2014-12-01 17:30:03 +02:00
Ari Lemmetti 24492adb02 Merge branch 'fme_merge' 2014-11-21 15:08:45 +02:00
Ari Koivula 21d221c075 Add fast 64x64 intra test.
- If intra search is not enabled for a depth, try the result from the
  top left CU of the next depth. This seems to give most of the benefit
  of at least 64x64 intra prediction units without costing very much
  in performance.
2014-11-20 17:20:24 +02:00
Ari Lemmetti 4874f2662f Added --subme commandline parameter for fractional pixel motion estimation: 1 == enable (default), 0 == disable. 2014-11-20 14:59:04 +02:00
Ari Lemmetti d5d2e04995 Merge branch 'fme' 2014-11-19 16:40:22 +02:00
Ari Koivula 3ef88dfda5 Add --owf=auto option.
- The optimal value for Overlapping Wave Front (OWF) depends on a bunch of
  variables. Attempt to set the optimal owf value, at least for all intra.
2014-11-18 02:19:40 +02:00
Ari Lemmetti 5a946f24ea Fixed time output formatting. 2014-11-14 16:46:41 +02:00
Ari Lemmetti 56c537e145 Build fixes for MinGW.
threads.h: use windows.h headers for clock stuff on MinGW
strategyselector.c: assert with strlen for MinGW support
2014-11-14 16:46:41 +02:00
Daniel Eneyev 992a98c5c4 If output name is dash - write to stdout 2014-11-13 12:45:53 +03:00
Ari Lemmetti c46b75a0ca Fixed mingw build error. Modified function declaration in getopt.h.
A macro definition adds * in front of __argc and __argv, causing
build error with mingw. Renamed them to argc and argv to prevent this.
2014-10-31 17:40:18 +02:00
Ari Lemmetti 6a12bc406d Load greatest submodule. Fixed loop that occurred during build process. 2014-10-30 15:17:50 +02:00
Ari Lemmetti a64aae7c53 Makefile now compiles tests. Fixed test files. Removed unused stuff. 2014-10-29 15:32:47 +02:00
Ari Koivula 50643eeaf8 Merge pull request #88 from darealshinji/patch-2
version.h is no longer used
2014-10-27 20:23:15 +02:00
darealshinji 81ecef17d7 version.h is no longer used 2014-10-27 18:17:26 +01:00
darealshinji e230fb2eab make it possible to add custom CFLAGS 2014-10-27 17:19:05 +01:00
Ari Lemmetti e93fa54838 Added -lrt to fix undefined references to clock_gettime on some systems 2014-10-23 14:51:28 +03:00
Ari Lemmetti eb7cecc3dd Added .travis.yml for continuous integration. Added env variable to disable AVX2 for Travis (GCC version doesn't support it yet). 2014-10-23 14:20:07 +03:00
Ari Lemmetti 20967cfafe Allow CC to be defined other than gcc. If not defined, use gcc as default. 2014-10-23 13:25:00 +03:00
Ari Koivula fcb6fa6d4b Fix compilation error on PowerPC.
- Need abs from stdlib.
2014-10-21 18:14:32 +03:00
Ari Koivula f6fead6221 Fix crash on inter frames.
- If the bitcost was 0 it would underflow for skip mode. The bitcost is now
  checked before decrementing.
2014-10-21 18:11:39 +03:00
Ari Koivula dfc67b766a Disable rd1 chroma search.
- The bdrate improvement isn't really worth the time it takes, so enable it
  only for rd3 untill it can be made faster or better.
2014-10-16 13:59:20 +03:00
Ari Koivula e9b8d9b889 Fix gcc warnings.
- Remove unused variables.
- Change intra prediction functions to take their inputs as const pointers.
- Change intra_get_pred to take two pointers instead of an array of pointers,
  because the warnings got just too exotic.
2014-10-16 13:17:46 +03:00
Ari Koivula 4bac52d9b6 Merge branch 'intra' 2014-10-16 13:11:23 +03:00
Ari Koivula afb9e8c3f4 Remove extra parameter sets. 2014-10-16 12:21:36 +03:00
Ari Koivula 02ec26fcea Try different number of chroma intra modes for different depths.
- And avoid doing extra work if no extra modes are tested for certain depths.
2014-10-16 12:21:36 +03:00
Ari Koivula 3cf5e422e8 Make fast chroma mode search select modes for slower chroma search. 2014-10-16 12:21:36 +03:00
Ari Koivula d12dbd4aa0 Add fast intra chroma mode search. 2014-10-16 12:21:08 +03:00
Ari Koivula 75a137c1e9 Add --cpuid parameter to disable runtime optimizations. 2014-10-16 12:01:36 +03:00
Ari Koivula 3e6023dfb5 Rename search constants and set sane defaults. 2014-10-16 03:08:11 +03:00
Ari Koivula 8a407b0313 Estimate luma and chroma intra mode bits separately.
- Remove cu_info.intra[].cost and bitcost as unnecessary.
- Add luma_mode_bits to complement chroma_mode_bits and remove
  intra_pred_ratecost as unneccessary. Difference is that intra_pred_ratecost
  was more coarse and included chroma mode with the assumption that it would
  be the same as chroma.
2014-10-16 03:08:11 +03:00
Ari Koivula c9e212ba92 Add intra chroma mode search.
- Based on full chroma reconstruction so enabled only for --rd=2.
2014-10-16 03:07:50 +03:00
Ari Koivula b32867be2a Remove -lrt from LDFLAGS.
- This might be required on some embedded system, but from what I can see
  all the functions we use from real time extensions are included in libc
  and the program seems to work fine without it.
- It doesn't exist on MingwW or Mac, so I think it's better to remove it
  completely and add it later on any system that actually requires it.
- Related to #85.
2014-10-14 11:48:57 +03:00
Ari Koivula 6f8a976b12 Give ARCH_X86_64 to yasm on Mac.
- Issue raised in #85.
2014-10-14 09:47:56 +03:00
Ari Koivula 55ab08c213 Fix incorrect const qualifiers.
- Change input pointers to const in dct-generic, like they should have been.
- Fixes compilation error on GCC.
2014-10-13 16:57:15 +03:00
Ari Koivula 8a5b24bcbe Remove usages of GCC __attribute__.
- To allow clang to compile, as it doesn't according to #58.
- The target attributes are not needed anymore due to makefile handling
  targetting now.
- The __attribute__((unused)) used for debugging. I don't know if clang
  supports this attribute or not but it doesn't seem very important so
  I'm removing it just in case.
2014-10-13 16:46:26 +03:00
Ari Koivula 04613bd5b3 Disable GET_TIME on Mac.
- This should fix the Mac version not compiling in issue #85.
2014-10-13 16:22:11 +03:00
Ari Koivula a469c059a5 Take chroma tr-skip bits into account. 2014-10-13 10:48:39 +03:00
Ari Koivula 7a5cf5d865 Add trskip mode cost to fast trskip mode decision. 2014-10-13 10:45:41 +03:00
Ari Koivula f164a5ba79 Add fast transform skip estimation to rough intra search. 2014-10-13 10:42:24 +03:00
Ari Koivula d893a489d6 Fix mingw compilation issue.
strategies/avx2/dct-avx2.c:334:25: error: pasting "g_dct_16" and "[" does
not give a valid preprocessing token

- The [ is not part of the token so compilation failed on mingw GCC 4.9.1.
- Fixes #86.
2014-10-10 16:32:39 +03:00
Ari Koivula 28d1532578 Make rd=1 use cabac for coeff cost estimation. 2014-10-08 12:50:03 +03:00
Ari Koivula cbb2aa75b7 Add macros for adjusting weight of distortion between luma and chroma.
- Everything needs to have a short name because windows has a maximum path
  length limitation that is breaking my testing framework.
2014-10-08 10:31:54 +03:00
Ari Koivula 49ad845c33 Add cabac bits for part_mode. 2014-10-08 10:31:54 +03:00
Ari Koivula b6710e7893 Add cabac bits for cu split flag. 2014-10-08 10:31:54 +03:00
Ari Koivula 38b224cf69 Change rest of cu split search costs to double. 2014-10-08 10:31:54 +03:00
Ari Koivula 17473624d3 Add transform tree bit costs for cbf_luma. 2014-10-08 10:31:54 +03:00
Ari Koivula 3b04d39db4 Take cabac bits into account on transform tree. 2014-10-08 10:31:54 +03:00
Ari Koivula 296f142d9e Retain coded block flag data during transform split search. 2014-10-08 10:31:54 +03:00
Ari Koivula 85dea10f3f Clean up transform split search.
- Remove unnecessary checks and comment.
2014-10-08 10:31:54 +03:00
Ari Koivula e1b801eb6f Add transform tree chroma cbf bits. 2014-10-08 10:31:23 +03:00
Ari Koivula 3868cc7ff1 Fix crash on inter search when --tr-depth-intra is used.
- Transform splits meant for intra modes were used for inter when inter mode
  was chosen, which caused an assert to be triggered if the split transform
  block didn't have any coefficients.
2014-10-03 19:29:06 +03:00
Ari Lemmetti bcf12567d0 Added some comments. 2014-10-03 17:51:58 +03:00
Ari Lemmetti fea517c2ae Misc code cleanup 2014-10-03 17:06:09 +03:00
Ari Lemmetti 85682c3b6a Removed unused transpose functions. 2014-10-03 11:39:31 +03:00
Ari Koivula 8a80845b91 Add chroma to transform split search. 2014-10-03 11:36:57 +03:00
Ari Koivula 51662e1081 Fix differences between cu_rd_cost_luma and rdo_cost_intra. 2014-10-03 11:36:57 +03:00
Ari Koivula bc7d7d5cb6 Add cu_info* as parameter to reconstruction functions.
- This is required so these functions can be used for searching. When NULL
  is given they take the CU from LCU struct as they did previously.

Conflicts:
	src/search.c
2014-10-03 11:36:56 +03:00
Ari Koivula ccc575e2c6 Disable transform tree bits. 2014-10-03 11:36:56 +03:00
Ari Koivula a0ab469c89 Disable rdo_cost_intra. 2014-10-03 11:36:56 +03:00
Ari Koivula c164978e21 Add FULL_CU_SPLIT_SEARCH macro for disabling cu split optimization. 2014-10-03 11:36:56 +03:00
Ari Koivula 549ac96438 Change costs to doubles to avoid rounding intermediate results.
- Helps with debugging.
2014-10-03 11:36:56 +03:00
Ari Koivula e591e89ade Add prediction mode to chroma reconstruction parameters.
- Just like in luma.
2014-10-03 11:36:56 +03:00
Ari Koivula f6272f06fc Unify signature for transform functions.
- Some used block, coeff and some src, dst. Now all signatures are const input
  and non-const output.
2014-10-03 11:21:43 +03:00
Ari Koivula b932cf4b21 Clean up avx2 dct macros. 2014-10-03 11:16:25 +03:00
Ari Koivula 47244a15c3 Merge branch 'dct-optimizations'
Conflicts:
	src/strategies/avx2/dct-avx2.c
	src/strategies/generic/dct-generic.c
2014-10-02 13:45:21 +03:00
Ari Lemmetti 61e1510480 Transform functions in dct-avx2.c are now generated with macros. 2014-10-02 13:24:30 +03:00
Ari Lemmetti 9407610555 Moved DCT / DST matrices to dct-generic.c 2014-10-02 13:24:30 +03:00
Ari Lemmetti 7255112bd8 Added transposed DCT/DST tables. Use them while calculating transforms instead of doing runtime transpose. Added separate functions for DST and IDST. 2014-10-02 13:24:30 +03:00
Ari Lemmetti e7bcb58846 Added 32x32 IDCT 2014-10-02 13:24:30 +03:00
Ari Lemmetti eacf173b7e Added 32x32 DCT for AVX2 2014-10-02 13:24:30 +03:00
Ari Lemmetti d2856a5d40 Added 32x32 transpose 2014-10-02 13:24:30 +03:00
Ari Lemmetti 7a33f08312 Added 16x16 DCT and IDCT for AVX2 2014-10-02 13:24:30 +03:00
Ari Lemmetti d2fe2a5391 Added 16x16 transpose 2014-10-02 13:24:30 +03:00
Ari Lemmetti d6af146a2e Added part of the functions 16x16 DCT needs 2014-10-02 13:24:30 +03:00
Ari Lemmetti aba3acdfff Added AVX2 optimized transforms for 4x4 and 8x8 blocks 2014-10-02 13:24:30 +03:00
Ari Lemmetti 5856f32d81 Fixed incorrect shift values for inverse transforms in generic strategy 2014-10-02 13:24:29 +03:00
Ari Lemmetti 41b032664d First version of 4x4 forward DCT 2014-10-02 13:24:29 +03:00
Ari Koivula 36232619ab Fix broken cabac contexts in wpp.
- Fixes #84.
- The issue was caused by 241b9d6 naively copying the whole struct, which
  contains data other than just the contexts. Rather than reverting the
  change, the struct was refactored to have another struct that contained
  just the contexts.
2014-09-24 01:02:52 +03:00
Ari Koivula 4e052d3f0f Wrap contexts of cabac_data inside cabac_data.ctx struct. 2014-09-24 01:02:37 +03:00
Ari Koivula b339004c4c Rename cabac_state.ctx to cur_ctx. 2014-09-24 01:02:28 +03:00
Ari Koivula 8b8b53fba5 Merge branch 'sao_cabac' 2014-09-22 10:28:30 +03:00
Ari Koivula bfa399c8fc Fix compiler warnings.
- Non-parenthesized parameter in a macro.
- Unused variables.
- Wrong const qualifiers.
- Signed/unsigned comparison.
2014-09-22 10:04:57 +03:00
Marko Viitanen 6f65a9cbbd Improved SAO merge decisions 2014-09-16 10:08:17 +03:00
Marko Viitanen 21df11ba4e Implemented SAO search for both chroma components 2014-09-15 16:07:31 +03:00
Marko Viitanen e8d1140a1a Check SAO band offset for both chroma components and better SAO chroma cabac costs 2014-09-15 16:07:31 +03:00
Marko Viitanen 0c92031e8a SAO merge checking cleanup 2014-09-15 16:07:31 +03:00
Marko Viitanen b274e7adcd Added cabac bit cost calculations to SAO search 2014-09-15 16:07:31 +03:00
Ari Koivula 5f732126c3 Add cabac bit costs float table. 2014-09-15 15:45:43 +03:00
Ari Koivula 0db7d8d20f test cu split cost 2014-09-15 15:42:03 +03:00
Ari Koivula 35b2e6f755 Add missing cabac context for chroma cbf.
- The context was also missing from HM, but has been fixed in HM13.
2014-09-15 15:41:44 +03:00
Ari Koivula 241b9d6adb Simplify cabac context copying.
Conflicts:
	src/context.c
2014-09-15 15:41:44 +03:00
darealshinji 61a414bced reposition colons in usage message to match with the rest 2014-09-15 03:40:18 +02:00
Ari Koivula 3c73892609 Fix transform split search.
- Redo the search with the best mode to make sure the tr_depth parameters are
  correct.
2014-09-11 10:56:53 +03:00
Ari Koivula 46b6b1243b Add --rd=3 mode and enable searching of intra depth 0.
- intra_build_reference_border was overflowing at depth 0 because it uses
  arrays just large enough to accommodate 32x32 transforms, which is the
  biggest transform.
- For similar reasons search_intra_rough doesn't work at depth 0.
- The --rd=3 mode tries all modes with transform search. It also works without
  rough search so it was used to test depth 0 search. If --rd=3 is not on intra
  split at depth 0 is not searched for.

Conflicts:
	src/search.c
2014-09-11 10:54:41 +03:00
Ari Koivula c5fa824347 Rebase transform split search. 2014-09-08 14:13:59 +03:00
Ari Koivula 79b86ce6e1 Add --tr-depth-intra command line option.
Conflicts:
	src/encoder.c
2014-09-04 13:42:24 +03:00
Marko Viitanen fe236de807 Fixed sps_max_dec_pic_buffering value to include current picture 2014-09-01 10:31:11 +03:00
Marko Viitanen dbcc8d65aa Removed duplicate function from RDOQ 2014-08-28 08:50:01 +03:00
Ari Koivula 931ec7301c Put slice delta QP to bitstream.
- Before slice delta QP was always 0. Now if global->QP is changed before
  contexts are set, the delta qp is put to the bitstream, allowing for rough
  frame level rate control.
2014-08-25 16:43:23 +03:00
Ari Koivula 4c3bbd4a35 Rewrite the SContruct.
- Works with new /strategy/ structure.
- Change architecture selection to use arch= instead of construction target.
2014-08-25 16:43:23 +03:00
Ari Lemmetti f88c3b6f37 Removed unnecessary if (both branches did the same thing) 2014-08-20 11:54:35 +03:00
Laurent Fasnacht f3c311fe1a Fix commit 8502f3d 2014-08-11 15:17:15 +02:00
Laurent Fasnacht f9bffe35a5 Log tile id in sad perf log 2014-08-11 11:57:08 +02:00
Laurent Fasnacht 6a937de9b2 Fix search_cu log 2014-08-11 11:57:08 +02:00
Laurent Fasnacht 8502f3d850 Improve logging 2014-08-11 11:57:07 +02:00
Laurent Fasnacht f1b303a2d2 Fix compilation errors 2014-08-11 09:53:06 +02:00
Ari Lemmetti 47e3bcfb50 Fixed incorrect shift values for inverse transforms in generic strategy 2014-08-07 16:01:30 +03:00
Ari Lemmetti 709520a233 Removed all AVX2 instructions from SATD functions.
-Zero extend macro now returns results in 2 xmm registers instead of one ymm
2014-07-31 13:25:28 +03:00
Ari Lemmetti 0beb278f5b Partial butterfly strategy is now called DCT strategy. Made changes to transform functions in preparation for optimizations.
-Moved fast_forward_dst and fast_inverse_dst to DCT strategies
2014-07-31 13:25:28 +03:00
Ari Lemmetti 6bf63bd171 Added AVX2 strategy for partial butterfly (no optimizations yet) 2014-07-31 13:25:28 +03:00
Ari Lemmetti faccc4f09b Partial butterfly functions now utilize the strategy selector 2014-07-31 13:25:28 +03:00
Ari Koivula c2fac805d7 Give HAVE_ALIGNED_STACK to yasm on windows.
- Linux gets it through some other means but on windows it needs to be
  given explicitly.

- Fixes issue #78.
2014-07-30 16:26:23 +03:00
Ari Koivula 669e99dd7f Improve intra SAD AVX2 intrinsics.
- Moved implementations for different sizes to inline functions that are
  defined using each other, reducing the amount of redundant code.

- Performance of sad_8bit_32x32_avx2 improved by about 10% due to unrolling of
  the loop.
2014-07-25 15:59:55 +03:00
Ari Koivula e00102f0ca Compile asm optimizations only if yasm is present. 2014-07-23 14:57:40 +03:00
Ari Lemmetti 85fb0784e4 Fixed intendentation and added some empty lines for readability 2014-07-23 12:32:27 +03:00
Ari Lemmetti bd6e89c1f0 Updated include directories and file names to Makefile 2014-07-22 15:36:54 +03:00
Ari Lemmetti 4f88ebce5a Added comments and made visual studio not to compile x86inc.asm 2014-07-22 15:07:57 +03:00
Ari Koivula cfd3636e08 Move some repetitive SATD asm into a macro.
Conflicts:
	src/strategies/x86_avx/picture_x86.asm
2014-07-22 12:46:39 +03:00
Ari Lemmetti c81639dd09 Removed old unused macro 2014-07-22 11:11:20 +03:00
Ari Lemmetti cf0797cafd Reordered and intended assembly code 2014-07-22 11:07:42 +03:00
Ari Lemmetti fea44c8234 Renaming AVX/asm files
-Splitted SAD and SATD functions in separate files
2014-07-21 18:02:01 +03:00
Ari Lemmetti a64df6f0d0 Merge branch 'asm'
Conflicts:
	build/kvazaar_lib/kvazaar_lib.vcxproj.filters
	src/Makefile
	src/strategies/strategies-picture.c
2014-07-21 16:41:09 +03:00
Ari Lemmetti 1be2c3aae5 Preparing push to master and misc
-Removed unnecessary <math.h> headers
-Updated AVX/asm optimizations to match the new file hierarchy
-Makefile only compiles .asm files if KVAZAAR_DISABLE_YASM is not set to 1 and TARGET_CPU_ARCH is x86
2014-07-21 12:39:56 +03:00
Ari Koivula a8f7103797 Add AVX2 implementations for sad_8bit_ 8x8, 16x16 and 32x32. 2014-07-18 18:27:30 +03:00
Ari Koivula 3daa5dd1f1 Add sse2 implementaton for sad_8bit_4x4. 2014-07-18 18:20:34 +03:00
Ari Koivula f49332c9b8 Add missing includes. 2014-07-18 17:56:15 +03:00
Ari Koivula 291817667f Tidy up the Makefile. 2014-07-18 17:31:18 +03:00
Ari Koivula e241866f43 Compile intrinsic functions with appropriate flags in gcc.
- Remove -march=native as it's no longer necessary for intrinsics to work.
  Closes #77.

- I couldn't test altivec or sse4.1, but sse4.1 compiles so I expect it
  to work.
2014-07-18 17:28:14 +03:00
Ari Koivula 5662621b3c Free threadqueue jobs when they are not needed.
- Also add destroying the mutex when the job is freed.

- This makes Kvazaar no longer acquire thousands of OS handles on Windows.
2014-07-16 16:51:20 +03:00
Ari Lemmetti 1e94262f85 Made AVX asm compatible with the changed system
- x86inc.asm is now located in extras
- Removed unused cpu.asm/h
2014-07-14 18:51:17 +03:00
Ari Lemmetti 683eda1183 Merge branch 'master' into asm
Conflicts:
	build/kvazaar_lib/kvazaar_lib.vcxproj
	build/kvazaar_lib/kvazaar_lib.vcxproj.filters
	src/Makefile
	src/strategies/strategies-picture.c
2014-07-14 16:42:33 +03:00
Ari Lemmetti 7f873e037c Updated Makefile to compile picture_x86.asm 2014-07-14 15:30:08 +03:00
Ari Lemmetti 2169f9ab8c Added AVX asm comments and fixes
-Added vzeroupper to satd macro to prevent AVX-SSE transition penalties int picture_x86.asm
-Fixed the order of registers in zero extend macro in picture_x86.asm
-Fixed SATD checkers test pattern in satd_tests.c
2014-07-14 14:43:36 +03:00
Ari Koivula 5d0df56c94 Move optimizations to their own compilation units according to target.
- This is necessary in order to compile AVX intrinsics correctly in
  Visual Studio. Having everything in their own units should also make
  compiling normal C code with optimizations on easier.

- For now the makefile still relies on GCC __target__ attribute for compiling
  intrinsics.
2014-07-11 17:26:19 +03:00
Ari Koivula f605d6c35b Align intra buffers to 32 bytes for 256 bit SIMD instructions. 2014-07-11 17:26:19 +03:00
Ari Koivula fbd03b706e Reconfigure VS project.
- Moved compilation flag stuff from project file to the abstraction layer.

- Disabled randomized base address as unnecessary.

- Disable stack buffer security check from release.
2014-07-11 17:26:19 +03:00
Laurent Fasnacht 72abc69b3d Measure time for SAD in _DEBUG mode 2014-07-08 11:42:58 +02:00
Laurent Fasnacht 1a318c714d log poc with new_frame 2014-07-08 11:42:19 +02:00
Laurent Fasnacht e64a692780 Add CU type in threadqueue.log 2014-07-08 09:06:31 +02:00
Laurent Fasnacht abfbb7cad3 Fix duplicate type key in threadqueue.log 2014-07-07 11:36:50 +02:00
Laurent Fasnacht 946e3b9651 Log search_cu to threadqueue.log 2014-07-07 10:50:05 +02:00
Laurent Fasnacht f62e571c15 Add missing info to threadqueue.log 2014-07-07 10:49:40 +02:00
Ari Lemmetti 048127c7e3 AVX assembly optimizations improved 2014-07-02 16:57:06 +03:00
Ari Koivula 7ecf78bb70 Use sqrt lambda cost for searches not using SSD.
- Add encoder_state->global->cur_lambda_cost_sqrt.

- Use sqrt lambda for inter search and rough intra search.

- The effect on inter is around 10-20% bdrate. The effect on intra is smaller
  and non-existent when --rd=2 is enabled, as the intra search refinement was
  already done with SSD and correct lambda.
2014-06-26 13:56:38 +03:00
Laurent Fasnacht 1112dca933 Fix compilation issue with assertion disabled 2014-06-26 07:31:37 +02:00
Laurent Fasnacht 9ab9defe67 Bitstream length per frame works again 2014-06-19 10:24:03 +02:00
Laurent Fasnacht 45faadb2c9 Fix bug where the wrong number of frames could be encoded (if one frame takes longer than the others) 2014-06-19 10:24:02 +02:00
Ari Koivula d5a77be4b8 Fix avx detection for gcc.
- GCC doesn't support _xgetbv intrinsic so we have to use inline assembler.
2014-06-18 11:50:17 +03:00
Ari Lemmetti bdef5384ef Added AVX strategy 2014-06-17 16:52:24 +03:00
Ari Koivula d7abe6a7c2 Address compilation warning.
strategyselector.c:170:10: error: ‘__get_cpuid’ is static but used in inline function ‘get_cpuid’ which is not static [-Werror]
   return __get_cpuid(level, eax, ebx, ecx, edx);
2014-06-17 16:26:55 +03:00
Ari Koivula 60ecc6baae Remove unused stuff. 2014-06-17 16:20:01 +03:00
Ari Koivula 7532b789f8 Add -std=gnu99 for gcc.
- std=c99 doesn't work because then struct timespec won't be defined.
2014-06-17 16:15:39 +03:00
Ari Koivula 94bc457b6c Add option to disable fast intra search. 2014-06-17 15:32:05 +03:00
Ari Koivula e27fc875c0 Clean up intra search. 2014-06-17 15:09:12 +03:00
Ari Koivula e4d70ac1ab Use more starting points for smaller blocks in intra search. 2014-06-17 13:28:27 +03:00
Ari Koivula 9911c7553b Avoid unnecessary intra dir searching. 2014-06-17 13:11:35 +03:00
Ari Koivula bd16a55b9b Always check DC and planar intra modes.
- At least one of them is always in predicted modes, but to make sure they
  are both included add them explicitly.
2014-06-17 12:51:15 +03:00
Ari Koivula 70740da123 Add smarter rough intra search.
- Directional intra mode search is done using halving search from the best
  known mode. Starting modes are vertical, horizontal and the 3 diagonal
  modes.

Conflicts:
	src/search.c
2014-06-17 12:33:10 +03:00
Marko Viitanen 0e2fe9e7ff Changed intra search to skip some modes speeding it up 2014-06-17 12:32:29 +03:00
Marko Viitanen a1c3cfe944 Moved intra mode cost calculation to a function
Conflicts:
	src/search.c
2014-06-17 12:32:29 +03:00
Marko Viitanen eb7d46f9ef Modify CU split cost. 2014-06-17 12:30:32 +03:00
Marko Viitanen bfa37b876b Conformance fix: set sps_max_dec_pic_buffering to correct value 2014-06-17 12:30:32 +03:00
Ari Koivula b3c15b8f94 Merge branch 'owf' 2014-06-16 16:07:41 +03:00
Laurent Fasnacht 91de92134f Constrain the search not to go under the LCU below if OWF is enabled 2014-06-16 14:27:56 +02:00
Laurent Fasnacht ef9c2258e9 Fix frame counter and stats 2014-06-16 13:21:52 +02:00
Ari Koivula 153b1ee41f Merge branch 'intra-sad-strategies' 2014-06-16 12:34:37 +03:00
Laurent Fasnacht 84d34c2655 Fix compilation on non-intel 2014-06-16 11:24:02 +02:00
Ari Koivula 3f00592b96 Separate strategyselector debug prints from _DEBUG.
- I only want to see the strategy stuff.
2014-06-16 12:15:19 +03:00
Ari Koivula 1c97a10a6d Move intra SAD and SATD functions under strategies. 2014-06-16 12:13:41 +03:00
Laurent Fasnacht 4b4702819b Also print encoding FPS 2014-06-16 11:10:11 +02:00
Laurent Fasnacht 2347574a8e Fix problems revealed by valgrind 2014-06-16 11:10:09 +02:00
Laurent Fasnacht 28c3f22ba1 Fix possible freeze 2014-06-16 11:03:48 +02:00
Laurent Fasnacht a96c742ad4 Fix depends for wpp+owf 2014-06-16 11:03:47 +02:00
Laurent Fasnacht f99e41d41f Improved CPU time statistics 2014-06-16 11:03:46 +02:00
Laurent Fasnacht 8a33c0a688 Fix recon job for wfrow 2014-06-16 10:55:01 +02:00
Laurent Fasnacht bf6024734a Fix statistics with OWF 2014-06-16 10:55:00 +02:00
Laurent Fasnacht 0522a3d8e5 --owf option 2014-06-16 10:55:00 +02:00
Laurent Fasnacht 47d1ded7b0 Dependencies between frames 2014-06-16 10:54:59 +02:00
Laurent Fasnacht 003d3c504c image_list_copy_contents 2014-06-16 10:54:58 +02:00
Laurent Fasnacht f4187dd10c cu_array data structure 2014-06-16 10:54:57 +02:00
Laurent Fasnacht 3be3fa8d6e Use different processing order depending if we have OWF or not 2014-06-16 10:54:56 +02:00
Laurent Fasnacht c32943f78b OWF 2014-06-16 10:54:56 +02:00
Laurent Fasnacht 490dd15f3d Remove flush between frame 2014-06-16 10:51:33 +02:00
Laurent Fasnacht fddcbabe28 bitstream writing is now a "normal" job in a thread 2014-06-16 10:51:32 +02:00
Laurent Fasnacht ff7143cc24 Assign thread_queue_jobs and move image_free to a more suitable place 2014-06-16 10:51:32 +02:00
Ari Koivula 87ca828a63 Correct intra sad function labels.
- These haven't been 16 bit for a long time.
2014-06-16 10:45:10 +03:00
Ari Koivula fcce6ae823 Fix printing of AVX2 capability. 2014-06-14 01:24:19 +03:00
Ari Koivula a49ba2633a Add OS and CPU detection for AVX2 and AVX. 2014-06-13 16:57:53 +03:00
Ari Koivula 1de102be61 Move strategies to their own compilation units.
- Enforces a little bit more hierarchy. Compilation units are in strategies
  and whatever inline includes they have are in a folder with the same name
  as the strategy.
2014-06-13 15:30:23 +03:00
Ari Koivula aa3549a717 Change SLEEP(0) to SLEEP(10) on Windows.
- This is a workaround for a performance problem on Windows where main thread
  is busy looping.
2014-06-13 12:01:03 +03:00
Laurent Fasnacht 4acadccf89 Only signal the required number of threads 2014-06-13 08:34:59 +02:00
Laurent Fasnacht 70ce7cec20 Remove unneccessary locks by adding threadqueue->queue_running counter 2014-06-13 08:34:58 +02:00
Laurent Fasnacht 7ef34ff5a1 Ability to dump mutex_lock, mutex_unlock and cond_wait timing, if compiled with -D_PTHREAD_DUMP 2014-06-13 08:32:14 +02:00
Laurent Fasnacht 68ad323e84 Tentative fix for race condition 2014-06-12 14:01:33 +02:00
Laurent Fasnacht b194e19708 Tentative fix for deadlock 2014-06-12 12:57:14 +02:00
Laurent Fasnacht b765eca153 Remove unneeded encoder_state_blit_pixels 2014-06-12 11:47:46 +02:00
Laurent Fasnacht da07b8b35d No-copy works (SAO and deblocking enabled) 2014-06-12 11:47:38 +02:00
Laurent Fasnacht 2cc700fab8 No-copy works with --no-sao (deblocking enabled) 2014-06-12 11:47:31 +02:00
Laurent Fasnacht 6b408b5904 No-copy works with --no-sao --no-deblock 2014-06-12 11:47:30 +02:00
Laurent Fasnacht 0dbfa62698 Replace copy of images made for tiles by sub-images (no copy)
- replace width by stride where required in the source code
2014-06-12 11:47:30 +02:00
Laurent Fasnacht b1347efef5 Add checkpoint in sao_reconstruct 2014-06-12 11:47:29 +02:00
Laurent Fasnacht ae4dc4eb44 Fix uninitialized sao_info structure members, which was creating false positive when checkpointing SAO 2014-06-12 11:47:29 +02:00
Laurent Fasnacht f371bdafc3 sao_info checkpoints 2014-06-12 11:47:28 +02:00
Laurent Fasnacht b7fe81c55c Checkpoint in pixels_blit, and avoid doing undefined behaviour when source and destination is the same.
Seems a reasonnable point to observe when refactoring, since it's called on most image data.
2014-06-12 11:47:28 +02:00
Laurent Fasnacht da8559fa34 Fix bug in CHECKPOINTS_FINALIZE() when checkpoints are disabled 2014-06-12 11:47:27 +02:00
Laurent Fasnacht 14df6de0d0 Checkpoint on frame checksum 2014-06-12 11:47:00 +02:00
Laurent Fasnacht 22df7cf98b Use an assert instead of a dumb assignment 2014-06-12 11:47:00 +02:00
Laurent Fasnacht cf123e317f Code to checkpoint cu_info and lcu_t 2014-06-12 11:47:00 +02:00
Ari Koivula ea830d3dd2 Add warning for VLAs in Makefile. 2014-06-12 09:57:08 +03:00
Ari Koivula 443f2f00aa Fix compilation for VS.
- VS2013 does not support variable length arrays.
2014-06-11 17:51:55 +03:00
Laurent Fasnacht 87ed365053 typo fix 2014-06-11 10:29:05 +02:00
Laurent Fasnacht 6ca30367f9 Fix POC bug 2014-06-11 10:29:05 +02:00
Laurent Fasnacht 8437229885 Fix handling of cu_arrays 2014-06-11 10:29:04 +02:00
Laurent Fasnacht e1d9cb015a Basic checkpointing system 2014-06-11 10:29:03 +02:00
Laurent Fasnacht 27a49d287d Big refactor to use videoframe, image_list, and image instead of picture* 2014-06-10 09:19:06 +02:00
Laurent Fasnacht 530faf3951 Move video frame related stuff to videoframe 2014-06-05 14:08:31 +02:00
Laurent Fasnacht 0fac77f9eb Image now in separate module 2014-06-05 14:04:12 +02:00
Laurent Fasnacht 2456c65822 Replace accesses to picture->cu_array with picture_get_cu and picture_get_cu_const 2014-06-05 10:41:58 +02:00
Laurent Fasnacht 821b71910b Move picture_list to its own module 2014-06-05 09:49:24 +02:00
Laurent Fasnacht 7372f9244d Basic infrastructure for OWF 2014-06-05 09:09:25 +02:00
Laurent Fasnacht 16e3a58359 Performance improvement 2014-06-05 06:57:51 +02:00
Laurent Fasnacht bad6d45e5f Performance improvement 2014-06-05 06:57:51 +02:00
Laurent Fasnacht aad2089fcf Use -ftree-vectorize 2014-06-05 06:57:50 +02:00
Laurent Fasnacht ea04bcd6a4 AltiVec support for SAD 2014-06-05 06:57:34 +02:00
Ari Koivula 3a7147baf4 Merge branch 't-20140602' 2014-06-04 18:11:15 +03:00
Ari Koivula 31b1bbc215 Address implicit declaration of warnings. 2014-06-04 18:00:50 +03:00
Ari Koivula 4f5c87fc5e Remove duplicate function definition. 2014-06-04 17:56:05 +03:00
Ari Koivula cb7d7f9e15 Update Makefile. 2014-06-04 17:52:28 +03:00
Ari Koivula bb47534b88 Make encoder_state .c files their own compilation units.
- It's good that this module has been chopped to smaller pieces, but lets
  avoid including .c files unless we really have to. These make pretty good
  submodules on their own so just make them their own compilation units.

- Move some stuff around to avoid having to forward declare them
  in encoderstate.c.
2014-06-04 17:45:18 +03:00
Ari Lemmetti 9e649a8f38 Updated usage message 2014-06-04 15:23:27 +03:00
Laurent Fasnacht b8acdc784a Fix compilation of encoder.c with -D_DEBUG 2014-06-03 15:02:14 +02:00
Laurent Fasnacht 961da05235 Split encoderstate.c in multiple files 2014-06-03 14:47:49 +02:00
Laurent Fasnacht 3d07f8cc84 encoderstate refactor 2014-06-03 14:25:16 +02:00
Laurent Fasnacht 2e821b79a9 encoder_state in now in encoder_state.[ch] 2014-06-03 13:51:30 +02:00
Laurent Fasnacht 9bdecbe071 Better thread scheduling 2014-06-03 11:39:16 +02:00
Laurent Fasnacht 0811dbcfbe Remove unneeded cond_broadcast. Limit contention 2014-06-03 09:45:17 +02:00
Laurent Fasnacht 5ee1319c08 Altivec detection 2014-06-03 07:55:39 +02:00
Laurent Fasnacht 58ad3b4d26 Log more performance data, plot also now many threads are running 2014-06-03 07:42:22 +02:00
Laurent Fasnacht 5ed69b063b Strategy selector for array_checksum, basic implementation using precomputed 256*256 block with larger accesses than byte 2014-06-03 07:42:22 +02:00
Ari Koivula a483e8cb0f Move cpuid stuff away from compiler namespace.
Conflicts:
	src/strategyselector.c
2014-05-30 10:08:14 +03:00
Marko Viitanen 6a72f87028 Merge commit '792a5a5dd1946a327f22b2daba05c6645dfa8037' 2014-05-30 08:47:01 +03:00
Marko Viitanen 792a5a5dd1 Small fix for __get_cpuid() 2014-05-30 08:37:03 +03:00
Laurent Fasnacht 642564b6fb Remove unused variable 2014-05-28 15:04:45 +02:00
Laurent Fasnacht 4f86919d75 Get rid of assembly cpuid for x86, compilation works for powerpc 2014-05-28 15:04:00 +02:00
Ari Koivula e585da37e5 Give correct transform depth to RDOQ.
Conflicts:
	src/search.c
2014-05-28 15:47:49 +03:00
Ari Koivula dceb3da9b8 Fix bug in search relating to transform with no non-zero coefficients.
- Because cost was calculated even though there were no coefficients, these
  very good modes were less likely to be selected.

- Added assert to encode_coeff_nxn to avoid these problems in the future.
2014-05-28 15:22:18 +03:00
Ari Koivula ddc02cc09e Avoid regenerating reference pixels for every rdo mode. 2014-05-22 13:18:28 +03:00
Ari Koivula dbe13d0cba Separate sad intra search from rdo search. 2014-05-22 12:47:45 +03:00
Ari Koivula 19ce21e07c Split final cost to luma and chroma functions. 2014-05-22 09:45:00 +03:00
Ari Koivula a6962e2974 Separate intra transform coding to luma and chroma functions. 2014-05-22 09:40:34 +03:00
Laurent Fasnacht 3a30a886fc FREE_POINTER of job->rdepends was at the wrong place (memory leak) 2014-05-22 07:15:18 +02:00
Laurent Fasnacht 3b38777b71 Fix condition depending on uninitialized value in SAO 2014-05-21 16:33:24 +02:00
Laurent Fasnacht 66e730ba94 Fix encoder_state_init, which was making out of bound reads 2014-05-21 14:23:36 +02:00
Laurent Fasnacht 37c20b8ce5 Add dependency between SAO rows 2014-05-21 13:52:56 +02:00
Laurent Fasnacht 90f46dc56f Threadqueue has now a start index to the first queue job. It improves the speed a little 2014-05-21 12:02:55 +02:00
Laurent Fasnacht f4f9093cb5 Parallel SAO 2014-05-21 11:48:29 +02:00
Laurent Fasnacht a3fcb141ed lcu_order_element now has pointer to neighbor LCUs 2014-05-21 11:06:53 +02:00
Ari Koivula de76d0a294 Don't add dependency to the above LCU in wavefront if it's not necessary.
- The top-right LCU already has dependency to the top LCU.
2014-05-20 10:48:19 +03:00
Laurent Fasnacht bdc2d43180 Write bitstream directly after doing the search. This is required since we need the correct entropy status for wpp 2014-05-20 09:29:01 +02:00
Laurent Fasnacht 06532292fc Wavefront are in tile coordinates 2014-05-20 09:28:58 +02:00
Ari Koivula 4751a3744b Fix intra mode search not doing boundary smoothing for DC.
- Move the boundary smoothing to the prediction function to make sure it's not
  forgotten.
2014-05-19 16:23:17 +03:00
Ari Koivula f9a603e4ea Move intra mode search form intra module to search module.
- Make the actual intra prediction function global.

- Move the rdo stuff to rdo module.
2014-05-19 16:12:02 +03:00
Ari Koivula 1da94f2085 Stop deblocking from filtering edges not on 8x8 grid. 2014-05-19 15:58:54 +03:00
Ari Koivula 2224e18a46 Make deblocking work with transform splits.
- It used to work only with the implicit transform split from LCU size.
2014-05-19 15:58:54 +03:00
Ari Koivula 656b0a321b Add chroma mode to lcu_set_intra_mode.
- This is needed for intra split.
2014-05-19 15:58:54 +03:00
Ari Koivula 921f58b249 Add tr_split to lcu_set_intra_mode. 2014-05-19 15:58:54 +03:00
Ari Koivula 846b608125 Add transform split recursion to intra reconstruction. 2014-05-19 15:58:54 +03:00
Ari Koivula 63f6cad5a0 Include global.h in thread modules. 2014-05-19 15:58:16 +03:00
Ari Koivula 551b087b47 Remove bunch of unnecessary code from encode_transform_unit.
- Really, it's useless. Selecting scan order isn't this hard.

- Checked from HM that ctx_idx doesn't have anything to do with contexts.
2014-05-16 17:42:40 +03:00
Ari Koivula f73bef0941 Remove unused include. 2014-05-16 16:09:59 +03:00
Laurent Fasnacht 6fdb821b14 Fix memory leaks 2014-05-16 12:20:40 +02:00
Laurent Fasnacht d4a6aed471 Multi-row jobs 2014-05-16 12:20:40 +02:00
Marko Viitanen 94285fbed7 Fixed compiling on visual studio with _DEBUG defined 2014-05-16 12:22:06 +03:00
Marko Viitanen 86155ef1ba Added windows specific timing macros for thread debugging 2014-05-16 12:16:22 +03:00
Laurent Fasnacht 36945e89ce Stubs to be able to make a portable version of the profiling 2014-05-16 10:15:05 +02:00
Laurent Fasnacht 53b0835316 Improve handling of jobs when not using threads 2014-05-16 08:50:43 +02:00
Laurent Fasnacht 519750d630 Write bitstream of a wavefront in a parallel way 2014-05-16 08:50:42 +02:00
Laurent Fasnacht 7473ac1bfc Able to log time in a simple way 2014-05-16 08:50:42 +02:00
Laurent Fasnacht 86e01284b8 Add -lrt 2014-05-16 08:48:54 +02:00
Laurent Fasnacht 4f73a7fc91 Instrument threads in order to be able to do some visualization 2014-05-16 08:44:32 +02:00
Ari Koivula a7cd31d87b Update the names of some bins to the current spec.
- Helps with debugging.
2014-05-16 05:44:03 +03:00
Ari Koivula ab4041c8fc Change cabac debug statements to show information better.
- Show the number of bits when encoding multiple bins. I would like just the
  bits them selves in string form, but that's too much trouble for this.

- Print then as unsigned and coerce them to unsigned, as they are going
  get coerced to unsigned by the function call anyway.

- Change state to be less verbose.
2014-05-16 05:44:03 +03:00
Ari Koivula c9a8756fbd Fix NxN scan mode for lcu_get_final_cost.
- Scan mode was always selected according to the first PU mode.
2014-05-15 16:20:35 +03:00
Marko Viitanen b08047cce9 Fixed intra chroma mode selection 2014-05-15 09:50:05 +03:00
Tapio Katajisto 4d879945b2 Fixed cost calculations in fme 2014-05-15 03:42:42 +00:00
Ari Koivula f0e990905e Remove chroma mode "36".
- It's an unnecessary chore to handle this special case everywhere (it means
  chroma_mode == intra_mode). Better just to use the actual mode.
2014-05-14 19:56:35 +03:00
Ari Koivula 60a0ba4280 Update VS project files to link win32-pthread.
- I haven't found a good way of including external dependencies to VS projects
  yet. Win32-pthreads is assumed to be found at the same level as kvazaar dir
  and has the files x86/pthreadVC2.lib and x64/pthreadVC2.lib.

- Win32-pthreads also requires the pthreadVC2.dll to be in PATH when running
  the program. Not sure what to do about that yet. We might need an installer
  for windows to handle that.

- Disable openmp as it's no longer used.

- Stop linking Ws2_32.lib as that hasn't been used for ages.
2014-05-14 17:54:34 +03:00
Laurent Fasnacht 8ff9ea0eee Wavefront works with parallelism + deblock (still no SAO) 2014-05-14 14:01:26 +02:00
Laurent Fasnacht 38444a81a6 Threads should be put in queue in wait state if we want to add dependencies later 2014-05-14 14:01:25 +02:00
Laurent Fasnacht e72408249b Add encoder_state pointer to lcu_order_element, new worker_encoder_state_search_lcu function to run the search stuff on one LCU 2014-05-14 14:01:24 +02:00
Laurent Fasnacht eb62696461 Fix problems when image dimensions is not a multiple of LCU 2014-05-14 13:27:14 +02:00
Laurent Fasnacht 1ba1683c05 search buffer has to be allocated tile-wise to avoid problems with wavefronts 2014-05-14 13:27:13 +02:00
Laurent Fasnacht bb86f24000 Take advantage of the new buffers to remove uneeded item assignment 2014-05-14 13:27:13 +02:00
Laurent Fasnacht 6607c9f563 Use new buffers for search 2014-05-14 13:27:12 +02:00
Laurent Fasnacht c257c4b863 Add const for the buffers 2014-05-14 13:27:12 +02:00
Laurent Fasnacht 1680273e80 Store search borders in a buffer for the whole picture 2014-05-14 13:27:11 +02:00
Laurent Fasnacht 0ceb1469a2 Improve decision about when to split into threads 2014-05-14 13:27:11 +02:00
Laurent Fasnacht d4a303e7e6 Free jobs as soon as possible 2014-05-14 13:27:09 +02:00
Laurent Fasnacht 63adb54a3d Add --threads <int> command line parameter 2014-05-14 13:27:09 +02:00
Laurent Fasnacht e772799d5e encoder_state_encode uses now the threadqueue 2014-05-14 13:27:08 +02:00
Laurent Fasnacht baede7f6c4 threadqueue 2014-05-14 13:27:08 +02:00
Laurent Fasnacht 8b7774153f Add SLEEP() define 2014-05-14 13:27:08 +02:00
Laurent Fasnacht aac7fc55b1 Remove filter_deblock function, which is not used and somewhat dangerous, since it doesn't take into account specific stuff about subencoders. 2014-05-14 13:27:07 +02:00
Laurent Fasnacht bc3ca90bdf Fix tiles when SAO or deblock is enabled.
Was broken by previous commit.
2014-05-14 13:27:07 +02:00
Laurent Fasnacht 4815a0604b Entropy coding sync works without parallelism, without SAO and without deblocking 2014-05-14 13:27:06 +02:00
Laurent Fasnacht 2c2a2528f3 Remove openmp stuff 2014-05-14 13:27:06 +02:00
Ari Koivula aee9bf2875 Re-add rdo control to transformskip decision.
- It got left out when rewriting the function.
2014-05-14 12:39:23 +03:00
Ari Koivula 9147b7acbf Split residual quantization to separate luma and chroma function. 2014-05-14 11:19:48 +03:00
Tapio Katajisto cc92cfee18 Added few warnings to Makefile
Cleaned fme code a bit
2014-05-14 01:49:34 +00:00
Tapio Katajisto efc43c8b3a Added fractional pixel motion estimation
Added farctional mv support for inter recon

Added 1/8-pel chroma and 1/4-pel luma interpolation
2014-05-14 01:42:02 +00:00
Ari Koivula e947bd4c0e Clean up trskip decision code and remove old code.
- You can define structs inside functions! This changes everything!!

- Bitstream changes a little bit compared to old trskip decision. Bdrate
  change is insignificant though.
2014-05-13 22:00:04 +03:00
Ari Koivula a3cdee9ec5 Move new trskip decision to a function. 2014-05-13 21:59:00 +03:00
Ari Koivula 2ff713ccb2 Add new implementation for trskip decision. 2014-05-13 21:57:45 +03:00
Ari Koivula 8b8da6f493 Make luma and chroma use the same quantization function.
- Only thing not working was transform skip.
2014-05-13 21:57:23 +03:00
Ari Koivula f0bfcedba2 Clean up coeff reconstruction code. 2014-05-13 21:56:10 +03:00
Ari Koivula 0c65a9b658 Remove abs_sum from coeff quantization.
- It's meant for checking if there are any coefficients, but we don't use it
  and it's annoying to remember to initialize it and pass it around. The
  benefit should be quite small anyway.
2014-05-13 21:54:34 +03:00
Ari Koivula 75042fc65d Move luma quantization to it's own function. 2014-05-13 21:34:06 +03:00
Ari Koivula ba3aaf3189 Expand chroma functions to parent function.
- This was done so that making the function work with luma would be easier.
2014-05-13 21:30:14 +03:00
Ari Koivula 637aceb495 Add TR_MAX_WIDTH.
- Max transform size is constrained by but independent of LCU size.

- Luma and chroma now have the same stride for transform arrays.
2014-05-13 21:22:40 +03:00
Ari Koivula 1c38209cab Add missing include. 2014-05-13 09:33:05 +03:00
Ari Koivula 13577562e5 Revert change to definition of LCU_WIDTH. 2014-05-13 09:28:01 +03:00
Ari Koivula fb763f7940 Move coefficient generation functions from encoder.c to transform.c.
- These functions probably should have been there to begin with.
2014-05-12 11:37:39 +03:00
Ari Koivula a3478ecd20 Move transform skip decision to it's own function. 2014-05-12 11:18:27 +03:00
Ari Koivula d9b890de6e Remove redundant variables.
- Redefine LCU_WIDTH to be 64. Stuff will break horribly if it's
  anything else anyway.

- Add LCU_WIDTH_C for chroma LCU width. It should be more readable than the
  constant (LCU_WIDTH >> 1).
2014-05-12 10:58:07 +03:00
Ari Koivula 59e0e98523 Separate luma and chroma coefficient generation variables. 2014-05-12 10:38:24 +03:00
Ari Koivula 0ca65e7606 Move chroma coefficient generation to it's own function.
- It's time to chop up this monster that is encode_transform_tree.
2014-05-12 10:24:06 +03:00
Ari Koivula 3c3c9a26c6 Move scan order selection to a function. 2014-05-12 08:47:16 +03:00
Ari Koivula 623d9001a8 Reorder chroma coefficient generation. 2014-05-12 08:47:16 +03:00
Ari Koivula 93141c7d2e Avoid unnecessary copying of predicted pixels when there are no coeffs.
- These are probably from a time when reconstruction happened in this
  function.
2014-05-09 16:39:58 +03:00
Ari Koivula 27ab882c25 Clean up coefficient generation. 2014-05-09 16:33:10 +03:00
Ari Koivula ce945ab4ef Handle coefficient initialization better.
- Coefficients are no longer required to be pre-zeroed. The resulting zeroes
  are copied in even in the case where we already know they are all zeroes.

- Move cbf clearing code to only happen at the leaves of the recursion.
2014-05-09 16:30:28 +03:00
Laurent Fasnacht b274558139 Refactor and fix entry_points functions.
Seems to be OK with HM now
2014-05-09 12:42:37 +02:00
Laurent Fasnacht 43b5f84c0d Fix sao_calc_edge_block_dims
It was computing wrong dimensions, which was causing out-of-bounds reads in sao_reconstruct.
2014-05-09 10:30:34 +02:00
Laurent Fasnacht 3f975e92cd Replace line fixing symptoms by assertions, to reveal the cause 2014-05-09 08:24:03 +02:00
Laurent Fasnacht 4dbf7c7a52 Fix blit dimensions in sao_search_best_mode 2014-05-09 08:24:02 +02:00
Ari Koivula cb5d7e6541 Fix compilation for VS2010. 2014-05-08 17:28:12 +03:00
Laurent Fasnacht 0452806ec4 Entry points 2014-05-08 15:04:56 +02:00
Laurent Fasnacht da588af2ba Partial support for wavefront 2014-05-08 15:04:55 +02:00
Laurent Fasnacht 4de5660254 Fix missing offset in LCU range computation for wavefronts 2014-05-08 15:04:55 +02:00
Laurent Fasnacht dc34a5eac6 LCU borders 2014-05-08 15:04:54 +02:00
Laurent Fasnacht 24f4a8cad1 Wavefront also needs entrypoints 2014-05-08 15:04:53 +02:00
Laurent Fasnacht d05f8b52aa Rewrite of encoder_state_write_bitstream_leaf: handle slice + tiles + wavefronts correctly 2014-05-08 15:04:53 +02:00
Laurent Fasnacht 27f694e3e8 Some initial code to support wpp and slices 2014-05-08 15:04:52 +02:00
Laurent Fasnacht b3d1754cc3 context_copy function 2014-05-08 15:04:51 +02:00
Laurent Fasnacht 163189c3c7 Bitstream for leaves can be computed in parallel 2014-05-08 15:04:51 +02:00
Laurent Fasnacht be9882f5b2 Leaf bitstream write 2014-05-08 15:04:50 +02:00
Laurent Fasnacht ae6a7a9c4b Leaf encoder uses encoder_state->lcu_order 2014-05-08 15:04:49 +02:00
Laurent Fasnacht b740142325 Add is_leaf to encoder_state 2014-05-08 15:04:48 +02:00
Laurent Fasnacht 8451d5b100 Move some init code to encoder_state_new_frame 2014-05-08 15:04:48 +02:00
Laurent Fasnacht 1cb3f14dfe lcu_order_count in (leaves) encoder 2014-05-08 15:04:47 +02:00
Laurent Fasnacht ef6ae3e723 Remove dead code 2014-05-08 15:04:46 +02:00
Ari Koivula 535b42bc9b Fix compilation for VS2010. 2014-05-07 15:26:44 +03:00
Laurent Fasnacht 05eef82896 Remove extra [ from graphviz dump 2014-05-07 13:40:29 +02:00
Laurent Fasnacht 84e5dbee39 Remove quote from graphviz dump 2014-05-07 13:33:02 +02:00
Laurent Fasnacht b48a687d3c Restored parallelism, but it will be done in another way... OpenMP is not very efficient in these kind of dynamic situation 2014-05-07 11:55:56 +02:00
Laurent Fasnacht 0e6f1c99fc Refactor picture to remove hidden dependency between slice and tiles
picture.type -> encoder_state->global->pictype
picture.slicetype -> encoder_state->global->slicetype
picture.slice_sao_luma_flag -> 1 (was constant)
picture.slice_sao_chroma_flag -> 1 (was constant)

This may be changed later. For now it's better to avoid having slice related stuff in picture.
2014-05-07 11:55:48 +02:00
Laurent Fasnacht 39d96e0546 Fix bug with cabac stream pointing to bad data 2014-05-07 11:55:41 +02:00
Laurent Fasnacht e144f817ef Works when not using tiles 2014-05-07 11:55:16 +02:00
Laurent Fasnacht 24c2bd70ca Fix small bugs with compilation 2014-05-07 11:54:35 +02:00
Laurent Fasnacht a03f0cba19 encoder_control_input_init near the other encoder_control_* functions 2014-05-07 11:53:21 +02:00
Laurent Fasnacht 1e2671ac30 Renamed encoder_clear_refs to encoder_state_clear_refs 2014-05-07 11:53:12 +02:00
Laurent Fasnacht 831b221cf8 Parsing seems to work now 2014-05-07 11:53:01 +02:00
Laurent Fasnacht 8b5cb62237 Debug code to generate a graph 2014-05-07 11:52:04 +02:00
Laurent Fasnacht cee6bb0e71 Fix iteration on children 2014-05-07 11:49:14 +02:00
Laurent Fasnacht 699669ee35 fixed typo 2014-05-07 11:48:16 +02:00
Laurent Fasnacht 6c6adf18c7 Refactor encoder_state 2014-05-07 11:47:31 +02:00
Laurent Fasnacht a23edd0339 added parent to encoder_state 2014-05-07 11:42:54 +02:00
Laurent Fasnacht 5ce518a47a lcu_at_tile_start and lcu_at_tile_end helper functions 2014-05-07 11:42:30 +02:00
Laurent Fasnacht c2872bd6b0 Slices and WPP in command line and encoder 2014-05-07 11:42:04 +02:00
Laurent Fasnacht 2d6f199246 reorganized encoder_state structure 2014-05-07 11:41:27 +02:00
Laurent Fasnacht f0b076876f Moved all the stream related stuff into substream_write_bitstream 2014-05-07 11:40:20 +02:00
Laurent Fasnacht f30b9c2a11 Fix a buffer overflow in parse_tiles_specification 2014-05-07 11:39:45 +02:00
Ari Koivula eaf8835bda Add some comments and const qualifiers. 2014-05-06 19:20:38 +03:00
Ari Koivula 3910b7989a Clear old cbf data before recursion in encode_transform_tree.
- Because encode_transform_tree also maintains the CBF data and assumes that
  the CBFs are initially zeroed, calling the function more than once would
  result in incorrect CBF data.
2014-05-06 19:03:29 +03:00
Ari Koivula bdc16d2612 Improve cu_info coded block flag data structure a bit.
- It works just like the old structure except that the flags are checked with
  bitmasks instead of having the flag value be propagated upwards. There isn't
  really any benefit to this because the flags still have to be propagated to
  parent CUs.

- Wrapped them inside a struct to make copying them easier. (Just need to copy
  the struct instead of making individual copies)
2014-05-06 18:28:04 +03:00
Ari Koivula d123b98aea Remove unnecessary tertiary expressions from usages of CABAC_BIN. 2014-05-06 17:39:25 +03:00
Ari Koivula 380401b2eb Have CABAC_BIN accept any >0 as binary 1.
It used to treat odd numbers as false.
2014-05-06 17:39:10 +03:00
Marko Viitanen bf2c2a1330 Small changes to fix compiling on VS
- Added threads.h to VS project
- Included Windows.h in threads.h
2014-05-05 11:18:43 +03:00
Laurent Fasnacht f3d4e6eb09 Move bitstream write to a separate function, and add assertions about the part which should not write to bitstream. 2014-05-05 09:24:57 +02:00
Laurent Fasnacht 0fe080ad0a bitstream_tell 2014-05-05 08:53:06 +02:00
Laurent Fasnacht 7f6f4fe9c1 Reference count for picture 2014-05-05 08:03:24 +02:00
Laurent Fasnacht 323054d5e2 naming: alloc_yuv_t -> yuv_t_alloc dealloc_yuv_t -> yuv_t_free 2014-05-02 11:45:27 +02:00
Laurent Fasnacht 7d6d1d5536 Remove pic->pred_* 2014-05-02 11:38:07 +02:00
Laurent Fasnacht 92e14cc80d rename picture_init to picture alloc and picture_destroy to picture_free 2014-05-02 10:58:28 +02:00
Laurent Fasnacht b76f7377b6 Always initialize tiles data structures (even with only one tile) 2014-05-02 10:00:22 +02:00
Laurent Fasnacht f97e60a80d Doc for encoder state 2014-05-02 10:00:12 +02:00
Laurent Fasnacht 161fe38f5e Remove USE_TILES define 2014-05-01 13:58:13 +02:00
Laurent Fasnacht a84fd6486d Add function subencoder_blit_pixels 2014-05-01 11:16:11 +02:00
Laurent Fasnacht b8b28635ff Iterable structure for sub-encoders (more flexibility) 2014-05-01 11:16:10 +02:00
Laurent Fasnacht 212d390003 Cleanup of encoder_state_init and encoder_state_finalize 2014-05-01 11:16:10 +02:00
Laurent Fasnacht 161053f86b Do not allow more tiles than dimension in LCU 2014-05-01 07:11:31 +02:00
Ari Koivula 42295d3cb9 Pass preprocessor defines for supported intrinsics in VS2010 explicitly.
- _M_IX86_FP defines whether VS should generate code using SSE or SSE2
  instructions. It isn't correct to use it to check whether optional runtime
  optimizations should be compiled in. It's also not defined at all in 64-bit
  mode.

- So let's just keep it simple and give a list of everything that is supported
  as release optimizations. It's not clear from the documentation if all of
  these are really supported. It just list a bunch of intrinsics from these
  that are.
2014-04-30 17:41:15 +03:00
Ari Koivula d1fbc6dc80 Fix a small memory leak.
- Malloced pointer returned by alloc_yuv_t was not being freed in
  substream_encode.

- Remove use of yuv_t from encode_one_frame, as it's not used there anymore.
2014-04-30 11:15:34 +03:00
Ari Koivula d808fe3b02 Merge branch 'strategy_selector' 2014-04-29 15:36:48 +03:00
Ari Koivula bd7e021742 Modify strategyselector to work with VS2010.
- VS doesn't have snprintf.

- VS doesn't support GCC attributes.

- Add defines for __SSE__ and __SSE2__ on VS.
2014-04-29 15:29:06 +03:00
Laurent Fasnacht bf7e755cf7 Strategies and runtime detection/choice of best algorithm 2014-04-29 11:51:41 +02:00
Ari Koivula 27b94d4b45 Address gcc -Wtype-limits errors.
- Fixes warnings in #19 and #16.
2014-04-29 09:15:52 +03:00
Ari Koivula 2a17e9a7aa Merge branch 'sse_intrinsics' 2014-04-28 19:38:08 +03:00
Ari Koivula cecf4b0b4e Move __USE_MINGW_ANSI_STDIO to Makefile.
- I'm not too clear on how this should be used, but having it in the source
  file after mingw stuff was included caused a warning about redefinition of
  __USE_MINGW_ANSI_STDIO.
2014-04-28 19:37:37 +03:00
Ari Koivula 4e7e40054f Move picture-sse2.c to src/inline-optimizations/.
- Having it in the src dir even though it's not a module on it's own breaks
  the scons build script. It's probably better to have these a little bit
  separated from the normal code anyway.
2014-04-28 19:36:40 +03:00
Laurent Fasnacht d66f809734 reg_sad implementation using SSE2/SSE4.1 intrinsics 2014-04-28 15:36:58 +02:00
Ari Koivula 4490e8afd6 Remove depth dimension from picture->cu_array.
- It isn't used for anything anymore.

- It was used in the past to hold information during search, but now that
  information is held in lcu_t structs.
2014-04-28 10:18:22 +03:00
Laurent Fasnacht 76ec605b72 SAO works with tiles now 2014-04-28 06:29:21 +02:00
Yusuke Nakamura 0214d4ffcc Makefile: Remove unneeded arguments in CCFLAGS.
This fixes a compilation on clang.
2014-04-27 00:41:10 +09:00
Yusuke Nakamura 03da39e229 config: Use built-in getopt on non-MSVC environments. 2014-04-27 00:40:52 +09:00
Yusuke Nakamura c5a4e7b52c encmain: Remove a warning on MinGW. 2014-04-26 23:56:50 +09:00
Ari Koivula 145816cfb5 Move printing of CLI stuff to stderr.
- Printing to stdout corrupts the stream when used with "-o -".
2014-04-26 12:56:39 +03:00
Laurent Fasnacht 5e7945888a Inter-frame prediction with tiles works.
Many thanks to Jean-Hugues Recolin for the insightful comments about shifts!
2014-04-25 09:28:00 +02:00
Laurent Fasnacht 7719837f17 Simple OpenMP parallelization 2014-04-25 09:11:10 +02:00
Laurent Fasnacht 4e34859e66 Fix compilation error with USE_TILES=1 and -Werror=maybe-uninitialized 2014-04-24 08:41:05 +02:00
Laurent Fasnacht 59392c4a62 Fix compilation issue with USE_TILES=0 2014-04-24 08:38:24 +02:00
Laurent Fasnacht 571a373f69 Use tile offset in search 2014-04-24 08:38:24 +02:00
Laurent Fasnacht 2e7d958af3 Picture and reference may have different sizes 2014-04-24 08:38:24 +02:00
Laurent Fasnacht af9a1c0fbb Use same reference images for all subencoders 2014-04-24 08:38:23 +02:00
Laurent Fasnacht 73c574fb45 P-frame: first try... 2014-04-24 08:38:22 +02:00
Laurent Fasnacht 03361dcf2c sao try... still not working 2014-04-24 08:38:22 +02:00
Laurent Fasnacht 3db4c59478 Recontruct full frame from tiles 2014-04-24 08:38:21 +02:00
Laurent Fasnacht 35d5d22ccc Fix tile size not to go outside of the original picture 2014-04-24 08:38:20 +02:00
Laurent Fasnacht 985630b8b2 Add a check to fix picture_blit_pixels when width > orig_stride 2014-04-24 08:38:20 +02:00
Laurent Fasnacht b36e154c38 Some cleanup 2014-04-24 08:38:19 +02:00
Laurent Fasnacht 01580a93c3 Encoding with tiles now more or less works with -p 1 --no-sao --no-deblock 2014-04-24 08:38:19 +02:00
Laurent Fasnacht fd89b9af76 New functions: bitstream_append and bitstream_clear 2014-04-24 08:38:18 +02:00
Laurent Fasnacht 356c17e0de Add missing break in bitstream_writebyte 2014-04-24 08:38:18 +02:00
Laurent Fasnacht 5fb4d9c36e substream_encode function 2014-04-24 08:38:17 +02:00
Laurent Fasnacht e292b2c274 allocate subencoders 2014-04-24 08:38:17 +02:00
Laurent Fasnacht 12e3900fd1 ( ) for preprocessor directives... 2014-04-24 08:38:16 +02:00
Laurent Fasnacht fba4f5432a Fix debug code 2014-04-24 08:38:16 +02:00
Laurent Fasnacht b255133460 Debug for tiles 2014-04-24 08:38:15 +02:00
Laurent Fasnacht 066ce6c9f4 Remove unused prototype 2014-04-24 08:38:15 +02:00
Laurent Fasnacht 11629ce811 Use tile scan order in encode_one_frame() 2014-04-24 08:38:14 +02:00
Laurent Fasnacht 0036afa056 Write tiles related information picture parameter set and slice header 2014-04-24 08:38:14 +02:00
Laurent Fasnacht 1e9c894eba Coding tree block raster and tile scanning conversion process, according to ITU-T Rec. H.265 (04/2013) 6.5.1 2014-04-24 08:38:13 +02:00
Laurent Fasnacht 7bd6aa2e9c encoder_control_input_init call moved to encoder_control_init 2014-04-24 08:38:13 +02:00
Laurent Fasnacht ff318ae0e9 Tiles in encoder_control 2014-04-24 08:38:12 +02:00
Laurent Fasnacht 9353f14792 Parameters for using tiles in command line arguments.
--tiles-width-split
--tiles-height-split
2014-04-24 08:38:11 +02:00
Laurent Fasnacht 61c67dc485 Allow -DUSE_TILES=1 to be specified in Makefile; define MAX_TILES_PER_DIM. 2014-04-24 08:38:11 +02:00
Laurent Fasnacht 19b1642aa2 Removed all cabac parameters (cabac is part of encoder_state) 2014-04-22 11:46:53 +02:00
Ari Koivula a539ae7e08 Address clang-analyzer warning.
- The assert needs to be before the initialization.
2014-04-22 11:55:28 +03:00
Laurent Fasnacht 5fea5875a5 Huge refactoring
Split some parts of encoder_control into encoder_state
(idea: encoder_control is immutable)

Goal is to allow multiple substreams in the future.
2014-04-22 10:39:12 +02:00
Ari Koivula 88a67a4e49 Fix faulty assert that stops the program from working with inter frames.
- The assert would be true after the next if block, but in it's current place
  it's false.
2014-04-22 10:57:38 +03:00
Ari Koivula 54270f271d Fix c89 problem to allow compilation with VS2010. 2014-04-17 19:12:39 +03:00
Ari Koivula 1b437a5989 Address clang-analyzer warnings about garbage values.
- False alarm, but surprisingly difficult to convince clang of that. It
  doesn't seem to understand bit shifts very well.

- Only assert and changing LCU_WIDTH>>depth to width was necessary to satisfy
  clang.

- Closes #35.
2014-04-17 18:43:09 +03:00
Ari Koivula 11509c68dc Address clang-analyzer warnings about unused values.
- Related to issue #35.
2014-04-17 18:43:08 +03:00
Ari Koivula 0704c43836 Address clang-analyzer warning about undefined behavior in intra.
- Related to issue #35.
2014-04-17 18:43:08 +03:00
Ari Koivula 32da12f653 Address a clang-analyzer warning about undefined behavior in filter.
- Analyzer didn't see that code is never called with MAX_DEPTH as it doesn't
  know the properties of width, height, x and y. Following would also
  silence the error:
    assert(cur_pic->width > 0 && cur_pic->height > 0);
    assert(cur_pic->width % 8 == 8 && cur_pic->height % 8 == 0);
    assert(x > 0 && y > 0);
    assert(x % 8 == 0 && y % 8 == 0);

- Related to issue #35.
2014-04-17 18:43:08 +03:00
Laurent Fasnacht 3396264f3c Moved g_cur_lambda_cost into encoder_control.cur_lambda_cost 2014-04-17 12:00:21 +02:00
Laurent Fasnacht 534013be77 Remove g_lambda_cost 2014-04-17 11:49:27 +02:00
Laurent Fasnacht 83360918ba Removed table generation from main code, moved it to tools. 2014-04-17 11:13:15 +02:00
Laurent Fasnacht 4a9c239027 Remove g_bitdepth 2014-04-17 11:13:13 +02:00
Laurent Fasnacht 7a2b883059 Remove encoder_input width, height, height_in_lcu, and width_in_lcu 2014-04-17 11:13:12 +02:00
Laurent Fasnacht d01e3ae67f bitstream is a union, and is statically in encoder_control structure 2014-04-17 11:13:12 +02:00
Laurent Fasnacht 122576fe8b some const in cabac.c 2014-04-17 11:13:11 +02:00
Laurent Fasnacht 94a48fc153 added const in bitstream 2014-04-17 11:13:11 +02:00
Laurent Fasnacht 9ac3b7bf2b encoder->in.cur_pic --> cur_pic 2014-04-17 11:13:10 +02:00
Laurent Fasnacht 21d34613c2 Replace encoder->stream by stream 2014-04-17 11:13:09 +02:00
Laurent Fasnacht 2286175378 nal are now written to a bitstream, not a FILE* 2014-04-17 11:13:09 +02:00
Laurent Fasnacht 677fc2ec7d Fix prototype of create_bitstream in bitstream.h 2014-04-17 11:13:08 +02:00
Ari Koivula 51ba80513b Centralize resource deallocation for encmain.
- CppCheck was complaining about unreleased resources for FILE*. They weren't
  really because they get flushed and closed when program exits normally, but
  let's close them anyway.
2014-04-17 11:58:03 +03:00
Ari Koivula b35f33b3da Address warnings about unused values.
- Related to issue #35.
2014-04-16 18:05:03 +03:00
Ari Koivula 9229e5d11b Fix undefined behavior of EO_IDX.
- Move the whole eo_cat thing to it's own function.

- Casting pixel values to int should solve issues with SIGN3. Not casting them
  was detected as undefined behavior by CLANG. Probably because the result of
  two unsigned might be treated as unsigned (but isn't on VS2010).
2014-04-16 14:50:19 +03:00
Laurent Fasnacht ec9d70f70c Moved scalinglist_process into init_encoder_control 2014-04-16 11:45:51 +02:00
Ari Koivula 6e24ba0a5f Merge branch 'sao-mode-cost' 2014-04-16 12:25:05 +03:00
Laurent Fasnacht e06253d437 scalinglist changes missing in previous commit 2014-04-16 11:00:29 +02:00
Laurent Fasnacht 9901c38dd5 scalinglist in independent file 2014-04-16 10:25:16 +02:00
Ari Koivula 051484f8d8 Remove unused old mode cost estimation from SAO. 2014-04-16 11:19:23 +03:00
Laurent Fasnacht 9112cbb58c Generate and use static tables 2014-04-16 09:49:09 +02:00
Ari Koivula a982800e1b Merge remote-tracking branch 'remotes/lfasnacht/const' 2014-04-16 10:28:31 +03:00
Ari Koivula df5af669f9 Merge remote-tracking branch 'remotes/lfasnacht/makefile_deps' 2014-04-16 10:28:04 +03:00
Ari Koivula 33b9594fec Take into account the coding cost of not using SAO. 2014-04-15 21:32:24 +03:00
Ari Koivula 880d09b27a Add sao_type_idx to SAO mode cost estimation. 2014-04-15 21:29:09 +03:00
Ari Koivula 280a946269 Apply bit cost fixes to edge band band sao search.
- Increases bdrate slightly. It's still a cleaner solution though and further
  improvements to bit cost estimation might improve things.
2014-04-15 21:28:16 +03:00
Ari Koivula c968253eb3 Estimate SAO merge coding costs better.
- This doesn't seem to have very much of an effect. I guess the difference
  between 1 and 2 bits isn't that important.
2014-04-15 21:27:16 +03:00
Ari Koivula ef7840c623 Add SAO coding costs for sao_eo_class, band position and offset sign.
- This reduces bdrate a little bit.

- It seems like increasing the bit cost of using SAO in general
  increases bdrate.
2014-04-15 21:24:44 +03:00
Ari Koivula 9ff32c566c Adjust SAO mode coding cost for zero offsets.
- Coding zero with TR only takes one bit.

- Even though offset 0 should be fairly common, this doesn't seem to help very
  much and actually increases bdrate on some sequences.
2014-04-15 21:21:31 +03:00
Ari Koivula 67a3d5542c Add better delta distortion calculation to sao. 2014-04-15 21:17:29 +03:00
Ari Koivula 8c4796e56e Calculate edge and band sao separately. 2014-04-15 21:16:55 +03:00
Ari Koivula 1017c6639c Move band and edge sao to their own functions. 2014-04-15 21:14:59 +03:00
Laurent Fasnacht 960f2cb4b0 g_sig_last_scan -> const uint32_t* 2014-04-15 16:09:52 +02:00
Laurent Fasnacht 288a4537ba const bit_table for exp_golomb 2014-04-15 16:09:52 +02:00
Laurent Fasnacht 763b775d3e encoder_control->cfg is const 2014-04-15 16:09:52 +02:00
Laurent Fasnacht ae2d79c954 Remove encoder_control.cqmfile 2014-04-15 16:09:51 +02:00
Laurent Fasnacht 86c1cf339f Add Makefile to dependencies 2014-04-15 15:47:40 +02:00
Laurent Fasnacht e135a88fb5 Remove encoder_control.cqmfile 2014-04-15 14:21:25 +02:00
Laurent Fasnacht 29c93d8f3f Automatic generation of build dependencies 2014-04-15 13:52:31 +02:00
Laurent Fasnacht 7897f7d5cd Remove counter from debug version of WRITE_*
It's not very useful, and create unneeded noise when trying to make diffs
2014-04-15 11:37:45 +02:00
Laurent Fasnacht f47e23cd24 Allow FREE_POINTER to free const xxx * ptr without warning 2014-04-15 11:37:44 +02:00
Laurent Fasnacht 52ae027b3a Avoid undefined behavior in memcpy calls
"The memcpy() function shall copy n bytes from the object pointed to by s2 into the object pointed to by s1. If copying takes place between objects that overlap, the behavior is undefined."
2014-04-15 06:30:21 +02:00
Laurent Fasnacht 317a3f87a4 Initialize scaling_list_dc (avoids branching on uninitialized value) 2014-04-15 06:19:12 +02:00
Laurent Fasnacht 9f3aeed6be Fix pixel access in sao 2014-04-14 15:40:06 +02:00
Laurent Fasnacht 486768fc79 scalinglist privatization 2014-04-14 13:39:28 +02:00
Laurent Fasnacht 78c579053a encoder_control should be const in nearly all the code 2014-04-14 10:56:06 +02:00
Marko Viitanen 0e7a5057d1 Merge pull request #26 from lfasnacht/warnings_fix
Fix warnings and compile with -Werror
2014-04-14 11:30:37 +03:00
Marko Viitanen 04f09a2bc8 Merge pull request #25 from lfasnacht/memory_bitstream
Changed bitstream handling to allow in-memory bitstream.
2014-04-14 11:29:25 +03:00
Laurent Fasnacht 13398e011b Fix create_bitstream() 2014-04-14 10:23:09 +02:00
Laurent Fasnacht 64f3f57af3 Compile with -Werror 2014-04-14 09:38:37 +02:00
Laurent Fasnacht 89ef1161c4 Fix warnings 2014-04-14 09:37:39 +02:00
Laurent Fasnacht baba299bb8 Obviously a void function cannot return NULL 2014-04-14 09:11:15 +02:00
Laurent Fasnacht 418e6eae51 Changed bitstream handling to allow in-memory bitstream. 2014-04-14 08:13:00 +02:00
Laurent Fasnacht 520dbdd86d Change return type of free_exp_golomb to be void, and add it to bitstream.h 2014-04-14 06:41:27 +02:00
Ari Koivula 29787efbbc Fix whitespace.
Fix some whitespace issues from a merge.
2014-04-11 17:06:21 +03:00
Ari Koivula 83d5a4753d Move input resolution to the same line as internal resolution.
The \n must have been left there by accident.
2014-04-11 16:55:08 +03:00
Ari Koivula 0b5c357795 Move all output to stderr.
It has to be in stderr to allow piping bitstream from stdout.
2014-04-11 16:50:59 +03:00
Ari Koivula 115872b300 Add total running time to output. 2014-04-11 12:42:37 +03:00
Marko Viitanen de1c0b7e8d Fixed intra RDO to include mode bitcost 2014-04-10 16:28:41 +03:00
Marko Viitanen a657cf84d9 Insert most probable (predicted) intra modes to RDO search 2014-04-10 15:59:36 +03:00
Marko Viitanen 05169d9476 Added more modes to RDO mode selection in intra search
Now 8 best modes for sizes 4x4 and 8x8 are added to RDO checking and 3 for other block sizes as before, only applies when --rd 2
2014-04-10 15:20:49 +03:00
Ari Koivula 5fa5e01e05 Merge branch 'intra-cleanup'
Conflicts:
	src/intra.c
	src/intra.h
	src/search.c
2014-04-10 13:51:14 +03:00
Ari Koivula 40c2fa4d46 Change intra reconstruction to use the same prediction function as search.
- This fixes a bug with intra search. It sometimes used filtered reference
  pixels for 4x4 blocks leading to inaccurate cost estimate.
2014-04-10 12:09:19 +03:00
Ari Koivula d5c3ad7a2b Move intra prediction generation to its own function. 2014-04-10 11:27:15 +03:00
Ari Koivula 088dd9ab96 Clean up intra mode search.
- This changes the bitstream a little bit, because it changes the order in
  which the modes are tried and when two modes have the same cost the first
  one is chosen.

- Dst buffer was removes as it was no longer used.
2014-04-10 10:25:57 +03:00
Marko Viitanen 43ae0a3b9a Implemented RDO cost calculation to Intra modes 2014-04-10 10:25:20 +03:00
Marko Viitanen c38ec1aa10 Added commandline option for RDO (--rd) 2014-04-09 12:29:15 +03:00
Marko Viitanen 6558c92020 Clean up get_coeff_cost()
Since contexts were moved to cabac struct, there's no need to store contexts one by one
2014-04-09 11:50:17 +03:00
Ari Koivula 92ac5025f9 Take intra mode based coeff scan mode into account for coeff bit cost.
- Previously only diagonal scan mode, the most common one, would be used.

- This improved bdrate by 0.1-0.5 % for p0 and 0-0.2 % for p60.
2014-04-09 10:44:44 +03:00
Ari Koivula c5dfcdf3aa Simplify scan mode selection.
- The scan mode selection for chroma was a bit complicated so I checked it
  and it was all unnecessary. The mode selection is the same as for luma.
2014-04-09 10:36:39 +03:00
Ari Koivula 3764688f84 Fix lambda initialization.
- Lambda was initialized before slice type was set in encoder_control.
2014-04-08 16:58:36 +03:00
Ari Koivula 0251bf5a1a Improve calculation of chroma coding cost for 4x4 blocks.
- Adds calculation of chroma coefficient cost for 4x4 blocks.

- Previously there was no cost. Now the cost is added to the first prediction
  block for NxN.

- This fix should improve bdrate by about 1%.
2014-04-08 12:43:26 +03:00
Ari Koivula 3c0977c7f3 Fix buffer overflow on copying of reference pixels.
- Valgrind noticed this.

- Shouldn't affect anything as the buffer overflowed to pixel buffers which
  were initialized later.
2014-04-04 17:28:56 +03:00
Ari Koivula 6e0bc655e2 Resolve unused variable warning.
- This unexpectedly changes bitstream, but as that makes no sense, it must be
  because some part of the program uses uninitialized memory.
2014-04-04 17:28:50 +03:00
Marko Viitanen e15a86268d Clean up tabs and whitespaces 2014-04-04 16:04:44 +03:00
Laurent Fasnacht 816ae13b1d Moved context information inside cabac_data.
This is required in order to be able to work on parallelism.
2014-04-04 14:28:50 +02:00
Laurent Fasnacht 8a14bd3b7b Remove cabac global variable 2014-04-04 14:26:40 +02:00
Laurent Fasnacht 946c815932 init_context directly has a QP parameter, instead of passing an encoder_control*
This makes context less tightly coupled with encoder.
2014-04-04 14:26:39 +02:00
Laurent Fasnacht 1e03cf8ac1 Add a function to free g_exp_table.
Even though g_exp_table has to be global (used in #define), it's better to avoid requiring other module to directly access it.
2014-04-04 14:26:39 +02:00
Marko Viitanen 7484dafd82 Fix for get_coeff_cost() scan mode selection
Small BD-rate improvement with this fix
2014-04-04 15:16:04 +03:00
Marko Viitanen c5ba5eb3c8 Use RDO in final_cost 2014-04-04 14:10:49 +03:00
Marko Viitanen b83559d3f3 Use RDO to check for transform skip mode 2014-04-04 13:09:42 +03:00
Marko Viitanen b09854d964 Implemented RDO function to calculate bits used for coefficient coding 2014-04-04 13:09:42 +03:00
Ari Koivula 61256fc31a Enable -Wall by default. 2014-04-04 13:02:08 +03:00
Ari Koivula 69ac9176a5 Disable warnings for extras/getopt.
- This isn't our code so we don't care about these warnings.
2014-04-04 13:02:08 +03:00
Ari Koivula 7239b59e94 Resolve constant conditional expression warning.
- Working towards issue #11.

- I felt that the macro was a little bit too clever in hiding the if-else
  statements so I removed that aspect, which also has the minor benefit of not
  requiring the starting if (0) statement.
2014-04-04 13:02:07 +03:00
Ari Koivula b19e4f3f2d Resolve possible uninitialized variable warnings.
- Working towards issue #11.

- Neither variable was actually used as uninitialized.
2014-04-04 13:02:06 +03:00
Ari Koivula 61ae195af7 Resolve warnings about assignments within conditions.
- Working towards issue #11.
2014-04-04 13:02:06 +03:00
Ari Koivula d44d1837bb Remove unreferenced parameters.
- Working towards issue #11.
2014-04-04 12:56:24 +03:00
Ari Koivula 46d33d3945 Resolve unsigned/signed mismatch warnings.
- Working towards issue #11.
2014-04-04 12:56:23 +03:00
Ari Koivula c142cbba21 Fix typo.
- Obvious typo. This g_bitdepth - 8 used to be g_bitincrement. Doesn't affect
  anything yet as we don't actually support bitdepth > 8 yet.
2014-04-04 12:56:22 +03:00
Laurent Fasnacht b371a8bb59 Use realloc correctly
Quote from MALLOC(3) manpage:

The realloc() function returns a pointer to the newly allocated memory, which is suitably aligned for  any  kind  of
variable and may be different from ptr, or NULL if the request fails.
2014-04-04 06:37:35 +02:00
Ari Koivula 0074cd1a98 Add extra parenthesis to suppress compiler warnings. 2014-04-03 15:38:18 +03:00
Ari Koivula 27a3329dfb Remove unreferenced_parameter macro.
- It was a silly hack to selectively silence compiler warnings from VS, but
  there is no point as it causes compiler warnings in GCC.
2014-04-03 15:38:17 +03:00
Ari Koivula f380e7d4b0 Check for malloc failure. 2014-04-03 15:38:17 +03:00
Ari Koivula 313466fdff Remove unused variables.
- Working towards issue #11.

- Either removed or redefined variables to not cause a warning.
2014-04-03 15:37:59 +03:00
Marko Viitanen 0da8071300 Changed final cost (and transform skip) error function from SAD to SSD 2014-04-02 14:51:39 +03:00
Marko Viitanen a14fb14e33 Added new commandline parameter --no-transform-skip 2014-04-02 14:49:48 +03:00
Marko Viitanen 21e02e2d7d Added 4x4 SATD (Hadamard)
Taken from HM 13.0
2014-04-02 11:12:42 +03:00
Marko Viitanen cfb21c0e4c Implemented transform skipping (for 4x4 blocks)
transform skip vs. normal transform selection criteria might need more work, currently both are calculated for each 4x4 block and SAD+coeff_SSE is compared.
2014-04-02 10:54:03 +03:00
Laurent Fasnacht ae5c573843 Global defines are now documented with references the specification when possible. Removes some redundancy. 2014-04-01 13:48:17 +02:00
Panu Sjövall c8f629495d Remove unnecessary buffer from bitstream.
- Writing encoded data to file is done in bitstream_put one byte at a time and nal_write only writes the packet headers
2014-03-25 11:46:56 +02:00
Ari Koivula 953aef0379 Move rest of LCU encoding inside the LCU loop.
- Move SAO search inside the LCU loop.

- Move CU coding inside the LCU loop.

- Move SAO frame reconstruction loop to sao module.
2014-03-21 12:41:44 +02:00
Ari Koivula 746eaa3671 Move deblocking code to filter module. 2014-03-21 11:57:12 +02:00
Ari Koivula 4d34377c42 Clean up deblocking code a bit.
- Change guards to use the same method of checking for coordinate alignment.

- Move variables to reduce their scope.
2014-03-21 10:50:47 +02:00
Ari Koivula 0f492c7680 Fix deblocking of transform boundaries.
This fixes issues with inter. Deblocking works now.
2014-03-21 10:42:41 +02:00