- Added 64x64 version for completeness.
- With the exception of 16x16, these were all slightly slower than the ASM
versions, as measured by "kvazaar_test -s speed -t intra_sad", but now they
are on par or slightly faster.
- None of these actually use any AVX2 intrinsics, and probably never will,
unless someone adds an interface for doing more than one block at a time,
in which case the non-destructive versions might come in handy.
- Everyone who has contributed code to the project has been asked to license
their contributions under LPGL and they have agreed.
- COPYING file changed to say LGPLv2.1 instead of GPLv2.
- GPL changed to LGPL in the header of every single file that a header and
header added to the few that were missing one.
- Also.. Happy new year!
- Moved implementations for different sizes to inline functions that are
defined using each other, reducing the amount of redundant code.
- Performance of sad_8bit_32x32_avx2 improved by about 10% due to unrolling of
the loop.