mix: Add special-case ARM NEON code for s16 mixing
note that orig is the time of the special-case C implementation where available, not
the generic matric remapping implementation
on ARM Cortex-A8 (TI OMAP3 DM3730 @ 1GHz) (Linaro GCC 4.6):
Checking NEON mix (s16, stereo)
func:
2096927 usec (avg: 20969.3, min = 18646, max = 24475, stddev = 1647.36).
orig:
7113956 usec (avg: 71139.6, min = 65705, max = 102601, stddev = 4475.93).
Checking NEON mix (s16, 4-channel)
func:
4093053 usec (avg: 40930.5, min = 39093, max = 48217, stddev = 1862.16).
orig:
15664104 usec (avg: 156641, min = 149781, max = 218598, stddev = 8819.22).
Checking NEON mix (s16, mono)
func:
1139558 usec (avg: 11395.6, min = 9826, max = 25299, stddev = 2495.29).
orig:
3219118 usec (avg: 32191.2, min = 28412, max = 46509, stddev = 2095.34).
Signed-off-by: Peter Meerwald <pmeerw@pmeerw.net>