]> code.delx.au - pulseaudio/commit
sbc: ARM NEON optimizations for input permutation in SBC encoder
authorSiarhei Siamashka <siarhei.siamashka@nokia.com>
Mon, 14 Mar 2011 18:28:31 +0000 (15:28 -0300)
committerLuiz Augusto von Dentz <luiz.dentz-von@nokia.com>
Mon, 14 Mar 2011 18:28:31 +0000 (15:28 -0300)
commit68bdf5526eeafa85308dcce6a1a8c33f307b6faf
treeec5f5f083d35040504b93afed5826f0297a95e5f
parent718fe73cab5d2b9d77ffc94b7141e7be44305968
sbc: ARM NEON optimizations for input permutation in SBC encoder

Using SIMD optimizations for 'sbc_enc_process_input_*' functions provides
a modest, but consistent speedup in all SBC encoding cases.

Benchmarked on ARM Cortex-A8:

== Before: ==

$ time ./sbcenc -b53 -s8 -j test.au > /dev/null

real    0m4.389s
user    0m3.969s
sys     0m0.422s

samples  %        image name               symbol name
26234    29.9625  sbcenc                   sbc_pack_frame
20057    22.9076  sbcenc                   sbc_analyze_4b_8s_neon
14306    16.3393  sbcenc                   sbc_calculate_bits
9866     11.2682  sbcenc                   sbc_enc_process_input_8s_be
8506      9.7149  no-vmlinux               /no-vmlinux
5219      5.9608  sbcenc                   sbc_calc_scalefactors_j_neon
2280      2.6040  sbcenc                   sbc_encode
661       0.7549  libc-2.10.1.so           memcpy

== After: ==

$ time ./sbcenc -b53 -s8 -j test.au > /dev/null

real    0m3.989s
user    0m3.602s
sys     0m0.391s

samples  %        image name               symbol name
26057    32.6128  sbcenc                   sbc_pack_frame
20003    25.0357  sbcenc                   sbc_analyze_4b_8s_neon
14220    17.7977  sbcenc                   sbc_calculate_bits
8498     10.6361  no-vmlinux               /no-vmlinux
5300      6.6335  sbcenc                   sbc_calc_scalefactors_j_neon
3235      4.0489  sbcenc                   sbc_enc_process_input_8s_be_neon
2172      2.7185  sbcenc                   sbc_encode
src/modules/bluetooth/sbc/sbc_primitives_neon.c