Wednesday, June 13, 2007

ALSA automatic sample rate conversion

Recent ALSA releases - not sure how recent, I'm using 1.0.13 - enable software mixing by default. This is fantastic, because it means an infinite number of properly written applications can record and play at the same time.
(It's hard to get this right in an application. If the app just connects to the PCM called "default", it'll work. But many apps get a list of physical devices and pick one, and that usually bypasses the software mixer and ruins the whole thing. The Java 1.5.x audio system is one of these.)

One side effect is that everything needs to be upsampled to 48000 Hz for playback through dmix, and downsampled from 48000 Hz for recording from dsnoop. ALSA's default sample rate converter is fast but terrible, for 8000 Hz at least. Most standard VoIP codecs use 8000 Hz and the default linear converter adds a lot of distortion and hissing.

There are two ways around this. You can disable software mixing, or install a better sample rate converter.

If you know you don't want software mixing, you can disable it. ALSA won't have to resample at all if your hardware supports multiple sample rates natively. (If it doesn't, you will want to use the second method.) To bypass the software mixer but still get automatic conversions when required, you can tell your applications to use device plughw:0,0. If the application doesn't allow that, change the default device by adding this block to your ~/.asoundrc or /etc/asound.conf:
pcm.!default {
        type plug
        slave.pcm "hw:0,0"
}

Now play a file with aplay -vv somefile.wav and look through the verbose output to make sure the "direct stream mixing" slave is not mentioned. If you ever want software mixing, use the device called cards.pcm.default (e.g. aplay -D cards.pcm.default file.wav).

If you want software mixing or your hardware doesn't natively support 8000 Hz [*], you can install libsamplerate and the ALSA plugins package. Then add to your ~/.asoundrc or /etc/asound.conf:

defaults.pcm.rate_converter = "samplerate_medium"

This should make 8000 Hz sound much cleaner. If you still have choppy audio during your voice calls, and you haven't bypassed software mixing, try it. I am not sure why -- maybe something to do with buffer sizes -- but the OPAL library didn't seem to like software mixing at all.

[*] The output from arecord -D plughw:0,0 -r 8000 -c 1 /tmp/file.wav would mention the rate conversion slave near the top if your hardware doesn't support 8000 Hz.

No comments: