Comparison of opus settings

A while ago I made a bunch of test files to determine if there would be any issues going below 96k for streaming opus. Shaving off bandwidth matters a lot for streaming, so even the 32k of savings going from 96k to 64k would be useful. I observed that 64k was just fine, and left it at that. I recently got into an internet argument over bitrates, with the other side arguing that anything under 96k was unsuitable for music, so I decided it was time for a new version of the tests.

The samples are Humans Are Such Easy Prey, by Perturbator. It was chosen because of the speech at the beginning, which makes a good sample of both regular speech and music. The original source was a lossless CD rip. To read the filenames, its:

easy_prey_${bitrate}_${sample rate}_${chanels}.opus

Test files

Observations

At 64k, any loss requires good headphones and serious listening to notice. IMO, if filesize/bitrate is any issue, going over 64k is a waste.

At 32k, there is noticable loss in the music, but the speech is fine. The difference between 24 and 48 khz is subtle, but there. Both achieve Limewire MP3 grade, and are perfectly acceptable if bandwidth is scarce. Some of these effects are noticeable in the 48kb/s samples, but not very clearly. I will opine that 24khz is probably better, due to eliminating the higher frequencies, rather than just running out of bits as can be noticed in the 48khz version. This may or may not be an issue for speech heavy audio though.

At all sample rates of 16k, the spoken audio is acceptable. The effect of dropping the sample rate is a bit more pronounced, although going to 16khz makes things worse, not better. The effects of switching to mono are still only barely noticeable, as we still have plenty of bits. This is fine for spoken audio in bandwidth constrained applications, but music is going to be a wash no matter what you do.

At 8k, 24khz is noticably worse. 16khz improves things, and the music is mud either way. Mono audio improves further, but 12khz doesn't offer any real improvement. At this level, we've achieved the equivilant of G.711 (phone call), but at 1/8th the bandwidth. This is very useful for voice communication with deep space, or some other location equally isolated from the internet, such as Australia. That said, I can't think of many applications where this would be useful unless you're paying by the kilobit on a satellite, or if you're a telco.

I did make 4k samples, but they're basically a rorschach test. Draw your own opinions here.

Notes

All files were created with ffmpeg using libopus. No settings were changed from the defaults, other than the bitrate, sample rate, and channels (-b:a, -ar, -ac).

make_samples.sh

#!/bin/sh

#Create a full spectrum of listening samples:

source_file="source.flac"
name="easy_prey"
bitrates="128k 96k 64k 48k 32k 24k 16k 12k 8k 4k"
samples="48k 24k 16k 12k 8k"
channels="2 1"

for br in $bitrates
do
	for sr in $samples
	do
		for ch in $channels
		do
			ffmpeg -y -i $source_file -c:a libopus -b:a $br -ar $sr -ac $ch ${name}_${br}_${sr}_${ch}.opus
		done
	done
done

I am fully aware that the low sample rate and low bitrate files can be improved with band filtering (narrowband mid-pass). I intend to come back to this some day and see if we can't make the 8k better and the 4k something other than a joke.

Full fileset

Generating the above list

$ ls *.opus | perl -wne '/easy_prey_(\d+)k_(\d+)k_(\d).opus/ and chomp and print "<li><a href=\"/assets/media/$_\">${1}k ${2}khz ${3}ch</a></li>\n";'