Recently, I've been
thinking a lot about the nature of audio file formats and online
distribution thereof. There seems to be a growing camp of people
demanding lossless digital download options, but also a camp that
claims that decent lossy compression is good enough for most people.
Although I'm decidedly in the former camp, I would like to more
thoroughly explore what the actual differences are between lossless
and lossy compression.
Plenty of people have
tried to determine if the differences are easy to hear. Generally,
these analyses fall into two camps. The more populist surveys usually
barely show a favorable outcome for the ability of an
average listener to correctly identify a lossless file versus a lossy
version. (See, for example, here.)
The more specialized, audiophile studies fare somewhat better,
although the specifics vary widely. Some people claim to be able to
discern the difference with no difficulty, but these people tend to
have high-end hardware and trained ears. (See, for example, here
or here.)
Most people can barely hear the difference, and it would seem even
that requires more concentration and effort than is usually afforded
during casual listening.
However easy it may or
may not be to hear the difference, I am nonetheless interested in
what exactly that difference is. The effectiveness of flac (the Free
Lossless Audio Codec) in reducing file sizes to about half or
two-thirds of uncompressed wav files should prove that some amount of
lossless compression is possible simply by eliminating redundant
data. Lossy compression also removes redundancies, but by definition
also removes actual audio content to further reduce file size. The
most obvious elimination is any frequency over 16 kHz, since many
people cannot hear frequencies above that point, or cannot hear them
well. Even I top out somewhere between 17 and 18 kHz.
After that, though,
exactly what gets cut is not necessarily easy to describe.
Fundamentally, information that is considered inessential is removed
by the algorithm. However, some of this information may be detectable
in its absence by careful inspection. To this end, I did some
internet searching and found a few articles and discussions that
address some common trends. Here are some of the conclusions I've
come across:
1. Transients (e.g.
snare hits) suffer. They get blurred, lose their sharpness, and may
even acquire pre-echo. All forms of percussion can lose some of their
natural punch. Such quick bursts of information are often too short
for the codec's processing frame size and they get blurred across the
frame.
2. Vocals lose focus
and clarity. Our ears are particularly sensitive to the human voice
and can detect seemingly subtle changes.
3. Cymbals and applause get distorted and rough. This is because high-entropy (i.e. "random" or rapidly changing) information changes too fast for the codec. This can sometimes also materialize as ringing or warbling.
3. Cymbals and applause get distorted and rough. This is because high-entropy (i.e. "random" or rapidly changing) information changes too fast for the codec. This can sometimes also materialize as ringing or warbling.
4. Bass instruments get
muddier. Lower frequencies have longer wavelengths, which can be
longer than the codec's processing frame size, and thus do not get
represented accurately.
5. Stereo separation
and phase become distorted. Some of this is due to M/S (mid/sides)
stereo mode, which instead of storing left and right, tries to reduce
information redundancy by only storing the center (shared) and side
(differences).
6. Dynamic loss and EQ
loss is somewhat inevitable. Some sounds may get attenuated more than
others, and the others may thus seem louder.
7. Noise (general
murkiness, an underwater feeling, hiss, etc.) sometimes creeps in
where previously there was desirable content.
8. Lossy compression
can simply make things sound different, even if not necessarily
worse. However, any deviation from the intentions of the artists and
producers can reasonably be considered undesirable.
9. Lower-fidelity
source material may actually suffer even worse, as whatever noise and
other flaws exist in the uncompressed original may become
exaggerated.
10. Genre, style, and
the nature of the audio in question matter. Some types of music seem
to compress better than others. Any reasonable audio comparison test
should use a variety of types of music or audio.
There are, of course, a
couple other factors to consider, such as the differences in
acquiring and storing lossless and lossy audio. Hard drives are
constantly getting cheaper and bigger, so the cost of storing
lossless audio is a fairly marginal issue anymore. Acquiring the
audio is another matter, although the difference there is also no
longer as vast as it once was. New CDs are still only slightly more
expensive than most mp3 stores, and used CDs are almost always
cheaper. (The rip-and-resell approach has detractors but has been
thus far legally unquestioned, at least in the USA.) Lossless online
retailers are generally just about as expensive as mp3 stores, or at
worst slightly more expensive than mp3s but still less than CDs.
Hence, cost of acquisition is hardly a dealbreaker.
The real problem in
acquisition is still that of availability. Lossless online retailers,
while ever increasing in number and in content, still do not
represent anything near all of the world's available music. It can be
a pain to track this stuff down if it doesn't have the right type of
following or industry support. There are a few significant websites
(such as Bandcamp)
and many individual indie labels (Sub Pop, Merge, etc.) and bands
(speaking from experience: Ride,
The Church, Wilco,
and others) that offer lossless downloads, but many artists are still
hard to track down.
This is also confused
by the proliferation of HD retailers, which offer even higher sample
rates and bitrates, despite that most people do not have the
equipment to take advantage of the additional audio content. This
wouldn't be a problem except that HD files are several factors larger
and usually more expensive than any other digital format. (Only vinyl
competes at that price range, and that's yet another story for
another time.)
For me, lossless is the
answer. While the quality advantage of lossless music may not be
vast, the matters of file size and cost are less significant to me.
The difficulty of acquiring lossless audio can still be a challenge,
but it seems to be getting easier with time, and I am not opposed to
CDs. In fact, if there is one issue that still gets me about most
digital music downloads, it's the lack of album art. This is a big
deal to me, and in fact was one of the first things I ever wrote about on this blog.
Sometimes this can be found on discogs
or other sites, but finding it in decent resolution is usually tough.
If that hurdle can be crossed, then lossless digital downloads should
clearly be considered the standard.
References:
HardForum discussion on lossless audio
Dan Turkel's signal detection theory approach [Edit 2019.10.29: broken link; archived here]
Dan Turkel's signal detection theory approach [Edit 2019.10.29: broken link; archived here]
P.S. For the purposes
of this discussion, I consider "lossless" to mean redbook
audio CD quality, i.e. 16-bit, 44.1 kHz. HD audio is entirely other
discussion with its own contentions, such as whether most listeners actually benefit from it,
whether listeners can distinguish it, and
whether the online retail options are any good.
I do not have solid opinions of my own on these matters (yet).
No comments:
Post a Comment