The Hungry Hacker's Explanation of Everything

Home » Uncategorized

Audio Formats Explained and Rated

21 December 2005 No Comment

The MP3 format is almost 2 decades old now.

Dramatic improvements in personal computing and technology and rapid advances in the field of audio compression are moving towards that perfect audio compression with a small file size.

The unending quest for a compressed file format which sounds almost as good as the original uncompressed audio file goes on. Uncompressed CD audio occupies roughly about 10MB of space for every minute, so the files have to be compressed to store them.

To achieve close to perfect compression, lossy or lossless, mathematical techniques are used to (model) represent the working and psycho-acoustic responses of the human ear, the final judge on how “good” the music sounds.

Audio compression is either:

  1. Lossless compression makes perfect copies of the original uncompressed audio (wave) file. When you uncompress the file, datum is kept intact, much like Zip file compression.
  2. Lossy compression, on the other hand, distorts some of the data, causing the file to lose some information.

In the frequency domain the range of human hearing is between approximately 20 to 20,000 Hz. The dynamic range of human hearing is approximately 120 decibels. Signals over 90 dB may cause permanent hearing damage. Audio file compression is achieved by removing inaudible sections of the pure uncompressed audio file which lies above and below the threshold of human hearing.

Typically, each codec uses complex and unique mathematical algorithms to shrink the size of a pure sound file, with a minimal loss in quality. Thus, an ideal balance between an acceptable drop in quality and small file size is formed.

Many formats have tried to improve upon the MP3 promise of high fidelity audio with smaller file sizes. Quite a few are extinct with their developers having abandoned the standard, but a handful of picky and battle hardened formats have survived, each using their own algorithm to store more in less.

The two primary reasons that MP3 has achieved a cult status are the easy availability of MP3 files before the music industry drove the bootleg MP3 industry underground and the easy and free availability of MP3 playing software.

For any format to take over from where MP3 left off, it will need these two factors in abundance and then some more. It will need to catch the attention of the developer community that democratized the MP3 revolution by creating shareware and free MP3 players, entire archives of MP3s online and peer to peer file swapping services to boot.

Four formats are poised to take the baton from MP3. Admittedly, they are not riding the popularity wave yet, but they could well emerge as dominant formats.

MP3PRO

The proprietary MP3PRO standard was developed by Thomson Multimedia in 2001 and they share the patent rights with the Fraunhofer Institute.

While appearing similar to the MP3 standard, it improves on it by using a technology called SBR (Spectral Band Replication). Essentially, SBR reproduces these high frequency components called the PRO components that are lost during normal MP3 encoding. By combining a low bit rate MP3 file with SBR data, you get a full bandwidth audio file with full bass and precise treble. With SBR, MP3PRO is able to reproduce the quality of a 128 Kbps encoded MP3 file, at 64 Kbps encoding quality, resulting in files half the size of a plain jane MP3.

The extra piece of information that is written into the MP3 file as a separate stream besides the normal data (read MP3) stream is what SBR is. The extra datum when read through a compatible MP3PRO decoder, allows the decoder to guess what the high frequencies sound like so they can be added to the MP3 file on the fly. This is an effective form of improving quality because the high frequencies take the brunt of MP3 lossy compression while allowing the encoder to allot bits to the more important areas of the song.

This format is backward compatible, so portable players with no MP3PRO decoder can simply still play MP3PRO files by ignoring the PRO component, thereby also dropping the quality of the MP3PRO to its original encoding rate which would be directly proportional to MP3 at this stage. Software support on the decoder end is freely available. Winamp has introduced a plugin and recent versions of other jukebox players have also included support.

Unfortunately, on the encoding front, Thomson has a demo version called the MP3PRO audio player that only allows up to a 64 Kbps encoding quality. Hence, you will be required to buy software that comes with the MP3PRO codec to encode files at 80 or 96 Kbps, such as the demo plugin shipped with Nero Burning ROM that allows 30 operations ranging from 24 Kbps for mono up to 96 Kbps for stereo.

MP3PRO is aimed at applications involving streaming audio, Web casting and Internet radio. For the desktop user, this format is suitable only if the file size is just as important as audio quality–if you enjoy the quality of MP3s at 128Kbps; then consider shifting over to MP3PRO to save desktop/portable real estate.

However, plain jane mp3 files encoded at 192 kHz and above have superior sound quality compared to MP3PRO at 96 kHz–the maximum encoding quality possible with MP3PRO.

Avoid converting (transcoding) your existing mp3s into any format (including MP3PRO)–instead of the superior audio quality you expect, there will be a further drop in quality. Instead, re-rip from your audio cd’s and encode into the format of your choice to get clear, high fidelity audio. Try experimenting with “vbr” for better compression quality.

Windows Media Audio

Arguably the most patented and proprietary format after MP3, Windows Media Audio (WMA) proves a close rival to the MP3 standard. Made only for Windows users, this format just got better with the recent release of the WMA 9 codec. WMA 9 can capture audio feed with a very impressive 24 bit/96Khz sampling rate in either stereo, 5.1 or 7.1 channel surround sound–so you can record your music in discrete digital surround sound (provided you have the hardware). Unlike the earlier versions, WMA 9 also has a lossless codec and it sounds just as good with an old stereo arrangement. It is widely believed that WMA 9, encoded at a bit rate of just 48 Kbps, sounds as good as an MP3 encoded at 128 Kbps. With WMA’s encoded at 96 Kbps, you have sound fidelity and clarity achievable only in MP3s encoded at 192 kHz and above.

WMA supports VBR encoding, which is ideal for squeezing in maximum quality within a minimal file size. Of course, there a caveat–Microsoft’s continued support for DRM (Digital Rights Management) means that distributing copyrighted WMA’s is severely restricted. The licensed file is encrypted with its license key that restricts you from storing or playing multiple copies of the file. Microsoft also tracks the transfer of the license across computers.

The hardware and software support for WMA 9 are just as good as it is for the MP3 standard. The format is backward compatible, and decoder support comes in the form of the old Winamp WMA codec. As far the encoding is concerned, you will need the new WMA 9 system codec along with encoding software or have Windows Media Player 9 installed.

With Microsoft’s muscle (read acoustic research sometimes) backing WMA 9, its well on its way to becoming the dominant format in the digital music arena. The format is well featured to suit diverse users, from artists wanting to distribute their music online securely to home users who need to encode a stack of CD’s.

Ogg Vorbis

This is a completely open, free source audio format that strives to replace all proprietary, patented formats. It achieved enormous popularity almost immediately after the Fraunhofer Institute decided to get tough on the MP3 standard patents and enjoys extensive developer support. In fact, the boys at Ogg Vorbis posted an open letter to the Fraunhofer Institute expressing their delight at the decision to extract license fees for MP3 and reporting higher website hits thereafter.

Ogg Vorbis is a lossy codec that compresses music in a technique similar to, but vastly better than MP3. It supports VBR which lets you tweak a song to achieve fine fidelity for less space. There is no encoding quality limitation specified, the encoders can support an amazing 16 to 500 Kbps in stereo and 32 to 256 Kbps in mono mode. Here, quality is not measured in kilobits per second, instead an arbitrary 10-point scale is used–quality level 0 is equivalent to 64 Kbps, level 5 is roughly 160 Kbps and level 10 is about 400 Kbps. Near Audio CD quality is closely achieved in level 3 and 4 which also adjust sound quality and file size excellently.

Many players have been supporting Ogg Vorbis through plugins for some time now. Winamp for instance, natively supports Ogg Vorbis. On the encoding front, software such as dBpowerAMP converts your existing audio files to Ogg Vorbis in a few mouse clicks. The software support is expected to increase significantly over time, with more and more users seeing Ogg Vorbis as the format offers the right mix of audio quality and file size.

The music industry is veering sharply away though–Ogg Vorbis, like MP3 and unlike WMA has no safeguards against piracy. Hardware support is also wanting, with just a few players supporting the format. Iomega and Rio are two companies all set to make Ogg Vorbis compatible with their players if the consumer demand is strong enough to necessitate the change.

Ogg Vorbis is a potential rival to the MP3 format and it will continue to get better because of the flexibility that allows significant tuning and tweaking of the algorithm, even after the format is frozen–and all this for free, with no copyright or patents.

AAC

One of the most promising formats on the horizon is the patented AAC (Advanced Audio Coding) standard. In the late 1990’s this format was sidelined by MP3, as it demanded unrealistic computing resources for encoding and decoding. But now, AAC is being promoted as the audio data compression standard of the 21st century. It is now and official standard under ISO-MPEG with Via Licensing currently in charge of licensing the technology.

AAC has become the base for a number of sophisticated audio codec’s, including AT+T’s a2b and will also be used in the MPEG-4 standard.

AAC can record data both in mono and stereo mode, and up to a maximum of 48 channels of data! Add to that sampling rates of 96 kHz and broadcast quality at a bit rate of 320Kbps for 5.1 channel sound and you have the future of compressed digital audio.

The standard is made up of three different ‘profiles’, which differ from each other in sound quality. Studies conducted by MPEG show that AAC profiles encoded at 128Kbps and 96Kbps bit rates rank way ahead of MP2 at 192Kbps and MP3 at 128Kbps. While this and numerous other tests show that AAC ranks way ahead of its contenders, including WMA, its popularity is severely restricted by the fact that it is very tightly licensed. This may change though–AAC is widely regarded as one of the highest quality formats for distributing music on the Internet and is also targeted towards streaming audio. However, despite the improvements in processor technology, the processor requirements for both encoding and decoding are rather high.

Moreover, software support is quite limited–as far as decoders go, there is a plugin for Winamp. On the encoder side, AACenc is available for non-commercial use only. You can use dBpowerAMP for converting your CD’s with a plugin. Hardware support too is a while away, rumors of popular portable players upgrading to this format are in the air but, the format itself is yet to pick up.

MP3 may be aging, but development is still in progress to produce encoders that are leaner and produce better audio fidelity with lesser artifacts. And it still rules the P2P roost as the most traded commodity online. So while the MP3 standard cannot be discounted yet, the future of digital audio clearly belongs to other standards–WMA and Ogg Vorbis aim for top-of-the-line quality, while MP3PRO gives us small yet clear files. The crystal ball also shows the emergence of MPEG-4, bringing full blown multimedia integration across the spectrum, not just bits and pieces.

Even with complex mathematical algorithms that accurately model the psycho-acoustic responses of that incredible human organ, the ear, and what you may hear about listening room tests, there is only one real testing ground: how good does the music sound when you play it on your computer or portable?

The human ear is the final judge of quality.

Note: This article by Strykar was reprinted from his orignal: http://www.ilounge.com/index.php/articles/comments/audio-formats-explained-and-rated/.

Related Reading

Leave your response!

Add your comment below, or trackback from your own site. You can also subscribe to these comments via RSS.

Be nice. Keep it clean. Stay on topic. No spam.

You can use these tags:
<a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>

This is a Gravatar-enabled weblog. To get your own globally-recognized-avatar, please register at Gravatar. Note: By filling out this comment form or emailing us you are signifying that you have read and agree to the terms laid out on the Contact Us page.