sound engineering

Audacity: miscellaneous tips

Setting ranges for Spectrum Analyzer

Aim: reproduce analysis of Monty Montgomery's proof that 16-bits can capture signals below 96dB using analysis of these files:

- Sample 1: 1kHz tone at 0 dB (16 bit / 48kHz WAV)

- Sample 2: 1kHz tone at -105 dB (16 bit / 48kHz WAV)

Above: Spectral analysis of a -105dB tone encoded as 16 bit / 48kHz PCM. 16 bit PCM is clearly deeper than 96dB, else a -105dB tone could not be represented, nor would it be audible

Firstly, I found the -105dB file entirely inaudible, so one has to rely on spectrum analysis to see that any low sound was captured.

The dB range for the Spectrum Analyzer is not under Preferences > Spectrograms. Here are the defaults (see also http://manual.audacityteam.org/man/Spectrograms_Preferences):

It is instead under Preferences > Interface > Meter/Waveform dB range (see also http://manual.audacityteam.org/man/Interface_Preferences#Display):

I found that one could reset the dB range and simply replot without closing the Spectrum Analyzer.

However, for the max frequency range, I found that one had to close the Spectrum Analyzer and reopen it to catch the change. The highest frequency plotted by the Spectrum Analyzer will not always be exactly the same as the max frequency, it may be much higher.

I was able with the Rectangular option to reproduce the original spectrum quite well:

Visit also: http://manual.audacityteam.org/man/Audacity_Waveform#dB


Spectrum Analysis of high frequencies

I wanted to analyse this high frequency intermodulation test file from Monty Montgomery:

30kHz tone + 33kHz tone (24 bit / 96kHz) [5 second WAV]

Note firstly that on opening a WAV file in Audacity you have the option to copy the original file or work on the file inplace (and I tried both, the result in this case was the same).

One has to be very careful about the "default" project sample rate and the "active" project sample rate. On my first attempt, even though it showed 96kHz in the top-left near the waveform, I knew something was wrong, because when I tried to set the highest frequency of the spectrum analyzer window to 48kHz it would not go higher than 22 kHz. From http://manual.audacityteam.org/man/Spectrograms_Preferences:

'Maximum Frequency: This value corresponds to the top of the vertical scale. The value can be set to 100 Hz or any higher value. Irrespective of the entered value, the top of the scale will never exceed half the current sample rate of the track (for example, 22050 Hz if the track rate is 44100 Hz) because any given sample rate can only carry frequencies up to half that rate. '

It in fact showed only 44.1kHz in the lower left as the project sample rate, I had to switch that also to 96kHz:

When I tried to create a spectrum I could see nothing.

To be extra careful, I then set the default sample rate for projects to 96 kHz (see http://manual.audacityteam.org/man/Quality_Preferences) and then reopened the project, this time permitting copying of the original WAV file. I could then generate a spectrum with up to 48kHz, but it was still empty:

But as you can see from the waveform (noting that one can't hear 30kHz and 33kHz) the frequencies are clearly present and at decent amplitude !

I could not perform spectrum analysis with high frequencies in Audacity (with the Rectangular window).

But from http://wiki.audacityteam.org/wiki/Suggested_Frequency_Analysis_Capabilit...

'Currently there is a hard upper limit of 100KHz for the maximum visible frequency in the spectrogram view. Most users will have audio files with a sample rate of 192KHz or less, and that a 100KHz limit is a reasonable default computationally and memory wise, however a warning might be more appropriate than a hard limit.'

And from: Are Your High Resolution Recordings Really High Resolution? by Teresa Goodwin:

' Setting up Audacity to Confirm Frequency Response of Downloaded Files

Open the Audacity program, click Audacity and select Preferences in the drop-down menu click Tracks, under Display - Default view mode choose Spectrogram click OK. Next click File and select Open in the drop-down menu, click Media and then Music and then pick the music file you want to test, then click Open. The music file will open as a playable spectrogram, ..

Here is a spectrogram of a 24/88.2 music file with authentic ultrasonic frequencies.'


I tried this and it worked fine, clearly showing the constant 30kHz and 33kHz frequencies:

The article above claims that Analyze > Plot Spectrum should also work:

'Next you can make a plot spectrum of about two minutes of the music from the spectrogram, highlight a section with the most high frequency energy, click Analyze and select Plot Spectrum in the drop-down menu.

Here is a plot spectrum of a 24/88.2 music file with authentic ultrasonic frequencies.'

But I tried it and again could see nothing. Until I tried the Hanning Window instead of the Rectangular window and it worked !

Visit also: Quick Tips: Audio Frequency Analysis using Audacity


Mac OS X: EBU R128 compliant loudness meters and batch processing

FREE EBU R128 meters

These tools are in addition to the FREE EBU R128 ("ebur128") command line loudness filter demonstrated at: A summary of a review of music levels for broadcasting, personal use, recording and mastering, including the new LOUDNESS measures.

FREE from Klangfreund: LUFS Meter:

'EBU R128 compliant loudness measurement

The LUFS Meter plugin enables you to deliver loudness-calibrated content.

Multi-Platform, Multi-Format

Available as VST- and Audio Unit-plugin on Mac. On Windows, the LUFS Meter is available as a VST-Plugin. 32 and 64 bit. Support for Linux and other plugin formats is planned.

http://www.klangfreund.com/lufsmeter/download/'

Please note that for Audacity you just use 32-bit version, as Audacity does not support 32-bit VST plugins !

I managed to get it to run (preview) a pink noise test file within Audacity:

But it kept crashing Audacity whenever I clicked the Ok button !

Professional paid EBU R128 Meters

Grimm Audio provide a range of professional EBU r128 and ATSC A/85 compatible software and VST, AU, and RTAS plugins.

From LevelView (Price: € 350.00 excl. VAT):

'LevelView is a highly innovative real time loudness meter. Its 'Rainbow meter', based upon the 'Bendy Meter' concept of BBC Research, gives the user continuous insight in recent loudness levels of the program material.

"LevelView is the most innovative real time solution for EBU R128 and ATSC A/85 compliance."

..

LevelView runs as a plugin in most DAW's on both Mac and PC. There's also a standalone application that directly connects to your sound card.

In trial mode the program has a 14 days evaluation period.'

From LevelOne (Price: € 450.00 excl. VAT ):

'LevelOne offers EBU R128 and ATSC A/85 compatible loudness normalization.

..

It performs all your level normalization tasks automatically and accurately. You have the choice of normalizing to sample peak, true (over-sampled) peak, PPM peak or ITU/EBU LUFS loudness target levels.

..

In trial mode the program has a 14 days evaluation period.'

From Waves: WLM Loudness Meter ($400) for Mac or Windows:

'The Waves WLM Loudness Meter plugin provides precision loudness measurement and metering for broadcast, movie trailers, games, packaged media and more. Fully compliant with all current ITU, EBU and ATSC specifications, the WLM offers comprehensive Momentary, Short Term, Long Term, and True Peak readouts ..'

Mac OS X: audio engineering plugins

From Wikipedia: Virtual Studio Technology:

'Virtual Studio Technology (VST) is a software interface that integrates software audio synthesizer and effect plugins with audio editors and hard-disk recording systems. VST and similar technologies use digital signal processing to simulate traditional recording studio hardware in software. Thousands of plugins exist, both commercial and freeware, and a large number of audio applications support VST under license from its creator, Steinberg.'

'VST plugins generally run within a digital audio workstation (DAW), to provide additional functionality. Most VST plugins are either instruments (VSTi) or effects, although other categories exist—for example spectrum analyzers and various meters. VST plugins usually provide a custom graphical user interface that displays controls similar to physical switches and knobs on audio hardware. Some (often older) plugins rely on the host application for their user interface.

VST instruments include software simulation emulations of well-known hardware synthesizers and samplers. These typically emulate the look of the original equipment as well as its sonic characteristics. This lets musicians and recording engineers use virtual versions of devices that otherwise might be difficult and expensive to obtain.

VST instruments receive notes as digital information via MIDI, and output digital audio. Effect plugins receive digital audio and process it through to their outputs. (Some effect plugins also accept MIDI input—for example MIDI sync to modulate the effect in sync with the tempo). MIDI messages can control both instrument and effect plugin parameters. Most host applications can route the audio output from one VST to the audio input of another VST (chaining). For example, output of a VST synthesizer can be sent through a VST reverb effect.'

From Wikipedia: Audio Units:

'Audio Units (AU) are a system-level plug-in architecture provided by Core Audio in Mac OS X developed by Apple Computer. Audio Units are a set of application programming interface services provided by the operating system to generate, process, receive, or otherwise manipulate streams of audio in near-real-time with minimal latency. It may be thought of as Apple's architectural equivalent to another popular plug-in format, Steinberg's VST. Because of the many similarities between Audio Units and VST, several commercial and free wrapping technologies are available (e.g. Symbiosis and FXpansion VST-AU Adapter).'

'Mac OS X comes with Audio Units allowing one to timestretch an audio file, convert its sample rate and stream audio over a Local Area Network. It also comes with a collection of AU plug-ins such as EQ filters, dynamic processors, delay, reverb, and a Soundbank Synthesizer Instrument.

AU are used by Apple applications such as GarageBand, Soundtrack Pro, Logic Express, Logic Pro, Final Cut Pro, MainStage and most 3rd party audio software developed for Mac OS X such as Ardour, Ableton Live, REAPER and Digital Performer.'

From Quick Tip: How to Manage VST and AudioUnits Plugins in Mac OS X (2010):

'VST and AudioUnits (AU) are the two native plugin formats for Mac OS X. Although there are other DAW specific formats for plugins, VST and AudioUnits are more common and compatible across various DAWs like Cubase, Logic, etc. There is an abundance of VST and AU plugins for expanding your DAW and building your collection of effects. However, it can be difficult to know how to get those plugins running on your computer. Especially if they are free and do not come with installers or instructions. I’ll help you get those files in the right places and make them appear in your plugin stacks.'

'The plugin folder is nested in the Macintosh HD Library. There are usually a minimum of two Libraries on your Mac, one in Macintosh HD and another in your user account. You should only place the plugins in the Macintosh HD Library so that it can be accessed by all users on the computer. The usual location of the folder should be:

/Macintosh HD/Library/Audio/Plug-Ins/
$ ls -1 /Library/Audio/Plug-Ins/

Components
HAL
MAS
VST

How to Install VST Plugins

1. Unzip the downloaded file if it is an archive like .zip or .rar. You should only see a file with a .vst extension. This is the actual file required for the plugin.

2. Move the .vst file to the VST folder in your audio plugins folder.

3. If your DAW is running, close it and restart it. When your DAW starts up, it will rescan your plugins folder and detect your recently installed plugin.

How to Install AudioUnits Plugins

1. Unzip the downloaded file if it is an archive like .zip or .rar. You should only see a file with a .component extension. This is the actual file required for the plugin.

2. Move the .component file to the Components folder in your audio plugins folder.

3. If your DAW is running, close it and restart it. When your DAW starts up, it will rescan your plugins folder and detect your recently installed plugin.

Other Plugin Formats

You might come across another folder labelled VST3, this is for VST3 plugins which are not as common as of yet. They can be identified with the .vst3 file extension. MAS is used for MOTU Audio System. HAL is Hardware Abstraction Layer and you should not be needing to change anything there.'

Plugins for Audacity

Note: Audacity is a 32-bit application so won't see 64-bit versions of VST plug-ins, even on 64-bit operating systems.

From VST Plug-ins:

'In current Audacity (and legacy 1.3.8 and later), VST effects are displayed with full GUI interface (where provided by the plug-in), and without need of the VST Enabler. This has been made possible by use of an open source VST header.

When Audacity is first launched, an "Install VST Effects" dialogue will appear which lists VST plug-ins detected in the Plug-Ins folder inside the Audacity installation folder and in other system locations. Press OK to load the chosen plug-ins. Prior to Audacity 2.0.4, the scan happened automatically with no choice of which effects to load. Your VST effects will appear in the Effect menu, underneath the divider.

When you restart Audacity again it will reload the plug-ins it detected last session, as stored in the plugins.cfg file in the Audacity folder for application data. This avoids slowing down each Audacity launch by scanning for new plug-ins. So if you add more VST plug-ins later, you must go to the Effects tab of Audacity Preferences, check "Rescan VST effects next time Audacity is started ", then restart Audacity. If you subsequently remove any VST plug-ins, they will automatically be removed from the Effect menu after restart, without need for a rescan (as long as you are using 1.3.10 or later).'

From Audio Units:

'This page describes support for Audio Unit effect plug-ins in Audacity. Audio Units is a plug-in architecture developed by Apple and is only supported in Audacity 1.3.1 and later on Mac OS X.

..

Audio Unit support

Audio Unit (AU) support is available in Audacity 1.3.1 and later - Audacity scans for available AU plug-ins each time it launches. AU support is enabled by default, but it can be turned on or off by clicking Audacity > Preferences: Effects then under "Enable Effects", uncheck "Audio Unit". Restart Audacity for changes to take effect.

Audio Unit "MusicEffects" are supported in Audacity 1.3.14 and later. This class of Audio Unit supports audio input like pure "Effect" AU's but has the ability to use MIDI input to set effect parameters. Audacity doesn't yet accept MIDI input, so although MusicEffects should work fine as audio effects, parameters need to be set manually. Examples of MusicEffects are all those from DestroyFX, Ohm Force and SFXmachine, plus FXpansion Snippet, Tobybear MadShifta and u-he MFM2.

Like VST plug-ins in current Audacity, Audio Units display their full GUI interface by default, where one is provided. If interface difficulties arise, Audio Units can be limited to a tabular interface with sliders by unchecking the option "Display Audio Unit effects in graphical mode" at Audacity > Preferences: Effects. Once again, restart Audacity for changes to take effect.

.. You can find a useful list of third-party AU plug-ins (free and demo/paid-for) on Hitsquad.

To add new Audio Units (AU) plug-ins, place them in:

/Library/Audio/Plug-Ins/Components/

OR

~/Library/Audio/Plug-Ins/Components/

and restart Audacity. As always, ~ means your home directory. Audacity will not load Audio Unit plug-ins from the Audacity "Plug-ins" folder. '

From Nyquist Plugin-ins:

'Audacity supports Nyquist effects on all operating systems, and includes a number of Nyquist plug-ins. You can download additional Nyquist plug-ins, edit their behavior, or even write your own. Nyquist Plug-ins are merely plain text files which can be opened and studied using any simple text editor.

..

We host a large collection of Nyquist plug-ins for use in Audacity

Installation

On Windows and OS X, place new Nyquist plug-ins in the Plug-Ins folder inside your Audacity installation folder and restart Audacity. Your installation folder is usually under C:\Program Files on Windows computers, or under Mac Hard Disk > Applications on OS X.

On Linux, place new Nyquist plug-ins in one of the following locations:

- /usr/share/audacity/plug-ins if Audacity was installed from a repository package

- /usr/local/share/audacity/plug-ins if you compiled Audacity from source code

- ~/.audacity-files/plug-ins which is a per-user directory for which super-user privileges are not required (Note:

- The .audacity-files folder is not created during installation so must be created manually)

- in a Nyquist directory specified in the AUDACITY_PATH environment variable.

Restart Audacity then new plug-ins will be visible in either the Effect Menu, or sometimes in the Analyze or Generate menus. '

From Wikipedia: Nyquist (programming language):

'Nyquist is a programming language for sound synthesis and analysis based on the Lisp programming language. It is an extension of the XLISP dialect of Lisp.

With Nyquist, the programmer designs musical instruments by combining functions, and can call upon these instruments and generate a sound just by typing a simple expression. The programmer can combine simple expressions into complex ones to create a whole composition, and can also generate various other kinds of musical and non-musical sounds.

The Nyquist interpreter can read and write sound files, MIDI files, and Adagio text-based music score files. On many platforms, it can also produce direct audio output in real time.

The Nyquist programming language can also be used to write plug-in effects for the Audacity digital audio editor.

One notable difference between Nyquist and more traditional MUSIC-N languages is that Nyquist does not segregate synthesis functions (see unit generator) from "scoring" functions. For example Csound is actually two languages, one for creating "orchestras" the other for writing "scores". With Nyquist these two domains are combined.

Nyquist runs under Linux and other Unix environments, Mac OS, and Microsoft Windows.'

From Ladspa Plug-ins:

'LADSPA (Linux Audio Developers Simple Plugin API) is an audio plug-in standard originally developed on Linux, but which can be ported to Windows and Mac too. Audacity has built-in support for LADSPA plug-ins.'

Mac OS X: some audio engineering apps and tools

These are in addition to: FFmpeg: command line and GUI audio/video conversion tool: audio references

Most are known to run on Mac OS X Mountain Lion 10.8.5 as of 2013.

Converters

- FLAC tools: official command line tools for FLAC format.

- X Lossless Decoder (XLD): super little free GUI app for Mac OS X, can handle FLAC and ALAC and some other lossless formats, as well as converting from say FLAC to lossy formats like MP3 or AAC.

- Free Audio Converter (FREAC) GUI app: free audio converter and CD ripper. Features MP3, MP4/M4A, WMA, Ogg Vorbis, FLAC, AAC, and Bonk format support, integrates freedb/CDDB, CDText and ID3v2 tagging.

- Max: CD ripper and encoder that supports FLAC and some other formats.

Sound editors

- If I want to do anything exciting involving my own music I use the absolutely awesome Ableton Live for recording, editing, composition, and mastering. (BTW Ableton Live 9 supports multitrack recording up to 32-bit/192 kHz.) I can be engaged for professional audio services: visit Ableton Live (audio).

- Sometimes for post-processing or certain tasks I also use the audio editing in Final Cut, and I likewise offer professional media services for it: visit Final Cut video and audio editing and production.

But sometimes it is nice to be able to load a simpler audio editor for a quick fade-in/out or normalisation job, or just to make a quick recording.

- Audacity is a free, open-source, cross-platform audio editor for Mac, GNU/Linux Windows etc. It's not the world's best audio editor (especially not for MP3 or AAC because it imports, processes, then reexports with a small quality loss rather than say direct MP3 editing), but it has lots of FX and plugins and is sufficient for experiments, quick edits, and some post-processing, as well as wave analysis. As of Nov 2013 on OS X Mountain Lion 10.8.5, I find it far more stable than it used to be. Given that it's free, there's an awful lot that you can do with Audacity.

- To use Audacity with MP3 you will need to also install the LAME MP3 encoder, it's easy.

Internally Audacity works in uncompressed audio in 32-bit floating point by default, and offers up to 96kHz sample rate. You may simply import, edit, then export changes (losing edits), or save edited audio in its native AUP multi-file project folder format. In order to play the results in other programs, you must always export to another well-known format, and it supports nearly every format you will ever need.

- There is an unofficial Wave Stats plugin for Audacity that performs excellent wave analysis over regions of about 30s length, which is enough for you to explore the difference between dBFS RMS and max peaks.

To learn how to install Plugins for Audacity (and most other audio editors) on Mac OS X visit: Mac OS X: audio engineering plugins.

- From Rogue Amoeba for $32: Fission:

'Crop and trim audio, paste in or join files, or just rapidly split one long file into many. Fission is streamlined for fast editing. Plus, it works without the quality loss caused by other editors, so you can get perfect quality audio even when editing MP3 and AAC files. If you need to convert formats, Fission can do that too! You can rapidly export or batch convert files to the MP3, AAC, Apple Lossless, FLAC, AIFF, and WAV formats.'

I tried the free demo for file splitting on silences, not bad.

Here are some other editors I have not yet tried, but they might be worth a go:

- TwistedWave is available for Mac ($79.90), iPhone / iPad ($9.99) and online. TwistedWave for Mac is available as a fully functional 30 day demo. Can handle audio at a resolution up to 32-bit and 192 kHz sampling rate. Includes batch processing with silence detection for splitting long recordings into many files. Can perform pitch correction, pitch shift, and time stretch.

- NHC Software offer the Master's edition of WavePad for $59.95 (includes VST plugins and SFX library), however:

'A free version of WavePad audio editing software is available for non-commercial use only. The free version does not expire and includes most of the features of the normal version. If you are using it at home, you can download the free version here. You can always upgrade to the master's edition at a later time, which has additional effects and features for the serious sound engineer.'

Supports sample rates from 6 to 96kHz, stereo or mono, 8, 16, 24 or 32 bits.

There are dozens of other sound editors for Mac, but as far as I can tell, unless you are working on some real original music composition with something truly professional like Ableton, all you need is Audacity (free).

Players

- Well obviously iTunes: plays most formats including WAV, AIFF, MP3 and AAC, likes compressed lossless ALAC, but does not play lossless compressed FLAC directly (yet). But that is not so bad because ...

- Fluke app: small OS X utility for listening to FLAC files within iTunes, without having to convert anything.

- QuickTime Player: although mainly known as a video player, is very useful for playing audio files (with a simple audio player GUI mode), and it also has a nice file info display with bit rates, sample rates etc. QuickTime is especially useful when you don't want to pollute your iTunes library with audio test files. Just right click and "open with .." then choose QuickTime Player instead of iTunes (or even set QuickTime Player as default for that audio file kind). However, as far as I can tell, QuickTime Player 10.2 still does not play FLAC.

- From Mac Software to play and convert FLAC:

The following software will play FLAC files without any requirement for modification - simply download, install and start using the current version.

- Cog: http://cogx.org

- Play: http://sbooth.org/Play/

- VLC: http://www.videolan.org/vlc/index.html

- Songbird: http://getsongbird.com

- Bigasoft Audio Converter: http://www.bigasoft.com/flac-converter-mac.html

- From Audiofile Engineering for $US 19.99 Fidelia: Premium Music Player:

Fidelia is a high-definition audio player for sophisticated music lovers. With support for all contemporary audio file formats and an elegant interface that focuses exclusively on music, it gives users the power and the freedom to organize, customize and savor their digital music collection at the highest possible fidelity in any circumstance. If you've invested in premium audio hardware, you should have the best audio software.

Plays FLAC. Has adjustable real-time dithering.

- From Sbooth for $US 33 comes Decibel:

'Decibel is an audio player tailored to the particular needs of audiophiles. Decibel supports all popular lossless and lossy audio formats including FLAC, Ogg Vorbis, Musepack, WavPack, Monkey's Audio, Speex, True Audio, Apple Lossless, AAC, MP3, WAVE and AIFF. For lossless formats such as FLAC and WAVE, and for Ogg Vorbis and specially tagged MP3 files, Decibel supports gapless playback with seamless transitions between tracks. Decibel processes all audio using 64-bit floating-point precision, providing the highest possible playback quality for files sampled at all bit depths.'

Monitors/Meters

Pro Level is a simple little $US 5 app with various VU-like digital monitors and some nice simple peak and clip hold settings, but you will need SoundFlower to shunt whatever stream your are targeting back through as an audio input source before it will see it (compare with Audio Hijack below, which you can also use to monitor system audio or any application's sound output directly).

Spectre: real-time Studio Multi-analyzer from Audiofile Engineering for $US99:

'Spectre is a multi-instrument real-time audio analyzer for Mac OS X. Designed in Cocoa from the ground up, Spectre proudly takes advantage of Quartz, OpenGL, CoreAudio, and other solid OS X interface features. Flexibility & Precision. Spectre focuses squarely on live audio analysis by offering 17 different multi-channel and multi-trace meters. Each meter can have any number of traces or indicators, and each trace can have it's own number of input channels, gain, mixing, filtering, ballistics and color (including transparency).'

Rerouting/shunting

- A likely "must have" for audio fun on a Mac is SoundFlower:

'Free Inter-application Audio Routing Utility for Mac OS X. Soundflower is a Mac OS X (10.2 and later) system extension that allows applications to pass audio to other applications. Soundflower is easy to use, it simply presents itself as an audio device, allowing any audio application to send and receive audio with no other support needed.

How To Use Soundflower

Soundflower presents itself as one of two audio devices (2ch / 16ch). The 2-channel device is sufficient for most situations. To send the output of one application to another, select Soundflower as the output device in the first application and Soundflower as the input device within the second application. If an application does not allow you to specify audio devices, you can make Soundflower the default input or output device inside the Sound panel in the System Preferences, or with the Audio MIDI Setup utility application. The 16-channel device is provided for more complex routing situations, and can be used with more than two applications simultaneously if the applications support audio routing to any channel, as Max/MSP does.'

But some of the functionality you might achieve with SoundFlower is more easily achieved out-of-the box with a good "hijacker".

Audio stream hijackers

"Exploring" and recording your (Mac) computer system's and applications' music sources (including online radio):

- Audio Hijack Pro (at around $US 32) is an absolutely super bit of software. You can record nearly any source (including any application) on your Mac, or full system audio. You can record Skype, Facetime, or anything you choose to "hijack", such as a particular web browser playing online radio. (Oops, I said it.) It has a very rich set of FX too, including tapping into all available VST and Apple FX, and you can customise nearly everything, including recording format, bit-rates, levels, schedule recordings, split recordings on-the-fly according to silence detection (with adjustable parameters). You can even use it to shunt audio around your system bus. Amazing !

- Also by Rogue Amoeba there is a new mini-version called Piezo, which unlike Audio Hijack Pro passes the restrictions to enter the Apple App Store. It enables you to record audio from any application, but you have restart the app every time after hijacking before a recording can start.

- To be fair I should also mention SoundTap (Mac and Win) from NHC Software, who also have a super kit of other audio apps. It is however not nearly as powerful as Audio Hijack Pro, but it's enough to tap a bit of your computer's sound quickly.

- And also Snowtape:

'Listen to internet radio. Record the music. Schedule radio shows. Edit songs and get album artwork. Export to iTunes.'

Hang on. Record the music ? From internet radio ? Ooh aah, that's naughty !

- And also Fstream for Mac. 'Listen to and record online radio easily'. Also available as an iPhone radio listening app.

No wonder so many online radio streams deliberately keep under 128kbps !

Some other audio apps and tools

- patch-based real-time audio and video synthesis. From the Max/MSP family. PureData is amazingly powerful and very clever. See also the Puredata synthesis zone for some examples. I am a huge fan of the PureData project; May Miller Puckette and the PureData/GEM community be blessed.

To see how I use PureData synthesise music and visuals from triaxial accelerometers to make real-time body music (gestural synthesis) see the Drancing project.

- MP3-Info is a very handy little app:

'MP3-Info is a clever companion that helps you organize your music collection. It is essentially a Plug-in for the Finder and iTunes. MP3-Info displays valuable information about audio files, such as their duration, the bitrate, important MP3-Tags, such as the artist, the title of the song, lyrics, cover art, and some more. That saves you a lot of time managing your song collection. It also shows these information for AAC files created by iTunes, and WAV, and AIFF files.'

- I haven't tried it yet, but AudioFinder sounds amazing. Can preview any audio file and give metadata and stats on any audio file direct in the Mac Finder.

- From TuneSweeper:

'Quickly find and remove all duplicates in your iTunes library. Remove missing iTunes files. Add additional music on your computer into iTunes.'

- FREE from AudioSlicer:

'AudioSlicer is a Cocoa GUI application for Mac OS X that finds all silences in an audio file and allows you to split it into several smaller audio files and to name/tag them properly. For now only MP3 is supported but other audio formats may be added in the future.'

FFmpeg: command line and GUI audio/video conversion tool: audio references

FFmpeg can be used for both video and audio conversion and stream manipulation; this page records some useful links for audio work with FFmpeg.

The reality is that I mostly use other Mac GUI tools like XLD (X Lossless decoder), Freac, Max etc. for converting, but I do find ffmpeg command line useful for tricky situations, and you can use it to make a nice and very quick audio statistics command line tool for checking RMS values and peaks in files (so that you can then decide whether or not they should be normalised, without having to load them in a sound editing app).


Command line ffmpeg

Command line 'ffmpeg' for Mac can be installed using MacPorts. I got it working OK on Mac OS X Mountain Lion, but you should at least be a bit UNIX savvy to try this. You will also need the LAME MP3 Encoder port if you want to deal with MP3.

$ sudo port install ffmpeg

$ sudo port install lame

(As always with mac ports, don't be scared to use that -f (force) option if you upgraded your os recently !)


FFmpegX is a rather clunky but nevertheless very powerful GUI for FFmpeg for Mac OS X. For some reason I could not get it to work with FLAC lossless compression files, but the command line version works with FLAC.


Useful FFmpeg links for audio:

- Audio options

- Video and audio file format conversion

- Rodrigo Polo's FFmpeg cheat sheet.

- FFmpeg basics for beginners

- An amazingly comprehensive 6-part series from 2011/2012 by Fabio Sonnati FFmpeg: the swiss army knife of Internet Streaming.


Supported audio types. You can also use command line:

$ffmpeg -codecs
$ ffmpeg -formats

Making a simple audio statistics analyzer

Store the following in a file at ~/bin/@ffmpeg-statistics:

#!/bin/bash
ffmpeg -i "$1" -filter:a "volumedetect" -vn -f null /dev/null

Make sure you make it executable with:

chmod +x ~/bin/@ffmpeg-statistics'.

And do "quote" your audio file name if it contains spaces:

@ffmpeg-statistics "my audio file.mp3"

Output is like:

Duration: 00:05:32.43, start: 0.000000, bitrate: 128 kb/s
Stream #0:0: Audio: mp3, 44100 Hz, stereo, s16p, 128 kb/s
Output #0, null, to '/dev/null':
Metadata:
..
encoder : Lavf55.12.100
Stream #0:0: Audio: pcm_s16le, 44100 Hz, stereo, s16, 1411 kb/s
Stream mapping:
Stream #0:0 -> #0:0 (mp3 -> pcm_s16le)
..
size=N/A time=00:05:32.43 bitrate=N/A
video:0kB audio:57263kB subtitle:0 global headers:0kB muxing overhead -100.000038%
[Parsed_volumedetect_0 @ 0x7f8fa8412d00] n_samples: 29318494
[Parsed_volumedetect_0 @ 0x7f8fa8412d00] mean_volume: -17.1 dB
[Parsed_volumedetect_0 @ 0x7f8fa8412d00] max_volume: -1.4 dB
[Parsed_volumedetect_0 @ 0x7f8fa8412d00] histogram_1db: 12
[Parsed_volumedetect_0 @ 0x7f8fa8412d00] histogram_2db: 583
[Parsed_volumedetect_0 @ 0x7f8fa8412d00] histogram_3db: 9111
[Parsed_volumedetect_0 @ 0x7f8fa8412d00] histogram_4db: 42170

There are lot sof example of the use of FFmpeg for command line processing, including using the new EBU R128 loundness filter "ebur128", at: A summary of a review of music levels for broadcasting, personal use, recording and mastering, including the new LOUDNESS measures.

Audio engineering test/sample file resources, and online generators and online audio tests

Some handy audio test file and test generator resources

- AudioCheck - Internet's largest collection of Sound Tests, Audio Test Tones, and Tone Generators. Online and Free!

- AudioCheck: Pink noise test tones. Includes excellent description of pink noise principles. Pink noise sample files generated using wavTones' professional grade Pink Noise Generator. Download Pink Noise CD Quality WAV.

- AudioCheck: High Definition Audio test files, but only it seems "high definition" in the sense of sample rates higher than 44.1kHz. Frequency sweeps, chirp tones, white noise, pink noise. Includes Pink Noise: 96kHz, -3dBFS, 30s, 5.6MB (WAV) BUT ONLY 16-bit.

- AudioCheck: dynamic test tones: A series of pink noise files at full scale then a given number of dB below, down to 72dB and then Mute:

'Nowadays, much emphasis is placed on 24-bit recordings, with a dynamic range exceeding 140dB. Use these test tones to realize how 16-bit supersedes by far the dynamic range offered by your listening environment.

"At 20 bits, you are on the verge of dynamic range covering fly-farts-at-20-feet to intolerable pain. Really, what more could we need? (a quote from the internet)." '

I downloaded them and performed tests on them as reported here: [node/3247].

- Ten-minute clips of white noise, pink noise and Brownian noise. Recorded in stereo at a 24-bit 48-kHz rate. Synthesized with Sound Forge Software. Available in 24bit FLAC, Ogg Vorbis, 64Kbps MP3, and VBR MP3.

- Sound tests and clips: WAV 48KHz, 16bit stereo. Examples: LRMonoPhase, piano, some organ sounds. Maybe more useful Pink Noise, 48k/32Float, Stereo, 3.7MB.

- A very useful list of Free 24/96 Downloads.

- Label 2l.no where you can download sample FLAC and WAV in 24 Bit / 96 kHz & 24 Bit / 192 kHz files

- LINN Records test files: 16-bit and 24-bit ALAC and FLAC, 320kbps MP3 (sample rates not stated).

- Sound Keeper Recordings Format comparison: includes 16-bit/44kHz, 24-bit/96kHz, 24-bit/192kHz zipped WAV files.

- Steinway and Sons: Three excerpts from Franz Liszt's Mephisto Waltz No. 1, in Original: 24 bit / 96 kHz (103 MB) vs. CD quality: 16 bit / 44.1 kHz (32 MB).

- HIRES FLAC testbench downloads: Mozart, Beethoven etc. FLAC 24-bit/192kHz, FLAC 24-bit/96kHz ..


Listening and format comparison tests (for fun and humility)

- Up to the challenge? Do 320kbps mp3 files really sound better? Take the test!

- AudioCheck: Blind Listening Tests. A heap of truly humbling sound tests that some self-proclaimed audiophiles would be too scared to try. Try them with your cheapest headphones possible to thwart the tests completely, or do it properly and use very expensive audiophile speakers and best quality equipment .. and get almost exactly the same result.


Monty Montgomery's high definition and ultrasonic audio test files

This is a super article 24/192 Music Downloads ... and why they make no sense including ultrasonic "intermod" tests:

'a 30kHz and a 33kHz tone in a 24/96 WAV file, a longer version in a FLAC, some tri-tone warbles, and a normal song clip shifted up by 24kHz so that it's entirely in the ultrasonic range from 24kHz to 46kHz ..'Assuming your system is actually capable of full 96kHz playback, the .. files should be completely silent with no audible noises, tones, whistles, clicks, or other sounds. If you hear anything, your system has a nonlinearity causing audible intermodulation of the ultrasonics.'

Well I played the normal song clip shifted up by 24kHz [10 sec WAV] and on my MacBook Pro 17" early 2008 (and listened with quality headphones) I could easily hear lots of audible noises, and even the basic rhythm and some melody of the original song ! Therefore, my MacBook Pro system, although claiming to by capable of 24-bit 96kHz (and setup correctly for it) 'has a nonlinearity causing audible intermodulation of the ultrasonics' !

I also heard lots, easily, on these ultrasonic warble files:

26kHz - 48kHz warbling tones (24 bit / 96kHz) [10 second WAV]
26kHz - 96kHz warbling tones (24 bit / 192kHz) [10 second WAV]

Monty also includes some test files that prove that 16-bit can represent arbitrary sounds quieter than the oft quoted -96dB (which is merely an RMS value, not a limit):

'I have linked to two 16 bit audio files here; one contains a 1kHz tone at 0 dB (where 0dB is the loudest possible tone) and the other a 1kHz tone at -105dB.

Sample 1: 1kHz tone at 0 dB (16 bit / 48kHz WAV)

Sample 2: 1kHz tone at -105 dB (16 bit / 48kHz WAV)

Above: Spectral analysis of a -105dB tone encoded as 16 bit / 48kHz PCM. 16 bit PCM is clearly deeper than 96dB, else a -105dB tone could not be represented, nor would it be audible. How is it possible to encode this signal, encode it with no distortion, and encode it well above the noise floor, when its peak amplitude is one third of a bit? Part of this puzzle is solved by proper dither, which renders quantization noise independent of the input signal. ..'

I performed some spectrum analysis in Audacity on some of these Monty test files, visit: Audacity: miscellaneous tips.

On the 16-bit "sound capture below 96kB" proof:

On the 30kHz-33kHz ultrasonic 24-bit 96kHz file:


A review and some remarks on lossless and lossy sound formats, audio codecs, and trends (fashions)

These are some remarks on my recent review of comparisons of bit rates, sample rates, and the endless arguments about whether to use 128kbps, 192kbps, 256lbps, or even 320kbps bit rates with lossy audio compression formats like MP3 at 44.1kHz, as well as a review of the debate about whether it matters whether consumer recordings use "only" CD Quality 16-bit sample bit depth at 44.1kHz sample rate, or higher sample bit depths like 24-bit and higher sample rates like 48kHz, 96kHz or even 192kHz for lossless formats, so-called "high(er) definition" audio.

I also have guide on "high definition audio" for Mac users here: Mac OS X: HOWTO adjust your system's sound quality, and record or find "high definition" audio sources.

Human beings happily enjoying music without being sad about bit rates and compression formats

WARNING: this page is fairly littered with links to MP3 bit rate and high definition audio comparison tests.
Find also various sample test files via: Audio engineering test/sample file resources, and online generators and online audio tests

Firstly, I'll put myself out there. If a musical composition and recording is basically quite good, and you can't even enjoy it at all in MP3 format on a small personal music playing device at 128kbps or even "as low as" 96kbps, then your view of life is broken, and you should go back in time to the trenches of WWI or perhaps explore a firestorm of WWII and see how much fun that is, until you realise how lucky you are to be alive and to be able to own and carry a cheap personal music playing device smaller than your pocket that holds days, weeks, or even months of music.

I listen to SBS Chill on my nice DAB+ digital radio at "only" 56 kbps/AAC (HE-AAC) and I manage to enjoy it.
I would prefer them to allocate a bit more, but the music is simply amazing anyway.

It is of course quite nice if you can have your personal music collection at higher bit rates, and if you have genuine professional audio needs or very - and I mean very - fancy speakers at home it is "nice" to use 320kbps MP3 or perhaps even a lossless format at 16-bit 44.1kHz, assuming the original source was "only" CD quality.

Or if the original source of the incredibly cleanly recorded music really warrants it, you might convince yourself (despite numerous professionally, scientifically conducted double-blind experiments that prove you won't be able to tell the difference) that you need 24-bit sample depth for your personal music collection, possibly even at one of the higher sample rates like 96kHz, presumably because you have very - and I mean very - sharp hearing better than nearly all other human beings, including "audiophiles", who have participated in those blind tests and all failed to consistently notice any difference.

But I really don't believe your ears and home speakers are so good you ever "need" 24-bit/192kHz lossless in your personal music collection.

Some background on lossy compressed and lossless audio formats

So, enough of the opinions for now, and on to the research and summary. It is useful to know that:

- From Wikipedia: Bit rate:

'the number of bits that are conveyed or processed per unit of time.'

- From Wikipedia: Sample rate:

'defines the number of samples per unit of time (usually seconds) taken from a continuous signal to make a discrete signal.'
..
'The full range of human hearing is between 20 Hz and 20 kHz. The minimum sampling rate that satisfies the [Nyquist-Shannon] sampling theorem for this full bandwidth is 40 kHz. The 44.1 kHz sampling rate used for Compact Disc was chosen for this and other technical reasons.'

Dogs and cats and some other animals can hear higher frequencies it seems, but they don't usually use iPods, although they might listen to music on very expensive sound systems in the loungerooms of audiophiles.

I recommend also this tutorial series by Dave Marshall from 2001: Implications of Sample Rate and Bit Size

Comparing bit rates of lossy compressed samples without considering the sample rate of the encoded source is inconsistent. For example, one might compare MP3 bit rates of different sample files assuming the same sample rate (44.1 kHz for traditional reasons) but it "aint necessarily so". Mostly it is (because mostly an MP3 is encoded from a 44.1 kHz CD quality source). But not always.

The bit rate works together with the sample rate in a subtle way to give what you perceive as sound quality. From Wikipedia:MP3:

'Compression efficiency of encoders is typically defined by the bit rate, because compression ratio depends on the bit depth and sampling rate of the input signal. Nevertheless, compression ratios are often published. They may use the Compact Disc (CD) parameters as references (44.1 kHz, 2 channels at 16 bits per channel or 2×16 bit), or sometimes the Digital Audio Tape (DAT) SP parameters (48 kHz, 2×16 bit). Compression ratios with this latter reference are higher, which demonstrates the problem with use of the term compression ratio for lossy encoders.'
..

'Several bit rates are specified in the MPEG-1 Audio Layer III standard: 32, 40, 48, 56, 64, 80, 96, 112, 128, 160, 192, 224, 256 and 320 kbit/s, with available sampling frequencies of 32, 44.1 and 48 kHz.'

'A sample rate of 44.1 kHz is almost always used, because this is also used for CD audio, the main source used for creating MP3 files. A greater variety of bit rates are used on the Internet. The rate of 128 kbit/s is commonly used, at a compression ratio of 11:1, offering adequate audio quality in a relatively small space. As Internet bandwidth availability and hard drive sizes have increased, higher bit rates up to 320 kbit/s are widespread.'

Uncompressed audio as stored on an audio-CD has a bit rate of 1,411.2 kbit/s, so the bitrates 128, 160 and 192 kbit/s represent compression ratios of approximately 11:1, 9:1 and 7:1 respectively.'

The compression ratio for 320kbps MP3 at 44.1kHz is 4.4:1, at which point - if you care about the sound quality so much - you might as well ask yourself why not just use a lossless compression format like Apple Lossless ALAC (which BTW is also (now) supported by all iOS device (iPod, iPad, and iPhone) models) or FLAC with a compression ratio of about 2:1.

OK, let's look at bit depths for uncompressed lossless like WAV and AIFF:

- From Wikipedia: Audio bit depth:

'In digital audio using pulse-code modulation (PCM), bit depth is the number of bits of information in each sample, and it directly corresponds to the resolution of each sample. Examples of bit depth include Compact Disc Digital Audio, which uses 16 bits per sample, and DVD-Audio and Blu-ray Disc which can support up to 24 bits per sample.'

..

'Bit depth is only meaningful in reference to a PCM digital signal. Non-PCM formats, such as lossy compression formats like MP3, AAC and Vorbis, do not have associated bit depths. For example, in MP3, quantization is performed on PCM samples that have been transformed into the frequency domain.'

'The bit depth has no impact on the frequency response, which is constrained by the sample rate.'

- From Pulse Code Modulation (PCM):

'a method used to digitally represent sampled analog signals. It is the standard form of digital audio in computers, Compact Discs, digital telephony and other digital audio applications. In a PCM stream, the amplitude of the analog signal is sampled regularly at uniform intervals, and each sample is quantized to the nearest value within a range of digital steps. PCM streams have two basic properties that determine their fidelity to the original analog signal: the sampling rate, the number of times per second that samples are taken; and the bit depth, which determines the number of possible digital values that each sample can take.'

Just quoting bit rates without stating specifics of the encoding method is dangerous (error prone). You can't compare bit rates between say lossy MP3 and AAC with sample/audio bit depths of lossless uncompressed WAV PCM or lossless compressed ALAC or FLAC without specifying exactly what was done and how in the processing and encoding. Also, in the past there was a wide range of quality in MP3 and AAC encoders, although this is less so in 2013.

So here is the crash course in what really counts, unless you are a genuine audio professional wrangling with issues specific to high-end professional audio production (not your iPod's music collection or your home movies):

- Music CDs use 16-bit, DVD-Audio and Blu-ray can support 24 bits per sample, and they can support a range of sample rates higher than the 44.1kHz used for CDs. A lot of people listened to 16-bit CDs for a long time and it didn't kill them, and it's still not dangerous to listen to "only" 16-bit at 44.1kHz if the music is good. See also my summary at: Mac OS X: HOWTO adjust your system's sound quality, and record or find "high definition" audio sources

- From Wikipedia: Advanced Audio Coding (AAC)

'a standardized, lossy compression and encoding scheme for digital audio. Designed to be the successor of the MP3 format, AAC generally achieves better sound quality than MP3 at similar bit rates'

There is increasing support for AAC in consumer devices, but MP3 is still probably more widely supported (still the "de facto"), although the gap is closing fast. Therefore:

MP3 (a lossy compression format) is not suddenly evil just because Advanced Audio Coding (AAC) (another lossy compression format) is clearly usually better as an algorithm. There are still a lot of devices that still don't support AAC, and if you are preparing music for somebody who has a device that only handles MP3 (and more still do), or if you are unsure, then use MP3. There is no shame in it, you will not be less cool than somebody who uses AAC, you will merely need a bit more (say 20% to 30%) more disc/storage space and more kbps to get about the same sound quality, depending on the type of music (see comparison links at the end of this article).

And even if you use Apple devices (as I do), you are still allowed to use MP3s, in 2013.

- High Efficiency AAC is typically used by broadcasters who can only offer lower bit rates; the algorithm is tuned to work well with less data and with streaming.

- Apple's popular .m4a suffix does not tell you automatically what the audio format is. It is a container format, and could, for example, contain ACC lossy compressed audio or ALAC lossless compressed. One needs to open the container and look inside to know.

- Some people claim that at lower bit rates the free open source Vorbis lossy compression format performs slightly better than AAC, but at higher bit rates above 128kbps it is likely indistinguishable.

- If you insist on using completely lossless compression, it does not matter much whether you use Apple's Lossless Audio Codec (ALAC) (often stored inside a special MP4 container with the filename extension .m4a) or the Free Lossless Audio Codec (FLAC). Really, it doesn't.

- ALAC 'Testers using a selection of music have found that compressed files are about 40% to 60% the size of the originals depending on the kind of music, which is similar to other lossless formats'

- FLAC 'Digital audio compressed by FLAC's algorithm can typically be reduced to 50–60% of its original size'.

So you can win a bet at a pub with an audiophile by quoting Wikipedia ! It depends a bit on the type of audio tested of course. But not that much.

If you are a musician or sound engineer working with professional sound recording and mixing you will need to stay lossless, and some sound editing systems support working "directly" in FLAC or ALAC - instead of uncompressed WAV or AIFF - and thus save typically around 50% storage space along the way (sometimes at the price of a bit of compression/decompression time).

Otherwise, unless you are being naughty distributing stolen PCM-sourced music via torrent sites and wish to save the torrent pirates some disk space and your torrent "customers" some download time, there is really barely any reason to not simply compress the music using a lossy format like AAC or MP3 (at 160kpbs, or perhaps 192kpbs, or if you insist you can hear the difference then even at 320kbps) and you still win massively on storage space.

There is an excellent summary of the (ridiculously) large number of audio formats at: http://en.wikipedia.org/wiki/Comparison_of_audio_codecs. But it does not show it seems a comparison of file sizes for various formats for "comparable" music styles.

Ok, time for some "expert" assessments of MP3 bitrates vs. CD quality

At the lower end of the scale, there is a "self-test" comparison by Daniel Potts (PC World) from 2002 at 'Audio compression formats compared'. It is interesting to note that he claims that MP3 with constant bit rate (CBR) can achieve 'CD quality' with 128kbps and a filesize of 960kB/min, whereas he claims AAC can achieve 'CD quality' with 80kbps and filesize of about 600kB/minute, a significantly smaller file size.

Really, 128kbps MP3 CBR ? That is probably far lower than most audio professional would equate to CD quality. Here is another comparison from How Stuff Works 'How MP3 Files Work' (and since they know how stuff works we trust them more):

'Using a bit rate of 128 Kbps usually results in a sound quality equivalent to what you'd hear on the radio. Many music sites and blogs urge people to use a bit rate of 160 Kbps or higher if they want the MP3 file to have the same sound quality as a CD.'

'Some audiophiles - people who seek out the best ways to experience music - look down on the MP3 format. They argue that even at the highest bit rate settings, MP3 files are inferior to CDs and vinyl records. But other people argue that it's impossible for the human ear to detect the difference between an uncompressed CD file and an MP3 encoded with a 320 Kbps bit rate.'

[It is assumed above that they are talking here about Constant Bit Rate (CBR) not Variable Bit Rate (VBR).]

Here's another example of mp3 vs cd audio quality tests from Sam Lin asserting that CD quality requires a bit higher MP3 bit rate, including some frequency analysis and nice graphics to prove it. He not only tested an orchestral piano piece and a pop song, he also tested some pink noise. He did blind listening tests and used a range of different sorts of speakers:

'There has been much debate on the sound quality of MP3's vs the 16-bit linear PCM used in producing audio CD's. Not being able to find much in the way of critical test results, I set out to perform some tests of my own. As a baseline, I chose 192Kbps as the lowest MP3 bitrate, since this seems to be a commonly agreed upon threshold for "near CD quality," and most of the MP3's I've listened to encoded below 192Kbps have sounded too degraded for my tastes.'

Some opinions, from a musician (live performer)

Oops, what was that I read above ? Many experts seem to agree that most/many people can't hear the difference between a 16-bit 44.1 kHz CD and an MP3 at 192kbps (CBR). And I recall well when CDs came out that lots of people seemed to enjoy the music on them ! Maybe it was ... because of good music, with good musicians, with good songs played well .. maybe it wasn't so much because of every last digital bit.

Most posh wine "experts" can't tell the difference between red wine and white wine when their nose is pegged and some can't even tell the difference when just blindfolded; it must be true if The Guardian says it too. And many - if not most - audiophiles bathe in their own self-absorbed obsession with bit rates, bit depths, and sample rates. A similar sentiment is expressed in this humorous MacWorld article 'Listen- (or shut-) up', which includes some nice tests.

Having some of your music in only 128kbps MP3 (instead of 192kbps or 320kbps MP3, or better AAC, or even in a nice compressed lossless format like FLAC or ALAC) is not a good reason to be grumpy or whinge and complain:

If you are still sad, maybe you have chosen the wrong music instead of the wrong bit rate or audio format ? Or perhaps you could learn how to sing a song or play a musical instrument instead. It may even be more fun than fussing about bit rates. And it might even sound better live, too.

You are also not suddenly a better human being than somebody else if your entire personal music collection is all, only, strictly, religiously, in 320kbps. Up to the challenge? Do 320kbps mp3 files really sound better? Take the test!

Live is best, uncompressed !

I am a live musician, a performer, and I know from experience that as long as you are an entertainer, you can bring people enjoyment, if you have the will to do it. And I also know, that nothing, no recording technology, no digital anything, will ever reproduce the sound of a live instrument, ever, anywhere. Ask my bongos (wood and skins/hide); they do some amazing things that no computer will ever do, and no number of bits are ever going to match them. Or stand near a nice brass trumpet played well and listen to it. Doh, computers and most speakers are not made of wood or hide or brass !

But what if I am a big-time DJ playing big music on big speakers to a big crowd at a really big gig ?

You mean like at your girlfriend's 18th birthday party, where all of her (heavily drinking) friends might notice if you only use 192kbps MP3s of Lady Gaga ? Maybe, just to be safe, you should instead use only 32-bit floating point lossless with 192kHz sample rates, so they can hear those really special "bright" sounds (that were never present in the original recordings anyway) above the ... noise ?

Or maybe there is a dog or cat at the party (oops, "gig") with really sharp hearing ?

If you do seriously have the chance to present your audio wares professionally on quality audio equipment, for people who seriously care (or can even notice), then by all means use 320kbps MP3 (or similar high bit AAC) or, if you have the storage space to spare, then simply use uncompressed lossless WAV or AIFF, or compressed lossless ALAC or FLAC (assuming you can play FLAC directly without decompression on your Mac).

Some arguments for consistent use of 320kbps MP3s and even uncompresssed lossless by DJs are made in this article A DJ’s Guide to Audio Files and Bitrates by Dan White (Sep 2012) (although the article also makes some terminology mistakes, such as in one place confusing bit rates of lossy codecs with sample bit depth and sample rates of lossless ones).

One potentially good argument is that if you are processing the music on the fly, such as tempo shifting for tempo matching, then - if you really have to work with lossy compressed MP3 - 320kbps is more forgiving, but of course it also means your processors have to work a bit harder. In any case, storage is now cheap and compact, and processing power is getting better all the time.

"And I am DJing with MP3s because ..."

If you really are a "big time" DJ, what are you working with MP3s for ? If you are seriously DJing professionally,
you don't have to worry about whether or not all of your music resources will fit on your iPhone.

Audiophiles insisting they can hear better than 16-bit / 44.1kHz Compact Disc quality are (probably) kidding themselves

I recommend that anybody who still seriously doubts this reads this detailed article by Monty, Mar 2012, a humorous and technically rich challenge: 24/192 Music Downloads ...and why they make no sense. Accurate, scientific, fantastic ! Also tells you how your ears work, and provides some fabulous audio test files (including some very quiet ones, and some very high frequency ones, for you to _not_ hear noise). After explaining well why 196kHz sample rate won't help you (and may even do some harm) he explains how dithering can push the dynamics of a 16-bit system down below the usually quoted RMS figure of -96dB down to -120dB, and gives test files to prove it ! And his conclusion:

'16 bits is enough to store all we can hear, and will be enough forever.'

..

When does 24 bit matter?

Professionals use 24 bit samples in recording and production for headroom, noise floor, and convenience reasons.

16 bits is enough to span the real hearing range with room to spare. It does not span the entire possible signal range of audio equipment. The primary reason to use 24 bits when recording is to prevent mistakes; rather than being careful to center 16 bit recording-- risking clipping if you guess too high and adding noise if you guess too low-- 24 bits allows an operator to set an approximate level and not worry too much about it. Missing the optimal gain setting by a few bits has no consequences, and effects that dynamically compress the recorded range have a deep floor to work with.

An engineer also requires more than 16 bits during mixing and mastering. Modern work flows may involve literally thousands of effects and operations. The quantization noise and noise floor of a 16 bit sample may be undetectable during playback, but multiplying that noise by a few thousand times eventually becomes noticeable. 24 bits keeps the accumulated noise at a very low level. Once the music is ready to distribute, there's no reason to keep more than 16 bits.

..

Listening tests

There are numerous controlled tests confirming this, but I'll plug a recent paper, Audibility of a CD-Standard A/D/A Loop Inserted into High-Resolution Audio Playback, done by local folks here at the Boston Audio Society.

..

This paper presented listeners with a choice between high-rate DVD-A/SACD
[DVD-Audio (supports up to 2-channel 24-bit 192 kHz) and Super Audio CD] content, chosen by high-definition audio advocates to show off high-def's superiority, and that same content resampled on the spot down to 16-bit / 44.1kHz Compact Disc rate. The listeners were challenged to identify any difference whatsoever between the two using an ABX methodology. BAS conducted the test using high-end professional equipment in noise-isolated studio listening environments with both amateur and trained professional listeners.

In 554 trials, listeners chose correctly 49.8% of the time. In other words, they were guessing. Not one listener throughout the entire test was able to identify which was 16/44.1 and which was high rate, and the 16-bit signal wasn't even dithered!'


Some other useful audio and sound engineering guides and resources concerning formats

- From Paul Sellars of Sound on Sound, an absolutely fabulous "must read" description of MP3s and the MP3 encoding/decoding process: Perceptual Coding: How Mp3 Compression Works (May 200).

- From The Great MP3 Bitrate Test: My Ears Versus Yours:

' .. three songs chosen from vastly different genres, encoded from CD and transcoded into the various popular bitrates available for MP3s (64k, 96, 128, 160, 192, 256, and 320kbps with VBR off) ..'

His conclusion is that in some cases he could hear nothing better above 192kbps, but in some cases he reckons he could hear improvements at 256kbps and 320kbps.

- From PC Pro: 24-bit audio: the new way to make you pay more for music? By Barry Collins, Feb 2011

- Mac software to play FLAC files

- An Overview of Apple Lossless Compression Results by Kirk McElhearn, May 2011. He notes that the Apple Lossless codec has gone open source. Provides some nice tests on various styles of music demonstrating ALAC:

'The range of compression for these examples is from 36% to 68%, with the majority of the examples clustering around the 50% level.' .. ' (These file sizes are similar for other lossless formats, such as FLAC, SHN and APE.)'

So hopefully there's another argument, ALAC vs FLAC, that we no longer have to have (ever again).


I hope this review was of some interest to you and don't forget:

Entertainment, sentiment, performance, and participation are more important than bit rates and sample rates. Love beats technology, and live is best.

Mac OS X: HOWTO adjust your system's sound quality, and record or find "high definition" audio sources

The following is at least applicable to Mac OS X Mountain Lion 10.8.5 in 2013.

Rule 1: there is no point having "high definition" audio resources if you don't
have your playback device set to handle high definition audio or if you don't
play them back on very good quality speakers or very good quality headphones.

What's the point of all those endless arguments about "best" and "minimal acceptable" sample rates and bit depths if your system is not even setup right to hear the difference ? It seems most Macs since many years before 2013 have supported at least 2-channel 24bit Integer sample bit depth at 96kHz sample rate, which is certainly higher definition than sleepy old stereo 16-bit 44.1kHz "CD quality", even if there is a lot of empirical evidence and psycho-acoustic science that proves you may never be able to notice the difference. So I give here some tips and links for checking "high definition" audio settings on your Mac.

But firstly, one of the curious things about the rough term "high definition audio" is that it is, well, not so well defined. You will often find it on the web loosely to mean 2-channel 24-bit 96kHz, while sometimes it is used for as low as 2-channel 24-bit 48kHz or even 44.1kHz, as long as 24bits per sample (per channel) are used. Some extreme audiophiles insist it is not true high definition unless it is at least 2-channel 24-bit 192kHz (and they probably also listen to ultrasonic "whistle music" at home together with their pet dogs and cats).

There is however a very clear definition of Intel High Definition Audio, from Wikipedia:

'Intel High Definition Audio (also called HD Audio or Azalia) refers to the specification released by Intel in 2004 for delivering high-definition audio ..

Hardware based on Intel HD Audio specifications is capable of delivering 192-kHz 32-bit quality for two channels, and 96-kHz 32-bit for up to eight channels. However, as of 2008, most audio hardware manufacturers do not implement the full high-end specification, especially 32-bit sampling resolution.

.. Mac OS X has full support with its AppleHDA driver. ..'

Well I'm not sure my beloved old MacBook Pro, Early 2008 17" can handle all that, but it certainly offers up to 2-channel 24-bit Integer 96kHz on built-in Microphone, built-in Input, and Built-in Output. You can check this for your system using a special application Utilities > Audio Midi Setup:

Whatever sample rate part of "Format" you choose there will also (on reload of the system info window) be reflected under About this Mac > System Report ...> Hardware > Audio > Devices, but it says nothing there about the sample bit depth:

Here you see I have now increased the sample rate for the output (only) to the maximum available for my system, 96kHz. The System Report is however strangely lacking in detail on the Intel High Definition Audio section (as is Apple's spec page for my Mac Book Pro 2008):

I have so far been able to find barely anything else concrete online about the audio hardware of any of my Mac machines, and in particular, I can't find anything about the credentials of the input ADCs - one good reason, especially when recording live music tracks, to instead use an external audio interface, say over USB or FireWire, with well known specifications instead, see discussion below.

High Definition Audio explained (2008) has a good discussion high definition audio as it is meant for Blu-Ray and HD DVD:

The Blu-Ray and HD DVD formats are capable of up to 48Mbps. Around 30Mbps of this transfer speed is reserved for video, leaving a sizeable chunk for (uncompressed) audio.

These audio streams can be sent to an AV receiver/amplifier as bitstream (encoded digital data) or PCM (essentially raw digital data.) Bitstreamed audio from a DVD, Blu-Ray or HD DVD disc needs to be decoded. This can sometimes be done by the player and output as PCM to the amplifier/receiver. More often than not though, the decoding is done by an AV receiver/processor. Regardless of which method you use, there is no difference in quality between PCM and lossless bitstreamed formats like Dolby True HD and DTS HD Master Audio. As a result, many Blu-Ray and HD DVD discs will offer both Dolby True HD/DTS HD Master Audio and (multi-channel) PCM soundtracks for the sheer convenience.

Along with the lossless Dolby True HD and DTS HD Master Audio formats, Blu-Ray and HD DVD offer Dolby Digital Plus and DTS HD High Resolution. While being a “lossy” format, these other two new standards offer benefits that Dolby Digital and DTS from DVD discs can’t such as higher sample rates.'

Of course HD DVD is now abandoned in favour of Blu-ray, for which Wikipedia explains (along with an excellent table of supported formats):

For audio, BD-ROM players are required to support Dolby Digital (AC-3), DTS, and linear PCM. Players may optionally support Dolby Digital Plus and DTS-HD High Resolution Audio as well as lossless formats Dolby TrueHD and DTS-HD Master Audio. BD-ROM titles must use one of the mandatory schemes for the primary soundtrack. A secondary audiotrack, if present, may use any of the mandatory or optional codecs.

Phew, well that made it easier. Basically, all Blu-ray players can at least support:

- Linear PCM (lossless): Max bitrate: 27.648 Mbit/s; Bits/sample: 16, 20, 24; Sample frequency: 48kHz, 96kHz, 192 kHz.

- Dolby Digital: Max bitrate 640 kbit/s; Bits/sample: 16, 24; Sample frequency: 48kHz; (Max 5.1 channels)

- DTS: Max bitrate: 1.524 Mbit/s; Bits/sample: 16, 20, 24; Sample frequency: 48kHz; (Max 5.1 channels)

In general, the higher sample frequencies are only available when no more than 6 channels are used.

So every BD-ROM player is required to at least support 24-bit at 192kHz (on 6 channels), which is higher than the 24-bit at 96kHz on 2 channels my old 2008 MacBook Pro can handle. It is however not as high as the 32-bit requirement for Intel High Definition Audio (see top).


Sourcing high definition audio (free)

Ok, now we know what high definition audio is (roughly) and how to ensure the Mac system audio is set for its audio highest definition capability, let's get some into our Mac. There are at least the following ways:

1. Record some yourself. Definitely the most fun and instructive, and the main subject for rest of this article. This way, as long as you do it right, because you get to explore the noise level, you know it is not only a high definition recording it is also a high definition source.

2. Get some free professional audio engineering test samples:

- Audio engineering test/sample file resources, and online generators and online audio tests

Especially useful are the generated ones, since you know there is not a lot of noise in them (or you know what kind of noise is in them).

Amongst the ones based on live recorded music, I found the free Steinway and Sons piano ones particularly interesting (as I am a piano player) and I confess I found it very hard in quality headphones to hear any difference between the CD quality 16-bit 44.1kHz WAV files and the 24-bit 96kHz WAV files.

3. Download free legal examples of high definition audio from all over the web (not necessarily professionally prepared audio test files, just any music). There is a good discussion of high definition audio resources here: How to find and play high-resolutions audio on the Mac, Jun 2011, by Kirk McElhearn. Includes also an excellent description of the inclusion of high resolution audio for sale on iTunes and other sites, and some of the lossless compression formats like FLAC (FLAC-HD) and ALAC often used to distribute them:

'Playing high-res files

Macs can natively support up to 24/96, played through iTunes or other software. However, without a couple settings tweaks, audio files with resolution higher than 16-bit/44.1kHz will automatically be downsampled to that resolution. So the first thing you need to do is set your sound output to 24/96. To do so, open Audio MIDI Setup, found in /Applications/Utilities. Select the desired output on the left, and then change the settings in the Format section on the right to 96000.0 kHz and 2ch-24bit.

Once you’ve made this change, you can play files at any resolution up to and including 24/96; lower-resolution files will actually be upsampled to 24/96 (which, unfortunately, won’t make them sound any better.)'

But you will never know for sure whether the actual music/sound source was in fact "high quality" and worth the high resolution audio treatment. The only way to be sure of that is to either record it yourself (very carefully), or get professional audio test files.

4. Steal illegal high resolution audio resources from all over the web (like high definition FLAC popular on torrent sites). I'll ignore this one. Besides, you can't be sure of what you get anyway !

5. Rip illegal high resolution audio resources from say Blu-rays at home. I'll ignore this one too. And you usually still can't be sure what the specs of the source behind the mastered audio put on the Blu-ray were, or the quality of the source, even if a "pro" did it for a major production house and a major entertainment distributor.

CAUTION: just because music samples are distributed using a high definition audio format does not mean the music is in fact "high definition audio". It is in some cases no more than a cynical marketing exercise. This is also sometimes true of so-called "high definition samples" offered to unsuspecting computer music composers who then work in a higher definition mode like 32-bit floating point (nevertheless of benefit from the point of view of internal processing, mixing, and FX application) and 96kHz sample rate in their DAW, but are in fact still working with music of no better than CD quality 16-bit 44.1kHz !
Professionally prepared audio test files are best for exploring high definition audio !

But it's fun trying to create your own, and one can learn a lot by doing it, so let's focus for the rest of this article on creating your own high resolution audio recordings.


Recording high(er) definition audio from live sources

Obviously, if the sound/music source is not clean with low ambient noise and your equipment is not clean with good specifications, there is little point. I found it however instructive to experiment with it even if there is some source or equipment noise clearly present to explore how the higher resolution recordings handle it.

Some remarks on audio interfaces

If you are recording it's no use having 24-bit depth at 96kHz on 2 channels if you can't get music into the machine at that quality. Many Mac model audio ports support both (through different connectors) analog via RCA and digital S/PDIF via TOSLINK with round mini-adapter (which is almost RCA shaped). From Wikipedia: TOSLINK:

'Also known generically as an "optical audio cable" or just "optical cable", its most common use is in consumer audio equipment (via a "digital optical" socket), where it carries a digital audio stream from components such as CD and DVD players, DAT recorders, computers, and modern video game consoles, to an AV receiver that can decode two channels of uncompressed lossless PCM audio or compressed 5.1/7.1 surround sound such as Dolby Digital Plus or DTS-HD High Resolution Audio. Unlike HDMI, TOSLINK does not have the capacity to carry the lossless versions of Dolby TrueHD and DTS-HD Master Audio.'

And from Wikipedia: S/PDIF:

'S/PDIF can carry two channels of uncompressed PCM audio or compressed 5.1/7.1 surround sound (such as DTS audio codec) with a maximum bandwidth of 3.072 Mbit/s per channel for a total of 6144 kbit/s; it cannot support uncompressed lossless formats (such as Dolby TrueHD and DTS-HD Master Audio) which require greater bandwidth like that available with HDMI or DisplayPort.'

But what's the use of Mac audio ports for recording from analog signals if Apple don't publish the A/D specs of specific machines ?

External audio interfaces

Another way of getting high quality audio sources for recording in - with known specs - is through an external audio interface to USB2/3, or FireWire400/800 (aka IEEE 1394). My MacBook Pro early 2008 has 3 USB2 ports and:

One FireWire 400 port at up to 400 Mbps

One FireWire 800 port at up to 800 Mbps

There have been lots of arguments online about whether PC cards are better than external interfaces, but with stable modern USB2/3 or FireWire and decent cables you will be fine (and for most Macs there is no choice, external it is). And there are lots of other nice things you can do with external audio interfaces. For example, some of them have very high quality mic preamps and stable phantom power. Some also have nice rerouting and inline FX capabilities.

I have an old M-Audio Firewire 410 (from about 2006), which also has excellent GUI software support, but the specs look a bit tired compared with modern interfaces:

M-Audio FireWire 410

• Dual low-noise mic/instrument preamps with gain controls, LED metering, phantom power and 66dB of available gain

• Two analog inputs (1/4" and XLR) and eight analog outputs on 1/4” TS jacks

• S/PDIF I/O on TOSLink optical or RCA coaxial connectors

• Supports sampling rates from 32KHz, up to 192KHz

• 2-in/8-out analog I/O at 24 bit resolution, up to 96KHz sampling rate

• 24-bit resolution playback at 192KHz sampling rate on outputs 1 and 2

• Two headphone outputs with assignable source and individual level controls

• Software-assigned rotary encoder for tactile control of monitor levels

• 1 x 1 MIDI I/O with hardware bypass switch for computer-independent operation

• Analog outputs support up to 7.1 surround using your audio software (your software must support surround outputs)

Frequency response 20-40kHz ± 1dB.

Signal to noise: –108dB

Dynamic range: 108dB (A-weighted)

• THD + N: 0.00281% @ 0dBFS

And from the box:

operating level: -10dB (unbalanced)

"Only" 96kHz sampling frequency is not bad (and matches the system audio capability of my old MacBook Pro), but many modern audio interfaces offer 192kHz. However at -108dB the noise level is a bit of a worry, although not the end of the earth, as explained in the summary of levels, loudness and noise at: A summary of a review of music levels for broadcasting, personal use, recording and mastering, including the new LOUDNESS measures.

BTW quoting SNR in negative (-)dB for such equipment is wrong according to this guide from RANE on audio specifications, and is often confused with EIN. Equivalent Input Noise or Input Referred Noise, which can be specified as, for example: EIN = -130 dBu, 22 kHz BW, max gain, Rs = 150 ohms.]

However compare 108dB SNR with some more modern audio interfaces/cards like the M-Audio Delta Audiophile 192 with an input SNR of 113dB, or the ASUS Xonar Essence STX with Input SNR of 118dB, and my old M-Audio Firewire 410 interface certainly seems well out of date. The modern cards and interfaces typically also have more generous frequency ranges from 10Hz to 90kHz (presumably for recording very big church organs and coyote howls).

Borrowing the excellent diagram from ZedBee's super article Digital recording rule of thumb, you can see that as long as you record (if using the 24-bit EBU Digital standard) with the RMS around -18dBFS, even 108dB SNR is not too bad (certainly good enough for even "high definition" home recording projects):

There is an excellent guide to Choosing A PC Audio Interface by Martin Walker from Sound on Sound mag from Nov 2004. The basic points are still relevant, with one of the major rules broken by my older M-Audio Firewire 410 (namely it uses only -10dBV consumer level voltage, not +4dBu pro level):

"Consumer & Professional

Many musicians are still confused about which interface input sensitivity and output level to use when faced with choices of [-]10dBV (consumer) or +4dBu (professional). It's easy to get bogged down in discussing millivolts and so on, but there are a few simple rules of thumb that should make everything easier to understand.
Always stick to the '+4' option if you can, since this generally results in lower noise levels. If you can't get high enough recording levels with '+4' input sensitivity on your interface, and there's no -10/+4 switch on the source gear, switch to '-10'. Similarly, stick with +4 output levels unless any connected gear can't cope with these higher levels, in which case revert to '-10'."

Aside: note carefully that these are in different dB scales, +4dBu and -10dBV (although product specs often state just +4dB or -10dB). From Understanding DB:

  • +4dBu equals 1.23 Volts RMS. Actually 1.2276 V
  • The reference level of -10dBV (0.316 V) is the equivalent to a level of -7.8dBu.
  • +4dBu and -10dBV systems have a level difference of 11.8 dB and not 14 dB. This is almost a voltage ratio of 4:1

Martin Walker seems to agree that one doesn't need the world's best signal-to-noise and dynamic range to make decent high quality recordings (although the needs of a true audio pro are more demanding):

'The most hotly quoted specification for any audio interface tends to be its dynamic range or signal/noise ratio. There's still a lot of confusion about these two terms, and this is hardly surprising considering each may be measured in a variety of ways. However, the way audio interface manufacturers measure them seems to be reasonably consistent, and using these particular methods the two figures also tend to be very similar with many products, which makes products that quote one or the other easier to compare.

In audio interface terms, Signal/Noise ratio compares the maximum signal level that you can send to the interface (ie. that which makes the input meters just register 0dB) with the background noise level when no signal is present. However, some crafty soundcard manufacturers realised early on that they could achieve amazingly good s/n figures by automatically muting the output in the absence of an input signal, so that its background noise level was significantly lower. The audio interface dynamic range measurement therefore measures the background noise level in the permanent presence of a low-level signal (generally a 1kHz sine wave at -60dBFS), which is subsequently notched out using a filter. Dynamic range is therefore a slightly more reliable real-world test. You may spot some cheap soundcards with significantly worse results for their dynamic range than for their Signal/Noise (S/N) ratio.

..

Both figures are generally measured via an 'A'-weighting network, which rolls off the noise either side of its 3kHz centre frequency, in line with the sensitivity of the human ear. In essence, a 'dBA' rating reflects more closely how annoying we will find the background noise, with low-level hums below 200Hz and whistles above 10kHz being less obvious than hiss between about 1kHz and 6kHz. A dBA rating is generally a few dBs better than a 'flat' measurement.

Despite the fact that most audio recordings still end up on a Red Book Audio CD at 16-bit/44.1kHz, most of us have abandoned 16-bit recording and playback in favour of the wider dynamic range possible with 24 bits. A typical soundcard will provide a maximum dynamic range of 96dBA at 16-bit, but well over 100dBA when using 24-bit, which allows us to worry less about taking our recordings to within a few dB of clipping, because the background noise levels are so much lower.

However, when comparing the dynamic ranges of different audio interfaces, don't lose sight of the signals you'll be recording. If, like me, you still record the outputs of various hardware synths, the chances are that they won't have a dynamic range of more than about 80dB. If you're capturing a live performance via a mic, the background noise level of that mic and its associated preamp may already be higher than that of the audio interface, especially since it's difficult to make recording areas really quiet without extensive soundproofing. After all, as Hugh Robjohns said in SOS September 2004: "In most public venues I find the ambient noise floor is typically about 50-55dB below the peak level of a modest orchestra, organ, or choral group".

So, while buying an interface with the lowest possible background noise is sensible, in the real world many musicians won't be able to hear any difference at normal listening levels between interfaces with a dynamic range of 110dBA and 120dBA. Moreover, I've recently spotted various musicians grumbling about the background noise levels of specific soundcards, when they were actually hearing digital nasties due to the the effects of a ground loop. As soon as they modified their wiring or introduced a DI box to deal with the problem, most were amazed at how quiet a background noise level of 100dBA was!'

Yep, that last one happened to me once, too. Buzz buzz, and it was it just a bad (dedicated) mic preamp with a ground loop. And Martin Walker takes away some more worry:

'It's also worth pointing out that switching to 32-bit recording and playback in your audio application won't result in an even larger dynamic range — the benefit of the 32-bit float format is massive internal headroom and no possibility of internal clipping when mixing together loads of tracks, but the interface will still have 24-bit converters on the input and output. Unless the world suddenly becomes a much quieter place, 24 bits will remain quite sufficient to digitise it.'

I found the following relevant, because I have an old 1993 Roland RD500 digital piano/organ with synth sounds (although I could not find out any specs regarding noise, or about how it produces its sounds or equivalent sample bit depths if it uses waveform reconstruction):

'Sample Rate Wars

While even budget audio interfaces are now beginning to feature 192kHz sample rates, there are still arguments raging on most audio forums about whether or not it's worth moving from a sample rate of 44.1kHz to 48, 88.2 or 96kHz. Many musicians stick to 24-bit/44.1kHz because they still create their music largely with hardware MIDI synths and soft samplers that themselves use 44.1kHz samples, so they see little point in moving higher, especially as they intend the final mix to end up on a 16-bit/44.1kHz audio CD. However, even those using electronic sources will probably find subsequent compression and peak limiting more accurate at higher sample rates, while EQ tends to sound far more analogue in nature and metering is more accurate. Those using soft synths that calculate or otherwise model their waveforms may also find they sound cleaner.

For live classical and other acoustic recordings I suspect most serious engineers now prefer 24-bit/96kHz, particularly if the final recordings are for DVD release at 48 or 96kHz ..

.. mainstream PC magazines may mark a particular review soundcard down if it doesn't offer a 192kHz sample rate, I personally consider this option a huge red herring in the case of most audio interfaces under £500. If you can hear the improvement, use 192kHz, but bear in mind that the rest of the signal chain needs to be of extremely high quality to really exhibit any benefit over 96kHz.

Remember, also, when choosing a sample rate for your projects, that at 192kHz every plug-in and soft synth you run will consume over four times as much CPU overhead, occupy more than four times the amount of hard disk space, and cut your potential simultaneous track count by more than a factor of four over 44.1kHz.'


Some more references

- Apple Tech Specs (with lookup against serial number).

- Apple: Mac Basics: Ports and connectors

- OS X Lion: Audio ports

- Audio Specifications: a super guide by RANE pro audio on interpreting the various audio specifications for devices, and how they should be measured and stated.

- Wikipedia: dog whistle

- 24/192 Music Downloads are Very Silly Indeed

- 24bit vs 16bit: the myth exploded

- The Emperor's New Sampling Rate, 2008 by Paul D. Lehrman.

A summary of a review of music levels for broadcasting, personal use, recording and mastering, including the new LOUDNESS measures

This page started because I began reading recently about the new(ish) loudness measures and standards, especially those of the European Broadcasting Union (EBU) (I did not examine the slightly lower US SMPTE recommendation in depth). From EBU: Loudness:

'In August 2010, the EBU published its Loudness Recommendation EBU R128. It tells how broadcasters can measure and normalise audio using Loudness meters instead of Peak Meters (PPMs) only, as has been common practice.

..

-23 LUFS

Basically EBU R128 recommends to normalize audio at -23 LUFS +/- 1 LU, measured with a relative gate at -10 LU. The metering approach can be used with virtually all material. To make sure meters from different manufacturers provide the same reading, EBU Tech 3341 specifies the 'EBU Mode', which includes a Momentary (400 ms), Short term (3s) and Integrated (from start to stop) meter. Already more than 60 vendors have reported to support 'EBU Mode' in their products.'

Now I am not a broadcaster, but my review of this matter of loudness sent me on a very interesting trip right back to the fundamentals of analog and digital audio engineering and levels, and I attempt to share that journey here, including some examples of dBFS and LUFS statistics processing with some free tools for Mac OS X.

I include some tips and research links on how these loudness measures relate to metering, recording and mastering levels, and how to react to the broadcasting loudness measures and recommendations pragmatically, namely in advance:

- The "best" level(s) for digital recording are different from the best levels for digital delivery, and depend on whether you will use your recorded resources to be mixed with other music, or as an end mix (or for simple capture), and they also depend critically on what devices your end mixes and masters will be played on (served via), and to some extent also on the chosen audio format.

- Levels and loudness considerations for mastering are very different from levels for recording live music, and from levels appropriate for preparing music collections for playing on personal music devices (as opposed to broadcasting) may be different again.

- There is a consensus that the -23 LUFS European Broadcasting Union (EBU) standard is fine for some media (TV, radio etc.) but not at all appropriate for personal music playing devices such as iPods, mobile/smart phones etc, where pushing it a good deal louder is handy.

- The recently refined EBU (and SMPTE) measures of loudness are beginning to penetrate the world of Digital Audio Workstation (DAW) software, with new loudness meters already included by many audio software vendors.


Some background on audio levels

In order to understand my summary one needs to at least be familiar with the following:

- From Wikipedia: Decibel:

'The decibel (dB) is a logarithmic unit used to express the ratio between two values of a physical quantity (usually measured in units of power or intensity). One of these quantities is often a reference value, and in this case the dB can be used to express the absolute level of the physical quantity.

The number of decibels is ten times the logarithm to base 10 of the ratio of the two power quantities.

A change in power by a factor of 10 is a 10 dB change in level. A change in power by a factor of two is approximately a 3 dB change. A change in voltage by a factor of 10 is equivalent to a change in power by a factor of 100 and is thus a 20 dB change. A change in voltage ratio by a factor of two is approximately a 6 dB change.

..
The decibel unit can also be combined with a suffix to create an absolute unit of electric power. For example, it can be combined with "m" for "milliwatt" to produce the "dBm". Zero dBm is the level corresponding to one milliwatt, and 1 dBm is one decibel greater (about 1.259 mW).

In professional audio, a popular unit is the dBu (see below for all the units). The "u" stands for "unloaded", and was probably chosen to be similar to lowercase "v", as dBv was the older name for the same thing. It was changed to avoid confusion with dBV. This unit (dBu) is an RMS measurement of voltage which uses as its reference approximately 0.775 V RMS. Chosen for historical reasons, the reference value is the voltage level which delivers 1 mW of power in a 600 ohm resistor, which used to be the standard reference impedance in telephone audio circuits.'

..
In professional audio, equipment may be calibrated to indicate a "0" on the VU meters some finite time after a signal has been applied at an amplitude of +4 dBu. Consumer equipment will more often use a much lower "nominal" signal level of -10 dBV. Therefore, many devices offer dual voltage operation (with different gain or "trim" settings) for interoperability reasons. A switch or adjustment that covers at least the range between +4 dBu and -10 dBV is common in professional equipment.

..

dBFS (digital)

dB(full scale) – the amplitude of a signal compared with the maximum which a device can handle before clipping occurs. Full-scale may be defined as the power level of a full-scale sinusoid or alternatively a full-scale square wave. A signal measured with reference to a full-scale sine-wave will appear 3dB weaker when referenced to a full-scale square wave, thus: 0 dBFS(ref=fullscale sine wave) = -3 dBFS(ref=fullscale square wave).

dBTP

dB(true peak) - peak amplitude of a signal compared with the maximum which a device can handle before clipping occurs. In digital systems, 0 dBTP would equal the highest level (number) the processor is capable of representing. Measured values are always negative or zero, since they are less than or equal to full-scale. '

Now before proceeding any further, let's look at one very important point about decibels as applied to dBFS. The formula for calculating dBFS is equivalent to the formula for calculating dB relative to a voltage (not a power), so the formula is:

LdB = 10 * log(V2/V02) = 20 * log10(V/V0)

where V0 is the reference. That is, the digital amplitude is handled as a "field" value just like electrical voltage, and not like a sound pressure or power ! Here are some typical values rounded for some amplitude ratios:

1.0000 =   0.000 dB
0.5000 =  -6.021 dB
0.2500 = -12.041 dB
0.1250 = -18.062 dB
0.1000 = -20.000 dB
0.0625 = -24.082 dB
0.0100 = -40.000 dB
0.0010 = -60.000 dB

This gives us a golden rule of thumb for digital:

Increasing the number of bits by 1 doubles the number of available quantisations,
and thus corresponds to about 6dB increase in the dynamic range.

- From Wikipedia: dBFS: decibels relative to full scale for digital:

'0 dBFS is assigned to the maximum possible digital level. For example, a signal that reaches 50% of the maximum level at any point would reach -6 dBFS at that point, 6 dB below full scale. Conventions differ for RMS measurements, but all peak measurements will be negative numbers, unless they reach the maximum digital value.'

- From Wikipedia: RMS levels:

'Since a peak measurement is not useful for qualifying the noise performance of a system, or measuring the loudness of an audio recording, for instance, RMS measurements are often used instead.

There is a potential for ambiguity when assigning a level on the dBFS scale to a waveform rather than to a specific amplitude, since some choose the reference level so that RMS and peak measurements of a sine wave produce the same number, while others want the RMS and peak values of a square wave to be equal, as they are in typical analog measurements.'

- From Wikipedia: Dynamic range:

'The measured dynamic range of a digital system is the ratio of the full scale signal level to the RMS noise floor. The theoretical minimum noise floor is caused by quantization noise. This is usually modeled as a uniform random fluctuation between −1/2 LSB and +1/2 LSB. (Only certain signals produce uniform random fluctuations, so this model is typically, but not always, accurate.)'

Some other useful audio and sound engineering guides and resources concerning levels and metering include:

- Understanding & Measuring Digital Audio Levels by Glen Kropuenske, 2006 (PDF) this is an excellent introduction to levels with some nice comparison graphics and discussion of digital vs analog:

'dB or decibels

Audio signal or sound levels are measured using a decibel (dB) system. The dB system is used to compare two levels or a change in signal voltage or power. One dB is the level change that is just noticeable by most people. A 6 dB change is considered to be about twice the volume.

Sound signal level in dB can be considered either as a power or as a voltage. The level in decibels is 10 times the logarithm of the ratio of two power levels. Where P is the measured power in watts and P Ref. is a reference power in watts.

Sound signal level in dB can be considered as a voltage ratio. The level in decibels is 20 times the logarithm of the ratio of two voltage levels. Where V is the measured voltage and V Ref. is a reference voltage.

The resistance is assumed to be the same so calculations using either the power or voltage formula agree.'

'Units of Sound Level Measurement

Sound signal level is expressed using various dB units of measurement including:

- dBm: decibels or dB referenced to 1 milliwatt (.001 watt)

- dBu or dBv: decibels or dB referenced to 0.775 volt (dBu is more commonly used)

- dBV: decibels or dB referenced to 1 volt'

'VU Meters

The VU (volume unit) meter is another voltage measurement method for analog audio level measurement. The VU meter is a voltmeter with a response time designed to reflect the loudness of live audio as the ear would interpret the loudness. Relating VU measurement units to the other dB units of measurement for audio can only be done with a sine wave test tone. In a professional audio balanced system, 0 VU corresponds to +4 dBu. You may also see 0VU as +4 dBm although this assumes 600 ohm balanced impedance. This is the only impedance in which 4 dBm equals 4 dBu'

'Analog vs. Digital Levels — the dBFS Scale

Digital audio levels are measured differently than analog audio levels. Yes, yet another and different dB system is used. The dB system in digital audio starts at the top and defines the loudest sound level that is to be digitized. This top or full scale view of the audio levels results in a full scale or "FS" system of dB measurement.'

[ED: Warning: The following numbers do not all agree well with some diagrams or statements made by others quoted below.
Also, it does not state for whether 16-bit or 24-bit (assumed).]

'A 0 dBFS measurement unit is to be the highest audio level. Assuming this is to be at the highest audio level before clipping occurs, this corresponds to an analog level of 24 dBu. Therefore, 4 dBu (dBu =dBv) is the same as - 20 dBFS or 0 VU.

While this is generally accepted as the range of digital audio, it is not a hard standard. When digital audio values are converted back to analog, some digital audio equipment provides level selections to shift the analog output levels of 0 VU to -18 dBFS or -14 dBFS. Lowering the dBFS relationship increases the audio sound levels output from the D/A converter.''

Some explanations with reference to standards are provided by Hugh Robjohns, technical editor of Sound on Sound, in Q. What are the reference levels in digital audio systems?, from which I borrow diagrams for EBU R68 (top) and SMPTE RP155 (below):


So what does this all mean for recording and mastering ?

Let's start getting into some concrete tips for recording (and compare them with mastering). I figure the Final Cut Pro people would know what they are talking about, and their recommendations agree well with the other tips and diagrams I provide below. From Final Cut Pro7: User Manual: About Audio Meters:

'There are several common digital levels used to correspond to 0 dB on an analog [VU] meter:

-12 dBFS: This level is often used for 16-bit audio such as DV audio, and for projects with compressed dynamic ranges, such as those for television or radio.

-18 or -20 dBFS: This level is more common on projects with higher dynamic range, such as professional post-production workflows using 20- or 24-bit audio.'

- And similarly from Final Cut Pro: Understanding Audio Meters :

'As a general guideline, if you are working with 16-bit audio, you should set your audio level around -12 dBFS. If you are working with 20- or 24-bit audio, you should set your audio level around -18 or -20 dBFS.'

- From Audio Metering Introduction: Audio Geek Zine:

'VU

Mic preamps, converters, hardware effect processors are all designed to work optimally at 0 VU. They can usually handle more than that before distorting, but 0 VU is where the signal to noise is best. VU stands for Volume Unit and is the oldest analog metering system. VU meters are relatively slow moving with at 300ms response time. This slow response of a VU meter better represents an averaged volume level close to how our hears work. 0VU is equal to +4dBu or professional line level.

dBu

The dBu scale measures the analog voltage level in our equipment with 0dBu calibrated to about 0.775 Volts. The u in dBu stands for ‘unloaded’ which means that the voltage is measured with a zero resistance load. Again, 0VU or +4dBu is the ideal constant voltage of all your analog components in the recording and monitoring chain.

Here’s an example chain – microphone, mic preamp, compressor, audio interface line input, Analog to digital converter, recording software.

The microphone signal gets boosted up to line level by the preamp. Line level goes into and out of the compressor into the audio interface. The analog to digital converter assigns bits representing the voltage coming in and sends the data to your DAW.

Digital Meters

Once it’s in your DAW the level you see will not be 0 on your track meters, it will actually be closer to -18dBfs depending on the calibration. This may seem like a really low level but this is actually the optimal level that all the analog components that come before it.

Once you build up your song with several other tracks, you’ll be happy you have that extra headroom and lower noisefloor.

0VU = +4dBu = -18dBFS: This is the only thing you need to remember
[ED: it is assumed he is talking about EBU Digital 24-bit, the equivalent for 16-bit digital would be -12dBFS, and for SMPTE digital 24-bit it's -20dBFS.]

dBFS

The dBFS meters show Decibels relative to full scale. Instantaneous digital levels below the 0dBFS absolute peak. When 3 consecutive samples pass 0 the clip light will come on.

dB RMS

Now what’s left is RMS metering. Some DAWs have this in addition to Peak metering on the master. Similar to how VU meters work, RMS meters show an average level. The RMS value relates to how loud a sound is perceived.

These days all music is mastered to peak just below 0dBFS, the unwritten standard is -0.3,
but the song with the higher RMS level will appear to be louder.

There isn’t a widespread calibration standard for RMS metering so you’ll have to compare values from a few references to what you’re working on.'

[ED: this mastering recommendation of -0.3dBFS is relatively high compared with some other recommendations.]

There is a fantastic resource at Audio Studio Recording: Mastering and Gain Structure that enables you to compare live dBFS mappings for different calibrations, along with excellent explanations of all aspects of every metering scale:

'dBFS meter description

dBFS meters are either hardware- or software-based digital meters that can run anywhere from - 40dB to - inf (- infinity) on the low end, but invariably end at 0dB on the high end. Color schemes for these meters vary (especially on the software versions) but typically turn red at or near the - 3dB to 0dB range at the top of the scale.

Many dBFS meters include a single LED or other type or illuminated indicator usually labeled either "OVER" or "CLIP".

dBFS meters display digital levels

dBFS meters visually indicate signal levels as defined by the values of the digital samples of an analog signal that as been converted to digital data. The top of the meter (0dBFS) indicates a digital value where all the bits of a digital sample have a value of 1. A digital value of all 1s is, by definition, the highest possible value that can be represented in a binary digital form. There is nothing louder than a digital value of all 1s. Therefore 0dBFS (the top of the meter) represents the maximum possible volume on any digital signal.

Note that this 0dBFS maximum is true regardless of the digital word length (a.k.a. bit depth) used. Whether we are recording at an 8-bit, 16-bit, or 24-bit word length doesn't matter here; as long as every bit in a sample has a value of 1, it will translate to 0dBFS on the meter.

dBFS meter calibration

dBFS meters do not directly represent analog voltages or signal levels, they provide a graphic representation of binary digital values only. As such, any correlation between analog levels and digital values is determined by the calibration of the analog-to-digital converter (ADC) circuitry in the recording signal chain.

Unfortunately there is no definitive standard of conversion in ADCs for converting from dBu to dBFS; it varies from brand to brand, model to model, even country to country. A pro-grade line level of +4dBu can typically equate to anywhere from -12dBFS to -20dBFS on the digital scale, depending on the individual ADC's calibration. Some ADCs even have switches on them offering multiple calibration settings.

There are many quality ADCs that convert +4dBu to -18dBFS as a default. For this reason, this is what many engineers quote as the conversion factor, and it is also the default display setting for our meter on the left. But the number of ADCs that do or can equate +4dBu to a different digital level than that are at least as numerous as those that equate it to - 18dBFS, so we need to check the specs on our ADCs to ensure we are using the right calibration standard for our recording.

[ED: one can use their cool Analog to Digital Conversion Calculator on the left in their page to see live how the dBFS meter levels can change based upon the calibration of a converter.]

More bits means more range

Because 0dBFS is the absolute top of the digital scale regardless of the number of bits used in our digital samples, the number of bits used does instead matter towards the bottom end of the dBFS scale. The more bits we have, the lower of a volume we can digitally represent, and the greater of a dynamic range we have to work with in the digital domain.

This range can be calculated by multiplying the number of bits by 6dB. Therefore 8 bits gives us a maximum range from 0dBFS to - 48dBFS. 16 bits will go from 0dBFS down to - 96dBFS, and 24 bits from 0dBFS down to - 144dBFS.

The "CLIP" or "OVER" indicator

Many dbFS meters include a separate indicator labeled "CLIP" or "OVER". This lights up when the meter "believes" that the incoming analog signal may have been higher than the digital 0dBFS. Because in the digital realm there can be nothing higher than 0dBFS, anything analog coming in higher than that is simply "clipped off" at 0dBFS during the conversion to digital. The "CLIP" or "OVER" indicators warn us when that clipping may be happening.

Because there is nothing above 0dBFS, the only way a meter can determine if clipping is occurring is by looking for consecutive samples of 0dBFS, the assumption being that a flattened waveform with a flat top of more than one sample in a row at maximum value most probably means that the top of a normal waveform has been clipped off.

Unfortunately here again there is no standard. Some clip lights are programmed to light up as soon as a single sample hits 0dBFS. Others wait for three consecutive 0dBFS samples to confirm that a real clip has taken place before lighting up. There are even others that will wait for as long as 8 consecutive samples before lighting up on the theory that shorter clips than that cannot be heard.'

'+4dBu "Pro" Line Level

In order for different pieces of analog audio gear to be able to properly send signals to each other without those signals being too weak or too strong for any given piece of gear, all such gear is designed to operate at a standard "line level".

"Line level" refers to the average signal voltage at which the standard line inputs and outputs of most of our audio gear is designed to operate. For this reason, the average-reading VU meters on most audio processing gear are calibrated so that a reading of 0VU indicates a line level voltage.

"Pro" line level

Most professional-grade and prosumer audio recording gear is designed to operate at a standard line level of +4dBu (~1.23 volts). However, some gear have switches or circuitry on them that let the user select between a "pro" line level of +4dBu and the "consumer" line level of -10dBV (approx. -7.8dBu or ~0.32 volts.)

Un-level playing fields
Because of the huge difference between "pro" and "consumer" line levels - "pro" line level is almost 4 times the voltage as "consumer" line level - It's important to know at which level your gear operates.

If you run a -10dBV "consumer" signal into a +4dBu "pro" input, the signal will be running almost 12dB lower than expected; having to boost the input that extra 12dB will also increase the noise level of the signal by almost 12dB.

Conversely, running a +4dBu signal into a -10dBV input will be inputting a signal almost 12dB hotter than expected, potentially cutting the amount of peak headroom in the device and opening up the possibility of extra signal distortion.'

I am going to offer you also for your reference during this discussion this excellent (respectively borrowed) summary image of analogue and digital levels; don't be overwhelmed, for this summary I will focus on digital. It is from a highly recommended blog article Digital Recording Levels - a rule of thumb from 2009 by ZedBee". [ED: I can't hotlink to the image, and borrowing it here uploaded is a hopefully forgiven breach of copyright for educational purposes, do please read the original article too.]


Getting a hold on the new Loudness measures

Ok, so let's examine some of the new human-perception-based loudness measures (as opposed to sound pressure dB measures, voltage dB measures, or digital dB measures). I highly recommend you watch firstly this absolutely brilliantly clear screencast video tutorial with live examples by Ian Shepherd from Production Advice UK. He knows exactly what he is talking about, and shows you in RMS/peak on old VU meters and on the new LUFS loudness meters with a real music project (then do please come back here):

- YouTube: LUFS - the new Loudness Units. What do they mean ?

- LUFS, dBFS, RMS… WTF ?!? How to read the new loudness meters (this includes a Pink Noise download so you can compare with his results).

Alright, so you get that there are different kinds of Root Mean Square (RMS) measures for music/sound, and that one has to be careful comparing them, but basically LUFS is like RMS but adjusted for human perception. It is also comparable with relative dB units (but always on the LUFS scale). So adjustments in dB will give a similar adjustment in LU. This means, that even if you don't yet have an official LUFS monitor, you can get a feel (only) using just RMS measures.

Let's explore some RMS and Peak dBFS stats first

My software tips here are all Mac specific (currently Mountain Lion 10.8.5) , but some of these tools are also available for other UNIX/Linux machines, and some even run on Windows (how about that).

I recommend that with some of your audio files and with the Ian Shepherd pink noise test WAV file you explore some stats. Visit also: Audio engineering test/sample file resources, and online generators and online audio tests.

Audacity is a free, open-source, cross-platform audio editor for Mac, GNU/Linux Windows etc. It's not the world's best audio editor (especially not for MP3 or AAC because it imports, processes, then reexports with a tiny quality loss rather than say direct MP3 editing), but it has lots of plugins and FX and is sufficient for experiments. Internally it works in 32-bit floating point LPCM at up to 96kHz.

There is an unofficial Wave Stats plugin for Audacity that performs excellent wave analysis over regions of about 30s length, which is enough for you to explore the difference between dBFS RMS and max peaks, and to get a feel on some different audio files. Typical output from the plugin:

Another useful audio analysis tool is ffmpeg used in command line mode. It is available on Mac using MacPorts. I got it working OK on Mac OS X Mountain Lion, but you should at least be a bit UNIX savvy to try this. You will also need the LAME MP3 Encoder port if you want to deal with MP3.

$ sudo port install ffmpeg

$ sudo port install lame

(As always with MacPorts, don't be scared to use that -f (force) option if you upgraded your os recently !)

Store the following in a file at ~/bin/@ffmpeg-statistics:

#!/bin/bash
ffmpeg -i "$1" -filter:a "volumedetect" -vn -f null /dev/null

Make sure you make it executable with:

$ chmod +x ~/bin/@ffmpeg-statistics

And do when running it on a file, do "quote" your audio file name if it contains any spaces:

$ @ffmpeg-statistics "my chill music audio file.mp3"

The statistics output, for a run on a 128kbps MP3 chill music file, is like:

Duration: 00:05:32.43, start: 0.000000, bitrate: 128 kb/s
Stream #0:0: Audio: mp3, 44100 Hz, stereo, s16p, 128 kb/s
Output #0, null, to '/dev/null':
Metadata:
..
encoder : Lavf55.12.100
Stream #0:0: Audio: pcm_s16le, 44100 Hz, stereo, s16, 1411 kb/s
Stream mapping:
Stream #0:0 -> #0:0 (mp3 -> pcm_s16le)
..
size=N/A time=00:05:32.43 bitrate=N/A
video:0kB audio:57263kB subtitle:0 global headers:0kB muxing overhead -100.000038%
[Parsed_volumedetect_0 @ 0x7f8fa8412d00] n_samples: 29318494
[Parsed_volumedetect_0 @ 0x7f8fa8412d00] mean_volume: -17.1 dB
[Parsed_volumedetect_0 @ 0x7f8fa8412d00] max_volume: -1.4 dB
[Parsed_volumedetect_0 @ 0x7f8fa8412d00] histogram_1db: 12
[Parsed_volumedetect_0 @ 0x7f8fa8412d00] histogram_2db: 583
[Parsed_volumedetect_0 @ 0x7f8fa8412d00] histogram_3db: 9111
[Parsed_volumedetect_0 @ 0x7f8fa8412d00] histogram_4db: 42170

There you have it, an indication of whether anything clipped (it didn't, max is negative and less than 0dBFS) and what your mean volume (RMS) is without even loading the audio file in an editor.

You might also understand why I wanted you to see the FFmpeg basic RMS and Peak stats processing before we examine its EBU R128 filter (later), because the RMS and Peak stats tell us important things - like whether we clipped at all, that we need to know anyway.

Here is the result on the Pink Noise WAV file from the Ian Shepherd LUFS tutorial:

Guessed Channel Layout for Input Stream #0.0 : mono
Input #0, wav, from 'Pink_Noise-production-advice-lufs-dbfs-test.wav':
Metadata:
artist : Fred Nachbaur
Duration: 00:00:10.00, bitrate: 705 kb/s
Stream #0:0: Audio: pcm_s16le ([1][0][0][0] / 0x0001), 44100 Hz, mono, s16, 705 kb/s
Output #0, null, to '/dev/null':
Metadata:
artist : Fred Nachbaur
encoder : Lavf55.12.100
Stream #0:0: Audio: pcm_s16le, 44100 Hz, mono, s16, 705 kb/s
Stream mapping:
Stream #0:0 -> #0:0 (pcm_s16le -> pcm_s16le)
Press [q] to stop, [?] for help
size=N/A time=00:00:10.00 bitrate=N/A
video:0kB audio:861kB subtitle:0 global headers:0kB muxing overhead -100.002494%
[Parsed_volumedetect_0 @ 0x7ffe53000000] n_samples: 441000
[Parsed_volumedetect_0 @ 0x7ffe53000000] mean_volume: -14.7 dB
[Parsed_volumedetect_0 @ 0x7ffe53000000] max_volume: -2.1 dB
[Parsed_volumedetect_0 @ 0x7ffe53000000] histogram_2db: 13
[Parsed_volumedetect_0 @ 0x7ffe53000000] histogram_3db: 111
[Parsed_volumedetect_0 @ 0x7ffe53000000] histogram_4db: 566

The max_volume and mean_volume values agree well the Peak and RMS values seen in the tutorial video meters.

Let's compare the FFmpeg stats with the Wave Stats plugin for Audacity applied on the same Pink Noise test:

The peak and RMS stats agree exactly with FFmpeg !

FFmpeg command line is very handy and fast, it's nice to not always have to load files in an editor and it is very useful for batch runs over many files (with some simple UNIX Bash shell scripting).

But I found you have to be a bit careful with it. FFmpeg tries to detect the input format, and bit rate or bit depth, but unless you explicitly give an output format, it will assume what it calls 16-bit "pcm_s16le" as output format (which in the command form above gets thrown away anyway). It makes no difference to the stats calculated on the input file, but for example say we are examining a 24-bit pink noise file, then it might be better used thus:

ffmpeg -i PinkNoise-10mins-24bit-48kHz.aiff -acodec pcm_s24le -filter:a "volumedetect" -vn -f null /dev/null

This gives more sensible input and output format identification and mapping:

Guessed Channel Layout for Input Stream #0.0 : stereo
Input #0, aiff, from 'PinkNoise-10mins-24bit-48kHz.aiff':
Duration: 00:10:00.00, start: 0.000000, bitrate: 2304 kb/s
Stream #0:0: Audio: pcm_s24be, 48000 Hz, stereo, s32, 2304 kb/s
Output #0, null, to '/dev/null':
Metadata:
encoder : Lavf55.12.100
Stream #0:0: Audio: pcm_s24le, 48000 Hz, stereo, s32, 2304 kb/s
Stream mapping:
Stream #0:0 -> #0:0 (pcm_s24be -> pcm_s24le)
size=N/A time=00:10:00.00 bitrate=N/A
video:0kB audio:168750kB subtitle:0 global headers:0kB muxing overhead -100.000013%
[Parsed_volumedetect_0 @ 0x7f9fa2c15220] n_samples: 57600000
[Parsed_volumedetect_0 @ 0x7f9fa2c15220] mean_volume: -24.2 dB
[Parsed_volumedetect_0 @ 0x7f9fa2c15220] max_volume: -12.0 dB
[Parsed_volumedetect_0 @ 0x7f9fa2c15220] histogram_11db: 525
[Parsed_volumedetect_0 @ 0x7f9fa2c15220] histogram_12db: 8270
[Parsed_volumedetect_0 @ 0x7f9fa2c15220] histogram_13db: 38442
[Parsed_volumedetect_0 @ 0x7f9fa2c15220] histogram_14db: 131459

It has correctly detected the input file as 24-bit 'pcm_s24be', and it now has a pseudo (discard) output also at 24-bit.

However, I found that FFmpeg failed to detect the bits within the format of a 24-bit FLAC file:

Input #0, flac, from 'PinkNoise-10mins-24bit-48kHz.flac':
Duration: 00:10:00.00, bitrate: 2305 kb/s
Stream #0:0: Audio: flac, 48000 Hz, stereo, s32
Output #0, null, to '/dev/null':
Metadata:
encoder : Lavf55.12.100
Stream #0:0: Audio: pcm_s24le, 48000 Hz, stereo, s32, 2304 kb/s

And I also don't understand why in the above examples it mentions 's32' in the stream format..

But for the sake of discussion of levels and loudness, the RMS and peak stats make sense and are consistent, so let's move on, this article is not supposed to be a tutorial on FFmpeg. For more on FFmpeg for audio, including description of formats, visit also: FFmpeg: command line and GUI audio/video conversion tool: audio references

We simply note for now that we have to be careful when comparing wave statistics between 16-bit and 24-bit sample depths.

Ok, so far we have looked at old RMS and Peak, but what about LUFS loudness stats ?

Unfortunately, as far as I can tell Audacity does not yet support LUFS, but there is apparently already a plan to change/enhance the VU Meter to conform to the EBU Standard R128.

But FFmpeg does now offer an EBU R128 audio filter as of at least version 2.0.2. To see whether it is available for your version use:

$ ffmpeg -filters | grep -i r128

ebur128 A->N EBU R128 scanner.

You can then perform loudness measurement runs like this (for example, on our pink noise WAV sample):

ffmpeg -i Pink_Noise-production-advice-lufs-dbfs-test.wav -filter:a "ebur128" -vn -f null /dev/null

Output (with most scan lines removed) is:

Guessed Channel Layout for Input Stream #0.0 : mono
Input #0, wav, from 'Pink_Noise-production-advice-lufs-dbfs-test.wav':
Metadata:
artist : Fred Nachbaur
Duration: 00:00:10.00, bitrate: 705 kb/s
Stream #0:0: Audio: pcm_s16le ([1][0][0][0] / 0x0001), 44100 Hz, mono, s16, 705 kb/s
Output #0, null, to '/dev/null':
Metadata:
artist : Fred Nachbaur
encoder : Lavf55.12.100
Stream #0:0: Audio: pcm_s16le, 48000 Hz, mono, s16, 768 kb/s
Stream mapping:
Stream #0:0 -> #0:0 (pcm_s16le -> pcm_s16le)
Press [q] to stop, [?] for help
[Parsed_ebur128_0 @ 0x7f9452c2d880] t: 0.0999792 M:-120.7 S:-120.7 I: -70.0 LUFS LRA: 0.0 LU
[Parsed_ebur128_0 @ 0x7f9452c2d880] t: 0.199979 M:-120.7 S:-120.7 I: -70.0 LUFS LRA: 0.0 LU
..
Parsed_ebur128_0 @ 0x7f9452c2d880] t: 10.0003 M: -14.4 S: -14.4 I: -14.4 LUFS LRA: 0.1 LU
size=N/A time=00:00:10.00 bitrate=N/A
video:0kB audio:938kB subtitle:0 global headers:0kB muxing overhead -100.002292%
[Parsed_ebur128_0 @ 0x7f9452c2d880] Summary:

Integrated loudness:
I: -14.4 LUFS
Threshold: -24.4 LUFS

Loudness range:
LRA: 0.1 LU
Threshold: -34.4 LUFS
LRA low: -14.5 LUFS
LRA high: -14.3 LUFS

At -14.4 LUFS integrated loudness, the pink noise is way above the EBU R128 broadcast standard of -23 LUFS.

LUFS meters for Audacity

As far as I can tell there is no LUFS meter available specifically for Audacity, however I found the following FREE from Klangfreund: LUFS Meter:

'EBU R128 compliant loudness measurement

The LUFS Meter plugin enables you to deliver loudness-calibrated content.

Multi-Platform, Multi-Format

Available as VST- and Audio Unit-plugin on Mac. On Windows, the LUFS Meter is available as a VST-Plugin. 32 and 64 bit. Support for Linux and other plugin formats is planned.

http://www.klangfreund.com/lufsmeter/download/'

Please note that for Audacity you just use 32-bit version, as Audacity does not support 32-bit VST plugins !

I managed to get it to run (preview) our pink noise test file within Audacity:

But it kept crashing Audacity whenever I clicked the Ok button !

Visit also:

- Mac OS X: EBU R128 compliant loudness meters and batch processing

- Mac OS X: audio engineering plugins

Playing with the loudness: normalization, amplification, attenuation

Audacity has a nice enough Normalize function under Effects, but it only works in terms of the maximum (peaks), it does not let you set RMS values, and certainly nothing fancy like the new LUFS loudness measures.

There is a command line 'normalize' you can also install using mac ports:

sudo port install normalize

It seems to only run on WAV files, I could not get it to see MP3 files.

Let's investigate an MP3 file with chill music I have already normalised in Audacity to -2dBFS. The FFmpeg RMS stats run gives:

Parsed_volumedetect_0 @ 0x7faab2000000] n_samples: 20731486
[Parsed_volumedetect_0 @ 0x7faab2000000] mean_volume: -14.4 dB
[Parsed_volumedetect_0 @ 0x7faab2000000] max_volume: -1.5 dB
[Parsed_volumedetect_0 @ 0x7faab2000000] histogram_1db: 23
[Parsed_volumedetect_0 @ 0x7faab2000000] histogram_2db: 741
[Parsed_volumedetect_0 @ 0x7faab2000000] histogram_3db: 9246
[Parsed_volumedetect_0 @ 0x7faab2000000] histogram_4db: 63090

Clearly the peak normalisation in Audacity to -2dBFS was not perfect, as the maximum is -1.5dB, but it is at least in the right ball park.

Compare with the LUFS runs:

Integrated loudness:
I: -12.2 LUFS
Threshold: -22.6 LUFS

Loudness range:
LRA: 3.0 LU
Threshold: -32.5 LUFS
LRA low: -14.1 LUFS
LRA high: -11.1 LUFS

The mean_volume was -14.4 dB, but the integrated loudness was -12.2 LUFS. This is also way above the EBU R128 broadcasting recommendation -23 LUFS; yet it works just brilliantly on my iPhone used as an iPod !

Changing the loudness in LU units to a target value

As a rule of thumb, the LUFS loudness can be adjusted in LU by making the same dBFS amplitude change in dB.

FFmpeg has a simple audio volume filter. See How to change audio volume up-down with FFmpeg:

'To turn the audio volume up or down, you may use FFmpeg's Audio Filter named volume, like in the following example. If we want our volume to be half of the input volume:'

ffmpeg -i input.wav -af 'volume=0.5' output.wav

However, this is not in dB, but we recall that halving is the same as reducing by 6dB. If we go back to my chill track with loudness 12.2LUFS, applying volume=0.25 should reduce the loudness by 12LU to about -24.2LUFS. Performing this adjustment, and rerunning the FFmpeg EBU R128 filter gives:

Integrated loudness:
I: -24.7 LUFS
Threshold: -35.0 LUFS

Loudness range:
LRA: 3.0 LU
Threshold: -45.0 LUFS
LRA low: -26.6 LUFS
LRA high: -23.6 LUFS

The rule of thumb has worked well enough, the integrated loudness is now -24.7 LUFS, the prediction based on -12dB reduction was -24.2LUFS.

Audacity has the ability to easily adjust the volume in dB units: see Amplify and Normalize.

So there it is, we have examined Peak and RMS statistics and LUFS loudness statistics for files and made reasonably accurate loudness adjustments using completely free Mac tools, including on the command line. I know there are now a range of much fancier LUFS tools available for Mac, but it's nice to know one can at least do it this way for nothing. Right, let's throw away that adjusted file, it's far too quiet for playing on my iPod !

Some recommended loudness levels for different applications

So time, for some basic recommendations.

As already illustrated, I am currently into "chill" music, and I want large collections of chill music that play at roughly the same "loudness" for a long time without having to adjust the volume (say if played through speakers while I am working at my computer, I don't want to have to frequently get up to adjust the volume).

And I want these collections to be usable on systems that do not use ReplayGain or Sound Check, the proprietary system for iTunes and iPod. (Besides, if I get it basically right without them I can always also use the same collections with those technologies as well.)

I have chosen my rule-of-thumb standard for "iPod preparation" of chill stuff in MP3 as -2dBFS peak in cases where the 'mean_volume' RMS (according to FFmpeg) is around -17dBFS to -14dBFS, which is about -15LUFS to -12LUFS loudness according to FFmpeg on this kind of music.

This is clearly much higher than the comparable -23LUFS European broadcasting standard. But remember, this is for playing on home devices, ipod, iphone etc.

I am perfectly aware that max peak values do not give a reliable indication of RMS values or LUFS loudness values, but I know in advance that the music I am treating in this case is quite compressed chill. I don't have the facility (yet) for automatically applying an LUFS loudness requirement in batch mode, whereas I can apply max peak normalisation easily, and in any case:

Just applying a high LUFS loudness measure (needed for say iPod) blindly does not ensure there is no clipping !

I find the resulting loudness range when normalising to -2dBFS max peak (for this kind of music) works well on quality headphones on my Macbook Pro, and on my iPhone headphones walking along the street, and just as well when played from an iPod via a mixer through my Opera DB Live powered speakers (yes I use a musician's PA at home instead of high quality "audiophile" speakers).

My rule-of-thumb, using simple peak normalisation, can be applied safely and reliably for lots of different kinds of chill and other music after performing a stats run (and I always do the FFmpeg stats run). Note that I leave still a little bit of room to play at the top end with -2dBFS.

Of course, many recent mastering standards/recommendations, especially for CDs, push it even higher, much closer to 0dBFS max peak, and far less dynamic range than was used in the past.

If you perform some measurements on a wide range of popular music ripped from CD you will find that there is typically a max peak range from as low as -5 dBFS (mostly from the 1980s) right up to 0dBFS and with much higher compression in recent years: See also this fantastic article The Death of Dynamic Range from Bob Speer of CD Mastering Services with some measurements and comparisons between decades, and this wise remark:

'You want your music to be loud? You can make it loud yourself [by TURNING UP YOUR STEREO'S VOLUME CONTROL] -- and the full quality and dynamic range of the music is preserved. .. But when all of your CDs are recorded to be loud right on the discs themselves, you don't have this choice anymore; you no longer have a variety of "loud" music and "quiet" music to choose from and to play at a volume level that suits your musical taste. The record companies are not only filling your CDs with distorted, corrupted audio, they are forcing you to listen to your music in a certain manner -- do you really want that?

Also from Bob Speer in 2001 comes the wonderful What Happened To Dynamic Range?, with this wonderful animation:

Again: My -2dBFS max peak tip (for chill music) is not a recommendation for a professional TV or radio broadcaster or a movie theatre; It is for a personal music collection to be played through a range of devices at home or on the go that will likely find the -23LUFS European broadcasting standard way too low.

So what about live music recording levels for multi-track ?

The recommendation above (-2dBFS peak) is clearly also not a suitable level for most digital recording of live music, and especially not for multi-track recording, where you will be likely reusing and altering a track in different contexts, combined with other tracks, and subjected to various FX and compression etc. Here is a recommendation for recording levels (based on some careful analysis of dynamic range ) from dBzee: Digital Recording Levels - a rule of thumb:

'The rule of digital thumb

1. Record at 24-bit rather than 16-bit.

2. Aim to get your recording levels on a track averaging about -18dBFS. It doesn't really matter if this average floats down as low as, for example -21dBFS or up to -15dBFS.

3. Avoid any peaks going higher than -6dBFS.

That's it. Your mixes will sound fuller, fatter, more dynamic, and punchier than if you follow the "as loud as possible without clipping" rule.'

Also:

'Most interfaces are calibrated to give around -18dBFS/-20dBFS when you send 0VU from a mixing desk to their line-ins. This is the optimum level!
-18dBFS is the standard European (EBU) reference level for 24-bit audio and it's -20dBFS in the States (SMPTE).'

I have found during my recent online research similar recommendations based on very precise analysis of noise features and the capabilities of 24 bit digital systems, typical converters, and above all, the capabilities also of typical analog to digital converters.

And another interesting discussion from Sound on Sound (SOS) Technical Editor Hugh Robjohns: Q How much headroom should I leave with 24-bit recording?:

'The basic idea is to treat -18dBFS as the equivalent of the 0VU mark on an analogue system’s meter, and that’s where the average signal level should hover most of the time. Peaks can be way over that, of course ..

If the material you are recording is well controlled and predictable in terms of its peak levels — like hardware synths tend to be, for example — you could legitimately reduce the headroom safety margin if you really want to. But in practice there is little point.

The only advantage to recording with less headroom is to maximise the recording system’s signal-noise ratio, but there’s no point if the source’s signal-noise ratio is significantly worse than the recording system’s, and it will tend to be that way with most analogue synth signals, or any acoustic instrument recorded with a mic in a normal acoustic space. The analogue electronic noise floor or the acoustic ambience will completely swamp the digital recording system’s noise floor anyway.

Recording ‘hot’, therefore, won’t improve the actual noise performance at all, and will just make it harder to mix against other tracks recorded with a more reasonable amount of headroom. One issue that comes up a lot is the confusion between commercially released media (CD, MP3, for example), which have no headroom margin at all (they peak to 0dBFS), and the requirement for a headroom margin when tracking and mixing.

Going back to traditional professional analogue audio systems, the practice evolved of recording signal levels that averaged around 0VU. OK, you could push things a few decibels hotter sometimes for effect with analogue tape, but a level of around 0VU was the norm, and that normally equated to a signal level of about +4dBu (VU meters are averaging meters and don’t show transient peaks at anything like their true level).

Analogue equipment is designed to clip at about +24dBu, so, in other words, the system was engineered to provide around 20dB of headroom above 0VU. It’s just that the metering systems we use with analogue don’t show that headroom margin, so we forget it’s there. Digital meters do show it, but so many people don’t understand what headroom is for, and so feel the need to peak everything to the top of the meter anyway. This makes it really hard to record live performances, makes mixing needlessly challenging and stresses the analogue monitoring chain that was never designed to cope with +20dBu signal levels all the time.

By recording in a digital system with a signal level averaging around -18 or -20 dBFS, you are simply replicating the same headroom margin as was always standard in analogue systems, and that headroom margin was arrived at through 100 years of development for very good practical reasons.

.. vworking with average levels of around -20dBFS or so is fine and proper, works in exactly the same way as analogue, and will generally make your life easier when it comes to mixing and processing.

The old practice of having to get the end result up to 0dBFS is a mastering issue, not a recording and mixing one. It is perfectly reasonable (after the mix is finished) to remove the (now redundant) headroom margin if that is what the release format demands.
..
A sensible headroom margin is essential when tracking, to avoid the risk of clipping and allow you to concentrate on capturing a great performance without panicking about the risk of ‘overs’. A similar margin is also required when mixing, to avoid overloading the mix bus and plug-ins (yes, I know floating-point maths is supposed to make that irrelevant, but there are compromises involved that can be easily avoided by maintaining some headroom!).

Once the mix is finished, the now redundant headroom can be removed, and that is a standard part of the mastering process for digital media like CD and MP3.'

So this is what I am basically doing when I go for -2dBFS max peak and around -17 to -14dBFS RMS (about -15LUFS to -12LUFS according to FFmpeg) for chill music end mixes. Play it through headphones on your Mac laptop or iPod or iPhone and you'll find out pretty quickly why. Most modern personal devices seem to benefit on playback from way more volume juice than the -23LUFS broadcast standard.

REMEMBER: preparing pre-recorded, pre-mixed music for playback on your personal playback devices (or capturing/stealing from computer audio sources like online radio streams) recording live music tracks, and mastering are completely different exercises !

Some more useful references on digital audio, quantization, and digital vs. analog levels

All About Digital Audio: Pt 2 by high Robjohns: excellent description of digital quantization and digital noise, from 1998, but still very relevant:

'When it comes to quantising the individual samples of an analogue audio signal, it turns out that our ears can easily hear very small errors in the measurements -- even down to tiny errors as small as 90dB or more below the peak level -- so we have to use a very accurate measurement scale. Figure 1 shows a few audio samples being measured against a very crude quantising scale simply to show the principles involved. Each level in the scale is denoted by a unique binary number -- in this case, three bits are used to count eight levels (including the base line at zero).

Some samples will happen to be at exactly the same amplitude as a point on the measurement scale, but others will fall just above or below a division. The quantising process allocates each sample with a value from the scale, so sometimes the quantised value is slightly lower than the true size of the audio sample, and sometimes slightly bigger. These errors in the description of a sample's size are called quantising errors and they are an inherent inaccuracy of the process.

When the digital data representing the quantised amplitude values is used to reconstruct samples for replay, some of those samples will be generated slightly louder or quieter than the original analogue audio signal from which they were derived -- they will not be entirely accurate. However, whether an audio sample falls on, above, or below a quantising level, and by how much a level is missed is essentially random -- and a random signal is noise. Consequently, quantising errors tend to sound like hiss -- white noise -- added to the original audio signal.

The only way to make quantising noise quieter is to reduce the size of the quantising errors, and the only way that can be done is by making the quantising intervals smaller -- in other words, by using a finer, more accurate scale for the measurements -- just like in the carpet example earlier. The errors will still be there, but if you choose small enough quantising intervals, the errors become vanishingly small, as does the hiss. However, finer gradations require more quantising levels, and so more binary digits are needed to count them.

If the number of quantising levels is doubled, the spacing between individual levels must be halved, and so the potential size of quantising errors must be halved as well. A doubling or halving (in terms of dBs) is 6dB; so every time the number of quantising levels is doubled, the hiss caused by quantising errors is reduced by 6dB. In binary counting, each extra bit added to the number allows twice the number of levels to be counted -- three bits can count eight quantising levels, four bits count sixteen, and five bits count 32 levels. This relationship gives us a handy rule of thumb to estimate the potential dynamic range of a digital system: For each extra bit used to count quantising levels, quantising noise is reduced by 6dB.

So, for example, an 8-bit system should have a dynamic range of 48dB, a 16-bit system (such as DAT and CD) should have a range of around 96dB, and a 24-bit system about 144dB.'

From Vincent Kars, 2012 The Well-Tempered Computer: 16 or 24 bits, explains exactly why it is better to record at 24bit:

1 bit=6 dB

SNR=6N+1.8 dB (N in bits) to be exact but for convenience sake, let’s use 6.

The loudest possible signal in digital audio (all bits are 1) is the reference, this is called 0 dBFS (dB Full Scale). All other measurements expressed in terms of dBFS will always be less than 0 dB (negative numbers). 16 bits will go down to -96 dBFS and 24 to -144 dBFS. In essence, 24 bits continue where 16 bits stops. It can resolve micro details 16 bits can’t.

Noise floor

The theoretical maximum signal-to-noise ratio in an analogue system is around 130dB. In practice 120 dB is a very good value. You can’t escape thermal noise

A couple of specs:

Benchmark ADC1 (24 bits 192 kHz) A/D THD+N, 1 kHz at -1 dBFS -102 dBFS, -101 dB, 0.00089%
Benchmark DAC1 THD+N: (w/-3 dBFS input) -107 dB, 0.00045%
Prism Orpheus AD (line in) THD+N -111dB (0.00028%, -0.1dBFS)

Yes 24 bit can capture those very soft tiny details 16 bit can’t but pretty soon you end in the noise floor of the equipment.

The big debate

You can find many debates on the internet about 16 vs. 24 In the pro world this debate has been settled, almost everybody is recording with 24 bits today. They have some very good reasons to do so ..

Also useful concerning levels and metering:

- Meter Madness: Understanding meters and what they're telling us..., By Mike Rivers (RecordingMagazine): Excellent reading, includes the history of VU meters, and the move to digital metering.

- Final Cut Pro: Setting Proper Audio Levels.

- The Well-Tempered Computer: Volume control. In general a super site for discussions on audio. This article compares volume control, quantization errors, and signal-to-noise for 16-bit digital, 24-bit digital, and analog. Excellent calculations and comparison tables, and explains why some audiophiles recommend controlling volume if possible with analog rather than digital (even with 24-bit) to keep noise down (unless you are using floating point digital).

Related: ESS Digital vs Analog volume control slides (PDF). Has excellent graphs in frequency domain of progressive volume reduction in a digital system, showing why it encourages noise, and why (as long as you have nice smooth analog volume control) audiophiles generally avoid digital volume control.

- Wikipedia: DBFS has the following to say on comparing dBFS with analog levels (compare with the graph above):

dBFS is not to be used for analog levels, according to AES-6id-2006. There is no single standard for converting between digital and analog levels, mostly due to the differing capabilities of different equipment. The amount of oversampling also affects the conversion with values that are too low having significant error. The conversion level is chosen as the best compromise for the typical headroom and signal-to-noise levels of the equipment in question. Examples:

- EBU R68 is used in most European countries, specifying +18 dBu at 0 dBFS

- In Europe, the EBU recommend that -18 dBFS equates to the Alignment Level

- European & UK calibration for Post & Film is −18 dBFS = 0 VU

- UK broadcasters, Alignment Level is taken as 0 dBu (PPM4 or -4VU)

- US installations use +24 dBu for 0 dBFS

- American and Australian Post: −20 dBFS = 0 VU = +4 dBu

- The American SMPTE standard defines -20 dBFS as the Alignment Level

- In Japan, France and some other countries, converters may be calibrated for +22 dBu at 0 dBFS.

- BBC spec: −18 dBFS = PPM "4" = 0 dBu

- German ARD & studio PPM +6 dBu = −10 (−9) dBFS. +16 (+15)dBu = 0 dBFS. No VU.

- Belgium VRT: 0dB (VRT Ref.) = +6dBu ; -9dBFS = 0dB (VRT Ref.) ; 0dBFS = +15dBu.

[ED: Warning: the above does not specify the digital bits, usually 24-bit applies here.]

The EBU R68 standard summary 2000 (PDF) makes this important statement:

'The EBU recommends that, in digital audio equipment, its Members should use coding levels for digital audio signals which correspond to an alignment level which is 18 dB below the maximum possible coding level of the digital system, irrespective of the total number of bits available.'

Note that this does agree with standard practice in many application domains for 24-bit, but it is not what many people recommend for 16-bit ! Look at the chart above from Zed Brookes again, and notice the Pro Reference Levels:

+4dBu = 0dBVU = 0VU = -12dBFS(16-bit) = -18dBFS(24-bit EBU) = -20dBFS(24-bit SMPTE)

Some more useful references on loudness, and the "new" European standards vs the USA standards

This one from the BBC is excellent, and at only 13 pages with good summaries well worth reading from top to bottom: White paper: Jan 2011: Terminology for Loudness and Level dBTP, LU and all that by Senior Research Engineer Andrew Mason, available as PDF download. It points out that:

'For broadcasting, there is one loudness measurement technique that we should know about. This has been relatively recently standardised by the ITU, and is known as Recommendation ITU-R BS.1770'

'The measurement uses a “K” weighting, so we have the subscript “K” for the quantity “L”. The
result is expressed in “LUFS” – Loudness Units relative to Full Scale. 1770 still refers to “LKFS”,'

'The 1770 algorithm is defined such that a stereo sine wave at 1kHz, at -18 dBFS, will have a
loudness level, LK, of -18 LUFS'

'Target level – the origin of “-23”

For the sake of a simple life, and reduced audience annoyance, EBU R 128 recommends that all
programmes be normalised to an average foreground loudness level of -23 LUFS. The figure
of -23 LUFS was chosen as the result of a careful study of broadcasting practice, dynamic range
tolerance, and the capabilities of different transmission technologies. Note that this value assumes
that gating is used in the measurement to prevent long pauses in a programme bringing down the
average loudness.'
..

'True Peak

The general shift away from quasi-peak metering towards loudness metering is complemented by
a move towards true peak metering as well. There are three “peak” metering terms that it might
be useful to clarify:

- quasi-peak – not really peak at all. Historically measured with a mechanical meter with controlled
rise and fall times, such as the well-known “PPM”. Now done in software for digital applications
using, for example, a 10ms integration time.

- sample peak – digital measurement of the highest sample value in the signal;

- true peak – digital measurement, interpolating between the actual samples in order to take account
of over-shoots that would occur later, with, for example, sampling rate conversion. Recommendation ITU-R BS.1770 includes an over-sampling true-peak meter.'

Wikipedia: Peak programme meter.

Wikipedia: Loudness monitoring

From LUFS/LKFS:

'Loudness, K-weighted, relative to Full Scale (or LKFS) is a loudness standard designed to enable normalization of audio levels for delivery of broadcast TV and other video. LKFS is standardized in ITU-R BS.1770. Loudness units relative to Full Scale (or LUFS) is a synonym for LKFS that is used in EBU R128.'

From Wikipedia: ReplayGain:

'ReplayGain is a proposed standard published by David Robinson in 2001 to measure the perceived loudness of audio in computer audio formats such as MP3 and Ogg Vorbis. It allows players to normalize loudness for individual tracks or albums. This avoids the common problem of having manually to adjust volume levels between tracks when playing audio files from albums that have been mastered at different loudness levels. ReplayGain is now supported in a large number of media players and portable media players and digital audio players. Although the standard is now formally known as ReplayGain, it was originally known as Replay Gain and is sometimes abbreviated RG.'

From Poll: Is 3 dB, 6 dB or 10 dB SPL double the sound pressure?, an interesting article that discusses the difference between "volume/amplitude" increase and "loudness" perception increase, with this rule of thumb:

'Doubling of the volume (loudness) should be felt by a level difference of 10 dB − acousticians say.
Doubling the sound pressure (voltage) corresponds to a measured level change of 6 dB.
Doubling of acoustic power (sound intensity) corresponds to a calculated level change of 3 dB.

+3 dB = twice the power (Power respectively intensity − mostly calculated).
+6 dB = twice the amplitude (Voltage respectively sound pressure − mostly measured).
10 dB = twice the perceived volume or twice as loud (Loudness nearly sensed − psychoacoustics).'

From Wikipedia: The Loudness War: An excellent discussion of the most of the issues concerning loudness measures, and comparisons over the decades, and remarks on dynamic range advocacy by engineer Ian Shepherd (see resources above).

A fantastic 6-part series by Hugh Robjohns on Sound-on-Sound from 1998 (but still relevant): start at All About Digital Audio, Part 2, it has links to the other parts of the series. Everything you ever wanted to know about quantization, metering, headroom, and dither.

From Dennis Bohn, Rane Corporation, 2008/2012 why there No Such Thing as Peak Volts dBu:

'It is incorrect to state peak voltage levels in dBu. It is common but it is wrong.

It is wrong because the definition of dBu is a voltage reference point equal to 0.775 Vrms (derived from the old power standard of 0 dBm, which equals 1 mW into 600 Ω). Note that by definition it is an rms level, not a peak level.'

From Normalized Audio and 0dBFS+ Exposure (2012) by Greg Ogonowski 2012:

'Because an analog-to-digital converter or sample rate converter sample clock generally has an arbitrary time relationship to a given piece of program material applied to its input, the same audio can be represented in an infinite number of ways if correctly dithered before the quantizer. Many CDs produced today are normalized to 0dBFS in the digital domain by digital signal processing that is not oversampled and is thus unaware of the peak values of the waveform following playback device D/A converters. Following reconstruction into the analog domain, the peak level of the audio waveform can exceed 0dBFS, a phenomenon commonly known as “0dBFS+,” “intersample peak clipping,” or “true peak clipping.” If the digital-to-analog converter in a consumer playback device does not have 3dB of headroom (3dB being the maximum possible increase in peak level if the reconstruction filter is phase-linear), the converter can produce massive clipping and aliasing distortion components on top of any distortion components introduced by the digital signal processing. Add these to the artifacts produced by the MP3/AAC encode/decode process and it is no wonder that much of today's aggressively mastered music sounds so unpleasantly distorted.

What is particularly pernicious is that if mastering engineers monitor their work through converters having the required 3dB of headroom and do not use meters that show intersample peaks, these engineers will be completely unaware of the additional distortion that many consumer playback devices will produce. Mastering engineers who do not use intersample peak meters are therefore likely to process more aggressively than would if they were able to hear the additional distortion introduced by poorly designed playback components.

Over-processed audio simply creates bad sound. Bad sound in, more bad sound out. It is really no wonder at all why it is so difficult to make radio stations and netcasts sound good with modern material, because it is all grossly pre-distorted! We are pleased to note that in the last year or so, the mastering community has finally started to become more aware of the intersample peak problem, but we are still seeing many major-label CDs that produce intersample peaks above 0 dBFS.'

TIP: Audio Studio Recording: Mastering and Gain Structure: interactive graphic with different meter reference levels.

From an excellent series of audio production tutorials from The Tenth Egg: Production Tip 6 : Preparing a mix for Mastering:

'One of the most frequent questions we get from new clients is how best to prepare their mixes for mastering and what format they should supply them in. Whether you intend to use a mastering service like ours (www.tenthegg.co.uk/mastering), tackle it yourself, or just want to keep a copy for archive there are a couple of simply steps to ensure that the mix you have is fit for the job.

1. Bit Depth and Sample Rate

Though a standard Audio CD can reproduce only 16Bit 44.1kHz digital audio it makes sense to work at the best resolution possible throughout the recording, mixing and mastering stages to ensure maximum quality of the end product. Most soundcards, software packages and hardware recorders now support 24Bit 96kHz recording and while there is still some debate about the benefits of higher sample rates most engineers would agree that 24Bit is the way to go. When it comes to mixing down even if you’ve recorded at 16Bit then there are still benefits to bouncing down your mix at 24Bit. The combination of multiple 16Bit elements will most likely have created a signal with a greater dynamic range. At 24Bit the low level detail, which will be brought up during mastering, will also be more faithfully reproduced. Regarding sample rates, your best bet is to mix down at the same resolution as you recorded. There won’t be any benefit from selecting a higher sample rate and the resulting conversion and re-conversion at the mastering stage may affect quality. If you’re not sure then opt for 44.1kHz.

2. File type

There are often lots of options here (wav, aif, SDII etc.) and most mastering houses will be able to work with whichever format you provide. But for maximum compatibility we would recommend a .wav (broadcast wave) file. Certainly you should try to avoid compressed formats such as MP3 or AAC, but if you’re forced to work in one of these then try and use a data rate of at least 256kbps. Often you will also be presented with the option of ‘split’ or ‘interleaved’ stereo files, with ‘interleaved’ being the preferred option.

3. Headroom

A degree of headroom (the gap in level between the maximum possible and that of the audio) is very important. If a signal clips at any point, even if distortion is inaudible on mixdown, it can become evident during mastering and will limit the processing options. Generally all that is needed is to pull the master fader down so that the meters no longer jump into the red at any point.

If you’re mixing down at 24Bit then you can safely leave as much as 3dB headroom. If you’re working at 16Bit then you’re going to want to maximise dynamic range so closer to 0.5dB is recommended.

4. Mix processing

Most engineers like to add a touch of overall compression, EQ and maybe even limiting when they mix down. This essentially goes some way towards creating that mastered sound and can help mixes sound lounder and play better across a range of audio systems. However, this kind of processing can again create problems at the mastering stage, especially if they have been overdone. If possible all overall mix processing should be avoided in the copy destined for mastering, as these processes can be better applied using the specialist equipment and experience available to the mastering engineer. Certainly they should be free from limiting, which can have a similar effect to clipping. If overall processing must be applied then it should be done as conservatively as possible, avoiding large EQ cuts or boosts and compression gain reduction of more than 3dB.

5. Burning questions

If you’re mastering the mixes yourself then job done, you’re ready to master. But if you’re passing them on to a mastering house then you’re probably going to need to burn a disc. You can’t go far wrong here, just ensure that you burn a Data CD rather than Audio CD or all your mixes will get converted down to 16Bit 44.1kHz and will need to be re-ripped at the other end. When burning your disc be sure to use one of the write speeds recommended on the disc to avoid data errors. Also try to avoid touching the surface of the disc before and after burning and refrain from using the disc more than once to verify its contents before sending it off.

Summary

Now if you’re relatively new to music production then all that might sound a bit daunting. But don’t worry, even if you aren’t able to meet every criteria in this list that doesn’t mean that mastering can’t make a massive difference to your mixes. Most mastering studios, including ours (www.tenthegg.co.uk/mastering) can help talk you through the best options for your particular project and help you prepare your mixes. What our recommendation represents is an ideal format that will maximise the benefits of the mastering process. i.e. a 24Bit .wav file at whichever sample rate your recorded with around 2dB headroom and free from overall mix processing.'

From Bob Katz on Digital Domain: Keeping Your Digital Audio Pure from First Recording to Final Master: everything you ever wanted to know about dithering, and the cost of cumulative dithering, and the cost of not dithering.

From Sound on Sound: MASTERING MASTERS: CD Mastering On Your PC: Tools & Techniques (2001): including more on why you need to dither when mastering down from 24-bit to 16-bit CD quality.

From Sound on Sound by Paul White, Feb 1999: 20 Tips On Home Mastering

Turn me up ! Bringing dynamics back to music., including Loudness War - The Movie.

A final word on the "loudness war" and why it matters

In Australia the loudness war has clearly been won by the retailer Harvey Norman who has the loudest and most annoying TV ads in the history of the world (not that I watch much commercial TV). They seem to have discovered a magic "penetration and annoyance" factor that is mixed into their also very visually loud ads. (I therefore refuse to shop in their shops ever, because they completely spoil any attempt to enjoy a movie on commercial TV in Australia.)

Audio engineering tips

Just a Webel zone to collect various useful information on audio/sound engineering. These freestyle pages started in late 2013 because I told a friend about some interesting resources on sound engineering I had read recently, he asked me to send him some links, and I figured I might as well share my summaries with the world.

I quote heavily from Wikipedia and I also quote essential passages from hundreds of excellent online audio engineering resources "in-place" for your reading convenience; please do read also the original references written by some top audio engineers and sound industry experts, no matter how much you think you already know about audio engineering I promise you it is worth reading them.

Some of the information is specific to Mac OS X, I have in most cases tried to separate Mac-specific material in separate pages.

These info pages are dedicated to SBS Chill, my absolute favourite digital and online radio station !

SBS Chill official home, includes current, previous, next playlists and option to buy selected tracks in high(er) quality.

TuneIn radio home info page for SBS Chill

Specific streams:

SBS Chill: 96kbps MP3

SBS Chill: 96kbps HLS

SBS Chill: 128kbps via Flashplayer (can be embedded in most browsers, simple player appears automatically)

SBSChill on DAB+ usually broadcasts at (only) 56kbps/AAC. To encourage them to prioritise this amazing music with higher bits, please send encouragement to: chill@sbs.com.au

Syndicate content
randomness