At last, we get to the heart of web audio -- the various file formats. This section provides an introduction to some of the most common formats for web audio.
The WAV and AIFF audio formats are very similar in peformance. The Waveform Audio File Format (.wav) was originally developed as the standard audio format for the Microsoft Windows operating system, but it is now supported on the Macintosh as well. WAV files can support arbitrary sampling rates and bit depths, although 8 KHz and 11.025 KHz at 8- or 16-bit are most common for Web use.
The Audio Interchange File Format (.aif or .aiff) was developed as the standard audio format for the Macintosh platform, but it is now supported by Windows and other platforms. It can support up to six channels and arbitrary sampling rates and bit depths, with 8 KHz and 11.127 KHz at 8- and 16-bits being the most common online.
WAV and AIFF files are less commonly used on the Web than they once were, now that we have audio formats that are better suited for web delivery (MP3) or designed specifically for the Web (streaming formats). WAV and AIFF files are typically used as the source format for audio that then gets compressed into more web-friendly formats, like RealAudio. They sound good when uncompressed, but they suffer drastic loss of quality when compressed to small file sizes. For this reason they are useful for very short, downloadable audio clips, such as short greetings. They are usually added to web pages via a link for download.
The following summarizes the WAV and AIFF formats:
Good for |
Storing high-quality source audio before converting to web formats, delivering short clips where pristine sound quality is not important, reaching the lowest common denominator (since everyone can play them). |
Delivery |
Download. |
Creation tools |
The majority of sound editing tools can save files in WAV and AIFF format. |
Player |
WAVs and AIFFs generally play using the browser's default function for sound handling (such as Windows MediaPlayer or the QuickTime plug-in). |
MP3's explosion in popularity is nothing short of a phenomenon and has changed the way we use and view the Internet. The key to its success is MP3's ability to maintain excellent sound fidelity at very small file sizes. In fact, its compression scheme can reduce an audio source to just one-tenth of its original size. For instance, four minutes of high-quality music in WAV format requires 40 MB of disk space; as an MP3, the same file weighs in at just 3.5 MB! With the discovery of MP3, it was suddenly feasible to transfer songs over the Internet without prohibitive download times. The rest is history.
The MP3s that we've grown to love are technically MPEG-1, Layer-III files. MPEG is actually a family of multimedia standards created by the Moving Picture Experts Group. It supports three types of information: video, audio, and streaming (which, in the context of MPEG compression, is synchronized video and audio).
MPEG uses a lossy compression scheme that is based on human auditory perception. Sounds that are not discernible to the human ear are thrown out in the compression process. The resulting file sounds nearly the same, but contains much less data than the original.
There are a number of MPEG standards: MPEG-1 was originally developed for video transfer at VHS quality and is the format used for MP3s; MPEG-2 is a higher-quality standard that was developed for television broadcast; other MPEG specs that address other needs (such as MPEG-4 and -7) are currently in development. MPEGs can be compressed using one of three schemes: Layer-I, -II, or -III (the "3" in MP3 refers to its compression scheme layer). To learn more about MPEG, visit the MPEG web site (http://www.mpeg.org).
Any audio source file (usually a WAV or AIFF file) can be turned into an MP3 using an MP3 encoder such as Xing AudioCatalyst, iTunes (Mac), or MusicMatch Jukebox. For a complete list of MP3 creation tools, see MP3-Converter.com (http://www.mp3-converter.com).
To make an MP3, begin with raw audio saved in WAV or AIFF format. If the audio is coming from a CD, it will need to be "ripped" first (extracted from the CD format and saved in a format a computer can understand). The next step is to encode the raw audio into the MP3 format. Many MP3 tools rip and encode audio tracks in one step.
When encoding, you'll be asked to set the quality level, or bit rate. The standard quality setting for putting music on the Internet is 128 Kbps (which is near-CD quality sound) at 44.1 kHz. For personal use (to play from your computer or portable MP3 player), you can use the next higher levels (160 or 192 Kbps). To keep file sizes extra small, choose 112 Kbps or lower, but expect a loss in audio quality. In order to stream MP3s at rates acceptable for 28.8 modem users, many MP3 online "radio" stations use 22.05K mono files compessed at a mere 24 Kbps.
You'll also need decide whether you want to make CBR (contant bit rate) or VBR (variable bit rate) files. Variable bit rate MP3s adjust their bit rate based on the complexity of the current audio passage. Variable bit rate MP3s can provide an enormous increase in quality at similar bit rates, but because VBR is inconsistently supported, the most reliable choice is CBR. Most of the new MP3 players support VBR, so keep an eye out for VBR to gain more support in the coming years.
MP3s can be served from a traditional FTP or HTTP server. MP3s can also be streamed using server solutions such as SHOUTcast (discussed later in this section) or RealServer 8. Along with the main advantages of streaming, this means that the MP3 file is not actually downloaded to the user's computer, providing better copyright protection.
And speaking of copyright, remember that while there is no problem creating MP3s for your own personal use, it is illegal to upload and distribute audio if you do not hold the copyright for it.
One of the most popular software packages used for streaming MP3s is SHOUTcast from Nullsoft. It makes it possible for people to broadcast audio from their PCs with a minimum amount of hardware and knowledge, over any speed Internet connection (although more bandwidth certainly helps). You can broadcast MP3s to individual users or to many users at once by redirecting your stream to a high-bandwidth server. To listen to a SHOUTcast server stream, open Winamp (or any other stream-capable MP3 player) and bring up the Open Location dialog box. Enter the URL of the server you want to listen to and hit Enter. For a list of SHOUTcast servers (and for more information), visit http://www.shoutcast.com. SHOUTcast is free for download for general non-profit use. For commercial use, there is a one-time licensing fee of $299 (as of this writing).
The following summarizes the MP3 format:
Good for |
Distribution and sale of high-quality audio (like music tracks), radio broadcast-style transmissions at lower bit rates. |
Delivery |
Streaming, download. |
Creation tools |
One of dozens of MP3 encoding programs. See http://www.mp3-converter.com for a complete list. |
Player |
One of dozens of free MP3 players, such as WinAmp (Windows), MPEG Audio Player (Mac), or iTunes (Mac); browsers may support MPEG audio via the QuickTime Plug-in. You can select a program for MP3 playback in the browser's application preferences. |
Although QuickTime is best known as a video technology, it is also possible to create audio-only QuickTime Movies (.mov). QuickTime is a container format, meaning it can contain a wide variety of media. In fact, the QuickTime 5 format can store still images (JPEG, BMP, PICT, PNG, and GIF), a number of movie formats including MPEG-1, 360-degree panoramic images, Flash movies, MP3 audio, and other audio formats. Once you package up media in a QuickTime .mov file, you can take advantage of QuickTime features such as dependable cross-platform performance, excellent compression, and true streaming.
Although the QuickTime system extension is needed to play a .mov file, it is widely distributed and available for both Windows and Macintosh systems. In addition, recent versions of both Netscape Navigator and Internet Explorer come with the QuickTime plug-in, so a QuickTime audio player can be embedded right on the page. It is a reliable format since you can assume most users have the appropriate plug-in or player.
QuickTime is discussed further in Chapter 25, "Video on the Web". For more information on QuickTime, see http://www.apple.com/quicktime/.
The following summarizes the QuickTime format:
Good for |
Continuous-play audio (music, narration). |
Delivery |
True streaming via RTP or RTSP (using QuickTime Server on Mac OS X Server or the open source Darwin Streaming Server on Unix and NT), pseudo-streaming on HTTP servers, download. |
Creation tools |
Most audio and multimedia editors support QuickTime, or use Apple's basic editing tool QuickTime Pro for $29.95. |
Player |
QuickTime plug-in (part of Netscape Navigator and Internet Explorer) for viewing within a web browser or QuickTime Player (standalone utility). |
MIDI (which stands for Musical Instrument Digital Interface) is a different breed of audio file format. It was originally developed as a standard way for electronic musical instruments to communicate with each other.
A MIDI file contains no actual audio information (the digital representation of analog sound), but rather numeric commands that trigger a series of notes (with instructions on each note's length and volume). These notes are played by a MIDI player using the available "instrument" sounds on a computer's sound card. The function is similar to the way a player piano roll creates a song when run through on the player piano.
As a result, MIDI files are incredibly compact and ideal for low-bandwidth delivery. They are capable of packing a minute of music into just 10K, which is 1,000 times smaller than a one-minute WAV file (approximately 10 MB).
QuickTime and most other MIDI file handlers install a General MIDI (GM) soundset with instruments like piano, drums, bass, orchestral strings, and even vocal "oohs" and "aahs" in standardized MIDI locations. While these sounds may vary in quality and timbre from player to player, General MIDI files can depend on getting a piano sound when they send to Program 1, Channel 1 of the GM Player (built into QuickTime, etc.). These sound sets can be surprisingly good, but they still can't compete with recordings created in a studio. In general, MIDI files will always sound "computery."
Despite this limitation, MIDIs are an extremely attractive alternative for adding instrumental music to your web site with very little download time.
The following summarizes the MIDI format:
Good for |
Background music and loops. |
Delivery |
Download. |
Creation tools |
Requires special MIDI sequencer software, such as Vision, Cakewalk, and Digital Performer. Creating and editing MIDI files can be complicated. Consider using an existing MIDI file if you are inexperienced with music composition and digital audio. |
Player |
QuickTime plug-in or Windows Media Player. MIDI sound engines are built into Internet Explorer and Navigator 4.0 and higher. |
RealNetworks (once Progressive Networks) was a pioneer in producing a viable technology for bringing streaming audio to the Web. Despite heavy competition, it continues to lead the pack in terms of widespread use and popularity, and it has grown to be the standard for streaming audio, including live broadcasts.
RealAudio is a server-based streaming audio solution. The RealServer offers advanced features for streaming audio delivery, including bandwidth negotiation (the proper bit rate version is delivered based on the speed of the connection), RTSP transmission for smooth playback, and administrative tools for tracking usage and minimizing server load. Using the SureStream feature, the bandwidth can be adjusted on the fly (while the file is streaming) to accommodate bit rate fluctuations.
A robust RealServer system can allow thousands of simultaneous listeners. The server software requires a large investment (starting at around $2000 for the basic package), and RealNetworks charges licensing fees for the number of streams. There is, however, a free version that allows 25 simultaneous listeners. For more information, see the RealNetworks site at http://www.realnetworks.com.
If you aren't ready to commit to a RealServer, RealMedia and RealAudio files can be pseudo-streamed from an ordinary HTTP server for sites with a limited amount of traffic.
To listen to RealAudio files, users must have RealPlayer, which is available for Windows, Mac, and Unix systems. The RealPlayer plug-in comes installed with Netscape Navigator and Internet Explorer and makes it possible to embed a RealMedia player right in the web page.
RealNetworks also offers tools for creating RealAudio and RealMedia files. The latest version (as of this writing) is RealSystem Producer Plus, which provides complete tools for converting audio and video to streaming format. Earlier creation tools include RealEncoder, for simple conversions, and RealPublisher, with advanced features such as wizards for creating HTML and FTP support. Audio can be saved in either the current and preferred RealMedia format (.rm) or the RealAudio format (.ra) for support in older versions of RealPlayer (5 and earlier).
The process for adding RealAudio to a web page is covered in detail later in this chapter. For more information, visit the RealNetworks site at http://www.realnetworks.com. For consumer-oriented information and downloads, see http://www.real.com.
The following summarizes the RealAudio format:
Good for |
Continuous-play audio and live broadcasts to large numbers of people. |
Delivery |
Streaming (via RTSP), pseudo-streaming (via HTTP). |
Creation tools |
One of the RealNetworks encoders (such as RealSystem Producer Plus) or a third-party tool such as Cleaner 5 from Terran Interactive. |
Player |
Freely available RealPlayer, Commercial RealPlayer Plus (with added features), RealPlayer plug-in in Netscape Navigator and Internet Explorer. |
Microsoft's Windows Media is a streaming media system similar to RealMedia. Like RealMedia, it comes with the standard components for creating, playing, and serving Windows Media files. Windows Media wraps all media elements into one Active Streaming File (.asf ), Microsoft's proprietary streaming media format. Audio may also be saved as nonstreaming Windows Media Audio format (.wma). Because Media Player is part of the Windows operating system, it is widely distributed and stable on the Windows platform. A considerably less supported version of Media Player is available for the Mac as well.
Windows Media Audio files are encoded using the special Windows Media Audio codec (currently in Version 8) which is ideal for all types of audio at bit rates from 16 Kbps to 192 Kbps. Users must have the Version 8 player to hear audio encoded with the Version 8 codec, so use Version 7 if you don't wish to force your users to upgrade. For voice-only audio at low bit rates (8 Kbps), use the alternative ACELP codec.
The Windows Media system has its advantages and disadvantages. On the good side, the server software comes free with Windows NT Server 4.0 and later, and there are no charges for streams as there is with RealMedia. Administration tools make it easy to track performance and bill per view or per minute. The disadvantages to Windows Media are that the server only runs on Windows NT and it doesn't support Flash or SMIL (Synchronized Multimedia Integration Language) like RealMedia. Also, although there is a Windows Media Player for the Mac, it lags behind the Windows version in terms of features and performance, so Mac users may miss your content.
For more information on Windows Media, see http://www.microsoft.com/windows/windowsmedia/en/default.asp. The FAQ is a good starting point.
The following summarizes the Windows Media format:
Good for |
Continuous-play audio and live broadcasts. |
Delivery |
Streaming, download. |
Creation tools |
Windows Media Encoder for converting to Windows Media format,Windows Media Author for creating synchronized multimedia presentations. See the Windows Media site for a complete list of creation tools at http://www.microsoft.com/Windows/windowsmedia/en/overview/components.asp. |
Player |
Media Player (shipped with Windows OS), available as download for the Mac as well as a variety of handheld devices that support Windows CE. |
Liquid Audio specifically targets the needs of the music industry by "providing labels and artists with software tools and technologies to enable secure online preview and purchase of CD-quality music." LiquidAudio is not just a file format; it's a professional utility for controlling music sales and distribution. It is very effective in what it sets out to do, but it is not an all-purpose web audio solution.
Liquid Audio delivers CD-quality audio (including streaming MP3s) and is the only streaming format that offers Dolby encoding. Audio files can be watermarked with copyright, owner, and purchaser information, discouraging piracy and copyright violation. The Liquid MusicServer offers a suite of integrated proprietary tools for encoding, serving, and playing Liquid Audio files.
Liquid Player can offer views of album graphics, lyrics, credits, and up-to-date promotions or announcements (such as tour dates). The player works with the Liquid MusicServer (which is easily tied into SQL databases) to enable individual tracks or entire CDs to be purchased online.
For more information, see the Liquid Audio web site at http://www.liquidaudio.com.
The following summarizes the Liquid Audio format:
Good for |
Distribution and sales of music. |
Delivery |
Streaming via Liquid Server. |
Creation tools |
Liquifier Pro. |
Player |
Liquid Player. |
If you want to add short interactive sound effects to a page, such as button rollover noises, consider using a Flash movie (.swf ). Flash, developed by Macromedia, is an ideal format for adding high-impact interactivity and animation to web sites. Audio (from short clips to long-playing audio) can be embedded in a Flash movie and triggered instantly by user actions. With other file formats (particularly streaming audio), there is an inevitable delay between the request and playback, making it inappropriate for interactive presentations.
Macromedia also offers Shockwave for putting CD-ROM-like interactive media files on web pages. Shockwave takes Director files (which can take advantage of the robust Lingo scripting language for advanced functionality) and compresses them down for web delivery as .dcr files. Shockwave files may contain internal sound effects and streaming audio in the Shockwave Audio (SWA) format. Despite compression, Shockwave files are not well suited for low-bandwidth connections.
Flash and Shockwave are covered in more detail in Chapter 26, "Flash and Shockwave". For more information, see Macromedia's site, http://www.macromedia.com.
The following summarizes the Flash and Shockwave formats:
Good for |
Interactive sound effects, specialized web applications with embedded long-playing sound. |
Delivery |
Streaming (via QuickTime 8 or RealServer), pseudo-streaming (via HTTP), download. |
Creation tools |
Macromedia Flash, Adobe LiveMotion. |
Player |
Flash Player or Shockwave browser plug-in (two of the most widely distributed plug-ins). |
Beatnik's Rich Music Format (RMF) is an HTML-based format that uses scripting languages (like JavaScript) to synchronize interactive soundtracks. RMF uses an advanced collection of MIDI sounds (some proprietary) combined with user-configured samples. The result is excellent sound quality in extremely small files that download fast. Beatnik is another option for adding interactive (user-triggered) sound effects to a web page.
One of Beatnik's claims to fame (besides being co-founded by pop legend Thomas Dolby Robertson) is its Mixman eMix remixers. By clicking on different buttons, users can remix popular songs and send their creations to their friends. For more information, go to http://www.mixman.com.
The Beatnik system is comprised of Beatnik Player (the browser plug-in required for playing .rmf files), Beatnik Audio Engine (a software audio mixer), Beatnik Methodizer (an automated JavaScript generator for adding Beatnik to web pages), and Beatnik Editor (for creating customized digital audio samples). The learning curve is fairly steep, and you must know some JavaScript to get the most out of the system.
Beatnik's disadvantages are its reliance on the Beatnik Player plug-in, which users must download, and the complexity of its authoring environment. It is also not well suited for long-format audio files. But for short, interactive sound effects, Beatnik offers a big bang for a few bytes. For more information, see the Beatnik web site at http://www.beatnik.com.
The following summarizes the Rich Music Format:
Good for |
Interactive sound effects, background sound loops, specialized online apps like remix machines. |
Delivery |
Download. |
Creation tools |
Beatnik Audio Engine, Beatnik Methodizer, and Beatnik Editor. |
Player |
Beatnik Player (currently not part of standard browser download, but Beatnik is still lobbying for greater distribution). |
Copyright © 2002 O'Reilly & Associates. All rights reserved.