Audio Codecs for Deff Lads
Deff is like, deff is life.
Ear Massage for the Sinful
Music is one of the oldest and most prosperous art forms to have ever graced humanity, with a proud tradition of thousands of years and countless pieces of emotional and technical depth. It speaks directly to humans because we have grown up with it ever since we discovered rhythm, and the ability to create a good piece of music is something that everybody should have, as it is an easy way to affect somebody in a good light.
But the music industry said "boners" and filled it with sixty years of pop songs. What a shame! For every Macklemore there's ten Rihannas, and for every will.i.am there's one tenth of Beck (that's mathematically correct, ya'll). But disregarding all the hell it went through (ask any old person and they'll tell you their generation was the best. i can't wait to be an old person so I can be wrong, too), there are still some gems. Now how are you going to store them, you ask? Great question! First we'll transcode some bits to this magnetic drum...
Oh, somebody already did that. Yeah, the great thing about technology is that any dumbass with a computer can make miracles happen that wouldn't have been possible literally thirty years ago. Even the PC you're on would have cost a million dollars ten years ago, unless you're on a Thinkpad. /marked/, motherfuckers. So the problem with audio has gone from finding the room to compute and store it on a PC, to actually finding music to put on your PC, because there is no audio problem and we've progressed as a society to stop worrying about such fundamental problems, much like the difference between eating food and deciding what to eat.
So today we're going to look at some audio codecs (or maybe audio coding formats) and take a comparison between them so that you can fit Thrift Shop onto a floppy disk or what have you, not that you would have the the capability to procure neither an adapter nor a disk. That reference goes out to all my viewers from earlier this morning. Better tattoo some straps on your shoulders, because I call that loyalty.
What are these terms I hear?
Did you ever wanted to be an audiophile? Of course not, because that would be worse than being a stripper at a hookerhouse. You don't get paid for showing off, and nobody compliments you on your tits. But much like a stripper's dancing, we can learn much from the flashy terminologies (note to foreigners: all plural words mean "tits") that our
pokephiles, audiophiles I mean, have taught us. And by sheer virtue of fundamental audio concepts, they're not bullshit!
Hertz (Hz) is used to refer to the amount of samples that are taken from an analog signal when converted to a digital signal. Digital signals aren't continuous like analog singals are - they're actually thousands of separate bits clumped together to form a comprehensible signal. Imagine a staircase. This is what a digital signal is like. Imagine a hill, and that's an analog signal. Hertz in this case refers to the amount of samples that are taken from an audio source, and then converted into digital so the computer can understand it. The higher hertz, the better quality, and 44,100Hz is the standard for CD audio, which is still considered to be one of the best mediums to distribute sound on.
Kilobits per second (kbps, sometimes written as kbit/s) refers to the number of bits in a second that are processed during playback. This is known as a bitrate, for instance a CD has a bitrate of 1,411kpbs, and the media would have a bitrate very near that if it was compressed in a lossless format. The higher the bitrate, the higher the audio quality, though this depends on the codec used. For instance, the MP3 format has a maximum bitrate of 320kbps, and with this particular format, lesser bitrates would create a noticeable difference in quality. With a format like Opus, you could take a 320kbps MP3 file and compress it down to 96kbps, and it would sound the same to most listeners. With bitrate, it all depends on the codec you use and the contents of the audio.
Kilobits per second is used as a sort of standard for audio quality, which is the subjective experience of listening to a song, and is most listeners first step when discerning the quality of a file. It is not foolproof by any means, though. For instance, a song like Steam Machine compressed with Vorbis (.ogg) down to 64kbps would not sound noticeably different than the original MP3 file, but a song like Get Lucky would have noticeable compression artifacts like distorted vocals (which itself sounds no different from Pharrel's usual singing).
Lossless audio refers to audio that sounds very nearly similar to the original source, and I say very nearly because no compression format can create a bit-perfect reproduction of audio. Lossless audio is more often than not a placebo unless you have some of the best sound setup in the world, in which case you don't need to be reading this guide (unless I'm Tom Fucking Scott and have such high production values that you're drawn to this green streetlamp like a moth), because a properly encoded 320kbps MP3 file is very near the limits of human hearing, as the format was designed specifically with the human ear in mind. Anything above that will provide diminishing returns for the file size you get, which may harm digital distribution by making it networkly costly to download.
In contrast, lossy audio is any audio which has been compressed down to save space. Technically all data transferred to a digital signal is "lossy" on some level, but let's not get pedantic. The goal of lossy audio is to provide similar quality to lossless audio while reducing the size of it, giving it more applications for streaming, game development, and microcomputers and what have you. Almost every piece of audio you hear online is compressed, and so it is lossy, because if it wasn't most of the web pages you visit would load far too slowly for your tastes, and the difference between a 116kbps Vorbis (.ogg) and a 1411kbps PCM (.wav) is the difference between 500 kilobytes and 5000 - a 90% reduction (that's correctly mathematic, ya'll) for a difference that most listeners won't even notice or care about. A 90% decrease in the bandwidth you serve to your users is huge, which is why a good lossy format is essential to the operation of the Web.
All of this ties into the concept of transparency, which is when a lossy format becomes unnoticeable from a lossless format. A transparent file partakes in all the quality of lossless audio with the size benefits of lossless audio, meaning that a user gets the best of both worlds. Most audio codecs strive to be transparent at lower bit rates, because that means that they're more likely to be used in applications that need to keep their size overhead low yet still want to provide quality for their users, such as voice chat, streaming, and online gaming. Downscaling a lossless audio codec to a lossy codec poorly, say compressing a 1600kbps FLAC down to a 16kpbs MP3 will result in a painfully opaque file, though compressing it to a 320kbps MP3 will result in a transparent file. However, taking a lossy file and upscaling it to a lossless format, like a 96kbps Vorbis (.ogg) and upscaling it to a 960kbps FLAC will do absolutely nothing (and I mean "absolute" as in 100% certainty that it will do nothing) to improve the quality of the file, and it will increase the size of the file many times over by adding in thousands of more bits per second that are redundant because the original music in the 96kbps file didn't have those bits.
Also, a codec is the actual data of the audio, the container wraps around the codec to make it easier for programs to read, and the extension is what's used for the user to identify a file. Vorbis is the codec, Ogg is the container, while .ogg is the extension, and I'm hammering this in because the distinction between "Ogg" and "Vorbis" is a stupid one because they have no connection to each other and .vorb or .orb would have been suitable, seeing as neither of those are taken. The actual distinction between these terms are only used by professors and such, and the actual distinctions between them fly over the heads of most audio nerds who are probably busy talking about headphones or some other meatbuy of the week.
Getting into specifics
During the writing of this article, the blog mod (god mod) had attempted to foist access into a private torrent tracker, which is anything you probably haven't heard of if you're a normie. While my Art of Misdirection attempts failed (please no hackerino), I had gained access to a signup-based site with a small reputation. It had an underwhelming collection and with slower and more restrictive download speeds than what a public tracker can offer. Supposedly it offers higher quality, but I'm the guy listening to music at 32kbps, so what do I care? I'll just stick to The Pirate Bay, if the website would ever get online (let the phoenix burn!).
Actually getting into the tracker was a treat, and though it wasn't What.CD (you need your real IP to use a BitTorrent tracker? fucking serious m8?), they have a nifty guide involving a lot of the fundamentals of audio and ishhh, so if you're curious and want to check out that tracker (and let me know if it's worth going to / how I can spoof my IP address), you can just do it.
Now featuring, the main audio formats that rule your digital life:
PCM (most commonly .wav) files are the closest things we have to uncompressed audio, as they are designed to store as much audio data as possible without compressing anything - not even silence. As a result of this, PCM files, which are almost always designated as WAV files due to their most popular extension .wav, is considered a standard in lossless audio, and is used extensively in applications such as CDs and DVDs. PCM files are also used in many developer applications such as sound design and audio engineering, because of their acceptance by pretty much every workstation on the planet. If you get a genuine PCM file, hold on to it, because it's the closest thing to the original audio source you have, and can be compressed down to other lossless formats with ease. Technically, PCM refers to a method of encoding audio, while .wav refers to a specific implementation of PCM.
AIFF (.aiff) is the Audio Interchange File Format, which is an implementation of PCM but never took off because Apple had no industry power when it was created. It was created by Apple 28 years ago, and was an extension of a previous Amiga format called IFF, though AIFF was designed to be compatible with a lot of computers. It could be considered a precursor of the WAV file container, providing a lossless solution from before WAVs time, even though no computer on Earth at the time could store much audio on it. It's still a supported format on Bandcamp, for some reason, even though it's been completely overshadowed by WAV and Apple fans would get better filesizes with ALAC.
FLAC (.flac) is the Free Lossless Audio Codec, which is FLOSS and is supported in most media players, except iTunes, because Apple is a monopolistic shitshow that would rather have you use its DRM-infested (infested by design) ALAC format. FLAC is considered a standard in lossless audio, because of it's strong software compatibility and generously small fizesizes for a lossless codec, as well as a cultural tradition in scene groups. On a technical side, it's able to be accessed faster than a lot of other codecs, making it popular for high-end mobile music libraries (if such a thing exists).
ALAC (.m4a) is the Apple Lossless Audio Codec. It is unexceptional, except that it has the ability to have DRM installed in it (much like iTunes had for years), and that Apple uses it as the exclusive lossless format for iTunes. If you're not stuck in the Apple closed garden, you have no reason to use this format.
MPEG-1 or MPEG-2 Audio Layer III, known universally as MP3 (.mp3), is the most popular audio compression format in the world. Developed as part of an effort to make music that's 1/12 the size of a CD with the same perceptible quality, it has been popularised by filesharing groups as the defacto method of music distribution (who said piracy harmed the music industry?) and is now accepted as the lossy audio standard for music everywhere. Every music player in the world can understand MP3 files, and a 320kbps MP3 sounds the same as a lossless audio file, but at a much lower filesize, and a 192kbps MP3 sounds the same to most of the population. The problem with MP3 is that it rapidly degrades in quality as you go from 128kbps or lower, which is why it's slowly losing popularity in favour of other audio codecs. Interesting to note that, despite its commonness, MP3 is technically patented and thus it's illegal to use without permission from the owners (crippling its adoption back in the 90s), though as to who owns it is anybody's guess, and a company trying to start a war against MP3 is like starting a war against the entire Internet (tip: Internet always wins).
AAC is the Advanced Audio Coding format (.m4a), which was supposed to be developed as a successor to MP3. Needless to say, it hasn't happened, unless you. The only people who use this format are those stuck on iTunes (seriously, what the fuck is wrong with iTunes? does it have Terminal 7?) or hardware manufacturers who see it as a "safe" alternative to MP3 and want to conform to the clusterfuck of standards it brings. It's patented and not-freely licensing, meaning that it'll never be used in FLOSS projects, which is why all of them use an alternative codec instead, thus ensuring AAC's inevitable demise in the years to come. Also, it can technically be modified to be lossless, but almost nobody uses it that way.
Advanced Systems Format is Microsoft's set of codecs for audio/video storage, though the one we're interested in is the audio-only component Windows Media Audio (.wmv). The format is completely unremarkable, as next to nobody uses it unless it's the last resort for the software they're working it. It doesn't excel in any area and its only saving grace is that it works with Windows right out of the box, but then so does MP3. It is also patented, can carry DRM, and is non-free software. What an underwhelming performance from Microsoft. 10/10 at the very least.
Vorbis (.ogg) is a completely free codec, which is most notable for being used on Wikipedia because whoever owns the other formats would rather sue fucking Wikipedia than lose a few pennies. Vorbis is free of all patent and software restrictions, compresses music down better than almost any format, making even 32kpbs recordings sound decent, and only really degrading in quality at its minimum of 16kbps. Because of the great compatibility between Vorbis and applications, its free nature, and its quality-to-size ratio, it's often used in video games as an alternative to MP3, because some game developers don't understand how to compress files properly, which is why a 5MB visual novel was 200 times bigger than it needed to be. Ogg is technically a container, while Vorbis is the actual coding format, so there's a lot of different content that can be saved in Ogg.
Opus (.opus) is another free codec developed by the Vorbis creators. The speciality of Opus is that it's able to handle very low bitrates (sub 32kbps) at an acceptable quality with minimum overhead, making it the best format for streaming and broadcasting implementations. It's lack of support in popular players, general obscurity, and specific software needed to convert the format means that it still hasn't gained broad acceptance. The actual quality of Opus files depends on the contents of the file - certain songs create noticeable audio compression, which Vorbis lacks at equivalent bitrates. This makes Vorbis a better choice for music, because it sounds closer to the original source with only a general reduction in quality, as opposed to Opus which prioritises speech over everything else, meaning it will cause certain aspects of a song to simply not exist as part of its compression algorithm, which is especially noticeable with long notes and reverb. Opus does tear ass when it comes to speech, as even its minimum of 6kbps produces recognisable words, turning a 4-minute song into 0.2 megabytes of recognisable lyrics.
Having read all of this information, you are now smarter than the general population when it comes to audio compression. Go forth onto /mu/ with your shitty taste and take photos of MC ride eating a cheeseburger, so that I may save bandwidth and not have Neocities rag me out because I'm costing them money (hasn't happened yet, but I'm future-proofing!). Just kidding. I actually do like having you here, because to be able to speak to people who are like me is a privilege I'd never want to give up. I guess that's part of the reason we make art and talk into our pillows - we all want somebody just like us to be able to talk to. And if you're extra special, you'll find somebody else like you, and perhaps you could stare into their eyes and feel at peace.
But not me, as relationships are a distraction from real work (good enough for Newton, good enough for me), and I can simulate every positive emotion through fictional means. If it's cruel to you, then perhaps you depend on putting your happiness into somebody elses arms and not into your own. That's pitiful, to not be a whole person and have to rely on somebody else to fill in the gaps. But then again, you're reading about audio compression on a Saturday night (sunday morning! k-os kills it), and I'm the dummy who's writing about the topic. So what do I know about relationships than what I've thought of for myself?
Better put on some (not) chillwave and learn about all of the applications of these codecs, because God knows it's just another distraction until you end up on your deathbed praying that you spent a small portion of your life with somebody who actually understood you. I'm not saying that's what will happen. I'm saying that's a possibility.
Apps and Applications:
PCM .wav is used whenever you want the absolute original quality, and is generally what you would use in high-end music production, seeing as it captures everything that not even the human ear can hear. That's great for audiophiles and other such placebo-affected victims, but the big draw of PCM is that, being on top, it's able to be compressed to just about every god damn format in the world and still retain some of the magic (portions which nobody understands) of the original file, which can't be said for compressing from other codecs, as there's always a little bit of magic lost. It's easy being on top, because everybody wants to use you, including in hardware distribution methods and development environments, where the ability to have an original file to easily convert to is something that can be done automatically, meaning that silly developers won't have to worry about that type of stuff.
PCM .aiff is not very popular at all, and in a world where power is determined by popularity, don't expect to see much of AIFF in the future. Unless you're from Apple. If you are, ship up or shape out.
FLAC is the standard format for lossless audio, as it's high compatibility and high quality encoding makes it very popular for music distribution. It can be read by any media player worth its salt, is free software, and has decent filesizes for a lossless codec. All of these makes it an attractive format for anybody who needs to distribute their uncompressed audio samples or whatever and doesn't want to have the massive filesizes of PCM. It continued use by scene groups and on Bandcamp cements its popularity as a method of music distribution.
ALAC is shittier FLAC. The only reason it's used is for iTunes bonus tracks, or if you're Apple, seeing as Apple made the bloody thing. Normally I don't shit on companies unless they deserve it, seeing as I'm not likely to change anybodys mind in two sentences worth of information (aren't we rational little monkeys? we're cute and funny and yet we're all incredibly stupid, too. a bit like a retarded girlfriend), but Apple has been phoning it in today. Jobs is rolling in his grave, probably to scare off Richard Stallman (somebody was paid to call Stallman "strange, misguided", which is as cringe-inducing as all the other Gawker articles at the time of his death).
MP3 is used in everything. It's a staple of the Web and is the most important method of distribution for almost any audio file. It can be read by every media player on the planet, and a 320kbps MP3 is the most transparent file for everybody except for the select group of people with thousands of dollars worth of audio equipment and the ears to use it. It needs no introduction, and it also needs no closure, until it eventually fades away in the public eye to be replaced by some other format, hopefully Vorbis.
AAC is used by hardware vendors who think MP3 is going out of style. As to why they continue conforming to the clusterfuck of industrial standards that AAC brings, I don't know. I can only imagine that they're scared of open formats because then they'd have to admit that their competitors have a point, and we can't have that now, because that might mean losing an inkling of power to a group that will make zero profit whatsoever off your implementation.
.wmv (nobody uses the full name) is only used for Windows applications, seeing as it works out of the box and doesn't require a lot of additional coding to implement. It's underwhelming, like I say, and should be seen as the emergency rations of codecs, a bit like the television is to entertainment.
Vorbis is used when you want to compress files down hard and fast, with little difference and quality. It can turn a 4.7 megabyte MP3 into a 1.2 megabyte .ogg while having a minimal reduction in quality, meaning the size-quality ratio is attractive to anybody looking to shrink down their music collection at the cost of making it sound a little bit worse. You must keep in mind though, that once you develop a taste for quality, it will never go away. Learn to be satisfied, instead. It's much easier. It's also free software, so implementations in entertainment like video games and free software projects are popular, while also being easy to decode and not very resource intensive to play. It's a little miracle, in my opinion.
Opus has one speciality, and that's speech. It can compress that same 4.7 megabyte MP3 down to 250 kilobytes, and still have all of the words be recognisable, though it completely wrecks all the music within. It's the perfect codec for online streaming applications, as it's very easy to decode and has a very low latency, meaning that audio lag is unlikely for even the shittiest implementations. Its low size furthers this reputation, making it easy on bandwidth while still providing acceptable quality. Because it focuses on speech above all else, it isn't suitable for general sound effects or music collections, though for voice-related files it's a very attractive option for developers. It's also freeeeeee!
Rip an album from a CD using fre:ac, then encode the files and clamp them to their absolute minimum quality (except for lossless, because that defeats the point of lossless). The end results will give you an idea of what I'm talking about, and comparing them can be fun for those of us who enjoy understanding quality as a series of incremental steps.
You will find though, that it is far easier to choose a codec that suits your particular fetish and use it for all of your music than it is to sort through each song individually and try to find the best codec for it. If you like small filesizes, use Vorbis, and for don't-give-a-fuck small files, use Opus. If you want to best quality, use PCM, or FLAC if you need more space. Whatever you do, don't convert a lossy file to a lossless, because it's literally wasting space. If you're a normie who just don't give a fuuuuuuuuck, use MP3, as it's accepted by everything, though you will soon grow out of your mainstream ways.
It is strange though that it does not matter so much as what you use to encode the audio, as the contents of the audio itself. If it takes 50MB or 1MB to make me cry, I guess it did its job, now didn't it? I can't recommend you anything that's certain to affect you emotionally, but sometimes we must chew the tattered string and show some humility by sharing your own experiences. For me, the Seasons EP by a (horse) bird nerd named Jackleapp did me in, and it's the scattered collection of his songs which inspired me to start organising my collection into album swaps, which is reorganising an album's songs to create a new listening experience. I might upload them. For now, I'm happy to have listened to the simple and colourful lyrics and melody that made me blush.
It's precious to me to be able to be a part of something like that. I wonder what the greater feeling is? Is it better to be a part of something you would never be a part of due to sheer virtue of fiction, or is it better to be satisfied in your acceptance of the ability of the world to make you feel as much as fiction?
I wonder if a pop song ever answered that question.
Music for your eyes - Froghand.
Today's page was updated on June 12, 2016!
Show somebody how it feels to feel good.