Listen to what gets lost when an MP3 is made (2015)

(vox.com)

142 points | by teleforce 8 months ago ago

94 comments

gwbas1c 8 months ago

In 1999 when MP3 was getting attention, I tried to do this. I encoded a file, then inverted it, and mixed it back into the original.

It didn't cancel anything out.

The reason: Mp3 dramatically alters phase. Because all the phases are different, it's hard to naively determine how the signal is altered.

Years later, I took the time to write a series of tools to investigate lossy audio: https://andrewrondeau.com/blog/2016/07/deconstructing-lossy-...

[-]

tecleandor 8 months ago

Oh! Is that an artifact of Joint Stereo encoding ?

[-]

gwbas1c 8 months ago

> Oh! Is that an artifact of Joint Stereo encoding ?

I personally don't understand enough of MP3's internals to explain that.

What I assume is that, because MP3 internally stores Fourier transforms in the frequency domain, instead of the time domain, it uses very few bits to store phase. This will result in phase shifts.

Hopefully, someone else can give a better explanation of this than I can.

Basically, think of it this way:

1: Imagine a frame of 64 16-bit samples. (1024 bits) 2: Than its converted to 32-bit floats: 64 32-bit samples. 3: Than its transformed 4: Now there are: 1 DC bias 32-bit float, where phase is 0, and 32 frequency amplitude floats, and 32 phase floats. (The other 31 frequency/phase pairs are duplicates and don't need to be stored.)

These floats need to be quantized so that there are many less bits. Some of it can happen by using very few bits to store phase, and some of it can happen by using less bits to store amplitudes. Again, someone else can probably explain this better than I can.

[-]

tecleandor 8 months ago

Interesting. In a quick internet search seems like people finds a small phase shift on MP3 enconding, but I haven't found details on why. Haven't had much time to find it, though...

[-]

gwbas1c 7 months ago

Speculation on my part:

Let's assume an MP3 stores 8 sample Fourier transforms, overlapping via windowing.

1: Start with 8 16-bit samples. (For a total of 128 bits from the original CD)

2: Convert to 32 bit float

3: Run a forward Fourier transform

4: Now you're in the frequency domain

5: Keep the amplitude for the 0th frequency bucket. Quantize at 8 bit with companding. (This keeps mid-range frequencies, where the ear is most sensitive, generally lossless.)

6: Apply an exponential function to the higher frequencies amplitude (slots 1, 2, 3, and 4) so that very quiet highs are discarded. Then quantize the highs at 6 bit.

7: We will ignore the phase. Remember that slots 5-7 are just copies of 1-3

8: Now, with slot 0 at 16-bit, and slots 1-4 at 6-bit, the window is a total of 32 bits.

Because of windowing, each Fourier transform is crossfaded. (Basically, the first 8-sample transform is at sample 0, the next is at sample 4, the next is at sample 8, and each is cross-faded.) This means that each the ratio of bits is (32 + 32) / 128: 0.5. Further compression (by storing the difference between windows) can probably get to around 0.25

vagab0nd 8 months ago

Why can't you convert the wave into frequency domain and compare that?

a-french-anon 8 months ago

This article could at least have a paragraph explaining (in a dumbed-down way) why various kinds of psychoacoustic masking (temporal, frequency) make what's removed almost inaudible anyway. Reading the linked source (https://www.theghostinthemp3.com/theghostinthemp3.html), he at least used LAME, but at a fixed 128 kbps bitrate, not in VBR mode =(

EDIT: nerds should read about sfb21 (https://wiki.hydrogenaud.io/index.php?title=LAME_Y_switch), AAC, Vorbis and Opus (CELT) aren't just theoretical improvements

[-]

magicalhippo 8 months ago

IIRC then 128kbps and lower means a 16kHz low-pass filter is applied, while it's not for higher bitrates. So in that case not just psychoacoustics.

[-]

theandrewbailey 8 months ago

I was under the impression that MP3 can't represent any frequency above 16kHz, no matter the bitrate.

[-]

Sonic656 8 months ago

MP3 is only good at 18KHz at V0/320kbps, Yet AAC & Vorbis can handle full 22KHz at 256kbps VBR with hotter audio(-15 ~ -5db) their VBR models can just use 384 ~ 960kbps. Yes Vorbis can actually reach 1.35mbit since It has no frame cap.

gwbas1c 8 months ago

Then they would all be 32khz, instead of 42.1khz or 48khz.

cladopa 8 months ago

There is a trick there. A sound can mask another sound. You will not be able to tell the difference with both sounds playing at the same time, but if you subtract them you can hear it because there is no masking.

I always loved to test the ears of my "Audiophile" friends. They will tell you how different MP3s are. You make a bet they can not differentiate them in 20 trials better than chance. I won with most people but some professional musicians that can identify little differences.

[-]

maep 8 months ago

I spent half my professional career doing listening tests (MUSHRA and P.800), specifically on test items like Tom's diner. 128 kbps mp3 ist fairly easy to pick out, especially if you can compare it to the original. Double the bitrate and it's a real challenge.

Modern codes like opus are much more efficient. At high bitrates they are fully transparent and anybody who claims to be able to hear a difference is full of shit. Put them in a controlled setting and they fail every time.

[-]

anotherhue 8 months ago

LAME @192kb/s VBR was transparent 20 years ago, that said FLAC is still a good choice because storage is cheap now and you don't want to have to deal with a copy-of-a-copy situation.

Some Young folk think that 24bit/192kHz is the one-true-form who would think a 16/44 FLAC is a lossy encode, and then there's the vinyl folks. (I like vinyl, but not for the fidelity).

Required reading: https://xiph.org/video/vid2.shtml

pimeys 8 months ago

I've found 128 kbps opus to be the best quality to stream my music when I'm not home. It is very fast to encode on the fly, and outside the house I mostly listen to music with either Bluetooth headphones or sometimes in a car, so playing something like flac would be a waste of bandwidth.

Maybe I'm old, but I do not hear a difference between 128 kbps opus and flac. I mainly use flac because it is an excellent archival format and you can encode it to different formats easily.

[-]

maep 8 months ago

Yea same with me. Unless you have a perfect setup this is good enough. Though if you want to do a little experiment, try Fatboy Slim's Kalifornia, the beginning is notorious for destroying transform based codecs.

kiririn 8 months ago

I think these kind of blind listening tests are fundamentally flawed. For example in the graphics realm (games, video encoding, colour science, etc) all it takes is a momentary black screen between two comparison images to make it vastly more difficult to detect differences. Likewise side by side is also more difficult than swapping between two images instantly. Audio makes it impossible to do an instant swap, at best you’re getting the equivalent of a side-by-side comparison

[-]

maep 8 months ago

If anything those tests make it easier to find subtle differences, which is good if transparency is the goal. I don't think that makes them fundamentally flawed. They are used throughout the industry, making results comparable.

Of course there are other ITU tests that work without hidden references, looping or even A/B comparison. They require a much bigger listener pool, are more expensive and take longer, thus used less often during development.

[-]

kiririn 8 months ago

Maybe not fundamentally flawed but audio ABX testing is focused towards short term memory and opinion (especially in unskilled subjects) than I would like. I don't think there is any right answer to audio blind tests.

I'll trust actual validated limits of human perception such as 16/48 audio, 1~3dE colour, etc. And techniques used in video encoding like psnr, ssim, etc are also pretty well grounded in science. Also SINAD

But anything involving a human blindly comparing audio is into audiophile pseudoscience territory, no matter how large a cohort of people or how it is executed

[-]

maep 8 months ago

I can assure you that audio codec testing is a through science. Tools such as PSNR, PEAQ or POLQA all have limitations and cannot fully replace a human listener. Those familiar with the topic are often vocal critics of audiophile bullshit.

No, this is nowhere near pseudoscience, psychoacoustics is an established field of science.

mrob 8 months ago

Audio does not make it impossible to do an instant swap. Any good ABX tool lets you switch between test/reference samples with zero delay. Hear for yourself:

https://abx.digitalfeed.net/list.html

(you can press A, B, or X on the keyboard for instant switching)

[-]

kiririn 8 months ago

Ye that's what I meant when I said side-by-side. You can't use the same pattern matching that your eyes/brain do when an image instantly swaps, as audio is always temporally moving to be heard at all. Instantly swapping between two audio streams is no better than looking at two images side-by-side

Sonic656 8 months ago

I laughed when Reddit suddenly spent months claiming Spotify is garbage despite It using 320kbps Vorbis and that the other 3 streaming platforms would dethrone It. Despite the fact I doubt any of them would tell AAC or Vorbis at 160 ~ 192kbps from FLAC, Hell I doubt they even tell 192kbit/s VBR LAME from FLAC let alone the modern lossy codecs. lol

archi42 8 months ago

Regarding little differences: Sometimes(!) hits on the crash cymbals sound "wrong". Since compression is pretty good, most of the time it sounds fine though. Plus, in other instances that's just how it was recorded/sampled.

So it's not a good indicator for your test (and I wouldn't dare claim what your audiophile friends claimed). But in a few cases a "wrong" sound cymbal can actually caused by compression.

But I sit down to listen to music as a hobby (and used to play drums, so I listen closely to them). And while I don't have audiophile-grade snake oil, I have both decent headphones and speakers; including the electronics.

mrob 8 months ago

>A sound can mask another sound.

Details:

https://en.wikipedia.org/wiki/Auditory_masking

Bernhard Seeber of the Audio Information Processing Group at the Technical University of Munich has some good demonstration videos on Youtube:

https://www.youtube.com/watch?v=R9UZnMsm9o8

https://www.youtube.com/watch?v=bU0_Kaj7cPk

Beijinger 8 months ago

" I won with most people but some professional musicians that can identify little differences."

Even this seems unlikely. I remember a test from the C't Computer magazine that has a very good reputation. And there were many professionals and as far as I remember, they were not able to tell the difference.

Fun fact: The only person that scored significantly was a person that loved punk music and had an ear damage.

[EDIT] https://www-parkrocker-net.translate.goog/threads/komprimier...

8 months ago

[deleted]

JansjoFromIkea 8 months ago

What bitrates are you using? I see practically no difference between v0 and anything above that on most things but sub 192kbps it can be very evident. I feel like a lot of the FLAC people have a hardcoded bias from the Limewire days that's hard to shake off once you've got it, you're basically listening to FLAC for the assurance that you're not missing something (which is a fair reason to a point imo).

smolder 8 months ago

There's a particular song I have in 96/24 .flac that I can easily tell apart from a 192kbps mp3. Much harder with v0. It's specific to the content though, having some really soft background sounds layered on top of an already rich sound. It helps having fancy headphones, I'm sure. With many other recordings or just worse headphones I doubt I could tell.

BurpyDave 8 months ago

Ironically, the 'diff' is compressed anyway, because it's on Vimeo, so that's not the actual diff either!

[-]

taneq 8 months ago

This is like those articles on 10-bit colour where they show you "how vibrant and rich" the higher bit depth is, but you're reading it on the same old 8-bit-per-channel monitor.

jonathanstrange 8 months ago

So what. People listened to music on mechanical gramophones and enjoyed it. Too many audio engineers think it's all about the sound, when in the end it's about the music and the feelings it expresses.

[-]

lukan 8 months ago

"People listened to music on mechanical gramophones and enjoyed"

Herman Hesse (or rather Harry Haller in Steppenwolf) used to get enraged by that - distorted garbage and a blasphemy to the godly componists. But then eventually he overcame it because of exactly that reasoning - if people enjoy it and are touched by the music - it works good enough. That is the purpose of music, not arbitrary perfectionism.

a-french-anon 8 months ago

You're naïve if you think that 1) Good enough prevents better from existing for some people, 2) Technological advances had the same consequences for all material. For example, orchestral music massively benefited from CD's increased dynamic range.

8bitsrule 8 months ago

Absolutely. Good music isn't about fidelity.

Lots of people in the 40s and 50s enjoyed listening to Frank and Elvis' songs broadcast by AM (mono) radio (10kHz bandwidth) and reproduced by one or two oval 2x6" speakers of their cars.

The 1939 Carnegie Benny Goodman concert was recorded with mikes and lathes cutting aluminum acetates with no better than 5 kHz fidelity. They were lost until the 50s, then released on an LP that sold over a million copies. It sounds great on Youtube; I was wondering what else sounds that good with a first-order roll-off at 5k.

sevensor 8 months ago

The way I think about recorded music is, the instrument being played is ultimately a set of speakers. That’s a musical experience to be encountered on its own terms. Live music is a wonderful thing, and presenting an illusion of live performance is certainly one thing a recording / reproduction system can do, but that doesn’t mean fidelity is the only thing we value when listening to a recording. Consider the punk music fans who enjoy a cheap stretched out cassette tape; I’m not going to deny them their enjoyment any more than I’ll deny the golden eared audiophile. They enjoy listening to different instruments.

And this brings me to my point: the engineer is no mere technician whose job is only fidelity. The engineer is part of the performance through speakers, an artist no less than the musicians being recorded.

[-]

mrob 8 months ago

Even the best stereo speakers have an inherent weakness: comb filtering between the channels.

You can easily test this yourself by generating some mono pink noise in Audacity (or your favorite audio editor) and playing it on stereo speakers. Move your head and you will hear changes in the sound like a flanger effect. This is most obvious if you are close to the speakers and in an acoustically dry environment. Compare with the same sound hard panned to a single speaker.

This is one reason why a real physical center channel improves clarity for movie dialogue.

Quarondeau 8 months ago

Interesting approach. So are we only able to hear those sounds now because the rest of the music was removed, which would ordinarily mask the missing sounds?

To say that the mp3-encoded version is not "what the artist recorded and wanted for us to hear" would imply that we can hear all sounds in the uncompressed recording.

[-]

varjag 8 months ago

Yes exactly this. On a high bitrate stream these losses are imperceptible (yes even by you golden ears out here) due to how auditory filters work in human hearing.

[-]

loa_in_ 8 months ago

That's assuming we understand human auditory filters AND human beings have uniform auditory filters all around

[-]

mrob 8 months ago

We don't need to understand those things to determine if the differences are audible. We can simply perform listening tests:

https://en.wikipedia.org/wiki/ABX_test

A properly conducted ABX test is the most favorable condition possible for detecting a difference. If you can't ABX it you can't hear it.

There's an ABX testing website with various lossy formats you could try:

https://abx.digitalfeed.net/list.html

However, failing to ABX those specific samples does not guarantee you are unable to tell the difference in all circumstances. There are some sounds that are unusually difficult to encode ("killer samples"). This is an especially big problem for MP3. The LAME project has a collection of killer samples for MP3:

https://lame.sourceforge.io/quality.php

More modern lossy formats are less susceptible to killer samples, but theoretically there could still be problematic cases.

skrebbel 8 months ago

You can test this. Get a bunch of audiophiles in a room, play various recordings from various media (24KHz wav, mp3 at good bitrates and encoding settings, bad mp3, ogg, CD, vinyl etc) without showing them which is which, ask which one they like the most and see if the results correlate with the supposed quality of the source.

I don't have a source ready but this has been a hot topic in audiophile land for decades and tldr is they'll pick out the really bad sources (eg <128kbps mp3) but not the rest. Basically the results look like those from a blind beer tasting test: no correlation between winner and supposed quality, except if the quality is especially bad.

I'm no scientist, but to me "audiophiles who really care about this stuff can't pick out the good MP3 from the uncompressed original" is sufficient proof that MP3 is, actually, based on a sufficiently well-understood model of human hearing.

reliablereason 8 months ago

You can try it yourself:

ffmpeg -i original.wav -codec:a libmp3lame -b:a 192k output.mp3 && \

ffmpeg -i output.mp3 decoded.wav && \

ffmpeg -i original.wav -i decoded.wav -filter_complex "[1:a]aresample=async=1,volume=-1.0[inverted];[0:a][inverted]amix=inputs=2:weights=1 1" difference.wav

[-]

schroffl 8 months ago

This produces an error in version 7.1 for me:

[AVFilterGraph @ 0x6000008b59d0] More input link labels specified for filter 'aeval' than it has inputs: 2 > 1

[AVFilterGraph @ 0x6000008b59d0] Error linking filters

Failed to set value '[0:a][1:a]aeval=val(0)-val(1):c=same' for option 'filter_complex': Invalid argument

Error parsing global options: Invalid argument

[-]

atoav 8 months ago

The theory behind it is simple: Subtract each audio sample in B from each audiosample in A.

You can do the same thing in your DAW¹ by putting A (e.g. the original) onto one channel and B (the processed sound) onto another. Then you invert the phase of B and listen to/export the sum.

This trick works also for audio gear that claims it does amazing things to your sound (here you just need to make sure to match the levels if they have been changed). Then you can look how much of the signal has truly been affected by your 1000 bucks silver speaker cable.

¹ Digital Audio Workstation, something as simple as Audacity should do the trick

[-]

TheRealDunkirk 8 months ago

> Then you can look how much of the signal has truly been affected by your 1000 bucks silver speaker cable.

I have a friend who has spent ridiculous sums of money on audio gear. Like, he's in his 50's, and still lives with his parents (in part) because of it. Over the years, I've learned I will never convince him that he's being fleeced, but I've wanted to make a site to host such A/B comparisons for a very long time, to perhaps get through to others what a waste most of the "audiophile" gear is.

[-]

Redoubts 8 months ago

Have you ever checked out the https://hydrogenaud.io/index.php?action=forum forums?

atoav 8 months ago

Oh that sounds incredibly sad.

On your A/B comparison website: I think it is important to make a "blind" test default. So they can listen to e.g. ten repetations and vote for for one each time and in the end they get a score which one they liked better. and by how much.

Because of course they want to hear the difference if there is an expensive price tag.

reliablereason 8 months ago

use this instead: ffmpeg -i original.wav -i decoded.wav -filter_complex "[1:a]aresample=async=1,volume=-1.0[inverted];[0:a][inverted]amix=inputs=2:weights=1 1" difference.wav

But honestly the only thing you get is something that subjectively sounds exactly the same, but lower volume. Probably due to the fact that subjective sound experience is more related to the fourier transform of the waves than it is to the waves themselves.

[-]

gwbas1c 8 months ago

It's because mp3 dramatically changes phase. As a result, merely mixing the inverted original won't leave you with what's filtered out.

That technique will work with simpler compression techniques, like companding. (Companding is basically doing the digital equivalent of the old Dolby NR button from the cassette days.)

bla3 8 months ago

Note that the original project did more involved processing, as described on https://www.theghostinthemp3.com/theghostinthemp3.html:

"Using the python library headspace, and a reverb model of a small diner, I began to construct a virtual 3-d space. Beginning by fragmenting and scrambling the more transient material, I applied head related transfer functions to simulate the background conversation one might hear in a diner. Tracking the amplitude of the original melody in the verse, I applied a loose amplitude envelope to these signals. Thus, a remnant of the original vocal line comes through in its amplitude contour."

jonnycomputer 8 months ago

"What MaGuire has proved here is that the songs we listen to every single day are not the exact master copy that the artist recorded and wanted for us to hear. Instead, they are slightly stripped versions of their art run through a set of standards created by a bunch of engineers in 1993. For many people, that won’t matter. The songs sound almost the same, but the compression of music into an MP3 format is an important question to weigh when considering artistic intent and analyzing songs that aren’t exactly the original."

I feel like this analysis isn't well grounded in what artists and sound engineers actually do, or how they think.

[-]

Optimal_Persona 8 months ago

I honestly didn't know that was a matter of contention at this point...it's very common for pro (or smart amateur) audio engineers to actually listen to the effect of different digital compression algorithms and adapt accordingly. Things like Sonnox Fraunhofer plugin let you do this. https://sonnox.com/products/oxford-fraunhofer-pro-codec

- In this month's TapeOp magazine, Jeff Jones mentions he always monitors through such a codec so he can adjust accordingly for the best sound regardless of final format https://tapeop.com/interviews/163/jeff-jones/

- IMO Rush's Moving Pictures is one of the best (and best-sounding) albums from the '80s despite the fact that it was mastered to a Sony digital device that's renowned as terrible sounding, with only 14 bits of usable resolution. How/Why? The production team monitored through the Sony system while mastering and made tweaks as they went to account for its limitations.

- I've had my own music professionally mastered at high resolution, and discovered that converting hi-res to MP3 (even at highest bitrate) caused digital peaks over 0.0 dB, I fixed by normalizing to -0.3 dB.

TBH if the resulting file is going to be transmitted over BlueTooth it's going to be further degraded. And yet people can still make a strong emotional connection to the underlying music and artist...

chasil 8 months ago

When I've used lame --preset extreme, it uses variable bit rate which was developed some time later.

NoPicklez 8 months ago

Fairly lackluster article.

Not all .mp3's are created equally and can vary in how lossy they are based on the bitrate.

If you care enough to want to hear exactly what the artist wants you to hear, you just listen to the lossless version.

[-]

sumtechguy 8 months ago

> Not all .mp3's are created equally

No kidding. There are a decent number of options to pick for quality of mp3. VBR/CBR, bitrate, joint/stereo/mono, and so on. I personally just pick something that sounds fairly close to the original. But that only really matters for me when it is side by side. Give me a few days between picking and I can not really tell anymore.

flanked-evergl 8 months ago

The "lossless" version is in fact not entirely lossless. There is much that gets lost in the process of digitization, and even if there is no digitization happening the analogue equipment has some frequency response range.

[-]

vinhcognito 8 months ago

Maybe theoretically but for our purposes doesn't the Nyquist–Shannon sampling theorem mean it's essentially lossless to our ears?

kazinator 8 months ago

> You can hear so many unnecessarily rejected sounds.

That accusation requires evidence based in psychoacoustics. Just because you can hear it in isolation doesn't mean you can hear it if it is added back to the host audio.

For instance when some quiet sound that is masked by immediately preceding loud sound is removed, of course you can hear that quiet sound in isolation! Your hearing has something like 120 decibel dynamic range, or better.

You can hear differences in the compressed audio. Nobody can claims that there's no degradation in quality. Artifacts are obvious. Much more so at lower bit rates, though. MP3 starts to sound quite good around 192 kbps.

The removal of those components is necessary. It is necessary to the algorithm so that it can achieve compression.

Also there's this issue. If we take a signal and apply some modest EQ to it. Say we boost the bass and treble and cut me a little bit. Or any other EQ profile. If we then level match the two signals and subtract them from the other, there will be a difference: some aspects of the original material will be recognizably heard. For instance the difference between a slightly treble cut signal and the original will be the treble. But the trouble was not completely cut from the original. What you're hearing in the difference is not something that was entirely removed.

CGamesPlay 8 months ago

Found the original author's page about the project (no longer on the internet): https://web.archive.org/web/20211011015410/http://ryanmaguir...

One interesting thing to note: this is a composition, not an analysis. It's not fully documented exactly what modifications to the "raw data" were made.

[-]

roelschroeven 8 months ago

I can open the original link, https://www.theghostinthemp3.com/theghostinthemp3.html, just fine. Maybe it's reinstated? Or there was a temporary problem?

[-]

8 months ago

[deleted]

Agraillo 8 months ago

Why Tom's Dinner? Because it is a cappella. There's a book "How Music Got Free" by Stephen Witt [1] detailing the history of mp3 format and related events. It is a very good read and there's an explanation

Increases in processing power spurred progress. Within a year Brandenburg’s algorithm was handling a wide variety of recorded music... But one audio source was proving intractable: what Grill, with his imperfect command of English, called “the lonely voice.” (He meant “lone.”) Human speech could not, in isolation, be psychoacoustically masked. Nor could you use Huffman’s pattern recognition approach—the essence of speech was its dynamic nature, its plosives and sibilants and glottal stops. Brandenburg’s shrinking algorithm could handle symphonies, guitar solos, cannons, even “Oye Mi Canto,” but it still couldn’t handle a newscast. Stuck, Brandenburg isolated samples of “lonely” voices. The first was a recording of a difficult German dialect that had plagued audio engineers for years. The second was a snippet of Suzanne Vega singing the opening bars of “Tom’s Diner,” her 1987 radio hit.

[1] https://en.wikipedia.org/wiki/How_Music_Got_Free

sdk77 8 months ago

Very interesting! The audio of Tom's Dinner rejected by the encoding sounds mesmerizing to me. I still find it to be musical - it reminds me of a record I bought a really long time ago, it was called modulation & transformation on mille plateaux, it's a collection of songs in the abstract and experimental genre.

pvillano 8 months ago

Two instances where lossy compression failed for me are the movie Koyaanisqatsi and songs by the artist TOBACCO. Koyaanisqatsi has a lot of film grain and TOBACCO uses a lot of distortion. There is noise in there, but it's very deeply mixed into the signal.

0points 8 months ago

This is why we dont encode mp3 in 96kbps or whatever.

moomin 8 months ago

I think the thing that's really sticks out is that the breath noise are gone, which is one of the things that gives the track its character. Willing to bet the same kind of thing happens to fret noise as well.

[-]

Fade_Dance 8 months ago

My headphone rig was always optimized for a nice guitar sound, and I would second that. The sound of fingers on the string and the "pluckiness" of the guitar is what gets lost.

Beyond that, the specific thing that i noticed gets lost is bass character on some tracks. Ex: Some drum and bass tracks just don't hit at low bitrate. This aspect sometimes feeds into the low guitar strings though, where they might have a bit less body.

Lastly is of course sound staging, but that's something that a headphone setup is very sensitive to.

As for quality differences, I basically fall in line with the consensus on this thread. FLAC and 320 are indistinguishable. 192khz is almost always indistinguishable and good enough, although there are some situations where it might be slightly noticeable. 128 is pretty easy to tell the difference with the good setup.

There is also the rare track with amazing production and or very cool stuff happening somewhere in the spectrum, so I don't entirely write off someone wanting a plus one on the take above. "I absolutely love this jazz album. It's been a large part of my musical journey as a human, I can a/b test this at 320, but for this album, I really want it lossless." I can respect that. I got to know some of my test tracks pretty well (ex Black Sun Empire & Arrakis for bass), and while 192 was fine for other tracks, I wanted 320 or lossless on those.

[-]

Optimal_Persona 8 months ago

As a musician recording ideas on a handheld Zoom recorder so I don't forget them, I switched from WAV to 320 MP3 a few years ago to save on file space and don't regret it.

no-such-address 8 months ago

Funny article.

"The exact master copy that the artist recorded and wanted for us to hear" In the digital era, does that even, uniquely, exist?

"a set of standards created by a bunch of engineers in 1993" Nice!

Was hoping the article would mention double blind studies about the ability to perceive differences and the quality between various audio file format, available elsewhere. Interesting, though not as overwrought as the reporting in this article.

[-]

chasil 8 months ago

In fact, Glenn Gould's final recording of Bach's Goldberg Variations was rereleased from the analog tape recording, as the sound quality was better than digital of the time (1981).

https://en.wikipedia.org/wiki/Bach:_The_Goldberg_Variations_...

HPsquared 8 months ago

This could also be done on visual compression with JPEGs.

Or on video compression, for that matter.

It just shows though that these diffs are invisible to a human - by design.

[-]

amelius 8 months ago

Animals and aliens might cringe at these images and sounds, though.

ps, you could do the same thing with watermarked content.

chrsgrrtt 8 months ago

I developed a streaming service many years ago; Dolby wanted us to use their codec for the audio, and they used a track just like this as the primary basis of their sales pitch. Was quite impressive at the time.

ezconnect 8 months ago

When I first experience CD audio it was too high pitch compared to tape versions. MP3 came along and each song sound different depending on the MP3 compression settings.

[-]

jampekka 8 months ago

Some CDs were mastered to have pre-emphasis that boosted high frequencies if the player didn't properly account for it. This caused them to sound "high pitched".

https://wiki.hydrogenaud.io/index.php?title=Pre-emphasis

Timwi 8 months ago

I would have liked a comparison with Ogg and perhaps other formats. I hear a lot about MP3 throwing away a lot more than Ogg but I'd love to see real data on it.

[-]

stavros 8 months ago

I don't think the claim is that MP3 throws a lot more away, it's that Vorbis sounds better. The two are very different, Vorbis might be throwing a lot more away, but if it's only the stuff you can't hear, then it'll sound better.

arch1t3cht 8 months ago

ogg is a container format, not a coding format.

[-]

wizzwizz4 8 months ago

Vorbis, then.

Traubenfuchs 8 months ago

...I think this person just created a new genre of music. Something like: "What's lost noise."

I immensely enjoyed listening to the "lost material" of Tom's Diner and would like to hear more of this!

Maybe one could diff with a lower quality version, one where more has been cut away, more is lost/left over? There are so many possibilities!

[-]

JKCalhoun 8 months ago

Lost-Fi.

[-]

Traubenfuchs 8 months ago

That's the Zeitgeist.

ipunchghosts 8 months ago

There's still audio motifs in there that can be further optized out.

If the remaining audio was noise like, I would say we reached the compression limit.

Klaster_1 8 months ago

The article doesn't mention at what bit-rate the difference track was made, anyone knows? Seems disingenuous and pro-"authentic" otherwise.

[-]

Sesse__ 8 months ago

It also doesn't really mention how this “lost” material was identified. If you just subtract the encoded from the original, then any phase difference will make it sound like material “disappeared”, while in reality, it just came very slightly earlier or later.

[-]

rob74 8 months ago

I guess it was exactly as you write - but instead of slightly earlier or later, the "lost" sounds are the high frequencies (a lot of hissing, clicking etc.) - the actual sound is mostly still there, but slightly "muffled" because it contains only the lower-frequency components.

etwas 8 months ago

It's 128kb. The useful and informative article is mentioned at the end: http://theghostinthemp3.com/theghostinthemp3.html

rob74 8 months ago

If it's really the original MP3 version of Tom's Diner produced by the Fraunhofer engineers, probably not a very high bitrate. Aside from that, I would say that even back in 2015, MP3 was already on its way out and replaced by better (while still lossy) compression methods?

[-]

porbelm 8 months ago

With space not much of an issue anymore, FLAC is pretty much the default nowadays. And even though Opus and AAC and others have better encoding than old MP3, but I guess a 128 Kbps MP3 encoded in the original Fraunhofer l3enc (the best back then) and one encoded with LAME will be different - and the LAME version will be "better" because of improvements in psychoacoustics? At least I remember l3enc being MUCH better than anything else at 128 Kbps (Xing lol, cymbal washing anyone?) before LAME came along.

[-]

klez 8 months ago

> With space not much of an issue anymore, FLAC is pretty much the default nowadays

The default for what? Space is not the only consideration. What about bandwidth?

I'm pretty sure spotify, deezer and the others are not transmitting FLACs, especially not at the base quality level.

kazinator 8 months ago

The funny capitalization of moDernisT instantly gives away that it is an anagram of Tom's Diner.

grishka 8 months ago

Now I want this comparison for Opus. It doesn't do that whole psychoacoustics thing, does it? But it also somehow manages to ~double the compression ratio compared to MP3 without any noticeable difference in the sound quality.

[-]

bjoli 8 months ago

I haven't gotten into reading about audio and video compression yet, but it from where I am standing now it really looks like magic.