This article could at least have a paragraph explaining (in a dumbed-down way) why various kinds of psychoacoustic masking (temporal, frequency) make what's removed almost inaudible anyway. Reading the linked source (https://www.theghostinthemp3.com/theghostinthemp3.html), he at least used LAME, but at a fixed 128 kbps bitrate, not in VBR mode =(
Interesting approach. So are we only able to hear those sounds now because the rest of the music was removed, which would ordinarily mask the missing sounds?
To say that the mp3-encoded version is not "what the artist recorded and wanted for us to hear" would imply that we can hear all sounds in the uncompressed recording.
Yes exactly this. On a high bitrate stream these losses are imperceptible (yes even by you golden ears out here) due to how auditory filters work in human hearing.
So what. People listened to music on mechanical gramophones and enjoyed it. Too many audio engineers think it's all about the sound, when in the end it's about the music and the feelings it expresses.
"People listened to music on mechanical gramophones and enjoyed"
Herman Hesse (or rather Harry Haller in Steppenwolf) used to get enraged by that - distorted garbage and a blasphemy to the godly componists. But then eventually he overcame it because of exactly that reasoning - if people enjoy it and are touched by the music - it works good enough. That is the purpose of music, not arbitrary perfectionism.
You're naïve if you think that 1) Good enough prevents better from existing for some people, 2) Technological advances had the same consequences for all material.
For example, orchestral music massively benefited from CD's increased dynamic range.
One interesting thing to note: this is a composition, not an analysis. It's not fully documented exactly what modifications to the "raw data" were made.
I would have liked a comparison with Ogg and perhaps other formats. I hear a lot about MP3 throwing away a lot more than Ogg but I'd love to see real data on it.
When I first experience CD audio it was too high pitch compared to tape versions. MP3 came along and each song sound different depending on the MP3 compression settings.
Some CDs were mastered to have pre-emphasis that boosted high frequencies if the player didn't properly account for it. This caused them to sound "high pitched".
It also doesn't really mention how this “lost” material was identified. If you just subtract the encoded from the original, then any phase difference will make it sound like material “disappeared”, while in reality, it just came very slightly earlier or later.
I guess it was exactly as you write - but instead of slightly earlier or later, the "lost" sounds are the high frequencies (a lot of hissing, clicking etc.) - the actual sound is mostly still there, but slightly "muffled" because it contains only the lower-frequency components.
If it's really the original MP3 version of Tom's Diner produced by the Fraunhofer engineers, probably not a very high bitrate. Aside from that, I would say that even back in 2015, MP3 was already on its way out and replaced by better (while still lossy) compression methods?
With space not much of an issue anymore, FLAC is pretty much the default nowadays. And even though Opus and AAC and others have better encoding than old MP3, but I guess a 128 Kbps MP3 encoded in the original Fraunhofer l3enc (the best back then) and one encoded with LAME will be different - and the LAME version will be "better" because of improvements in psychoacoustics? At least I remember l3enc being MUCH better than anything else at 128 Kbps (Xing lol, cymbal washing anyone?) before LAME came along.
This article could at least have a paragraph explaining (in a dumbed-down way) why various kinds of psychoacoustic masking (temporal, frequency) make what's removed almost inaudible anyway. Reading the linked source (https://www.theghostinthemp3.com/theghostinthemp3.html), he at least used LAME, but at a fixed 128 kbps bitrate, not in VBR mode =(
EDIT: nerds should read about sfb21 (https://wiki.hydrogenaud.io/index.php?title=LAME_Y_switch), AAC, Vorbis and Opus (CELT) aren't just theoretical improvements
IIRC then 128kbps and lower means a 16kHz low-pass filter is applied, while it's not for higher bitrates. So in that case not just psychoacoustics.
You can try it yourself:
ffmpeg -i original.wav -codec:a libmp3lame -b:a 192k output.mp3 && \
ffmpeg -i output.mp3 decoded.wav && \
ffmpeg -i original.wav -i decoded.wav -filter_complex "[0:a][1:a]aeval=val(0)-val(1):c=same" difference.wav
This produces an error in version 7.1 for me:
[AVFilterGraph @ 0x6000008b59d0] More input link labels specified for filter 'aeval' than it has inputs: 2 > 1
[AVFilterGraph @ 0x6000008b59d0] Error linking filters
Failed to set value '[0:a][1:a]aeval=val(0)-val(1):c=same' for option 'filter_complex': Invalid argument
Error parsing global options: Invalid argument
Ironically, the 'diff' is compressed anyway, because it's on Vimeo, so that's not the actual diff either!
Interesting approach. So are we only able to hear those sounds now because the rest of the music was removed, which would ordinarily mask the missing sounds?
To say that the mp3-encoded version is not "what the artist recorded and wanted for us to hear" would imply that we can hear all sounds in the uncompressed recording.
Yes exactly this. On a high bitrate stream these losses are imperceptible (yes even by you golden ears out here) due to how auditory filters work in human hearing.
That's assuming we understand human auditory filters AND human beings have uniform auditory filters all around
Fairly lackluster article.
Not all .mp3's are created equally and can vary in how lossy they are based on the bitrate.
If you care enough to want to hear exactly what the artist wants you to hear, you just listen to the lossless version.
So what. People listened to music on mechanical gramophones and enjoyed it. Too many audio engineers think it's all about the sound, when in the end it's about the music and the feelings it expresses.
"People listened to music on mechanical gramophones and enjoyed"
Herman Hesse (or rather Harry Haller in Steppenwolf) used to get enraged by that - distorted garbage and a blasphemy to the godly componists. But then eventually he overcame it because of exactly that reasoning - if people enjoy it and are touched by the music - it works good enough. That is the purpose of music, not arbitrary perfectionism.
You're naïve if you think that 1) Good enough prevents better from existing for some people, 2) Technological advances had the same consequences for all material. For example, orchestral music massively benefited from CD's increased dynamic range.
Found the original author's page about the project (no longer on the internet): https://web.archive.org/web/20211011015410/http://ryanmaguir...
One interesting thing to note: this is a composition, not an analysis. It's not fully documented exactly what modifications to the "raw data" were made.
I can open the original link, https://www.theghostinthemp3.com/theghostinthemp3.html, just fine. Maybe it's reinstated? Or there was a temporary problem?
I would have liked a comparison with Ogg and perhaps other formats. I hear a lot about MP3 throwing away a lot more than Ogg but I'd love to see real data on it.
This is why we dont encode mp3 in 96kbps or whatever.
When I first experience CD audio it was too high pitch compared to tape versions. MP3 came along and each song sound different depending on the MP3 compression settings.
Some CDs were mastered to have pre-emphasis that boosted high frequencies if the player didn't properly account for it. This caused them to sound "high pitched".
https://wiki.hydrogenaud.io/index.php?title=Pre-emphasis
The article doesn't mention at what bit-rate the difference track was made, anyone knows? Seems disingenuous and pro-"authentic" otherwise.
It also doesn't really mention how this “lost” material was identified. If you just subtract the encoded from the original, then any phase difference will make it sound like material “disappeared”, while in reality, it just came very slightly earlier or later.
I guess it was exactly as you write - but instead of slightly earlier or later, the "lost" sounds are the high frequencies (a lot of hissing, clicking etc.) - the actual sound is mostly still there, but slightly "muffled" because it contains only the lower-frequency components.
If it's really the original MP3 version of Tom's Diner produced by the Fraunhofer engineers, probably not a very high bitrate. Aside from that, I would say that even back in 2015, MP3 was already on its way out and replaced by better (while still lossy) compression methods?
With space not much of an issue anymore, FLAC is pretty much the default nowadays. And even though Opus and AAC and others have better encoding than old MP3, but I guess a 128 Kbps MP3 encoded in the original Fraunhofer l3enc (the best back then) and one encoded with LAME will be different - and the LAME version will be "better" because of improvements in psychoacoustics? At least I remember l3enc being MUCH better than anything else at 128 Kbps (Xing lol, cymbal washing anyone?) before LAME came along.
> With space not much of an issue anymore, FLAC is pretty much the default nowadays
The default for what? Space is not the only consideration. What about bandwidth?
I'm pretty sure spotify, deezer and the others are not transmitting FLACs, especially not at the base quality level.
It's 128kb. The useful and informative article is mentioned at the end: http://theghostinthemp3.com/theghostinthemp3.html
...I think this person just created a new genre of music. Something like: "What's lost noise."
I immensely enjoyed listening to the "lost material" of Tom's Diner and would like to hear more of this!
Maybe one could diff with a lower quality version, one where more has been cut away, more is lost/left over? There are so many possibilities!