In the past few weeks, I’ve had conversations with intelligent, scientifically minded individuals who believe in 24/192 downloads and want to know how anyone could possibly disagree. They asked good questions that deserve detailed answers.
I was also interested in what motivated high-rate digital audio advocacy. Responses indicate that few people understand basic signal theory or the sampling theorem, which is hardly surprising. Misunderstandings of the mathematics, technology, and physiology arose in most of the conversations, often asserted by professionals who otherwise possessed significant audio expertise. Some even argued that the sampling theorem doesn’t really explain how digital audio actually works.
Misinformation and superstition only serve charlatans. So, let’s cover some of the basics of why 24/192 distribution makes no sense before suggesting some improvements that actually do.
The article is from 2012, but it’s still an interesting insight into these highly technical matters few of us understand properly.
“[…] these highly technical matters few of us understand properly” -> It is far worse: music is one of these highly technical matters that some believe they understand properly! There is so much snake oil and self-deception in the “audiophile” segment that it is boring, even for a jaded, scornful cynic like me.
Sorry, but 24/192 to listen to Nicki Minaj or Justin Bieber *IS* overrated ! 8/64 is enough…
And 1/8 for any modern R&B
The sad part is the whole thing? Came from a basic misunderstanding that arose in the late 80s.
You see in the late 80s there started being this thing called “master tape releases” which was the studios getting folks to buy the same albums yet again by going back to the original master to make the disc. Now considering that the original mixdowns they were using was from the 70s and even 60s when the gear was all analog and every playback caused degradation? This made the master tape series sound better than the earlier copies made from inferior analog master mixdowns which had by that time frankly been worn out.
So where did the misunderstanding occur? Simple people that didn’t understand basic math and the difference between digital and analog (not meant as an insult, its not easy to understand that just because both start and end with audio doesn’t mean they aren’t as different as a sandwich and a rickshaw) thought that “if they use it in the studio then like the master series having it in the home would be better” but with digital that just isn’t the case. you need 24/192 in the studio so when you start piling on the effects you have plenty of headroom. Nobody listens in the studio to 24/192 for pleasure, its strictly about mixing and headroom.
The reason I know this is I’ve had the pleasure of being in a million dollar studio with an engineer that trained at Julliard. Sadly he threw more audio theory at me than my poor brain could handle but he did let me hear what a final mix sounds like at both standard CD and 24/192 and even on 10k speakers 192 SOUNDS WORSE. The reason why is quite simple, the 192 has headroom above what the human ear can hear because some instruments like cymbals can have transients that reach above human hearing and you want to capture those so when its brought down to normal CD audio it sounds more natural. I apologize if I’m describing this wrong or getting a detail wrong here or there because like I said he was level skill levels above me but he said even the 10k studio speakers he had simply aren’t able to handle those ultra high and low notes so what you end up with is distortion or hum.
But at the end of the day what matters is I got to hear original digital masters on gear more expensive than the best audiophile gear and got to ABX the 24 vs 16 bit and the ones sitting down doing the math are correct, 24bit makes sense from a recording standpoint but NOT from a listening standpoint, it just doesn’t sound as good.
I think those guys are funny. They’re like gamers who think the can see more than 60FPS on their 60Hz monitors.
I have never seen those comments. Are you sure they don’t mean that when your gpu does 120fps it is almost impossible to dip below 60fps?
The purpose of running games at >60fps is not to see more than 60 frames per second. The goal is to see the most up-to-date frame. It’s about latency, not throughput.
If you limit the game to 60fps, there is a delay of 0-17ms before a frame that has already been rendered is sent to the screen. By running at higher framerates, you get screen tearing, and you waste GPU cycles calculating pixels that are never displayed, but the pixels that do make it to the screen are a few milliseconds more up-to-date.
I’m obviously not talking about the ones who knows the score but about the large number of people who don’t.
He says this “16 bit linear PCM has a dynamic range of 96dB” which correct. But now think about what happens in a soft passage. To make the music soft you might need to stick 14 or 15 bits of zeros in front of it. Now your music only has 2-bits of resolution and doesn’t sound right.
Say you want to hear these soft passages better so you turn the volume up. They will sound awful because they are being played with 2 bits of resolution. Any set of ears can hear this, you don’t need golden ones.
This is a known problem with classical musical and it is easily observed. 24b is enough to fix this. Some systems use 32 bits.
This also points out a way to get some easy lossless compression in music. Use one symbol to encode average volume as it changes. Then center the music data around these average volume level symbols and remove the top zero bits. With this simple algorithm 24b music is the same size as uncompressed 16b PCM.
TI does this technique in hardware to increase amplifier performance – it is called power supply volume control.
I can’t tell that when increasing the sample rate it makes any audible difference. But note that 44.1 can only capture up to a 22Khz sine wave. Instruments don’t make pure sine waves. That’s why they sound different. If you start throwing out all of the high frequencies you will be losing part of this. Think about using a signal generator to listen to a 2Khz sine wave, square wave and triangle wave.
I selected the most important aspects, in my opinion.
The author also points out that the best investment you can do if you really like audio is to buy a good headphones, not necessarily expensive ones.
By my side, I do not give advices to people about good equipments. A cousin of mine spent about US$ 20,000 on one system and asked what I thought about it, I said it sounded great. To almost everyone, “perception” is more important than reality and if they buy something you pick instead of what they think is better, it will mostly backfire on your friendship.
Edited 2014-12-03 16:57 UTC
It is not rocket science – to make the 16b music soft you have put a lot of zeros into the top bits. If you put in 15 zeros that leaves you one bit for the music. 1 bit music at 44.1 just sounds like noise, you can barely tell it is music.
Of course you can’t tell whether it’s music. You can’t hear it at all.
If you use the volume control on your amp you can make that noise as loud as you want it to be.
Really, you should be a bit more careful about reading it. Hint: audio is a two dimension function of frequency and amplitude and that is how things are encoded. I think you are trying to analyse it only looking the amplitude, it does not work this way. He explained how loudness affect our perception of less loud tones on about same frequency and I selected the sentence. For all I know, and I studied it a long time ago on vibration analyses, he is totally right.
Take something like the 1812 Overture where they use real cannons. Those cannon shots are very brief but very loud. The loud components in them do not persist long enough to damage hearing. If you center the recording to capture those cannon shots you have to push the average volume level down on the rest of the recording a lot. When you do that the very quiet passages lose all of their resolution – they are effectively recorded using 1-3 bits since all of their top bits have to be zero to allow for the cannon shots.
Now if you turn up the volume on those quiet passages (something that he did not allow) you will find that they sound awful.
This is a well know problem with recordings of classical music. The combination of dynamic range with enough bits being left over for accurate resolution isn’t there in 16b audio.
A floating point format would be the best way to fix this.
People also say you can’t hear ultrasonics. But you can if they are modulated in the right way.
http://en.wikipedia.org/wiki/Sound_from_ultrasound
There was a recent Kickstarter featuring a hypersonic system.
Read the article again where he talks about how the high frequency affects and distort sounds we can hear.
Really, the problems you are explaining have nothing to do with reproduction and are all about the processing phase. Or the ones that captured did not use 48 bits or it was badly damaged on processing step. He explained it on his arguments, as he also explained why we should care for formats like FLAC if we want to remix or apply effects.
His arguments were directed toward reproduction.
Edited 2014-12-03 18:26 UTC
Exactly…in other words, playback! For playback, 16 bits is enough. For mastering etc., more bits is beneficial/useful/you name it.
Edit: there’s a HUGE thread over at Head-Fi detailing all this, successfully debunking all the myths surrounding 24-bit playback etc. Well worth the read, if you have some free time.
http://www.head-fi.org/t/415361/24bit-vs-16bit-the-myth-exploded
Edited 2014-12-04 02:08 UTC
16 bit playback is thin and lifeless.
24 bit playback sounds more real.
end of story.
More real in someone’s mind when they know it’s 24, but which no one could point out during blind tests when they were presented samples without knowing which one is which. End of story indeed.
Edited 2014-12-05 06:24 UTC
“blind” listening tests are garbage. the ears and our reaction to music doesn’t work that way. a flawed test can show anything you want it to.
stick with your low-def music as long you want to, you can always reduce reduce reduce. but if i pay for music i want the masters, not something reduced for 1978 consumption.
Blind tests is exactly the only way to test it. Otherwise you aren’t testing your ears, but testing what your mind makes your imagine. It has nothing to do with reality.
what else you can you make your ears imagine they are hearing?
Except that nobody has been able to prove this – as soon as people don’t know what they’re listening to, 16 and 24 bit reproductions are completely indistinguishable.
Of course, the 24-bit versions you have might well sound better – since they are mastered for high-end gear and picky listeners, instead of for radio and cheap CD players at parties. That’s not because of the bit depth, though – they’d still sound better in 16 bit.
And he even explicitly makes that distinction in the article and says that for mastering it’s better to use 24 bits or even 32.
Edited 2014-12-05 06:22 UTC
This is called “floating point”. However the range with 16bit PCM is still so huge (and not linear) virtually no one uses floating point (except for processing).
Edited 2014-12-03 19:20 UTC
Well, you need about 12-13 bit to make audio sound good. With 16bit that means you can turn the volume down to 1/8 of max which is pleanty for normal radio music, but can be an issue with classic especially if you already have the app volume on half of full.
I would prefer if we just inserted a gain or volume in every MP3 package (around 300 samples), that would be enough. You don’t need the extra 8 bit in every sample, but updating the total volume a 100 times per second would be good so we could use the 16bit to the fullest even at low soft volumes.
I absolutely can notice light coming from some “IR” remotes. Just now I took out three remotes to test with, I can see light from a roku remote as well as a samsung dvd player, it’s dim but unmistakable. I could not see the light for the cable remote. I’m not sure how to determine what the wavelengths are?
I can clearly see the “IR” floodlights used by night cameras even when I’m not looking for them. I really never thought anything of it, are other people not able to see those?
Edit: I put IR in quotes because maybe they aren’t technically IR. Is there a way to find out?
Edited 2014-12-03 15:16 UTC
Most people can’t see near-infrared, but there are people who have a mutation of sorts that does let them see such and some have been able to see such after having a laser eye surgery. It is rare, though.
Everyone can see the common near-infrared LEDs used for night vision. The LEDs bleed into the visible spectrum.
Edited 2014-12-03 15:33 UTC
Do you have a source for that, by chance? I don’t doubt you at all, I’d just like to read more about it. It would fit very well into some talks that I give. Tried Google but couldn’t find a reference to it, which probably says more about my skill writing queries.
Thanks!
Actually, the most common circumstance where a rare few people can see infrared or ultraviolet comes from modern cataract surgery – they replace the lens with a plastic lens… which then no longer blocks (as much of) the infrared or ultraviolet that the natural lens does. Not many people are tetrachromatic, or have the chance to see if they are or not.
WOFall,
Yes that’s what I suspected, some IR LEDs have more visible wavelengths. Anyways it seems he was exaggerating at least a bit because I just have to cup my hands over it to see it. When I look into it it’s “dim”, but “barely visible” seems to imply one has difficulty seeing it, which I don’t. There’s no need to wait for my eyes to become dark-adjusted or anything like that.
Now I’m genuinely curious to what extent different people can register light differently. I wish I were equipped to conduct a proper test. Not that it matters much, but just for fun.
Edited 2014-12-03 18:14 UTC
Warning !
Infrared light sources like high power LEDs and lasers can be more dangerous than visible ones, because there is no reflex to close eyes.
By trying to ‘see’ infrared, one can be dazzled and have his/her retina severed without realizing it.
Beware !
Edited 2014-12-03 21:52 UTC
Treza,
Yes, however the emitters from remote controls are low power and are certainly not lasers. I found specs for an IR LED illuminator kit with 36 high power LEDs rated at 3.6 Watts. For comparison, if we do some handwaving regarding efficiency and focus, then it’s still a magnitude less energy than say a car headlight of ~55Watts. It’s probably best not to look directly at either from up close, so it’s good you brought that up, but I doubt it’s going to cause acute problems unless it’s unusually intense or sustained. These things are going up everywhere in public areas (ie near entrances) so hopefully they’re not that bad for us…
It would be interesting to read up on what levels are considered dangerous.
You’re not taking how big an effect efficiency has into consideration here, “hand waving” doesn’t quite cover it. The car headlamp is probably only gives out around 3 or 4 times more light energy than your LED array, making those power consumption figures pretty misleading.
You have a point though, that these things are everywhere. However people don’t tend to stare at them because they don’t offer a distraction for the eye to focus on in the same way a floodlight or headlight does.
daedalus,
Of course, I just wanted to give a rough context. Calculating how much light energy is entering the pupil is left as an exercise to the reader
I guess it depends if you are able to see the red glow. I’m assuming from leo’s highly upvoted post “Everyone can see the common near-infrared LEDs used for night vision.” that many people can see it. They probably take a brief look to identify the source and then continue going about their business.
I don’t have much opinion on high-bitrate audio, but it reminds me of when Microsoft told me that science prevented me from being able to tell a Retina screen from standard one, ‘because science’
Harmonics, artifacts and subtle field effects can have quite significant effects outside of the directly measurable, and shouldn’t be disregarded.
Except that the only way 24/192 affects harmonics, artifacts, and subtle field effects is by driving many types of speakers outside design tolerances, risking CREATING said distortions.
If you haven’t already seen them, take a look at the two videos he’s done so far which explain the underlying principles.
https://www.xiph.org/video/
Now that we have digital audio processing, it’s possible to make a lowpass filter with a perfect cut-off once you get past the analog-to-digital conversion stage.
That means that a 44.1KHz audio file can represent frequencies all the way up to 22.05KHz. Not only beyond human hearing range… but beyond the rated output range of every model of non-laboratory speaker I’ve ever heard of.
(And the sampling theorem proves that, as long as you apply that lowpass filter, you’ve got a unique solution, which makes the downsampling from a conversion-stage sample rate like 96KHz to 44.1KHz lossless without needing one to be an integer multiple of the other.)
As for the sample size, what a lot of people don’t think about is how 16-bit linear PCM, while obvious, isn’t used in practice. Human hearing has logarithmic sensitivity, so a logarithmic transform is used to spread the representational resolution out roughly evenly across the entirety of human hearing.
(Plus, as he points out, we’ve got shaped dithering, which lets you ensure a uniform noise floor, rather than letting noise pile up in harmonic peaks. Think of it like a smarter alternative to “snap to nearest grid line” for sampling the audio waveform.)
Again, it’s room acoustics, hardware, and your audio compression that you want to worry about… none of which 24/192 can help.
Also, as he pointed out, software to administer double-blind ABX tests is freely available and there’s so much bunk, superstition, and snake oil in the world of audiophiles that, without an ABX test, an opinion isn’t even worth the time it takes to read.
Edited 2014-12-03 16:39 UTC
xiph.org is full of it.
they are trying to convince you less is more. they are in the DSP industry.
everyone i know hears 24bit as more accurate representation of the real thing.
16bit is thin and lifeless and lacks much of the emotional content of 24bit. which is why it’s easy to type about it not being important. but it’s there, and if you listen to well-mastered music, especially if recorded analog (anything before 2000).
24 bit sounds so much better, the second you hear it you will drop all these stupid arguments.
It’s all about the music and how it was produced.
24/192 Jazz at the Pawnshop or Diana Krall sounds just as insipid as it does at lower resolutions. No amount of audiophile weasel-words is going to change that.
I’ve heard 16/44.1 mono versions of Miles Davis albums that sound better than those overpriced HDtracks “remasters”.
i agree with all that. but it doesn’t cancel what i said.
i’ve heard the same exact album on vinyl, on CD, and at 24bit FLAC, and the 24bit FLAC sounds the best, with the best of both worlds.
it comes down to the release. i’m arguing against people who say “there’s nothing above 16/44, it’s all a lie”. it’s most definitely not a lie, but bad recording, mastering or other business decisions can make 16/44 the best available.
So basically one person once told you something untrue and used science to justify his reasons, therefore anyone else tells you things you disagree with but uses science could also be wrong?
I think the solution is to become scientifically literate, rather than dismissing scientific arguments out of hand.
Of course not, the point is that, just like our eyes, there are significant things about our hearing that we don’t understand.
The important thing about scientific theory is to recognise that the models are almost never accurate. They’re the best approximation to reality that we know/use. Therefore if you’re using these incomplete models to counter established knowledge/observable behaviour, then you’re making one of the worst mistakes a scientist can make.
As for this article, the ‘right’ thing would be to do the a/b tests across various bit/sample rates and find out what happens in reality. I don’t care enough to do this, but it wouldn’t be hard to do.
Unfortunately there’s one other factor in play here. It’s easy for someone careless / inept to badly encode audio, all the discussions thus-far have assumed that the producers do the right thing. If the industry message is always: lossless / high-bitrate, it makes it harder for the careless to mess up the sound. Sometimes driving technically uncertain best-practices can foster an environment that leads to better actual solutions.
Like all of those idiot “climatologists” and “evolutionary biologists” right?
Well, to be frank, yes.
The evolutionary biologists came up with a model. We’ll never /know/ if it’s right or not, because we can never actually see what happened in the past. However, it’s more useful to assume that evolution is correct, because the models we have for that fit observed behaviour, and have shown useful in predicting future behaviour (localized mutations etc..). Most people choose to believe in evolution without question, I choose to believe in it only until we come up with a better explanation
As for climatologists, this is a less clear-cut problem. these guys have a model of the world, this model is wildly incomplete, and as a result, the exact outcome of these models keeps changing. We’ve gone from cooling to warming to ‘more extreme weather’. This is to be expected, as the models and our understanding change, the models change.
I don’t think we’re at a stage yet to be able to rely on the predictions to know what will happen, but I do think it’s sensible to assume that unless we change our global emissions, /something/ bad will happen. It’s a good case of the faith dilemma, it’s always safer to believe in a god, becuase if you’re wrong, then nothing bad happens, but if you’re right, then you chose the winning side
i don’t get why the conversation is about 24/192 isntead of 24/96 which is the more common “high fidelity” audio.
also, the amount of variables on this makes any conclusion that “it is the same” (or “that it isn0t the same” ) a bit moot.
it depends on at least three major factors:
a) the kind of music (type of instruments, number of instruments, if it’s a smoot passage, a loud passage, etc etc)
b) your listening equipment.
c) your hears..
i can listen and feel a significant difference between good and bad headphones. and bad and a good bitrate. but it very strongly depends on the kind of music.
a previous girlfriend had perfect pitch and was a musician and she could hear farther high frequencies than i can.
people are different. meaning that this is one of those stuff that can’t be just rulled by majority. there’s a segment of the population that feels the difference between low/medium bitrate and a higher encoding. also, depending on the music type this fraction of the population varies widly.
and then there’s the whole listening equipement difference…
I am ok with storing samples at 24 or 32 bits just for intermediate process, inside a recording/mastering studio in order to have the less error propagation when processing such signals.
In final audio, it does not make any sense.
unless you want to hear the actual thing as recorded.
then it makes perfect sense.
But it’s been mixed and mastered since it was recorded. And you’re not listening to it on the equipment it was recorded with. You’ll never hear it as it was recorded.
Not to mention that you’re wrong anyway…
oh please, i know i’m right. it’s clear as day for me. you’re the one suffering if you can’t hear it.
return to your HD visuals and your crappy audio.
Pass the blind test, then claim you can hear it.
I had also held the belief that audiophile stuff is pointless. I have since started designing some audio equipment and now have a different opinion: many of the audiophile “concepts” are correct, but often not for the reason they think. Concerning high-res audio, for example, having a frequency cut-off at 22kHz is low, not beacuse you can hear above that, but because the filter at 22kHz needs to be very steep, and will strongly affect the impulse response. You can clearly hear the difference between a linear-phase and a minimum-phase filter. Having a higher sampling frequency rate allows one to use a waker, and less noticeable filter. 96Khz, however, is already plenty. Regarding bit depth, with good equipment, you can A/B test 16 bits vs 20 bits; even with 16 bit dithering, which already helps.
From the Wikipedia:
“Aphakia is the absence of the lens of the eye, due to surgical removal, a perforating wound or ulcer, or congenital anomaly…
Aphakic people are…able to see ultraviolet wavelengths (400^aEUR“300 nm) that are normally excluded by the lens. They perceive this light as whitish blue or whitish violet. This is probably because all three of the eye’s color receptors, the blue more than the others, are stimulated when such a person sees ultraviolet wavelengths. Aphakia might have had an effect on the colors perceived by artist Claude Monet, who had cataract surgery in 1923.”
In WWII elderly coast watchers with UV vision were used to detect or communicate with off-shore boats or subs using UV light.
… of the argument that says in order to fully appreciate 1080p, I must have a 60″ LCD, because of the distance I sit from the TV, because otherwise my eye can’t resolve individual pixels.
Of course, this argument is usually promoted by the manufacturers of 60″ hi-def TV’s. I suppose for 4K, I’ll need a bigger house.
Breaking out a technical or mathematical explanation for something that is largely subjective, seems a bit off the point. I’ll agree, that if you’re producing > 100 dB noise levels, you’re probably doing it wrong, and you won’t enjoy that music for long.
But that’s as much a matter of the hardware and software being used to turn that encoded stream into sound waves, as it is the stream. I can blow out speakers and make some really lousy noise with 64 KB/s .mp3 files, too.
The linked article has a section labelled “Sampling rate and the audible spectrum”– and then talks about how they determine the audible range for hearing. Sampling is not discussed at all. Playback is not discussed at all.
For that matter, I can’t tell if 24/192 refers to the data stream, or the output channels. Does that mean a stereo recording is 24 bit and 96khz per channel? Or is it 12 bit and 192khz per channel? Or, are they recording multiple tracks at 24/192, and then expecting my speaker to vibrate at 192khz? That will annoy my cat, every dog in the neighborhood, and probably result in my house being attacked by bats. I hope no one’s getting an ultrasound nearby.
Conceptually, I can see a benefit in mastering @ 24/192. I would think the better the source material, the better the end result will be (although for modern music, since the bits are all large, square and loud, I’m not sure it matters– how many bits do you need for “THUD”?).
However, I’m not sure I see a benefit in playback at 24/192. Since I have one of the early SACD-capable PS3’s, I own two or three SACD disks, and feel they produce a (slightly) better sound– but that’s probably as much the result of more channels and better mastering, than anything.
I do think that lossy compression formats are the product of demonic forces, and an example of the downward slide of civilization, so anything that leads to better adoption of lossless compression (Windows 10 has FLAC?!), I’m in favor of, but sadly, that all seems to get lost in the noise of whether we can benefit from the newest advancement.
I have been arguing this everywhere and I have a 24bit pono player in my hands right now —
anyone who says 24bit audio is snake oil is a complete idiot, or worse, they are trying to sell you something less for more.
listen for yourself, and until you have, all comments are ignorant.
24bit sounds so much better than 16bit it’s obvious to anyone who actually just sits and listens to it for more than 20 seconds.
So why can’t anybody detect the difference between them in a proper test?
i’m detecting it right now. i detect it every time.
your data is based on bad testing. a poorly configured test can prove anything you want it to.
play well mastered 24 bit audio through a player that can do it properly, like a pono or fiio and you will hear clear differences.
no argument.
No, anyone who claims that 24 bits are needed for playback can’t actually point to any evidence of such, since they all fail to pass blind tests to differentiate the two.
they use their heads and maths instead of ears.
everyone i know hears 24bit as more realistic than 16bit.
only typing nerds think they can disprove it.
but every single one of them has HD TV and HD monitor and HD phone screen and HD camera.
unbelieveable. i really am getting pissed about this, because OS news is the only place these idiots claiming that less is more haven’t chased me.
xiph.org is garbage. anyone who can’t hear 24bit sounding better has never tried it.
grrrrrrrr…… hear it for yourself and stop trying to use faulty logic to tell people what they can and can’t hear.
Link to the tests results please.
I think it is difficult to understand because the difference between 44.1Khz/48Khz at 16bit and 192Khz at 24Bits is noise floor and range frequency range that can be captured.
So since we cannot hear anything over 20Khz it does not make sense to use a higher sampling rate for playback as neither our ears or speakers do anything useful with those higher frequencies and it can actually cause problems with intermodulation and subharmonics where sounds above 20Khz can add distortion below 20Khz.
Since 16bit is used along with dithering, for playback the sound is still reproduced accurately with a low amount of noise which is introduced, but that can be shaped to mostly appear in frequencies that we are less sensitive to.
Using 8bit for playback can result in the added noise being audible, but 16bit vs 24bit does not make any difference in listening tests as the noise introduced at 16bits is already too low for us to hear during playback (when it is not loud enough to damage our hearing or cause pain). The only time 24 or more bits makes sense it at the recording and mastering stages where it keeps the noise low even with many stages of editing.
I think you could look at why you seem to care so much, as the logic is ‘sound’*. Listening tests like ABX are great as they take away bias introduced by not knowing which stream is playing which is why they are useful in determining whether people can actually hear the difference or not.
*Pun intended.
Edited 2014-12-06 14:28 UTC
i can’t argue with your reasoned approach but you are still missing my main point.
24bit sounds more realistic. every instrument. every voice. every part. the mix is wider, the layering is more realistic.
if you play an instrument, or can sit next to one being played — that’s REAL.
16 bit sounds less REAL than 24 bit, no matter what math or incomplete science thinks about it.
it’s not about the extremes, it’s about the accuracy and detail of the real thing.
if computers could determine what was real sounding we’d be fooled by computer voices every time. but we aren’t because we hear various cues in the audio that tells us it’s less than real.
24 bit sounds more real just like 1080p looks more real. it’s very simple. all we are trying to do is record and playback reality, and 24bit is closer to it.
the closest i’ve ever heard. i’ve been in studios and concert halls, i know what REAL sounds like. sit next to an acoustic guitar player singing. that’s real. your ears know it’s real. you hear the room, their fingers, their breath, and you know it’s real.
if you play a recording you know it’s not real, it’s a replication. the 16bit replication is less real, bottom line, and none of the established math or shaky science behind hearing deals with REAL ACCURACY.
Many thanks for following up on my comment, I understand that in a lot of instances what is specified does not work well out in the field, for example, some people thought that having large packet buffers in a network can improve performance, but in reallity it can cause extremely long latency.
However in this specific instance the ‘real accuracy’ as you said is tide directly to the scientific reasoning.
Using the sample rate as an example (because it is easier to understand) when listening to a live orchestra playing without any other equipment such as microphones and amplifiers, you would not experience the type of intermodulation and or subharmonics which could be present in playback. So using a sample rate of 44.1khz or 48khz could actually be more accurate.
The painful part of the discussion seems to be bits per sample, which I guess could be simplified (perhaps oversimplified) to the view of how much more quantisation noise is added as lower bits per sample is used because there is greater distance between the waveform present in the room verses what was recorded. In this example what seems to happen is that when you use 16bits during playback it does not make any difference to the precieved quality verses using 24bits during playback, but the advantage of the lower amount of noise being added can be beneficial during recording and editing as a litte bit of noise could be added during each of the many steps before it final result is distributed.
Now your view seems to be that this is not the case, but it seems that you are biased because you do not fully entertain listening tests which sole purpose is to attempt to ascertain whether or not a difference can be precieved.
i’ve said before numerous times, any type of constructed “test” for our ears should be thrown out if you are trying to get to the heart of how me consume music.
i’ve seen no tests that deal with the accuracy and the emotional content of the musical piece, that accounts for listeners emotional state, the physical state of the listener, and the room acoustics.
if you gave people A & B & X and asked them to live with them for a few months, giving each one a point when it sounds “perfect” or “the best”, maybe you’d get somewhere close to accurate results. switching back and forth between sources of the same thing tires and confuses the ears.
ears are not meant to jump through such hoops because they don’t behave in a linear way. they are tied directly to our emotional state, which is why music is so effective at motivating or emotionally crippling us, far more than any other sensory input. you can play test tones in mono all you want, or have some poor soul switch between different qualities and ask them to pick, but the results mean nothing to me. people are easily confused and most don’t understand what they are listening for anyway.
give the sample files to musicians, or give someone their all-time favorite piece of music at different quality levels, and give them some time (days if not months) to pick which one is best. our ears are not made for hunting for differences, and we can’t play “spot the differences” as easily as we can with our eyes. there is no compare feature in our ears. it’s about our emotional response. as soon as you’ve played one you’ve already forgotten or at least colored your memory of the previous.
most of you confuse recognition with quality listening. big difference. enjoying many symphonies at mp3 are you?
most of you think the ears can be charted and have fixed specifications like a microphone or a speaker. also false. our ears, being tied into our emotional cortex, can swing pretty radically based on all kinds of other reasons.
most of you also ignore the single most important part of listening — the room you are in (with time of day, humidity, and ambient noises being variables that we catalog and adjust for without thinking).
when i go to my main playback spots and play music i love at 16bit then at 24bit i hear a clear improvement. i own 3 albums in both 16 and 24, transferred from analog masters. it’s literally clear as day and all of this keyboard jockeying shows it’s ridiculousness. typing about sound.
no one outside of musicians and music producers talk about accuracy, soundstage, timbre in voices and instruments, pre-delays, room shapes, and overall emotional impact. the xiph.org people might know programming but they obviously don’t know jack about music and must have some horribly fried ears.
standup bass — way better at 24bit.
vocal harmonies? more realistic and not mashed together at 24bit.
hi hats — so many more sounds made than recorded at 16bit
crash cymbals — like being in the room with it at 24bit. like the ‘digital version’ at 16bit.
voices — like being there.
acoustic guitars — those have strings and frets and when fingers move on strings that makes sounds clear as day when in the room. at 16bit some of that detail goes away.
snare drums – the attack and dynamic range is more direct and fuller at 24bit. you can hear the snares on the bottom drum head.
soundstage? everything has a place at 24bit, nothing competes or cancels. when a new instrument comes in it sits in the mix perfectly. not so at 16bit.
i’m over 35 and have heard analog 2″ tape as well as an expensive vinyl rig, so lately i’ve been taking my 24bit DAP to people younger than me and most can’t believe it’s digital, they have just accepted that everything “digital” means less than the original. That should tell you something right there, clearly 16bit wasn’t doing it, even for people that haven’t heard better.
I should have avoided this whole thing, because, honestly, programmers and math people trip up on audio almost every time. they want a formula to explain everything, and our auditory system is so amazing it’s far beyond what modern science understands. it’s our basic survival. even people with def ears still pick up vibrations and prefer higher resolution sound.
you can always flip it — if 16/44 was all we could ever hear, then a CD version of your baby crying would fool you. a CD version of your wife screaming would fool you. but it doesn’t, at least not at 16bit.
again — listen to 24bit version of your favorite piece of music on a player that can actually play it at a decent quality (the fiio x1 is only $99), then come back and tell me you think it’s snake oil. no one does, they just shut up and enjoy their music because it’s better sounding than the 16bit version. and guys like me type a thousand words trying to get the word out, because our society is suffering under all this bad science and attempts to program reality.
call me a naturalist programmer if you want – the second that geeks get into human sensory fields they lose their way quickly. the idea that a programmer is somehow more qualified to talk about sound than a lifelong musician or recording engineer just shows the arrogance and folly of internet commenters. xiph.org plays no instruments and has touched noone emotionally with sounds they’ve produced. it’s a circle jerk of people that can’t hear telling everyone else they can’t hear either.
Edited 2014-12-07 01:36 UTC
ABX test to see if you can hear a difference, if you felt emotionally different each time music of a particular quality was played that would count and the results would show that… but you do not fully entertain them.
If an ABX seems to confuse the ears it mean that the person doing the test is having to guess.
Our ears are a physical system until it reaches the brain and then emotions can alter our perception. Our auditory system is made to be able to pick out differences, for example having a conversation with someone while music is playing in the background.
Important factors indeed but have nothing to do with playback frequency and bit depth which are the things under question here.
An ABX test would confirm this if it were true… but you do not fully entertain the because it is difficult to accept that our senses do have limits.
This is a logical fallacy called appeal to authority, the science is the same no matter what profession or group you belong to.
If it is recorded with 2 channels, tests have shown that people do find it hard to tell the difference at high bit rates. But we are not talking about mp3 compression here, that is a different subject.
ABX would either backup or invalidate those claims.
This might be the cause of your bias as you have invested a lot of money on equipment.
Science does not have the answer for everything but at least at tries to understand.
This is just an appeal to emotion and does not really add to the discussion with regards to the differences in perceived sound quality.
There are other factors at play when you compare a $99 payer with ^Alb999 player that have nothing to do with playback bit depths. For example the cheaper player could have low quality components.
Again this is just an appeal to authority and emotion.
Bought The Cars and Blood Sugar Sex Magik at 24bit yesterday.
WOW.
My wife started crying when I played The Cars “All Mixed Up” at 24bit out of the Pono.
Crying. She couldn’t believe it was digital, it was like perfectly clean vinyl on a high-end player. CRYING. Take that, xiph.org.
But it was a digital FLAC playing on a Pono through $40 speakers.
As far as RHCP — I could hear Frusciante’s fingers on Could Have Lied. Never heard that in 23 years of listening to that at 16/44.
I could hear anthony’s voice cracking even more in Breaking The Girl. Never heard that in 23 years of listening at 16/44.
Flea seemed to be hitting even more notes when he slapped.
24bit is like sitting in the studio. The noise floor is as nice as CD but the actual music itself is much more alive.
Listen yourself, then type. Don’t think. Don’t believe this crap math about less = more.
I thought readers of this site understood data and resolution? 16/44 can hold about 65k datapoints for a second of audio. 24/96 can hold about 65 million per second.
If you care AT ALL about how your music sounds, you will check out 24bit, or you will be sorry.
I suppose you didn’t read an excerpt of the article where such arguments where debunked, I will repeat it again:
I have read many similar tests using good encode techniques and they all got to the same conclusions: no one was ever able to point precisely what was playing at 24 bits from the 16 bits or from high bit rates from low ones.
On all cases that had some differences the factor that caused them were improper processing/encoding (and distortion caused by modulation from high frequency signals played a role on some cases).
We should always remember that all engineered equipments have trade-offs, we just can’t get all the best in one piece and that strikes loudspeakers particularly hard, as anyone that went to orchestra auditions can attest.
As I said in my first post, when “in person” I try to evade such subjects. I found out that the auto nominees “audiophiles” believe on vibration nuances of fairies wings tips.
Edited 2014-12-04 14:40 UTC
The part no one seems to be mentioning, from the AES restricted paper, but mentioned in one of the linked articles discussing the paper:
So, see my previous comment about the benefits of mastering to a higher level (which got downvoted, strangely, I suppose because I’m criticizing what appears to be criticism for the sake of criticism).
shooting down 24 bit audio this is the definition of FUD.
only in audio do normally smart people argue against higher resolution and higher specs.
amazing.
listen for yourself before commenting and posting links to other people’s opinion.
music isn’t math, it’s emotion and vibration. more is more.
go back to your HD TV and HD camera and HD phone and leave the audio to the people that actually use their EARS.
Thom, you stepped in the trap.
Audio and hearing is counterintuitive to the techy programmer mind.
Want proof?
Only audio gets smart people fighting for lower standards and actually claiming no one ever needs an upgrade. Ever.
16/44 is a compromise from 1978. Anyone defending it at this point has another motive or is horribly misinformed.
i’ll add if anyone wants to hear 24bit audio there is plenty available online. Not everything, and not that much modern stuff, but if you are really into music you owe it to yourself to check it out.
You can also come over my house and hear my pono player for yourself. I use cheap speakers and head units cuz I’m just a regular guy, not an audiophile, whatever that means. It also sounds amazing in any car and anywhere else. It’s simply the best, the original.
A FLAC at 24/anything — the accuracy and warmth of perfect vinyl with the low noise floor and pristine highs of the best cd. You can hear cymbal decay and hi-hats again. Voices and guitars and keyboard solos will make you cry again (or for the 1st time if you grew up in mp3 era).
All in a file like an mp3. Perfection almost achieved.
Interesting that you won’t address the scientific arguments made, but just appeal repeatedly to emotion, and basically say “I believe in it so you should too”.
And yet, the authors agree that music mastered for the higher fidelity formats sounds better than the music mastered for 16/44.
I’m curious if people would have been able to tell the difference on “commodity” hardware– not audiophile levels (although this paper is 7+ years old, and DSP has moved on a bit), but the stuff that average people might buy.
The real secret of the article is not whether 24/192 sounds better than 16/44.1, but the fact that the music industry isn’t bothering to make quality music anymore.
As you might or might not be saying, it’s not about “mastered for 44/16”, though – it’s about “mastered for radio and cheap speakers at parties”.
That’s not always the case; there are albums out that are mastered differently. I think Bob Dylan has been very particular about this lately, and I’m sure there are others in other genres. Still, yeah – it’s a shame it isn’t easier to get hifi-friendly versions.
Edited 2014-12-05 14:24 UTC
The article is bullshit. A very long text and can’t explain the main thing: why a 22 kz tone is allways a square wave at 44 khz sampling rate.
One word – filtering. You really ought to work on your reading comprehension.
That’s not an explanation. Music should always be filtered, nobody wants anything over or below human audible range. But 48khz, 88khz, 96khz etc, of sampling rate always give a more precise waveform.
This is the main point.
And a 22 kz tone sampled at 44 khz sampling rate, is a square wave. It can be antialiased and filtered as you wish, but never can regenerate the original waveform. Despite of the wrong claim of this article.
No it is not – in digital there is no wave that could have a form actually. Maybe you should also look at the D/A and A/D video by the same author as the article – https://www.youtube.com/watch?v=cIQ9IXSUzuM where he shows this using an oscilloscope.
So a 22kHz tone (pure tone which has a sine wave form) is after a A/D – D/A conversion (sampling at 44kHz) again a 22kHz sine wave.
The raw audio for a 44kHz recording is typically put through a filter that removes anything above 20kHz. (Well, it’s not a brickwall – but at 22kz, everything is definitely gone.)
Which is fine, 20kHz is on the very upper limits of healthy hearing.
and extremes are just extremes. this doesn’t address the middle of the spectrum, where most of our music lives.
there might be something to the overtones and masking provided by ultrasonic and subsonic frequencies, but i think that’s a false read on this.
music sounds best because of detail, depth, timbre, and soundstage, and none of that has to do with extreme highs or lows.
the real advantage is 24bit being able to measure to about 65 million data points, giving far more precision than 16bit.
trust me – 24/44 sounds noticeably better than 16/44. unless you’ve heard it yourself, with music you love and know well, please refrain from commenting your opinion.
an opinion about audio is useless. play the music yourself and decide, when hearing it it’s so clear people drop these stupid arguments. it’s so obviously more realistic to our ears.
you are trying to prove to me that i don’t need anything more than 540×400 resolution on my tv screen, and no formula or other pictures will prove you right. my ears hear it easy, and so do most other musicians, procures, engineers, and sound people.
Nonsense. Complete and utter nonsense. Provably wrong.
An analog to digital converter must have an anti-aliasing filter, and a digital to analog converter must have a reconstruction filter. The reconstruction filter removes any signals above half of the sampling frequency. A square wave consists of the fundamental frequency plus odd-numbered harmonics. Your 22 kHz square wave has components at 22 kHz, 66 kHz, 110 kHz, and so on; 1,3,5,7… times the fundamental.
Only the 22 kHz fundamental comes out of the DAC. A sine wave, just like the signal that went into the ADC.
Anyone who talks about digital audio and leaves out the role of the input (anti-aliasing) and output (reconstruction or anti-imaging) filters doesn’t know what they are talking about. Those filters are a critical part of the sampling and conversion process.
The output of a DAC never contains square waves or stairsteps. Those artifacts of sampling are made up of frequencies higher than half the sampling frequency, so they are smoothed out and eliminated by the reconstruction filter.
Some of the common confusion may come from the fact that semiconductor companies sell chips that they label as converters. It’s shorthand. Those chips are really building blocks for converters, and require other components to make a complete converter. When someone calls their mini-tower “the CPU”, that’s a clue that they don’t know much about computers. Confusing the chip for the converter is a similar mistake. You could get a 22 kHz square wave out of a primitive, non-oversampling converter chip, but not out of the complete converter.
TL;DR Filtering is a fundamental part of sampling. You should ignore anyone who talks about sampling without talking about filtering – they’re talking about something that they don’t understand.
In fourier analysis… and in our ears… etc…
But a square wave is always a square wave, and a triangle wave is always a triangle wave.
If you digitize two 22 khz tones, one sawtooth and other triangle wave, at 44 khz sampling rate both are square waves in its digital form, and when converted to analog both became identical (sine ^A??) waves. Only increasing the samping rate ,the AD and DA processing can enhance the restore the original waveforms process.
Talking about sawtooth wave and triangle wave frequency is meaningless as ASP and DSP are based around sine waves. When we say 22 kHz we mean 22 kHz sine wave (or a sum of more of sine waves where one has the highest frequency 22 kHz), not a sawtooth, triangle or any other kind of wave with that frequency. Sound and hearing is the same. When we say 20 kHz sound, we mean 20 kHz sine wave frequency not square, sawtooth, triangle wave 20 kHz frequency.
Digital form of the signal is not a square wave because the digital signal is represented in discrete-time and not continuous-time as an analogue signal. Which means that the signal has a certain value at discrete time intervals and the values in-between are undefined.
Unfortunately, some record companies purposely crapify the Audio CD master by compressing the frequency range, so their CDs sound well in “boombox” car stereos and the like. Some times they engage in The Loudness War for their Audio CD masters.
And the uncrapified masters are kept for vinyl releases and DVD-A, SACD and 24/192 track downloads.
So, I can’t dismiss audiophiles claiming they hear a difference elso easily.