|
Copyright claimed by Ron Plachno, author, as of date
written October 16-19, 2013
Note that this article may become published in a new book and
copyrighted in the Library of Congress
Are Engineers Modeling the Human Ear Wrong?
by Ronald J. Plachno
October 16-19, 2013
As I do home recordings and also in my mind have the questions about
which compression method for music is best, mp3 versus ma4 and other
types, a bigger issue occurs to me by listening tests. And this is my
question:
THE QUESTION
Digital audio seems to reduce music quality, since it seems to combine
multiple audio sources that the human ear hears independently into a
single source which is simply less impressive to our more sophisticated
ears.
THE EYE ANALOGY
In my opinion, in a similar manner, science and engineers modeled the
eye wrong. While the discovery of movies and TV which are optical
illusions were great, they underestimated the eye. The first movies had
24 still frames a second. Since the human eye retains image we saw that
as continuous motion. TV due to interlacing had 30 frames per second.
For a long time it was believed that was all the eye can handle. But
when progressive scan came out with 60 frames a seconds it looked so
great to some humans that some sales people called it "High Definition"
- which should really refer to the number of lines down a screen, 1080,
and not how often the screen changed (got refreshed). And now I
understand some refresh at 120 frames per second - or more. How good is
this eye of ours?
MY EAR OPINION
Even though we have only two ears it seems to me that each ear can hear
many sounds for some reason quite independently. If those sounds have
not yet been combined into a digital single sound, those independent
sounds seem richer. Why do I say that? Some items below:
LIVE MUSIC - whether a band, an orchestra or a combination always seems
far richer to me in audio quality and complexity than a two speaker
stereo system.
5:1 / 7:1 SURROUND SOUND - If we have only two ears then why is it that
5:1 surround sound sounds far more amazing to us than the two point
source sound of stereo? We seem to be able to hear those 6 sounds
independently.. somehow. And we view it as an improvement over two sound
sources. Of course for 7:1 it is 8 sounds.
NORMAL ROOM SOUNDS: I do believe at times I hear multiple sounds in a
room and I can tune into one of them and almost ignore others. This
means our ear is not simply combining those sounds but we hear those
sounds independently somehow - as if each ear perhaps has 50 microphones
each and not one. For example, we might hear a football game, our
spouse, a refrigerator making noise, people talking in the background, a
child's toy, a bird singing outside, and somehow I seem to be able to
tune my ear radio to listen to the sound source I wish. I have never
been a dog, but I think a dog does the same thing. They might be hearing
50 sounds, but one of those is of interest and they perk up their ears
and tune their ear radios for that one sound.
NORMAL OUTDOOR SOUNDS
When I was young, our family lived across the street from railroad
tracks which at the time even included steam locomotives. Later in the
Chicago area, we had a house in the flight pattern of O Hare field,
which at the time was the busiest airport in the US. For a time we lived
in England in the path of the supersonic Concorde. At times people would
ask us if the deafening sounds bothered us. We would ask "What sounds?"
And again it appeared to me that our ears hear sounds separately. If
something is quite normal as a background sound, our brains might even
ignore it. Now if we just heard one sound source per ear and the
Concorde mixed in with the sounds around us, then it seems clear to me
that I could not ignore it. It must be an independent sound processed
independently in my ear and brain combination for me to ignore it.
AUDIOPHILES WHO LIKE TAPE BETTER - back when I used to read audiophile
articles, many audiophiles seemed to not trust digital at all. Some of
them talked about the "Fat sound" of tape where they said nothing was
being thrown away. They said as soon as someone deals in an algorithm
and goes digital, they are combining things and throwing sound away.
Well, my thought is - what if they are right and wrong at the same time?
Perhaps engineers make no errors at all throwing anything away - but it
is the process of combining or method of combining that is bad?
MIDI MUSIC PLAYING THE FIRST TIME - Other than live music, the only time
I recall in the last ten years being "enthralled by music" was some of
the times I played my midi music back and converted it to audio for the
first time. Now my system is limited to 28 separate notes at a time and
no more than 16 instruments for the midi portion. But let us say that 10
instruments going on and 20 note sounds total for those 10 instruments
are far more than the two sound sources I normally hear from stereo. Now
someone can argue that even though those sounds are being made fresh by
a Roland Sound Canvas that converts midi to audio, the end result still
must come out two speakers. I agree with that. But I think that it is an
analog addition of those sounds and not a digital sound. I also now
believe that speaker distance enters into this all somehow. But anyway,
what I am saying here is a value judgment and then later trying to
understand it. But yes, a number of times playing back midi to speakers
for the first time, I felt "wow!! - a lot of sounds there!".
MY BELIEF OF THE EAR ERROR I BELIEVE ENGINEERS MAKE
Having been an engineer and around them a lot, I have at times seen them
get the complex thing very right but may miss something outside what
they consider their own project, since they are not "looking in that
direction". I think that is what we have. I believe that engineers may
have looked at an oscilloscope or frequency analyzer to see what the
composite of 10 audio signals in a room look like and then figured out
how to combine them. The problem is not what they did. The problem is
that they may have assumed our ears combined audio like an oscilloscope
and/or frequency analyzer using a single microphone, whereas perhaps
that is not what our ears do at all. Our ears may have the equivalent of
"many microphones" that can separately deal with many sound sources at
the same time.
MY POINT
I think our ears are far better than we gave them credit for. I believe
an ear does not hear just one sound each ear but can distinguish many
sounds from many sources. In that manner, as soon as sounds are combined
digitally much is lost. And my ears tell me that much .... is lost ...
after digital combining. I think that even the very simple example of
why 5:1 surround sound or 7.1 surround sound is more impressive, proves
that point even all by itself.
FURTHER OBSERVATIONS
I find that listening to a midi song playing through speakers before mix
down can give results I will never hear after mix-down to digital. I
seem to hear almost all sounds it seems distinctly and clearly, and
therefore I sometimes foolishly think that I have adjusted my volume
settings, and changing volumes during a song correctly. But when I get
to the digital mix down, some of what I thought I just heard a bit ago
seems missing. Actually, in fairness, it seems to be there but at a
lower sound level. It is as if being now included in one sound source,
it has to compete far harder in volume level to be heard than when it
was a distinct and separate sound.
Also headphones used for sound mixing seem oddly to be more along the
lines of the final mix down digital product than my speaker system
amplifying what are likely analog combined signals. I am not sure why.
Perhaps for our ear to hear distinct audio sounds it must be at a
distance or at some different angle or something for us to distinguish.
I do not have the answer. I can only say that there are big differences
to my ear between analog sounds from speakers, headphones and digital
mix-downs.
WHAT THIS MEANS TO ME REGARDING COMPRESSION
At the risk of getting everyone mad at me, I think we lose much as soon
as digitally mix from many different sound sources to a single source
sound. And I think the problem is that no matter how great the genius of
the engineer, this process seems to take multiple and distinct sound
sources and combine them into a single very complex sound source. And in
that process, the distinct nature of the different sounds seems lost,
and now what is left is competing far, far more for volume settings to
be heard - or to be ever lost in the background. Of course some types of
compression are better than others and I have also looked into that as
have others, but digital music seems to have one thing in common - to my
ears it seems a single audio source rather than the joy of many distinct
audio sources.
To me, digital mixing of music is like a photo of the beach. Now
depending on codec and compression, some photos will be brighter or have
more contrast, but there are still photos. They are not the beach. If at
the beach our eyes see things in three dimensional imaging. We hear
beach sounds. Our toes feel the sand. We feel the wind and see the birds
fly. A photo might be the best we have, but it is not the beach. Of
course an engineer can say that is not a fair analogy since many sense
are included there. Well, customers are not always reasonable people.
But as engineers we should try to get as close as we can get.
MY SUGGESTED FIX
If the above is correct, then the issue is not a problem in making a
technical error in combining - it is the combining approach in the first
place. If I am right in modeling the ear as many microphones, then
digitizing it would mean perhaps finding a way to keep 20 or more sounds
separate and not combine them into a single waveform. That is easier to
say than do. But perhaps one should start with a tape and see why some
say it gives a better output. We can say those audiophiles are lunatics,
but what if they are right but just do not know how to state the issue
well?
On the other hand, one could take a group of beings that likely will not
begin with the answer and yet be quite fair. One could do an experiment
with dogs. Dogs seem quite good at hearing small sounds, such as the
sound of their human's car as it pulls into the driveway. One could try
multiple recording methods, and then watch the reaction of the dogs to
sounds played back.
Yes, I know, some will say that I am mad for not "going with the flow
here". But I think there is something here - otherwise, why do some
think 5:1 surround sound is different than stereo enough to buy 4 more
speakers?
------------
THE SUBJECTIVE SIDE
The above is all about the technical side only of musical recording, and
it is almost solely based on doing home audio recordings and listening
at various stages. However, at least from my tastes musical enjoyment
becomes far more complex based on a number of subjective issues. I
realize that for engineers, this may be where the "good stuff" ends,
since subjective is subjective, and yes I agree, engineers can do very
little about fixing subjective items they cannot control.
What you Begin With
For my tastes it is still far more important what you begin with than
the recording method itself. I see that so much in home recording. But
first let us take a case that for me the "who" is the person or group
recorded makes a huge difference. If my granddaughter sent me a song on
mp3, of course I would listen to it, and right away, and treasure it.
And I likewise would do that of relatives and friends as well. That
would be far more important to me than listening to a higher quality
recording of say a group I did not know or like.
There are also some professional groups that may not be available in
high quality recordings. One of those a group I believe called the Los
Admiradores long ago came out with a phonograph record called "Bongos,
Bongos, Bongos" that I admired very much. Great orchestration and clever
song arrangement for what appeared to be a 6 or 7 piece band. To my
knowledge this record never made it to what even I agree is the higher
quality digital sound. And so I spent honestly hours trying to record a
phonograph record and using scratch removal software and effects and the
like to try and get a modern copy. But even with the problems, I
would far rather listen to their music in low audio quality that many
items of not so good music in a better recording quality method.
Mix Down Quality Matters More
Time and again I find that often unless I did a great job during music
mix-down from multiple sources into a single digital stereo output, that
the quality of that mix down matters far more than what compression
method is used later. It would be convenient and nice for me to blame
the compression method, but I find too often it was the person doing the
mix down that had far more to do with quality than what was done later.
Of course this is generally a well known Engineering principle - that
you have to start with something good in the first place, because seldom
can you make bad sound better. You can often make it worse or if you are
really good, it can stay the same, but better than the original is often
not a reasonable goal.
And then there is "fix it in the mix"
"Fix it in the Mix" is a recording person's expression for trying to fix
less than perfect sound recordings at the same time we mix down to the
final stereo output. Perhaps a singer coughed; if so, our deft fingers
may shove the volume for them (that single vocal track) down right at
that time, for just a second, if we can avoid cutting into their
singing. Perhaps there was noise at the beginning or end. That is easy,
we kill those sounds. What the singer did great but ended a phrase wrong
in just one place? No problem, we lower the volume right at that point.
And in my case, since I am a one person band, all of these issues come
from me.
Well, okay, now let us say someone did a recording and they are not
Frank Sinatra backed up by the real Paul Mauriat Orchestra or Percy
Faith Orchestra. Okay, there are going to be days when a less than
perfect group may actually sound better with some nuances missing. That
means that an mp3 of them might in fact sound better since it eliminates
some of the problem sounds and instead concentrates on the better
sounds. And yes, unfortunately, I have seen that also - and most likely
with my own recordings. Anyone remember the Dolby days when Dolby and
DBX were used to kill noise? Some days they killed some music along with
it. Some days that can be a feature and sometimes not.
If one adds the subjective items onto the technical items, the recording
method or compression method becomes a most complex subject - at least
for me.
Ronald J. Plachno
October 16-19, 2013
Back To Articles
Back to Main Index for Ron Plachno Site
|
|
|