advertisement


How can a speaker cone work properly?

wulbert

pfm Member
Clearly they do, but something about the way that (I understand that) speakers work is unclear to me.

If a speaker cone is, say reproducing an "A" played on a guitar, vibrating at 440hz (IIRC), what then happens when it is required to reproduce a cymbal crash. Does it still vibrate at 440hz, but then incorporate the "cymbal" sound into the movement. I.e. vibrating at "A" but with another oscillation overlaid or incorporated?
What happens when another note is added producing a chord? And another, and another, then a snare drum?
Is there a limit to the complexity of sound that a speaker cone can convey before it all just becomes a mush? There doesn't seem to be any limit, which to me feels counter-intuative.

We can only sing one note at a time with our vibrating vocal chords (as far as I know) so how come a vibrating paper cone can do so much more?

I realise I may be exposing my own stupidity with this question, but happily I am at an age when I no longer care.
 
I think you already have the jist of it, higher frequencies are always on the back of the lower frequencies
the electrical ac music wave form should be followed exactly by the driver.

Your question does it have limits? yes it does, on one hand that explains the need for multiway loudspeakers
But in my experence all virtually all loudspeakers will play a steady signal.. eventually.. maybe after 5 or more cycles.
But loudspeaker drivers that respond exactly to the voltage input of a dynamic music signal are in the minority in my experence.. however controversial or not that may appear.
 
... If a speaker cone is, say reproducing an "A" played on a guitar, vibrating at 440hz (IIRC), what then happens when it is required to reproduce a cymbal crash. Does it still vibrate at 440hz, but then incorporate the "cymbal" sound into the movement. I.e. vibrating at "A" but with another oscillation overlaid or incorporated?
What happens when another note is added producing a chord? And another, and another, then a snare drum?
Is there a limit to the complexity of sound that a speaker cone can convey before it all just becomes a mush? There doesn't seem to be any limit, which to me feels counter-intuative. ...
It's a good question but quite complex to answer simply.

In HiFi products we mostly demand they have a property called "linearity". That means they do indeed handle the guitar's "A" and the cymbal sound independently. The displacement of a 'speaker's cone/dome/film is indeed just the sum of the displacements that would have happened if you had applied the two sources separately.

However, all real systems have "non-linearity" to some (usually weak) extent. Non-linearity creates small new signals that are not present in the sum of the sources.

With one source, non-linearity causes harmonic distortion: second, third, ... harmonics.

With two sources you get the harmonics from both but also some new "intermodulation" products at non-harmonic difference and sum frequencies. A 440 Hz note and a 262 Hz note together create a weak note at 440 - 262 = 178 Hz, and also a weak note at 440 + 262 = 704 Hz, and some more complex products too.

With multiple sources as in normal music, intermodulation gets very complex and in the worst cases produces mushy sound.

Most engineers strive to make sure these unwanted new signals are as weak as possible. Intermodulation products are usually much more objectionable to human listeners than harmonics.
 
I've often thought about this myself, it is a good question. Maybe things have changed these days but for years my assumption was that it was one of the many things we take advantage of without fully understanding. Like... electricity.
 
I've often thought about this myself, it is a good question. Maybe things have changed these days but for years my assumption was that it was one of the many things we take advantage of without fully understanding. Like... electricity.

Well, *no-one* understands electrons. But they go on being electrons, regardless. :)
 
A useful analogy as a thought process - for me, anway - how can the complexity of anything at all, not least music, be reduced to strings of 0s and 1s, and still be "reconstructed" accurately?

If any sound was cut into isolated segments of very brief duration, could anyone recognise anything but one note, let alone what instrument was being played?

Add in the effects achieved by the internal software being used for listening....
 
If the speaker driver vibrates, at any given instant, at the sum of all the frequencies applied to it (which is what I understand is happening, in effect as the voltage applied will be the sum of all the voltages in the bit of the signal for that instant) then it is the brain which reassembles the resulting complex waveform into its perceived component parts. How it does that is a bit of a mystery, I think. But it doesn't feel unreasonable to consider that one purpose of hifi is to make the brain's job easier in that regard. And if (controversy alert) that means exploiting the placebo effect to help the brain get to where it needs to be, then why not?
 
Clearly they do, but something about the way that (I understand that) speakers work is unclear to me. ...
Addressing the "how" part of the question, if anyone is really interested in how loudspeakers avoid producing harmonic and intermodulation distortion there are some really comprehensive papers available from Klippel here but most people will not want to get into that level of detail.

Loudspeakers produce numerically the highest levels of non-linear distortion in a typical audio system. A complex mix of good mechanical, magnetic and electrical engineering is needed to make them sound clean. Without any criticism of personal preference, I have observed that people sometimes seem to like the specific sonic signature of their 'speakers and find really low distortion 'speakers that might be preferred in a studio to be too clinical and over-detailed. As always, it's a case of YMMV.
 
There's also the point that, actually, one note from the throat is more than one 'tone'. We recognise individuals' voices from the particular tonal qualities, which wouldn't be possible if we all emitted just one tone.
 
If a speaker cone is, say reproducing an "A" played on a guitar, vibrating at 440hz (IIRC),

That guitar A note won't be a single frequency to begin with. In fact, even its fundamental won't remain constant over time.

The notion of 'frequency' is an abstraction, with little direct relevance to any real sounds or signals. A single frequency corresponds to a sine tone extending infinitely in time. That would make for pretty boring music.

Any practical sound occupies an infinite amount of frequencies, a continuum. All of these are carried by that vibrating string. By that expanding and contracting volume of air. By that speaker cone.
 
Clearly they do, but something about the way that (I understand that) speakers work is unclear to me.

If a speaker cone is, say reproducing an "A" played on a guitar, vibrating at 440hz (IIRC), what then happens when it is required to reproduce a cymbal crash. Does it still vibrate at 440hz, but then incorporate the "cymbal" sound into the movement. I.e. vibrating at "A" but with another oscillation overlaid or incorporated?
What happens when another note is added producing a chord? And another, and another, then a snare drum?
Is there a limit to the complexity of sound that a speaker cone can convey before it all just becomes a mush? There doesn't seem to be any limit, which to me feels counter-intuative.

We can only sing one note at a time with our vibrating vocal chords (as far as I know) so how come a vibrating paper cone can do so much more?

I realise I may be exposing my own stupidity with this question, but happily I am at an age when I no longer care.

You are definitely not stupid, there are so many layers and facets to your question it is difficult to know where to start. Speaker cones are based on old technology, wax cylinders with the 'needle' running in a groove and a simple horn. Again with early 78 shellac records the needle in the groove excited a diaphragm which was amplified by a tin horn. We also had radio with simple speakers. All these sounds are complex patterns which change instance to instance. Mathematically think of the way calculus works, differentiation of small steps to see the bigger pattern. Say you listened to a tiny slice of the music say half a second I think it would be difficult to recognise but add the slices together then music. As others have said what stops it being a mess is the design and quality of the transducer, given that the source and amplification are of good quality.
The other element is how our brains process the information. Although compared with vision, hearing perception is much simpler, it is faster and I guess that our brains learn to make sense of the noise, i.e. from previous experience.
There are papers and associated research (which would take me too much time to document) which give insights into how our brains process audio, what is certain our interpretation is variable person to person, which leads to the view that what we think we hear is a difficult basis to share information on the performance of hi fi systems. Hence the need to reference the engineering parameters as part if the appraisal. As you might guess I don't understand well enough to explain fully
 
If a speaker cone is, say reproducing an "A" played on a guitar, vibrating at 440hz (IIRC), what then happens when it is required to reproduce a cymbal crash. Does it still vibrate at 440hz, but then incorporate the "cymbal" sound into the movement. I.e. vibrating at "A" but with another oscillation overlaid or incorporated?
What happens when another note is added producing a chord? And another, and another, then a snare drum?
Is there a limit to the complexity of sound that a speaker cone can convey before it all just becomes a mush? There doesn't seem to be any limit, which to me feels counter-intuative.

Logically consider a loudspeaker to be the other end of the chain to a microphone. They are nearly identical technology, the loudspeaker just having a larger diagram/cone. You can actually use a loudspeaker as a microphone, and some headphones (e.g. Sennheiser HD414) have a slightly tweaked mic capsule as a driver.

A mic captures sonic vibrations in the air by moving in sympathy with them and converts that movement to an electrical signal. The loudspeaker does the reverse, it converts that electrical signal back into vibrations in the air. It is reproducing/recreating the movements picked up by the microphone diaphragm and placing them back as energy in the air. That it works as well as it does is amazing, but that’s certainly the concept at its most basic.
 
The simple thing to remember is that a microphone takes the differences in air pressure and converts them to instantaneous varying voltages and a Loudspeaker just does this in reverse.

You can forget about individual instruments and their frequency etc, all that counts are the changes in air pressure that the instruments combine to make and that these are converted to voltage and back with the lowest possible distortion.
 
Clearly they do, but something about the way that (I understand that) speakers work is unclear to me.
...
We can only sing one note at a time with our vibrating vocal chords (as far as I know) so how come a vibrating paper cone can do so much more?

I realise I may be exposing my own stupidity with this question, but happily I am at an age when I no longer care.
It's a fair question- but it is not peculiar to speaker cones. Whether in terms of air pressure (as sound you hear), a voltage (in the amplifier or cable), a displacement of a cone or a groove in a record -all of the possible sound components add up to a single signal which flcuatates with resepct to time. As @John Phillips points out, that addition isn't quite perfect in practice but that's in essence what it is. You have one voltage per channel in the cable and one displacement of the speaker diaphragm. And this single variable with respect to time represents a continuous spectrum of components over a range 20hz -20Khz or more.
This point crops up again and again in different forms.
 
A useful analogy as a thought process - for me, anway - how can the complexity of anything at all, not least music, be reduced to strings of 0s and 1s, and still be "reconstructed" accurately?

If any sound was cut into isolated segments of very brief duration, could anyone recognise anything but one note, let alone what instrument was being played?

Add in the effects achieved by the internal software being used for listening....

You realise that atoms are 'lumps' and that the sound pressure changes you 'hear' comes from a given number of them banging up against your ears / sec at their average velocity?

'Analogue' in reality is like that. Loads of 'bits' of something. Your senses average over what they do. For good human hearing the average being over about 1/40,000th of a second as your sensing system takes that sort of time to get an average it can notice.

All 'analogue' signals are thus in effect 'modulated noise'. Thus all real world patterns of audio have to be accompanied by noise. As a result if a sound is too weak it can't be heard below the noise level as your ears aren't perfect, And if it is too fast a variation you won't notice because it is too fast for your hearing to notice.

The physics is more fundamental than that. But the above is the outcome. If you can face seeing equations I can recommend (ahem) a good free pdf book that explains more specifically. :)
 
If the speaker driver vibrates, at any given instant, at the sum of all the frequencies applied to it (which is what I understand is happening, in effect as the voltage applied will be the sum of all the voltages in the bit of the signal for that instant) then it is the brain which reassembles the resulting complex waveform into its perceived component parts. How it does that is a bit of a mystery, I think. ....
The brain does a lot of work but actually the sound is broken up into frequency components by a physical process in your cochlea -in crude terms you shake it and the different frequencies wiggle different parts of the cochlea (the ear actualy does part of the work by looking where the cochlea is wiggled and part of the work by looking how fast the wiggling is). You'll find lots of resources on the internet which explain this. Of course the brain then interprets the information it gets...
 
Even more fun than that, the way the brain is informed about what is going on are the cilia, the little hairs in there that ... respond to different frequencies.

Yes, you could read that as human hearing having a digitising component - at least - a bunch of frequency-selective sensors, and it works out what is going on from .. the samples it receives / which areas are triggered on a continuous basis. And ... that process can induce as much as 30% THD readily - but we don't perceive it , at all.

The wetware between the ears is amazing.
 
(PS that's why hearing can change so much by over exposure to loud noise, or the natural decline with age - the shortest cilia / HF lose sensitivity, some get damaged by abuse - excess energy/loudness and so on - making holes in the pick-up if you like. and teh curel jok is that its right in teh area of most-sensitivity, the 1-3khz midrange, where both effect collide hence presbycusis; its right where the fricavtive components in speech are, the consonants - which carry all the information in speech in any language. Hence why hearing what some one said gets harder as you get older... or are over-exposed to loud noises ...)
 
The simple thing to remember is that a microphone takes the differences in air pressure and converts them to instantaneous varying voltages and a Loudspeaker just does this in reverse.

You can forget about individual instruments and their frequency etc, all that counts are the changes in air pressure that the instruments combine to make and that these are converted to voltage and back with the lowest possible distortion.

Yes, that's a good point and distinction that you (and Tony L and others) are making. I'd been thinking about the speaker reproducing specific frequencies, but differences in air pressure feels like a much simpler task.
 


advertisement


Back
Top