Pondering Images

Via the South Bend Seven, comes this interesting post on images at Barbarian Blog.

The total number of pixels [on an HDTV screen]  is 1920 horizontally x 1080 vertically = 2,073,600 pixels. There are 256 possible intensities of red, green and blue for each pixel, so that's 2563 = 16,777,216 possible colors. To figure out how many possible images there are, we need to raise the second number to the power of the first, so 16,777,2162,073,600 = 1.5 * 101,4981,180 possible images. That's a pretty big number "“ it's almost a million and a half digits long. Printing it in 10 point Monaco would take over 2,700 pages of paper. Scientists estimate that there are 1080 atoms in the observable universe "“ a tiny number in comparison.

However big it may be, the fact that the number is finite is a surprising thing to realize. It means that every possible image has a unique ID number. So instead of asking me, "did you see that picture of MIA performing pregnant at the Grammys", you might ask, "did you see image number 1,394,239,...,572?" Obviously that is totally impractical and it would make you a huge nerd, but it's interesting that you could.
More in the same vein at the link.  I was surprised that the number of states a video screen could be in was so much larger than the molecules in the universe.

24 Comments

  1. epobirs:

    Yes, I am a huge nerd. A long time ago I was thinking about compression methods and had thoughts along the same lines, trying to consider at what point it became more practical to assign everything a reference number rather than storing the image itself in compressed form.

    Of course, that level of color depth is becoming obsolete. A lot of newer gear is capable of a much wider range and this is expected to become standard eventually as the demand for new features and higher quality works its way don the price scale.

    Another experiment I always wanted to do, if I had the programming talent (even if I did, the ADD runs strong in my line)was looking into the potential for what could be done with spell-check as a native OS feature. This inherently gets you a dictionary with every word assigned a minimum number of bytes for lookup. Four bytes per word would provide enough entries for several languages, per http://hypertextbook.com/facts/2001/JohnnyLing.shtml , and storing documents using these tokens could deliver a substantial reduction in data volume.

    Thing is, people don't get concerned much about the data volume of text anymore and the system requires a standardized dictionary that is open source to work, to insure a file is readable across all OS implementations. So you have a big open source project and some resource requirements before any benefits are in sight, for a compression measured in kilobytes. But as a mental exercise it's fun to look at what could have been done if spell-check had been considered a critical OS level feature back in the early 90s.

  2. roger the shrubber:

    yeah, but for REAL fun with numbers, consider the physics "multiverse" theory. basically, the theory is that every time a choice of any kind is made - by ANYone - a split occurs, and another universe springs into existence. so if you go to starbucks today for coffee instead of 7-11, there's now another universe **just like this one** except in that one, you went to 7-11. now: consider all the people who ever lived, each one making literally thousands of choices and decisions every day. start doing multiplication, and you get into VERY large numbers real real quick. that's a lotta realities. the "germans win ww2" universe. the "nixon gets a spot on mt. rushmore" universe. the "cindy crawford falls for a mere shrubber" universe, which is the one *I'M* looking for. (micheal crichton explains all this much better than i can - and in layman's terms - in his time-travel [!] novel "timeline".)(and then the brilliant SOB **PROVES** it, which will tend to get you thinking about things like 'infinity' and 'God': if all things that can happen do happen, [somehwere], then A)does the big guy really exist? if all things are literally possible, why would God be necessary? and B)if all those quadrillions times googleplexes to the quintillionth power universes are out there [dark matter? dark energy?] and God has dominion over all of them, and can keep track of them all, then... - by definition - when we think of God, we think waaaaay too small, no?) (see pp 109-114 in the hardback edition.)

    none of this explains how gigantic pointy-toed clown shoes became fashionable for women, however. or why mens athletic teams would choose "baby blue" as a color to wear while they do battle.

  3. Matt:

    As an amature photographer, I have to say that this does not show that there is a finite number of possible images. Only that there is a finite number of images that can be stored / displayed in an HDTV format. This will be true for any digital image format.

  4. Michael:

    These are just the observed states. Can you imagine the number of unobserved states.

  5. delurking:

    "I was surprised that the number of states a video screen could be in was so much larger than the molecules in the universe."

    Actually, I am surprised that you are surprised. There are lots of things that have more possible arrangements than the number of atoms in the universe. It takes a remarkably short strand of DNA to pass that threshold, for example.

  6. enoriverbend:

    "Only that there is a finite number of images that can be stored / displayed in an HDTV format. This will be true for any digital image format."

    Even for analogue images, there is a limit to the informational states contained in the image. For analogue film, we just call the equivalent of pixels 'grain' instead. But there is still a limit to the resolution and contrast, and thus a limit to the number of recorded images that can be unique. Estimates of 'pixel-equivalence' vary, but one believeable estimate I've read has 14 million pixel-equivalents for a 35mm photo on decent color film used in a good camera with a good lens.

    And then when the lab prints the photo, there is additional resolution loss and a much lower limit. At this point there isn't nearly as much difference between decent digital and decent analogue photos when printed at the same quality lab.

  7. Bob Hawkins:

    Now you understand that combinatoric is to exponential as exponential is to polynomial.

  8. LoneSnark:

    I believe your math is wrong. We no longer use 24 bit color (8 bits per color) as we now use 32 bit color, which dramatically increases the options.

    That said, this might be a reason everyone loves their 10+ megapixel cameras: the increased possibilities :-)

  9. John Moore:

    I've done some work with compression (optimizing the computations required for a very early digital music synthesize, and various things in the software world). All modern compression schemes are based on fundamental ideals from Shannon's Information Theory. Shannon starts with an argument somewhat similar to this blog: hey, if both sides of a combination have a big list of possible messages, then they can send over the message number rather than the message itself. Hence cellular phones are highly optimized for human voice. They use (unless they've gone passed it since I was up on this) a scheme called "code-book excited linear predictive coding" - which means they have a "code book" made up of the various forcing sounds that go into the human vocal tract, and coefficients for a simple model of how that tract modifies those sounds. Hence they are great at giving you the sound (think image for context) you want, but don't try to play music through them! Most compression schemes in use today are based on some seriously heavy duty math, combined with a lot of heuristic tricks (example above - math in the LPC, heuristic trick in coupling a code-book to it).

    On top of this, encoding of multimedia data can take advantage of known limitations in human sensing. Hence CD's suppress any frequency above about 21KhZ because most humans can't hear that high a frequency, and by suppressing it, they can sample at 44Khz or whatever. Old fashioned color TV used much more bandwidth to transmit blue and green than for red, because the human visual system sucks badly at seeing red (which is why green hand-lasers are 100x as bright as red ones of the same power).

    In terms of the blog post, this means that the number of *perceivable* (or differentially perceivable) pictures is much smaller than the combinatoric number, and compressions algorithms take advantage of that. Also, the number of *interesting* pictures (or parts of pictures) is much smaller than the number of possible pictures, and compression algorithms take advantage of that also.

    All in all, it's an interesting field, but the underlying theory really does start with the thinking in the blog post.

  10. Kevin Jackson:

    LoneSnark: 32 bit colour is still 8 bits per colour (RGB), with an additional alpha channel. I don't think that it contributes to actual differences in colours displayed, but I could be wrong. My experience with 32 bit colour is limited.

  11. IgotBupkis:

    Warren:
    The relevant factor though behind naming images is that, for the most part, you don't care about exactly which of a thousand or more images you've seen, as most of them carry the gestalt you had in mind behind asking, "did you see 'x'?" The gestalt is what you really wanted to transmit, not the specific picture. So a specific image isn't usually important, a class of related images is what you refer to, and that's why the "did you see 'x'?" question is more accurate.

    Warren and epobirs:
    First off, the most common compression idea behind "zip" (and the other similar mechanisms, rar, tar, and so forth) -- Ziv-Lempel Encoding -- in essence does what you suggest, but on a self-contained level. It trades off a small amount of efficiency for more generality (it doesn't care what the data is) and for the advantage of being self-contained (i.e, you don't need an external dictionary). There is a trade-off, but much less than you'd think. LZW (an offshoot of LS encoding) is remarkably efficient. They suck for jpg images, largely because jpgs are already pretty efficient in terms of what the LZ algorithm does, and they don't do great for most images in general (because images without lots of "sameness" elements don't lend themselves to LZW encoding).

    The key point is that LZW encoding already gets probably 90-95% of the efficiency of your dictionary approach for most data. So why add another level of complexity to get that remaining 5-10%?

    Another reason is "sorta duh" -- our storage capacity has been growing at a phenomenal rate, the amount of storage capacity available on a roughly 5"x9"x2" hard disk drive has doubled approximately every 18 months since before most of the people reading this were born. Anyone here remember the first "personal" hard drives? 5 MEGA bytes, about $1k in 1980 dollars. Floppies at 160 KILObytes -- and costing 5 bucks EACH?? I'll go you one better --
    1978, NASA in Cape Kennedy, touring the computing facilities with the UCF CIS department. Go through a pair of double doors, and you see an aisle flanked by 30 washing machines -- 15 on a side, on raised flooring providing additional cooling. WOWWWWW!!! Each washing machine is a HARD DRIVE. Each washing machine costs about 100k to 150k each, so figure 30 of them is about 3 MILLION dollars of equipment (and note the operating expense, too, with both power consumption AND cooling consumption). Total capacity of all 30 hard drives? 3 GIGAbytes. WOWWWWWWW!. Now, consider -- 3 gigabytes is now so small a capacity that you wouldn't even bother putting it into a Linux machine. Your computer may well have more RAM memory than that. Currently you can purchase a 2 TERAbyte drive that fits into a fraction of the space, with a fraction of the operating expense (and much, much better performance), and costs $190. Thats 666x the capacity for 1/15750th of the price. A 10.5 million improvement factor in cost/byte of storage. In 32 years!

    And to put that into perspective -- I've NEVER thrown away any e-mail, in or out, that isn't flat-out spam, and that's in about 13 years of e-mails. With LZW compression, that entire correspondence would fit on a single blank 4.5 Gb DVD, costing about 25 cents. It would probably fit on it without compression, but I haven't tried, offhand (you can start to see why I haven't bothered to do so -- actually sorting through and deciding what to keep or toss would cost more time than it does to buy the storage medium) And this way, I can nominally reference any old e-mail I've ever sent or received (since I've switched from Netscape to Thunderbird in the interim, I'd have to re-install Netscape, but that's another matter).

    So the benefit from actually going for that remaining 5-10% improvement in storage efficiency is really, really trivial. Most people don't even bother to use compressed folders in Windows, for example, even though it's supported them for around a decade. The main usage of Zip these days is more its convenience as a "packaging tool" than because the compression is actually needed.

    John Moore: I have no doubt simplified things quite a bit from your perspective. But it's good enough for the average joe to understand. More in a moment...

  12. IgotBupkis:

    Roger: One issue that we haven't yet figured out is how much "ripple" there is between decisions, and how much the universe seems to actually branch off. "Is the butterfly effect actually applicable on the big picture, as opposed to computationally?" If I decide to snap my fingers, *now*, is the resultant universe actually different in any particular from the universe where I did not do so? How about the one where I *said* I did so, but did not?

    Does that decision -- snap, no snap, claim-to-snap -- different in 100 years from each other? Or do they converge back into the same universe?

    Some decisions are clearly more relevant than others -- the one where I decide to cross the street and get killed by a speeding car is at least different from the other one from my perspective, but even there -- does the "Chinese Relativity Axiom" apply to the universe?

    Is the result of my actions so significant to the universe that it's notably different in any real aspect from the one where I die at 10 from leukemia or at 90 from congestive heart failure?

    And, on the larger scale, some people are clearly going to change things massively with some key decisions. In terms of humanity, it would be a far different place we live in had Khrushchev decided to start tossing nukes back and forth during the Cuban Missle Crisis.

    But now fit THAT into an obvious extension to the Chinese Relativity Axiom: "No matter how great your planet's triumphs, no matter how devastating its failures, approximately 100 billion stars couldn't care less."

  13. IgotBupkis:

    > On top of this, encoding of multimedia data can take advantage of known limitations in human sensing. Hence CD’s suppress any frequency above about 21KhZ because most humans can’t hear that high a frequency, and by suppressing it, they can sample at 44Khz or whatever.

    Actually, John, though, the problem is when engineers set some of these standards and don't consider anything but the basics.

    A lot of music purists hated CDs because of the above standards, which ignored a key facet of sound applicable to human hearing, and that is that you don't just hear things as isolated tones, but have a much higher discrimination when two tones are played together -- there are harmonic waveforms -- places where the two tones are interacting to produce a much more complex waveform -- which are produced which the human ear can detect, especially when applied to two or more moving tones (as opposed to two fixed tones).

    When those are absent (as they are when some "quality" music, as in that of an orchestra or symphonic grade classical performance), many people familiar with the music can very definitely hear the difference. You hear this complaint as "flat"ness of digital music.

    This is one reason the modern music standard, so-called "MP3" uses a much higher sampling rate and a much higher tonal limit. MP3s can be sampled at much higher bit rates than CDs were, and the frequencies allowed can be much higher (196khz is typical, but I've seen some MP3s that were sampled with 320khz as their max).

    The MP3 standard was extracted from the DVD storage standard for the sound recording of movies. By the time the DVD standard was being devised, the nature of the music purists' complaints had been (believed) understood, and the MP3 standard attempts to address that. There's probably a true limit to the ability of human ears, even "the best" to discriminate, but it's a lot higher than the base level that CDs once defined. I've never heard any legitimate complaints about music stored from scratch using an MP3 algorithm as opposed to the CD algorithms. Not saying they aren't there but I suspect that very few, if any, can reliably hear the difference any more.

    Similarly, in vision -- the 16megacolor limit was initially set and used by engineering types because that is nominally well beyond the maximum color space recognizable by typical human vision (around 10mcolors, IIRC)... that is, until you start to place shades and hues right next to one another, at which point the brain's visual processing facilities CAN discriminate between two otherwise indistinguishable colors.

    That often happens in a picture of a scene that, in the real world, has a "nearly smooth" flow of color from point 'a' to point 'b' (in actuality that's controlled by the limits of your eye's cones and rods, of course).

    In application to the blog post and/or your comments, you no doubt grasp that the jpg algorithm has a "quality level" that can be applied. Different pictures can be compressed more or less than others without noticeable "artifacts" (i.e., visible flaws) in the decompressed image. The most reliable mechanism I've seen for recognizing this is a "blink test" -- flip quickly between decompressed and uncompressed/raw, if there's a visible difference, you'll REALLY see it that way (the human cortex is very good at spotting changes in what it is seeing). But that's for a purist -- in actuality, many pictures will have notable artifacts in a "blink test", but they are subtle enough that you won't spot them short of that unless you're REALLY looking for them -- usually by scanning the edges of something, esp. some visible text in the image.

  14. IgotBupkis:

    P.S., in 2007, there was an article on Tom's Hardware Guide that discussed the storage capacity issue:

    Estimate: One Zettabyte by 2010
    A study on storage trends by IDC shows that we are using billions of gigabytes of storage at an increasing rate. The study predicts that there will be a 57% annual growth rate in storage and that the amount of information created will reach 988 exabytes in 2010.

    Note that very little of that is text.

    I'd include the link but it's dead and gone :(

    Here's another along the same lines:
    Storage Must Prepare For The Zettabyte Universe

  15. Matt:

    What surprises me is that so many people seem to find credibility in the idea that scientists can estimate the number of atoms in the universe. I can see maybe see taking a stab at the number of atoms in the *known universe*, but even that is pushing it for me.

  16. epobirs:

    IGotBupkis,

    I know all of that, I assure you. I can still picture that hard drive ad in the first issue of 'BYTE' I picked up back then, not long before I got my first machine, an Atari 800. Years later, when faced with the options for adding memory to a system that shipped with 16 MB for running Win95, I decided I had to spend the extra money to go to 48 MB, rather than the Win95 sweet spot of 32 MB, because it would be 1024 times the memory of my first computer. That made it a moral imperative.

    What made the external dictionary approach interesting to me was the interest I had in putting spell checking everywhere text was entered. I remember when spell check was a separate application from word processors and a document could be examined only after exiting the WP app and loading the spell checker. It was almost more trouble than it was worth for minor documents. When Borland came out with their Thunder app, that made spell check into a Terminate and Stay Resident app, that marked an exciting change in how PCs could be used. You could see a a coming era when everything you wanted the computer to do for you was just a command away. Or just there all the time.

  17. epobirs:

    Matt,

    There is nothing odd about it at all. Whenever you've got a set of parameters that you believe to be true, you can start doing math based on those parameters. That doesn't mean the parameters aren't subject to change with the appearance of new data. In fact, the new data may come specifically as part of the process of testing how well the current understanding fits the observable reality. You have to work with what you've got and use that to point out where to look next.

    It's very hard to develop an improved estimate if no previous estimate has been made. Testing an idea requires the idea to be first defined.

  18. IgotBupkis:

    > What made the external dictionary approach interesting to me was the interest I had in putting spell checking everywhere text was entered.

    Well, you can nominally get this with Firefox as far as everyplace you enter text. I don't make spelling errors all that often (a very good natural eye for proofreading), but it does highlight my typos before I spot them normally.

    I think the biggest issue with the idea is that it makes people dumb. They think it can't make mistakes while there are whole classes of errors it can't catch, such as when you're using the wrong homophone -- "bare" vs. "bear" -- or recognizing when you're using the wrong one of "its"/"it's" or "to", "too", and "two".

    Given the abysmal state of public education I see no reason not to make idiots stand out.

    > But as a mental exercise it’s fun to look at what could have been done if spell-check had been considered a critical OS level feature back in the early 90s.

    Considering that Windows is STILL, 15-odd years later, not much more than a BIOS with a GUI front end, that's asking for a lot. I'd be happy if it actually TRACKED resources the way an OS is supposed to. The idea that it still warns you (dunno about Windows 7, haven't used it yet) when you kill the last app that purportedly uses a DLL that "something might still be using it.... are you sure?" before it kills the orphan DLL says they still aren't actually creating a database of who uses what when things are installed -- they're STILL just incrementing a #@$%#$%$# counter associated with it.

    First and foremost, and OS is a database of resources -- what has been installed, which things have permission to access what, and how to access each thing. Not Windows. And I might have considered it reasonable back when W95 came out, but we're well beyond that point. Windows should be nigh uncrashable by now,and pretty close to invulnerable to attack from most sources.

    When Bill Gates goes to hell, he's going to be put to work coding systray apps for Windows 2000 and supporting them personally.

    ================================

    Matt: as epobirs points out -- in your own words, it's an "estimate". Not a concrete number, a guess as to what the correct value is. Capisce?

  19. John Moore:

    IgotBupkis:

    A lot of music purists hated CDs because of the above standards, which ignored a key facet of sound applicable to human hearing, and that is that you don’t just hear things as isolated tones, but have a much higher discrimination when two tones are played together — there are harmonic waveforms — places where the two tones are interacting to produce a much more complex waveform — which are produced which the human ear can detect, especially when applied to two or more moving tones (as opposed to two fixed tones).

    A lot of music purists are totally ignorant of signal theory, and there are all sorts of wrong explanations of why CD's don't sound right, or why tubes are better than transistors (hint: they absolutely are not).

    BTW, psycho-acoustic testing has shown that humans can hear isolated sine waves much better than more complex waveforms. In fact, we took advantage of this fact to mask aliasing in our system by dithering the clock (your PC does the same thing - something I wrote up in my patent book decades ago but never got around to filing).

    I am well aware of harmonic waveforms (duh, I helped design a music synthesizer). However, it appears that you are confusing harmonics with chords, or perhaps with mixing phenomena. Your description is not coherent from a signal standpont. However, if there are frequencies present higher than 21 kHz, and there are significant non-linearities present in the system (which, of course audio enthusiasts do their absolute best to REMOVE), then those higher frequencies can mix and produce signals audible in the human frequency range. However, that's a very big if, and I know of no evidence that this effect is used, is part of human hearing, or is consistent.

    On the other hand, a few folks can in fact hear above 21 khz. If they also just happen to have become music purists while young enough to hear that, then they could hear the loss of high frequency in CD's. CD's were designed to be good enough for almost all humans, not the far outliers in the perceptual space.

    When those are absent (as they are when some “quality” music, as in that of an orchestra or symphonic grade classical performance), many people familiar with the music can very definitely hear the difference. You hear this complaint as “flat”ness of digital music.

    Actually, in tests, it has turned out that the complaint was due to the music being more accurately rendered than the listeners were used to, so they objected to it.

    This is one reason the modern music standard, so-called “MP3″ uses a much higher sampling rate and a much higher tonal limit. MP3s can be sampled at much higher bit rates than CDs were, and the frequencies allowed can be much higher (196khz is typical, but I’ve seen some MP3s that were sampled with 320khz as their max).

    Higher sampling rates are used because we can afford to do it now, and the higher sampling rates allow lower levels of certain complex aliasing (from amplitude quantization). Also, it shuts up the purists and it satisfies the audio savants with extraordinary hearing.

    There’s probably a true limit to the ability of human ears, even “the best” to discriminate, but it’s a lot higher than the base level that CDs once defined. I’ve never heard any legitimate complaints about music stored from scratch using an MP3 algorithm as opposed to the CD algorithms. Not saying they aren’t there but I suspect that very few, if any, can reliably hear the difference any more.

    Actually, there are only a very tiny percentage of humans (and only a tiny percentage of music purists) who can discriminate in carefully controlled tests, because the limits of human hearing are very well understood and have been for many decades (the military had a very significant interest in this for a long time). Also, keep in mind that CD's use NO compression (just quantization), while MP3 algorithm does use compression, throwing away information.

    Similarly, in vision
    You've got to be real careful in how you interpret these effects. The human eye is very good at distinguishing aliasing. Furthermore, there is a lot of variation in human vision; for example, most humans have 4 primary colors, except for 60% of males who have only three.

    In application to the blog post and/or your comments, you no doubt grasp that the jpg algorithm has a “quality level” that can be applied.

    Yeah, after a huge amount of thought and studying, I finally "grasped" that.

  20. IgotBupkis:

    John, I bow to your apparently superior expertise. The description I gave is basically derived from how it was explained to me. I lack the underlying engineering practice to argue in favor of it or say that you're right and they are wrong (I have the math background if I wanted to dig into it but don't care enough, since it's not something that is likely to be fruitful, and there's enough on my plate of pure curiosity that I'm not likely to add another thing -- funny thing, that plate. Even though it hasn't shrunk, there never seems to be as much time to feed off it... :D )

    Though you appear to be of a different knowledge, I was told that when two musical instruments interacted, there were more complex waveforms (and that certainly matches my knowledge of the math, when you mix waves) which the ear could hear, even though it could not have heard individual frequencies that short -- The ear could not detect a difference on hearing them independently, but juxtaposed it was capable of detecting a difference. I'll take your disagreement with that assessment under advisement.

    Similarly with regards to color perception. I could show you slide 1, color xxyyzz, and slide 2, color xxyyzy, and you could not tell the difference. Put them both on the same slide, side by side, and you could perceive the difference. You appear to be saying that is not the case, and I'm not factually equipped to dispute you or to concur with you at this time. So I'll take your expression of disagreement into consideration in the future.

    > Yeah, after a huge amount of thought and studying, I finally “grasped” that.

    LOL, it sounds like you know more about it than I do -- not something typically expected in a random blog conversation. Your original verbiage came off as much more dilettante than seems to apply. ;)

  21. John Moore:

    I'm no expert on psychoachoustics, and less on visual perception...

    I hope you'll like the math in this...

    From a signals standpoint, if you have the sounds of two musical instruments present, you have a more complex waveform. However, it depends very strongly on the form of interaction.

    If the signals are added together in a linear system (which is what one normally talks about when discussing music), no new frequency components are added, just the components that are present in each instrument. The time-domain function representing the sound is simply the sum of the time-domain functions from each instrument (or for a polyphonic instrument like a piano, the sum of the time-domain function for each note).

    For most instruments, the timbre consists of a fundamental sine wave and its harmonics. A pure sine wave sounds "flat." A sound with a lot of harmonics sounds "rich" - a bass pipe organ note, for example.

    Various instruments have different harmonic characteristics, and it gets more complex when one considers that those may vary from one note to another. Note also that an octave is the difference between a harmonic and the adjacent harmonic (the frequency of one octave up is 2x the frequency of the fundamental). Music is niocely mathematical, both in timbre, chords, and the time series used.

    Now, in any real world system, there are non-linearities - which engineers refer to as mixing (as distinguished from what audio folks all mixing). If you think in linear algebra terms, the linear system is the addition operator, and the non-linear system is the multiplier (cross product in this case):

    The result of a system-nonlinearity can be represented by a simple polynomial, where the "variable" is the function representing the waveform ( f(t) ). So it looks like:

    mixer(t) = af(t)+bf(t*t)+cf(t*t*t)....

    where f(t) itself is of the form:

    f(t) = a*sin(w*t)+b*cos(w*t)+c*sin(2*w*t)+c*cos*2*w*t_....

    So, if you have non-linearities, that polynomial produces cross products of the frequency components in f(t) - things like x*a*sin(w1t)*c*cos(2*w2t). (w1 from one instrument, w2 from another)

    Simple trig identities can be used to used to unwrap those cross products and identify the individual frequency components that will result. In the frequency domain (fourier transform of the time domain), you end up seeing all the frequency components in the form of sums and differences of the individual frequencies. Hence the example above yields frequencies f1-2f2 and f1+2f1.

    Anyone... I love this stuff so much I can't help spouting it from time to time.

    As to color perception, I don't disagree that putting colors adjacent allows one to see differences that one will not be apparent viewed sequentially. The human visual system is better at judging differences than absolute colors. My only argument on color perception is that people often misinterpret the cause of artifacts, and end up believing that the engineers didn't understand the issue. That is usually, but not always, wrong.

    As to the quality of blog comments, the brilliant host (Warren) we have on this blog attracts smart people, and people with varied, interesting expertise.

  22. John Moore:

    oops... anyone <= anyway

    I wanted to add... my description of musical instruments from the signal processing standpoint is still over-simplified. There are many complexities, not all of which I am conversant with. However, the fundamental approach of reducing signals to time-domain functions, applying linear or non-linear operators, and the analyzing the result of that in the frequency domain is quite useful (and more complex than I discussed, since mathematically, you can only get a pure sine wave if it lasts form time -infinity to time +infinity).

  23. IgotBupkis:

    Well, the general claim I encountered was that, much like that eye-detection example, the ear had capabilities of detecting changes represented by harmonics that occurred much faster and at higher frequencies than the ear could consistently and widely detect or recognize in more discrete circumstances, and this was the reason for the complaint about CD-grade audio... purportedly as a result of the adherence to pure nominal limits of the ear's capabilities, rather than the ear's limitations to detect shifts.

    I've never heard the same complaints expressed about MP3 grade audio.

    Not to suggest I've gone hunting for them, mind you. It would be interesting to know if there has been any actual psychoacoustic study to actually prove or disprove this thesis. It does sound as though you've paid more attention to the theoretical basis for it than I have by far, so I'll assume your perception has merit enough to at least make my somewhat non-technical sources (articles from technophile mags, offhand, IIRC) something less than certainly correct.

  24. John Moore:

    Well, I don't know everything about it, and my actual knowledge is dated. However, I did learn that the technophile mags were remarkably fact free when I was involved. They tended to pick up the big words (harmonics are widely misunderstood even though the concept is very simple), and wrap them into fantastic theories - sort of like UFO enthusiasts. I know that people use audio cards or amplifiers with tubes in them because they are "more accurate", which is total caca. I know that people grossly overpay for cables - going far beyond any detectable difference. I know there's a heck of a lot of money to be made by promoting these fantasies and selling into them.

    That being said, I do know that some people have remarkable hearing, at least when young. Shortly after my work in the field, I worked with a fellow who was indeed an audiophile and who was disturbed by ultrasonic motion detectors (one time we went into a store and he complained, and sure enough, the alarm system was on). However, the intersection set between people with extra high frequency hearing and the people making audiophile claims is almost certainly very small.