This is another in a series of posts about why digital pianos often sound bad in live performance, and what can be done about it.
Sometimes when I play a chord on my digital piano (DP) the tone sounds suddenly and unexpectedly thin. Striking the same chord again might sound better, or sometimes worse… or just different. This effect is most audible on sustained chords. I find that it’s much worse with live sound, especially in mono. Stereo usually sounds better, especially in headphones where I hardly ever notice it.
This inconsistency of tone is very distracting. It feels like all the body drains out of my tone at random, making me cringe and back off of certain chords. Such haphazard tonal quality can make my playing timid and uninspired, and has made a lot of gigs not fun at all.
Update: based on my investigations here and elsewhere, I’ve designed a new loudspeaker system to deal with the challenges of turning piano samples into naturalistic live piano sound. Visit taylorsoundlabs.com to purchase or get more info.
This issue has vexed me for years, but I’ve only recently measured it and tracked down what I think is the cause: interference between coincident harmonics when two or more notes are played at the same time. The nature of this interference (constructive vs. destructive) depends on the micro-timing (on the order of milliseconds) between struck notes. Timing at that scale is pretty much impossible for a player to control accurately and consistently; it ends up being somewhat random, which accounts for the inconsistency of the timbral distortion I’ve been hearing.
First, some measurements to confirm what my ears have been telling me. I made 20 different recordings of me playing the notes F3 and C4 together (that’s middle C and the F below) on my DP (a Yamaha CP-50). I captured the signals at the piano’s analog outputs, with the keyboard set to a constant key strike velocity — so that any variation in timbre/tone is due only to changes in timing between notes, not their relative level. For each recording I calculated a spectrum (the mean of several windowed and overlapped FFTs) on the first 1s of audio signal. Here are the spectra of all 20 recordings of this “chord”, overlayed on one graph:
Most of the tones (harmonics) have identical level across all 20 recordings, but the harmonic at 522Hz has levels that vary dramatically (by 12dB!) from one instance of this chord to the next. Such gross distortion is quite audible when listening in mono (though much less so in stereo) and it clearly corresponds to the fickle tone quality I’ve been hearing in my chords.
What’s going on here is that the 3rd harmonic of F3 and 2nd harmonic of C4 coincide at 522Hz. That coincidence makes this perfect 5th a consonant interval, but it also sets up conditions for destructive interference between these two 522Hz tones. This interference occurs within the mixer in the DP, before the signal reaches the analog outputs. When these notes are played at the same time, the DP layers the F3 and C4 samples, mixing them down by summing their signals. If the two samples’ time alignment puts their 522Hz tones in-phase then they add constructively in the mix, giving this doubled harmonic a high level. But if they are played back out of phase then they sum destructively: the 522Hz tone drops out of the mix.
The relative phase of the two 522Hz tones depends on the timing of the key strikes, which are never exactly simultaneous. The timing difference between in-phase and out-of-phase is half a wave cycle; at 522Hz this is less than 1ms, short enough to be inconsistent in real-life playing.
This is probably over-simplified, since in reality the doubled harmonics don’t coincide exactly. What I would expect, for a piano tuned to A440 in equal temperament, is for the third harmonic of F3 to be at 523.84Hz, with the 2nd harmonic of C4 slightly lower at 523.25Hz. Together these form a tone at 523.5Hz whose amplitude beats slowly at a rate of 0.6Hz (hence a period of 1.7s). What’s likely happening is this tone starts at different parts of its beating cycle depending on the relative timing of the F3 and C4 key strikes.
To back up some of this speculation I made individual recordings of the F3 and C4 samples on my DP. To simulate small variations in timing between key strikes, I time-shifted these signals by up to a few milliseconds before layering them and calculating a spectrum on the first 1s of output. The following graph shows how the levels of various tones in the output depend on the amount of time shift:
This lines up exactly with the previous figure. The tones at 172Hz (fundamental of F3), 264Hz (fundamental of C4) and 350Hz (2nd harmonic of F3) all have constant level, independent of key-strike timing. But the level of the 522Hz tone (which is the sum of two separate tones, the 3rd harmonic of F3 and 2nd harmonic of C4) varies a lot with small changes in the relative timing of the struck notes, since this timing alters the relative phase of the two sources of this tone.
(Incidentally, notice that the 2nd harmonic of F3 is 12dB down in the left channel. This has nothing to do with key-strike timing. It’s an artefact baked into this DP’s samples. Even for single notes, neither channel sounds “right” on its own.)
The interference at 522Hz is different in the right channel than in the left. If the key strike timing causes this tone to drop out in the left channel then it appears at a high level in the right channel, and vice versa. Thus, in stereo reproduction (esp. in headphones) the two channels will compensate for each other. Perceptually, the timbre is something like the average of left and right channels, so the interference might not be audible.
Unfortunately, this compensation doesn’t take place for all intervals. Here is a simulation of the interval C3/G3:
The interference between the 393Hz tones (the 3rd harmonic of C3 and 2nd harmonic of G3) is nearly the same in both channels. For certain delays between key strikes this tone ends up being 9dB down in both channels, so the distortion is audible even in stereo.
There is nothing special here about intervals of a perfect 5th. Doubled harmonics are the basis of harmony: they occur in every consonant interval, so this problem will show up in most chords. For example, the octave E4/E5 is especially bad on my DP:
Here the 2nd harmonic of E4 and fundamental of E5 coincide at 662Hz. Interference between these tones causes the 662Hz tone in the mixed-down output to vary by 18dB, depending on tiny changes in the key strike delay. This sounds obviously wrong.
(And again — though not relevant to the issue at hand — the 328Hz fundamental in the E4 sample is 16dB down in the left channel, which is obvious to my ear. Maybe Yamaha isn’t as meticulous about their sampling as they claim.)
Nothing about this is particular to my Yamaha DP. This problem is going to show up in polyphony on any sampled instrument: if the instrument works by simply layering samples of individual notes, then inevitably there will be interference between doubled harmonics produced by different notes played at the same time. This interference will depend on the micro-timing of the notes played, and so the timbre will change, seemingly randomly, each time a given interval is played — sometimes with bad results. Software pianos might be better, if they can simulate the mutual resonance between multiple strings sounding at the same time.
Real pianos don’t behave this way. Because of mutual resonance between strings, playing C3/G3 together on a real piano is fundamentally different than simply playing back a mix of separate recordings of C3 and G3, each played with all other strings damped. I haven’t made measurements, but I would be very surprised if timing-dependent interference between doubled harmonics occurs in real pianos. I’ve never heard it.
All of this points to the importance of reproducing digital piano samples in stereo (at least). This has nothing to do with traditional “stereo imaging”: it’s about ensuring that the inevitable spectral nulls in any one channel are filled in by the other channel, which encodes different phase relationships between harmonics, at least for most intervals. As the C3/G3 interval on my DP shows, this doesn’t always work: sometimes an important tone will drop out of both channels. To prevent this, maybe pianos should be sampled/reproduced with more than two channels.
Even without this kind of interference, the individual samples on my DP have large spectral aberrations that show up in different channels for different notes. (I don’t know if this occurs in other DPs. I suspect it’s just an artefact of the microphone being near a null of standing waves within the piano cabinet when the samples were recorded.) Consequently, neither channel sounds right on its own. Stereo reproduction can average out these spectral bumps and create a more natural piano-like timbre.
Thus, sampled pianos have at least two kinds of linearly distortion: spectral/timbral errors in the individual samples, and errors due to interference between doubled harmonics. Both are corrected (at least somewhat) by reproducing in stereo. The two channels aren’t used to encode spatial cues as in traditional stereophony, but simply to capture two independent pictures of the piano timbre, with different spectral errors. Differences between channels can actually be an advantage here, decreasing inter-aural cross-correlation (IACC) to promote a sense of spaciousness and envelopment more like an acoustic instrument. But as I’ve noted here and here, stereo reproduction has its own pitfalls that can create interference notches that mess up the timbre, so care is needed.