The video was thirty-seven minutes long. I watched it in a hotel room in Salzburg, sitting on the edge of the bed with my laptop balanced on the desk, and by minute six I wanted to turn it off.
Not because the content was bad. The content was fine. The structure was solid. The effects were well-chosen and well-rehearsed. The script I had written was, if I’m being honest, pretty good. I had spent weeks on it, testing every line against the reactions it was supposed to generate, cutting filler, tightening transitions.
But none of that mattered, because the person on screen was delivering all of it in the same voice. The same pitch. The same volume. The same speed. The same everything.
For thirty-seven minutes.
I sounded like someone reading a terms-of-service agreement at a software launch. Not bored, exactly — there was clearly energy and effort behind it. But the energy had nowhere to go. It sat at one level, stayed at one level, and remained at one level from the opening line to the final reveal.
My wife, who had watched from the back of the room during the actual performance, had given me a thumbs up afterward. “That was really good,” she’d said. And she meant it. But my wife is kind, and she was evaluating the content. The video was not kind. The video showed me what was actually happening, and what was actually happening was that I had the vocal range of a dial tone.
The Black-and-White Metaphor
There is a passage in Ken Weber’s Maximum Entertainment that I had read months before this video, filed away mentally as “interesting,” and then ignored because I assumed it didn’t apply to me. Weber describes a flat, unvaried voice as being like black-and-white television. It’s adequate. It conveys information. You can follow what’s happening. But once you’ve experienced color, black-and-white becomes unbearable. You can never go back.
I had read that and thought, sure, that makes sense for people who have monotone problems. I’m animated. I’m an engaging speaker. I do keynotes for a living. This doesn’t apply to me.
The video said otherwise.
The video said I was broadcasting in black and white while believing I was in full Technicolor. And the gap between those two realities — the internal experience of speaking and the external reality of what the audience actually hears — was the most uncomfortable gap I had encountered since I started learning magic.
Why We Don’t Hear Ourselves
Here is what I’ve come to understand about why this happens, especially to people who come from professional speaking backgrounds.
When you’re on stage, your brain is doing a hundred things at once. Managing the script. Executing the technical elements. Reading the audience. Tracking your position on stage. Anticipating what comes next. Recovering from minor mistakes. Projecting confidence even when your heart rate is elevated.
In all of that cognitive load, your voice gets assigned to autopilot. Not consciously. You don’t decide to stop modulating your voice. What happens is that your brain redirects processing power to the tasks it perceives as most urgent — the sleight that needs to happen in two seconds, the audience member who looks confused, the prop that isn’t quite where you expected it to be — and your voice becomes a background process. It runs at its default setting. And your default setting, for most of us, is flat.
In my consulting career, I had developed what I thought was a strong speaking voice. Confident. Clear. Well-projected. But consulting presentations reward consistency. You want to sound steady, reliable, credible. You are not trying to create drama. You are not trying to build suspense. You are trying to convey information in a way that builds trust. And a steady, consistent voice does that beautifully.
The problem is that what works in a boardroom presentation actively undermines a performance. When I carried my consulting voice onto the stage, I brought all the credibility and none of the color. I sounded like someone you would trust with your portfolio but not someone who was about to make the impossible happen right in front of your eyes.
The Four Dimensions of Vocal Color
After watching that video three times — and hating every minute of it, but forcing myself to take notes — I started studying what vocal variation actually means. Not in theoretical terms. In practical, actionable terms.
There are four dimensions you can manipulate, and once you know what they are, you start hearing them everywhere.
Pitch. This is how high or low your voice goes. Not your natural register — you’re not trying to change the fundamental sound of your voice. You’re varying it within your natural range. Going higher when something is exciting or surprising. Going lower when something is serious or dramatic. The range between your natural high and your natural low is wider than you think, and most performers use about ten percent of it.
I discovered this by doing something simple. I took three sentences from my script and recorded myself saying them ten different ways. The first few recordings sounded identical — my default pitch, my default everything. But by the fifth or sixth take, I was forcing myself to find different notes, and something started happening. The sentences came alive. The same words, the same meaning, but completely different energy depending on where my voice sat in my range.
Tone. This is the quality of the sound — warm, sharp, gentle, urgent, conspiratorial, authoritative. Tone is how your voice conveys emotion without changing the words. The sentence “Watch this closely” can sound like a command, an invitation, a warning, or a shared secret, depending entirely on tone. And if you deliver it the same way every time, it stops being any of those things. It just becomes noise.
Volume. Most people think of volume as a binary — loud enough for the audience to hear, or not. But volume is a spectrum, and every point on that spectrum communicates something different. Dropping your volume forces the audience to lean in. Raising it signals importance or excitement. And the shift between levels — the moment when loud becomes quiet or quiet becomes loud — is where the real power lives. The change itself is the signal. More on this in a later post.
Pacing. Speed. How fast or slow you deliver your words. This one caught me most by surprise when I watched the video, because I discovered that I had two speeds: my normal talking speed, and a slightly faster version of my normal talking speed. That was it. No deliberate slowdowns. No pauses for effect. No sudden acceleration to build excitement. Just one gear, with a slightly higher RPM when I was nervous.
The Consultant’s Curse
I’ve started calling this the Consultant’s Curse, because I suspect it affects everyone who comes to performing from a professional speaking background. In consulting, you learn to be consistent. Steady. Measured. Your voice is a tool for conveying credibility, and the most credible-sounding voice is one that doesn’t jump around. No dramatic pauses. No sudden volume shifts. No conspiratorial whispers. Those things would make you seem unhinged in a boardroom.
But consistency, when applied to performance, creates the exact monotone that Weber is talking about. The black-and-white television. Adequate for information transfer, terrible for entertainment.
The fix is not to become someone else. It’s not to adopt an artificial performance voice that sounds nothing like you. The fix is to expand your existing range. To use more of what your voice can already do, but doesn’t do by default because you trained it not to.
Think of it like this. You own a piano with eighty-eight keys, but you’ve been playing everything in the middle octave because that’s where your fingers naturally rest. Nobody is asking you to become a different pianist. They’re asking you to use the rest of the keyboard.
The Exercise That Changed Everything
Here is what I started doing, and I’m sharing this because it’s the simplest vocal exercise I’ve ever encountered and it produced results faster than anything else I tried.
Take one paragraph from your script. Any paragraph. Doesn’t matter which one.
Record yourself reading it exactly the way you normally would. Don’t perform it. Don’t try to make it sound good. Just say it the way you’d say it on stage, on autopilot.
Now record it again. This time, make it absurdly dramatic. Go way over the top. Whisper the first sentence. Shout the second one. Slow down to an agonizing crawl for the third. Speed up the fourth until you’re practically tripping over the words. Make it ridiculous. Make it sound like a parody of a dramatic reading.
Now record it a third time. Find the middle ground. Take some of that absurd variation and dial it back to about thirty percent. Keep the whisper but make it subtle. Keep the volume shift but make it natural. Keep the pace change but smooth it out.
That third recording is closer to what your delivery should sound like than your first recording ever was. Not the absurd version. Not the flat version. The version that lives between the two, the one that has color without being cartoonish.
I did this exercise in hotel rooms for weeks. Paragraph by paragraph, line by line, working through my entire script. Each time I recorded the three versions — flat, absurd, middle ground — and each time I was surprised by how much variation the middle ground version contained and how natural it sounded despite containing far more movement than I would ever have produced on autopilot.
What Happens When You Add Color
The difference is not subtle. It’s not a marginal improvement that only a trained ear would notice. It’s the difference between someone talking at you and someone talking to you.
When your voice has variation, the audience’s brain stays engaged. Every shift in pitch, tone, volume, or pacing sends a signal that says “something new is happening.” And as long as the brain is receiving that signal, it keeps processing actively. It stays in the room. It stays with you.
When your voice is flat, the brain stops receiving those signals. It downshifts from active processing to passive reception. The audience hears you but they stop listening. They may look attentive — they may even think they’re paying attention — but the cognitive depth of their engagement has dropped significantly. And when a magic effect happens during passive reception, it doesn’t land with the same force. The audience sees it, registers it, maybe even claps. But the gasp, the lean-forward, the wide-eyed astonishment — those require active engagement. Those require the audience to be fully present. And a flat voice is the fastest way to make them stop being present.
The Ongoing Work
I wish I could tell you that I fixed this once and moved on. I didn’t. Vocal variation is not a skill you acquire and possess permanently, like learning to ride a bicycle. It’s more like physical fitness — something you maintain through ongoing effort, and something that atrophies the moment you stop paying attention.
I still catch myself going flat, especially when I’m tired or nervous or performing material I’ve done many times. The default setting is always there, waiting to reassert itself. The consulting voice is always lurking, ready to take over the moment my cognitive load spikes and my brain starts looking for processes it can put on autopilot.
What’s different now is that I know what to listen for. I record myself regularly. I review the recordings with specific attention to vocal variation. And I ask myself the same question every time: am I broadcasting in color, or have I slipped back into black and white?
The answer is not always what I want to hear. But at least I know to ask.
This is the first post in a series about voice and delivery. Over the next few posts, I’m going to get into specific techniques — upspeak, emphasis patterns, projection versus volume, the counterintuitive power of silence and whispers. Each one addresses a different facet of the same fundamental challenge: using your voice not just as a vehicle for words, but as an instrument that shapes how those words land.
Because here’s what I’ve learned. Your content can be perfect. Your script can be airtight. Your effects can be beautifully crafted and flawlessly executed. But if your voice delivers all of it in the same flat, unvaried, black-and-white signal, you’re asking your audience to do the work of adding color. And audiences don’t do that work. They check out instead.
The color has to come from you.