I was doing something stupid in a hotel room in Innsbruck.
I was trying to remember nine things at once. Not for a magic effect. For work. I was on the phone with a client, and they were rattling off a list of requirements for a strategic initiative we were putting together. No pen. No paper handy. Just me, sitting on the edge of the bed at eleven at night, trying to hold their entire list in my head.
I got to about five items and felt my grip start to slip. By the time they reached the seventh item, the first two were gone. Not fuzzy — gone. Completely evicted from my short-term memory like tenants who failed to pay rent. I could feel it happening. Each new item was physically pushing an older one out, and there was nothing I could do to stop it.
Later, after I had written down what I could remember and apologized for needing the client to repeat the rest, I sat back and thought about what had just happened. Because I had been reading about this exact phenomenon in the context of magic, and I had experienced it firsthand for probably the thousandth time in my life without ever connecting it to performance.
The magic number is seven. Plus or minus two.
George Miller’s Bottleneck
In 1956, cognitive psychologist George Miller published one of the most cited papers in the history of psychology. His title was brilliantly memorable: “The Magical Number Seven, Plus or Minus Two: Some Limits on Our Capacity for Processing Information.” The paper demonstrated that human working memory — the mental workspace where we hold and manipulate information in real time — has a strict capacity limit. Most people can hold between five and nine discrete items in working memory at any given moment.
Not fifty. Not twenty. Not even twelve. Seven, give or take.
I first encountered this in the context of magic through Gustav Kuhn and Alice Pailhes’s research at Goldsmiths University. They connect Miller’s finding directly to how magic works — or more precisely, to why magic works even when the performer makes no special effort to be deceptive.
The connection is devastatingly simple. If working memory can only hold seven items, and your effect requires the spectator to track eight or nine things, then the earliest items will be displaced. Pushed out. Overwritten. And if one of those early items happens to be the secret action that makes the effect possible, the spectator will not remember it. Not because it was hidden. Not because it was misdirected. But because their brain literally ran out of room.
The Overwhelmed Executive
Let me give you a non-magic example first, because I think the principle is easier to see in everyday life.
I run innovation workshops as part of my consulting work. In one exercise, I present participants with a complex scenario — multiple variables, conflicting constraints, evolving conditions — and ask them to make a decision. The exercise is designed to be slightly too complex for anyone to hold all the variables in their head simultaneously. And every single time, without fail, participants forget critical pieces of information that were presented clearly at the beginning. They do not ignore the information. They do not dismiss it. They simply cannot hold it while also processing everything that comes after.
The interesting part is what happens next. When you point out the forgotten information, participants are genuinely surprised. They often insist it was never mentioned. They will argue about it. They have no memory of hearing it, even though everyone in the room heard the same briefing.
This is not carelessness. This is architecture. The human brain is not designed to hold more than seven items in working memory. When you exceed that limit, the system does not crash — it prunes. And the pruning happens automatically, silently, and without the person’s awareness.
Why This Matters More Than Misdirection
Here is what struck me when I started applying this to my own performance: working memory overload might be more powerful than traditional misdirection.
With misdirection, you are trying to control where the spectator looks or what they pay attention to. It is an active intervention. You have to do something — make a joke, create a visual distraction, time your gaze — to redirect their attention. And if the misdirection fails, they see the secret action.
Working memory overload is different. It is passive. It does not require you to redirect attention at all. You simply present enough information that the spectator’s memory buffer fills up, and the earliest entries get automatically displaced. The spectator can be staring right at you, paying full attention, tracking everything you do — and they will still forget the first few things because their brain does not have room for them anymore.
I found this idea simultaneously thrilling and unsettling. Thrilling because it meant there was a structural mechanism protecting my effects that operated independently of my skill as a performer. Unsettling because it meant I had been accidentally benefiting from it for years without understanding it, which meant I had also been accidentally undermining it.
The Four-Card Problem
Kuhn and Pailhes describe a beautifully clean demonstration of this principle. When spectators are asked to remember multiple cards during a magic effect — say four cards, shown one at a time — their working memory is occupied with the task of holding those card identities. While they are busy encoding “seven of hearts, queen of spades, three of clubs, ace of diamonds,” their capacity to notice and remember other things drops dramatically.
Actions that happen during this period of cognitive load are functionally invisible. Not because the spectator did not look at them. Not because the spectator was distracted. But because the spectator’s working memory was full, and there was nowhere for the new information to go.
Think about what this means for effect design. If you know that your spectator’s working memory is occupied with holding four card identities, you have a window of psychological protection that does not depend on misdirection, timing, or technique. You could perform your method openly during that window, and the spectator would have no memory of it afterward.
Not because they were fooled. Because they were full.
My Counting Mistake
Here is where I embarrassed myself.
For years, I had been performing a particular mentalism effect where I asked a spectator to think of a number, concentrate on it, and then I would reveal it. The effect worked fine. People were impressed. But I always felt like it was thin — like the method was too exposed, too traceable.
Then I restructured the effect. Instead of asking the spectator to think of one number, I asked them to think of several pieces of information. Their birth month. Their favorite color. A number between one and fifty. The name of someone they admired. A city they had always wanted to visit. And then the number.
The effect was the same. I was still revealing the number. But now the number was buried in a list of six or seven items, and by the time we got to the revelation, the spectator could not clearly remember the sequence of events that led to it. The earlier steps — the ones closest to the method — had been displaced from working memory by the later steps.
I had accidentally overloaded their working memory. The method vanished not into sleight of hand but into cognitive architecture.
The embarrassing part is that I only realized what I had done after reading the research. I had stumbled into the principle through instinct, the way a person stumbles into a doorway in the dark. The science gave me the light to see the room I was standing in.
The Seven-Item Design Principle
Once I understood the mechanism, I started applying it deliberately. Here is the framework I developed for myself.
If your effect requires the spectator to track more than seven discrete events, actions, or pieces of information, then the earliest events will be automatically displaced from their working memory. This is not a possibility. It is a certainty, as reliable as gravity.
So the design question becomes: what do you want to be displaced?
If the method happens early in the sequence and you stack enough subsequent events on top of it, the method gets pushed out. It drops below the threshold of memory. The spectator will still have a vague sense that “some things happened,” but the specific details — the details that would allow them to reconstruct the method — will be gone.
Conversely, if the method happens late in the sequence, it sits at the top of the working memory stack, fresh and vivid and easily recalled. That is the danger zone. Late methods are memorable methods. Early methods, buried under subsequent information, are invisible methods.
This is not about rushing through things. It is not about being confusing. It is about strategic sequencing — putting the information you want them to forget early and the information you want them to remember late.
The Corporate Presentation Parallel
I use a version of this principle in my keynote speaking all the time, and I only realized the connection after studying the memory research.
When I structure a keynote, I front-load the heavy data and analytical content — the charts, the numbers, the complex frameworks. Then I shift to stories, emotional hooks, and the big takeaway. By the end of the talk, the audience remembers the stories and the takeaway vividly. The data? They remember that there was data. They remember it was persuasive. But they cannot recall the specifics.
That is working memory doing its job. The stories and the takeaway were the most recent items. They sit at the top of the memory stack. The data was early. It has been displaced by everything that followed.
In a keynote, this is fine. The stories are doing the persuasion work. In a magic performance, this is gold. The method was early. It has been displaced by everything that followed. And the impossible moment — the revelation, the transformation, the climax — sits at the top of the stack, vivid and unchallenged.
Why Complexity Is Not the Same as Confusion
There is a trap here, and I want to flag it because I fell into it.
When I first started deliberately overloading working memory, I made my effects too complicated. I added unnecessary steps. I asked for too much information. I created procedures that were genuinely confusing, not just cognitively demanding.
The problem with confusion is that it triggers a different psychological response. When someone is confused, they become suspicious. They start looking for the trick. Their analytical mind wakes up. They shift from the passive, trusting state that makes magic work into an active, defensive state that fights against it.
Working memory overload should feel natural, not confusing. The spectator should feel like they are participating in something logical and sequential, even if the total amount of information exceeds their capacity. Each individual step should make sense. The sequence should feel purposeful. The overload happens not because the steps are confusing but because there are simply too many of them for any human brain to hold simultaneously.
The difference is between drowning and swimming in deep water. Both involve being in over your head. But one feels like a crisis and the other feels like an experience.
The Five-to-Nine Spectrum
Miller’s number is seven, but the range is five to nine. This means some people have slightly more working memory capacity than others. When I design effects, I aim for the nine end of the spectrum — I assume my spectators are sharp, attentive, and have above-average working memory. If the method is displaced even for a nine-item person, it is invisible for everyone.
This means I want at least ten discrete events in the sequence, with the method-relevant actions happening in the first three or four. That gives me a buffer of six or seven subsequent events to push the method out of even the most capacious working memory.
Is this overkill? Probably. But I would rather have too much protection than too little. The audience member who remembers more than you expected is the one who figures out how it was done. And as my business partner Adam and I have discussed many times at Vulpine Creations, the goal is not to fool most people. It is to create an experience that feels genuinely impossible to everyone in the room.
The Late-Night Realization
I am writing this in another hotel room. Different city this time — Vienna, where I have a strategy session tomorrow morning. The cards are on the nightstand. The laptop is open. It is nearly midnight, and I am thinking about working memory.
The insight that keeps coming back to me is this: the audience does not experience your performance. They experience what they can remember of your performance. And what they can remember is constrained by the same cognitive architecture that made me forget my client’s requirements on that phone call in Innsbruck.
Seven, plus or minus two. That is the bottleneck. That is the gate. Everything you do in a performance has to pass through it, and most of what you do will not survive the journey.
The question is not whether your audience will forget. They will forget. The question is what they will forget and what they will keep. That is not random. It is structural. And it is, to a degree that still surprises me, under your control.
Front-load the method. Stack the sequence deep. Let the cognitive architecture do the heavy lifting. The human brain is the best accomplice a performer ever had — and it does not even know it is helping.