Customer discovery interviews optimize for confidence, not accuracy

Customer discovery interviews are the most widely recommended early-stage research method, and they are valuable. They are also systematically biased toward a specific type of output: what people believe they want, articulated in response to questions they have been asked to reflect on. That output is not useless. It reveals mental models, vocabulary, and self-reported pain points that inform positioning and messaging. What it does not reliably reveal is what people will actually do when they encounter a product, because the conditions under which someone answers an interview question are categorically different from the conditions under which they make a real behavioral choice.

Watching someone use a prototype for fifteen minutes produces a different type of evidence. It reveals where they hesitate, what they click first, what they skip, what they misunderstand, and what they try to do that the interface does not support. These observations do not require the user to know what they want or to articulate it accurately — they require only that the user attempt to accomplish something real. The gap between what a customer said they wanted in an interview and what they actually did with the prototype is where the most valuable product information lives, and founders who only run interviews never reach it.

What interviews are actually measuring

A customer discovery interview is a structured conversation designed to surface how a person currently experiences a problem and what they believe would help. The structural bias of this method is that it asks people to simulate a decision rather than to make one. When an interviewer asks “how do you currently handle X?” and “what would you pay for a better solution?”, the respondent is constructing a narrative about their behavior and preferences in real time. That narrative is shaped by social dynamics, the desire to be helpful, the interviewer’s framing, and the respondent’s own incomplete self-knowledge.

People are genuinely poor at predicting their own behavior in contexts they have not experienced. A respondent who says “I would use that feature daily” is reporting a belief about their future self that may have no relationship to what they will actually do when the feature exists. Research in behavioral economics and consumer behavior consistently shows that stated preference and revealed preference diverge, particularly for novel products where the respondent has no prior behavior to extrapolate from. The interview captures the stated preference. The prototype session captures the revealed one.

Customer discovery interviews are also subject to interviewer effect: the questions asked, the order in which they are asked, and the non-verbal responses of the interviewer all influence what the respondent says. A well-intentioned founder who nods when a respondent mentions a pain point the founder cares about is receiving partially self-generated signal. This does not make interviews worthless — a skilled interviewer can minimize these effects — but it means the method requires discipline to execute neutrally, and most founders interviewing their own potential customers are not neutral observers.

What prototype observation reveals that interviews cannot

A prototype session reveals behavioral evidence: what someone actually does when presented with a working interface. This evidence is more predictive of real-world product behavior than interview responses for two reasons. First, it requires the user to act rather than to imagine acting, which engages a different cognitive process. Second, it surfaces problems that the user would never have reported in an interview because they would not have known to report them — interaction friction, missing context, interface misalignments — that only become visible when a real person tries to do a real thing.

The most productive prototype sessions produce three specific types of observations. The first is task completion: can the user accomplish the core job without assistance, and if not, where do they stop or ask for help? This reveals whether the interface matches the user’s mental model of how the task should be done. The second is interpretation: what does the user think the product is doing at each step? Misinterpretations visible in prototype sessions routinely reveal that the product’s conceptual model does not match the user’s, a problem that interview responses rarely surface because the user has not yet tried to use the product.

The third type is preference revealed through behavior rather than through statement. A user who, when asked to complete a task, immediately tries an approach the founder did not anticipate has revealed something about how they think the product should work. A user who completes the task but then immediately goes back to check something has revealed an information need the interface did not meet. These observations are not available through interviews because they require the user to encounter the product rather than to imagine it.

How to combine interviews and prototype observation effectively

The most effective early-stage research combines both methods in sequence, using each for what it is good at. Interviews are better for exploring problem space and gathering language. Prototype observation is better for testing whether a proposed solution fits the user’s actual behavior. Running them in the wrong order — or running only one — produces incomplete signal.

Run interviews first to map the problem space, not to validate the solution. The goal of early interviews is to understand how users currently experience the problem domain — what they do, what breaks, what they have already tried. Do not show the product or describe the solution in interviews. Ask about current behavior, not hypothetical preferences. The interview output is a map of the problem space, not a validation of your answer to it.
Build the minimum prototype that makes one specific claim testable. Define the one behavioral question you need to answer before the next build decision. “Will users understand what this output means without a label?” or “Will users complete onboarding without asking what to do first?” Build only what is necessary to test that question. A prototype that tests everything at once cannot tell you which specific assumption was wrong.
Observe, do not guide. In prototype sessions, give the user a task and watch. Do not explain the interface. Do not answer questions about how it works. Note where they hesitate, what they try that does not work, and what they say aloud. The observations from a non-guided session are the evidence. Post-task discussion can surface why — “what were you expecting to happen there?” — but the behavioral observation is the primary data.
Run at least five prototype sessions before drawing conclusions. One session reveals one person’s behavior. Five sessions reveal whether a pattern exists. Three misunderstandings of the same interface element across five sessions is a design problem. One misunderstanding in one session may be noise. Do not adjust the product on the basis of a single session’s observations.
Compare interview-stated preferences against prototype-observed behavior explicitly. After running both, create a list of what users said they wanted in interviews and a list of what their behavior in prototype sessions suggests they need. The items on both lists are useful. The discrepancies between the lists are more useful still — they reveal where self-reported preference and actual behavior diverge, which is where the product’s most consequential design decisions live.
Use interview signal for messaging and prototype signal for product decisions. The vocabulary users use in interviews to describe their problem is valuable for positioning, marketing copy, and sales conversations. The behavior users exhibit in prototype sessions is valuable for interface design, feature prioritization, and workflow architecture. Using interview signal for product decisions and prototype signal for messaging is the common reversal — it produces well-positioned products that do not work the way users expect, and poorly positioned products that do.

What it costs to rely only on interviews

A founder who validates entirely through customer discovery interviews arrives at the build phase with high confidence in a set of user preferences that may or may not translate into actual usage. The confidence is real — interviews with ten or twenty enthusiastic respondents produce strong conviction — but it is conviction about what people said they want, not about what they will do. The cost becomes visible at launch, in the form of low activation rates, unexpected churn in the first thirty days, and usage patterns that do not match the workflows the product was designed around.

These outcomes are not evidence that the market research was incomplete — a founder who ran twenty interviews feels like they did extensive research. They are evidence that the research method was well suited to building confidence and poorly suited to predicting behavior. Customer discovery interviews are a necessary part of early-stage product development. They are not sufficient, and treating them as sufficient is a specific error with a specific cost: products built for the user’s self-reported preferences rather than for the user’s actual behavior. The fifteen minutes of watching someone use a prototype is not a complement to the interview. For product decisions, it is the more reliable source.