The Research Behind the Analysis

Why a Car Tool Has Academic Citations

The car industry is one of the most behaviorally sophisticated sales environments in the American economy. Dealership sales training is built around anchoring, payment framing, artificial urgency, and sunk-cost exploitation — techniques drawn directly from behavioral psychology research. The four-square worksheet, the "what can you afford per month?" question, the financing office with its add-ons priced in monthly increments: none of this is accidental. It is designed, and it works.

The Witch was built on the same research base — used in the opposite direction. The same body of work that describes why people make bad financial decisions under pressure informs every design decision in this tool: what information appears first, how uncertainty is communicated, where bias warnings are placed, why the tab order is what it is.

This page names the papers, explains what each one found, and shows exactly where it shows up in the product. If you want to verify the methodology, the citations are here.

The Witch's standard: every behavioral intervention in the product — every bias callout, every framing choice, every sequencing decision — must be grounded in peer-reviewed research. "It feels right" is not a sufficient basis for telling someone what to do with their money.

Two bodies of research shape the product: decision science (how people process information and resist being misled) and actuarial methodology (how the numbers are calculated). Decision science first.

The Decision Science

These eight papers govern how information is structured, sequenced, and presented throughout the product. Each one answered a specific design question.

Why the Witch speaks first, and why she names the frame before showing the numbers

Kunda, Z. (1990). The case for motivated reasoning. Psychological Bulletin, 108(3), 480–498.

Kunda documented that people don't reason their way to conclusions — they reason from them. When someone has already decided what they want to do (keep the car, buy the new one), they run a biased memory search that retrieves only the evidence supporting that conclusion. They can't reach just any conclusion — they need one they could defend to a skeptic — but the outcome is predetermined and the evidence is assembled afterward.

This is why the Witch's narrative leads every analysis. Raw data shown before an interpretive frame gets processed through whatever frame the user already holds. The verdict must come first, so that the numbers that follow are evaluated against the correct conclusion rather than recruited to support the wrong one.

It is also why the Counsel tab's identity selector asks users to name their own decision framework before they receive analysis. Requiring that declaration activates accuracy goals — motivation to get the right answer — rather than directional goals, which produce motivated reasoning.

Why the bias callouts explain the mechanism, not just the name

Chi, M.T.H. (2008). Three types of conceptual change. In Handbook of research on conceptual change.

Chi identified three conditions of prior knowledge: missing (the user needs new information), incomplete (the user needs gap-filling), and conflicting (the user holds a misconception that actively resists correction). The third is the hardest.

Most users who arrive at this tool hold one specific misconception: that comparing a repair bill to a monthly payment is the correct calculation. This isn't a knowledge gap — it's a wrong mental model that feels right. Simply presenting total cost of ownership data doesn't fix it. The wrong model doesn't disappear; it competes with the correct one, and the user picks whichever supports their preference.

Conceptual change requires the user to actively construct the correct explanation, not just receive it. This is why every bias callout in the product names the wrong mental model, explains specifically why it misleads in the user's actual situation, and provides an explicit replacement frame — not just the correct number, but the correct way of thinking about the comparison.

Why bias callouts appear before the numbers they contextualize

Lewandowsky, S., Ecker, U.K.H., Seifert, C.M., Schwarz, N., & Cook, J. (2012). Misinformation and its correction. Psychological Science in the Public Interest, 13(3), 106–131.

Lewandowsky and colleagues found that corrections to misinformation are more effective when delivered before the misleading stimulus than after. Once someone has already processed a number through a biased frame — "the repair costs more than two monthly payments" — the correction is playing catch-up against a conclusion that has already formed.

They also found that corrections without a replacement explanation backfire: saying "don't think about monthly payments" without providing an alternative leaves a cognitive gap that gets refilled with the original framing. Every correction must provide a specific alternative comparison, not just a retraction.

This is why the Anchoring Bias callout appears before the break-even threshold on every analysis — not after. And why every callout ends with: here is what to compare instead.

Why the tool makes structural decisions on the user's behalf

Larrick, R.P. (2004). Debiasing. In Blackwell Handbook of Judgment and Decision Making.

Larrick's research on debiasing established that individuals cannot reliably correct their own cognitive biases. Feedback loops are too slow, outcomes too ambiguous, and self-serving attribution too persistent. What actually works is external intervention: changing the decision environment so that correct reasoning is the path of least resistance.

This means every layout decision in the product is a debiasing decision. The default tab order (Verdict → Counsel → Prophecy) is not aesthetic — it ensures users pass through perspective-taking before reaching the raw data tables they might otherwise cherry-pick from to contradict a verdict they dislike. The Witch does not present neutral information and let users reason freely. She structures the environment so that accurate reasoning is easier than biased reasoning.

Why uncertainty bands are on the chart, not in the footnote

Padilla, L., Kay, M., & Hullman, J. (2020). Uncertainty visualization. In Handbook of Computational Statistics and Data Science.

Padilla and colleagues documented a consistent failure mode in data communication: precise-looking numbers are treated as facts regardless of what hedging language appears in accompanying footnotes. Users substitute the simpler, deterministic interpretation for the more cognitively demanding probabilistic one — not out of carelessness, but because the visual encoding of the number itself signals certainty.

This is why cost projections in the product carry ±20% uncertainty bands rendered directly on the chart — not deferred to a tooltip or footnote — and why prose cost figures use hedging language ("roughly $4,200/yr") rather than bare numbers. Uncertainty must travel visually with the data it qualifies. A footnote doesn't accomplish this.

Why vehicles are compared side by side, not stacked in rows

Franconeri, S.L., Padilla, L.M., Shah, P., Zacks, J.M., & Hullman, J. (2021). The science of visual data communication. Psychological Science in the Public Interest, 22(3), 110–161.

Franconeri and colleagues established that the visual system can extract broad statistics across a display instantly, but comparing subsets of values — vehicle A vs. vehicle B across multiple categories — requires focused attention and taxes working memory at roughly 2–3 comparisons per second. Poor display design doesn't just make comparisons harder; it causes viewers to extract the wrong patterns because the structure leads them to the wrong comparison axis.

Position is the most precise visual channel for comparison. Side-by-side card layouts (used throughout the Summary Metric Cards and the Prophecy Matrix) make position the primary comparison axis. Vehicle-as-column in cost tables serves the same function. The design principle is not aesthetic minimalism — it is minimizing the working memory load required to make the one comparison that matters.

Why there is no neutral way to present a car decision

Weinmann, M., Schneider, C., & vom Brocke, J. (2016). Digital nudging. Business & Information Systems Engineering, 58(6), 433–436.

Weinmann and colleagues established the foundational principle of digital nudging: there is no neutral way to present choices online. Every interface decision — the default tab, the order of options, the visual prominence of a number, the label on a button — influences behavior, whether intentionally or not. The question is not whether the product nudges users, but in which direction.

The Witch nudges explicitly toward accurate self-assessment. The first number a user sees anchors their judgment. The default identity in the Counsel selector shapes which analysis feels most relevant. The position of the safety banner relative to the financial content is a deliberate structural choice. Every one of these decisions is made with awareness of its directional effect — and with the intent that the effect points toward correct reasoning rather than toward any particular outcome.

Why the Steelman exists

Kriplean, T., Morgan, J., Freelon, D., Borning, A., & Bennett, L. (2012). Supporting reflective public thought with ConsiderIt. CSCW 2012.

Kriplean and colleagues studied public deliberation tools and found that exposing users to opposing views is insufficient to change minds. What works is requiring active engagement with the strongest version of the opposing argument before committing to a position — not a summary of what the other side believes, but the best case they could make.

The Steelman feature in the Counsel tab is a direct application of this finding. After the Witch delivers a verdict and an identity-specific narrative, users can ask her to argue the opposite conclusion with full force. If she counseled replacement, she argues for keeping. If she counseled keeping, she argues for replacing. One steelman per session. After it renders, she says: the Witch has spoken both sides. The decision is yours.

Why the identity selector uses outcomes, not labels

Fujita, K., et al. (2022). Consumer identities and vehicle purchase decisions. Journal of Consumer Psychology.

Fujita and colleagues documented a persistent gap between stated preferences and revealed preferences in consumer decisions — people's self-reported priorities reliably differ from the factors that actually drive their choices. Label-based typologies ("Safety Buyer," "Value Seeker") activate social desirability effects: users pick the label that reflects who they want to be, not who they are.

The Counsel identity selector uses outcome-framed questions instead of labels: "Getting everyone home safely" rather than "Safety." "The actual cheapest path forward, full stop" rather than "Optimizer." Outcome framing produces more honest self-recognition because it asks about goals, not identity — and goals are harder to perform for an audience.

The behavioral research governs how the product communicates. The actuarial research governs the numbers themselves.

The Numbers Behind the Numbers

The cost projections in the analysis are built from public data sources combined through a methodology documented internally and summarized here.

Repair and maintenance costs

Annual repair cost estimates use an actuarial aging model calibrated against BLS Consumer Expenditure Survey data. The model separates two independent forces: mechanical aging (the vehicle requires more maintenance as it accumulates wear) and price inflation (parts and labor cost more over time). Conflating these into a single escalation rate produces double-counting and significantly inflated long-term projections. The model applies them independently.

When users provide repair history, the model blends their actual data with the actuarial prior using a credibility weighting system. The more history a user provides, the more weight their actual experience carries relative to the population average. This prevents a single atypical repair year from distorting a multi-year forecast while ensuring users with strong history get estimates that reflect their specific car, not the fleet.

Regional labor rates

Maintenance estimates are adjusted for local mechanic labor costs using BLS Occupational Employment and Wage Statistics data at the metropolitan area level. Labor rates vary materially across geographies — the spread between the highest- and lowest-cost markets is substantial. A user in a high-cost metro and a user in a rural market are not running the same analysis.

Fuel efficiency decay

Fuel economy projections use an age-based MPG decay model grounded in published vehicle engineering research: Omar, A., Alias, N. K., & Hamzah, A. (2023), "Effects of vehicle age on fuel economy for urban driving cycles," Jurnal Kejuruteraan, SI6(2), 211–217. Live fuel prices are fetched weekly from EIA weekly retail gasoline prices and are editable by the user.

Reliability signals

Reliability is reported as disaggregated signals rather than a single composite score. A car with five NCAP safety stars and an open recall is not "fine on average" — collapsing those into one number masks exactly the information you need. The signals are sourced directly from the NHTSA public API: consumer complaints, recall records, and NCAP crash test ratings.

Complaint volume is normalized by years in production and log-scaled, so a model with ten years of owner history isn't penalized against one that launched last year. Complaint severity is category-weighted: engine, transmission, and brake failures score materially higher than interior or electrical complaints. Crash- and fire-involved complaints carry additional weight. Recall flags and open federal investigations are surfaced separately, with park-it recalls — ones where the manufacturer advises not driving the vehicle — called out explicitly.

The ±20% uncertainty band

All cost projections carry ±20% estimation uncertainty. This figure reflects the genuine limits of population-level actuarial data applied to any individual vehicle. The uncertainty band is rendered visually on projection charts and stated in prose with hedging language.

The same uncertainty governs the Prophecy Matrix winner designation. When vehicle A costs 20% less than vehicle B, A's upper uncertainty bound (A × 1.20) just meets B's lower bound (B × 0.80) — the ranges barely separate. Below that 20% margin, the uncertainty envelopes overlap and the model cannot reliably rank the options. The Witch withholds the winner badge in that case. A difference the data cannot distinguish is not a decision the math gets to make.

The methodology exists to produce one output: an honest answer to an honest question.

Run the Analysis

The research is here because it should be auditable. The product is here because the question deserves a real answer. If you're holding a repair quote or wondering whether your car still makes financial sense to keep, the calculator is the place to start.

Cast the Math →