The Terminology Problem

"Reasoning" models don't reason like humans—they generate more internal tokens before responding. "Flash" models aren't dumb—they're optimized for efficiency. The names are marketing, not descriptions.

A Better Mental Model

System 1 (fast, intuitive): Quick pattern matching, immediate responses. Good for most tasks where the answer is recognizable.

System 2 (slow, deliberate): Step-by-step processing, explores multiple paths. Better for novel problems requiring multi-step logic.

— Borrowed from Kahneman's "Thinking, Fast and Slow"

The Key Insight

Neither mode is "smarter." They're different processing strategies optimized for different tasks. Using System 2 for everything wastes resources; using System 1 for complex logic produces errors.

Click a mode to explore details →

Direct Response

"Flash" / Standard

Extended Processing

"Reasoning" / Thinking

Select a mode to see
when and why to use it

Why Choose Direct Response

Most queries don't need extended deliberation. The model has seen similar patterns millions of times—it can recognize the right response without "thinking it through."

Cost: Lower token usage = lower cost

Speed: Faster time-to-response

When: Summaries, translations, standard coding tasks, Q&A, content generation

Why Choose Extended Processing

Some problems benefit from exploring multiple approaches before committing. The model generates internal "scratch work"—testing paths, catching errors, refining answers.

Cost: Higher token usage (thinking tokens count)

Speed: Slower, sometimes significantly

When: Multi-step math, complex logic, novel problems, planning, code architecture

"Flash" / "Standard"

Direct Response

Pattern recognition → Output

What Actually Happens

1
Receive input tokens
2
Match patterns from training
3
Generate output tokens

Token Distribution

Input Output
Your prompt
Response

Best For

Familiar patterns (summarize, translate, explain)
Speed-sensitive applications
Novel multi-step logic puzzles
Fast
Response
Lower
Cost
Standard
Depth
"Reasoning" / "Thinking"

Extended Processing

Explore → Evaluate → Output

What Actually Happens

1
Receive input tokens
2
Generate internal "thinking" tokens
3
Explore multiple paths, self-check
4
Generate refined output tokens

Token Distribution

Input Thinking Output
Your prompt
Internal processing
Response

Best For

Multi-step math & logic problems
Complex planning & architecture
Simple lookups & standard tasks
Slower
Response
Higher
Cost
Deeper
Analysis

Good Tasks for Direct Response

Summarize this article Translate to Spanish Write a product description Explain this concept Fix this syntax error Draft an email

Good Tasks for Extended Processing

Solve this proof Debug this algorithm Plan a system architecture Analyze competing tradeoffs Multi-step word problem Find the flaw in this argument