ONLINE

Claude 4 Opus & Sonnet Review

The AI That Actually Gets Work Done

After a month of using Claude 4 daily, I finally understand why developers are abandoning everything else. This isn’t just another model update – it’s the first AI that can actually work autonomously for hours without going off the rails.

Anthropic quietly dropped Claude 4 in May 2025 with two models: Opus 4 (the powerhouse) and Sonnet 4 (the workhorse). While everyone was distracted by GPT-5 rumors and Grok drama, Claude became the tool that actually ships code. Here’s what you need to know.

What is Claude 4?

Claude 4 is Anthropic’s latest model family, featuring hybrid reasoning that can switch between instant responses and extended thinking. Think of it as having two brains: one for quick answers, another for deep work.

The lineup includes:

  • Claude Opus 4: The frontier model for complex, long-running tasks
  • Claude Sonnet 4: The efficient model that replaced Sonnet 3.7

Both models can use tools, maintain memory across tasks, and most impressively – work autonomously for hours without supervision. This isn’t marketing speak. I’ve watched it refactor entire codebases while I slept.

Key Features That Actually Matter

World’s Best Coding Model

Claude Opus 4 leads SWE-bench at 72.5% and Terminal-bench at 43.2%. In human terms: it writes better code than most junior developers and some seniors.

Extended Thinking with Tools

The models can alternate between reasoning and using tools (web search, code execution) during extended thinking. It’s like having an assistant who knows when to stop and research before answering.

Actual Memory That Works

When given file access, Claude builds “tacit knowledge” over time. It remembers context from previous interactions and applies it to new tasks. Finally, an AI with short-term memory.

Parallel Tool Usage

Both models can use multiple tools simultaneously. While it’s searching the web, it’s also analyzing your codebase and planning the implementation. Multitasking that actually works.

Real-World Testing Results

I’ve used both models extensively across different scenarios:

Opus 4 Performance

  • 7-hour coding session: Autonomously refactored a 50k line codebase, maintaining context throughout
  • Research tasks: Analyzed 200+ sources for a market report, synthesizing insights I would have missed
  • Creative writing: Produced a 10k word story with consistent characters and plot – genuinely impressive
  • Error rate: Near-zero hallucinations in extended sessions

Sonnet 4 Performance

  • Daily coding: Faster than Opus, perfect for quick implementations
  • Bug fixes: Understands codebase context, suggests surgical fixes
  • Documentation: Writes clearer docs than most humans
  • Cost-efficiency: 80% of Opus quality at 20% of the price

Pricing: Fair for What You Get

Claude Opus 4:

  • Input: $15 per million tokens
  • Output: $75 per million tokens
  • With caching: Up to 90% cheaper
  • With batch: 50% cheaper

Claude Sonnet 4:

  • Input: $3 per million tokens
  • Output: $15 per million tokens
  • Free tier available on Claude.ai

For context:

  • More expensive than GPT-4o but delivers more value
  • Cheaper than GPT-4.5’s insane pricing
  • Best value: Sonnet 4 for most tasks, Opus 4 for critical work

Claude 4 vs. The Competition

Claude 4 vs. GPT-4.5

  • Coding: Claude destroys GPT-4.5
  • Extended tasks: Claude can work for hours, GPT can’t
  • Price: Claude more expensive but worth it
  • Reasoning: Both strong, Claude more reliable

Claude 4 vs. Grok 4

  • Safety: Claude won’t tweet antisemitic content
  • Consistency: Claude more predictable
  • Benchmarks: Grok edges ahead on some tests
  • Real work: Claude wins hands down

Claude 4 vs. Previous Claude

  • Speed: 2-3x faster responses
  • Accuracy: Dramatically reduced errors
  • Capabilities: Night and day difference
  • Price: Same as before (amazing value)

Who Should Use Claude 4?

Perfect For:

  • Software developers who want an actual coding partner
  • Researchers needing deep, accurate analysis
  • Writers who want quality over quantity
  • Teams building AI agents and automation
  • Anyone tired of babysitting their AI

Skip If:

  • You just need basic chat (use free Sonnet 4)
  • Budget is extremely tight (try open-source)
  • You want cutting-edge multimodal (wait for updates)
  • You prefer chaos (stick with Grok)

The Game-Changing Features

Claude Code Integration

The Claude Code terminal tool is revolutionary. Point it at your project, give it a task, and watch it work. It understands:

  • Cross-file dependencies
  • Project architecture
  • Your coding style
  • When to ask for clarification

Extended Thinking Mode

This is the killer feature. Claude can “think” for minutes, showing you its reasoning process. It’s like pair programming with someone who never gets tired or frustrated.

Memory That Persists

Give Claude access to a notes file, and it maintains context across sessions. It remembers your preferences, project details, and previous decisions. Game-changing for long-term projects.

Real Developer Testimonials

The endorsements aren’t just marketing:

  • Cursor: “State-of-the-art for coding”
  • GitHub: Chose Sonnet 4 for Copilot
  • Replit: “Fundamentally changes how our agent works”
  • Cognition: “Handles critical actions others miss”

These aren’t random startups – these are the tools developers actually use.

Tips for Maximum Value

  1. Use Sonnet 4 by default – It’s fast and handles 90% of tasks
  2. Save Opus 4 for complex work – Long sessions, critical code, deep research
  3. Enable file access – Let it build memory for better results
  4. Use extended thinking – Worth the wait for complex problems
  5. Leverage prompt caching – 90% cost reduction for repeated tasks

The ASL-3 Elephant

Anthropic classified Opus 4 as ASL-3 – meaning it could “substantially increase” someone’s ability to create biological or nuclear weapons. They’re not joking about the power here. The safety measures are robust, but this is genuinely frontier capability.

What’s Missing?

  • Voice mode: Coming but not here yet
  • Image generation: Still can’t create images
  • Video understanding: On the roadmap
  • Real-time collaboration: Would be incredible

The Bottom Line

Claude 4 isn’t just an incremental update – it’s a fundamental shift in what AI can do. For the first time, I have an AI that can take a complex task and actually complete it without constant supervision.

Opus 4 is expensive but delivers genuinely unprecedented capabilities. Sonnet 4 offers 80% of the power at a fraction of the cost. Together, they’ve become indispensable to my workflow.

If you’re serious about using AI for real work – not just demos and toys – Claude 4 is the only choice that makes sense right now.

Frequently Asked Questions

Q: Is Opus 4 really worth 5x the cost of Sonnet 4? A: For extended autonomous work, absolutely. For quick tasks, no. Use Sonnet by default.

Q: How does the memory feature work? A: Give it file access, and it maintains a knowledge base across sessions. Revolutionary for ongoing projects.

Q: Can it really code for 7 hours straight? A: Yes. I’ve seen it. Make sure you have good test coverage first.

Q: Is it better than human developers? A: At specific tasks, yes. At understanding business context and making architectural decisions, not yet.

Q: Will it replace developers? A: It’ll replace developers who can’t work with AI. Learn to use it or get left behind.

Ready to experience actual AI productivity? Try Sonnet 4 free at claude.ai or get API access at anthropic.com.