On Feb. 5, 2026, Anthropic launched Claude Opus 4.6 with a 1M-token context window (beta), describing it as “a first for our Opus-class models.” In the official announcement, Anthropic reiterated: “Opus 4.6 features a 1M token context window in beta,” signaling a major leap for teams that routinely work with sprawling codebases, multi-document investigations, and dense knowledge-work workflows.
Long context isn’t just a bigger buffer for prompts, it changes how people structure work with AI. Instead of chunking documents into dozens of calls or building elaborate retrieval pipelines for every task, a million tokens can let a model track far more of the source material directly, enabling analysis that’s closer to “read everything first, then answer” than “search and stitch.”
What “1 million-token context” really means in practice
A context window is the amount of text the model can consider at once, your prompt, tool outputs, and prior conversation turns included. With Claude Opus 4.6 moving to a one-million-token context window in beta, the ceiling rises dramatically for workloads that previously required splitting inputs into many smaller segments.
To translate the scale into something concrete, earlier long-context coverage around Claude Sonnet 4 often cited a rough equivalence of about “~750,000 words” for one million tokens (a comparison widely repeated when Sonnet 4 reached 1M context in 2025). The exact mapping varies by language and formatting (code vs. prose), but the line point stands: this is “whole libraries of documents” territory, not “a few PDFs.”
The Verge noted that Opus 4.6 includes a beta one-million-token context window that can enable work across multiple documents. That matters because many real workflows, due diligence, incident response, contract review, design audits, are inherently multi-source, and the friction of constant chunking can be the difference between a tool people try and a tool people adopt.
From Sonnet 4 to Opus 4.6: why this milestone matters
Anthropic introduced one-million-token context for Claude Sonnet 4 earlier (Aug. 12, 2025), framing it as “a 5x increase” and positioning it for processing entire codebases and extensive document sets. That earlier release established the technical precedent and created user expectations for what “long context” could unlock.
Opus 4.6 matters because Opus is Anthropic’s top-tier class where users often expect maximum capability for high-stakes, complex work. Anthropic’s Feb. 5, 2026 messaging emphasizes that the 1M context is “a first for our Opus-class models,” which signals that long-context workloads no longer require stepping down to a different family for certain tasks.
SiliconANGLE also framed it similarly, reporting that Opus 4.6 will support 1 million tokens of context (beta) on the Claude Developer Platform at launch, and highlighting that it’s the first Opus model to get long context. For engineering leaders, that combination, top-tier model plus top-tier context, reduces the need to trade off depth of reasoning against breadth of material.
Access, platforms, and the reality of gating
Not everyone automatically gets the full 1M-token window on day one. Claude Docs on context windows state that 1M-token context availability requires usage tier 4 or custom rate limits, an explicit form of API access gating that reflects the higher infrastructure cost and demand profile of very large prompts.
From a deployment standpoint, Claude Docs also list long-context availability as “currently available on the Claude API, Amazon Bedrock, and Google Cloud’s Vertex AI.” That’s important for enterprises that standardize on a specific cloud procurement route, because “supported by the model” and “available in our platform” are often different milestones.
The practical takeaway: product teams should treat 1M context as a capability that must be verified per environment (direct API vs. Bedrock vs. Vertex), per account tier, and per rate limit configuration. Planning for that early avoids a common rollout trap, building a workflow around 1M tokens and discovering the production account is capped lower.
Pricing and performance: the long-context tradeoffs
Opus 4.6 introduces pricing details explicitly tied to long context. Anthropic notes: “Premium pricing applies for prompts exceeding 200k tokens ($10/$37.50 per million input/output tokens).” In other words, the million-token ceiling is real, but the economics change once you cross a threshold.
Claude Docs make the rule even more explicit: requests over 200K tokens are charged at premium rates, applying exact multipliers of 2x input and 1.5x output pricing. For architects, this suggests a new optimization mindset, use large context when it buys measurable value (fewer calls, fewer retrieval errors, better synthesis), not simply because it’s available.
At the same time, Opus 4.6 supports up to 128K output tokens, which becomes especially useful when paired with long prompts. If you’re asking the model to produce a detailed audit report, a structured migration plan, or a long-form security review that references many sources, output room can be as important as input room, otherwise the answer gets truncated just when it becomes most valuable.
Context compaction: making million-token workflows sustainable
One of the less flashy but highly practical additions in Opus 4.6 is “context compaction.” Anthropic describes it as a way to summarize or replace older context near a threshold to extend task length, essentially managing the conversation’s memory so work can continue without constantly restarting.
This feature addresses a common issue with long-running projects: even with huge windows, iterative collaboration can bloat context with intermediate drafts, repeated instructions, and earlier dead ends. Compaction can preserve the essential decisions and references while removing redundant text, keeping the working set smaller and more relevant.
For teams building tools on top of Claude, compaction also hints at a more ergonomic pattern for long tasks: allow users to keep feeding material, periodically compress state, and continue, rather than forcing “session resets” that break continuity. Combined with 128K output, it supports workflows that look more like a sustained analysis session than a series of disconnected Q&A calls.
Why 1M context is a big deal for cybersecurity and large codebases
Axios described Opus 4.6 as unusually strong in cybersecurity work and reported it found “over 500” previously unknown high-severity vulnerabilities. Regardless of the exact testing setup behind that line, the connection to long context is intuitive: real security reviews often require reading large swaths of a codebase, tracing flows across files, and correlating patterns with documentation and configuration.
With a million tokens, it becomes more feasible to include broad portions of repositories, dependency manifests, build scripts, and security policies in one analytical pass. That can reduce the chance that a critical clue lives “just outside the window,” and it can cut down on the brittle logic of deciding which files to include in each chunk.
Importantly, long context doesn’t eliminate the need for good security process, it augments it. The most effective pattern is to use the large window to keep the model grounded in primary sources (code, logs, standards), then ask for precise, testable outputs: vulnerable functions, exploit scenarios, remediation diffs, and verification steps.
Enterprise knowledge work: multi-document synthesis at scale
Financial Times coverage of the Opus 4.6 announcement framed it in terms of enterprise and knowledge-work positioning, including the ability to process larger amounts of data. That’s where a million tokens can feel less like a technical metric and more like an operational change: fewer handoffs between tools, fewer manual summaries, and a faster path from “inputs” to “decision.”
Consider typical corporate workflows: an M&A review spanning contracts, emails, risk memos, and financial statements; a regulatory response requiring citation across policies and evidence; or a product strategy exercise combining research, internal metrics, and customer feedback. A large context window can support a single, traceable workspace where the model can reference many documents without constantly reloading them.
That said, the best enterprise implementations will still be selective. Premium pricing past 200K tokens and access gating (tier 4 or custom limits) push teams toward disciplined designs: pre-filtering documents, using long context for the “final synthesis” pass, and applying compaction to keep the session lean while retaining auditability.
Claude Opus 4.6’s 1M-token context window (beta) marks a notable shift: long-context capability has moved into Anthropic’s Opus class, backed by official statements and echoed by reporting from outlets like The Verge and SiliconANGLE. Combined with up to 128K output tokens, it enables workflows where both the evidence and the deliverable can be large, detailed, and continuous.
The opportunity is real, but so are the constraints. Access may require usage tier 4 or custom rate limits, and the economics change after 200K tokens via premium pricing (2x input, 1.5x output). The teams that get the most value will treat one million tokens as a strategic resource: use it when breadth of context materially improves accuracy, leverage context compaction to stay efficient, and design processes that remain verifiable for security and enterprise decision-making.