CLAUDE March 28, 2026
Extended Thinking Now Streams in Real-Time
WHAT CHANGED
Extended thinking mode in Claude now streams thinking tokens in real-time. Previously the full thinking block was buffered before delivery.
WHY IT MATTERS
Users see the model reasoning live instead of staring at a blank screen. Perceived latency drops dramatically for long reasoning tasks.
HOW TO USE IT
Enable streaming with thinking in your API call and handle content_block_delta events for thinking type blocks.
thinkingstreamingapi
ORIGINAL SOURCE
Streaming extended thinking is a major UX unlock. Instead of waiting 10–30 seconds for a response to appear, users watch the reasoning unfold in real time. This is particularly valuable for complex coding and analysis tasks where the thinking process itself is informative.