INTELLIGENCE FEED

ALL UPDATES

15 UPDATES TRACKED

CLAUDE NEW
APR 10, 2026

Claude Sonnet 4.6: Extended Thinking Streams 2x Faster

WHAT CHANGED

Extended thinking in claude-sonnet-4-6 now streams at 2x the previous throughput. Internal reasoning tokens arrive in real-time via the thinking content block, with no additional latency penalty on the first token.

WHY IT MATTERS

For any app using extended thinking — code review, multi-step reasoning, complex planning — the UX dramatically improves. Users see Claude working through problems as it happens, not waiting for a wall of text to appear.

HOW TO USE IT

Pass budget_tokens in the thinking parameter alongside stream: true. The stream emits thinking blocks first, then text blocks. Parse content_block_delta events where type is 'thinking' to render the internal monologue separately.

CLAUDE / TYPESCRIPT
import Anthropic from "@anthropic-ai/sdk";

const client = new Anthropic();

async function streamWithThinking(prompt: string) {
  const stream = await client.messages.stream({
    model: "claude-sonnet-4-6",
    max_tokens: 16000,
    thinking: {
      type: "enabled",
      budget_tokens: 10000,
    },
    messages: [{ role: "user", content: prompt }],
  });

  let thinkingText = "";
  let responseText = "";

  for await (const event of stream) {
    if (event.type === "content_block_delta") {
      if (event.delta.type === "thinking_delta") {
        thinkingText += event.delta.thinking;
        process.stdout.write("\x1b[2m"); // dim
        process.stdout.write(event.delta.thinking);
        process.stdout.write("\x1b[0m");
      } else if (event.delta.type === "text_delta") {
        responseText += event.delta.text;
        process.stdout.write(event.delta.text);
      }
    }
  }

  return { thinking: thinkingText, response: responseText };
}

streamWithThinking(
  "Design a database schema for a multi-tenant SaaS app with row-level security."
);
extended-thinkingstreamingperformanceapi
CHATGPT NEW
APR 08, 2026

GPT-4o Persistent Memory Now On By Default for Plus & Pro

WHAT CHANGED

OpenAI flipped the switch on persistent memory for all Plus and Pro subscribers. ChatGPT now automatically stores facts, preferences, and context across all conversations without users needing to opt in or use custom instructions.

WHY IT MATTERS

This fundamentally changes how power users interact with ChatGPT. No more re-explaining your stack, your preferences, or your projects every session. For indie hackers, this means ChatGPT can hold persistent context about your product, codebase preferences, and writing style.

HOW TO USE IT

Memory is automatic — just start working. Explicitly tell ChatGPT facts you want retained: 'remember that I always use TypeScript strict mode' or 'my main product is a B2B SaaS for HR teams'. Review and manage memories at Settings → Personalization → Memory.

memorypersonalizationgpt-4opluspro
GEMINI NEW
APR 07, 2026

Gemini 2.5 Pro: Google Search Grounding Free Up to 1,500 Queries/Day

WHAT CHANGED

Google has made grounding with Google Search free for up to 1,500 queries per day on Gemini 2.5 Pro via the Gemini API. Previously this was a paid add-on. Beyond 1,500 queries, standard grounding rates apply.

WHY IT MATTERS

Grounding means Gemini's responses are anchored to current web results — no hallucinated facts, no stale training data. For research tools, news aggregators, or any app where accuracy matters, 1,500 free grounded queries per day covers most indie hacker workloads completely.

HOW TO USE IT

Add the google_search tool to your request. The response includes grounding metadata with source URLs. For production apps with >1,500 queries/day, you'll hit standard pricing around $35/1M tokens.

GEMINI / PYTHON
import google.generativeai as genai
import os

genai.configure(api_key=os.environ["GEMINI_API_KEY"])

model = genai.GenerativeModel(
    model_name="gemini-2.5-pro",
    tools=[{"google_search": {}}],
)

response = model.generate_content(
    "What are the latest updates to Claude's API in 2026?",
    generation_config=genai.GenerationConfig(
        temperature=0.1,
    ),
)

# Print grounded response
print(response.text)

# Access grounding metadata (sources)
if response.candidates[0].grounding_metadata:
    for chunk in response.candidates[0].grounding_metadata.grounding_chunks:
        print(f"Source: {chunk.web.uri}")
        print(f"Title: {chunk.web.title}")
groundingsearchfree-tiergemini-2.5-proapi
CURSOR
APR 05, 2026

Cursor 0.44: Background Agent Runs Tests & Pushes to Branches

WHAT CHANGED

Cursor 0.44 ships Background Agent — a fully autonomous coding agent that runs in a remote sandboxed environment. It can execute your test suite, iterate on failures, commit code, and push to feature branches, all while you do other work.

WHY IT MATTERS

This is the first Cursor feature that genuinely removes you from the edit-run-fix loop. Point it at a failing test or a GitHub issue, go make coffee, and come back to a PR. For solo developers, it multiplies your effective output without requiring you to sit and watch an AI code.

HOW TO USE IT

Open the Command Palette → 'Start Background Agent'. Describe the task in natural language. The agent spins up a fresh environment, clones your repo, and begins. You get async notifications when it finishes or needs clarification. Connect GitHub for auto-PR creation.

background-agentautonomoustestinggitcursor-0.44
ELEVENLABS
APR 03, 2026

ElevenLabs: Instant Voice Cloning From 10 Seconds of Audio

WHAT CHANGED

ElevenLabs now supports Instant Voice Cloning from as little as 10 seconds of audio via their API. Upload a clean audio sample, get a voice_id back in under 3 seconds, and immediately use it for text-to-speech generation.

WHY IT MATTERS

The friction of adding a custom voice to an app just collapsed. 10 seconds of audio is easy to collect from any user, any podcast clip, or any spokesperson recording. This unlocks personalized TTS at scale — onboarding flows, AI assistants, content localization.

HOW TO USE IT

POST an audio file to /v1/voices/add with name and files[] parameters. The API returns a voice_id immediately. Pass that voice_id to /v1/text-to-speech/{voice_id} for synthesis. Minimum sample: 10 seconds of clear speech, no background noise.

ELEVENLABS / PYTHON
import requests
import os

ELEVEN_API_KEY = os.environ["ELEVEN_API_KEY"]

# Step 1: Clone a voice from a short audio sample
def clone_voice(name: str, audio_path: str) -> str:
    with open(audio_path, "rb") as f:
        response = requests.post(
            "https://api.elevenlabs.io/v1/voices/add",
            headers={"xi-api-key": ELEVEN_API_KEY},
            data={"name": name},
            files={"files": (audio_path, f, "audio/mpeg")},
        )
    response.raise_for_status()
    return response.json()["voice_id"]

# Step 2: Generate speech with the cloned voice
def speak(voice_id: str, text: str, output_path: str):
    response = requests.post(
        f"https://api.elevenlabs.io/v1/text-to-speech/{voice_id}",
        headers={
            "xi-api-key": ELEVEN_API_KEY,
            "Content-Type": "application/json",
        },
        json={
            "text": text,
            "model_id": "eleven_multilingual_v2",
            "voice_settings": {"stability": 0.5, "similarity_boost": 0.8},
        },
    )
    response.raise_for_status()
    with open(output_path, "wb") as f:
        f.write(response.content)

voice_id = clone_voice("My Narrator", "sample.mp3")
speak(voice_id, "Welcome to the future of voice AI.", "output.mp3")
voice-cloninginstantapitts
CLAUDE
APR 01, 2026

Claude Sonnet 4.6 Released

WHAT CHANGED

Anthropic released Claude Sonnet 4.6, the latest in the Claude 4 family, with improved reasoning, faster response times, and better instruction following compared to Sonnet 3.7.

WHY IT MATTERS

Sonnet 4.6 is the sweet spot model — smarter than Haiku, cheaper than Opus. Most production apps should migrate to this as the default.

HOW TO USE IT

Update your model string to claude-sonnet-4-6 in any existing Anthropic API call. No other changes needed.

CLAUDE / TYPESCRIPT
import Anthropic from "@anthropic-ai/sdk";

const anthropic = new Anthropic();

const response = await anthropic.messages.create({
  model: "claude-sonnet-4-6",
  max_tokens: 1024,
  messages: [{ role: "user", content: "Your prompt here" }]
});

console.log(response.content[0].text);
releaseapisonnet
CLAUDE
MAR 28, 2026

Extended Thinking Now Streams in Real-Time

WHAT CHANGED

Extended thinking mode in Claude now streams thinking tokens in real-time. Previously the full thinking block was buffered before delivery.

WHY IT MATTERS

Users see the model reasoning live instead of staring at a blank screen. Perceived latency drops dramatically for long reasoning tasks.

HOW TO USE IT

Enable streaming with thinking in your API call and handle content_block_delta events for thinking type blocks.

CLAUDE / TYPESCRIPT
import Anthropic from "@anthropic-ai/sdk";

const anthropic = new Anthropic();

const stream = await anthropic.messages.stream({
  model: "claude-sonnet-4-6",
  max_tokens: 16000,
  thinking: { type: "enabled", budget_tokens: 10000 },
  messages: [{ role: "user", content: prompt }]
});

for await (const event of stream) {
  if (event.type === "content_block_delta") {
    if (event.delta.type === "thinking_delta") {
      process.stdout.write(event.delta.thinking);
    }
    if (event.delta.type === "text_delta") {
      process.stdout.write(event.delta.text);
    }
  }
}
thinkingstreamingapi
CLAUDE
MAR 20, 2026

Claude Code Is Now Generally Available

WHAT CHANGED

Claude Code, Anthropic's agentic coding tool that runs in the terminal, exited beta and is now generally available. It can edit files, run tests, commit code, and navigate large codebases autonomously.

WHY IT MATTERS

The most capable agentic coding assistant is now stable and production-ready. Indie hackers can automate entire feature builds from a single prompt.

HOW TO USE IT

Install globally with npm then run claude in any project directory. Works with any language and framework.

CLAUDE / BASH
# Install
npm install -g @anthropic-ai/claude-code

# Navigate to your project and start
cd your-project
claude

# Give it a task
# > Add rate limiting to the /api/auth/login endpoint using Redis
claude-codeagentscodingterminal
CHATGPT
MAR 15, 2026

GPT-4o Memory Now On By Default for All Plus Users

WHAT CHANGED

OpenAI enabled persistent memory by default for all ChatGPT Plus and Pro users. The model automatically saves facts, preferences, and context from conversations and references them in future sessions.

WHY IT MATTERS

ChatGPT now behaves like a persistent assistant that remembers your stack, goals, and preferences across sessions without any prompt engineering.

HOW TO USE IT

No action needed — memory is automatic. View and edit stored memories in Settings → Personalization → Manage Memory.

memoryuxplus
GEMINI
MAR 10, 2026

Gemini 2.5 Pro Released with 1M Token Context

WHAT CHANGED

Google released Gemini 2.5 Pro with a 1 million token context window, improved multimodal reasoning, and native code execution.

WHY IT MATTERS

1M token context means you can feed entire codebases, legal documents, or full books in a single prompt. Strong competitive alternative to Claude for long-context tasks.

HOW TO USE IT

Use the gemini-2.5-pro model string in the Google AI SDK. Pass long documents directly in the prompt — no chunking needed.

GEMINI / PYTHON
import google.generativeai as genai

genai.configure(api_key=os.environ["GEMINI_API_KEY"])
model = genai.GenerativeModel("gemini-2.5-pro")

# Pass entire codebase — 1M tokens available
with open("entire_codebase.txt") as f:
    code = f.read()

response = model.generate_content(f"Summarize this codebase:\n{code}")
print(response.text)
releasecontext-windowmultimodal
GEMINI
MAR 05, 2026

Gemini Grounding with Google Search Free Up to 1,500 Queries/Day

WHAT CHANGED

Google made the Search grounding feature free for Gemini API users up to 1,500 queries per day, letting the model cite real-time web sources.

WHY IT MATTERS

Build news trackers, research tools, or any app needing live web data without paying for a separate search API. Major free-tier advantage over competitors.

HOW TO USE IT

Add google_search_retrieval to your tools list in any Gemini API call. No additional billing setup needed under the free quota.

GEMINI / PYTHON
import google.generativeai as genai

genai.configure(api_key=os.environ["GEMINI_API_KEY"])
model = genai.GenerativeModel("gemini-2.5-pro")

response = model.generate_content(
    "What happened in AI this week?",
    tools=[{"google_search_retrieval": {}}]
)
print(response.text)

# Access source citations
for chunk in response.candidates[0].grounding_metadata.grounding_chunks:
    print(f"Source: {chunk.web.uri}")
groundingsearchfree-tierapi
CURSOR
FEB 28, 2026

Cursor Background Agent Pushes to Git Branches Autonomously

WHAT CHANGED

Cursor's Background Agent mode now supports running test suites, interpreting failures, self-correcting, and committing to feature branches without user interaction.

WHY IT MATTERS

You can assign a feature to Cursor, walk away, and come back to a tested and committed branch. First genuinely autonomous coding workflow for everyday developers.

HOW TO USE IT

Enable Background Agent in Cursor Settings → Features → Background Agent. Open the agent panel, describe the task, and select a target branch.

agentsautonomousgittesting
ELEVENLABS
FEB 20, 2026

ElevenLabs Instant Voice Cloning from 10 Seconds of Audio

WHAT CHANGED

ElevenLabs reduced the minimum audio required for instant voice cloning from 1 minute down to 10 seconds while maintaining the same output quality.

WHY IT MATTERS

Creating a custom voice for a product is now nearly frictionless. A short voice memo is enough to clone a voice for any app or content workflow.

HOW TO USE IT

Use the /v1/voices/add endpoint with a 10-second mp3 or wav sample. The voice is ready to use within seconds of upload.

ELEVENLABS / PYTHON
import requests
import os

ELEVEN_API_KEY = os.environ["ELEVEN_API_KEY"]

url = "https://api.elevenlabs.io/v1/voices/add"
headers = {"xi-api-key": ELEVEN_API_KEY}
files = {"files": open("sample.mp3", "rb")}
data = {"name": "MyVoice", "description": "Custom voice"}

response = requests.post(url, headers=headers, files=files, data=data)
voice_id = response.json()["voice_id"]
print(f"Voice cloned: {voice_id}")
voice-cloningapiaudio
OTHER
FEB 15, 2026

Vercel AI SDK 4.0 — Unified API Across All Major LLMs

WHAT CHANGED

Vercel released AI SDK 4.0 with a unified API that works identically across Claude, GPT-4o, Gemini, Mistral, and Llama. Includes streaming, tool use, and structured output.

WHY IT MATTERS

Write your AI integration once and swap models without changing code. The useChat and useCompletion hooks work across Next.js, SvelteKit, and plain React.

HOW TO USE IT

Install @ai-sdk/anthropic or other provider packages alongside the core ai package. The generateText and streamText functions work identically across all providers.

OTHER / TYPESCRIPT
import { generateText } from "ai";
import { anthropic } from "@ai-sdk/anthropic";

const { text } = await generateText({
  model: anthropic("claude-sonnet-4-6"),
  prompt: "Explain RAG in one paragraph"
});

console.log(text);

// Swap provider in one line — same API
// import { openai } from "@ai-sdk/openai";
// model: openai("gpt-4o")
sdkvercelmulti-modeltypescript
CHATGPT
FEB 10, 2026

OpenAI Realtime API Supports Text and Audio in Same Session

WHAT CHANGED

OpenAI's Realtime API now supports mixing text and audio modalities in the same WebSocket session. Send text, receive audio, or switch modes mid-conversation.

WHY IT MATTERS

Build voice assistants with text fallback, or multimodal apps where users switch between typing and speaking without breaking the session.

HOW TO USE IT

Connect to the Realtime API WebSocket and specify both text and audio in the modalities array of your session config.

CHATGPT / JAVASCRIPT
const ws = new WebSocket(
  "wss://api.openai.com/v1/realtime?model=gpt-4o-realtime-preview",
  {
    headers: {
      "Authorization": `Bearer ${process.env.OPENAI_API_KEY}`,
      "OpenAI-Beta": "realtime=v1"
    }
  }
);

ws.on("open", () => {
  ws.send(JSON.stringify({
    type: "session.update",
    session: {
      modalities: ["text", "audio"],
      voice: "alloy"
    }
  }));
});

ws.on("message", (data) => {
  const event = JSON.parse(data.toString());
  if (event.type === "response.audio.delta") {
    // Handle audio chunk
  }
  if (event.type === "response.text.delta") {
    process.stdout.write(event.delta);
  }
});
realtimeaudiowebsocketvoice