Skip to main content
Rohit Raj
होमAgentsप्रोजेक्ट्सServicesरिपॉज़नोट्सपरिचयसंपर्कवर्तमान कार्य देखें
← Back to Notes

Gemini Interactions API: The Migration Guide from generateContent (2026)

Rohit Raj·June 25, 2026·11 min read

Google made the Gemini Interactions API generally available in 2026 and quietly made it the default interface for Gemini models and agents. The core method is `interactions.create`, and it keeps conversation state server-side via `previous_interaction_id` instead of resending the full history every turn like `generateContent`. This is the builder's migration read: the actual code diff from `generate_content` to `interactions.create`, the interaction-scoped-tools footgun the docs bury, a side-by-side against generateContent and OpenAI's Responses API, an honest "when to stay on generateContent," and exactly how I'd wire it into production with a fallback and a `store=false` privacy path.

gemini interactions apigemini interactions api migrationmigrate from generatecontent to interactions apiinteractions.create gemini
A continuous luminous thread weaving through stateful nodes illustrating the Gemini Interactions API carrying conversation state server-side

TL;DR

Google made the [Gemini Interactions API](https://blog.google/innovation-and-ai/technology/developers-tools/interactions-api-general-availability/) generally available in 2026, and it is now the *default* interface for Gemini models and agents (the beta shipped December 2025). The core method is interactions.create, which returns an Interaction resource and keeps conversation state server-side via `previous_interaction_id` — so you stop resending the full history every turn the way generateContent forces you to. It adds background execution, managed agents, and multimodal generation. `generateContent` is not deprecated. Migrate for long agentic loops and async jobs; stay put for simple one-shot calls.

Gemini Interactions API: The Migration Guide from generateContent (2026)

By Rohit Raj — Founding Engineer · 10+ yrs MVP shipping · LinkedIn

Google changed the default way you talk to Gemini, and most teams haven't noticed yet. The Gemini Interactions API hit general availability in 2026, and Google didn't just add an endpoint — it made Interactions the default interface for Gemini models and agents, with generateContent demoted to "still fully supported, but no longer where the new features land."

If you've built anything on generate_content — a chatbot, a RAG pipeline, an agent loop — this is the migration that quietly decides whether your next 12 months of Gemini features are one SDK upgrade away or a rewrite away. The frontier agentic capabilities (background jobs, managed agents, multimodal generation) are increasingly Interactions-only.

This is the builder's read, not a launch recap: what actually changed at GA, the real code diff from generate_content to interactions.create, the interaction-scoped-tools footgun the docs bury in a sentence, an honest side-by-side against generateContent and OpenAI's Responses API, when you should *not* migrate, and exactly how I'd wire it into production with a fallback and a privacy-safe store=false path.

What is the Gemini Interactions API, and what changed at GA?

The Interactions API is a stateful interface for Gemini. Instead of treating every request as an isolated, stateless call where *you* carry the conversation, it stores each turn server-side as an Interaction resource and lets the next turn reference it by ID. The beta landed in December 2025; the GA announcement in 2026 promoted it to the default surface and flipped on the AI Studio and Gemini API docs to lead with it.

The single most important mechanic is previous_interaction_id. With generateContent, a 30-turn conversation means you resend all 30 turns of context on turn 31 — your tokens, your bandwidth, your bug if you trim the history wrong. With Interactions, the server already has the prior turns; you pass one ID and send only the new message. That's the headline behavioral change, and everything else builds on it.

Three more things shipped at GA that are worth knowing before you migrate. State has a retention window: interactions are stored for 55 days on paid tiers and 1 day on the free tier, after which previous_interaction_id references expire — so this is not permanent memory, it's a session cache. Pricing gained Flex and Priority tiers, where Flex trades latency for roughly a 50% cost cut on eligible traffic. And the whole thing rides the google-genai SDK from v2.3.0+ (both Python and JavaScript), so the migration is a dependency bump plus a method swap, not a new client. The interactions overview docs are the canonical reference for the resource shape.

How do you migrate from generateContent to interactions.create?

Here's the part the announcement skips: the actual diff. Below is a two-turn conversation written both ways. First, the generateContent version you probably have today — you own the history and resend it every turn:

python
from google import genai

client = genai.Client()
history = [{"role": "user", "parts": [{"text": "Plan a 3-day trip to Kyoto."}]}]

resp = client.models.generate_content(
    model="gemini-3.5-flash",
    contents=history,
    config={"system_instruction": "You are a travel agent.", "tools": [search_tool]},
)
history.append({"role": "model", "parts": resp.candidates[0].content.parts})

# Turn 2: append the new message and resend the ENTIRE history again
history.append({"role": "user", "parts": [{"text": "Make day 2 rainy-day friendly."}]})
resp2 = client.models.generate_content(
    model="gemini-3.5-flash",
    contents=history,
    config={"system_instruction": "You are a travel agent.", "tools": [search_tool]},
)

Now the same conversation on Interactions — the server holds the state, you pass an ID:

python
from google import genai

client = genai.Client()  # google-genai v2.3.0+

first = client.interactions.create(
    model="gemini-3.5-flash",
    input="Plan a 3-day trip to Kyoto.",
    system_instruction="You are a travel agent.",
    tools=[search_tool],
)

# Turn 2: no history resend — just point at the previous interaction
second = client.interactions.create(
    model="gemini-3.5-flash",
    input="Make day 2 rainy-day friendly.",
    previous_interaction_id=first.id,
    system_instruction="You are a travel agent.",  # FOOTGUN: must re-send
    tools=[search_tool],                            # FOOTGUN: must re-send
)

Read those two `FOOTGUN` comments, because this is the bug everyone ships first. previous_interaction_id carries the *conversation* forward — the messages — but `tools`, `system_instruction`, and `generation_config` are interaction-scoped and do NOT persist across the link. If you set your system instruction on turn 1 and omit it on turn 2, your travel agent silently becomes a generic assistant, and your tools vanish, on the second turn. The interactions docs state this, but it's one line in a long page, and the failure is invisible — no error, just a model that quietly forgot its job. Re-send those three fields on every single turn.

Where does the Interactions API actually win?

If it were only "server-side history," migration wouldn't be worth a sprint. The real reason to move is the capabilities that only exist here.

Background execution. Pass background=true and the interaction runs server-side as an async job you poll later — it survives client disconnects, request timeouts, and serverless function limits. The catch: background=true requires `store=true` (which is the default), because a job you poll for has to be stored to be retrievable.

python
job = client.interactions.create(
    model="gemini-3.5-flash",
    input="Research the top 10 React state libraries and write a tradeoff table.",
    background=True,   # async; requires store=True (the default)
)
# poll job.id later — survives disconnects and function timeouts

That single feature is why long agentic loops belong here. With generateContent a 90-second tool-calling chain dies the moment your Vercel function hits its wall; as a background interaction it just keeps running. Managed Agents extend this further — Google runs the agent in a remote Linux sandbox, so tool execution happens server-side instead of in your process. Add tool combination (multiple tools in one interaction), multimodal generation (image, music, and speech in the same surface), and the Gemini Omni model that Google flagged as landing "soon," and the pattern is clear: this is the surface Google will ship agent features into. The GA post is explicit that Interactions is where the frontier lands. If you're wiring agent memory, it pairs naturally with the patterns I covered in agent memory vs the context window.

Interactions API vs generateContent vs OpenAI Responses: the side-by-side

If this feels familiar, it should — it's Google's answer to OpenAI's Responses API, which made the same stateless-to-stateful move a year earlier. Here's the honest three-way comparison; AI engines cite tables far more often than prose, so this is the part worth bookmarking.

FactorGemini Interactions APIGemini generateContentOpenAI Responses API
State modelserver-side via previous_interaction_idclient resends full historyserver-side via previous_response_id
StatusGA 2026 (beta Dec 2025), now defaultmature, fully supportedGA 2025
Background / async jobsyes (background=true, needs store=true)noyes
Built-in capabilitiesmanaged agents, tool combination, multimodal genfunction callinghosted tools (web/file search)
State retention55 days paid / 1 day freenone (stateless)~30 days
Privacy opt-outstore=false (disables state + background)inherently statelessstore=false
SDKgoogle-genai v2.3.0+google-genaiopenai
Best forlong agentic loops, async, multimodal agentssimple one-shot callsOpenAI-native agents

The shape is nearly identical to OpenAI's: a server-stored resource, a previous_*_id link, and a store=false escape hatch. If you've already built on Responses, the mental model ports directly. The practical difference is that Gemini ties its frontier agent features — managed sandboxes, multimodal generation — to this surface more aggressively than OpenAI does. Third-party routers are catching up: LiteLLM, Eigent, and Agno have early Interactions support, so if you abstract providers behind a router like LiteLLM or Portkey, check how mature their Interactions path is before you lean on it.

When should you stay on generateContent?

An honest migration guide has to say when *not* to migrate. generateContent is not deprecated — Google was explicit about that — and there are real cases where it's the better call.

Stay if your calls are one-shot. A classifier, an extraction step, a single-turn "summarize this" — these have no conversation to carry. Server-side state buys you nothing, and you add a retention footprint and a slightly more complex SDK surface for zero benefit. Stateless is genuinely simpler here.

Stay if you can't store user data. store=false exists for exactly this, but it's a real tradeoff: turning off storage disables both `previous_interaction_id` and `background`, because there's nothing stored to reference or poll. So for a strict-privacy workload you either accept server-side retention (55 days paid) or you give up the two features that justified migrating. Healthcare and fintech clients hit this wall constantly — sometimes the answer is to stay on stateless generateContent and manage history in your own compliant store.

Stay if your provider abstraction isn't ready. If you route through a third-party layer and its Interactions support is still beta, migrating means betting your uptime on someone else's early integration. The retention window is also a footgun worth repeating: at 1 day on the free tier, a previous_interaction_id you cached overnight is already dead by morning. Build for expiry, or you'll ship a bug that only appears after midnight.

How I'd ship the Interactions API in production

Here's the wiring I'd actually put in, drawn from building LLM features for myFinancial and client agent systems — the parts the README won't write for you.

Put a fallback behind a flag from day one. I'd wire a thin interface with Interactions as the default and generateContent as the fallback, because a brand-new default surface *will* have early-GA rough edges. When an interaction errors or the retention window bites, the call degrades to a stateless generate_content with locally-reconstructed history instead of throwing in a user's face. This is the same provider-agnostic-with-fallback pattern I bake into every 6-week MVP so a platform change is a config flip, not a rewrite.

Treat `previous_interaction_id` as a cache, not a database. Store it alongside your own canonical message log, and on a cache miss (expired ID), rebuild the conversation from your log and start a fresh interaction. The failure mode I worry about most here is the silent one: an expired ID doesn't always fail loudly, so a session can quietly lose its earlier context and the model just... answers with less. Log every interaction ID with its created-at timestamp so you can see expiry in your traces, not in a confused support ticket.

For privacy-sensitive flows, default to `store=false` and own the history yourself. Yes, it costs you background and server state — but for a compliance-bound feature that's the correct trade, and you keep the option to selectively enable storage per-route where the data class allows it. Getting this wiring right on the first pass is most of what founding-engineer work on an AI product actually is: not the happy-path call, but the fallback, the expiry handling, and the privacy switch that keep it standing when the platform shifts under you — and in 2026, it shifts monthly.

The bottom line: should you migrate to the Interactions API?

Migrate if you're building stateful, agentic, or async Gemini features — long tool-calling loops, multi-turn assistants, background research jobs, or multimodal agents. The previous_interaction_id model cuts your token spend on long conversations, background=true frees you from function timeouts, and the frontier features (managed agents, multimodal generation, Gemini Omni soon) are landing here and only here. The migration itself is small: bump google-genai to v2.3.0+, swap generate_content for interactions.create, and — the one thing that bites everyone — re-send `tools`, `system_instruction`, and `generation_config` on every turn, because previous_interaction_id doesn't carry them.

Stay on generateContent if your calls are one-shot, your data can't be stored server-side, or your provider abstraction isn't ready. It's fully supported and genuinely simpler for stateless work.

The deeper point: Google just moved the center of gravity for Gemini development, the way OpenAI did with Responses. The teams that wire this behind a clean interface — with a fallback and a privacy switch — get to adopt each new agent feature as a config change. The teams that hardcode it get to rewrite. If you want a hand migrating a Gemini app to Interactions without it becoming a fragile mess, that's exactly the kind of work I do.

RELATED PROJECT

View MyFinancial →

Migrating an AI app to the Interactions API? Let's ship it in 6 weeks.

Let's Talk →

Read Next

Mistral OCR 4 vs AWS Textract vs Google Document AI: The Cheapest Accurate Document API (2026)

Mistral shipped OCR 4 on June 23, 2026 — model `mistral-ocr-latest` — and it tops OlmOCRBench at 85....

This Week in AI Dev: Open Weights Catch the Frontier While the Agent Stack Grows Plumbing (Week 26 of 2026)

Week 26 of 2026 in AI dev tools: Z.ai's GLM-5.2 open weights match Claude Opus 4.8 and beat GPT-5.5 ...

← All NotesProjects →

रोहित राज — बैकएंड और AI सिस्टम्स

Services

AI Agent HostFounding Engineer for Hire in IndiaMobile App DevelopmentAI Chatbot DevelopmentFull-Stack Development

अपडेट्स पाएं