Google officially shut down gemini-3-pro-preview on March 9, 2026. If your application still references that model string, every API call is now returning errors. The designated replacement is gemini-3.1-pro-preview.
This guide covers the full migration path: model string changes, SDK updates, parameter differences, pricing implications, and the gotchas that will break your code if you miss them.
What Changed and Why
Google follows a predictable deprecation cycle for Gemini preview models. Each generation gets a short runway before the next version replaces it. Gemini 3 Pro Preview launched in late 2025, and Gemini 3.1 Pro Preview followed in early 2026 with significant improvements to reasoning, coding benchmarks, and tool use.
The migration deadline was March 9, 2026. After that date, gemini-3-pro-preview stopped accepting requests. Google now auto-routes that model string to gemini-3.1-pro-preview, but relying on implicit redirects in production is a bad idea — explicit is always better.
Step 1: Update the Model String
The minimum viable migration is a single string change. Every place in your codebase that references gemini-3-pro-preview needs to become gemini-3.1-pro-preview.
Python (New Google GenAI SDK)
Before:
from google import genai,[object Object],
response = client.models.generate_content(model="gemini-3-pro-preview", # DEPRECATEDcontents="Explain transformer attention in plain English.")print(response.text)
After:
from google import genai,[object Object],
response = client.models.generate_content(model="gemini-3.1-pro-preview", # Updatedcontents="Explain transformer attention in plain English.")print(response.text)
JavaScript (Google GenAI SDK)
Before:
import { GoogleGenAI } from "@google/genai";,[object Object],
const response = await ai.models.generateContent({model: "gemini-3-pro-preview", // DEPRECATEDcontents: "Explain transformer attention in plain English.",});console.log(response.text);
After:
import { GoogleGenAI } from "@google/genai";,[object Object],
const response = await ai.models.generateContent({model: "gemini-3.1-pro-preview", // Updatedcontents: "Explain transformer attention in plain English.",});console.log(response.text);
If you are still using the old google-generativeai Python package or @google/generative-ai npm package, those SDKs reached end-of-life on November 30, 2025. You should migrate to google-genai (Python) or @google/genai (JavaScript) as part of this update.
Step 2: Migrate Thinking Parameters
This is the breaking change that catches most developers. Gemini 3 introduced the thinking_level parameter, replacing the numeric thinking_budget used in Gemini 2.5 models. If your code sets thinking_budget, it still works for backward compatibility — but sending both thinking_budget and thinking_level in the same request throws an error.
The Parameter Shift
| Parameter | Gemini 2.5 Models | Gemini 3.x Models |
|---|---|---|
thinking_budget | Integer (0–24,576 tokens) | Supported (backward compat) |
thinking_level | Not available | "low", "medium", "high" |
Migration Mapping
If you were using thinking_budget, map your values to the new categorical levels:
- Budget 1–1,024 →
thinking_level: "low"(light reasoning, fastest responses) - Budget 1,024–8,192 →
thinking_level: "medium"(balanced reasoning) - Budget 8,192+ →
thinking_level: "high"(deep reasoning, slowest but most thorough)
Python Example
Before (Gemini 2.5 style):
from google.genai import types
response = client.models.generate_content(model="gemini-3-pro-preview",contents="Solve this optimization problem step by step.",config=types.GenerateContentConfig(thinking_config=types.ThinkingConfig(thinking_budget=8192 # Old numeric approach)))
After (Gemini 3.1 style):
from google.genai import types
response = client.models.generate_content(model="gemini-3.1-pro-preview",contents="Solve this optimization problem step by step.",config=types.GenerateContentConfig(thinking_config=types.ThinkingConfig(thinking_level="medium" # New categorical approach)))
Step 3: Handle Vertex AI Location Changes
If you access Gemini through Vertex AI rather than the Gemini Developer API, there is an important infrastructure change. Gemini 3.1 Pro Preview on Vertex AI requires the global endpoint. Regional endpoints may not have this model available.
# Vertex AI users: set location to globalimport google.auth
client = genai.Client(vertexai=True,project="your-project-id",location="global" # Required for Gemini 3.1 Pro)
If your existing setup hardcodes a regional location like us-central1, update it to global or verify that your region supports the model.
Step 4: Update Structured Output and Tool Calls
The API interface for structured output and function calling is backward-compatible between Gemini 3 Pro and 3.1 Pro. No schema changes are needed. However, the new model may produce slightly different structured outputs for the same prompts — particularly in edge cases around field ordering and optional field inclusion.
Gemini 3.1 Pro also introduces a separate endpoint for applications that rely heavily on custom tools:
gemini-3.1-pro-preview-customtools
This variant prioritizes your registered tools (like view_file, search_code, execute_command) over the model's built-in capabilities. If your application is an agent that chains multiple tool calls, test this variant — it may produce more reliable tool-use sequences.
Gemini 3 Pro vs 3.1 Pro: Capability Comparison
| Feature | Gemini 3 Pro Preview | Gemini 3.1 Pro Preview |
|---|---|---|
| Model String | gemini-3-pro-preview | gemini-3.1-pro-preview |
| Status | Discontinued (March 9, 2026) | Active |
| Context Window | 1M tokens | 1M tokens |
| Max Output Tokens | ~32K tokens | ~65K tokens |
| Input Price (≤200K) | $2.00 / 1M tokens | $2.00 / 1M tokens |
| Output Price (≤200K) | $12.00 / 1M tokens | $12.00 / 1M tokens |
| Input Price (>200K) | $4.00 / 1M tokens | $4.00 / 1M tokens |
| Output Price (>200K) | $18.00 / 1M tokens | $18.00 / 1M tokens |
| Thinking Parameter | thinking_level | thinking_level |
| Google Maps Grounding | Not supported | Supported |
| Custom Tools Endpoint | Not available | Available |
| File Upload Limit | 50MB | 100MB |
| YouTube URL Analysis | Limited | Full support |
| Reasoning Benchmarks | Baseline | ~3x improvement |
| Coding Benchmarks | Strong | Competitive with Claude Opus |
The pricing is identical. Every change from 3 Pro to 3.1 Pro is a capability upgrade at the same cost, which makes the migration straightforward from a budget perspective.
Migration Checklist
Use this checklist to verify your migration is complete:
- [ ] Search codebase for all instances of
gemini-3-pro-previewand replace withgemini-3.1-pro-preview - [ ] Update SDK versions — Python:
pip install -U google-genai(requires v1.51.0+); JavaScript:npm install @google/genai@latest - [ ] Migrate from deprecated SDKs if still using
google-generativeai(Python) or@google/generative-ai(JS) - [ ] Replace
thinking_budgetwiththinking_level(low/medium/high) — or at minimum, ensure you are not sending both parameters - [ ] Update Vertex AI location to
globalif applicable - [ ] Test structured output — verify JSON schemas still parse correctly with the new model
- [ ] Test function calling — verify tool call sequences produce expected results
- [ ] Test custom tools endpoint (
gemini-3.1-pro-preview-customtools) if your app uses registered tools heavily - [ ] Verify max output length — the new 65K token limit may affect truncation logic if you had guards based on the old 32K limit
- [ ] Run regression tests in staging before deploying to production
- [ ] Add retry logic with exponential backoff for 503/429 responses during migration period
- [ ] Configure monitoring for error rates and latency changes post-migration
Breaking Changes and Gotchas
1. Dual Thinking Parameters Cause Errors
If your code sends both thinking_budget and thinking_level in the same request, the API returns an error. This is the most common migration failure. Search your codebase for both parameter names and ensure only one is present.
2. Output Length Doubling
Gemini 3.1 Pro supports up to 65,000 output tokens compared to the ~32,000 limit on Gemini 3 Pro. If your application parses responses with a fixed buffer size or has downstream systems that expect bounded output, this can cause unexpected behavior. Set max_output_tokens explicitly if you need to constrain output length.
3. SDK Version Requirements
Gemini 3.x API features require google-genai version 1.51.0 or later for Python. Older SDK versions may not recognize the model string or the thinking_level parameter. Run pip show google-genai to check your current version.
4. Vertex AI Regional Availability
Not all Vertex AI regions support Gemini 3.1 Pro Preview. If you get a model-not-found error, switch to the global endpoint. This is a common issue for teams with region-locked infrastructure policies.
5. Cached Context Pricing
Context caching costs $0.20 per million tokens for contexts under 200K, which is unchanged from Gemini 3 Pro. However, verify that your cached contexts are compatible with the new model — cached prompts created for gemini-3-pro-preview may need to be recreated for gemini-3.1-pro-preview.
Alternative: Skip Provider-Specific Versioning Entirely
Every time Google deprecates a Gemini model version, you run the same migration playbook: update strings, test regressions, handle parameter changes, deal with SDK updates. This cycle repeats every few months.
Instead of migrating from one Google model version to another, ModelsLab's unified API abstracts away provider-specific versioning. You call a single OpenAI-compatible endpoint, specify the model you want, and the API handles the rest. No SDK migrations, no parameter format changes, no regional endpoint juggling.
from openai import OpenAI,[object Object],
response = client.chat.completions.create(model="gemini-3.1-pro",messages=[{"role": "user", "content": "Explain transformer attention."}])print(response.choices[0].message.content)
The same client code works across Gemini, Llama, Mistral, DeepSeek, and other models — change the model string and the routing happens automatically. When Google ships Gemini 4 and deprecates 3.1, you update one string instead of refactoring your SDK integration.
ModelsLab supports 100,000+ AI models across text, image, video, and audio generation through a single API platform. Explore the full catalog at modelslab.com/models.
FAQ
How long will Gemini 3.1 Pro Preview be available?
Google has not announced a specific deprecation date for Gemini 3.1 Pro Preview yet. Based on previous patterns, preview models typically get 3–6 months before the next version replaces them. Plan for a stable release or a 3.2/4.0 migration sometime in late 2026.
Can I use the old google-generativeai Python package with Gemini 3.1 Pro?
The old google-generativeai package reached end-of-life on November 30, 2025. While it may still function for basic requests, it does not support Gemini 3.x features like thinking_level, the custom tools endpoint, or 100MB file uploads. Migrate to google-genai (pip install google-genai) for full support.
Is the pricing different between Gemini 3 Pro and 3.1 Pro?
No. The pricing is identical: $2.00 per million input tokens and $12.00 per million output tokens for standard contexts (under 200K tokens). Long-context pricing is also unchanged at $4.00/$18.00 per million tokens. The migration is a pure capability upgrade with no cost increase.
What if I need to support both model versions during a transition period?
You can run both models in parallel by using different model strings in your API calls. However, since Gemini 3 Pro Preview is already shut down, there is no transition period — you must use 3.1 Pro Preview. If you need multi-model resilience, consider routing through a provider-agnostic API like ModelsLab that lets you switch between models and providers without code changes.
Does Gemini 3.1 Pro Preview produce different outputs than 3 Pro for the same prompts?
Yes. While the API is backward-compatible, the model itself has improved reasoning and coding capabilities. This means responses may differ in quality, structure, and length. For semantic tasks (summaries, explanations, creative writing), this is generally an improvement. For applications that rely on deterministic or regex-matched outputs, run your test suite against the new model before deploying.
