Introducing GPT-5.5

OpenAI model release focused on agentic coding, computer use, and long-horizon knowledge work.

Key Claim (excerpt)

GPT-5.5 excels at writing and debugging code, researching online, analyzing data, creating documents and spreadsheets, operating software, and moving across tools until a task is finished. The gains are especially clear in agentic coding, computer use, knowledge work, and early scientific research—areas where progress depends on reasoning across context and taking action over time.

Overview

GPT‑5.5 is positioned as OpenAI’s next step toward “a new way of getting work done on a computer” — more autonomous planning and tool-use on messy, multi‑part tasks, while keeping serving latency similar to GPT‑5.4.

Highlights (from the post)

Stronger at planning + tool use + self-checking over longer tasks
Improved efficiency: often fewer tokens and fewer retries on the same Codex tasks
Safety: described as OpenAI’s strongest safeguards to date, with expanded testing (cybersecurity/biology) and feedback from ~200 early-access partners
Rollout: Plus/Pro/Business/Enterprise users in ChatGPT and Codex; API “very soon” (per post)

Benchmarks (selected, from the post)

Terminal‑Bench 2.0: 82.7% (vs 75.1% for GPT‑5.4)
OSWorld‑Verified: 78.7% (vs 75.0% for GPT‑5.4)
BrowseComp: 84.4% (GPT‑5.5), 90.1% (GPT‑5.5 Pro)
FrontierMath Tier 1–3: 51.7% (vs 47.6% for GPT‑5.4)
CyberGym: 81.8% (vs 79.0% for GPT‑5.4)

(See post for full table including Claude Opus 4.7 and Gemini 3.1 Pro comparisons.)

Introducing GPT-5.5

Introducing GPT-5.5

Links

Key Claim (excerpt)

Overview

Highlights (from the post)

Benchmarks (selected, from the post)

Links to this page:

Introducing GPT-5.5

Links

Key Claim (excerpt)

Overview

Highlights (from the post)

Benchmarks (selected, from the post)

Related

Links to this page: