Agent-first runtime: GPT-5.5 as plumbing for OpenAI's super app
The most substantive shift in GPT-5.5 is not a benchmark number — it is OpenAI's explicit reframing of the model from chat assistant to agent runtime. OpenAI and its evaluators describe GPT-5.5 as 'a new class of intelligence for real work' built around agentic coding, computer use, knowledge work, and scientific research, with the headline evals (Terminal-Bench 2.0, SWE-Bench Pro, OSWorld-Verified, GDPval) all chosen to measure multi-step tool use rather than single-turn reasoning.
That framing aligns with the strategy Greg Brockman and Sam Altman are publicly telegraphing: GPT-5.5 is plumbing for a 'super app' spanning ChatGPT, Codex, and an AI browser. Brockman's own framing — 'This model is a real step forward towards the kind of computing that we expect in the future — but it is one step, and we expect to see many in the future' — is less a product pitch than a roadmap disclosure. The 400K-token context window inside Codex and the 1M-token API window are not chat features; they are agent-session budgets. Viewed through that lens, the 'incremental' critique that dominated Reddit misses the architectural point: OpenAI is not trying to ship a smarter chatbot, it is trying to ship the runtime its future products will execute on top of.


