Appshots is a primitive, not a screenshot
The headline-friendly framing of Appshots — press Command-Command, send your window to Codex — undersells what's actually shipped. The capture uses a hybrid of macOS ScreenCaptureKit for the visual and the Accessibility APIs for structured text extraction, so Codex receives both the pixels of the frontmost window and the underlying accessible text tree, including content scrolled off-screen [1][2]. That detail matters: a Figma board, a long Slack thread, or a Jira ticket below the fold all arrive in the thread without the user having to scroll-and-stitch. Two more constraints reveal intent. Appshots default to opening a new thread, but if the user touched Codex within the last 60 seconds the capture is appended to that recent thread — supporting consecutive captures during an iteration loop without polluting history [1]. And the feature is Mac-only and explicitly unavailable from the CLI, requiring Screen Recording and Accessibility permissions [1].
Curtis Pyke at Kingy AI reads this as a deliberate human-AI handoff pattern: 'not just screenshot to chat,' but a least-friction primitive for visual context that the user fires on demand rather than streams ambiently [2]. The contrast with always-on recall products is the point — OpenAI is betting developers want a discrete capture key, not surveillance.



