The TEE Wedge: Hardware-Enforced Privacy, Not a Retention Policy
The story most outlets are telling about Incognito Chat — 'WhatsApp adds an incognito mode to Meta AI' [3]— undersells what is structurally different here. OpenAI's temporary chats, Google's transient Gemini sessions, and Anthropic's privacy controls are all policy promises wrapped around servers that physically can read your prompt. Meta's pitch is that the servers cannot.
The Private Processing stack uses AMD SEV-SNP confidential virtual machines and NVIDIA H100 GPUs in confidential computing mode, fronted by Oblivious HTTP relays that mask user IPs, with remote attestation so the user's device can verify it is connecting to an unmodified environment before sending anything [8]. Inference happens inside this Trusted Execution Environment, which Meta says is inaccessible even to Meta itself [4]. The decoder framing — 'a protected server environment that even Meta itself can't access' [6]— is the part that actually matters: it is a property of the silicon and the attestation chain, not of a privacy policy.
Why the distinction matters: a retention window can be quietly lengthened, subpoenaed around, or breached. A TEE either attests correctly or the device refuses to talk to it. Per Meta's framing, Google retains 'temporary' Gemini chats for up to roughly three days and OpenAI retains ChatGPT 'temporary' chats for up to thirty days [5]. Meta is arguing that any number greater than zero is the wrong category of answer, and that the right answer requires re-architecting the inference stack, not editing a settings page.



