Uncensored models and community fine-tunes, served from verified Macs — where the machine's owner can't read your prompts, and you can prove it.
Keep your OpenAI client. Reach the long tail — uncensored models, rare fine-tunes — that no hosted API will touch. Your prompts stay private.
Host the models you choose — pull them with your own Hugging Face key — and earn per token. The work runs where you can't be watched, and neither can the prompts.
P-256 keypair on the chip. Private key never leaves the Enclave, even to RAM.
Apple MDM SecurityInfo cross-checks the SE serial. Bad posture → out.
Full X.509 chain to the Apple Enterprise Attestation Root CA.
Binary SHA-256 pinned. The provider is the audited build, not a fork.
You got the real model — verified at request time, every request.
The long tail incumbents won't host — uncensored and community fine-tunes — served from Macs you can verify, at prices the cloud can't match. Browse the catalog, pick one, point your client at it.
Private-prompt inference isn't a feature we promise. It's a property the hardware has. Every other promise in this market is built on top of that property — and on top of the assumption that you, the buyer, will trust the seller's word for it. We don't think you should have to.
Most Apple Silicon machines sit unused 18 hours a day. Umbra turns that into a verified per-token earn — and gives the long-tail model market the hosts it can't get anywhere else.