1) Runtime Architecture (Browser as Compute Node)
OfflineTranscriber runs Whisper inference in a Web Worker, so heavy processing does not block UI interactions. The worker executes model operators through browser-side runtimes (WASM and, when available, WebGPU acceleration).
In practical terms, your device executes the matrix operations. We do not run a transcription server that receives raw media files for decoding.
2) The "Secret Sauce": WebAssembly and WebGPU
Traditional transcription products send audio to cloud Python services. Our path is different: model execution happens directly in the browser sandbox via WebAssembly, with WebGPU used when supported by your hardware and browser.
If WebGPU is unavailable, runtime falls back to CPU execution. This changes speed, not privacy boundary.
3) Model Caching and Offline Behavior
First run downloads model artifacts and stores them in browser cache. After this cache is present, repeated transcription can run offline for that model.
Clearing site data removes cached models and requires a fresh download. This is expected behavior, not data loss on our backend.
4) Why We Cannot Read Your Media
We do not expose a media-receive endpoint for transcription media payload. During normal use, network traffic is limited to static assets, model artifact downloads, licensing, and basic product telemetry.
You can verify this yourself: open Chrome DevTools Network panel and filter by XHR/Fetch while transcribing. You should not see a request containing raw audio binary being posted to our domain.
5) Limits and Failure Boundaries
Local inference quality and speed still depend on source audio quality, speaker overlap, background noise, and device memory/compute capacity. On low-memory devices, long files should be processed in smaller chunks for stability.
Offline mode requires one successful model download first. If model cache is missing or cleared, internet is required for setup again.
Technical FAQ
- Does this work fully offline?
- Yes, after model artifacts are downloaded once and stored in browser cache. First-time setup still needs network access.
- What requests are still sent over network?
- Static assets, model downloads, licensing flows, and basic telemetry. Media payload for transcription is processed locally.
- How can I independently verify the privacy boundary?
- Use browser DevTools: record Network traffic while transcribing and inspect request payloads. You should not find raw audio media in API requests to our service.