DOWNLOADING MODEL
Initializing...
0%
0 MB / ~1.5 GB
Calculating speed...
Vision Encoder (SigLIP2)
~400 MB
Language Decoder (Q4)
~1.1 GB
⚡ First load downloads ~1.5 GB from Hugging Face.
🔒 Everything runs 100% in-browser — zero data leaves your device.
🛜 Keep this tab open. Do not refresh during download.