QWEN 3.5 running on iPhone Pro in airplane mode. Full large language model running onan edge device with no network connectivity.
5 is now running fully on device on an iPhone 17 Pro, and that’s a big deal.
Despite its compact size, Qwen 3.5 reportedly outperforms models up to four times larger. It shows strong multimodal capability, meaning it can interpret and reason over images as well as text. It also includes a reasoning toggle, letting users switch between faster responses and deeper step by step thinking depending on the task.
The demo uses a 2B parameter model quantized to 6 bit precision, optimized with MLX for Apple Silicon. That combination allows advanced AI to run locally, without relying on cloud servers.
If this scales, it signals a shift toward powerful, private, on device AI that doesn’t need a data center to compete.









