Pocket Models for iOS: Explore On-Device AI with GGUF Models, Data Memory, and Journeys

Community Article Published March 18, 2026

Upvote

TL;DR: Pocket Models is a free iOS app for exploring what on-device AI can actually do. Download and run GGUF models on your iPhone, experiment with persistent data memory, and experience guided AI Journeys, all running locally, all powered by the DataSapien Device Native AI SDK. Whether you're a developer testing small language models or a brand evaluating edge AI, this is your sandbox.

Download Free on the App Store

Why We Built This

Our team has been working on on-device data and AI for over a decade, long before the current wave of small language models made it mainstream. We've always believed that the most meaningful AI experiences will run locally, close to the person, close to their data.

On-device AI is now at an inflection point. Models like Llama, Gemma, Phi, and LiquidAI's LFM Nanos run comfortably on smartphones. The Hugging Face community is producing thousands of GGUF quantisations optimised for local inference. The hardware is ready.

But loading a model onto a phone is just the starting point. The real opportunity is in what you build on top of it: persistent memory, structured journeys, on-device agents that act on your behalf using your real data, all without anything leaving the device. That's the leap from "local inference" to "Device Native AI."

There's still a gap between running a model on a phone and understanding what on-device journeys, agents, and persistent context mean for a real product. Pocket Models is designed to close that gap. It's an education, discovery, and experimentation tool: a free sandbox where developers can test SLMs on real mobile hardware, and brands exploring Device Native AI can see what's possible when AI runs locally with access to real personal context.

What's in the App

Run GGUF Models On-Device

Pocket Models supports GGUF models running entirely on your iPhone. No cloud, no API calls, no data leaving your device.

For this initial release, any GGUF model can be requested and loaded. We're testing models before adding them to the app to ensure they perform well on mobile hardware, evaluating inference speed, memory footprint, and response quality across different iPhone generations. We're curating the library based on real device performance, and actively expanding it based on community feedback.

If there's a specific GGUF model from Hugging Face you'd like to see supported, let us know.

Data Memory

This is where Pocket Models goes beyond a standard local inference app.

The app includes a personal data store powered by the DataSapien SDK. As you interact with models, Pocket Models builds a local data profile: your preferences, context, and conversation history, stored entirely on-device in what we call MeData.

This means your local models draw on persistent personal context via on-device RAG. They don't start from zero every session. They remember what you've discussed, learn your preferences, and deliver increasingly relevant responses, all without any data leaving your phone. The memory persists, even when different models are loaded up.

The data memory layer supports hundreds of structured data types: health and fitness signals, app usage patterns, media preferences, psychographic profiles, device context, and more. All collected, processed, and stored locally. Zero data shared.

For developers, this is a chance to experience what on-device personalisation actually feels like, and to understand why persistent local context changes the quality of small model interactions so dramatically.

For brands, it's a hands-on preview of what your own app could deliver using the same SDK.

Journeys

Pocket Models includes Journeys: guided, structured AI experiences that go beyond open-ended chat.

Journeys combine local model inference with your personal data to deliver specific outcomes. A personalised wellness check-in, a review of your habits, a decision-making framework grounded in your actual behavioural data.

Think of the difference between a blank prompt and a purpose-built AI experience, except it's running entirely on your device, powered by your real data, and nothing ever touches a server.

Journeys are orchestrated using DataSapien's no-code canvas, which means new Journeys can be pushed to the app without requiring App Store updates. This is the same orchestration layer available to any developer or brand building on the DataSapien platform.

The Stack: DataSapien Device Native AI SDK

Pocket Models is built on the DataSapien Device Native AI SDK, the same SDK available to any iOS, Android, or Flutter developer building on-device AI experiences.

The SDK handles:

On-device MeData collection: baseline device signals plus native platform data
On-device inference: run and manage GGUF models locally
On-device audiences and rules: segment and personalise without a server round-trip
On-device Journeys: orchestrate structured AI experiences locally
Orchestrator sync: pull down new logic, models, and Journey definitions without syncing user data

The architecture follows a Zero-Shared Data principle: the Orchestrator syncs logic, not user data. Your personal data store stays on your device. The platform sends down model configurations, Journey definitions, and audience rules. Never the reverse.

The SDK is available on:

pub.dev/packages/datasapien_sdk (Flutter)
Native iOS and Android SDKs via dev.datasapien.com

React Native support is coming soon.

For the Hugging Face Community

Pocket Models is built to work with the GGUF models this community is already creating and sharing. As we expand the model library, we're pulling directly from HF-hosted quantisations, and we'd love input on which models to prioritise.

Some questions we're exploring:

Which sub-3B models give the best conversational quality on iPhone? We're benchmarking across Llama, Gemma, Phi, and others. If you have experience with specific quantisations that punch above their weight on mobile, we want to hear about it.
How should personal context be structured for on-device RAG with small models? We're working with around 350 structured data types and limited context windows. What retrieval patterns have you found most effective?
What Journeys would you build? If you had a local model with genuine knowledge of a user's health data, media habits, and daily patterns, what experience would you design?

Get Started

Pocket Models is free and available now on iOS.

Download Free on the App Store

Pocket Models is iOS-only for now. If there's demand for an Android version, we'll build it, so if that's something you'd find useful, let us know in the comments or drop us a message. We want to go where the community needs us.

We're actively building the model library and Journey catalogue. If you have a GGUF model you'd like to see supported, a Journey idea, or feedback on the architecture, reach out:

Pocket Models is built by DataSapien, a Device Native AI platform based in London and Türkiye. We're building the infrastructure for a post-cloud era of personalised, on-device AI, where the smartest AI isn't in a data centre, it's in your pocket.