Cactus - Deploy AI Models Locally on Mobile | Flutter, React Native & Kotlin

After months of development and feedback from our community, we're launching the new Cactus SDK with significant architectural improvements and performance optimizations.

Cactus v0 served as our foundational release, proving that high-performance on-device AI inference was possible on mobile devices. However, as developers began building more complex applications, we identified several areas for improvement in scalability, developer experience, and platform consistency.

v1 represents a complete overhaul of our inference engine with optimized ARM-CPU kernels that deliver substantially better performance across all supported devices. We've rebuilt our SDKs from the ground up to provide consistent APIs across Flutter, Kotlin Multiplatform, and C++, while maintaining backward compatibility where possible.

The new architecture is more energy-efficient and more stable on lower-end devices. It also introduces hybrid completion modes that seamlessly fall back to cloud inference when needed, ensuring reliability in production applications. This addresses one of the most common requests from v0 users who needed guaranteed response times for critical user-facing features.

We've also completely redesigned our telemetry and monitoring systems to give developers granular insights into their AI model performance, usage patterns, and potential optimization opportunities. This data-driven approach enables teams to make informed decisions about model selection and deployment strategies. Get started with telemetry here.

For existing Cactus v0 users, we recommend migrating to v1 to take advantage of existing and upcoming performance improvements. React Native developers can continue using v0 while we finalize the v1 bindings, which are currently in development.

	v0		v1
	React Native	Flutter	React Native	Flutter	Kotlin
LLM Inference			Soon
Tool calling			Soon
Embeddings			Soon
Voice transcription			Soon	Soon
Voice synthesis			Soon	Soon	Soon
Image embedding			Soon	Soon	Soon
RAG			Soon		Soon
Model format	GGUF	GGUF	Cactus	Cactus	Cactus

Cactus v1 is in beta!