meetsansara.com/tech
SANSARA Technical Specification
v0.7 · April 2026 · Beta
System Overview
SANSARA is an offline-first AI wellness companion built on React Native 0.85. All inference, storage, and personalization run on-device. The application ships no network permissions, no API keys, and no analytics SDK. The binary is self-contained.
┌─────────────────────────────────────────────────────────────────┐
│ USER INPUT │
│ Text │ Voice (Whisper) │ Face (MLKit) │ Biometrics │
└────┬─────┴──────────┬──────────┴────────┬──────────┴────────┬───┘
└────────────────┴─────────┬─────────┴───────────────────┘
│
┌───────────▼───────────┐
│ SENSORY FUSION │
│ Unified context vec │
└───────────┬───────────┘
│
┌───────────▼───────────┐
│ MATRIX ROUTER │
│ <100M param clf │
│ Selects 2-4 agents │
└───────────┬───────────┘
│
┌───────────▼───────────┐
│ BLACKBOARD │
│ Shared mutable JSON │
└───────────┬───────────┘
│
┌───────────▼───────────┐
│ CONSENSUS LOOP │
│ Propose → Critique │
│ → Refine → Converge │
└───────────┬───────────┘
│
┌───────────▼───────────┐
│ RESPONSE │
└───────────────────────┘Model Tiers
All tiers run the identical 9-agent architecture, memory system, and personalization pipeline. The only variable is the base language model, determined by device hardware capability.
| Tier | Model | Params | Quant | Disk | Hardware |
|---|---|---|---|---|---|
| Standard | Gemma 4 E2B | 2B | Q4_K_M GGUF | ~3 GB | iPhone 12+, Pixel 6+, 4GB RAM+ |
| Pro | Gemma 4 E4B | 4B | Q4_K_M GGUF | ~5 GB | iPad, Galaxy Tab, S25 Ultra, 6GB RAM+ |
| Desktop | Gemma 4 E4B | 4B | Q4_K_M GGUF | ~5 GB | High-RAM devices, 8GB RAM+ |
Inference Pipeline
iOS Pipeline
Model format: Q4_K_M GGUF
Runtime: llama.rn (React Native wrapper for llama.cpp)
GPU backend: Metal (Apple Silicon)
Model loading: llamaRn.initLlama({ model: path, ... })
Inference: Metal-accelerated matrix opsAndroid Pipeline
Model format: .pte (ExecuTorch ahead-of-time compiled) Runtime: react-native-executorch + ExecuTorch GPU backend: QNN → Qualcomm AI Engine / Adreno GPU Fallback: XNNPACK (CPU)
Runtime Footprint
Model on disk (Standard): ~3 GB (Gemma 4 E2B Q4_K_M) Model on disk (Pro): ~5 GB (Gemma 4 E4B Q4_K_M) KV cache overhead: ~200-500 MB (context-dependent) Whisper ASR (tiny.en): ~75 MB
Multi-Agent Architecture
Blackboard Pattern
Agents do not communicate directly. They read from and write to a shared mutable state object (the Blackboard). Decoupled architecture eliminates sequential context degradation of linear chains.
Blackboard = {
raw_input: string,
prosody_vector: float[],
emotion_vector: float[192],
biometric_state: { steps, light, hrv, sleep },
history_embeds: float[][], // HNSW nearest neighbors
agent_proposals: Proposal[],
agent_critiques: Critique[],
consensus_state: "pending" | "converged",
final_response: string | null
}Matrix Router
Sub-100M parameter classification model. Analyzes fused input vector, selects 2-4 agents. Each inference pass uses a distinct system prompt against the shared base model.
Consensus Loop
loop: 1. PROPOSE — Each active agent writes a response proposal 2. CRITIQUE — Each agent reads peer proposals, posts critiques 3. REFINE — Each agent integrates feedback, rewrites 4. MEASURE — Control Shell computes semantic similarity 5. IF similarity < threshold → GOTO 1 6. CONVERGE — Emit unified response
The Nine Agents
| ID | Layer | Input Signal | Behavior |
|---|---|---|---|
| ECHO_EMPATH | Emotional Mirroring | Text sentiment + vocal prosody | Validates immediate emotional state |
| PATTERN_ECHO | Emotional Mirroring | Current input + historical memory | Surfaces recurring emotional themes |
| BIO_MIRROR | Emotional Mirroring | Steps, light, HRV, sleep | Reflects body-mood connection |
| INSIGHT_WEAVER | Cognitive Reframing | Immediate text input | Restructures negatives into reflective prompts |
| TREND_SAGE | Cognitive Reframing | Historical vector embeddings | Reveals macro-trends in micro-failures |
| ENV_THINKER | Cognitive Reframing | Ambient light, location, time | Context-aware environmental reframing |
| ACTION_NURTURER | Compassionate Action | Acute stress level | Immediate grounding exercises |
| GROWTH_GUIDE | Compassionate Action | Historical success database | Recommends proven personal strategies |
| HABIT_ELEVATOR | Compassionate Action | Biometric telemetry | Links behavioral change to physical metrics |
All nine agents share a single quantized base model. The Matrix Router time-slices inference passes with distinct system prompts. No nine separate models are loaded.
Multimodal Fusion Layer
| Modality | Implementation | RAM | Latency | Output |
|---|---|---|---|---|
| Voice | whisper.rn (ggml-tiny.en) | ~75 MB | <2s / chunk | Text + temporal prosody array |
| Face | Google ML Kit Face Detection | ~5 MB | Real-time | Emotion vector (opt-in, per-session) |
| Biometrics | HealthKit / Health Connect | Negligible | Passive | { steps, heart_rate, sleep } |
All modalities opt-in per session. Raw pixel data purged from RAM immediately after embedding extraction. Only the output vector persists on-device.
On-Device Storage
Journal store: expo-sqlite (SQLite on-device) Vector index: In-process IVF clustering (JavaScript) Index cap: 2,000 entries (covers ~2.7 years of daily use) Encryption: AES-256 at rest (Android); iOS encryption in development Search: Nearest-neighbor semantic retrieval Data location: Device only — structurally incapable of leaving
On-Device Learning
LoRA Fine-Tuning
Method: Low-Rank Adaptation Target layers: q_proj, v_proj (attention) Trainable params: ~2-5M (vs billions of frozen base params) Adapter size: 10-50 MB Trigger: Device charging + idle (overnight) Android engine: ExecuTorch native training loop ← available now iOS engine: In development
Feedback Sessions
Weekly or monthly (user-configured). SANSARA initiates structured dialogue referencing specific past responses. User corrections converted to labeled training pairs for the next LoRA cycle. Passive learning from response quality signals: entry length, re-engagement rate, conversation flow depth.
Predictive Foresight
Algorithm: Trend analysis on rolling emotional vectors Input: Historical mood and sentiment scores (local SQLite) Output: Pattern flags written to Blackboard context Detection: Recurring dips, weekly stressors, mood cycles Trigger: Preemptive signal surfaced to active agents
Privacy Architecture
network_permission: NONE api_keys_in_binary: NONE analytics_sdk: NONE cloud_endpoints: NONE telemetry_hooks: NONE data_location: DEVICE ONLY encryption_android: AES-256 at rest encryption_ios: In development bytes_transmitted: 0 accounts_required: 0 # Structurally incapable of transmitting data. # Not by policy. By architecture.
Accessibility
| Feature | Implementation |
|---|---|
| Interface modes | 8 modes: Minimal → Maximum visual intensity |
| Typography | OpenDyslexic font, adjustable kerning / line-height |
| Sensory controls | Independent toggles: haptics, motion, soundscapes |
| Input | Voice-first journaling with prosody analysis |
| Progress | Stellar Alignments (cumulative) — no streak counters |
| Shame mechanics | None — no streaks, no guilt counters |
| Glassmorphism | Constrained blur (4-6px), enforced text contrast |
Tablet Mode (Pro / Desktop)
| Capability | Implementation |
|---|---|
| Handwriting | On-device OCR → searchable text + preserved strokes (Android ML Kit + iOS Vision) |
| Drawing | Canvas capture → preserved alongside journal entry |
| Stylus | Palm rejection + stylus-only mode; pressure sensitivity in development |
| Devices | Apple Pencil (iPad), Samsung S Pen (Galaxy Tab / S25 Ultra) |
Stack Summary
Application: React Native 0.85 (bare workflow) UI framework: React Native Skia (SkSL procedural shaders) Animation: Reanimated + react-native-svg Inference iOS: llama.rn + GGUF + Metal GPU Inference Droid: react-native-executorch + ExecuTorch Models: Gemma 4 E2B (Standard) / Gemma 4 E4B (Pro+), Q4_K_M GGUF Voice: whisper.rn (ggml-tiny.en, ~75 MB) Face: Google ML Kit Face Detection Storage: expo-sqlite (SQLite) + in-process vector index Encryption: AES-256 (Android); iOS in development Learning: LoRA via ExecuTorch (Android); iOS in development Forecasting: Trend analysis on local emotional history State: Zustand + Blackboard JSON Haptics: Platform LRA APIs (affective patterns) Testing: Jest + Firebase Test Lab (Android CI) Payment: Stripe (web) / RevenueCat (in-app) Web: Next.js 16 on Vercel
Questions: info@sansara.app