meetsansara.com/tech
SANSARA Technical Specification
v0.7 · March 2026 · Pre-release
System Overview
SANSARA is an offline-first AI wellness companion built on React Native 0.84. All inference, storage, and personalization run on-device. The application ships no network permissions, no API keys, and no analytics SDK. The binary is self-contained.
┌─────────────────────────────────────────────────────────────────┐
│ USER INPUT │
│ Text │ Voice (Whisper) │ Face (MLKit) │ Biometrics │
└────┬─────┴──────────┬──────────┴────────┬──────────┴────────┬───┘
└────────────────┴─────────┬─────────┴───────────────────┘
│
┌───────────▼───────────┐
│ SENSORY FUSION │
│ Unified context vec │
└───────────┬───────────┘
│
┌───────────▼───────────┐
│ MATRIX ROUTER │
│ <100M param clf │
│ Selects 2-4 agents │
└───────────┬───────────┘
│
┌───────────▼───────────┐
│ BLACKBOARD │
│ Shared mutable JSON │
└───────────┬───────────┘
│
┌───────────▼───────────┐
│ CONSENSUS LOOP │
│ Propose → Critique │
│ → Refine → Converge │
└───────────┬───────────┘
│
┌───────────▼───────────┐
│ RESPONSE │
└───────────────────────┘Model Tiers
All tiers run the identical 9-agent architecture, memory system, and personalization pipeline. The only variable is the base language model, determined by device hardware capability.
| Tier | Model | Params | Quant | RAM | Hardware |
|---|---|---|---|---|---|
| Base | Llama 3.2 1B | 1B | 4-bit QLoRA | ~1.5 GB | iPhone 14+, Pixel 7+, 4GB+ |
| Pro | Llama 3.2 3B | 3B | 4-bit | ~3 GB | iPad, Galaxy Tab, S25 Ultra, 6GB+ |
| Desktop | Llama 3.2 11B | 11B | 4-bit | ~6 GB | M-series Mac, iPad Pro M4, 10GB+ |
Inference Pipeline
Compilation
PyTorch model → torch.export() # Capture computational graph → torchao quantization # 4-bit via SmoothQuant / SpinQuant → ExecuTorch lowering # Hardware backend delegation → .pte binary # Ahead-of-time compiled artifact
Hardware Backends
| Platform | Primary Backend | Fallback |
|---|---|---|
| iOS | CoreML → Apple Neural Engine | XNNPACK (CPU) |
| Android | QNN → Qualcomm AI Engine / Adreno GPU | XNNPACK (CPU) |
| Desktop | CoreML (Apple Silicon) / CPU | XNNPACK |
Speculative Decoding
Lightweight draft model predicts multiple future tokens in parallel. Primary model verifies in a single batch. Measured speedup: 2.2-3.6x. Processor at peak for a fraction of a second, then back to low-power sleep.
Runtime Footprint
ExecuTorch base runtime: 50 KB react-native-executorch: JSI bridge, declarative API Model artifact (.pte): ~500 MB (1B), ~1.5 GB (3B), ~4 GB (11B) KV cache overhead: ~200-500 MB (context-dependent)
Multi-Agent Architecture
Blackboard Pattern
Agents do not communicate directly. They read from and write to a shared mutable state object (the Blackboard). Decoupled architecture eliminates sequential context degradation of linear chains.
Blackboard = {
raw_input: string,
prosody_vector: float[],
emotion_vector: float[192],
biometric_state: { steps, light, hrv, sleep },
history_embeds: float[][], // HNSW nearest neighbors
agent_proposals: Proposal[],
agent_critiques: Critique[],
consensus_state: "pending" | "converged",
final_response: string | null
}Matrix Router
Sub-100M parameter classification model. Analyzes fused input vector, selects 2-4 agents. Each inference pass uses a distinct system prompt against the shared base model.
Consensus Loop
loop: 1. PROPOSE — Each active agent writes a response proposal 2. CRITIQUE — Each agent reads peer proposals, posts critiques 3. REFINE — Each agent integrates feedback, rewrites 4. MEASURE — Control Shell computes semantic similarity 5. IF similarity < threshold → GOTO 1 6. CONVERGE — Emit unified response
The Nine Agents
| ID | Layer | Input Signal | Behavior |
|---|---|---|---|
| ECHO_EMPATH | Emotional Mirroring | Text sentiment + vocal prosody | Validates immediate emotional state |
| PATTERN_ECHO | Emotional Mirroring | Current input + historical memory | Surfaces recurring emotional themes |
| BIO_MIRROR | Emotional Mirroring | Steps, light, HRV, sleep | Reflects body-mood connection |
| INSIGHT_WEAVER | Cognitive Reframing | Immediate text input | Restructures negatives into reflective prompts |
| TREND_SAGE | Cognitive Reframing | Historical vector embeddings | Reveals macro-trends in micro-failures |
| ENV_THINKER | Cognitive Reframing | Ambient light, location, time | Context-aware environmental reframing |
| ACTION_NURTURER | Compassionate Action | Acute stress level | Immediate therapeutic micro-interventions |
| GROWTH_GUIDE | Compassionate Action | Historical success database | Recommends proven personal strategies |
| HABIT_ELEVATOR | Compassionate Action | Biometric telemetry | Links behavioral change to physical metrics |
All nine agents share a single quantized base model. The Matrix Router time-slices inference passes with distinct system prompts. No nine separate models are loaded.
Multimodal Fusion Layer
| Modality | Implementation | RAM | Latency | Output |
|---|---|---|---|---|
| Voice | Whisper ASR (ggml-tiny.en) | ~40 MB | <2s / chunk | Text + temporal prosody array |
| Face | ML Kit Face Mesh + MobileNetV3 TFLite | ~5 MB | Real-time | 192-dim emotion vector |
| Biometrics | HealthKit / SensorManager | Negligible | Passive | JSON: { steps, light, hrv, sleep } |
All modalities opt-in per session. Raw pixel data purged from RAM immediately after embedding extraction. Only the output vector persists on-device.
On-Device Storage
Database: ObjectBox 4.0
Core: C++ with memory-mapped files (FlatBuffers)
Vector index: HNSW (Hierarchical Navigable Small World)
Binary size: ~3 MB
Encryption: AES-256 at rest
Search latency: <5ms over millions of embeddings
Memory model: 5-year tiered consolidation
├─ Hot: last 30 days (full embeddings)
├─ Warm: 30-365 days (compressed)
└─ Cold: 1-5 years (summary vectors)On-Device Learning
LoRA Fine-Tuning
Method: Low-Rank Adaptation Target layers: q_proj, v_proj (attention) Trainable params: ~2-5M (vs 1B+ frozen base) Adapter size: 10-50 MB Trigger: Device charging + idle (overnight) iOS engine: MLX Swift (unified memory, zero CPU↔GPU transfer) Android engine: ExecuTorch native training loop
Feedback Sessions
Weekly or monthly (user-configured). SANSARA initiates structured dialogue referencing specific past responses. User corrections converted to labeled training pairs for the next LoRA cycle. Passive learning from response quality signals: entry length, re-engagement rate, conversation flow depth.
Predictive Foresight
Algorithm: ARIMA (Autoregressive Integrated Moving Average) Input: Historical emotional vectors from ObjectBox Output: Mood trajectory forecast (72h window) Detection: Cyclical dips, seasonal patterns, weekly stressors Trigger: Preemptive intervention written to Blackboard
Privacy Architecture
network_permission: NONE api_keys_in_binary: NONE analytics_sdk: NONE cloud_endpoints: NONE telemetry_hooks: NONE data_location: DEVICE ONLY encryption: AES-256 bytes_transmitted: 0 accounts_required: 0 # Structurally incapable of transmitting data. # Not by policy. By architecture.
Accessibility
| Feature | Implementation |
|---|---|
| Interface modes | 8 modes: Minimal → Maximum visual intensity |
| Typography | OpenDyslexic font, adjustable kerning / line-height |
| Sensory controls | Independent toggles: haptics, motion, soundscapes |
| Input | Voice-first journaling with prosody analysis |
| Focus | ADHD Focus Mode — structured attention |
| Progress | Stellar Alignments (cumulative) — no streak counters |
| Compliance | WCAG AAA on all text over shifting backgrounds |
| Glassmorphism | Constrained blur (4-6px), 1px text shadows |
Tablet Mode (Pro / Desktop)
| Capability | Implementation |
|---|---|
| Handwriting | On-device OCR → searchable text + preserved strokes |
| Drawing | Canvas capture → visual pattern analysis |
| Stylus | Pressure sensitivity, palm rejection, tilt detection |
| Devices | Apple Pencil (iPad), Samsung S Pen (Galaxy Tab / S25 Ultra) |
| Vision (Desktop) | 11B model processes drawings as visual input |
Stack Summary
Application: React Native 0.84 (bare, not Expo managed) UI framework: React Native Skia (SkSL procedural shaders) Animation: Reanimated + react-native-svg Inference: ExecuTorch (Meta) + react-native-executorch Models: Llama 3.2 (1B / 3B / 11B), 4-bit quantized .pte Voice: whisper.rn (ggml-tiny.en) Face: Google ML Kit + MobileNetV3 TFLite Storage: ObjectBox 4.0 (HNSW vector index) Encryption: AES-256 on-device Learning: LoRA via MLX Swift (iOS) / ExecuTorch (Android) Forecasting: ARIMA on local emotional vectors State: Zustand + Blackboard JSON Haptics: Platform LRA APIs (affective patterns) Build: Fastlane (iOS + Android) Testing: Jest (TDD for Blackboard + inference) Payment: Stripe / Paddle (Web2App, external checkout) Web: Next.js 16 on Vercel
Questions: info@sansara.app