meetsansara.com/tech

SANSARA Technical Specification

v0.7 · April 2026 · Beta

01

System Overview

SANSARA is an offline-first AI wellness companion built on React Native 0.85. All inference, storage, and personalization run on-device. The application ships no network permissions, no API keys, and no analytics SDK. The binary is self-contained.

┌─────────────────────────────────────────────────────────────────┐
│                        USER INPUT                               │
│   Text   │   Voice (Whisper)   │   Face (MLKit)   │  Biometrics │
└────┬─────┴──────────┬──────────┴────────┬──────────┴────────┬───┘
     └────────────────┴─────────┬─────────┴───────────────────┘
                                │
                    ┌───────────▼───────────┐
                    │    SENSORY FUSION     │
                    │  Unified context vec  │
                    └───────────┬───────────┘
                                │
                    ┌───────────▼───────────┐
                    │    MATRIX ROUTER      │
                    │   <100M param clf     │
                    │   Selects 2-4 agents  │
                    └───────────┬───────────┘
                                │
                    ┌───────────▼───────────┐
                    │     BLACKBOARD        │
                    │  Shared mutable JSON  │
                    └───────────┬───────────┘
                                │
                    ┌───────────▼───────────┐
                    │   CONSENSUS LOOP      │
                    │  Propose → Critique   │
                    │  → Refine → Converge  │
                    └───────────┬───────────┘
                                │
                    ┌───────────▼───────────┐
                    │      RESPONSE         │
                    └───────────────────────┘
02

Model Tiers

All tiers run the identical 9-agent architecture, memory system, and personalization pipeline. The only variable is the base language model, determined by device hardware capability.

TierModelParamsQuantDiskHardware
StandardGemma 4 E2B2BQ4_K_M GGUF~3 GBiPhone 12+, Pixel 6+, 4GB RAM+
ProGemma 4 E4B4BQ4_K_M GGUF~5 GBiPad, Galaxy Tab, S25 Ultra, 6GB RAM+
DesktopGemma 4 E4B4BQ4_K_M GGUF~5 GBHigh-RAM devices, 8GB RAM+
03

Inference Pipeline

iOS Pipeline

Model format:     Q4_K_M GGUF
Runtime:          llama.rn (React Native wrapper for llama.cpp)
GPU backend:      Metal (Apple Silicon)
Model loading:    llamaRn.initLlama({ model: path, ... })
Inference:        Metal-accelerated matrix ops

Android Pipeline

Model format:     .pte (ExecuTorch ahead-of-time compiled)
Runtime:          react-native-executorch + ExecuTorch
GPU backend:      QNN → Qualcomm AI Engine / Adreno GPU
Fallback:         XNNPACK (CPU)

Runtime Footprint

Model on disk (Standard):  ~3 GB  (Gemma 4 E2B Q4_K_M)
Model on disk (Pro):       ~5 GB  (Gemma 4 E4B Q4_K_M)
KV cache overhead:         ~200-500 MB (context-dependent)
Whisper ASR (tiny.en):     ~75 MB
04

Multi-Agent Architecture

Blackboard Pattern

Agents do not communicate directly. They read from and write to a shared mutable state object (the Blackboard). Decoupled architecture eliminates sequential context degradation of linear chains.

Blackboard = {
  raw_input:        string,
  prosody_vector:   float[],
  emotion_vector:   float[192],
  biometric_state:  { steps, light, hrv, sleep },
  history_embeds:   float[][],     // HNSW nearest neighbors
  agent_proposals:  Proposal[],
  agent_critiques:  Critique[],
  consensus_state:  "pending" | "converged",
  final_response:   string | null
}

Matrix Router

Sub-100M parameter classification model. Analyzes fused input vector, selects 2-4 agents. Each inference pass uses a distinct system prompt against the shared base model.

Consensus Loop

loop:
  1. PROPOSE   — Each active agent writes a response proposal
  2. CRITIQUE  — Each agent reads peer proposals, posts critiques
  3. REFINE    — Each agent integrates feedback, rewrites
  4. MEASURE   — Control Shell computes semantic similarity
  5. IF similarity < threshold → GOTO 1
  6. CONVERGE  — Emit unified response
05

The Nine Agents

IDLayerInput SignalBehavior
ECHO_EMPATHEmotional MirroringText sentiment + vocal prosodyValidates immediate emotional state
PATTERN_ECHOEmotional MirroringCurrent input + historical memorySurfaces recurring emotional themes
BIO_MIRROREmotional MirroringSteps, light, HRV, sleepReflects body-mood connection
INSIGHT_WEAVERCognitive ReframingImmediate text inputRestructures negatives into reflective prompts
TREND_SAGECognitive ReframingHistorical vector embeddingsReveals macro-trends in micro-failures
ENV_THINKERCognitive ReframingAmbient light, location, timeContext-aware environmental reframing
ACTION_NURTURERCompassionate ActionAcute stress levelImmediate grounding exercises
GROWTH_GUIDECompassionate ActionHistorical success databaseRecommends proven personal strategies
HABIT_ELEVATORCompassionate ActionBiometric telemetryLinks behavioral change to physical metrics

All nine agents share a single quantized base model. The Matrix Router time-slices inference passes with distinct system prompts. No nine separate models are loaded.

06

Multimodal Fusion Layer

ModalityImplementationRAMLatencyOutput
Voicewhisper.rn (ggml-tiny.en)~75 MB<2s / chunkText + temporal prosody array
FaceGoogle ML Kit Face Detection~5 MBReal-timeEmotion vector (opt-in, per-session)
BiometricsHealthKit / Health ConnectNegligiblePassive{ steps, heart_rate, sleep }

All modalities opt-in per session. Raw pixel data purged from RAM immediately after embedding extraction. Only the output vector persists on-device.

07

On-Device Storage

Journal store:   expo-sqlite (SQLite on-device)
Vector index:    In-process IVF clustering (JavaScript)
Index cap:       2,000 entries (covers ~2.7 years of daily use)
Encryption:      AES-256 at rest (Android); iOS encryption in development
Search:          Nearest-neighbor semantic retrieval
Data location:   Device only — structurally incapable of leaving
08

On-Device Learning

LoRA Fine-Tuning

Method:          Low-Rank Adaptation
Target layers:   q_proj, v_proj (attention)
Trainable params: ~2-5M (vs billions of frozen base params)
Adapter size:    10-50 MB
Trigger:         Device charging + idle (overnight)
Android engine:  ExecuTorch native training loop  ← available now
iOS engine:      In development

Feedback Sessions

Weekly or monthly (user-configured). SANSARA initiates structured dialogue referencing specific past responses. User corrections converted to labeled training pairs for the next LoRA cycle. Passive learning from response quality signals: entry length, re-engagement rate, conversation flow depth.

Predictive Foresight

Algorithm:       Trend analysis on rolling emotional vectors
Input:           Historical mood and sentiment scores (local SQLite)
Output:          Pattern flags written to Blackboard context
Detection:       Recurring dips, weekly stressors, mood cycles
Trigger:         Preemptive signal surfaced to active agents
09

Privacy Architecture

network_permission:     NONE
api_keys_in_binary:     NONE
analytics_sdk:          NONE
cloud_endpoints:        NONE
telemetry_hooks:        NONE

data_location:          DEVICE ONLY
encryption_android:     AES-256 at rest
encryption_ios:         In development
bytes_transmitted:      0
accounts_required:      0

# Structurally incapable of transmitting data.
# Not by policy. By architecture.
10

Accessibility

FeatureImplementation
Interface modes8 modes: Minimal → Maximum visual intensity
TypographyOpenDyslexic font, adjustable kerning / line-height
Sensory controlsIndependent toggles: haptics, motion, soundscapes
InputVoice-first journaling with prosody analysis
ProgressStellar Alignments (cumulative) — no streak counters
Shame mechanicsNone — no streaks, no guilt counters
GlassmorphismConstrained blur (4-6px), enforced text contrast
11

Tablet Mode (Pro / Desktop)

CapabilityImplementation
HandwritingOn-device OCR → searchable text + preserved strokes (Android ML Kit + iOS Vision)
DrawingCanvas capture → preserved alongside journal entry
StylusPalm rejection + stylus-only mode; pressure sensitivity in development
DevicesApple Pencil (iPad), Samsung S Pen (Galaxy Tab / S25 Ultra)
12

Stack Summary

Application:     React Native 0.85 (bare workflow)
UI framework:    React Native Skia (SkSL procedural shaders)
Animation:       Reanimated + react-native-svg
Inference iOS:   llama.rn + GGUF + Metal GPU
Inference Droid: react-native-executorch + ExecuTorch
Models:          Gemma 4 E2B (Standard) / Gemma 4 E4B (Pro+), Q4_K_M GGUF
Voice:           whisper.rn (ggml-tiny.en, ~75 MB)
Face:            Google ML Kit Face Detection
Storage:         expo-sqlite (SQLite) + in-process vector index
Encryption:      AES-256 (Android); iOS in development
Learning:        LoRA via ExecuTorch (Android); iOS in development
Forecasting:     Trend analysis on local emotional history
State:           Zustand + Blackboard JSON
Haptics:         Platform LRA APIs (affective patterns)
Testing:         Jest + Firebase Test Lab (Android CI)
Payment:         Stripe (web) / RevenueCat (in-app)
Web:             Next.js 16 on Vercel

Questions: info@sansara.app

SANSARA