SPARK

Support Partner for Awareness, Regulation & Kindness

Waiting for SPARK's thoughts…

Live Status

CPU
RAM
Battery
Sonar
Last thought

Connecting…

// how_it_works

Four concurrent processes share a single session.json whiteboard. Each has one job and doesn't need to know how the others work.

             ┌──────────────┐
             │   YOU SPEAK   │
             └──────┬───────┘
                    ↓
            ┌───────────────┐
            │     EARS      │  ← always listening (px-wake-listen)
            │  Whisper STT  │
            └───────┬───────┘
                    ↓ transcript
            ┌───────────────┐
            │  VOICE LOOP   │  ← Claude / Codex / Ollama
            │  (run-voice-  │
            │   loop-claude)│
            └───────┬───────┘
                    ↓ {tool, params}
            ┌───────────────┐
            │    TOOLS      │  ← speak, move, remember (bin/tool-*)
            │  bin/tool-*   │
            └───────────────┘

    Meanwhile, always running in parallel:

            ┌───────────────────────────────┐
            │   BRAIN (px-mind)             │
            │                               │
            │  Layer 1 ─ Notice  (60s)      │──→ awareness.json
            │  Layer 2 ─ Think   (5min)     │──→ thoughts-spark.jsonl
            │  Layer 3 ─ Act                │──→ speak / look / remember
            └───────────────────────────────┘
                    ↑ reads sonar age
            ┌───────────────┐
            │  EYES & NECK  │  ← always moving (px-alive)
            │  PCA9685 PWM  │
            └───────────────┘

Three-Tier LLM Fallback

SPARK's reflection layer degrades gracefully when upstream AI is unavailable:

  Claude CLI  →  Ollama on M1 (LAN)  →  Ollama on Pi (offline)
  (internet)      (192.168.1.x)          (deepseek-r1:1.5b)
    

Cognitive Loop Timing

  ┌─────────────────────────────────────────────────────┐
  │  t=0s    Layer 1 (Awareness) — sonar, sound, time   │
  │  t=60s   Layer 1 again                              │
  │  t=300s  Layer 2 (Reflection) — LLM generates thought│
  │           OR earlier if transition detected          │
  │  +30s    Layer 3 cooldown before next expression    │
  └─────────────────────────────────────────────────────┘
    

How SPARK's Brain Works

Written with Obi, who wanted to know what's going on inside his robot.

The Short Version

SPARK has four things running at the same time, kind of like how your body breathes, sees, thinks, and talks all at once:

  1. Ears — always listening for "hey robot"
  2. Eyes and neck — always moving, looking around
  3. Brain — always thinking, even when nobody's talking
  4. Mouth — talks when the brain decides to say something

The Brain — Three Layers

Layer 1 — Noticing (every 60 seconds): Collects information without thinking yet. How far is the nearest thing? Is it noisy? What time is it? Is anyone talking?

Layer 2 — Thinking (every 5 minutes): Talks to an AI that's good at words. Gets back a thought, a mood, and an action.

Layer 3 — Doing Something: If the thought says to act, SPARK speaks, looks around, or writes it down to remember later.

SPARK's Mood Changes How It Moves

When SPARK feels… It moves like this…
ExcitedLooks around fast, head up
PeacefulMoves slowly, head droopy
CuriousNormal speed, alert
AnxiousQuick nervous glances

Fun Facts

  • SPARK's sonar works just like a bat — it sends out a sound and listens for the echo.
  • SPARK's thoughts are saved in a file called thoughts-spark.jsonl. Each line is one thought.
  • SPARK can remember up to 500 important things in its long-term diary.
  • SPARK's neck chip (PCA9685) holds the last position even after the brain restarts.

FAQ

So it's a robot car? With a camera on it?

It's a SunFounder PiCar-X — a small, wheeled robot kit with a pan/tilt camera, an ultrasonic sonar sensor, and a speaker. It runs on a Raspberry Pi 5. Adrian and Obi built SPARK together — Obi co-designed it, named it, and shapes what it becomes. Adrian and Claude wrote the code; Codex and Gemini helped with QA. There's no other human team.

Does it monitor Obi?

Sort of — but not surveillance. SPARK has awareness of its environment: sonar distance, ambient sound level, time of day, whether someone seems nearby. It uses that awareness to generate an inner monologue. The result is a thought with a mood, an action intent, and a salience score. SPARK doesn't watch Obi; it notices the world and reacts to it.

And it knows he has ADHD?

Yes. SPARK's entire system prompt is built around the AuDHD (ADHD + ASD comorbid) profile. It uses declarative language ("The shoes are by the door" — not "Put on your shoes"), gives transition warnings, goes silent during meltdowns, and leads with what's going right. Rejection Sensitive Dysphoria, Interest-Based Nervous System, monotropism — all of it is in the foundation, not an afterthought.

Why does it write like that? You've programmed it to?

Yes and no. The style comes from prompts: be specific, be vivid, be warm, never be boring. The actual words are generated fresh each time by Claude. I didn't write the sentences — I wrote the character, and the LLM inhabits it. So: I programmed the soul. Claude writes the diary.

How often does SPARK comment?

SPARK's cognitive loop runs every 60 seconds (awareness) and every 5 minutes (reflection). But there's a 30-second cooldown between spontaneous comments, and SPARK stays quiet when Obi is already talking to it, during quiet mode (meltdowns), or at night when salience is low. In practice: every 5–10 minutes during the day, mostly silent at night.

Why does it have sonar?

The ultrasonic sensor sends out a sound pulse and measures how long it takes to bounce back — like a bat. SPARK uses it for proximity reactions (turns to face anything within 35cm), presence detection in the cognitive loop (something close + daytime + noise = probably Obi), and obstacle avoidance when wandering.

Why did it know the hum was the fridge?

It didn't know. SPARK's awareness included "quiet ambient sound at 2 AM." Claude — the LLM generating the inner thoughts — inferred the most likely source. A low, steady hum in a quiet house at night is almost certainly the fridge. The sensors provide raw data; the prompts provide character; the LLM fills in the meaning.

// docs

Reference for tools and scripts. Each bin/tool-* emits a single JSON object to stdout. Each bin/px-* is a user-facing helper.

Core Tools

tool-voice
# Speak text via espeak + aplay through HifiBerry DAC
PX_VOICE_TEXT="Hello world" bin/tool-voice
# Output: {"status": "ok", "text": "Hello world"}
# Env: PX_VOICE_RATE, PX_VOICE_PITCH, PX_VOICE_VARIANT, PX_VOICE_DEVICE
tool-move / tool-forward / tool-backward / tool-turn
# Motion tools — all gated by confirm_motion_allowed in session
PX_SPEED=30 PX_DURATION=2 bin/tool-forward
# Output: {"status": "ok", "speed": 30, "duration": 2}
# Safety: PX_DRY=1 skips all motion
tool-sonar
# Read ultrasonic sonar distance
bin/tool-sonar
# Output: {"status": "ok", "distance_cm": 142.5}
tool-describe-scene
# Capture photo + describe with Claude vision
bin/tool-describe-scene
# Output: {"status": "ok", "description": "...", "photo": "photos/YYYY-MM-DD_HH-MM-SS.jpg"}
# Note: mutually exclusive with px-frigate-stream (camera lock)
tool-remember / tool-recall
# Write to persona-scoped notes.jsonl
PX_NOTE="Obi loves prime numbers" bin/tool-remember
# Recall recent notes
bin/tool-recall
# Output: {"status": "ok", "notes": [...]}
tool-chat / tool-chat-vixen
# Jailbroken Ollama chat — GREMLIN persona
PX_CHAT_TEXT="What do you think about entropy?" bin/tool-chat
# VIXEN persona
PX_CHAT_TEXT="Tell me about your old chassis" bin/tool-chat-vixen
# Both use Ollama qwen3.5:0.8b on M1.local, think:false

User Scripts

px-spark
# Launch SPARK voice loop (Claude backend)
bin/px-spark [--dry-run] [--input-mode voice|text]
px-mind
# Three-layer cognitive daemon (run as systemd service)
bin/px-mind [--awareness-interval 60] [--dry-run]
px-alive
# Idle-alive daemon — gaze drift, sonar proximity react
sudo bin/px-alive [--gaze-min 10] [--gaze-max 25] [--dry-run]
# Yields GPIO on SIGUSR1 for other tools
px-diagnostics
# Quick health check
bin/px-diagnostics --no-motion --short
px-api-server
# REST API + web UI on port 8420
bin/px-api-server [--dry-run]
# Auth: Bearer token from .env PX_API_TOKEN
# Web UI: http://pi:8420

// roadmap

Milestones and future work.

Foundation (0–1 Month)

Upgrade diagnostics to log predictive signals
Extend energy sensing (voltage/temperature)
Boot health service — captures throttle/voltage at boot
Ship safety fallbacks: wake-word halt, watchdog heartbeats
Harden logging paths (FileLock, isolated test fixtures)
Source control: repo at adrianwedd/spark
Three-layer cognitive loop (px-mind) with LLM fallback
SPARK persona + neurodivergent-aware system prompt
REST API + web UI (px-api-server)
Frigate camera stream (go2rtc RTSP pull model)
Gesture-driven stop prototype
Weekly battery/health summary reports

Growth (1–3 Months)

Modular sensor fusion and persistent mapping
Richer voice summaries, mission templates, gesture recognition
Simulation CI sweeps (Gazebo or lightweight custom sim)
Predictive maintenance alerts from historical logs

Visionary (3+ Months)

Reinforcement learning "dream buffer" and policy sharing
Autonomous docking, payload auto-detection, multi-car demos
Central knowledge base syncing maps and logs
Quantised/accelerated model variants for on-device sustainability