Series 7 — Self-Hosted AI Stack on Raspberry Pi • Part 1 of 6

A Raspberry Pi 5 running Kokoro TTS, Whisper STT, Ollama LLM, ChromaDB, and a custom audio converter handles the entire the WhatsApp AI agent stack without a single cloud API call. This article covers the service architecture, port planning, resource management, and startup order.

Services and Ports

┌─────────────────────────────────────────────────────────────────────────────┐
│                    RASPBERRY PI 5 (16GB RAM)                                │
│                                                                             │
│  Service               Port    Language    Process manager                  │
│  ─────────────────────────────────────────────────────────                 │
│  Kokoro TTS            9010    Python      nohup + disown                   │
│  Whisper STT           9011    Python      nohup + disown                   │
│  Audio Converter       9012    Python      nohup + disown                   │
│  Ollama LLM           11434    Go          systemd (built-in)               │
│  ChromaDB              8000    Python      nohup + disown                   │
│  Apache (PHP app)       443    PHP/Apache  systemd                          │
│  Nginx (reverse proxy)  80     Nginx       systemd                          │
└─────────────────────────────────────────────────────────────────────────────┘

Resource Constraints on Pi 5 (16GB)

ServiceRAM (steady)RAM (peak)CPU (idle)
Kokoro TTS (af_heart model)~800MB~1.2GB~2%
Whisper STT (base model)~300MB~600MB~1%
Ollama (llama3.1:8b)~5.5GB~6.5GB~3%
ChromaDB~200MB~500MB~1%
Audio Converter~50MB~200MB~0%
Apache + PHP~400MB~800MB~5%

Total steady-state: ~7.25GB. Peak (all services under load simultaneously): ~9.8GB. Leaves 6GB headroom on a 16GB Pi.

Service Startup Order

Services with inter-dependencies must start in order:

  1. ChromaDB — no dependencies
  2. Ollama — no dependencies (managed by systemd)
  3. Kokoro TTS — no dependencies
  4. Whisper STT — no dependencies
  5. Audio Converter — depends on FFmpeg being installed
  6. Apache/PHP — depends on all above (waits for health checks)

Create a startup script that checks each service's health endpoint before starting the next. A service that starts before its dependencies produces cryptic errors.

What to Watch For

  • SD card failure — Raspberry Pi running AI services writes frequently to disk (model caches, temp files). Use an SSD via USB 3.0 for the OS and services, not the SD card.
  • Temperature throttling — Pi 5 throttles at 80°C. Under sustained LLM load, CPU temperature can reach 70°C without active cooling. Add a heatsink and fan.
  • Port conflict detection — Before starting any service, check if the port is already bound: ss -tlnp | grep :9010. A zombie process holding the port will cause a silent startup failure.