In-depth technical writing on AI, ML, distributed systems, and modern engineering.
Downloading voice notes from Meta's media endpoint, local Whisper transcription via HTTP microservice, language hint injection, and graceful…
The WAV-not-MP3 trap, the UTF-8 /u flag corruption bug in prepareText(), audio type classification, and keeping the model warm with a health…
TTS → WAV → OGG/OPUS via FFmpeg → Meta upload → send media_id → monitor delivery status. The silent failure trap: API returns 200 but delive…
Storing wa_message_id + transcript on created workspace items, WebhookContext globals for cross-cutting request state, and the media URL exp…
Kokoro always outputs WAV regardless of requested format. FFmpeg converts WAV → OGG/OPUS at 48kHz mono 48kbps. The exact command, bitrate ch…
The two-step upload → send flow, MIME type requirements (audio/ogg; codecs=opus), delivery status callbacks, and the silent success trap: AP…
WAV → OGG/OPUS: the full annotated command. Bitrate choices for voice (48k VBR Opus). Detecting actual format with the file command vs trust…
Test each step independently: TTS, conversion, upload, send. Meta delivery status as ground truth. Decoding common error codes (131053 and o…
Stripping WhatsApp markdown (/u flag required), expanding legal abbreviations for natural pronunciation, converting bullet lists to spoken c…
Weekly digest of the best new articles, videos, and tutorials. No spam, unsubscribe anytime.
✓ Check your inbox to confirm!