Sending audio on WhatsApp requires five steps: TTS generation, WAV-to-OGG/OPUS conversion, upload to Meta, send the media_id, and monitor the delivery status callback. Each step can fail silently. This article walks the complete pipeline and identifies every failure point.
The Complete Audio Pipeline
function send_voice_response(string $recipientNumber, string $text): bool
{
$tmpWav = null;
$tmpOgg = null;
try {
// Step 1: Generate WAV with Kokoro TTS
$wavBytes = generate_tts($text);
$tmpWav = save_tmp_bytes($wavBytes, '.wav');
// Step 2: Convert WAV → OGG/OPUS via local converter service
$tmpOgg = convert_wav_to_ogg($tmpWav);
// Step 3: Upload to Meta and get media_id
$mediaId = upload_to_meta($tmpOgg, 'audio/ogg; codecs=opus');
// Step 4: Send via WhatsApp
$result = send_whatsapp_audio($recipientNumber, $mediaId);
// Step 5: The silent failure trap — API returns 200, but delivery may still fail
// Delivery status comes via a separate webhook status callback (not here)
log_msg('Audio sent. Delivery status pending webhook callback.', [
'media_id' => $mediaId,
'recipient' => $recipientNumber,
]);
return true;
} catch (\Exception $e) {
// Text fallback — never fail silently
send_whatsapp_text($recipientNumber, $text);
log_error('Audio pipeline failed, text fallback sent', ['error' => $e->getMessage()]);
return false;
} finally {
// Always clean up temp files
foreach ([$tmpWav, $tmpOgg] as $f) {
if ($f && file_exists($f)) unlink($f);
}
}
}
WAV → OGG/OPUS Conversion
function convert_wav_to_ogg(string $wavPath): string
{
$oggPath = sys_get_temp_dir() . '/' . uniqid('wa_ogg_') . '.ogg';
// Use the local Python converter microservice (port 8882)
$ch = curl_init('http://localhost:9012/convert');
curl_setopt_array($ch, [
CURLOPT_POST => true,
CURLOPT_POSTFIELDS => ['file' => new \CURLFile($wavPath, 'audio/wav', 'audio.wav')],
CURLOPT_RETURNTRANSFER => true,
CURLOPT_TIMEOUT => 20,
]);
$oggBytes = curl_exec($ch);
$code = curl_getinfo($ch, CURLINFO_HTTP_CODE);
curl_close($ch);
if ($code !== 200 || !$oggBytes) {
throw new \RuntimeException("OGG conversion failed: HTTP {$code}");
}
file_put_contents($oggPath, $oggBytes);
return $oggPath;
}
The Silent Failure Trap
Meta's send API returns 200 with a message ID even when delivery will ultimately fail. The actual delivery status (sent / delivered / read / failed) arrives via a separate status webhook callback with the message ID. Without processing those status callbacks, you have no way to know if the audio reached the user.
Handle status callbacks:
// In your webhook handler, check for statuses alongside messages
$statuses = $payload['entry'][0]['changes'][0]['value']['statuses'] ?? [];
foreach ($statuses as $status) {
$msgId = $status['id'];
$state = $status['status']; // 'sent', 'delivered', 'read', 'failed'
$errors = $status['errors'] ?? [];
log_delivery_status($msgId, $state, $errors, $pdo);
if ($state === 'failed') {
handle_delivery_failure($msgId, $errors, $pdo);
}
}
What to Watch For
- MIME type for upload — Meta requires
audio/ogg; codecs=opusexactly.audio/oggalone may be rejected.audio/mpegfor OGG files will be rejected. - Text fallback is not optional — Every audio send path must have a text fallback. A voice response that silently fails leaves the user with no reply at all.
- Temp file cleanup in finally — Even if the OGG conversion throws, the WAV temp file must be cleaned up. Use
finally, not cleanup only on success.