WhatsApp, Stripe, GitHub — all webhook providers retry on timeout. Without deduplication, a slow handler gets the same message processed twice. The dedup check must be atomic, or a race condition between two simultaneous retries produces the same double-processing it was designed to prevent.
The Problem: Retry-Induced Duplicates
t=0s: Meta sends webhook (message ID: wamid.abc123)
t=18s: Handler is slow (LLM is generating) — Meta re-sends
t=18s: Two instances of the handler now both have wamid.abc123
t=19s: BOTH complete → message sent twice, lead created twice, reminder set twice
Atomic Deduplication with INSERT IGNORE
function claim_message(string $msgId, PDO $pdo): bool
{
// INSERT IGNORE: succeeds only once for any given msg_id
// If two requests race, exactly one will succeed (returns 1 affected row)
// The other gets 0 affected rows → returns false
$stmt = $pdo->prepare(
'INSERT IGNORE INTO wa_processed_messages (wa_msg_id, processed_at)
VALUES (?, NOW())'
);
$stmt->execute([$msgId]);
return $stmt->rowCount() === 1; // true = this request "won" the race
}
// In the webhook handler:
if (!claim_message($msgId, $pdo)) {
http_response_code(200);
echo json_encode(['status' => 'duplicate']);
exit; // Stop here — the other request is handling it
}
// Only the "winner" reaches this line
process_message($payload);
The key: INSERT IGNORE is atomic at the database level. The unique primary key on wa_msg_id guarantees exactly-once insertion. Two concurrent requests cannot both insert the same key.
TTL Cleanup for the Dedup Table
// Cron: delete dedup records older than 24 hours
// Meta doesn't retry webhooks this old
$pdo->exec('DELETE FROM wa_processed_messages WHERE processed_at < DATE_SUB(NOW(), INTERVAL 24 HOUR)');
// Also: partition by date if table grows large
// ALTER TABLE wa_processed_messages PARTITION BY RANGE (TO_DAYS(processed_at)) ...
Idempotency for Non-Webhook Contexts
The same pattern applies to any operation that must happen exactly once: Celery task retry, payment webhook, Stripe charge event. Use an idempotency key derived from the operation's unique identifier and store it in a processed_keys table with the same INSERT IGNORE + rowCount check.
What to Watch For
- SELECT before INSERT — wrong — A SELECT + INSERT sequence is not atomic. Two requests can both SELECT and find nothing, then both INSERT. Always use INSERT IGNORE + rowCount.
- Unique constraint, not PRIMARY KEY — If you use a UNIQUE INDEX instead of PRIMARY KEY on wa_msg_id, INSERT IGNORE still works atomically. But PRIMARY KEY is preferred for the dedup column.
- Dedup table vs Redis — Redis SET NX (set if not exists) works for deduplication too, and is faster. But it requires Redis to be available. The MySQL INSERT IGNORE approach is simpler and works with infrastructure you already have.