WhatsApp gives you 20 seconds to respond to a webhook. Ollama takes 15-30 seconds to generate a response. The math doesn't work synchronously. Celery + Redis is the solution: acknowledge the webhook immediately, generate the AI response asynchronously, then send it.
The Webhook Timeout Problem
Meta's WhatsApp webhook documentation is clear: if your endpoint does not return a 200 within 20 seconds, the webhook will be retried. Retries without deduplication mean double-processing. The solution is a two-phase architecture:
- Phase 1 (synchronous, <1s) — Verify HMAC signature, deduplicate by
wa_msg_id, enqueue a Celery task, return 200. - Phase 2 (async, <60s) — The Celery worker processes the message: builds system prompt, calls Ollama, sends the response to WhatsApp.
Celery Setup with Redis
# celery_app.py
from celery import Celery
celery = Celery(
"the chatbot platform",
broker="redis://localhost:6379/0",
backend="redis://localhost:6379/1",
)
celery.conf.update(
task_serializer="json",
result_serializer="json",
accept_content=["json"],
task_acks_late=True, # Don't ack until the task completes successfully
task_reject_on_worker_lost=True,
worker_prefetch_multiplier=1, # Process one task at a time per worker
)
# tasks.py
from celery_app import celery
@celery.task(bind=True, max_retries=3, default_retry_delay=10)
def process_whatsapp_message(self, client_id: int, wa_contact_id: str, message_text: str, wa_msg_id: str):
try:
system_prompt = build_system_prompt(client_id, message_text, db, chroma)
history = get_conversation_history(client_id, wa_contact_id, db)
response = generate_response(system_prompt, history + [{"role": "user", "content": message_text}])
send_whatsapp_text(wa_contact_id, response)
store_message(client_id, wa_contact_id, message_text, response, db)
except Exception as exc:
raise self.retry(exc=exc)
Retry Logic and Dead Letter Queues
celery.conf.update(
task_routes={
"tasks.process_whatsapp_message": {"queue": "whatsapp"},
"tasks.process_widget_message": {"queue": "widget"},
},
# Dead letter queue: tasks that exhaust retries go here for inspection
task_queues={
"whatsapp": {"exchange": "whatsapp", "routing_key": "whatsapp"},
"dead": {"exchange": "dead", "routing_key": "dead"},
},
)
Monitor the dead letter queue. A task that lands there means Ollama crashed, Redis was unavailable, or WhatsApp returned a permanent error. Each case needs a different fix — don't just re-queue blindly.
What to Watch For
- Duplicate task enqueue — If the webhook returns 200 but Celery enqueue fails silently, the message is lost. Use a try/except around
task.delay()and return a 500 if enqueue fails — forcing a retry from Meta. - Redis OOM — Celery results accumulate in Redis unless you set
result_expires. Set it to 1 hour for debugging, 5 minutes in production. - Worker count vs model concurrency — Ollama processes one request at a time by default. More Celery workers than Ollama can serve concurrently will cause request queuing inside Ollama — not in your system where you can observe it.