DEEP DIVE SYSTEM DESIGN

How WhatsApp Delivers Messages to 2 Billion Users — Messaging at Planetary Scale

By Akshay Ghalme·April 16, 2026·20 min read

You send a WhatsApp message. It reaches the other person's phone in under 200 milliseconds, even if they are on the other side of the planet. The message is encrypted so thoroughly that WhatsApp's own servers cannot read it. Now multiply that by 100 billion messages per day, across 2 billion monthly active users, in every country on earth. And for most of WhatsApp's history, this was run by fewer than 50 engineers.

WhatsApp is the most used messaging application in the world, and its architecture is one of the most efficient pieces of infrastructure ever deployed to consumers. The technology choices — Erlang, the Signal Protocol, a fanatical focus on simplicity — are deeply unusual and deeply deliberate. Each one solves a specific problem at planetary scale, and each one carries a lesson for any engineer building real-time systems.

This post is the first in a three-part series on Meta's infrastructure. Part 1 covers WhatsApp's messaging architecture. Part 2 covers how Meta deploys code to 4 billion users with zero downtime. Part 3 covers how Meta stores trillions of photos, messages, and social graph edges.

The Erlang/BEAM Foundation — Why WhatsApp Runs on a Telecom Language

WhatsApp's server software is written in Erlang, a programming language originally built by Ericsson in the 1980s for telephone switches. This is not a fashionable choice. Erlang is not on any "top 10 languages to learn" list. It has no hype cycle. It has barely any hiring pool. WhatsApp chose it anyway, and that choice is the single most important architectural decision in the entire system.

Here is what the BEAM virtual machine (Erlang's runtime) gives WhatsApp that no other platform does as well:

Millions of Lightweight Processes Per Node

Erlang processes are not OS threads. They are virtual machine constructs that cost roughly 300 bytes of memory each and are scheduled by the BEAM VM across available CPU cores. A single Erlang node can run millions of concurrent processes. In WhatsApp's architecture, each connected user is one Erlang process. A single server handling 2 million concurrent connections means 2 million Erlang processes — trivial for BEAM, impossible for most other runtimes.

For comparison, a Java application using one OS thread per connection would need 2 million threads. At roughly 1 MB of stack space per thread, that is 2 TB of memory just for stack space — on a single machine. Erlang handles the same workload in a few hundred gigabytes.

The "Let It Crash" Philosophy

In most programming languages, when something goes wrong, you write error handling code to catch it, recover, and continue. In Erlang, the philosophy is the opposite: let the process crash and restart it fresh. Each Erlang process is isolated — one crashing process cannot corrupt another. A supervision tree automatically restarts crashed processes within milliseconds.

This is perfect for a messaging server. If a user's connection process encounters a corrupted packet, the process crashes, the supervisor restarts it, and the user reconnects automatically. The 1,999,999 other connections on the same server are completely unaffected. No shared state, no cascading failures, no "one bad request takes down the server."

Hot Code Swapping

BEAM supports loading new code into a running system without stopping it. WhatsApp can deploy new server code to a node while 2 million users are connected to it, and none of them notice. No reconnections, no dropped messages, no maintenance window. This is the holy grail of zero-downtime deployment, and it comes built into the VM.

The Numbers That Prove the Choice

WhatsApp circa 2014 (pre-Meta acquisition):
  Servers:          ~550 (commodity hardware)
  Engineers:        ~50
  Users:            450 million
  Messages/day:     ~50 billion
  Connections/server: ~2 million

Ratio:  ~820,000 users per engineer
        ~91 billion messages per server per year

No other messaging platform in history has achieved this ratio. The Erlang/BEAM runtime is the primary reason.

Signal Protocol — End-to-End Encryption at Planetary Scale

Every WhatsApp message — text, photo, video, voice call, video call — is end-to-end encrypted using the Signal Protocol, developed by Moxie Marlinspike and Open Whisper Systems. This is the same protocol used by Signal (the app). It is considered the gold standard of messaging encryption.

End-to-end encryption means the WhatsApp server cannot read your messages. It is not a policy choice ("we promise not to read them"). It is a mathematical impossibility — the server never has the keys. Here is how it works:

Key Exchange — X3DH (Extended Triple Diffie-Hellman)

When two users first communicate, their devices perform an X3DH key exchange. Each user has a long-term identity key, a signed pre-key, and a set of one-time pre-keys. The pre-keys are uploaded to the WhatsApp server when the app is installed. When Alice wants to message Bob for the first time, she downloads Bob's pre-keys from the server and computes a shared secret — without Bob being online. Bob's phone computes the same shared secret the next time it comes online. The server never sees the shared secret.

The Double Ratchet — A New Key for Every Message

After the initial key exchange, every subsequent message uses the Double Ratchet algorithm. This generates a new encryption key for literally every single message. If an attacker somehow compromises the key for message #47, they still cannot decrypt message #46 or message #48. This property is called forward secrecy (past messages stay secret) and future secrecy (future messages stay secret even if the current key leaks).

The "double" in Double Ratchet refers to two ratchets running simultaneously: a Diffie-Hellman ratchet that advances when messages go back and forth, and a symmetric key ratchet that advances on every single message. Together they ensure that the cryptographic state continuously evolves and never reuses a key.

The Message Delivery Pipeline

Here is what actually happens when you send a WhatsApp message:

sequenceDiagram participant A as Alice's Phone participant S as WhatsApp Server participant B as Bob's Phone A->>A: Encrypt message with
Signal Protocol A->>S: Send encrypted blob +
recipient ID + metadata Note over S: Server CANNOT
read the message alt Bob is online S->>B: Forward encrypted blob B->>B: Decrypt with local keys B->>S: Delivery receipt ✓✓ S->>A: Delivery receipt ✓✓ else Bob is offline S->>S: Store in offline queue
(encrypted, max 30 days) Note over S: When Bob comes online... S->>B: Deliver queued messages B->>B: Decrypt with local keys B->>S: Delivery receipt ✓✓ S->>A: Delivery receipt ✓✓ S->>S: Delete from queue end

Online Delivery (the Fast Path)

If the recipient is connected to the WhatsApp server (which most active users are — the app maintains a persistent connection), the message is forwarded immediately. The server matches the recipient ID to the Erlang process handling that user's connection and pushes the encrypted blob. Total server-side processing time: typically under 10ms. The round-trip latency the user sees (send → double checkmark) is dominated by network latency, not server processing.

Offline Queuing (the Store-and-Forward Path)

If the recipient is offline, the server stores the encrypted message in an offline queue. Messages are held for up to 30 days. When the recipient's phone reconnects, the server delivers all queued messages in order. Once the recipient's device acknowledges receipt, the server deletes the messages from its storage. At no point does the server decrypt the messages — it is storing and forwarding opaque encrypted blobs.

The Checkmarks — What They Actually Mean

  • One grey checkmark (✓) — message reached the WhatsApp server
  • Two grey checkmarks (✓✓) — message was delivered to the recipient's device
  • Two blue checkmarks (✓✓) — recipient opened the chat (read receipt, can be disabled)

Each checkmark is a separate acknowledgment flowing through the system. The "delivered" checkmark requires an ACK from the recipient's device back to the server, which forwards it to the sender. This is the same acknowledgment pattern used in TCP — and like TCP, it is the only way to reliably know a message arrived.

Push Notification Routing

When a user's phone is not actively connected to WhatsApp's server (the app is in the background or the phone is in deep sleep), the server uses push notifications to wake the app:

  • iOS: Apple Push Notification service (APNs) — WhatsApp sends a push payload to Apple's servers, which deliver it to the iPhone. The push wakes the app, which reconnects to WhatsApp's server and downloads the message.
  • Android: Firebase Cloud Messaging (FCM) — same pattern through Google's push infrastructure.
  • Fallback: If neither push service delivers within a timeout (unreliable networks, aggressive battery optimization), the message stays in the offline queue until the app naturally reconnects.

The push notification itself does not contain the message text. It is a "wake up" signal. The actual message is downloaded over the encrypted connection after the app reconnects. This is both a privacy feature (Apple/Google cannot read your messages) and a reliability feature (the encrypted message is always delivered through WhatsApp's own infrastructure, not through a third party).

Multi-Device Architecture — The 2021 Redesign

Until 2021, WhatsApp Web and WhatsApp Desktop were mirrors of your phone — they proxied every message through your phone's connection. If your phone was offline, your desktop client did not work. This was architecturally simple (one device, one identity) but operationally terrible (slow, battery-draining, and broken whenever your phone had no internet).

The 2021 multi-device redesign changed this fundamentally:

graph TD ALICE[Alice's Contact] -->|Encrypted for
each device| SERVER[WhatsApp Server] SERVER -->|Copy 1| PHONE[Bob's Phone
Identity Key A] SERVER -->|Copy 2| WEB[Bob's Web Client
Identity Key B] SERVER -->|Copy 3| DESKTOP[Bob's Desktop
Identity Key C] PHONE --> DECRYPT1[Decrypt with Key A] WEB --> DECRYPT2[Decrypt with Key B] DESKTOP --> DECRYPT3[Decrypt with Key C] classDef server fill:#6C3CE1,stroke:#00D4AA,stroke-width:2px,color:#fff; classDef device fill:#4A1DB5,stroke:#00D4AA,stroke-width:2px,color:#fff; classDef good fill:#047857,stroke:#00D4AA,stroke-width:3px,color:#fff; class SERVER server; class PHONE,WEB,DESKTOP,ALICE device; class DECRYPT1,DECRYPT2,DECRYPT3 good;

Each device now has its own identity key pair. When Alice sends Bob a message, her device encrypts separate copies for each of Bob's registered devices. Bob's phone, web client, and desktop client each independently decrypt their copy using their own private keys. The phone no longer needs to be online — each device maintains its own encrypted session with every contact.

This is dramatically more complex than the original architecture. A message to Bob with 3 devices requires 3 separate encryption operations. A group message to 50 people averaging 2 devices each requires 100 encryptions. The fan-out cost scales linearly with the number of devices in the conversation, which is why WhatsApp limits you to 4 linked devices.

Group Messaging — Solving the Fan-Out Problem

Encrypting a group message individually for every recipient would be catastrophically expensive. A 256-person group (WhatsApp's maximum) with 2 devices each would require 512 encryption operations per message. On a busy group, that is thousands of encryptions per second from a single sender.

WhatsApp solves this with the Sender Key protocol. Each group member generates a sender key for that group and distributes it (encrypted) to every other member. When sending a message, the sender encrypts it once with their sender key, and the server fans out the single encrypted blob to all members. Each member decrypts it using the sender's key that they already have.

The trade-off: if a member is removed from the group, all sender keys must be regenerated and redistributed (because the removed member knows the old keys). This is why removing someone from a large WhatsApp group can briefly cause a burst of key exchange traffic.

Media Delivery — Photos, Videos, and Voice Notes

Text messages are small (a few KB) and flow directly through the message pipeline. Media is different — a photo can be 5 MB, a video 50 MB. WhatsApp handles media separately:

  1. Client encrypts the media with a random symmetric key.
  2. Client uploads the encrypted blob to WhatsApp's media storage (a CDN-backed blob store).
  3. Client sends a regular text message containing the media URL, the decryption key, and a thumbnail (also encrypted).
  4. Recipient receives the text message, sees the thumbnail, and downloads the full media from the CDN when they tap it.
  5. Recipient decrypts the media locally using the key from the message.

This separation means the media storage layer never has the decryption keys (they travel through the message channel, which uses Signal Protocol). The CDN stores opaque encrypted blobs. Even if the storage infrastructure is compromised, the media is unreadable.

WhatsApp vs Signal vs Telegram vs iMessage

Feature WhatsApp Signal Telegram iMessage
E2E Encryption Default, all chats Default, all chats Secret Chats only (opt-in) Default (Apple-to-Apple)
Protocol Signal Protocol Signal Protocol MTProto (custom) Custom (proprietary)
Server can read messages? No No Yes (regular chats) No (Apple-to-Apple)
Metadata collection Extensive (contacts, usage, location) Minimal (phone number only) Moderate Moderate (tied to Apple ID)
Server language Erlang Java/Rust C++ (custom) Unknown (proprietary)
Open source Encryption library only Fully open source Client only No
Max group size 1,024 1,000 200,000 (channels) 32
Offline message storage 30 days on server Until delivered Indefinite (cloud) 30 days on server

The key distinction: WhatsApp and Signal cannot read your messages by design. Telegram can read your regular chats. Telegram's defense is that they choose not to — but "choose not to" is a policy, not a mathematical guarantee. The Signal Protocol makes reading messages a mathematical impossibility for anyone without the recipient's device.

Why 50 Engineers Could Handle 2 Billion Users

WhatsApp's tiny engineering team is one of the most remarkable facts in software history. The reasons are architectural, not heroic:

  1. Erlang eliminates most operational complexity. Hot code upgrades, fault-isolated processes, built-in distribution — the things other companies need dedicated SRE teams for come free with the runtime.
  2. Feature minimalism. WhatsApp deliberately did not build stories, feeds, ads, games, bots, or payment systems for years. Fewer features = fewer bugs = fewer engineers needed. The Unix philosophy of "do one thing well" applied to a consumer product.
  3. Server-side simplicity. The server's job is simple: authenticate, route encrypted blobs, manage presence. It does not process, analyze, recommend, or render anything. The encryption and decryption happen on the client. The server is a glorified mail carrier.
  4. No ads = no ad infrastructure. WhatsApp charged $1/year for years instead of running ads. No ads meant no ad serving system, no targeting pipeline, no analytics stack, no advertiser dashboard — which would have required 10x the engineering team.

The Infrastructure Numbers

WhatsApp (estimated, 2026):
  Monthly active users:       2+ billion
  Messages per day:           100+ billion
  Peak messages per second:   ~1.2 million
  Voice/video calls per day:  2+ billion minutes
  Countries served:           180+
  Server fleet:               Thousands (Meta data centers)
  Protocol:                   Custom XMPP variant (Erlang)
  Encryption:                 Signal Protocol (E2E, all messages)
  Linked devices per account: Up to 4 companion + 1 phone

Patterns You Can Reuse in Your Own Systems

Most engineers will never build WhatsApp. But these patterns show up in any system that handles real-time state with millions of connections:

  • Actor model for connections. One lightweight process per connection (Erlang actors, Go goroutines, or Node.js with WebSockets). Do not share state between connections — let each actor own its state.
  • Store-and-forward for reliability. When the recipient is offline, queue the message. When they come back, deliver and delete. This is the same pattern as SQS, Kafka consumer groups, and email SMTP relays. If your system drops messages when the receiver is offline, you need this.
  • Separate media from messages. Upload large blobs to a CDN, send a reference in the small message. This keeps the real-time channel fast and lets the CDN handle bandwidth-heavy delivery. S3 + CloudFront + a message queue is the AWS version of this pattern.
  • End-to-end encryption as an architecture constraint. Designing your system so the server cannot read the data (not just "promises not to") radically simplifies compliance (GDPR, HIPAA) and reduces the blast radius of server breaches. The principle of least privilege taken to its logical extreme.
  • Per-device identity for multi-device. If your product supports multiple devices per user, give each device its own credentials and session. Do not proxy through one "primary" device — it creates a single point of failure and drains the primary device's battery.

Frequently Asked Questions

How does WhatsApp handle 2 billion users?

Through a combination of Erlang/BEAM for extreme concurrency (millions of connections per server), Signal Protocol for end-to-end encryption, a store-and-forward message delivery pipeline with offline queuing, and an extremely lean engineering culture. The key choice was Erlang — its lightweight process model lets a single server handle 2+ million concurrent connections, which is why WhatsApp ran with under 50 engineers for years.

Why does WhatsApp use Erlang?

Because the BEAM virtual machine was purpose-built for telecom systems requiring massive concurrency, fault tolerance, and hot code upgrades. A single Erlang node handles millions of lightweight processes (one per connection). The "let it crash" philosophy means individual connection failures never cascade. Hot code swapping lets WhatsApp deploy without disconnecting users. No other runtime matches this combination.

How does WhatsApp encryption work?

WhatsApp uses the Signal Protocol. Initial key exchange uses X3DH (Extended Triple Diffie-Hellman). Subsequent messages use the Double Ratchet algorithm which generates a new encryption key for every single message. The WhatsApp server never has decryption keys — it stores and forwards encrypted blobs it cannot read. Even if the server is compromised, messages remain encrypted.

How does WhatsApp multi-device work?

Since 2021, each device (phone, web, desktop) has its own identity key pair and maintains independent encrypted sessions with every contact. Messages are encrypted separately for each device. The phone no longer needs to be online — each device independently receives and decrypts. Limited to 4 companion devices to manage fan-out costs.

How does WhatsApp handle group messages?

Using the Sender Key protocol. Each group member distributes a sender key to all other members. The sender encrypts once with their sender key, the server fans out the single blob to all members. This avoids encrypting separately for each recipient — a message to a 256-person group requires one encryption instead of 256.

How many messages does WhatsApp handle per day?

Over 100 billion messages per day across 2 billion monthly active users. Peak throughput is approximately 1.2 million messages per second globally. The Erlang server fleet maintains persistent connections with every active client.

What database does WhatsApp use?

Mnesia (Erlang's built-in distributed database) for session and presence data. A custom-modified LMDB for server-side message storage (temporary, until delivered). Messages are deleted from the server after delivery. Long-term backups use Google Drive (Android) or iCloud (iOS) on the client side.

How is WhatsApp different from Telegram?

The biggest difference is encryption. WhatsApp uses end-to-end encryption by default for all messages (Signal Protocol) — the server cannot read them. Telegram uses server-client encryption by default (Telegram can read messages) with optional E2E Secret Chats. WhatsApp prioritizes privacy at the protocol level. Telegram prioritizes features (large groups, channels, bots, cloud storage).


Next in the Meta Infrastructure Series

Related Reading

AG

Akshay Ghalme

AWS DevOps Engineer with 3+ years building production cloud infrastructure. AWS Certified Solutions Architect. Currently managing a multi-tenant SaaS platform serving 1000+ customers.

More Guides & Terraform Modules

Every guide comes with a matching open-source Terraform module you can deploy right away.