← Back to Logs

Why WiFi Does Not Guarantee Your Message Gets Delivered

Try the interactive lab for this articleTake the quiz (6 questions · ~5 min)

When Dimitris in Athens sends a WhatsApp message to Sofia in Berlin, the word "delivered" hides an enormous amount of protocol machinery. His phone encodes the message, encrypts it, wraps it in a TLS record, hands it to a TCP segment, places that segment inside an IP packet, and finally pushes the packet into an 802.11 frame that is modulated onto a radio carrier and transmitted by the WiFi chip. Every single one of those layers has its own notion of "success" and "failure," and none of them, individually, guarantees that Sofia will ever see the message.

This is not a bug. It is a design choice that has been baked into internet architecture since the 1980s, formalised in one of the most influential papers in systems design, and validated by decades of deployment experience. The network does not guarantee message delivery. The endpoints do.

This post traces that idea from its theoretical roots through every layer of the stack, and then examines how the major messaging platforms (WhatsApp, Telegram, Signal, Facebook Messenger, Viber) each solve the problem of reliable message delivery on top of a network that promises nothing.

1. The End-to-End Argument: Why the Network Stays Dumb

In 1984, Jerome Saltzer, David Reed, and David Clark published "End-to-End Arguments in System Design." The paper was short, clear, and transformative. Its central claim: if a function can only be correctly implemented with the participation of the endpoints, placing that function inside the network provides at best a performance optimisation, never a complete solution.

Their canonical example was reliable file transfer. A network could implement checksums at every hop, retransmit every lost packet, and verify every byte in transit. But even with all that machinery, the file could still be corrupted by a bug in the sending application's memory, a disk error on the receiving end, or a crash during the final write. The only way to guarantee that the file arrived correctly is for the receiving application to checksum the entire file after it has been written to disk and confirm it back to the sender. An end-to-end check is necessary. Network-level checks are redundant (at best) or wasteful (at worst).

This argument shaped the internet's layered architecture. IP routers do not guarantee delivery. They do not retransmit lost packets. They do not reorder out-of-sequence data. They forward packets toward their destination and drop them when they cannot. TCP, sitting at the transport layer on the endpoints, provides reliable ordered delivery. But TCP only covers the transport; application-level reliability (did the message get stored? did the user see it? did the display render correctly?) requires application-level confirmation.

The end-to-end argument is not a law of physics. It is an engineering principle, and like all engineering principles, it admits exceptions. Sometimes placing a function inside the network does help performance enough to justify the complexity. Link-layer retransmissions (like WiFi's MAC-layer ACKs) are a perfect example: they reduce the loss rate that TCP sees, which improves throughput, even though they cannot eliminate loss entirely. But the principle holds in its strong form: the network alone cannot provide the reliability that applications need.

Every messaging app in existence is a living proof of this argument. WhatsApp, Signal, Telegram, Viber, Messenger: they all build their own delivery confirmation systems because the network beneath them, from WiFi to IP to the cellular baseband, simply does not provide one.

2. What WiFi Actually Guarantees: Less Than You Think

WiFi operates at Layer 2, the data link layer. It moves frames between a station (your phone or laptop) and an access point (your router). The 802.11 standard does include an acknowledgement mechanism at this layer, and understanding what it covers (and what it does not) matters.

MAC-Layer Frame ACKs

When a station transmits a unicast data frame, the receiver must respond with an ACK frame within a strict timing window. In 802.11a/g/n, this is the Short Interframe Space (SIFS), which is 10 microseconds on 5 GHz and 10 microseconds on 2.4 GHz (for OFDM). The ACK is a 14-byte control frame: it contains no payload, only the receiver's MAC address and a duration field.

If the sender does not receive the ACK within the SIFS + slot time window, it assumes the frame was lost. The MAC layer will retry the transmission, up to a configurable limit. The 802.11 standard specifies two retry counters:

  • Short Retry Count (for frames shorter than the RTS threshold): default limit is 7.
  • Long Retry Count (for frames equal to or longer than the RTS threshold): default limit is 4.

After exhausting retries, the MAC layer discards the frame and reports a transmission failure to the upper layer (typically the network stack's IP layer). There is no further recovery at Layer 2.

What This Guarantees

The WiFi ACK mechanism guarantees one thing: when the sender receives a MAC-layer ACK, the frame was received by the access point's radio and passed its Frame Check Sequence (FCS, a CRC-32 integrity check). That is the entire scope. It means:

  • The frame was not corrupted in transit (CRC passed).
  • The access point's radio hardware received the frame.
  • The access point sent back an ACK.

It does not mean:

  • The frame survived the access point's internal processing.
  • The frame was forwarded to the wired network.
  • The frame reached the destination IP address.
  • The frame reached the destination application.
  • The destination user read the message.

The WiFi ACK is a one-hop, link-layer confirmation. It covers approximately 10 metres of radio path between your phone and your router. The message still has to traverse the router's internal forwarding logic, the ISP's network, multiple autonomous systems, the recipient's ISP, the recipient's router, the recipient's WiFi link, and the recipient's application stack. WiFi says nothing about any of that.

Broadcast and Multicast: No ACKs at All

WiFi broadcast and multicast frames are not acknowledged. The sender transmits once, at the lowest mandatory rate (typically 1 Mbps on 2.4 GHz or 6 Mbps on 5 GHz), and hopes for the best. There are no retries. This is why protocols that rely on multicast (mDNS, SSDP, ARP) can be unreliable on WiFi networks; the frames simply vanish if they collide or if the receiver misses them.

Power Save and Frame Buffering

When a WiFi client enters power-save mode (which mobile devices do aggressively to conserve battery), the access point must buffer frames destined for that client. The AP announces buffered frames in the Traffic Indication Map (TIM) field of its beacon frames, which are transmitted every 102.4 milliseconds by default. The client wakes up periodically, listens for a beacon, checks the TIM, and sends a PS-Poll or triggers a service period (in U-APSD/WMM power save) to retrieve its buffered frames.

If the AP's buffer fills up, or if the client does not wake up in time, frames are dropped. This is a common source of packet loss on mobile devices: the radio was literally asleep when the frame arrived, and the buffer overflowed before it woke up.

3. IP Is Deliberately Unreliable

The Internet Protocol (IP, RFC 791 for IPv4, RFC 8200 for IPv6) provides a single service: best-effort delivery of datagrams from a source address to a destination address. "Best-effort" is a precise technical term. It means the network will try to deliver the packet, but it makes no promises about success, ordering, timing, or non-duplication.

What Routers Do

An IP router receives a packet on one interface, examines the destination address, performs a longest-prefix match against its routing table, decrements the Time to Live (TTL) field (or Hop Limit in IPv6), recalculates the header checksum, and forwards the packet out the appropriate interface. If the outgoing interface's queue is full, the router drops the packet. No notification is sent to the sender. No retry occurs. The packet is gone.

This is by design. The alternative is a circuit-switched network, like the old telephone system, where resources are reserved end-to-end before communication begins. Circuit switching guarantees capacity but wastes resources when the circuit is idle (which, for bursty data traffic, is most of the time). IP's packet-switched, best-effort model is statistically efficient: it allows many flows to share the same links, at the cost of occasional loss when demand exceeds capacity.

Why Packets Get Dropped

Packets are lost in the IP network for many reasons:

  • Congestion: a router's output queue fills up, and incoming packets are tail-dropped (or randomly dropped under active queue management schemes like Random Early Detection).
  • TTL expiry: a routing loop causes a packet to circulate until its TTL reaches zero. The router discards the packet and may send an ICMP Time Exceeded message back to the source.
  • Link failure: a physical link goes down mid-transmission. Packets in flight on that link are lost.
  • MTU mismatch: a packet is too large for the next hop's Maximum Transmission Unit, the Don't Fragment (DF) bit is set, and the router drops the packet with an ICMP Fragmentation Needed message.
  • Policy filtering: a firewall or access control list drops the packet based on source, destination, port, or protocol.
  • Checksum failure: the IPv4 header checksum does not match (IPv6 has no header checksum; it relies on upper-layer checksums).

In normal operation on a healthy network, the loss rate is low (typically under 0.1% for well-provisioned links). But "low" is not "zero," and the network provides no mechanism to recover from that loss. Recovery is the job of higher layers.

Contrast with the Telephone Network

The traditional Public Switched Telephone Network (PSTN) was circuit-switched. When you dialled a number, the switches established a dedicated 64 kbit/s channel through the network. That channel was reserved for the duration of the call, whether or not anyone was speaking. The channel guaranteed constant bandwidth and constant latency, which is perfect for voice. But it was catastrophically inefficient for data, because data traffic is bursty: a web page load sends a burst of packets, then nothing, then another burst. Reserving a constant circuit for bursty traffic wastes most of the reserved bandwidth.

IP won because statistical multiplexing (sharing links among many flows, each getting capacity only when it has data to send) is vastly more efficient for data traffic. The price is that no individual flow gets a guarantee. The network is best-effort. Reliability lives at the edges.

4. TCP vs UDP vs QUIC: The Transport Layer Tradeoff

The transport layer sits between IP (which is unreliable) and applications (which need reliability, or at least some defined service). The three major transport protocols offer different tradeoffs.

TCP: Reliable, Ordered, Expensive

TCP (RFC 9293, the current consolidated specification) provides a reliable, ordered byte stream. It achieves this through:

  • Sequence numbers: every byte of data is assigned a sequence number. The receiver can detect missing, duplicate, and out-of-order bytes.
  • Acknowledgements: the receiver sends ACKs indicating the next expected byte. Selective ACKs (SACK, RFC 2018) allow the receiver to report non-contiguous blocks of received data.
  • Retransmission: if an ACK is not received within the Retransmission Timeout (RTO), or if three duplicate ACKs indicate a fast retransmit, the sender resends the missing data.
  • Flow control: the receiver advertises a window size indicating how much buffer space is available. The sender will not send more than this.
  • Congestion control: algorithms like CUBIC, BBR, or Reno limit the sending rate to avoid overwhelming the network.

The cost of all this machinery is latency. TCP requires a three-way handshake (1 RTT) before data can flow. If TLS is layered on top, add another 1 to 2 RTTs. Head-of-line blocking means a single lost segment stalls the entire stream until it is retransmitted. For short-lived connections (a single DNS query, a game state update), the handshake cost alone can dominate total latency.

UDP: Unreliable, Unordered, Cheap

UDP (RFC 768) adds almost nothing to IP. It provides port numbers (for demultiplexing), a length field, and an optional checksum (mandatory in IPv6). No sequencing, no acknowledgement, no retransmission, no flow control, no congestion control. A UDP datagram is sent and either arrives or does not. If the application needs reliability, it must implement it.

UDP is the right choice when:

  • Timeliness matters more than completeness (real-time voice, video, gaming).
  • The application has its own reliability protocol (DNS, QUIC, WireGuard).
  • The overhead of TCP connection setup is too high for a single request-response exchange.

QUIC: Reliable Streams over UDP

QUIC (RFC 9000) runs over UDP but provides TCP-like reliability with several improvements:

  • 0-RTT and 1-RTT handshakes: QUIC integrates TLS 1.3 into the transport handshake. A new connection requires 1 RTT; a resumed connection can send data in 0 RTT.
  • Stream multiplexing without head-of-line blocking: QUIC multiplexes multiple independent streams within a single connection. Loss on one stream does not block others.
  • Connection migration: QUIC connections are identified by a Connection ID, not by the four-tuple (source IP, source port, destination IP, destination port). When a phone switches from WiFi to cellular, the connection can survive.
  • Encrypted transport headers: almost all QUIC header fields are encrypted, preventing middlebox ossification.

HTTP/3 uses QUIC as its transport. Google has been running QUIC in production since 2013, and it carries a significant fraction of global web traffic.

What Messaging Apps Use

Most messaging apps use TCP or QUIC for their primary message transport, because messages need reliable delivery. Some specifics:

  • WhatsApp: uses Noise Pipes (a Noise Protocol Framework variant) over TCP, with fallback to WebSocket. More recently, WhatsApp has been observed using QUIC on some connections.
  • Telegram: uses its own MTProto protocol over TCP, with UDP as a fallback.
  • Signal: uses WebSocket over TLS/TCP for the persistent connection to Signal's servers.
  • Facebook Messenger: uses MQTT over TLS/TCP for push messaging.
  • Viber: uses a proprietary binary protocol over TCP.

In all cases, the transport provides byte-stream reliability, but message-level delivery confirmation requires application-layer protocol work on top.

5. WhatsApp: Signal Protocol, Noise Pipes, and the Checkmark System

WhatsApp is the most widely used messaging app in Europe, with over 2 billion users globally. Its architecture combines end-to-end encryption, a store-and-forward server infrastructure, and a multi-layered delivery confirmation system.

The Signal Protocol: Double Ratchet and X3DH

WhatsApp uses the Signal Protocol for end-to-end encryption. This protocol has two main components:

X3DH (Extended Triple Diffie-Hellman) handles initial key agreement. When Elena in Amsterdam wants to message Nikos in Athens for the first time, her client needs to establish a shared secret without Nikos being online. X3DH works as follows:

  1. Every WhatsApp client uploads a set of prekeys to the WhatsApp server: an identity key (long-term), a signed prekey (medium-term, rotated periodically), and a batch of one-time prekeys (each used exactly once).
  2. Elena's client fetches Nikos's prekey bundle from the server.
  3. Elena's client performs three (or four, if a one-time prekey is available) Diffie-Hellman computations using her identity key, her ephemeral key, Nikos's identity key, Nikos's signed prekey, and optionally Nikos's one-time prekey.
  4. The results are combined through HKDF to derive a shared secret, which initialises the Double Ratchet.

The beauty of X3DH is asynchronous key agreement: Elena can encrypt a message for Nikos even if Nikos's phone has been off for three days. The server holds his prekeys; his phone does not need to be involved until he comes online to decrypt.

The Double Ratchet provides forward secrecy and future secrecy for every message. It combines:

  • A Diffie-Hellman ratchet: each message exchange includes a new ephemeral DH public key. Both parties compute a new DH shared secret, which ratchets the root key forward.
  • A symmetric ratchet: between DH ratchet steps, a KDF chain derives per-message keys from the current chain key. Each message key is used once and deleted.

This means that compromising a single message key does not reveal past or future messages. Every message is encrypted with a unique key derived from an evolving chain of secrets.

Noise Pipes Transport

Below the Signal Protocol's encryption layer, WhatsApp uses Noise Pipes, a transport protocol from the Noise Protocol Framework (designed by Trevor Perrin, who also co-designed the Signal Protocol). Noise Pipes provides:

  • Mutual authentication between the client and the WhatsApp server.
  • Encrypted transport (all data after the handshake is encrypted with ChaCha20-Poly1305 or AES-256-GCM).
  • A 1-RTT handshake for new connections, and a 0-RTT handshake for resumed connections (using a cached server static key).

The Noise handshake pattern used is IK (the client knows the server's static key in advance, enabling a 1-RTT handshake) or XX (when the client does not have the server's key cached, requiring 2 RTTs). In practice, after the first connection, subsequent connections use IK for lower latency.

Store-and-Forward

WhatsApp's servers act as a store-and-forward relay. When Dimitris sends a message:

  1. His client encrypts the message using the Signal Protocol (producing ciphertext that only Sofia's device can decrypt).
  2. The client sends the ciphertext to WhatsApp's server over the Noise Pipes connection.
  3. The server stores the ciphertext. WhatsApp cannot decrypt it (it does not have the message keys).
  4. When Sofia's device connects (or is already connected), the server forwards the ciphertext.
  5. Sofia's client decrypts the message using her copy of the Double Ratchet state.

If Sofia is offline, the server queues the message. WhatsApp reportedly stores undelivered messages for up to 30 days before discarding them.

The Checkmark System

WhatsApp's checkmarks map directly to protocol-level events:

Indicator Meaning Protocol Event
Single grey checkmark (✓) Message sent to server Server acknowledged receipt of the ciphertext from the sender's device
Double grey checkmarks (✓✓) Message delivered to recipient's device Recipient's device downloaded the ciphertext and sent a delivery receipt back through the server
Double blue checkmarks (✓✓) Message read by recipient Recipient's app rendered the message on screen and sent a read receipt back through the server

Each transition involves a round trip through the network. The delivery receipt is an application-layer message sent by the recipient's WhatsApp client back to the server, which relays it to the sender. This is entirely separate from any TCP ACK or WiFi ACK. A TCP ACK means "my kernel's TCP stack received the bytes." A WhatsApp delivery receipt means "the WhatsApp application on the recipient's phone processed the message."

Multi-Device

WhatsApp's multi-device support (launched in 2021) complicates this. Without multi-device, messages were relayed through the phone, which was the single source of truth. With multi-device, each device has its own Signal Protocol identity key pair and its own set of ratchet states. When Dimitris sends a message to Sofia, his client must encrypt it separately for each of Sofia's linked devices (phone, desktop, web). The server stores and forwards each copy independently. Delivery and read receipts are per-device.

6. Telegram: MTProto 2.0 and the Cloud-First Model

Telegram takes a very different architectural approach from WhatsApp and Signal. It is cloud-first: messages are stored on Telegram's servers, encrypted in transit but not end-to-end encrypted by default. This is a deliberate design choice with significant implications for both usability and privacy.

MTProto 2.0

Telegram uses its own custom protocol, MTProto 2.0, instead of relying on established protocols like TLS. MTProto handles authentication, encryption, serialisation, and transport. The protocol has been criticised by some cryptographers for being bespoke rather than using well-audited standards, but it has undergone several independent security analyses.

Key exchange in MTProto uses a Diffie-Hellman exchange between the client and the Telegram server. The shared secret is combined with a server nonce and client nonce through SHA-256 to derive an authorisation key, which is a 2048-bit key that persists across sessions. This is server-client encryption, not end-to-end: the Telegram server possesses the key material needed to decrypt messages.

Message encryption in MTProto 2.0 uses AES-256-IGE (Infinite Garble Extension) mode with a per-message key derived from the authorisation key and the message content. The message key is computed as:

msg_key = SHA-256(substr(auth_key, 88, 32) + plaintext)[8:24]

This is a 128-bit key extracted from the middle of a SHA-256 hash. The AES key and IV are then derived from the auth_key and msg_key through a series of SHA-256 operations.

TL serialisation: Telegram uses its own serialisation format called TL (Type Language), not Protocol Buffers or JSON. TL is a strongly-typed binary serialisation scheme with a schema language. Every API call and response is serialised using TL constructors, which are identified by CRC32 hashes of their type signatures. This is unusual; most modern APIs use Protobuf, FlatBuffers, or JSON.

Cloud-First Architecture

Regular Telegram chats (called "Cloud Chats") are stored on Telegram's servers in encrypted form. The server can decrypt them (it has the auth_key). This enables several features that E2E-encrypted apps cannot easily provide:

  • Seamless multi-device: all your devices see the same message history, because the server holds the canonical copy. No need for per-device encryption or complex state synchronisation.
  • Server-side search: you can search your entire message history from any device.
  • Fast device switching: log in on a new phone and your entire history is there.
  • Large group chats: groups of up to 200,000 members work because the server handles fan-out, not the sender's device.

The tradeoff: Telegram can, in principle, read your messages. They state that they do not, and that messages are encrypted at rest with keys distributed across multiple jurisdictions, but this is a policy guarantee, not a cryptographic one.

Secret Chats: The E2E Exception

Telegram does offer end-to-end encryption through "Secret Chats," but these are opt-in, not default. Secret Chats use a Diffie-Hellman key exchange between the two clients (not involving the server's long-term keys) to establish a shared secret, then encrypt messages with AES-256-IGE using keys derived from that shared secret.

Secret Chats are device-specific (they do not sync across devices), do not support group conversations, and do not store messages on the server. They also support self-destructing messages with a timer. In practice, very few Telegram users use Secret Chats. The default cloud chat experience is far more convenient.

Delivery Mechanics

Telegram's delivery model for cloud chats:

  1. The sender's client serialises the message using TL, encrypts it with MTProto, and sends it to a Telegram server over TCP.
  2. The server decrypts the message, stores it in the cloud, and determines which of the recipient's devices are connected.
  3. For connected devices, the server pushes the message immediately over the persistent MTProto connection.
  4. For disconnected devices, the server sends a push notification via FCM (Android) or APNs (iOS) to wake the device. When the device connects, it syncs new messages from the server.

Telegram uses a single checkmark (✓) to indicate that the message reached the server, and a double checkmark (✓✓) to indicate delivery to the recipient's device. There is no separate "read" indicator for regular chats (though read status is tracked in the protocol for features like unread counts).

7. Signal: Minimal Metadata, Maximum Privacy

Signal is the reference implementation of the Signal Protocol and the app that most security researchers recommend. Its architecture prioritises privacy and minimises the data that the server can access.

Protocol Stack

Signal uses the same Signal Protocol (X3DH + Double Ratchet) as WhatsApp for message encryption. The difference is in everything around it:

  • Transport: Signal clients maintain a persistent WebSocket connection to Signal's server over TLS 1.3/TCP. Messages are sent and received over this WebSocket.
  • Push notifications: when the WebSocket is not connected (phone is asleep), Signal sends a push notification via FCM or APNs. Critically, the push notification contains no message content; it simply wakes the app, which then connects via WebSocket to retrieve encrypted messages.
  • Prekey distribution: like WhatsApp, Signal clients upload prekey bundles to the server for asynchronous key establishment.

Sealed Sender

One of Signal's most interesting privacy features is sealed sender (introduced in 2018). In a normal messaging protocol, the server sees both the sender and recipient of every message, because it needs to know where to route the message. Sealed sender encrypts the sender's identity so that the server only knows the recipient.

The mechanism:

  1. The sender obtains a delivery token for the recipient (derived from the recipient's profile key, which is shared through the Signal Protocol's encrypted channel).
  2. The sender encrypts the message content with the Signal Protocol as usual.
  3. The sender then wraps the encrypted message in an additional encryption layer, using the server's public key. The outer envelope contains the recipient's identifier and the delivery token, but the sender's identity is encrypted inside the inner layer.
  4. The server validates the delivery token against the recipient, decrypts the outer layer to determine routing, but cannot see the sender's identity.

The server learns that someone sent a message to a given recipient, but not who. Combined with Signal's policy of retaining minimal metadata (their responses to subpoenas have famously contained only the account creation date and last connection timestamp), this provides strong metadata protection.

Secure Value Recovery (SVR)

Signal's SVR system allows users to set a PIN that protects their profile, settings, and contacts. The PIN-derived key encrypts this data before it is stored on Signal's servers. SVR uses Intel SGX (Software Guard Extensions) enclaves to process PIN verification in a trusted execution environment; the server itself cannot access the unencrypted data even if compromised. Signal has been migrating to SVR2, which uses a rate-limiting scheme instead of SGX, recognising the limitations of hardware-based trusted computing.

Delivery Confirmation

Signal's delivery and read receipts are opt-in. When enabled:

  • A delivered receipt is sent when the recipient's Signal app downloads and decrypts the message.
  • A read receipt is sent when the recipient views the message in the app's UI.

Both receipts are themselves encrypted using the Signal Protocol and sent through the sealed sender mechanism. The server cannot determine whether a receipt is a "delivered" or "read" receipt; it is opaque ciphertext.

8. Facebook Messenger and Instagram DMs: MQTT, Lightspeed, and the Meta Unification

Facebook Messenger and Instagram Direct Messages share a backend infrastructure within Meta. Since late 2023, Messenger has default end-to-end encryption for personal messages, a significant architectural change for a platform with over a billion users.

MQTT for Push Messaging

Messenger has historically used MQTT (Message Queuing Telemetry Transport) as its push messaging protocol. MQTT is a lightweight publish-subscribe protocol originally designed for IoT devices. It runs over TCP and uses a broker model:

  • The client establishes a persistent TCP connection to the MQTT broker.
  • The client subscribes to topics (in Messenger's case, topics correspond to the user's conversations).
  • When a new message arrives, the broker pushes it to the client's connection immediately.
  • MQTT defines three Quality of Service levels:
    • QoS 0: at most once (fire and forget).
    • QoS 1: at least once (broker stores the message until acknowledged by the client).
    • QoS 2: exactly once (two-phase acknowledgement).

Messenger primarily uses QoS 1, which means the MQTT broker will retry delivery until the client acknowledges receipt. This provides transport-level reliability, but again, it only confirms that the client's network stack received the message, not that the user saw it.

The Lightspeed Sync Protocol

Meta developed an internal protocol called Lightspeed for synchronising messaging state across devices. Lightspeed replaced the older approach of fetching messages from a server-side store and instead uses a log-based sync model:

  • Each conversation has a monotonically increasing sequence of events (messages, reactions, read receipts, typing indicators).
  • Each client maintains a local database (SQLite on mobile) with its view of the conversation state.
  • When the client connects, it sends its current sync token (a cursor indicating the last event it has seen) to the server.
  • The server sends all events since that cursor.
  • The client applies those events to its local database.

This model handles multi-device synchronisation cleanly: every device independently syncs from the server's event log. It also handles offline scenarios: if a phone is off for two days, it catches up by replaying the events it missed.

End-to-End Encryption

Messenger's E2E encryption (rolled out as default in late 2023) uses the Labyrinth protocol, which is based on the Signal Protocol but modified for Meta's multi-device requirements. Key differences from Signal's approach:

  • Device groups: rather than encrypting a message N times for N recipient devices, Labyrinth uses a group key distribution mechanism that reduces the per-message overhead.
  • Epoch-based key management: keys are rotated on "epoch" boundaries (when devices are added or removed), rather than on every message exchange as in the Double Ratchet.
  • Server-side encrypted storage: unlike Signal (which stores minimal data server-side), Messenger stores encrypted message history on Meta's servers, encrypted with keys that only the user's devices can derive.

Cross-App Messaging

Meta has unified the backend so that Messenger and Instagram DM users can message each other. From a protocol perspective, this means the Lightspeed sync protocol handles routing across app boundaries, and the E2E encryption spans both apps (using the same Labyrinth key management). The user experience presents them as separate apps, but the underlying infrastructure is shared.

Delivery Indicators

Messenger's delivery indicators:

  • Sent (circle outline): the message has been sent from the client to Meta's server.
  • Delivered (filled circle with checkmark): the message has been delivered to the recipient's device.
  • Read (recipient's avatar): the recipient's app reported that the message was displayed on screen.

Instagram DMs use a similar system with "Seen" replacing the avatar indicator.

9. Viber: Proprietary Protocol with Signal-Inspired Encryption

Viber is particularly popular in Eastern Europe and parts of the Mediterranean. It uses a proprietary binary protocol and has implemented end-to-end encryption since 2016.

Protocol Architecture

Viber's transport layer uses a proprietary binary protocol over TCP. Unlike WhatsApp (Noise Pipes) or Signal (WebSocket), Viber does not use a well-documented public protocol for its transport. The client maintains a persistent TCP connection to Viber's servers for message delivery.

For voice and video calls, Viber uses UDP with its own RTP-like protocol for media transport, falling back to TCP through relay servers when direct UDP connectivity is blocked (common in corporate networks and some European ISPs that aggressively filter UDP).

Encryption

Viber's E2E encryption, introduced in April 2016, draws from Signal Protocol concepts. It uses:

  • Curve25519 for key exchange.
  • AES-256 in CBC mode for message encryption (note: CBC, not the GCM or ChaCha20-Poly1305 used by Signal and WhatsApp).
  • HMAC-SHA256 for message authentication.

Each Viber user has an identity key pair, and encryption is established per-conversation. Viber uses a double-ratchet-like mechanism for forward secrecy, though the exact implementation details are not publicly documented to the same degree as the Signal Protocol.

Group chats in Viber are also E2E encrypted. The group key management uses a sender-keys approach: each participant generates a sender key, distributes it to all other participants encrypted with their pairwise keys, and then uses the sender key for efficient group message encryption.

Delivery and Read Indicators

Viber has a three-tier delivery system:

  • Sent (one grey checkmark): the message was sent to Viber's server.
  • Delivered (two grey checkmarks): the message was delivered to the recipient's device. This is confirmed by the recipient's Viber client sending a delivery acknowledgement back through the server.
  • Seen (two blue checkmarks): the recipient's app reported that the message was viewed.

Server Queue and Offline Delivery

When a recipient is offline, Viber's servers queue messages for delivery. The recipient's push notification (via FCM or APNs) wakes the app, which connects to the server and downloads queued messages. Viber also supports a "Viber Out" feature for calling non-Viber phone numbers, which uses traditional VoIP SIP trunking to bridge to the PSTN, a completely different delivery model from in-app messaging.

10. What "Delivered" and "Read" Actually Mean at the Protocol Level

The words "delivered" and "read" are used casually, but they correspond to very specific events in the protocol stack. The gap between a WiFi ACK and a "read" receipt spans at least six distinct confirmation points. Tracing a message from Dimitris in Athens to Sofia in Berlin:

Layer-by-Layer Confirmation

1. WiFi Layer 2 ACK (Dimitris's phone to his router)

When Dimitris's phone transmits the frame containing (part of) his message, his router's WiFi radio sends a MAC-layer ACK within 10 microseconds. This confirms that the frame traversed the radio link without bit errors.

Scope: 802.11 frame received by the access point. Does not confirm forwarding, routing, or delivery.

2. TCP ACK (Dimitris's phone to WhatsApp server)

TCP running on Dimitris's phone receives an acknowledgement from WhatsApp's server, confirming that the bytes were received by the server's TCP stack.

Scope: bytes arrived at the server's kernel buffer. Does not confirm that the application read them or stored them.

3. Application-layer server acknowledgement (WhatsApp server to Dimitris's phone)

WhatsApp's server processes the message (stores the ciphertext, determines routing) and sends an application-layer acknowledgement back to Dimitris's client. This triggers the single grey checkmark.

Scope: the server accepted the message and takes responsibility for delivering it. If Sofia's phone is offline, the server will queue the message.

4. Server pushes to Sofia's device

When Sofia's device is online (WebSocket/persistent TCP connection is active), the server pushes the encrypted message. If she is offline, the server sends a push notification via FCM/APNs to wake her device, which then connects and downloads the message.

Scope: data in transit to the recipient.

5. Delivery receipt (Sofia's device to server to Dimitris's phone)

Sofia's WhatsApp client receives the ciphertext, decrypts it successfully, and stores it locally. The client then sends a delivery receipt message back through the server to Dimitris. This triggers the double grey checkmarks.

Scope: the message has been decrypted and stored on the recipient's device. The user has not necessarily seen it. The app might be running in the background.

6. Read receipt (Sofia's device to server to Dimitris's phone)

Sofia opens the conversation in WhatsApp. The app detects that the message is visible on screen and sends a read receipt. This triggers the double blue checkmarks.

Scope: the message was rendered in the UI and the user's screen displayed it. Whether Sofia actually read and understood the message is, of course, beyond the reach of any protocol.

Comparing Across Apps

Event WhatsApp Telegram Signal Messenger Viber
Server received 1 grey ✓ 1 ✓ Sent indicator Circle outline 1 grey ✓
Delivered to device 2 grey ✓✓ 2 ✓✓ Delivered (if enabled) Filled circle + ✓ 2 grey ✓✓
Read by user 2 blue ✓✓ (tracked internally) Read (if enabled) Recipient avatar 2 blue ✓✓

Note the differences: Telegram does not show a visible "read" indicator in one-on-one cloud chats (though the data is tracked for unread badges). Signal makes both delivery and read receipts optional and off by default for privacy. Messenger shows the recipient's avatar photo as the "seen" indicator, giving the most personal confirmation.

The Critical Gap

The distance between a WiFi ACK (step 1) and a read receipt (step 6) can range from milliseconds (both devices online, fast network) to days (recipient's phone off, queued on server, push notification delayed). Every messaging app exists to bridge this gap. The WiFi layer does not know about messages. TCP does not know about messages. The server does not know the message content (in E2E-encrypted apps). Only the endpoints (the sender and recipient apps) can confirm that the message was composed, encrypted, transmitted, stored, forwarded, received, decrypted, displayed, and seen.

This is the end-to-end argument made concrete. The network provides best-effort byte delivery. Everything else is the application's problem.

11. Failure Modes: Where Messages Get Lost

Understanding the protocol stack also means understanding where it breaks down. Messages fail to arrive for reasons that span every layer, and the failure mode determines what the user sees.

Flaky WiFi

The most common source of message delivery delay (not loss, usually, but delay) is unstable WiFi. A phone at the edge of WiFi coverage might experience:

  • High frame error rate: the radio link is marginal, and many frames fail CRC checks. The MAC layer retries, up to 7 times for short frames. If all retries fail, the frame is dropped and the upper layer (TCP) must retransmit, adding hundreds of milliseconds of delay.
  • Beacon loss: if the phone cannot receive beacons from the AP, it may decide the AP is unreachable and disassociate. The phone then reassociates (if it can) or switches to cellular data. During this transition, in-flight packets may be lost.
  • Channel interference: in a dense European apartment block in Athens or Amsterdam, dozens of WiFi networks overlap on the same channels. Co-channel interference causes frame collisions, which cause backoff and retries, which increase latency and reduce throughput.

In all these cases, the WiFi layer is unreliable, but TCP (or QUIC) eventually recovers. The message is delayed, not lost. The user sees the single checkmark (sent to server) hang for a few seconds before appearing, or the message appears to "stick" in a sending state.

WiFi-to-Cellular Handoff

When a user walks out of WiFi range, their phone switches to cellular data. This transition is not seamless at the TCP level:

  1. The phone's WiFi interface drops the connection.
  2. The phone activates cellular data (if enabled).
  3. The phone gets a new IP address from the cellular network.
  4. All TCP connections bound to the old WiFi IP address are now broken. The TCP stack may wait for the RTO to expire before detecting the failure, which can take 1 to 60 seconds depending on the RTO estimate.
  5. The messaging app detects the broken connection and establishes a new one.

During this window, messages sent to the user may be queued on the server. Messages the user is sending will time out and be retransmitted on the new connection.

QUIC handles this better because of connection migration: the QUIC Connection ID is independent of the IP address, so the connection can survive an address change without a new handshake. Apps using QUIC (or their own connection resumption logic over TCP) can make this transition faster.

Some mobile operating systems and chipsets support "WiFi-to-cellular handover" at the system level, keeping both interfaces active during the transition and migrating flows. Android's "seamless connectivity" and Apple's WiFi Assist both attempt this, with varying degrees of success.

Recipient Offline for Days

When Sofia's phone is off (battery dead, aeroplane mode, no SIM card in a new country), messages accumulate on the server:

  • WhatsApp: stores encrypted messages for up to approximately 30 days. After that, they are discarded and the sender sees no notification. The single grey checkmark remains indefinitely.
  • Telegram: cloud chats are stored permanently on Telegram's servers. The message is "delivered" to Telegram's cloud immediately (the sender sees the single checkmark), and when Sofia comes online, her client syncs from the cloud. Secret Chat messages, which are not stored on the server, may be lost if the recipient's device is unreachable for too long.
  • Signal: stores encrypted messages for a limited time (reportedly a few days to a few weeks, depending on server capacity and policy). If the recipient does not come online, the message may expire from the server.
  • Messenger: stores encrypted messages on Meta's servers; these persist long-term (Meta's infrastructure is designed for durable storage). Messages are available whenever the recipient logs in.
  • Viber: queues messages on the server. The retention period is not publicly documented, but messages reportedly persist for at least several days.

Server Outages

Messaging servers go down. WhatsApp experienced a global outage in October 2021 caused by a BGP misconfiguration at Facebook (now Meta), which made WhatsApp's servers unreachable for approximately six hours. During that time:

  • No messages could be sent (the app could not reach the server to get even the first checkmark).
  • Messages typed during the outage were queued locally on the sender's device.
  • When service was restored, devices reconnected and flushed their local queues.
  • No messages were lost (because they were queued locally), but delivery was delayed by hours.

Telegram has experienced smaller outages, typically related to ISP-level blocks in specific countries (Russia, Iran) rather than infrastructure failures. Signal experienced a brief outage in January 2021 when a surge of new users (driven by WhatsApp's privacy policy controversy) overwhelmed its servers.

Push Notification Failures

When a messaging app is not in the foreground and does not have an active connection, it relies on the OS push notification service (FCM on Android, APNs on iOS) to wake it up. Push notifications can fail:

  • FCM/APNs server overload: during high-traffic events, push delivery can be delayed by minutes.
  • Doze mode / App Standby (Android): Android's battery optimisation can delay or batch push notifications for apps that are not whitelisted. A messaging app in Doze mode might not receive a push notification until the device's next maintenance window, which can be 15 minutes or more.
  • APNs token invalidation (iOS): if the user reinstalls the app or restores from a backup, the APNs device token may change. Until the app registers the new token with the messaging server, push notifications will fail silently.
  • Network restrictions: some enterprise WiFi networks or cellular carriers block or throttle the connections to FCM/APNs servers. In China, Google services (including FCM) are blocked entirely, which is why Chinese Android phones use manufacturer-specific push services (Huawei Push Kit, Xiaomi Mi Push, etc.).

When push fails, the message sits on the server. The next time the user opens the app manually, it connects and retrieves queued messages. The user may perceive this as "I didn't get a notification for 30 minutes," which is correct; the message was on the server the entire time, but the phone was never told to wake up and fetch it.

DNS and Routing Failures

Less common but more catastrophic: DNS resolution failures can prevent the messaging app from finding the server at all. If the DNS resolver returns an incorrect or stale IP address, the app connects to the wrong server (or fails to connect). This can happen during DNS cache poisoning attacks, misconfigured resolvers, or ISP-level DNS outages.

BGP routing failures can make entire IP prefixes unreachable. The October 2021 Meta outage was caused by a BGP withdrawal that removed Meta's IP address blocks from the global routing table. Every device trying to reach WhatsApp, Facebook, or Instagram saw their DNS queries succeed (the IP addresses were correct) but their TCP connections time out (because no router knew how to reach those IPs).

12. Putting It All Together: Why Reliability Lives at the Endpoints

The architecture of every major messaging platform confirms the end-to-end argument in a way that Saltzer, Reed, and Clark probably never imagined in 1984. Consider the full chain of events when Dimitris sends "Are you free for dinner?" from his phone in Athens to Sofia's phone in Berlin:

  1. WhatsApp encrypts the message using the Double Ratchet, producing ciphertext that only Sofia's device can decrypt.
  2. The ciphertext is sent over a Noise Pipes connection (TLS-like encryption of the transport) to WhatsApp's server over TCP.
  3. TCP ensures the bytes reach the server reliably (retransmitting as needed), but TCP has no idea it is carrying a chat message.
  4. The WiFi layer on Dimitris's phone transmits frames and gets MAC-layer ACKs from his router, but WiFi has no idea what is in the frames.
  5. IP routes packets across autonomous systems between Athens and WhatsApp's server, but IP has no idea where the packets are going at the application level.
  6. WhatsApp's server stores the ciphertext and forwards it to Sofia's device (or queues it if she is offline).
  7. Sofia's phone receives the ciphertext, decrypts it, stores it locally, and sends a delivery receipt.
  8. Sofia opens the chat, sees the message, and a read receipt is sent.

At no point does any network layer, from WiFi to IP to TCP, guarantee that Sofia will see the message. WiFi guarantees one-hop frame delivery (with retries). IP guarantees nothing. TCP guarantees byte-stream delivery to the server's kernel. Only the application layer, spanning WhatsApp's client code, server infrastructure, and client code on the other end, provides the end-to-end confirmation that the message was composed, encrypted, transmitted, stored, forwarded, received, decrypted, and displayed.

The implications are significant:

For users: the checkmark system in your messaging app is not decoration. It is the only real signal of delivery status. A message with one checkmark has been accepted by the server. A message with two checkmarks has reached the device. Anything less means the message is still somewhere in the pipeline, and you cannot tell where by looking at your WiFi signal strength or cellular bars.

For developers building messaging systems: you must implement application-layer acknowledgement if you want to tell your users whether their message arrived. TCP ACKs are not sufficient. You need the recipient's application to confirm receipt, and ideally, you need a store-and-forward architecture so messages survive recipient unavailability.

For protocol designers: the end-to-end argument does not mean the network should provide no features. WiFi MAC retries, TCP retransmission, and QUIC's stream multiplexing all improve the performance of the end-to-end path. But performance improvements in the network cannot replace correctness guarantees at the endpoints. The Signal Protocol, Noise Pipes, Lightspeed, MTProto: these all exist because IP and TCP provide a necessary but insufficient foundation.

For privacy: there is a deep tension between reliability and privacy. To provide reliable delivery, someone must store the message while the recipient is offline. In E2E-encrypted systems (WhatsApp, Signal), the server stores opaque ciphertext. In cloud-first systems (Telegram's default), the server stores data it can decrypt. The end-to-end argument has a privacy corollary: the less the network knows about the data it carries, the less it can abuse. Sealed sender, encrypted message stores, and minimal metadata retention are all techniques for keeping the network as dumb as possible while still providing reliable delivery.

The internet was designed to be a best-effort network. Every messaging app you use is a testament to how much work it takes to build reliability, privacy, and usability on top of that foundation. The next time your WhatsApp message hangs on one checkmark, you are watching the end-to-end argument play out in real time: somewhere between your WiFi chip and your friend's phone, the network is doing its best, and your app is doing the rest.