This article is 4,500 words long, taking about 10 minutes for a casual read and 30 minutes for a thorough read. The content includes information about WireGuard.
Recently, during an internal BBL session in my team, I shared about WireGuard. WireGuard (hereafter referred to as WG), as a representative of the new generation VPN, is likely familiar to many tech enthusiasts. Similar to other VPN technologies, we can use it to establish a secure channel between home and company networks, thereby accessing âintranetâ data and applications.
Before diving into WG, letâs first abstract the general requirements for VPNs:
- Security: Ensuring the data between two private networks can be securely transmitted over an insecure network such as the Internet
- Authenticity: The visitor is a legitimate user, accessing the correct network
- Efficiency: Enabling the VPN doesnât significantly slow down the network access, and the tunnel setup should be fast
- Stealthiness: Third parties should not easily sniff out the presence of the gateway
- Accessibility: Easy to configure, easy to turn on and off
We must thank Martin E Hellman, Bailey W Diffie, and Ralph C. Merkle for securely transmitting data over insecure networks. Their patent, Cryptographic apparatus and method, introduced the widely used DH algorithm for key exchange.
This algorithm uses properties of congruence and the commutative property of multiplicationâ the process is simple, and those interested can refer to Wikipedia. WG uses ECDH, a variant of the DH algorithm, which uses elliptic curves to enhance performance and security.
Through the DH algorithm, both ends of the network can negotiate a key in an insecure network for encrypting the data to be transmitted. Subsequently, the data stream can be efficiently symmetrically encrypted with this key.
With security addressed, how do we solve the issue of identity authentication at both ends of the network? Currently, there are two general solutions to this issue:
- Pre-shared key
- Certificate
For example, when we access the website of a particular bank, the browser verifies the bankâs certificate to ensure that the network we are accessing is indeed the one we intend to visit. When a companyâs headquarters and branch networks need to communicate, they can pre-configure each otherâs public keys and authenticate each other through digital signatures. This is a variant of the pre-shared key (pure pre-shared keys donât satisfy forward secrecy and should almost never be used in communications).
Once the issues of security and identity authenticity are solved, the most important VPN problems are resolved. Our current VPN solutions, whether IPSec VPN operating at the network layer or SSL/TLS/OpenVPN at the session layer, all utilize the algorithms discussed above for key exchange and authentication. Their complexity largely stems from handling configurations, encryption algorithm negotiations, and various compatibility issues. Meanwhile, WG, though not innovative in algorithms, cleverly organizes requirements and implements alternative approaches, resulting in breathtaking simplicity.
Hereâs a comparison of code volume:
WG achieves its implementation with just 4k lines of kernel code! Itâs so ingeniously crafted. While it sounds a bit disrespectful, in comparison, OpenVPN or StrongSwan seems like a product of line-by-line charged Indian outsourcing companies, whereas WG is the masterpiece of a real programmer! Linus himself was full of praise for WG, writing in an email on August 2, 2018:
Btw, on an unrelated issue: I see that Jason actually made the pull request to have wireguard included in the kernel. Can I just once again state my love for it and hope it gets merged soon? Maybe the code isnât perfect, but Iâve skimmed it, and compared to the horrors that are OpenVPN and IPSec, itâs a work of art. Linus
Notably, Linusâs usual style of commenting on code is like this (Mauro is a Kernel maintainer):
âItâs a bug alright â in the kernel. How long have you been a maintainer? And you still havenât learnt the first rule of kernel maintenance? âShut up, Mauro. And I donât ever want to hear that kind of obvious garbage and idiocy from a kernel maintainer again. Seriously.â
So getting Linus to âstate my loveâ is as difficult as climbing to the moon. So letâs learn WG devotedly â the way it approaches product coding is worth our deep study!
The Concept of WireGuard Interface
Letâs start with the concept.
Many product managers donât bother to clearly explain various new and old concepts within the product, especially when creating new concepts. This is very wrong. From the beginning of architecture design, the product should have all its concepts clarified. When existing concepts cannot adequately describe parts of the product, we should have the courage to create new concepts to ensure comprehensive descriptions. Concepts form the basis of communication between engineers and between engineers and the outside world. Communicating through mutually agreed concepts is more precise and efficient. For example, when I previously referred to ECDH as a variant of the DH algorithm using elliptic curves, I donât need to re-explain it every time I mention ECDH. Once a new concept is created, we can attach many attributes to it to distinguish it from other concepts.
WG first defined an important concept â WireGuard Interface (hereafter referred to as wgi). Why do we need wgi? Why arenât existing tunnel interfaces suitable? A wgi is a special interface:
- It has its private key (curve25519)
- It has a UDP port for listening to data
- It has a group of peers (peer is another important concept), with each peerâs identity confirmed through its public key
By defining this new interface, wgi distinguishes it from a regular tunnel interface. With such an interface definition, the mapping of other data structures and the sending and receiving of data become clear and straightforward.
Letâs look at the WG interface configuration:
Code language: javascriptCopy
[Interface]Address = 10.1.1.1/24ListenPort = 12345PrivateKey = blablabla[Peer]PublicKey = IWNVZYx0EacOpmWJq6lE8RfcFBd8EeUliOi+uYKQfG8=AllowedIPs = 0.0.0.0/0,::/0Endpoint = 1.1.1.1:54321
The initiator/responder of a WG VPN tunnel is symmetrical, hence thereâs no client/server or spoke/hub distinction like in usual VPNs. Thus, configurations are also symmetrical.
In this configuration, we further learn about the peer concept: it is a counterpart of a WG node, with a statically configured public key, a white list of networks behind the peer (AllowedIPs), and the peerâs address and port (not always required and might change automatically as the network roams).
In just 9 lines of configuration, we describe the simplest VPN network. This configuration doesnât include endless certificate setups, complex and lengthy content that is difficult to understand, nor does it require setting a CA. If youâve had the misfortune of configuring IPSec VPN or OpenVPN, you would marvel that simplicity truly is a productivity driver.
From a data structure perspective, thereâs a hash table of peers and a hash table of key_index mounted below wgi. Through key_index included within received data packets, we can immediately locate the peer, and each peer stores the state of the endpoint, handshake state, and keypairs (three sets: the currently used key, the key used before the last rekey, and the key to be used after the next rekey), with each keypair set including keypair for receiving and sending directions.
When the wgi interface is enabled (wg-quick up wg0), it gets initialized, and consequently, its related peers are created; conversely (wg-quick down wg0), wgi stops running, and related peers are deleted. The data structureâs outline is exceptionally clear.
The Process of Channel Negotiation Encryption with WireGuard
WGâs simplicity also reflects in the negotiation of encrypted tunnels. It uses the Noise Protocol Framework to build the protocol negotiation process. The Noise Protocol Framework is an ingeniously designed framework for creating secure protocols, which wonât be discussed here but will be introduced in another article later. WG uses Noise_IKpsk2_25519_ChaChaPoly_BLAKE2s, from the protocol name you can probably infer it selects curve 25519 for ECDH, ChaChaPoly for symmetric encryption, and Blake2s for hashing. In IKE/SSL/TLS protocols, these algorithms are usually negotiated between parties. WG sees no need to negotiate and fixes them in the protocol, significantly reducing supported encryption algorithms and saving algorithm negotiation processes. As both ends are configured with each otherâs public keys, it can complete tunnel establishment using only 1-RTT (a round trip of messages), 2 messages. Comparing to IPSecâs IKE protocol needing 6 messages under main mode (3-RTT) or at least 3 messages under aggressive mode (2-RTT), the benefit is clear. From Beijing to Seattle, 1-RTT is about 175ms (cloudping.info), 2-RTT will noticeably delay protocol performance. For any protocol, reducing RTT in tunnel negotiation can greatly enhance protocol performance.
1-RTT also implies connection-less operation, as thereâs no mutual confirmation. You can compare it to a connection-oriented network like TCP (three-way handshake for eye contact connection) and connection-less UDP. Connection-oriented networks have numerous benefits, but connection-less shines in its simplicity, like a fish with only a seven-second memory, free from past, present, and future burdens.
For connection-oriented protocols, generally, a state table is required to store where previous communication progressed. This dynamically generated state table can easily become a target for DoS attacks, like TCPâs enduring SYN-flood issues since its inception. On the other hand, connection-less protocols donât carry this burdenâ the server doesnât need to specially handle a clientâs handshake requests nor consider packet loss (just re-handshake anyway with 1-RTT), doesnât need to manage timers for a connection table of half-open connections (since such a table doesnât exist) and so forth.
WG handshake packets encapsulate:
- unencrypted_ephemeral: Senderâs temporary public key generated for this handshake (unencrypted, for ECDH)
- encrypted_static: Encrypted peer public key using a temporary key generated via receiverâs public key and temporary private key ECDH
- encrypted_timestamp: Encrypted current timestamp using key2 obtained from receiver public key and senderâs private key ECDH mixed into key1
- mac1: Hash of peerâs public key and entire message content
Receiving side first verifies mac1 (simple authentication â most hackers would fail here), if incorrect, discards it; then verifies encrypted_static (confirmation â without private key the hacker fails here again), verifies encrypted_timestamp (prevents replay attack, so replay attacks fail here as well). Once the receiving side checks everything is okay, it can create its temporary key pair. By now, with the senderâs temporary public key, it can calculate the keys needed for encrypting data jointly agreed upon post handshake. But, it still needs to send a handshake reply message to provide its temporary public key to the sender for the sender to calculate the same key:
- unencrypted_ephemeral: Receiverâs temporary public key generated for this handshake (unencrypted, for ECDH)
- mac1: Hash of peerâs public key and entire message content
This way, both ends, with each otherâs temporary public key and their temporary private key, can ECDH + HKDF (a method of deriving a symmetric encryption key from a DH result) to derive the symmetric keys for data encryption in both directions.
If thereâs packet loss, say the receiver didnât get the senderâs handshake request or the sender didnât get the handshake reply, the whole process can just restart. Since itâs 1-RTT anyway, it doesnât waste any time.
This process considers stealthiness; receiving parties will discard any unauthorized handshakes (e.g., peers they donât know or retransmissions). From the senderâs perspective, the handshake packets appear as if theyâve entered a black hole, illustrating that unless the hacker has authorization to add their public key as a peer on the WG gateway, they have virtually no chance of sniffing out the existence of the receiver. Meanwhile, other VPN protocols during tunnel establishment phases, like the IPSec IKE protocol or OpenVPN SSL/TLS protocol, can be sniffed out.
Sending and Receiving Data Packets
With keys established, user data packets are straightforward to handle. The handling logic is exceptionally simple and clear, covered in just a few lines:
- Sending:
- User Space: Application sends data packets destined for the VPN peer network
- Kernel: Routes are determined that the packets should be sent via the wg0 interface, thus handing it over to WG for processing
- WG: Upon the destination address, it reverse-maps to determine which peer should receive it, encrypts the packet with the pre-negotiated key with the peer (if not yet negotiated or the key expired, it re-negotiates) and encapsulates it in a UDP packet destined for the peerâs endpoint with key_index included
- Receiving:
- Kernel: If a data packetâs UDP port is monitored by WG, itâs handed to WG for processing (WGâs recv handles this packet)
- WG: Using key_index in the packet, locates the corresponding key in a hash table and decrypts (not directly but enqueues it in a decryption queueâa small network system trick in design)
- WG: Checks if the decrypted original packet is permitted in the peerâs allowed IP list. If it is, passes the original packet to the kernel for further processing, discerns the peer again from key_index
- Kernel: Based on routing table for the original packetâs destination address sends out the packet
Too Dry? Letâs Add Some Color!
In the BBL, I conducted a demonstration: establishing a WG VPN between my machine and a DigitalOcean machine, then sending an HTTP GET request, with the server returning a hello world text. Below is a Wireshark packet capture, slightly annotated by me:
>>
Additional Reading
- WireGuard Protocol: https://www.wireguard.com/protocol/
- Noise Protocol: https://noiseprotocol.org/
- Authenticated Encryption with Associated Data (AEAD) algorithm â RFC7539
- HKDF: https://tools.ietf.org/html/rfc5869
- DH Algorithm Patent: https://patents.google.com/patent/US4200770
- WireGuard Source Code: https://github.com/WireGuard/WireGuard
- Linusâs Email: https://lists.openwall.net/netdev/2018/08/02/124
Moments of Reflection
As a veteran in the space of network and security protocols, WireGuardâs impact on me was profound. Itâs like a hammer striking down on my head: if one practices rational compromise and simplifies bureaucratic processes, something as complex as a VPN protocol can become so graceful and refined; a simple, well-thought-out user interface (configuration) implies user-friendly products and designs that embody great wisdom in apparent simplicity; the resulting simplicity streamlines many subsequent processes: because of the simple, clear interface, nearly all data structures can be pre-generated; because the protocol itself is simple (1-RTT), itâs easy to renegotiate; losing packets during the handshake? Let it be; handshakes are fast and easy; ultimately, simplicity leads to less code, free from complex twists and turns, allowing an engineer familiar with C and Linux development to comprehend the main flow with ease in an afternoonâthis means code is easier to review, writing test code takes less time and can achieve higher test coverage with fewer errors, and less time fixing bugs, leaving engineers more time to think deeply and perhaps even plan for the future, hence no need for 996 culture, with time saved to spend joyfully with family or read books and attend concerts with friends. Itâs all worthwhile.