Is your Caddy Reverse Proxy Ready for QUIC?
Linux constraints may be starving services that use the QUIC protocol. I've taken a closer look at how this relates to MTU, UDP Packets, and QUIC.
Fig. 1. Sarah, fastest cheetah
Source: Adapted from [1]
Not Specifically about Caddy
This doesn’t just apply to your web proxies like Caddy[2]. It also applies to your web applications that use this new transport protocol. Targetting the Linux kernel is ultimately where we will go with this.
The Tea
The QUIC[3] protocol is becoming more important and implemented in more services as time rolls on. There seems to be general confusion, whether it’s appropriate to change kernel parameters and adjust the UDP size to support a higher window and disagreement on what to do with larger QUIC transfers over UDP. More aggressive services are already showing up in my error logs suggesting that I need to raise it.
You can see what I mean on this bug report to Debian and how the community responded.
https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1111052 đź”—
How Do I Quicly Describe this
I needed a new analogy for this to help myself understand what is going on. I hope it helps you too. The below table provides insight.
| Component | Technical Role | Analogy |
|---|---|---|
| UDP | Transport Layer | The Generic Delivery Van: Fast and efficient, but doesn’t wait for signatures. It just drops boxes at the dock. |
| MTU (1500) | Max Packet Size | Standard Box Dimensions: Every box coming off the van is exactly 1,500 units wide. |
| QUIC | Application Protocol | The Smart Courier: A specialized worker inside the van who manages encryption and ensures boxes arrive in the right order. |
rmem_max | Kernel Rx Buffer | The Loading Dock (Inbound): Floor space where incoming boxes pile up before Caddy’s staff can process them. |
wmem_max | Kernel Tx Buffer | The Staging Area (Outbound): Space for boxes waiting to be loaded onto vans leaving your server. |
In a high-speed HTTP/3 connection, clients send data in massive bursts.
- The Default Bottleneck: Linux defaults to a
rmem_maxof ~212KB. At a standard MTU of 1500, your “dock” only fits about 140 boxes. - The Burst: A modern browser might send 1,000+ boxes in a single millisecond.
- The Overflow: Once those 140 spots are full, the kernel has no choice but to throw the remaining 860 boxes in the trash. This results in the “failed to sufficiently increase receive buffer size” error in Caddy logs.
Nerd Diagram
This is a more technical diagram of what’s happening and how to think about QUIC as a protocol with our changes to the kernel. At least until the linux kernel changes something here.
graph LR
subgraph WAN ["Internet"]
S((Remote Client))
end
subgraph Host ["Proxmox Host (Shared Kernel)"]
direction TB
NIC{Physical NIC}
MTU[/MTU 1500 Limit<br/>'Max Box Size'/]
subgraph Buffers ["Kernel Memory Space"]
RMEM[[net.core.rmem_max: 7.5MB]]
WMEM[[net.core.wmem_max: 7.5MB]]
end
end
subgraph LXC ["Caddy Container (LXC)"]
QUIC[QUIC/Go Stack]
Caddy(Caddy Web Server)
end
%% Inbound Flow
S -- "UDP Packets" --> NIC
NIC -- "Check Size" --> MTU
MTU -- "Queueing" --> RMEM
RMEM -- "Read Socket" --> QUIC
QUIC --> Caddy
%% Outbound Flow
Caddy -- "Write Socket" --> WMEM
WMEM -- "Burst Data" --> NIC
NIC -- "UDP Streams" --> S
Patch Debian Kernel
Create a persistent config file and apply the kernel adjustments to sysctl.
echo "net.core.rmem_max=7500000" >> /etc/sysctl.d/99-caddy-performance.confecho "net.core.wmem_max=7500000" >> /etc/sysctl.d/99-caddy-performance.confApply changes immediately
sysctl --systemShould we Move to Jumbo Frames?
In short, no; since 1500 MTU is still persistent across the Internet and usually the most expected window. We don’t want to change to jumbo frames for this. Segementation will work fine, remember we are just increasing the size of our storage floor for the items coming in to be processed. We don’t want to drop any of these packets, we simply want to queue them in a safer space than inside the transport layer.
Kernel Safety and Reliability Concerns
Ultimately, adjusting these kernel parameters isn’t about chasing micro benchmarks; it’s about aligning our infrastructure with the modern web. The QUIC model is intelligent enough to handle its own recovery, but it shouldn’t have to. Every time our kernel drops a packet because the buffer is full, we force a retransmission that wastes cycles, bandwidth, and time. While we can agree and appreciate that the conservative views of the linux kernel is often and usually the right choice. That doesn’t mean we need to be constrained by dogma in every situation.
I’ll leave you with a quote I like for these situations.
All ships are safe in the harbor, but that isn’t what ships are built for.
References
[1] G. Wilson, "Sarah, fastest cheetah," *Wikimedia* 2012. [Online]. Available: https://commons.wikimedia.org/wiki/File:Sarah_(cheetah).jpg Accessed: Dec. 27, 2025.
[2] "Caddy Reverse Proxy," *website* 2025. [Online]. Available: https://caddyserver.com/ Accessed: Dec. 27, 2025.
[3] J. Iyengar and M. Thomson, "QUIC: A UDP-Based Multiplexed and Secure Transport," *Github* 2022. [Online]. Available: https://datatracker.ietf.org/doc/rfc9000/ Accessed: Dec. 26, 2025.