Is your Caddy Reverse Proxy Ready for QUIC?

Is your Caddy Reverse Proxy Ready for QUIC?

Linux constraints may be starving services that use the QUIC protocol. I've taken a closer look at how this relates to MTU, UDP Packets, and QUIC.

Sarah, fastest cheetah running.

Fig. 1. Sarah, fastest cheetah
Source: Adapted from [1]

Not Specifically about Caddy

This doesn’t just apply to your web proxies like Caddy[2]. It also applies to your web applications that use this new transport protocol. Targetting the Linux kernel is ultimately where we will go with this.

The Tea

The QUIC[3] protocol is becoming more important and implemented in more services as time rolls on. There seems to be general confusion, whether it’s appropriate to change kernel parameters and adjust the UDP size to support a higher window and disagreement on what to do with larger QUIC transfers over UDP. More aggressive services are already showing up in my error logs suggesting that I need to raise it.

You can see what I mean on this bug report to Debian and how the community responded.

https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1111052 đź”—

How Do I Quicly Describe this

I needed a new analogy for this to help myself understand what is going on. I hope it helps you too. The below table provides insight.

ComponentTechnical RoleAnalogy
UDPTransport LayerThe Generic Delivery Van: Fast and efficient, but doesn’t wait for signatures. It just drops boxes at the dock.
MTU (1500)Max Packet SizeStandard Box Dimensions: Every box coming off the van is exactly 1,500 units wide.
QUICApplication ProtocolThe Smart Courier: A specialized worker inside the van who manages encryption and ensures boxes arrive in the right order.
rmem_maxKernel Rx BufferThe Loading Dock (Inbound): Floor space where incoming boxes pile up before Caddy’s staff can process them.
wmem_maxKernel Tx BufferThe Staging Area (Outbound): Space for boxes waiting to be loaded onto vans leaving your server.

In a high-speed HTTP/3 connection, clients send data in massive bursts.

  • The Default Bottleneck: Linux defaults to a rmem_max of ~212KB. At a standard MTU of 1500, your “dock” only fits about 140 boxes.
  • The Burst: A modern browser might send 1,000+ boxes in a single millisecond.
  • The Overflow: Once those 140 spots are full, the kernel has no choice but to throw the remaining 860 boxes in the trash. This results in the “failed to sufficiently increase receive buffer size” error in Caddy logs.

Nerd Diagram

This is a more technical diagram of what’s happening and how to think about QUIC as a protocol with our changes to the kernel. At least until the linux kernel changes something here.

graph LR
    subgraph WAN ["Internet"]
        S((Remote Client))
    end

    subgraph Host ["Proxmox Host (Shared Kernel)"]
        direction TB
        NIC{Physical NIC}
        MTU[/MTU 1500 Limit<br/>'Max Box Size'/]
        
        subgraph Buffers ["Kernel Memory Space"]
            RMEM[[net.core.rmem_max: 7.5MB]]
            WMEM[[net.core.wmem_max: 7.5MB]]
        end
    end

    subgraph LXC ["Caddy Container (LXC)"]
        QUIC[QUIC/Go Stack]
        Caddy(Caddy Web Server)
    end

    %% Inbound Flow
    S -- "UDP Packets" --> NIC
    NIC -- "Check Size" --> MTU
    MTU -- "Queueing" --> RMEM
    RMEM -- "Read Socket" --> QUIC
    QUIC --> Caddy

    %% Outbound Flow
    Caddy -- "Write Socket" --> WMEM
    WMEM -- "Burst Data" --> NIC
    NIC -- "UDP Streams" --> S

Patch Debian Kernel

Create a persistent config file and apply the kernel adjustments to sysctl.

Terminal window
echo "net.core.rmem_max=7500000" >> /etc/sysctl.d/99-caddy-performance.conf
Terminal window
echo "net.core.wmem_max=7500000" >> /etc/sysctl.d/99-caddy-performance.conf

Apply changes immediately

Terminal window
sysctl --system

Should we Move to Jumbo Frames?

In short, no; since 1500 MTU is still persistent across the Internet and usually the most expected window. We don’t want to change to jumbo frames for this. Segementation will work fine, remember we are just increasing the size of our storage floor for the items coming in to be processed. We don’t want to drop any of these packets, we simply want to queue them in a safer space than inside the transport layer.

Kernel Safety and Reliability Concerns

Ultimately, adjusting these kernel parameters isn’t about chasing micro benchmarks; it’s about aligning our infrastructure with the modern web. The QUIC model is intelligent enough to handle its own recovery, but it shouldn’t have to. Every time our kernel drops a packet because the buffer is full, we force a retransmission that wastes cycles, bandwidth, and time. While we can agree and appreciate that the conservative views of the linux kernel is often and usually the right choice. That doesn’t mean we need to be constrained by dogma in every situation.

I’ll leave you with a quote I like for these situations.

All ships are safe in the harbor, but that isn’t what ships are built for.

References

[1] G. Wilson, "Sarah, fastest cheetah," *Wikimedia* 2012. [Online]. Available: https://commons.wikimedia.org/wiki/File:Sarah_(cheetah).jpg Accessed: Dec. 27, 2025.

[2] "Caddy Reverse Proxy," *website* 2025. [Online]. Available: https://caddyserver.com/ Accessed: Dec. 27, 2025.

[3] J. Iyengar and M. Thomson, "QUIC: A UDP-Based Multiplexed and Secure Transport," *Github* 2022. [Online]. Available: https://datatracker.ietf.org/doc/rfc9000/ Accessed: Dec. 26, 2025.