Load Balancers Were the Start to the Firefox Outage
Nice summary of a recent Firefox outage. I enjoy reading these blogs as folks take the time to find out what really happened. In this case, it was a combination of things, which included a change by their cloud provider.
Firefox has a number of servers and related infrastructure that handle several internal services. These include updates, telemetry, certificate management, crash reporting and other similar functionality. This infrastructure is hosted by different cloud service providers that use load balancers to distribute the load evenly across servers. For those services hosted on Google Cloud Platform (GCP) these load balancers have settings related to the HTTP protocol they should advertise and one of these settings is HTTP/3 support with three states: “Enabled”, “Disabled” or “Automatic (default)”. Our load balancers were set to the “Automatic (default)” setting and on January 13, 2022 at 07:28 UTC, GCP deployed an unannounced change to make HTTP/3 the default. As Firefox uses HTTP/3 when supported, from that point forward, some connections that Firefox makes to the services infrastructure would use HTTP/3 instead of the previously used HTTP/2 protocol.
Having events like this published can only help the community uncover any blind spots and also provide new test scenarios to implement.