In the scenario above, a visitor gets an inconsistent view of the website. They load an old version of
index.html from before the deployment and a new version of
style.css from after the deployment. Depending on how the website layout was changed, this could result in a totally broken experience.
I call this the client-side consistency problem. It is a textbook example of a distributed systems problem.
The problem is even worse if the website relies on scripts. New scripts might have code with wrong expectations about the old version of
index.html. If there are more than one script, they might be downloaded in any order, with a mix of scripts coming from before and after the deployment.
Atomic deployments don't solve the problem. In the scenario above, the deployment was atomic! Both files got deployed at exactly the same instant.
It's easy to think that the problem is unimportant because it is low-probability. After all, a visitor has to get unlucky enough to load a page during a deployment for the problem to affect them.
However, that's the wrong way to think about it. A better way to think about it is to conservatively assume that every visitor who comes to your site during a deployment will get a broken copy of the site. Framed in those terms, deployments suddenly seem very dangerous!
To make matters worse, visitors don't have to visit your site during a deployment to be affected. Depending on how their browsers and your server are configured, this sequence of events could be possible:
In this second scenario, a visitor gets a broken version of your site because their browser cached the
style.css file from an old version of your site. After loading the new
index.html from after the deployment, they again have a broken experience even though their page load didn't overlap with a deployment.
I have some reason to believe that a lot of big companies have independently identified and worked around this problem:
Specific technologies vary, but the general technique works like this:
That rules out the first scenario because the client's old version of
index.html will still point to the correct, old stylesheet (
And it rules out the second scenario because the new version of
index.html will point to
style__v2.css, which the visitor's browser has not yet cached:
Eventually, old assets like
style__v1.css will have to be removed from the server. This is akin to garbage collection, and it is notoriously difficult in distributed systems. Ideally we would somehow "fence out" old clients so their browsers would never use a version of
index.html that references
style__v1.css... but there is no practical way to implement such a fence.
Instead, we have to settle for a carefully-chosen retention time. The retention time has to be long enough that we can be virtually certain no clients will be using an old version of
index.html that references
style__v1.css. The correct retention time depends on a lot of details about how the website is configured, but since disk space is cheap, it can probably be very long, even up to a year or several years.