
How I learned to stop worrying and love the CDN
Creating an HTTP rules bypass to invalidate cached assets at deploy time
A Path Less Traveled
I love the concept of an R-blogdown based static web site - fast… simple… - OK, maybe just fast. There have been valid criticisms of static sites possessing an inherent “gateway drug” quality - a false sense of simplicity. But the idea really is to focus on the essential complexity of whatever you’re building, and skip past some of the incidental and accidental consequences of running the site on web server infrastructure.
But what happens when the platform/services your site depends on changes underneath your site? That happened to me, or at least I think it happened that way. This site has been in this form for more than a year, and I’ve been using the same deployment method for that time - and has never failed, until recently. A few weeks ago I cam back to post some updates, and the deployments kept failing. But before I get too deep in the weeds I suppose a bit of background is in order.
The Symptom, and Furtive Steps to Remedy
So when I returned to the site the behavior was oddly specific. Certain pages and assets were updating while others were remaining in the cache, even though I was refreshing at the top of the endpoint. And when I made API requests to invalidate specific pages and assets, they would continue to remain in the CDN. I toyed with the idea of setting a short TTL. But my reflex is to affirmatively set the state of the site during deploy, and check/test for it as part of a post-deploy process. The next step was to replicate all of my CDN settings to a new cache endoint, and it deployed cleanly! Horray! … only to be frustrated that on the NEXT deploy the same issue re-surfaced.
[cue sad trombone]
The Root Cause, and The Cure
It took a considerable amount of investigation to finally come back to something in the CDN that was blocking the invalidation of certain assets in cache. The net effect was that new versions of certain site assets (such as my index page, of all things) would not update until the time-to-live had lapsed. This was incredibly frustrating, since it has been rock solid up to the point that the issue was discovered.
After some online spelunking I came to a discussion thread relating to a similar case, and the solution was to put a rule on top of each rule that essentially checked whether to skip processing. Basically if there was a purge request then skip the rule. This would in-effect put a bypass to the HTTP rules when a CDN cache invalidation request was coming in.
Redirect HTTP to HTTPS
Honestly - in Verizon CDN parlance I’m not sure if it’s actually removing the files from the cache or simply marking them as invalid - but in either case, it works. Just be certain to put this IF statement at the top of every rule in the set you deploy.
Lessons Learned
So it goes to show that being “on the PaaS level” is not automatically the simpler approach. When breaking changes occur it’s often anguishing to trace back to the cause of the issue. That said, I’m not exactly sure how cache refreshes ever worked. For now I’ll set it aside and get back to working directly on this site, where my focus should actually reside.
Share this post
Twitter
Reddit
LinkedIn
Email