Businesses are built on the web. Without the web, Twitter couldn’t exist. Facebook couldn’t exist. And not just businesses—Wikipedia couldn’t exist. Your favorite blog couldn’t exist without the web. The web doesn’t favor any one kind of use. It’s been deliberately designed to accommodate many and varied activities.
Just as many wonderful things are built upon the web, the web itself is built upon the internet. Though we often use the terms web and internet interchangeably, the World Wide Web is just one application that uses the internet as its plumbing. Email, for instance, is another.
Like the web, the internet was designed to allow all kinds of services to be built on top of it. The internet is a network of networks, all of them agreeing to use the same protocols to shuttle packets of data around. Those packets are transmitted down fiber-optic cables across the ocean floor, bounced around with Wi-Fi or radio signals, or beamed from satellites in freakin’ space.
As long as these networks are working, the web is working. But sometimes networks go bad. Mobile networks have a tendency to get flaky once you’re on a train or in other situations where you’re, y’know, mobile. Wi-Fi networks work fine until you try to use one in a hotel room (their natural enemy).
When the network fails, the web fails. That’s just the way it is, and there’s nothing we can do about it. Until now.
For as long as I can remember, the World Wide Web has had an inferiority complex. Back in the ’90s, it was outshone by CD-ROMs (ask your parents). They had video, audio, and a richness that the web couldn’t match. But they lacked links—you couldn’t link from something in one CD-ROM to something in another CD-ROM. They faded away. The web grew.
Later, the web technologies of HTML, CSS, and JavaScript were found wanting when compared to the whiz-bang beauty of Flash. Again, Flash movies were much richer than regular web pages. But they were also black boxes. The Flash format seemed superior to the open standards of the web, and yet the very openness of those standards made the web an unstoppable force. Flash—under the control of just one company—faded away. The web grew.
These days it’s native apps that make the web look like an underachiever. Like Flash, they’re under the control of individual companies instead of being a shared resource like the web. Like Flash, they demonstrate all sorts of capabilities that the web lacks, such as access to device APIs and, crucially, the ability to work even when there’s no network connection.
The history of the web starts to sound like an endless retelling of the fable of the tortoise and the hare. CD-ROMs, Flash, and native apps outshine the web in the short term, but the web always seems to win the day somehow.
Each of those technologies proved very useful for the expansion of web standards. In a way, Flash was like the R&D department for HTML, CSS, and JavaScript. Smooth animations, embedded video, and other great features first saw the light of day in Flash. Having shown their usefulness, they later appeared in web standards. The same thing is happening with native apps. Access to device features like the camera and the accelerometer is beginning to show up in web browsers. Most exciting of all, we’re finally getting the ability for a website to continue working even when the network isn’t available.
The technology that makes this bewitching offline sorcery possible is a browser feature called service workers. You might have heard of them. You might have heard that they’re something to do with JavaScript, and technically they are…but conceptually they’re very different from other kinds of scripts.
Usually when you’re writing some JavaScript that’s going to run in a web browser, it’s all related to the document currently being displayed in the browser window. You might want to listen out for events triggered by the user interacting with the document (clicks, swipes, hovers, etc.). You might want to update the contents of the document: add some markup here, remove some text there, manipulate some values somewhere else. The sky’s the limit. And it’s all made possible thanks to the Document Object Model (DOM), a representation of what the browser is rendering. Through the combination of the DOM and JavaScript—DOM scripting, if you will—you can conjure up all sorts of wonderful magic.
Well, a service worker can’t do any of that. It’s still a script, and it’s still written in the same language—JavaScript—but it has no access to the DOM. Without any DOM scripting capabilities, this kind of script might seem useless at first glance. But there’s an advantage to having a script that never needs to interact with the current document. Adding, editing, and deleting parts of the DOM can be hard work for the browser. If you’re not careful, things can get very sluggish very quickly. But if there’s a whole class of script that isn’t allowed access to the DOM, then the browser can happily run that script in parallel to its regular rendering activities, safe in the knowledge that it’s an entirely separate process.
The first kind of script to come with this constraint was called a web worker. In a web worker, you could write some JavaScript to do number-crunching calculations without slowing down whatever else was being displayed in the browser window. Spin up a web worker to generate larger and larger prime numbers, for instance, and it will merrily do so in the background.
A service worker is like a web worker with extra powers. It still can’t access the DOM, but it does have access to the fundamental inner workings of the browser.
Let’s take a step back and think about how the World Wide Web works. It’s a beautiful ballet of client and server. The client is usually a web browser—or, to use the parlance of web standards, a user agent: a piece of software that acts on behalf of the user.
The user wants to accomplish a task or find some information. The URL is the key technology that will empower the user in their quest. They will either type a URL into their web browser or follow a link to get there. This is the point at which the web browser—or client—makes a request to a web server. Before the request can reach the server, it must traverse the internet of undersea cables, radio towers, and even the occasional satellite (Fig 1.1).
Imagine if you could leave instructions for the web browser that would be executed before the request is even sent. That’s exactly what service workers allow you to do (Fig 1.2).
Usually when we write JavaScript, the code is executed after it’s been downloaded from a server. With service workers, we can write a script that’s executed by the browser before anything else happens. We can tell the browser, “If the user asks you to retrieve a URL for this particular website, run this corresponding bit of JavaScript first.” That explains why service workers don’t have access to the Document Object Model; when the service worker is run, there’s no document yet.
A service worker is like a cookie. Cookies are downloaded from a web server and installed in a browser. You can go to your browser’s preferences and see all the cookies that have been installed by sites you’ve visited. Cookies are very small and very simple little text files. A website can set a cookie, read a cookie, and update a cookie. A service worker script is much more powerful. It contains a set of instructions that the browser will consult before making any requests to the site that originally installed the service worker.
A service worker is like a virus. When you visit a website, a service worker is surreptitiously installed in the background. Afterwards, whenever you make a request to that website, your request will be intercepted by the service worker first. Your computer or phone becomes the home for service workers lurking in wait, ready to perform man-in-the-middle attacks. Don’t panic. A service worker can only handle requests for the site that originally installed that service worker. When you write a service worker, you can only use it to perform man-in-the-middle attacks on your own website.
A service worker is like a toolbox. By itself, a service worker can’t do much. But it allows you to access some very powerful browser features, like the Fetch API, the Cache API, and even notifications. API stands for Application Programming Interface, which sounds very fancy but really just means a tool that you can program however you want. You can write a set of instructions in your service worker to take advantage of these tools. Most of your instructions will be written as “when this happens, reach for this tool.” If, for instance, the network connection fails, you can instruct the service worker to retrieve a backup file using the Cache API.
A service worker is like a duck-billed platypus. The platypus not only lactates, but also lays eggs. It’s the only mammal capable of making its own custard. A service worker can also…Actually, hang on, a service worker is nothing like a duck-billed platypus! Sorry about that. But a service worker is somewhat like a cookie, and somewhat like a virus, and somewhat like a toolbox.
Service workers are powerful. Once a service worker has been installed on your machine, it lies in wait, like a patient spider waiting to feel the vibrations of a particular thread.
Imagine if a malicious ne’er-do-well wanted to wreak havoc by impersonating a website in order to install a service worker. They could write instructions in the service worker to prevent the website ever appearing in that browser again. Or they could write instructions to swap out the content displayed under that site’s domain. That’s why it’s so important to make sure that a service worker really belongs to the site it claims to come from. As the specification for service workers puts it, they “create the opportunity for a bad actor to turn a bad day into a bad eternity (http://bkaprt.com/go/01-01/).”
To prevent this calamity, service workers require you to adhere to two policies:
The same-origin policy means that a website at example.com can only install a service worker script that lives at example.com. That means you can’t put your service worker script on a different domain. You can use a domain like s3.amazonaws.com for hosting your images and other assets, but not your service worker script. That domain wouldn’t match the domain of the site installing the service worker.
The HTTPS-only policy means that https://example.com can install a service worker, but http://example.com can’t. A site running under HTTPS (the S stands for Secure) instead of HTTP is much harder to spoof. Without HTTPS, the communication between a browser and a server could be intercepted and altered. If you’re sitting in a coffee shop with an open Wi-Fi network, there’s no guarantee that anything you’re reading in browser from http://newswebsite.com hasn't been tampered with. But if you’re reading something from https://newswebsite.com, you can be pretty sure you’re getting what you asked for.
Enabling HTTPS on your site opens up a whole series of secure-only browser features—like the JavaScript APIs for geolocation, payments, notifications, and service workers. Even if you never plan to add a service worker to your site, it’s still a good idea to switch to HTTPS. A secure connection makes it trickier for snoopers to see who’s visiting which websites. Your website might not contain particularly sensitive information, but when someone visits your site, that’s between you and your visitor. Enabling HTTPS won’t stop unethical surveillance by the NSA, but it makes the surveillance slightly more difficult.
There’s one exception. You can use a service worker on a site being served from localhost, a web server on your own computer, not part of the web. That means you can play around with service workers without having to deploy your code to a live site every time you want to test something.
If you’re using a Mac, you can spin up a local server from the command line. Let’s say your website is in a folder called mysite. Drag that folder to the Terminal app, or open up the Terminal app and navigate to that folder using the cd command to change directory. Then type:
python -m SimpleHTTPServer 8000
This starts a web server from the mysite folder, served over port 8000. Now you can visit localhost:8000 in a web browser on the same computer, which means you can add a service worker to the website you’ve got inside the mysite folder: http://localhost:8000.
But if you then put the site live at, say, http://mysite.com, the service worker won’t run. You’ll need to serve the site from https://mysite.com instead. To do that, you need a secure certificate for your server.
There was a time when certificates cost money and were difficult to install. Now, thanks to a service called Certbot, certificates are free. But I’m not going to lie: it still feels a bit intimidating to install the certificate. There’s something about logging on to a server and typing commands that makes me simultaneously feel like a 1337 hacker, and also like I’m going to break everything. Fortunately, the process of using Certbot is relatively jargon-free (Fig 1.3).
On the Certbot website (http://bkaprt.com/go/01-02/), you choose which kind of web server and operating system your site is running on. From there you’ll be guided step-by-step through the commands you need to type in the command line of your web server’s computer, which means you’ll need to have SSH access to that machine. If you’re on shared hosting, that might not be possible. In that case, check to see if your hosting provider offers secure certificates. If not, please pester them to do so, or switch to a hosting provider that can serve your site over HTTPS.
Another option is to stay with your current hosting provider, but use a service like Cloudflare to act as a “front” for your website. These services can serve your website’s files from data centers around the world, making sure that the physical distance between your site’s visitors and your site’s files is nice and short. And while they’re at it, these services can make sure all of those files are served over HTTPS.
Once you’re set up with HTTPS, you’re ready to write a service worker script. It's time to open up your favorite text editor. You’re about to turbocharge your website!
Preparing for Offline