Date Modified Series Part 1 of S3 Web Host Category Techbits Tags AWS / S3 / CORS

Series Introduction

What is Amazon S3? Can it be a web host? Great questions, to be sure, but first, an intro to the S3 Web Host Series. Initially, I considered summarizing this proof of concept in one article. However, the process is so involved that it would be more like a Project and not what I envisioned as a Techbit when I first introduced them. That brings us to the logical conclusion of a multipart series about some of what I learned configuring S3 as a web host.

What is Amazon S3?

Now that we finished the introductions, on to the main questions. First, what is Amazon S3? The fine-grained details of what, exactly, S3 is are out of the scope of this series. However, the S3 User Guide will be much more detailed and up-to-date than this article. For this series, and at its most basic, S3 is a cloud folder, or “bucket.” Is it that simple?

If we go to a website without an index page, we’ll likely get a default page listing the directory contents. The contents would include files and subdirectories, much like a folder on a computer. If web crawlers can access the directory, they can index them and make them searchable via the search engine. Incidentally, browsing open directories on the Internet is an educational way to spend one’s youth, but I digress. Before we get to the next question, the Getting Started with S3 guide covers basics like making a bucket and uploading files.

Tip

Shop around for endpoints to host your bucket. Different regions have different prices and features. Try making one where you expect the most traffic and choose the least expensive one to save a bit!

Tip

Use the AWS CLI to save a lot of time uploading files. Logging in and loading files through the S3 console can be tedious. The AWS CLI can streamline a lot of the upload process via scripting. However, depending on your IAM users and groups, it also can have access to everything. So, configure cautiously.

Can S3 be a Web Host?

Next question, can S3 be a web host? Well, kind of. It depends on our web host requirements. Generally, there are two types of web pages (or websites): static and dynamic. There are many technicalities when defining a website as “static” or “dynamic,” so I defer to Wikipedia for greater detail about web page deployment. For this series, static websites only use client-side scripting, while dynamic websites also use server-side scripting.

Ultimately, that determines whether S3 can act as a web host for our website: S3 can’t run server-side scripts by itself, so it only supports client-side scripting. That said, if you need a dynamic hosting solution, there’s another article about setting up a CentOS 7 AMI Web and Mail Server with DDNS, which covers most of what you’d need. Otherwise, this series covers static websites on S3, much like this one.

Enable Website Hosting

There are a few things we have to do to prevent the S3 bucket from acting as a directory without an index page. Of course, the Website Hosting Guide will be more up-to-date and detailed than this article (not as comprehensive as this series, of course), but we’ll go over some highlights.

First, we have to enable website hosting and specify both the index and error pages. Without these settings, the S3 bucket honestly acts like a cloud folder. We can style these pages any way that we want, so long as they are “static.” The custom error page setting prevents visitors from getting a scary, generic error page that makes it seem like the Internet broke. Save the generated endpoint/URL for accessing this bucket after it’s publicly accessible (more on that later).

Don’t feel like sweating over a hot keyboard writing multiple pages with HTML, CSS, and JavaScript by hand? Not to worry, there are plenty of static site generators out there. The best one depends on our requirements. As mentioned in this website’s footer, this one uses Pelican, which is a static site generator using the Python programming language and reStructuredText.

Next, we have to set permissions for website access because buckets are private by default. As repeatedly mentioned in the documentation, enabling public access to a bucket is risky depending on the bucket objects. So, review Blocking Public Access and Secure S3 Resources for more background information. Otherwise, follow the directions to Disable Public Access Blocking.

Finally, Add a Bucket Policy so that anyone on the Internet can access the files in the bucket. Although it’s possible to specify private directories in the bucket policy, it’s much better to have a separate bucket for anything we don’t want accessible on the Internet. While we’re here, add an ACL if multiple IAM users and groups will be adding and editing files in the bucket. Although the CORS settings are also here, they’re not in the web host guide. The CORS Guide has greater detail about configuring it, and Mozilla’s MDN CORS docs better explain what it does.

Essentially, it controls which servers can access this bucket using specific requests, like JavaScript using XHR. Having that said, CORS is mainly for APIs, so websites generally don’t need to configure CORS policies. However, CORS is useful for S3 because it also has a REST API. So, enabling CORS prevents other servers from using our local fonts or scripts from accessing bucket resources. These additional requests would adversely affect data transfer costs.

Although it would only save a few cents, configuring CORS helps reduce the costs associated with other servers trying to access our files. Generally, we want to configure the CORS policy to allow only local scripts, but your specific needs will vary. The best advice is that if you’re unsure whether you need CORS, don’t enable it.

Tip

There are ways to test if the CORS policy works. Try using this CORS tester to make sure you’re going with the right one.

Conclusions

Well, that’s technically it: we now have a static website on S3, but it’s missing a lot of things we’d expect from a website. For one, it’s only accessible by a scary-looking endpoint URL, like http://bucket-name.s3-website-Region.amazonaws.com. Not only that, it’s only accessible using the HTTP protocol instead of the HTTPS protocol.

Depending on the region where we created the bucket, users will experience very high latency accessing our website. For example, someone closer to the Asia Pacific (Sydney) region accessing a bucket in US East (Ohio). Because of the distance, the page might take a few more seconds to finish rendering. Since we’re on the hook for data transfer costs, we’d also be susceptible to DDNS attacks.

This series will tackle all of these issues and more! Next up, we’ll make the URL more shareable with a custom domain and Route53. Later, we’ll cover SSL certificates and caching with AWS Certificate Manager and CloudFront. Finally, we’ll cover adding redirects and security headers like CSP with Lambda@Edge. So be sure to stay tuned!