Skip to content

Kyle Barron

Protect self-hosted assets with firewalls and CORS

4 min read

I self host map tiles and other large assets for use by my own projects, but in general I don't want to host map data for anyone on the internet. By default, hosting a server with map tiles open to the internet means that anyone could point to your URL and serve a web map using your map data.

I've created open-source scripts [] to generate map tiles, in order to enable developers to self host map assets as easily as possible, but I don't want to pay for them to use my tiles directly.

I host my map assets partially on a Google Cloud f1-micro instance, and partially on AWS S3 directly. For both, I put Cloudflare in front so that I can benefit from their caching and lower my bandwidth costs.

Allow only traffic from Cloudflare

I use Cloudflare to make load times faster for users through their CDN, and to keep my bandwidth costs as low as possible by leveraging their free caching. However, this strategy only works if all your traffic is indeed coming through Cloudflare.

Both AWS S3 and a Google Cloud Compute instance have default URLs for access. For S3, if you create the bucket in US-East-1, it'll be https://s3.amazonaws.com/BUCKET_NAME. For Google Cloud, it's the external IP address. Traffic coming directly to either of those won't be proxied by Cloudflare, and thus won't be cached. The solution is to only allow access from Cloudflare's IP addresses, because when Cloudflare proxies traffic for you, it means that your server sees Cloudflare's IP address as the source, instead of the user's.

You'll need to first get the list of all the IP addresses Cloudflare uses for proxy traffic, which will be used in the next steps.

S3

For S3, you can restrict traffic to only traffic from Cloudflare by creating a bucket policy. To do this, go to the Amazon S3 console, choose your bucket, click "Permissions", then click "Bucket Policy". You'll then be greeted by a text editor, where you can add a bucket policy.

The following bucket policy allows the action s3:GetObject--aka HTTP GET requests--to all objects in your bucket, given that the source IP address is within one of the addresses listed in aws:SourceIp. The IP addresses listed are currently all the IP addresses Cloudflare uses.

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "PublicReadGetObject",
      "Effect": "Allow",
      "Principal": "*",
      "Action": "s3:GetObject",
      "Resource": "arn:aws:s3:::BUCKET_NAME/*",
      "Condition": {
        "IpAddress": {
          "aws:SourceIp": [
            "2400:cb00::/32",
            "2405:8100::/32",
            "2405:b500::/32",
            "2606:4700::/32",
            "2803:f800::/32",
            "2c0f:f248::/32",
            "2a06:98c0::/29",
            "103.21.244.0/22",
            "103.22.200.0/22",
            "103.31.4.0/22",
            "104.16.0.0/12",
            "108.162.192.0/18",
            "131.0.72.0/22",
            "141.101.64.0/18",
            "162.158.0.0/15",
            "172.64.0.0/13",
            "173.245.48.0/20",
            "188.114.96.0/20",
            "190.93.240.0/20",
            "197.234.240.0/22",
            "198.41.128.0/17"
          ]
        }
      }
    }
  ]
}

Once you save the bucket policy, all objects will only be accessible through Cloudflare. For example, if I put an index.html at the root of my bucket, and attempt to load it by its default AWS S3 URL (https://s3.amazonaws.com/BUCKET_NAME/index.html), I'm greeted with "AccessDenied":

<Error>
  <Code>AccessDenied</Code>
  <Message>Access Denied</Message>
  <RequestId>EBD7E77186227632</RequestId>
  <HostId>
    JhPjXJKGdWUbP14jkhti8RdkenxGyG8+zm1h9Cew/hOkiYIiXKpQ3uCA1OTJVO5wBcxwAvqZq/s=
  </HostId>
</Error>

However, after pointing Cloudflare to the bucket, I can succesfully access that file (https://CLOUDFLARE_URL/index.html):

Hello world!

Google Cloud Compute Engine

You can also restrict your Google Cloud Compute Engine instance to only accept traffic from Cloudflare IP addresses. Go to the Firewall rules page in Google Cloud Console for project. Then select "Create Firewall Rule". Give it a name and description; I like cloudflare-allow-http-https. Keep "Direction" as "Ingress", since you're restricting inbound traffic. Set "Action on match" to "Allow". "Targets" allows you to set this Firewall rule for all compute units in your account or just selected ones. Since I also have compute units under the same project that shouldn't receive HTTP traffic, I add a target tag to the firewall, and also add the same target tag to the compute instance to which I want to allow traffic.

For "Source filter", keep it as "IP ranges", then add the list of Cloudflare's source IP ranges. Annoyingly, this apparently must be done one at a time. Also, I got errors when trying to add Cloudflare's IPv6 addresses, so I only added its IPv4 addresses and it appears to work fine. The IPv4 addresses are the ones of the form [0-255].[0-255].[0-255].[0-255].

For "Protocols and ports", choose "Specified protocols and ports", and set tcp to 80,443, which are the ports used for HTTP and HTTPS traffic.

Then click "Create" and wait for the rule to be created.

If you added a "Target tag" in the firewall, don't forget to go back to the VM instance, choose "Edit", and add that tag under "Network tags".

You should be able to test that the firewall is working correctly by attempting to navigate to the instance's external IP address in your browser. It should timeout.

CORS

From the Cloudflare documentation, there are three options to make Cloudflare's cache see your updated CORS headers:

  • Change the filename or URL to bypass cache to instruct Cloudflare to retrieve the latest CORS headers.
  • Use the single-file purge API to specify the appropriate CORS headers along with the purge request
  • Update the resource’s last-modified time at your origin web server. Then, complete a full purge to retrieve the latest version of your assets including updated CORS headers

Each of these options is difficult. For the first, you need to change all your filenames, which can be a burden on S3. The second implies you need to purge every file individually through their API, which could take a while and be easy to miss some files.

The overall lesson here is to minimize CORS header changes. It's much easier to set up the first time than to change. If you're constantly creating new domain names that you want to access your map data from, you might want to have a more open CORS configuration, so that you don't need to go through one of these steps on every new domain.

AWS S3

By default, a bucket's CORS configuration is empty. This means that only requests from the same domain will be accepted. You can test this with fetch(URL) in the browser console. If you go to a different domain, say, https://google.com, and then try:

fetch("https://CLOUDFLARE_URL/index.html");

you'll see an error message like:

Access to fetch at 'https://CLOUDFLARE_URL/index.html' from origin
'https://www.google.com' has been blocked by CORS policy: No
'Access-Control-Allow-Origin' header is present on the requested resource. If an
opaque response serves your needs, set the request's mode to 'no-cors' to fetch
the resource with CORS disabled.

while if you go to the same domain, say https://CLOUDFLARE_URL/index.html, and then try the request again fetch('https://CLOUDFLARE_URL/index.html'), it'll go through because the request was coming from the same address.

If you wanted to allow every website to load content from your bucket, you could add a CORS configuration like the following:

<?xml version="1.0" encoding="UTF-8"?>
<CORSConfiguration>
<CORSRule>
    <AllowedOrigin>*</AllowedOrigin>
    <AllowedMethod>GET</AllowedMethod>
    <AllowedMethod>HEAD</AllowedMethod>
    <MaxAgeSeconds>3000</MaxAgeSeconds>
    <AllowedHeader>*</AllowedHeader>
</CORSRule>
</CORSConfiguration>

I only want selected websites to be able to load my content, so I can instead use a CORS configuration like the following:

<?xml version="1.0" encoding="UTF-8"?>
<CORSConfiguration>
  <CORSRule>
    <AllowedOrigin>*kylebarron.dev</AllowedOrigin>
    <AllowedMethod>GET</AllowedMethod>
    <AllowedMethod>HEAD</AllowedMethod>
    <MaxAgeSeconds>3000</MaxAgeSeconds>
    <AllowedHeader>*</AllowedHeader>
  </CORSRule>
  <CORSRule>
    <AllowedOrigin>http://www.example2.com</AllowedOrigin>
    <AllowedMethod>PUT</AllowedMethod>
    <AllowedMethod>POST</AllowedMethod>
    <AllowedMethod>DELETE</AllowedMethod>
    <AllowedHeader>*</AllowedHeader>
  </CORSRule>
</CORSConfiguration>

Update Cloudflare's CORS cache

If you've updated your CORS,