Using AWS S3 to Power Your Digital World

As a designer, web developer and techie-geek I need a versatile and robust data storage solution that I can afford but also use without learning some new language. So far, I’ve only found one service that can handle the large majority of my needs. This article covers how I use the Amazon Web Services Simple Storage Service (AWS S3) to meet most of my needs.

AWS S3

AWS S3 is Amazon’s cloud storage solution. It’s versatile, reliable, fast and scalable to fit almost anyone’s needs. Of course with a service that sounds this great you would expect it to be expensive but it’s actually the most affordable storage solution I’ve found on the web, considering the features you get.

Amazon Web Services S3

AWS S3 is intended for developers, but thanks to some great tools, it’s easy enough for just about anyone to use. Before I get into how I use AWS S3, I want to mention that this storage solution doesn’t use the traditional file structure of folders/files, etc. Instead AWS S3 uses “buckets” in which you store objects. The tools I use make AWS S3 appear to be a normal file system with the exception of “buckets”. Think of a bucket as a separate hard drive where you’ll store your files. You might also want to read Amazon S3 at Wikipedia. So lets get on with how I use AWS S3.

AWS S3 + Jungle Disk

I probably use Jungle Disk the most often because it makes it easy to use and manage my AWS S3 buckets, perform automated backups and centralize my data for access anywhere at any time. When you use Jungle Disk with your AWS S3 account you setup your individual buckets which Jungle Disk can connect as a network drive. Then you have drag and drop access to your AWS S3 files! Jungle Disk also encrypts your files though so they’re safe and secure.

Jungle Disk

Jungle Disk has plenty of options for bucket management, automatic backups, encryption, bandwidth limiting, etc. It also has a monitoring tool to view and manage transfers in progress. It typically runs in the background but it comes in very handy when you would like to take action on something or just watch what’s going on.

Jungle Disk settings

If you’re worried about cross-platform compatibility, don’t. Jungle Disk has versions of their software for 32 and 64-bit Windows, Linux and Mac. They even have a version that you can run from a USB flash drive on all three platforms for quick access to your files from anywhere.

Jungle Disk download

Of course, if you forget your flash drive, they also have web access to your files. If you work with other people who need access to your files, Jungle Disk can do that too. They have multi-user options to make accessing AWS S3 buckets, for several people, very easy.

Jungle Disk users

Ok so we have cross platform cloud storage that’s drag and drop easy which we can access anywhere with tons of great options, what else do we need?

AWS S3 as a “CDN” or Public File Access

Most of you probably have blogs or websites that you have hosted on a web server you pay for. As we all know, quality web hosting isn’t cheap especially when it comes to storage space. I don’t want to use my expensive web server storage for images and other file downloads and I especially don’t want to bog down my web server with file requests from visitors when there’s a better way to do it.

S3Fox for Firefox

S3Fox is a Firefox addon that lets you manage your AWS S3 buckets and files. Why do we need S3Fox when we could use Jungle Disk? S3Fox does a few things Jungle Disk wasn’t intended for, such as managing CloudFront distributions which we’ll get into later. I’ve setup a bucket called “files.jremick.com” which I plan on using to host images and files for my blog as well as other websites and other random purposes.

S3Fox

Then I setup a cname on my web server directing “files” and “www.files” to “files.jremick.com.s3.amazonaws.com.” which will then allow me to use the subdomain “http://files.jremick.com” to access files I’ve placed in the “files.jremick.com” bucket for public viewing. The other two are used by CloudFront which we’ll get into later.

S3 Cname

So now we have an easy way to access files at http://files.jremick.com. We could use it as a sort of “CDN” (even though it wouldn’t be a true CDN) or we could just use it to provide file downloads that won’t bog down our web server. If you’re wondering, yes, you can view and download the panorama image from my S3 account and no, I’m not worried about bandwidth because it’s super cheap! :-) http://files.jremick.com/red-rock-panorama.jpg Did you notice the “wp-content” directory? Familiar eh? On to using AWS S3 with WordPress!

S3Fox files

AWS S3 plugin for WordPress

The AWS S3 plugin for WordPress is one of my favorite plugins for WordPress because it lets me use my AWS S3 account to host media for my blog rather than my expensive web server. Of course I could do this manually if I wanted but the plugin integrates this functionality with WordPress so I can upload files without leaving my WordPress control panel.

AWS S3 and WordPress

You might be wondering why this is beneficial. Well, for starters, images and other media loaded from your AWS S3 account will likely load faster simply because you’re using Amazon’s servers rather than your own, possibly puny server. Also, your web server won’t be bogged down loading these media files and your regular PHP/HTML files.

Your website will also load faster for most people because in most browsers you are limited to the number of parallel downloads from a single domain. If you’re hosting your images on your AWS S3 account which will be from a secondary domain then browsers will be able to load more files at the same time. See Maximizing Parallel Downloads in the Carpool Lane for more information

AWS S3 + CloudFront

Ok, so I’ve covered how I use AWS S3 for networked storage as well as for my websites and reducing the load on my web server. If you run a high traffic website (which I don’t) or you’re just a nerd (like me) and want things to run as fast as possible then you’ll want to check out Amazon CloudFront as well.

Amazon CloudFront

Earlier in the article I put “AWS S3 as a “CDN” or Public File Access” with CDN in quotes. The reason I did that is because AWS S3 is NOT a true CDN. A CDN is a Content Delivery Network that delivers your files from a distribution of servers around the world. Visitors get access to your files from the fastest resource available (usually the closest server). AWS S3 only has a few data centers around the world and your data will most likely be in one location making it far from a CDN.

If you want the best speed for visitors across the globe, you’ll want to use a real CDN like CloudFront. Thankfully Amazon has made it super easy to use these services together. I’ve already signed up for CloudFront and now I just need to configure it using S3Fox.

CloudFront distribution

Simply right click on the bucket you want distributed to Amazon’s CloudFront and click “Manage Distributions”. From here you can configure your CloudFront distribution. You’ll be assigned a unique domain for the distribution. “d1i7xb2p8w9276.cloudfront.net” is what this distribution has been assigned.

I’ve also used “cdn.jremick.com” as the CNAME for this distribution so I can access the files at http://cdn.jremick.com. You’ll see the status as “InProgress” until the distribution has been deployed and the status will change to “Deployed”.

CloudFront distribution

Then I setup the CNAME on my web server.

CloudFront CNAME

Now when I request files at http://cdn.jremick.com they will be requested from the CloudFront servers which will pull the files from your AWS S3 account and cache them for all subsequent requests.

There are some disadvantages to CloudFront (and other true CDN’s) though. Once a file has been cached on the CloudFront servers, it won’t be requested from your AWS S3 account again. That means you’ll need to version your files (filename_v1.css, filename_v2.css, etc.) so they’ll actually reflect the changes for your users. It’s a great service but it really is intended more for high traffic purposes. In most situations for average people with blogs, AWS S3 will do just fine. I will be using CloudFront to host JavaScript, CSS and other static files though just because I’m a nerd and I want performance! :-)

AWS S3 + S3Sync = Automated Offsite Server Backups

I’m a worry wort when it comes to losing data. My web server hosts around 20 accounts for other people and it’s very important to make sure all that data is backed up, safe and secure. That’s where S3Sync comes in. I can use it to automatically backup my web server to a specified AWS S3 bucket.

Here I’ve jumped into Transmit (FTP for Mac with AWS S3 support) and logged into my AWS S3 account. I’m looking at my “servintbackups” bucket which shows the different backup folders. Each night the backups are updated automatically on my AWS S3 account.

servintbackups

If you would like to do this as well check out these tutorials.

Conclusion

Using AWS S3 and a variety of tools I’ve managed to get a lot for a little.

  • Centralized file access in the cloud, anywhere, on any platform.
  • Automated backups for desktop and server computers.
  • Web access to your files.
  • Media hosting outside of your web server to reduce load and speed things up.
  • Easy to setup “CDN” and/or providing file access for users.
  • Easy to setup true CDN with CloudFront.

As I said earlier as well, AWS S3 is built for developers so if I do need to use it for even more solutions then the opportunity is there.

As great as AWS S3 is, it may not fit the bill for every problem you have. For instance, AWS S3 servers don’t gzip files and backing up 200GB of data (like an iTunes library) would cost $30/m vs $5 or $10 per month on other services. AWS S3 is just one of the tools I use among many. If you’re interested in other solutions such as Google App Engine, iDrive, etc. let us know in the comments below.



9

Comments
  • Rich says:

    very nice article. i’ve been looking into this a bit, and this is one of the better (more resourceful) articles thus far.

  • Andrew says:

    Very nice! I’ll have to look into this more!

  • Jeff says:

    I must admit I really don’t do enough to find out more abuot things that are out there to make our lives easier – so thanks!!

  • CloudFiles (http://www.mosso.com/cloudfiles.jsp) is both Amazong CloudFront and AWS S3 in one. :) Plus, when you upload/download files through their control panel, the data transfer is free (meaning you can upload your 200GB iTunes library without paying for the data transfer :) ). Jungle Disk also supports them, by the way.

  • xmdsys says:

    Awesome, thanks for the heads up Bruno! I’m reading up on some of their offerings and here’s what I can see so far.

    1. They don’t offer CNAME support yet and for me, this is a must.

    2. Their service doesn’t offer the easier to use “folders” that you get with S3. S3 doesn’t actually use folders but applications (like Transmit) make it appear that way because S3 allows it. Apparently this service doesn’t (from what I can tell so far).

    3. Their pricing is actually more expensive. They start at .22/GB vs CloudFront which starts at .17/GB for US and EU locations then .21 – .22/GB for locations in Asia. However, Mosso doesn’t charge for other things like certain requests so I have a feeling they’re really close to the same pricing at the end of the month depending on how you use them.

    4. Another advantage Mosso has is that they’re using Limelight’s CDN which from what I’ve read, out performs CloudFront in latency but CloudFront outperforms Limelight in transfer speed. The purpose of using a CDN though is for smaller files and to reduce latency so Mosso wins at this point. However, keep in mind CloudFront is much newer than Limelight and Amazon is expanding their footprint. Unless you’re running a super high traffic website or you care about an extra couple ms latency, CloudFront will do just what you need.

    5. Another advantage Mosso has is that they have technical support to backup their product that Amazon doesn’t provide (to the best of my knowledge). For some people this is important too.

    Anyway, I’m definitely going to look into it a little more. I doubt I’ll make the switch but for those people starting out it might be worth looking into Mosso. Thanks again! :-D

  • Jarel, I suppose that at the end of the day, it all depends on how one is planning on using the service. If you REALLY care about the lower latency, and need the technical support, the best choice would probably be Mosso. Nevertheless, like you said, CloudFront is a new service, and I’d say that they’ll probably get even better in due time. Like you said, they’re expanding. Besides, they have CNAME support, and for some (most?) that will indeed be a decisive factor. Either way, both would be a better choice over the traditional method of hosting these kind of files. :)

  • Andy says:

    I always enjoy learning what other people think about Amazon Web Services and how they use them. If you want to manage Amazon S3 accounts on Windows check out my freeware product CloudBerry Explorer. http://cloudberrylab.com/

  • Justin says:

    Another great article! Thanks! Going to check out that wordpress plugin.

  • Thanks for a great article! Lots of ideas for me to chew on. I understand that Jungle Disk now supports Rackspace as well as Amazon. Rackspace doesn’t charge for bandwidth. Have you looked into the pros and cons of using Rackspace for the purposes you just described?