Image: iStock/lukutin77

With the trove of user-generated content uploaded to social networking or media websites and the advent of streaming video services, the storage requirements for web services are higher than ever, and taking on additional users in the event of rapid growth necessitates scaling quickly to meet user demand. Rather than building out storage in-house, many organizations are turning to Amazon S3 to handle their storage requirements.

This resource about Amazon S3 is a quick introduction to the object storage service, as well as a “living” guide that will be updated periodically as further integrations are released.

SEE: All of TechRepublic’s cheat sheets and smart person’s guides

Executive summary

  • What is Amazon S3? Amazon S3 (Simple Storage Service) is a data storage service that can be used to store and retrieve data for a variety of use cases.
  • Why does Amazon S3 matter? Amazon S3 significantly reduces the hardware and maintenance costs of data storage.
  • Who does Amazon S3 affect? Any organization from a single user to a large enterprise can benefit from using S3.
  • When is Amazon S3 launching? Amazon S3 launched in 2006, with additional geographic regions and related services added since then.
  • How do I get Amazon S3? You can get started with AWS using the free tier, which allows limited free use for up to one year.

SEE: Inside Amazon’s clickworker platform: How half a million people are being paid pennies to train AI (PDF download) (TechRepublic)

What is Amazon S3?

Amazon S3 is a data storage service that can be used to store and retrieve data for a variety of use cases, such as static data used in a web page or mobile app, redundancy and storage for internal business data, and long-term data backup. Accordingly, Amazon S3 has different storage tiers to accommodate a variety of potential use cases.

  • Standard storage is intended for use cases where high availability is required and frequent access is anticipated.
  • Standard infrequent is intended for cases where high availability is required, though frequent access is not necessary, such as backups or private user-submitted data for dormant accounts.
  • Glacier storage is intended for data archiving for cases where access latency of minutes to hours is acceptable.
  • Intelligent tiering is a new storage option that adds two new tiers: Archive access and deep archive. Intelligent tiering automatically moves data to the appropriate tier, with archive access being the destination at 60+ days of inactivity and deep archive for 90+ days. Intelligent tiering can be toggled on/off in the S3 console.

Lifecycle management is available to change the provisioning of data to different storage tiers to reduce storage costs.

As with other AWS products, billing is figured on a pay-per-use model, preventing the need to pay upfront for capacity buildouts. Charges are assessed monthly, per GB of data stored, as well as additional charges for GET, POST, PUT, LIST, and other ancillary operations.

Pricing between the three tiers differs based on the priority of the data. Presently, in the US-East region, the standard storage tier starts at $0.023 per GB for 50 terabytes, with discounts thereafter. Infrequent access storage starts at $0.0125 per GB, and Glacier storage starts at $0.004 per GB.

At re:Invent 2020, Amazon added S3 support to AWS Outposts, its on-premise option. An AWS Outpost is designed to meet data residency needs and increase performance and data can be synced to S3 to prevent loss.

Additional resources:

Why does Amazon S3 matter?

The task of planning and building a storage solution on-site is a rather laborious one, particularly when maintenance tasks and data integrity risks are taken into consideration. Various factors, such as the assorted ways that hard disks and related equipment can fail, the fluctuations of the spot market for disks, and the time-consuming nature of managing such hardware can make the prospect of such a buildout an unpleasant one. Aside from these technical reasons, on-premises storage in office spaces in urban centers is not a particularly efficient use of physical space, considering the high cost of rent and utilities in many locales around the world.

By moving that process to Amazon S3, the cost of data storage can be reduced significantly, as are management tasks. Normal maintenance tasks such as backups and zone transfers can be performed using the web-based management interface. Amazon S3 supports encrypted data, as well as HTTPS connections for content served on web pages.

Additional resources:

Who does Amazon S3 affect?

Many businesses leverage Amazon S3 (among other AWS services) in their operations. Perhaps the most famous example of an early adopter is the professionally-targeted photo hosting service SmugMug, which began using Amazon S3 in 2006, one month after the service had launched. As SmugMug was founded without startup capital, the cost savings for the company were considerable. By 2010, SmugMug had stored two petabytes of photos on Amazon S3. Similarly, image hosting services imgur, Tumblr, and Pinterest store data on Amazon S3, as does the news aggregation website Reddit.

Image: AWS

Netflix has recently moved most of its business operations to Amazon S3, as part of a larger strategy to increase performance and reliability. With the depth of the streaming catalog at Netflix, the rapidly increasing subscriber count, the sheer amount of data stored, as well as data transferred to customers on a daily basis, Netflix’s lead shows that Amazon S3 is capable of handling demanding load.

In March 2017, extended downtime occurred unexpectedly for S3 at the US-East location, causing partial or complete service failures for services such as Twitch and IFTTT, and even Amazon’s service health dashboard, as all three depend on S3 at US-East to function. While this outage was notable for the length of time it occurred, Forrester’s Dave Bartoletti assuaged concerns in an interview with TechRepublic that “This isn’t a normal incident, nor do we see any indication that the public cloud is becoming unreliable.”

Additional resources:

When is Amazon S3 happening?

Amazon S3 launched as part of the public beta of AWS in March 2006. Other AWS services, including storage services such as Elastic Block Store, have followed since the launch of Amazon S3. Likewise, additional geographic regions for Amazon S3 (and AWS as a whole) have been added since that time, with over a dozen regions available across North and South America, Europe, Asia, and Australia.

Amazon Glacier, the long-term data archival service, launched in August 2012. In December 2016, Amazon announced the AWS Snowmobile, a truck that can be driven to any location that can store 100 PB of data, in order to ease migrations to AWS.

In 2020, Amazon added AWS Snowcone, a portable piece of hardware that fits in a backpack for copying smaller amounts of data for AWS migration.

At AWS re:Invent 2020, several new S3 features were launched. Three new security features were added: Bucket owner validation to prevent accidental changes to data, object ownership overwrite to force new bucket objects to be owned by the bucket owner instead of the uploader, and bucket keys, which reduce the need for constant KMS validation by reducing it to one key per bucket.

Amazon also announced multi-destination replication capabilities, two-way replication, and new data replication metrics and notifications at re:Invent 2020, and the addition of the S3 Storage Lens that was recently announced. Storage Lens gives organization-wide visibility into storage use and provides 29 different metrics that tie into recommendations to reduce costs and apply best data storage practices.

Additional resources:

Which services compete with Amazon S3?

The two biggest competitors to Amazon S3 (and AWS as a whole) are Google Cloud Platform and Microsoft Azure. Google Cloud Storage multi-region general storage is $0.026 per GB, whereas Azure starts at $0.0184 per GB for locally redundant hot storage.

For a storage-only solution, BackBlaze offers an Amazon S3 competitor in its B2 Cloud Storage service, which severely undercuts storage pricing for Amazon, Google, and Microsoft at $0.005 per GB.

Additional resources:

How do I get Amazon S3?

Anyone with an Amazon account can sign up for AWS, which includes access to Amazon S3. Developers can get started with AWS using the Free Tier, which is available to anyone without restriction for the first 12 months. It features 5 GB of standard storage in Amazon S3 with 20,000 GET and 2,000 PUT requests, as well as free access to over a dozen other AWS services.

For startups, Amazon has two tiers of free access. The Portfolio package offers up to $15,000 of promotional credit for up to two years, whereas the Portfolio Plus package provides the option of that benefit or up to $100,000 of promotional credit for AWS which expires after one year. The Portfolio package provides up to $5,000 of support credit for one year; Portfolio Plus doubles this amount, and extends the offer to two years. Exact amounts and credit validity vary depending on which startup accelerator your organization is aligned with.

Additional resources:

Note: This article was updated by Brandon Vigliarolo.

Subscribe to the Data Insider Newsletter

Learn the latest news and best practices about data science, big data analytics, artificial intelligence, data security, and more. Delivered Mondays and Thursdays

Subscribe to the Data Insider Newsletter

Learn the latest news and best practices about data science, big data analytics, artificial intelligence, data security, and more. Delivered Mondays and Thursdays