The application needed to allow unlimited storage of media to the users so we definitely needed something that would have a lot of cost control mechanisms.
Choose a cost-effective solution for storing media that scales with the application. Service should be reliable, fast, and cheap enough at the same time so that we could allow our users to upload all their media within reasonable limits.
Challenges / Possible Solutions
While designing the solution, we had to consider the following requirements:-
- This should be a scalable solution that can work equally well for hundreds as well as billions of files without a downgrade in any kind of performance.
- This should not require us to provision capacity beforehand and should auto-scale.
- There should not be any limits on the sizes of files.
- The design should have redundant options for fault tolerance.
- This should have options to control and reduce costs for long-term storage.
- And lastly, this should allow serving the media at a decent speed securely.
Among The Crucial Inquiries, We Had to Respond to, Were The Following:
- Do we need block storage or would object storage suffice?
- Do we use the same storage for hot and cold storage?
- How well is the service integrated with the CDN?
Since AWS was our choice of cloud, we started exploring our options for storage. AWS has 3 options available for file storage. Elastic Block Storage & Elastic File Storage are block storage options and S3 is the object storage service.
It is good that AWS has clearly defined the use cases for these storage services. After doing a careful analysis of these services, It all came down to one fundamental decision which is whether Object Storage would be sufficient for us or not.
It didn’t take us long to decide in favor of S3. It’s an object storage service from AWS with unlimited scale and is designed to provide 99.999999999% durability and 99.99% availability of objects.
Here’s how the solution worked:
- All types of media i.e Images, videos, and documents would be stored on S3.
- The versioning feature would provide accidental deletion protection.
- Since the object storage solution doesn’t provide any processing capability, Lambda would be used to offload any processing needs. S3 works very well with Lambda.
- “Lifecycle rules” is a killer feature and can be used for price and storage control. We decided to use these to move media to lower storage classes as they get old and are accessed rarely for cost control.
- Cloudfront would be used for securely distributing the media from S3. We used restricted access & Signed cookie features from CloudFront to serve private content only to authorized users.
- The cherry on top was the replication feature that S3 comes with. For compliance purposes, we were able to replicate our media to multiple regions effortlessly.
Architecture Diagram of the final solution:
S3 has been used for 5+ years now in the project and we have multi-terabytes of data stored in S3. We never had a single interruption in service in these many years. Lifecycle Rules, Tagging, and usage pattern reports all helped achieve a cost-effective storage system. Lambda has a very generous free tier and we were able to resize the images that we uploaded to S3, almost free.