Amazon Simple Storage Service (Amazon S3) is an object storage service that stores data as objects within buckets. An object is a file and any metadata that describes the file. Each object has a key. The key is the full path of the object. A bucket is a container for objects like directories. Buckets must have a globally unique name. To store data in S3, we have to create a bucket and specify a bucket name and AWS Region.
Naming Convention
When we name our bucket, we have to remember the below naming rules,
- Consist only of lowercase letters, numbers, dots (.), and hyphens (-).
- 3-63 characters long
- must begin and end with a letter or number
- must not be formatted as an IP address (for example, 192.168.5.4).
- must not start with the prefix xn--
- must not end with the suffix -s3alias. This suffix is reserved for access point alias names.
- must be unique across all AWS accounts in all the AWS Regions within a partition.
Click here to know more about naming conventions.
Max 5TB
Max object size is 5TB. If we want to upload more than 5 GB, we must use a multi-part upload.
Not a Global service
S3 is not a Global service. It is a global console. we need to select a region for our S3 buckets.
S3 Storage classes
Amazon S3 offers a range of storage classes designed for different use cases. Below are those storage classes,
- S3 Standard
- S3 Standard Infrequent Access
- S3 One Zone Infrequent Access
- S3 Glacier Instant Retrieval
- S3 Glacier Flexible Retrieval
- S3 Glacier Deep Archive
- S3 Intelligent-Tiering
Versioning
We can keep multiple variants of an object in the same bucket by versioning. By default, S3 Versioning is disabled on buckets, we must explicitly enable it. If we enabled versioning, we can easily recover from unintended user actions like object deletion or can easily roll back in case of application failures.
We can enable or disable the versioning even after creating the buckets & uploaded some objects. Any file that is not versioned prior to enabling versioning will have a version “null”. Suspending versioning does not delete the previous versions.
If we enable versioning, objects versioning will be marked as Delete. It’s just a soft delete & you can see the older version & revert them back. But we can again select the object which is marked as delete & delete them permanently.
S3 Security
Security is a shared responsibility between AWS and us. For Amazon S3, our responsibility includes the following areas:
Managing our data, including object ownership and encryption.
Classifying our assets.
Managing access to our data using IAM roles and other service configurations to apply the appropriate permissions.
Enabling detective controls such as AWS CloudTrail or Amazon GuardDuty for Amazon S3.
The following topics show us how to configure Amazon S3 to meet our security and compliance objectives. To read more about security, please visit AWS documentation.
Protecting data using encryption
Data protection refers to protecting data while
- in-transit (as it travels to and from Amazon S3) and
- at rest (while it is stored on disks in Amazon S3 data centres).
Below are the options to protect or encrypt the objects in S3
- Client-side encryption or Secure Socket Layer/Transport Layer Security (SSL/TLS)
- Server-Side Encryption (SSE)
- SSE-S3: Amazon S3 managed encryption keys
- SSE-KMS: KMS keys stored in AWS KMS
- SSE-C: Customer-provided encryption keys
SSE-S3
Amazon S3 encrypts each object with a unique key that is managed by S3. As an additional safeguard, it encrypts the key itself with a key that it rotates regularly. It uses 256-bit Advanced Encryption Standard (AES-256). We must provide the x-amz-server-side-encryption key with value AES256(x-amz-server-side-encryption: AES256) in the request header or in the form fields(for POST requests).
SSE-KMS
Amazon S3 uses AWS KMS keys to encrypt S3 objects. AWS Key Management Service (AWS KMS) is a service that combines secure, highly available hardware and software to provide a key management system scaled for the cloud. KMS gives you the control of who used which key and the Audit trail. We must provide the x-amz-server-side-encryption key with value aws:kms(x-amz-server-side-encryption: aws:kms)
SSE-C
it allows us to set our own encryption keys. S3 does not store the encryption key. Instead, it stores a randomly salted HMAC value of the encryption key to validate future requests. We must use HTTPS. We have to provide the key in the request header. Amazon S3 will take care of encrypting the object using the key we passed before writing into the bucket & decrypting when we want to access the object/Data. We don’t need to maintain any code to perform data encryption and decryption. The only thing you do is manage the encryption keys you provide. The only thing we have to do is manage the encryption keys.
Client-side encryption
Client-side encryption is the act of encrypting data before sending it to S3 & decrypting it after retrieving the encrypted data from S3. Amazon S3 will not do any encryption or decryption. It will just receive data as encrypted & store them.
The AWS Encryption SDK is a client-side encryption library. Unlike the Amazon S3 encryption clients in the language-specific AWS SDKs, the AWS Encryption SDK is not tied to Amazon S3 and can be used to encrypt or decrypt data to be stored anywhere.
Identity and Access Management in Amazon S3
By default, all Amazon S3 resources—buckets, objects, and related subresources are private. Only the resource owner, the AWS account that created it, can access the resource. The resource owner can optionally grant access permissions to others by writing an access policy. Amazon S3 offers access policy options broadly categorized as,
- Resource-based policies
- User-based policies
Resource-based policies
Access policies that we attach to resources (buckets and objects) are referred to as resource-based policies. For example,
- Bucket policies – bucket wide rule
- Object Access Control List(ACL) – finer grain
- Bucket Access Control List(ACL)
User-based policies
We can also attach access policies to users in our account. These are called user policies or user-based policies. IAM policies authorize which API calls should be allowed for a specific user from IAM console.
Note: An IAM principal can access an S3 Object if the user IAM permissions allow it or the resource policy ALLOWS it and there is no explicit DENY(in bucket policy).
S3 Access Point
From Aws documentation,
Amazon S3 access points simplify data access for any AWS service or customer application that stores data in S3. Access points are named network endpoints that are attached to buckets that you can use to perform S3 object operations, such as GetObject and PutObject.
You can only use access points to perform operations on objects. You can’t use access points to perform other Amazon S3 operations, such as modifying or deleting buckets. Once we created the access point to allow the upload & download of the objects, we will not be allowed to upload or download the objects directly.
S3 bucket policies
A bucket policy is a resource-based policy that we can use to grant access permissions to our bucket and the objects in it. Bucket policies use JSON-based access policy language. S3 bucket policies are applied at the bucket level. Permission granted by the policy applies to all of the objects within the bucket. We cannot attach a bucket policy to individual objects.
Using bucket policy, we can
- Grant public access to the bucket
- Force Objects to be encrypted at upload
- Grant access to another account(Cross Account)
**************Add example POlicy here*****************
AWS provides a policy generator tool to build policies that we need.
S3 Access logs
It will log all requests made to the S3 bucket. This is not enabled by default. We have to enable it ourselves. When we enable S3 access logs, we can write them into another S3 bucket. API calls can also be logged in AWS CloudTrail.
ACL or Bucket ACL
Bucket ACL(Access Control List) is applied at an object level. We can apply different objects with a bucket. We can define which accounts or groups are granted access & what type of access is granted such as read, write, delete, and full access. It is a fine-grained control. By this, we can grant different types of access to different objects within the same bucket. We can apply different permissions for different objects to different users and groups.
S3 Block Public Access
From Aws documentation,
By default, new buckets, access points, and objects don’t allow public access. However, users can modify bucket policies, access point policies, or object permissions to allow public access. S3 Block Public Access settings override these policies and permissions so that you can limit public access to these resources.
With S3 Block Public Access, account administrators and bucket owners can easily set up centralized controls to limit public access to their Amazon S3 resources that are enforced regardless of how the resources are created.
- Access points
- Bucket settings
- Account settings
Public access is granted to buckets and objects through access control lists (ACLs), access point policies, bucket policies, or all. To help ensure that all of your Amazon S3 access points, buckets, and objects have their public access blocked, we recommend that you turn on all four settings for blocking public access for your account.
- Amazon S3 doesn’t support block public access settings on a per-object basis.
- When we apply block public access settings to an account, the settings apply to all AWS Regions globally.
- The settings might not take effect in all Regions immediately or simultaneously, but they eventually propagate to all Regions.
MFA delete
When working with S3 Versioning in Amazon S3 buckets, you can optionally add another layer of security by configuring a bucket to enable MFA (multi-factor authentication) delete. MFA delete requires additional authentication for either of the following operations:
- Changing the versioning state of our bucket
- Permanently deleting an object version
Sharing objects using pre-signed URLs
We can create a pre-signed URL & grant permission to access/download the objects for a limited time.
S3 Websites
We can host static websites in S3. The website URL will be in the below format,
<bucket-name>.s3-website-<AWS-region>.amazonaws.com
This may have client-side scripting. We have to make sure the bucket policy allows public read. Otherwise, we get 403(Forbidden) error.
Using cross-origin resource sharing (CORS)
Cross-origin resource sharing is a mechanism that allows making requests from one website to another website in the browser, which is normally blocked.
CORS check or preflight request is an HTTP OPTION method to be sent to the server to check whether the cross-origin access is enabled. If it’s enabled, we will receive the success response for the preflight request & the response contains the header ACCESS-CONTROL-ALLOW-ORIGIN & ACCESS-CONTROL-ALLOW-METHODS.
Note: If a client does a cross-origin request on our S3 bucket, we need to enable the correct CORS headers. We can allow for a specific origin or for *(all origins).