AWS Elastic search Index snapshot
Amazon ES takes daily automated snapshots of the primary index shards in a domain, as described in Configuring Automatic Snapshots. The service stores up to 14 of these snapshots for no more than 30 days in a preconfigured Amazon S3 bucket at no additional charge to you. You can use these snapshots to restore the domain.
You cannot use automated snapshots to migrate to new domains. Automated snapshots are read-only from within a given domain. For migrations, you must use manual snapshots stored in your own repository (an S3 bucket). Standard S3 charges apply to manual snapshots.
Prerequisites
To create index snapshots manually, you must work with IAM and Amazon S3. Verify that you have met the following prerequisites before you attempt to take a snapshot.
S3 Bucket
Stores manual snapshots for your Amazon ES domain. Make a note of the bucket’s name. You need it in two places:
- Resource statement of the IAM policy that is attached to your IAM role
- Python client that is used to register a snapshot repository
Important: Do not apply a Glacier lifecycle rule to this bucket. Manual snapshots do not support the Glacier storage class.
IAM role
Delegates permissions to Amazon Elasticsearch Service. The rest of this doc refers to this role as “ESSnapshotRole”
The trust relationship for the role must specify Amazon Elasticsearch Service in the Principal statement.
Permissions
You ( IAM user ) must be able to assume the IAM role in order to register the snapshot repository. You also need access to the “es:ESHttpPut” action.
Let’s get started
Create an IAM role named “ESSnapshotRole”. and give ‘AssumeRole’ privilege with following policy.
{
"Version": "2012-10-17",
"Statement": [{
"Sid": "",
"Effect": "Allow",
"Principal": {
"Service": "es.amazonaws.com"
},
"Action": "sts:AssumeRole"
}]
}
Add following policie to the IAM role.
{
"Version": "2012-10-17",
"Statement": [{
"Action": [
"s3:ListBucket"
],
"Effect": "Allow",
"Resource": [
"arn:aws:s3:::s3-bucket-name"
]
},
{
"Action": [
"s3:GetObject",
"s3:PutObject",
"s3:DeleteObject"
],
"Effect": "Allow",
"Resource": [
"arn:aws:s3:::s3-bucket-name/*"
]
}
]
}
Add the below policy to the user.
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": "iam:PassRole",
"Resource": "arn:aws:iam::123456789012:role/ESSnapshotRole"
},
{
"Effect": "Allow",
"Action": "es:ESHttpPut",
"Resource": "arn:aws:es:region:123456789012:domain/my-domain/*"
}
]
}
You must register a snapshot repository with Amazon Elasticsearch Service before you can take manual index snapshots. This one-time operation requires that you sign your AWS request with credentials that are allowed to access ‘ESSnapshotRole’
You can’t useĀ curl
to perform this operation, because it doesn’t support AWS request signing. Instead, use the following python script.
import boto3
import requests
from requests_aws4auth import AWS4Auth
host = '' # include https:// and trailing /
region = '' # e.g. us-west-1
service = 'es'
credentials = boto3.Session().get_credentials()
awsauth = AWS4Auth(credentials.access_key, credentials.secret_key, region, service, session_token=credentials.token)
Register repository
path = '_snapshot/my-snapshot-repo' # the Elasticsearch API endpoint
url = host + path
payload = {
"type": "s3",
"settings": {
"bucket": "s3-bucket-name",
"region": "us-west-1",
"role_arn": "arn:aws:iam::123456789012:role/ESSnapshotRole"
}
}
headers = {"Content-Type": "application/json"}
r = requests.put(url, auth=awsauth, json=payload, headers=headers)
print(r.status_code)
print(r.text)
You are all set now take a snapshot with curl.
curl -XPUT 'elasticsearch-domain-endpoint/_snapshot/repository/snapshot-name'
You can list all the snapshots with the following command.
curl -XGET 'elasticsearch-domain-endpoint/_snapshot/repository/_all?pretty'
Restore a snapshot
curl -XPOST 'elasticsearch-domain-endpoint/_snapshot/repository/snapshot/_restore'
Delete a snapshot
curl -XDELETE 'elasticsearch-domain-endpoint/_snapshot/repository/snapshot-name'