AWS Elastic search Index snapshot

May 24, 2019

Amazon ES takes daily automated snapshots of the primary index shards in a domain, as described in Configuring Automatic Snapshots. The service stores up to 14 of these snapshots for no more than 30 days in a preconfigured Amazon S3 bucket at no additional charge to you. You can use these snapshots to restore the domain.

You cannot use automated snapshots to migrate to new domains. Automated snapshots are read-only from within a given domain. For migrations, you must use manual snapshots stored in your own repository (an S3 bucket). Standard S3 charges apply to manual snapshots.

Prerequisites

To create index snapshots manually, you must work with IAM and Amazon S3. Verify that you have met the following prerequisites before you attempt to take a snapshot.

S3 Bucket

Stores manual snapshots for your Amazon ES domain. Make a note of the bucket’s name. You need it in two places:

Resource statement of the IAM policy that is attached to your IAM role
Python client that is used to register a snapshot repository

Important: Do not apply a Glacier lifecycle rule to this bucket. Manual snapshots do not support the Glacier storage class.

IAM role

Delegates permissions to Amazon Elasticsearch Service. The rest of this doc refers to this role as “ESSnapshotRole”

The trust relationship for the role must specify Amazon Elasticsearch Service in the Principal statement.

Permissions

You ( IAM user ) must be able to assume the IAM role in order to register the snapshot repository. You also need access to the “es:ESHttpPut” action.

Let’s get started

Create an IAM role named “ESSnapshotRole”. and give ‘AssumeRole’ privilege with following policy.

{
   "Version": "2012-10-17",
   "Statement": [{
     "Sid": "",
     "Effect": "Allow",
     "Principal": {
       "Service": "es.amazonaws.com"
     },
     "Action": "sts:AssumeRole"
   }]
 }

Add following policie to the IAM role.

{
   "Version": "2012-10-17",
   "Statement": [{
       "Action": [
         "s3:ListBucket"
       ],
       "Effect": "Allow",
       "Resource": [
         "arn:aws:s3:::s3-bucket-name"
       ]
     },
     {
       "Action": [
         "s3:GetObject",
         "s3:PutObject",
         "s3:DeleteObject"
       ],
       "Effect": "Allow",
       "Resource": [
         "arn:aws:s3:::s3-bucket-name/*"
       ]
     }
   ]
 }

Add the below policy to the user.

{
   "Version": "2012-10-17",
   "Statement": [
     {
       "Effect": "Allow",
       "Action": "iam:PassRole",
       "Resource": "arn:aws:iam::123456789012:role/ESSnapshotRole"
     },
     {
       "Effect": "Allow",
       "Action": "es:ESHttpPut",
       "Resource": "arn:aws:es:region:123456789012:domain/my-domain/*"
     }
   ]
 }

You must register a snapshot repository with Amazon Elasticsearch Service before you can take manual index snapshots. This one-time operation requires that you sign your AWS request with credentials that are allowed to access ‘ESSnapshotRole’

You can’t use curl to perform this operation, because it doesn’t support AWS request signing. Instead, use the following python script.

import boto3
 import requests
 from requests_aws4auth import AWS4Auth
 host = '' # include https:// and trailing /
 region = '' # e.g. us-west-1
 service = 'es'
 credentials = boto3.Session().get_credentials()
 awsauth = AWS4Auth(credentials.access_key, credentials.secret_key, region, service, session_token=credentials.token)
 Register repository
 path = '_snapshot/my-snapshot-repo' # the Elasticsearch API endpoint
 url = host + path
 payload = {
   "type": "s3",
   "settings": {
     "bucket": "s3-bucket-name",
     "region": "us-west-1",
     "role_arn": "arn:aws:iam::123456789012:role/ESSnapshotRole"
   }
 }
 headers = {"Content-Type": "application/json"}
 r = requests.put(url, auth=awsauth, json=payload, headers=headers)
 print(r.status_code)
 print(r.text)

You are all set now take a snapshot with curl.

curl -XPUT 'elasticsearch-domain-endpoint/_snapshot/repository/snapshot-name'

You can list all the snapshots with the following command.

curl -XGET 'elasticsearch-domain-endpoint/_snapshot/repository/_all?pretty'

Restore a snapshot

curl -XPOST 'elasticsearch-domain-endpoint/_snapshot/repository/snapshot/_restore'

Delete a snapshot

curl -XDELETE 'elasticsearch-domain-endpoint/_snapshot/repository/snapshot-name'