Backup and Restore¶

Overview¶

Nixiesearch's S3-native architecture provides significant advantages for backup and restore operations. Unlike traditional search engines that require complex volume snapshots, nixiesearch stores all index data directly in S3, making backups as simple as copying S3 objects.

This tutorial covers:

Backup scenarios: Regular backups vs disaster recovery
Restore procedures: Point-in-time and cross-environment restores
Key concepts: S3 prefix changes and ConfigMap rollouts for zero-downtime operations

Prerequisites¶

Before starting, ensure you have:

A distributed nixiesearch cluster running in Kubernetes (see Distributed Deployment)
S3 bucket with appropriate read/write permissions (see S3 Persistence)
AWS CLI installed and configured
kubectl access to your Kubernetes cluster

Backup¶

Manual Backup¶

Create a point-in-time backup by copying your current S3 prefix:

# Backup current index data
aws s3 sync s3://nixiesearch-indexes/movies s3://nixiesearch-indexes/backup-movies-2024-01-15

# Backup Kubernetes configuration
kubectl get configmap nixiesearch-config -o yaml > nixiesearch-config-backup.yaml
kubectl get deployment nixiesearch-searcher -o yaml > searcher-deployment-backup.yaml
kubectl get statefulset nixiesearch-indexer -o yaml > indexer-statefulset-backup.yaml

Note

Underlying Lucene index consists of multiple immutable segments. Segments cannot be altered, but can be removed during the compaction process. Nixiesearch indexer only deletes actual index segments after a fixed 1h time interval to make S3 index operations atomic.

Automated Backup¶

Set up automated backups using S3 lifecycle policies:

{
  "Rules": [
    {
      "ID": "NixiesearchBackup",
      "Status": "Enabled",
      "Filter": {"Prefix": "movies"},
      "Transitions": [
        {
          "Days": 30,
          "StorageClass": "STANDARD_IA"
        },
        {
          "Days": 90,
          "StorageClass": "GLACIER"
        }
      ]
    }
  ]
}

For cross-region disaster recovery, enable S3 cross-region replication:

aws s3api put-bucket-replication \
  --bucket nixiesearch-indexes \
  --replication-configuration file://replication-config.json

Restore¶

Point-in-Time Restore¶

Restore from a specific backup by updating the S3 prefix in your configuration:

Copy backup data to new prefix:

aws s3 sync s3://nixiesearch-indexes/backup-movies-2024-01-15 s3://nixiesearch-indexes/movies-restored

Update ConfigMap with new prefix:

kubectl patch configmap nixiesearch-config --patch '
data:
  config.yml: |
    schema:
      movies:
        store:
          distributed:
            searcher:
              memory:
            indexer:
              memory:
            remote:
              s3:
                bucket: nixiesearch-indexes
                prefix: movies-restored
                region: us-east-1
        fields:
          # ... rest of your schema configuration
'

Rolling restart to pick up new configuration:

kubectl rollout restart deployment/nixiesearch-searcher
kubectl rollout restart statefulset/nixiesearch-indexer

Verify restore:

# Check pod status
kubectl get pods -l app=nixiesearch

# Test search functionality
kubectl port-forward svc/nixiesearch-searcher 8080:8080
curl "http://localhost:8080/movies/_search?q=test"

Zero-Downtime Restore¶

For production environments, use a blue-green deployment approach:

Create new ConfigMap with restored prefix:

kubectl create configmap nixiesearch-config-restored --from-file=config.yml=restored-config.yml

Deploy new searcher deployment:

# Update searcher deployment to use new ConfigMap
kubectl patch deployment nixiesearch-searcher --patch '
spec:
  template:
    spec:
      volumes:
      - name: config
        configMap:
          name: nixiesearch-config-restored
'

Wait for new pods to be ready:

kubectl rollout status deployment/nixiesearch-searcher

Switch traffic by updating the service selector if needed, or simply let Kubernetes handle the rolling update.

Update indexer once searchers are healthy:

kubectl patch statefulset nixiesearch-indexer --patch '
spec:
  template:
    spec:
      volumes:
      - name: config
        configMap:
          name: nixiesearch-config-restored
'