Backup and Restore¶
Overview¶
Nixiesearch's S3-native architecture provides significant advantages for backup and restore operations. Unlike traditional search engines that require complex volume snapshots, nixiesearch stores all index data directly in S3, making backups as simple as copying S3 objects.
This tutorial covers:
- Backup scenarios: Regular backups vs disaster recovery
- Restore procedures: Point-in-time and cross-environment restores
- Key concepts: S3 prefix changes and ConfigMap rollouts for zero-downtime operations
Prerequisites¶
Before starting, ensure you have:
- A distributed nixiesearch cluster running in Kubernetes (see Distributed Deployment)
- S3 bucket with appropriate read/write permissions (see S3 Persistence)
- AWS CLI installed and configured
kubectl
access to your Kubernetes cluster
Backup¶
Manual Backup¶
Create a point-in-time backup by copying your current S3 prefix:
# Backup current index data
aws s3 sync s3://nixiesearch-indexes/movies s3://nixiesearch-indexes/backup-movies-2024-01-15
# Backup Kubernetes configuration
kubectl get configmap nixiesearch-config -o yaml > nixiesearch-config-backup.yaml
kubectl get deployment nixiesearch-searcher -o yaml > searcher-deployment-backup.yaml
kubectl get statefulset nixiesearch-indexer -o yaml > indexer-statefulset-backup.yaml
Note
Underlying Lucene index consists of multiple immutable segments. Segments cannot be altered, but can be removed during the compaction process. Nixiesearch indexer only deletes actual index segments after a fixed 1h time interval to make S3 index operations atomic.
Automated Backup¶
Set up automated backups using S3 lifecycle policies:
{
"Rules": [
{
"ID": "NixiesearchBackup",
"Status": "Enabled",
"Filter": {"Prefix": "movies"},
"Transitions": [
{
"Days": 30,
"StorageClass": "STANDARD_IA"
},
{
"Days": 90,
"StorageClass": "GLACIER"
}
]
}
]
}
For cross-region disaster recovery, enable S3 cross-region replication:
aws s3api put-bucket-replication \
--bucket nixiesearch-indexes \
--replication-configuration file://replication-config.json
Restore¶
Point-in-Time Restore¶
Restore from a specific backup by updating the S3 prefix in your configuration:
-
Copy backup data to new prefix:
aws s3 sync s3://nixiesearch-indexes/backup-movies-2024-01-15 s3://nixiesearch-indexes/movies-restored
-
Update ConfigMap with new prefix:
kubectl patch configmap nixiesearch-config --patch ' data: config.yml: | schema: movies: store: distributed: searcher: memory: indexer: memory: remote: s3: bucket: nixiesearch-indexes prefix: movies-restored region: us-east-1 fields: # ... rest of your schema configuration '
-
Rolling restart to pick up new configuration:
kubectl rollout restart deployment/nixiesearch-searcher kubectl rollout restart statefulset/nixiesearch-indexer
-
Verify restore:
# Check pod status kubectl get pods -l app=nixiesearch # Test search functionality kubectl port-forward svc/nixiesearch-searcher 8080:8080 curl "http://localhost:8080/movies/_search?q=test"
Zero-Downtime Restore¶
For production environments, use a blue-green deployment approach:
-
Create new ConfigMap with restored prefix:
kubectl create configmap nixiesearch-config-restored --from-file=config.yml=restored-config.yml
-
Deploy new searcher deployment:
# Update searcher deployment to use new ConfigMap kubectl patch deployment nixiesearch-searcher --patch ' spec: template: spec: volumes: - name: config configMap: name: nixiesearch-config-restored '
-
Wait for new pods to be ready:
kubectl rollout status deployment/nixiesearch-searcher
-
Switch traffic by updating the service selector if needed, or simply let Kubernetes handle the rolling update.
-
Update indexer once searchers are healthy:
kubectl patch statefulset nixiesearch-indexer --patch ' spec: template: spec: volumes: - name: config configMap: name: nixiesearch-config-restored '
Further Reading¶
- Distributed Deployment Overview - Architecture and concepts
- S3 Persistence Configuration - Detailed S3 setup
- Schema Migration - Handling schema changes during restore
- Version Upgrades - Combining backups with version updates
- Configuration Reference - Complete configuration options
- Auto-scaling - Scaling considerations for restored clusters
Note
Remember that nixiesearch's stateless architecture means no persistent volumes need backing up - all critical data lives in S3 and can be restored by simply updating configuration.