S3 Persistence¶
S3 persistence stores search index segments in S3-compatible object storage, enabling distributed deployments where multiple searcher and indexer nodes can share the same index data. This is the recommended approach for production distributed deployments.
For an overview of all storage options, see the persistence overview. For development environments, consider in-memory storage or local disk storage.
When to Use S3 Storage¶
S3 storage is ideal for production distributed deployments where you need to scale searcher and indexer components independently. It enables stateless compute nodes that can be added or removed without data loss, making it perfect for auto-scaling scenarios.
Use S3 storage when you need to share indexes between multiple environments (development, staging, production) or when you want to decouple storage from compute for better reliability and operational flexibility. It's also essential for multi-region deployments where you need consistent access to search indexes across different geographic locations.
Configuration Example¶
Distributed configuration with in-memory components and S3 persistence:
schema:
movies:
store:
distributed:
searcher:
memory: # Fast in-memory reads
indexer:
memory: # Fast in-memory writes
remote:
s3: # Persistent index segments
bucket: my-search-indexes
prefix: movies
region: us-east-1
fields:
title:
type: text
search:
lexical:
analyze: en
overview:
type: text
search:
lexical:
analyze: en
For more field configuration options, see the schema mapping guide. For complete distributed setup, see the distributed deployment overview.
S3 Authentication¶
AWS IAM Roles (Recommended)¶
The most secure approach is to use IAM roles for service accounts:
# No credentials in config - uses IAM role
schema:
movies:
store:
distributed:
remote:
s3:
bucket: my-search-indexes
prefix: movies
region: us-east-1
Environment Variables¶
Set AWS credentials via environment variables:
export AWS_ACCESS_KEY_ID=your-access-key
export AWS_SECRET_ACCESS_KEY=your-secret-key
export AWS_REGION=us-east-1
Kubernetes Secrets¶
For Kubernetes deployments, use secrets for credentials:
kubectl create secret generic s3-credentials \
--from-literal=access-key-id=YOUR_ACCESS_KEY \
--from-literal=secret-access-key=YOUR_SECRET_KEY
Then reference in your deployment:
env:
- name: AWS_ACCESS_KEY_ID
valueFrom:
secretKeyRef:
name: s3-credentials
key: access-key-id
- name: AWS_SECRET_ACCESS_KEY
valueFrom:
secretKeyRef:
name: s3-credentials
key: secret-access-key
Required IAM Permissions¶
Your IAM user or role needs these S3 permissions:
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"s3:GetObject",
"s3:PutObject",
"s3:DeleteObject",
"s3:ListBucket"
],
"Resource": [
"arn:aws:s3:::my-search-indexes",
"arn:aws:s3:::my-search-indexes/*"
]
}
]
}
Alternative S3 Providers¶
Nixiesearch supports any S3-compatible storage service by specifying a custom endpoint:
MinIO¶
schema:
movies:
store:
distributed:
remote:
s3:
bucket: nixiesearch-indexes
prefix: movies
region: us-east-1
endpoint: http://minio:9000
Google Cloud Storage¶
schema:
movies:
store:
distributed:
remote:
s3:
bucket: my-gcs-bucket
prefix: movies
region: us-central1
endpoint: https://storage.googleapis.com
DigitalOcean Spaces¶
schema:
movies:
store:
distributed:
remote:
s3:
bucket: my-spaces-bucket
prefix: movies
region: nyc3
endpoint: https://nyc3.digitaloceanspaces.com
Cloudflare R2¶
schema:
movies:
store:
distributed:
remote:
s3:
bucket: my-r2-bucket
prefix: movies
region: auto
endpoint: https://your-account-id.r2.cloudflarestorage.com
For authentication with alternative providers, use their respective access keys with the same environment variables or Kubernetes secrets approach.
Further Reading¶
- Persistence overview - Compare all storage options
- In-memory storage - Fast ephemeral storage
- Local disk storage - Persistent local storage
- Distributed deployment overview - Complete distributed setup
- Configuration reference - All configuration options