Caching¶
In-memory embedding cache¶
In practice many documents you ingest into index share a lot of common strings, like category names and colors. A common way to improve indexing throughput is to skip computing embeddings for common strings and instead just cache them.
Nixiesearch has an in-memory LRU cache for common embeddings, which can be configured as follows:
schema:
my-index-name:
fields:
# fields here
cache:
embedding:
maxSize: 32768
The whole
cache
andcache.embedding
sections of config file are optional.
Where:
cache.embedding.maxSize
: integer, optional, default=32768. Maximal number of entries in embedding LRU cache.
A ballpark estimation of cache RAM usage:
- single embedding:
<dimensions> * 4 bytes
. Typical dimensions are384
for MiniLM-L6/E5-small, and768
for larger models. - total usage:
<maxSize> * <embedding size>
For example, a default E5-small
embedding model with 32768
default cache size will take: 384 dims * 4 bytes * 32768 entries
= 50Mb
of heap RAM.