Elasticsearch
Introduction
- Supported ES versions: 7.x, 8.x
Elasticsearch version is automatically retrieved from root/ping endpoint. Based on this version Jaeger uses compatible index mappings and Elasticsearch REST API. The version can be explicitly provided via version: config property.
Elasticsearch does not require initialization other than installing and running Elasticsearch. Once it is running, pass the correct configuration values to Jaeger.
Elasticsearch also has the following officially supported resources available from the community and Elastic:
- Docker container from Elastic for getting a single node up quickly
- Helm chart from Elastic
- Kubernetes Operator from RedHat
Configuration
A sample configuration for Jaeger with Elasticsearch backend is available in the Jaeger repository: config-elasticsearch.yaml. In the future the configuration documentation will be auto-generated from the schema. Meanwhile, please refer to config.go as the authoritative source.
Shards and Replicas
Shards and replicas are some configuration values to take special attention to, because this is decided upon index creation. This article goes into more information about choosing how many shards should be chosen for optimization.
Index Management Strategies
Jaeger supports three index management strategies:
| Time-based indices (default) | Manual rollover | Rollover with ILM (recommended) | |
|---|---|---|---|
| How indices are created | Jaeger creates daily or hourly indices (e.g., jaeger-span-2024-06-18) | Operator runs jaeger-es-rollover init to create the first numbered index (e.g., jaeger-span-000001); cron job creates subsequent ones | Operator runs jaeger-es-rollover init to create the first index; Elasticsearch creates subsequent ones |
| Rollover trigger | Automatic (new time period) | jaeger-es-rollover rollover cron job | Elasticsearch ILM policy |
| Retention cleanup | jaeger-es-index-cleaner cron job | jaeger-es-rollover lookback (optional) + jaeger-es-index-cleaner cron jobs | Elasticsearch ILM policy |
| External tooling required | None | jaeger-es-rollover init (one-time) | jaeger-es-rollover init (one-time) + ILM policy |
The relevant configuration options are:
| Config property | Default | Relevant strategy | Description |
|---|---|---|---|
date_layout | 2006-01-02 | Time-based | Date format for index names (e.g., 2006-01-02-15 for hourly indices) |
use_aliases | false | Manual rollover, ILM | Use read/write aliases instead of time-based indices (enables rollover mode) |
use_ilm | false | ILM | Delegate rollover and retention to Elasticsearch ILM (requires use_aliases: true) |
create_mappings | true | All | Create index templates at Jaeger startup. Must be false when use_ilm: true |
create_mappings option is orthogonal to the index management strategy. In any mode, you can set it to false if you prefer to manage index templates externally (e.g., via jaeger-es-rollover init or your own automation). When using ILM, it must be false because jaeger-es-rollover init already creates the templates as part of index initialization.Index Rollover
Elasticsearch rollover is an index management strategy that optimizes use of resources allocated to indices.
For example, indices that do not contain any data still allocate shards, and conversely, a single index might contain significantly more data than the others.
Jaeger by default stores data in daily indices which might not optimally utilize resources. Rollover feature can be enabled by use_aliases: true config property.
Rollover lets you configure when to roll over to a new index based on one or more of the following criteria:
max_age- the maximum age of the index. It uses time units:d,h,m.max_docs- the maximum documents in the index.max_size- the maximum estimated size of primary shards (since Elasticsearch 6.x). It uses byte size unitstb,gb,mb.
Rollover index management strategy is more complex than using the default daily indices and it requires an initialization job to prepare the storage and cron jobs to manage indices.
To learn more about rollover index management in Jaeger refer to this article.
For automated rollover, please refer to ILM Support section.
jaeger-es-rollover and jaeger-es-index-cleaner tools below are shown using
Docker invocations, but they are also available as standalone binaries on the
Jaeger GitHub releases page.Initialize
The following command prepares Elasticsearch for rollover deployment:
docker run -it --rm --net=host \
jaegertracing/jaeger-es-rollover:latest \
init http://localhost:9200 # <1>
<1> If you need to initialize archive storage, add -e ARCHIVE=true.
The initializer performs the following steps for each index type (spans, services, dependencies):
- Creates index templates that define field mappings, shard/replica settings, and index patterns (e.g.,
jaeger-span-*). All future rollover indices inherit their schema from these templates. - Creates the first rollover index (e.g.,
jaeger-span-000001). Subsequent rollovers increment this number. - Creates read and write aliases (e.g.,
jaeger-span-readandjaeger-span-write) pointing to the initial index. Jaeger queries via the read alias and writes via the write alias.
After the initialization, Jaeger can be deployed with use_aliases: true.
Roll over
The next step is to periodically execute the rollover API which rolls the write alias to a new index based on supplied conditions. The command also adds a new index to the read alias to make new data available for search.
docker run -it --rm --net=host \
-e CONDITIONS='{"max_age": "2d"}' \
jaegertracing/jaeger-es-rollover:latest \
rollover http://localhost:9200 # <1>
<1> The command rolls the alias over to a new index if the age of the current write index is older than 2 days. For more conditions see Elasticsearch docs.
The next step is to remove old indices from read aliases. It means that old data will not be available for search. This imitates the behavior of max_span_age: config property used in the default index-per-day deployment. This step could be optional and old indices could be simply removed by index cleaner in the next step.
docker run -it --rm --net=host \
-e UNIT=days -e UNIT_COUNT=7 \
jaegertracing/jaeger-es-rollover:latest \
lookback http://localhost:9200 # <1>
<1> Removes indices older than 7 days from read alias.
Remove old data
The historical data can be removed with the jaeger-es-index-cleaner that is also used for daily indices.
docker run -it --rm --net=host \
-e ROLLOVER=true \
jaegertracing/jaeger-es-index-cleaner:latest \
14 http://localhost:9200 # <1>
<1> Remove indices older than 14 days.
ILM support
Elasticsearch ILM automatically manages indices according to performance, resiliency, and retention requirements.
ILM support is an alternative to the manual rollover + lookback + index-cleaner workflow described above. When ILM is enabled, Elasticsearch manages rollover and retention automatically according to the configured policy.
For example:
- Rollover to a new index by size (bytes or number of documents) or age, archiving previous indices
- Delete stale indices to enforce data retention standards
To enable ILM support:
Create an ILM policy in elasticsearch named jaeger-ilm-policy.
For example, the following policy will rollover the “active” index when it is older than 1m and delete indices that are older than 2m.
curl -X PUT \ http://localhost:9200/_ilm/policy/jaeger-ilm-policy \ -H 'Content-Type: application/json; charset=utf-8' \ --data-binary @- << EOF { "policy": { "phases": { "hot": { "min_age": "0ms", "actions": { "rollover": { "max_age": "1m" }, "set_priority": { "priority": 100 } } }, "delete": { "min_age": "2m", "actions": { "delete": {} } } } } } EOFRun rollover initializer with
ES_USE_ILM=true:docker run -it --rm --net=host\ -e ES_USE_ILM=true \ jaegertracing/jaeger-es-rollover:latest \ init http://localhost:9200 # <1><1> If you need to initialize archive storage, add
-e ARCHIVE=true.While initializing with ILM support, make sure that an ILM policy named
jaeger-ilm-policyis created in Elasticsearch beforehand (see the previous step), otherwise the following error message will be shown:“ILM policy jaeger-ilm-policy doesn’t exist in Elasticsearch. Please create it and rerun init”
The initializer performs the same steps as described above (creates index templates, seed indices, and aliases), with the following ILM-specific additions:
- Validates that the ILM policy (
jaeger-ilm-policy) exists in Elasticsearch. - Embeds
index.lifecycle.nameandindex.lifecycle.rollover_aliasin the index templates, so Elasticsearch automatically applies the ILM policy to every new rollover index. - Sets
is_write_index: trueon the write aliases, which is required for Elasticsearch to perform ILM-triggered rollovers.
With ILM enabled, Elasticsearch manages rollovers and retention automatically — you no longer need the
rollover,lookback, orindex-cleanercron jobs described above.After the initialization, deploy Jaeger with
use_ilm: trueanduse_aliases: true.- Validates that the ILM policy (
Upgrading
Elasticsearch defines wire and index compatibility versions. The index compatibility defines the minimal version a node can read data from. For example Elasticsearch 8 can read indices created by Elasticsearch 7, however it cannot read indices created by Elasticsearch 6 even though they use the same index mappings. Therefore upgrade from Elasticsearch 7 to 8 does not require any data migration. However, upgrade from Elasticsearch 6 to 8 has to be done through Elasticsearch 7 and wait until indices created by ES 6.x are removed or explicitly reindexed.
Refer to the Elasticsearch documentation for wire and index compatibility versions. Generally this information can be retrieved from root/ping REST endpoint.
Reindex
Manual reindexing can be used when upgrading from Elasticsearch 6 to 8 (through Elasticsearch 7) without waiting until indices created by Elasticsearch 6 are removed.
- Reindex all span indices to new indices with suffix
-1:
curl -ivX POST -H "Content-Type: application/json" \
http://localhost:9200/_reindex -d @reindex.json
{
"source": {
"index": "jaeger-span-*"
},
"dest": {
"index": "jaeger-span"
},
"script": {
"lang": "painless",
"source": "ctx._index = 'jaeger-span-' + (ctx._index.substring('jaeger-span-'.length(), ctx._index.length())) + '-1'"
}
}
Delete indices with old mapping:
curl -ivX DELETE -H "Content-Type: application/json" \ http://localhost:9200/jaeger-span-\*,-\*-1Create indices without
-1suffix:curl -ivX POST -H "Content-Type: application/json" \ http://localhost:9200/_reindex -d @reindex.json { "source": { "index": "jaeger-span-*" }, "dest": { "index": "jaeger-span" }, "script": { "lang": "painless", "source": "ctx._index = 'jaeger-span-' + (ctx._index.substring('jaeger-span-'.length(), ctx._index.length() - 2))" } }Remove suffixed indices:
curl -ivX DELETE -H "Content-Type: application/json" \ http://localhost:9200/jaeger-span-\*-1
Run the commands analogically for other Jaeger indices.
There might exist more effective migration procedure. Please share with the community any findings.
