Operator for Kubernetes
Understanding Operators
The Jaeger Operator is an implementation of a Kubernetes Operator . Operators are pieces of software that ease the operational complexity of running another piece of software. More technically, Operators are a method of packaging, deploying, and managing a Kubernetes application.
A Kubernetes application is an application that is both deployed on Kubernetes and managed using the Kubernetes APIs and kubectl
(kubernetes) or oc
(OKD) tooling. To be able to make the most of Kubernetes, you need a set of cohesive APIs to extend in order to service and manage your apps that run on Kubernetes. Think of Operators as the runtime that manages this type of app on Kubernetes.
Installing the Operator
Installing the Operator on Kubernetes
The following instructions will create the observability
namespace and install the Jaeger Operator.
kubectl
command is properly configured to talk to a valid Kubernetes cluster. If you don’t have a cluster, you can create one locally using
minikube
.To install the operator, run:
kubectl create namespace observability # <1>
kubectl create -f https://raw.githubusercontent.com/jaegertracing/jaeger-operator/main/deploy/crds/jaegertracing.io_jaegers_crd.yaml # <2>
kubectl create -f https://raw.githubusercontent.com/jaegertracing/jaeger-operator/main/deploy/service_account.yaml
kubectl create -f https://raw.githubusercontent.com/jaegertracing/jaeger-operator/main/deploy/role.yaml
kubectl create -f https://raw.githubusercontent.com/jaegertracing/jaeger-operator/main/deploy/role_binding.yaml
kubectl create -f https://raw.githubusercontent.com/jaegertracing/jaeger-operator/main/deploy/operator.yaml
<1> This creates the namespace used by default in the deployment files. If you want to install the Jaeger operator in a different namespace, you must edit the deployment files to change observability
to the desired namespace value.
<2> This installs the “Custom Resource Definition” for the apiVersion: jaegertracing.io/v1
At this point, there should be a jaeger-operator
deployment available. You can view it by running the following command:
$ kubectl get deployment jaeger-operator -n observability
NAME DESIRED CURRENT UP-TO-DATE AVAILABLE AGE
jaeger-operator 1 1 1 1 48s
The operator is now ready to create Jaeger instances.
Installing the Operator on OKD/OpenShift
The instructions from the previous section also work for installing the operator on OKD or OpenShift. Make sure you are logged in as a privileged user, when you install the role based access control (RBAC) rules, the custom resource definition, and the operator.
oc login -u <privileged user>
oc new-project observability # <1>
oc create -f https://raw.githubusercontent.com/jaegertracing/jaeger-operator/main/deploy/crds/jaegertracing.io_jaegers_crd.yaml # <2>
oc create -f https://raw.githubusercontent.com/jaegertracing/jaeger-operator/main/deploy/service_account.yaml
oc create -f https://raw.githubusercontent.com/jaegertracing/jaeger-operator/main/deploy/role.yaml
oc create -f https://raw.githubusercontent.com/jaegertracing/jaeger-operator/main/deploy/role_binding.yaml
oc create -f https://raw.githubusercontent.com/jaegertracing/jaeger-operator/main/deploy/operator.yaml
<1> This creates the namespace used by default in the deployment files. If you want to install the Jaeger operator in a different namespace, you must edit the deployment files to change observability
to the desired namespace value.
<2> This installs the “Custom Resource Definition” for the apiVersion: jaegertracing.io/v1
Once the operator is installed, grant the role jaeger-operator
to users who should be able to install individual Jaeger instances. The following example creates a role binding allowing the user developer
to create Jaeger instances:
oc create \
rolebinding developer-jaeger-operator \
--role=jaeger-operator \
--user=developer
After the role is granted, switch back to a non-privileged user.
Jaeger Agent can be configured to be deployed as a DaemonSet
using a HostPort
to allow Jaeger clients in the same node to discover the agent. In OpenShift, a HostPort
can only be set when a special security context is set. A separate service account can be used by the Jaeger Agent with the permission to bind to HostPort
, as follows:
oc create -f https://raw.githubusercontent.com/jaegertracing/jaeger-operator/main/examples/openshift/hostport-scc-daemonset.yaml # <1>
oc new-project myappnamespace
oc create -f https://raw.githubusercontent.com/jaegertracing/jaeger-operator/main/examples/openshift/service_account_jaeger-agent-daemonset.yaml # <2>
oc adm policy add-scc-to-user daemonset-with-hostport -z jaeger-agent-daemonset # <3>
oc apply -f https://raw.githubusercontent.com/jaegertracing/jaeger-operator/main/examples/openshift/agent-as-daemonset.yaml # <4>
<1> The SecurityContextConstraints
with the allowHostPorts
policy
<2> The ServiceAccount
to be used by the Jaeger Agent
<3> Adds the security policy to the service account
<4> Creates the Jaeger Instance using the serviceAccount
created in the steps above
DaemonSet
to be created: Warning FailedCreate 4s (x14 over 45s) daemonset-controller Error creating: pods "agent-as-daemonset-agent-daemonset-" is forbidden: unable to validate against any security context constraint: [spec.containers[0].securityContext.containers[0].hostPort: Invalid value: 5775: Host ports are not allowed to be used
After a few seconds, the DaemonSet
should be up and running:
$ oc get daemonset agent-as-daemonset-agent-daemonset
NAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE
agent-as-daemonset-agent-daemonset 1 1 1 1 1
Quick Start - Deploying the AllInOne image
The simplest possible way to create a Jaeger instance is by creating a YAML file like the following example. This will install the default AllInOne strategy, which deploys the “all-in-one” image (agent, collector, query, ingester, Jaeger UI) in a single pod, using in-memory storage by default.
apiVersion: jaegertracing.io/v1
kind: Jaeger
metadata:
name: simplest
The YAML file can then be used with kubectl
:
kubectl apply -f simplest.yaml
In a few seconds, a new in-memory all-in-one instance of Jaeger will be available, suitable for quick demos and development purposes. To check the instances that were created, list the jaeger
objects:
$ kubectl get jaegers
NAME CREATED AT
simplest 28s
To get the pod name, query for the pods belonging to the simplest
Jaeger instance:
$ kubectl get pods -l app.kubernetes.io/instance=simplest
NAME READY STATUS RESTARTS AGE
simplest-6499bb6cdd-kqx75 1/1 Running 0 2m
Similarly, the logs can be queried either from the pod directly using the pod name obtained from the previous example, or from all pods belonging to our instance:
$ kubectl logs -l app.kubernetes.io/instance=simplest
...
{"level":"info","ts":1535385688.0951214,"caller":"healthcheck/handler.go:133","msg":"Health Check state change","status":"ready"}
$ kubectl logs -l app.kubernetes.io/instance=simplest -c jaeger
...
{"level":"info","ts":1535385688.0951214,"caller":"healthcheck/handler.go:133","msg":"Health Check state change","status":"ready"}
Deployment Strategies
When you create a Jaeger instance, it is associated with a strategy. The strategy is defined in the custom resource file, and determines the architecture to be used for the Jaeger backend. The default strategy is allInOne
. The other possible values are production
and streaming
.
The available strategies are described in the following sections.
AllInOne (Default) strategy
This strategy is intended for development, testing, and demo purposes.
The main backend components, agent, collector and query service, are all packaged into a single executable which is configured (by default) to use in-memory storage.
Production strategy
The production
strategy is intended (as the name suggests) for production environments, where long term storage of trace data is important, as well as a more scalable and highly available architecture is required. Each of the backend components is therefore separately deployed.
The agent can be injected as a sidecar on the instrumented application or as a daemonset.
The query and collector services are configured with a supported storage type - currently Cassandra or Elasticsearch. Multiple instances of each of these components can be provisioned as required for performance and resilience purposes.
The main additional requirement is to provide the details of the storage type and options, for example:
storage:
type: elasticsearch
options:
es:
server-urls: http://elasticsearch:9200
Streaming strategy
The streaming
strategy is designed to augment the production
strategy by providing a streaming capability that effectively sits between the collector and the backend storage (Cassandra or Elasticsearch). This provides the benefit of reducing the pressure on the backend storage, under high load situations, and enables other trace post-processing capabilities to tap into the real time span data directly from the streaming platform (Kafka).
The only additional information required is to provide the details for accessing the Kafka platform, which is configured in the collector
component (as producer) and ingester
component (as consumer):
apiVersion: jaegertracing.io/v1
kind: Jaeger
metadata:
name: simple-streaming
spec:
strategy: streaming
collector:
options:
kafka: # <1>
producer:
topic: jaeger-spans
brokers: my-cluster-kafka-brokers.kafka:9092
ingester:
options:
kafka: # <1>
consumer:
topic: jaeger-spans
brokers: my-cluster-kafka-brokers.kafka:9092
ingester:
deadlockInterval: 0 # <2>
storage:
type: elasticsearch
options:
es:
server-urls: http://elasticsearch:9200
<1> Identifies the Kafka configuration used by the collector, to produce the messages, and the ingester to consume the messages.
<2> The deadlock interval can be disabled to avoid the ingester being terminated when no messages arrive within the default 1 minute period
Understanding Custom Resource Definitions
In the Kubernetes API, a resource is an endpoint that stores a collection of API objects of a certain kind. For example, the built-in Pods resource contains a collection of Pod objects. A Custom Resource Definition (CRD) object defines a new, unique object Kind
in the cluster and lets the Kubernetes API server handle its entire lifecycle.
To create Custom Resource (CR) objects, cluster administrators must first create a Custom Resource Definition (CRD). The CRDs allow cluster users to create CRs to add the new resource types into their projects. An Operator watches for custom resource objects to be created, and when it sees a custom resource being created, it creates the application based on the parameters defined in the custom resource object.
For reference, here’s how you can create a more complex all-in-one instance:
apiVersion: jaegertracing.io/v1
kind: Jaeger
metadata:
name: my-jaeger
spec:
strategy: allInOne # <1>
allInOne:
image: jaegertracing/all-in-one:latest # <2>
options: # <3>
log-level: debug # <4>
storage:
type: memory # <5>
options: # <6>
memory: # <7>
max-traces: 100000
ingress:
enabled: false # <8>
agent:
strategy: DaemonSet # <9>
annotations:
scheduler.alpha.kubernetes.io/critical-pod: "" # <10>
<1> The default strategy is allInOne
. The other possible values are production
and streaming
.
<2> The image to use, in a regular Docker syntax.
<3> The (non-storage related) options to be passed verbatim to the underlying binary. Refer to the Jaeger documentation and/or to the --help
option from the related binary for all the available options.
<4> The option is a simple key: value
map. In this case, we want the option --log-level=debug
to be passed to the binary.
<5> The storage type to be used. By default it will be memory
, but can be any other supported storage type (Cassandra, Elasticsearch, Kafka).
<6> All storage related options should be placed here, rather than under the ‘allInOne’ or other component options.
<7> Some options are namespaced and we can alternatively break them into nested objects. We could have specified memory.max-traces: 100000
.
<8> By default, an ingress object is created for the query service. It can be disabled by setting its enabled
option to false
. If deploying on OpenShift, this will be represented by a Route object.
<9> By default, the operator assumes that agents are deployed as sidecars within the target pods. Specifying the strategy as “DaemonSet” changes that and makes the operator deploy the agent as DaemonSet. Note that your tracer client will probably have to override the “JAEGER_AGENT_HOST” environment variable to use the node’s IP.
<10> Define annotations to be applied to all deployments (not services). These can be overridden by annotations defined on the individual components.
You can view example custom resources for different Jaeger configurations on GitHub .
Configuring the Custom Resource
You can use the simplest example (shown above) and create a Jaeger instance using the defaults, or you can create your own custom resource file.
Storage options
Cassandra storage
When the storage type is set to Cassandra, the operator will automatically create a batch job that creates the required schema for Jaeger to run. This batch job will block the Jaeger installation, so that it starts only after the schema is successfully created. The creation of this batch job can be disabled by setting the enabled
property to false
:
apiVersion: jaegertracing.io/v1
kind: Jaeger
metadata:
name: cassandra-without-create-schema
spec:
strategy: allInOne
storage:
type: cassandra
cassandraCreateSchema:
enabled: false # <1>
<1> Defaults to true
Further aspects of the batch job can be configured as well. An example with all the possible options is shown below:
apiVersion: jaegertracing.io/v1
kind: Jaeger
metadata:
name: cassandra-with-create-schema
spec:
strategy: allInOne # <1>
storage:
type: cassandra
options: # <2>
cassandra:
servers: cassandra
keyspace: jaeger_v1_datacenter3
cassandraCreateSchema: # <3>
datacenter: "datacenter3"
mode: "test"
<1> The same works for production
and streaming
.
<2> These options are for the regular Jaeger components, like collector
and query
.
<3> The options for the create-schema
job.
MODE=prod
, which implies a replication factor of 2
, using NetworkTopologyStrategy
as the class, effectively meaning that at least 3 nodes are required in the Cassandra cluster. If a SimpleStrategy
is desired, set the mode to test
, which then sets the replication factor of 1
. Refer to the
create-schema script
for more details.Elasticsearch storage
Under some circumstances, the Jaeger Operator can make use of the Elasticsearch Operator to provision a suitable Elasticsearch cluster.
When there are no es.server-urls
options as part of a Jaeger production
instance and elasticsearch
is set as the storage type, the Jaeger Operator creates an Elasticsearch cluster via the Elasticsearch Operator by creating a Custom Resource based on the configuration provided in storage section. The Elasticsearch cluster is meant to be dedicated for a single Jaeger instance.
The self-provision of an Elasticsearch cluster can be disabled by setting the flag --es-provision
to false
. The default value is auto
, which will make the Jaeger Operator query the Kubernetes cluster for its ability to handle a Elasticsearch
custom resource. This is usually set by the Elasticsearch Operator during its installation process, so, if the Elasticsearch Operator is expected to run after the Jaeger Operator, the flag can be set to true
.
Elasticsearch index cleaner job
When using elasticsearch
storage by default a job is created to clean old traces from it, the options for it are listed below so you can configure it to your use case
storage:
type: elasticsearch
esIndexCleaner:
enabled: false # turn the job deployment on and off
numberOfDays: 7 # number of days to wait before deleting a record
schedule: "55 23 * * *" # cron expression for it to run
image: jaegertracing/jaeger-es-index-cleaner # image of the job
Auto-injecting Jaeger Agent Sidecars
The operator can inject Jaeger Agent sidecars in Deployment
workloads, provided that the deployment has the annotation sidecar.jaegertracing.io/inject
with a suitable value. The values can be either "true"
(as string), or the Jaeger instance name, as returned by kubectl get jaegers
. When "true"
is used, there should be exactly one Jaeger instance for the same namespace as the deployment, otherwise, the operator can’t figure out automatically which Jaeger instance to use.
The following snippet shows a simple application that will get a sidecar injected, with the Jaeger Agent pointing to the single Jaeger instance available in the same namespace:
apiVersion: apps/v1
kind: Deployment
metadata:
name: myapp
annotations:
"sidecar.jaegertracing.io/inject": "true" # <1>
spec:
selector:
matchLabels:
app: myapp
template:
metadata:
labels:
app: myapp
spec:
containers:
- name: myapp
image: acme/myapp:myversion
<1> Either "true"
(as string) or the Jaeger instance name.
A complete sample deployment is available at
deploy/examples/business-application-injected-sidecar.yaml
.
When the sidecar is injected, the Jaeger Agent can then be accessed at its default location on localhost
.
Installing the Agent as DaemonSet
By default, the Operator expects the agents to be deployed as sidecars to the target applications. This is convenient for several purposes, like in a multi-tenant scenario or to have better load balancing, but there are scenarios where you might want to install the agent as a DaemonSet
. In that case, specify the Agent’s strategy to DaemonSet
, as follows:
apiVersion: jaegertracing.io/v1
kind: Jaeger
metadata:
name: my-jaeger
spec:
agent:
strategy: DaemonSet
DaemonSet
as the strategy, only one will end up deploying a DaemonSet
, as the agent is required to bind to well-known ports on the node. Because of that, the second daemon set will fail to bind to those ports.Your tracer client will then most likely need to be told where the agent is located. This is usually done by setting the environment variable JAEGER_AGENT_HOST
to the value of the Kubernetes node’s IP, for example:
apiVersion: apps/v1
kind: Deployment
metadata:
name: myapp
spec:
selector:
matchLabels:
app: myapp
template:
metadata:
labels:
app: myapp
spec:
containers:
- name: myapp
image: acme/myapp:myversion
env:
- name: JAEGER_AGENT_HOST
valueFrom:
fieldRef:
fieldPath: status.hostIP
Secrets Support
The Operator supports passing secrets to the Collector, Query and All-In-One deployments. This can be used for example, to pass credentials (username/password) to access the underlying storage backend (for example: Elasticsearch). The secrets are available as environment variables in the (Collector/Query/All-In-One) nodes.
storage:
type: elasticsearch
options:
es:
server-urls: http://elasticsearch:9200
secretName: jaeger-secrets
The secret itself would be managed outside of the jaeger-operator
custom resource.
Configuring the UI
Information on various configuration options for the UI can be found here , defined in json format.
To apply UI configuration changes within the Custom Resource, the same information can be included in yaml format as shown below:
ui:
options:
dependencies:
menuEnabled: false
tracking:
gaID: UA-000000-2
menu:
- label: "About Jaeger"
items:
- label: "Documentation"
url: "https://www.jaegertracing.io/docs/latest"
linkPatterns:
- type: "logs"
key: "customer_id"
url: /search?limit=20&lookback=1h&service=frontend&tags=%7B%22customer_id%22%3A%22#{customer_id}%22%7D
text: "Search for other traces for customer_id=#{customer_id}"
Defining Sampling Strategies
The operator can be used to define sampling strategies that will be supplied to tracers that have been configured to use a remote sampler:
apiVersion: jaegertracing.io/v1
kind: Jaeger
metadata:
name: with-sampling
spec:
strategy: allInOne
sampling:
options:
default_strategy:
type: probabilistic
param: 0.5
This example defines a default sampling strategy that is probabilistic, with a 50% chance of the trace instances being sampled.
Refer to the Jaeger documentation on Collector Sampling Configuration to see how service and endpoint sampling can be configured. The JSON representation described in that documentation can be used in the operator by converting to YAML.
Finer grained configuration
The custom resource can be used to define finer grained Kubernetes configuration applied to all Jaeger components or at the individual component level.
When a common definition (for all Jaeger components) is required, it is defined under the spec
node. When the definition relates to an individual component, it is placed under the spec/<component>
node.
The types of supported configuration include:
affinity to determine which nodes a pod can be allocated to
resources to limit cpu and memory
tolerations in conjunction with
taints
to enable pods to avoid being repelled from a nodevolumes and volume mounts
serviceAccount to run each component with separate identity
securityContext to define privileges of running components
apiVersion: jaegertracing.io/v1
kind: Jaeger
metadata:
name: simple-prod
spec:
strategy: production
storage:
type: elasticsearch
options:
es:
server-urls: http://elasticsearch:9200
annotations:
key1: value1
labels:
key2: value2
resources:
requests:
memory: "64Mi"
cpu: "250m"
limits:
memory: "128Mi"
cpu: "500m"
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: kubernetes.io/e2e-az-name
operator: In
values:
- e2e-az1
- e2e-az2
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 1
preference:
matchExpressions:
- key: another-node-label-key
operator: In
values:
- another-node-label-value
tolerations:
- key: "key1"
operator: "Equal"
value: "value1"
effect: "NoSchedule"
- key: "key1"
operator: "Equal"
value: "value1"
effect: "NoExecute"
serviceAccount: nameOfServiceAccount
securityContext:
runAsUser: 1000
volumeMounts:
- name: config-vol
mountPath: /etc/config
volumes:
- name: config-vol
configMap:
name: log-config
items:
- key: log_level
path: log_level
Accessing the Jaeger Console (UI)
Kubernetes
The operator creates a Kubernetes
ingress
route, which is the Kubernetes’ standard for exposing a service to the outside world, but by default it does not come with Ingress providers.
Check the
Kubernetes documentation
for the most appropriate way to achieve an Ingress provider for your platform. The following command enables the Ingress provider on minikube
:
minikube addons enable ingress
Once Ingress is enabled, the address for the Jaeger console can be found by querying the Ingress object:
$ kubectl get ingress
NAME HOSTS ADDRESS PORTS AGE
simplest-query * 192.168.122.34 80 3m
In this example, the Jaeger UI is available at http://192.168.122.34.
OpenShift
When the Operator is running on OpenShift, the Operator will automatically create a Route
object for the query services. Use the following command to check the hostname/port:
oc get routes
https
with the hostname/port you get from the command above, otherwise you’ll see a message like: “Application is not available”.By default, the Jaeger UI is protected with OpenShift’s OAuth service and any valid user is able to login. To disable this feature and leave the Jaeger UI unsecured, set the Ingress property security
to none
in the custom resource file:
apiVersion: jaegertracing.io/v1
kind: Jaeger
metadata:
name: disable-oauth-proxy
spec:
ingress:
security: none
Custom SAR
and Delegate URL
values can be specified as part of the .Spec.Ingress.OpenShift.SAR
and .Spec.Ingress.Openshift.DelegateURLs
, as follows:
apiVersion: jaegertracing.io/v1
kind: Jaeger
metadata:
name: custom-sar-oauth-proxy
spec:
ingress:
openshift:
sar: '{"namespace": "default", "resource": "pods", "verb": "get"}'
delegate-urls: '{"/":{"namespace": "default", "resource": "pods", "verb": "get"}}'
When the delegate-urls
is set, the Jaeger Operator needs to create a new ClusterRoleBinding
between the service account used by the UI Proxy ({InstanceName}-ui-proxy
) and the role system:auth-delegator
, as required by the OpenShift OAuth Proxy. Because of that, the service account used by the operator itself needs to have the same cluster role binding. To accomplish that, a ClusterRoleBinding
such as the following has to be created:
kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
name: jaeger-operator-with-auth-delegator
namespace: observability
subjects:
- kind: ServiceAccount
name: jaeger-operator
namespace: observability
roleRef:
kind: ClusterRole
name: system:auth-delegator
apiGroup: rbac.authorization.k8s.io
Cluster administrators not comfortable in letting users deploy Jaeger instances with this cluster role are free to not add this cluster role to the operator’s service account. In that case, the Operator will auto-detect that the required permissions are missing and will log a message similar to: the requested instance specifies the delegate-urls option for the OAuth Proxy, but this operator cannot assign the proper cluster role to it (system:auth-delegator). Create a cluster role binding between the operator's service account and the cluster role 'system:auth-delegator' in order to allow instances to use 'delegate-urls'
.
Updating a Jaeger instance (experimental)
A Jaeger instance can be updated by changing the CustomResource
, either via kubectl edit jaeger simplest
, where simplest
is the Jaeger’s instance name, or by applying the updated YAML file via kubectl apply -f simplest.yaml
.
Simpler changes such as changing the replica sizes can be applied without much concern, whereas changes to the strategy should be watched closely and might potentially cause an outage for individual components (collector/query/agent).
While changing the backing storage is supported, migration of the data is not.
Removing a Jaeger instance
To remove an instance, use the delete
command with the custom resource file used when you created the instance:
kubectl delete -f simplest.yaml
Alternatively, you can remove a Jaeger instance by running:
kubectl delete jaeger simplest
Monitoring the operator
The Jaeger Operator starts a Prometheus-compatible endpoint on 0.0.0.0:8383/metrics
with internal metrics that can be used to monitor the process.
Uninstalling the operator
To uninstall the operator, run the following commands:
kubectl delete -f https://raw.githubusercontent.com/jaegertracing/jaeger-operator/main/deploy/operator.yaml
kubectl delete -f https://raw.githubusercontent.com/jaegertracing/jaeger-operator/main/deploy/role_binding.yaml
kubectl delete -f https://raw.githubusercontent.com/jaegertracing/jaeger-operator/main/deploy/role.yaml
kubectl delete -f https://raw.githubusercontent.com/jaegertracing/jaeger-operator/main/deploy/service_account.yaml
kubectl delete -f https://raw.githubusercontent.com/jaegertracing/jaeger-operator/main/deploy/crds/jaegertracing.io_jaegers_crd.yaml