OpenTelemetry: the observability standard your team should adopt now
Week 10 of 52 · Pillar: Observability · Estimated read: 17 min
Last week, this series argued that the gap between telemetry and observability is closed by correlation infrastructure — the ability to move from a metric alert to a relevant trace to the specific log entries in a single workflow, in seconds. That correlation depends on consistent, standards-compliant field naming across every pillar of your observability stack.
Which raises an obvious question. Whose standards?
For most of the last fifteen years, the answer was: your observability vendor’s standards. You instrumented your applications using the client libraries they provided. You used the agent they shipped. You structured your logs according to their conventions. And in exchange for that vendor lock-in, you got a product that worked out of the box — until you decided to switch vendors, at which point the cost of migration came due in full. Every integration rewritten. Every dashboard rebuilt. Every alert recalibrated. Every custom metric re-instrumented. An observability vendor migration was, until recently, one of the most painful infrastructure projects an engineering organisation could undertake.
OpenTelemetry changes that calculus permanently. And it is the most important observability investment your team can make in 2026 — not because of what it does today, but because of what it prevents tomorrow.
What OpenTelemetry actually is
OpenTelemetry — OTel, for short — is a Cloud Native Computing Foundation project that provides a vendor-neutral specification, API, and SDK for generating, processing, and exporting telemetry data. It merged two earlier efforts (OpenTracing and OpenCensus) into a single unified standard, and it has become the de facto industry standard for application instrumentation across every major programming language and every major observability vendor.
The word that matters most in that definition is “vendor-neutral.” OpenTelemetry is not a product. It is not an observability platform. It is not competing with Dynatrace, Splunk, Datadog, Grafana, or New Relic. It is the open specification that all of them now support as an ingestion format. Your application instruments using OTel APIs, emits telemetry in OTel’s protocol (OTLP), and that telemetry can be sent to any OTel-compatible backend — including, critically, more than one backend at once.
The three things OpenTelemetry provides:
API → language-specific interfaces your application code calls to emit spans, metrics, and log records. The API is stable and vendor-independent — code instrumented with OTel APIs does not change when you switch observability backends.
SDK → the implementation that sits behind the API, handling context propagation, sampling, batching, and export. You configure the SDK with which backend to send telemetry to. You do not reinstrument your code to change backends.
Collector → a separately deployed agent (or gateway) that receives telemetry from applications, processes it (filtering, enrichment, sampling), and forwards it to one or more backends. This is the piece that makes multi-backend export possible and that handles the gnarly integration concerns between your applications and your observability platforms.
All three are stable, production-grade, and supported by every major observability vendor. Tracing reached general availability first, metrics followed, and logging — the last pillar to stabilise — is now on production-ready footing across all major language SDKs. The standard is no longer experimental. It is the default.
The one-sentence case for OpenTelemetry: it decouples the cost of instrumenting your applications from the cost of choosing your observability platform, and once your instrumentation is in OTel, switching backends becomes a configuration change rather than an engineering project.
Why “now” — the strategic timing case
Teams often react to OpenTelemetry with the same logic they apply to any new standard: “it sounds good, but we will adopt it when it is more mature / when we have more time / when our next observability contract is up for renewal.” This is the wrong framing. There are three reasons why adoption now is strategically different from adoption later.
Reason 1 — your vendor already supports it
Every major observability platform now accepts OTLP as a native ingestion format. Splunk Observability Cloud, Dynatrace, Datadog, Grafana Cloud, New Relic, Honeycomb, Elastic — all of them. This was not true three years ago. The vendor compatibility gate has closed. There is no longer a technical reason to use a vendor-specific agent for new instrumentation if OTel instrumentation will work equally well with your current vendor and every plausible replacement.
Reason 2 — the cost of migration compounds over time
Every new service your team ships that is instrumented with vendor-specific libraries adds to the migration debt you will eventually pay. A fifty-service environment fully instrumented with vendor-specific code is a six-month migration project. The same fifty services instrumented with OTel is a configuration change. The compounding is asymmetric — the team that adopts OTel late pays a migration cost that grows linearly with every service they shipped in the interim. The team that adopts OTel early pays a small upfront cost and then never pays for a migration again.
Reason 3 — your next vendor decision is already partly made
For teams considering an observability platform migration — the Splunk Observability Cloud to Dynatrace evaluation that many organisations are running right now is a specific example — the OTel adoption decision is effectively part of the migration. Committing to OTel before or during the migration means the instrumentation work is done once and then portable forever. Committing to the new vendor’s native agent during the migration locks you in for another cycle and sets up the same migration pain the next time you evaluate.
The observability vendor migration cost, with and without OTel:
Without OTel (vendor-native instrumentation):
Phase Engineering effort
───────────────────────────── ──────────────────────
Rewrite application 4–8 weeks per language
instrumentation × number of languages
Rebuild dashboards 1–2 weeks per critical
in new vendor service
Recalibrate alert 2–4 weeks for SLO/burn
thresholds in new tool rate alerts
Re-instrument custom 1–3 weeks per service
metrics
Parallel-run both 2–3 months of duplicate
platforms for validation vendor cost
Total for 50-service environment: 4–6 months, 2–4 engineers
────────────────────────────────────────────────────────────
With OTel already in place:
Phase Engineering effort
───────────────────────────── ──────────────────────
Change OTLP export 1 day per Collector
endpoint in Collector instance (or IaC
change rolled across)
Validate data parity 1–2 weeks total
Rebuild dashboards 1–2 weeks per critical
(still manual) service
Calibrate alerts in 1–2 weeks for SLO/burn
new vendor UI rate alerts
Parallel-run both Data dual-writes to
platforms for validation both backends from
Collector — 2–3 weeks
Total for 50-service environment: 4–6 weeks, 1–2 engineersThe difference is not marginal. Teams with OTel adoption already in place complete observability vendor migrations roughly four times faster than teams without, because the work that is actually rewriting integration code is done once and kept.
The OpenTelemetry architecture — what you are actually deploying
OpenTelemetry has three distinct architectural components, and understanding which is which is the prerequisite for any serious adoption discussion. Most confusion about OTel in practice comes from conflating these three layers.
OpenTelemetry architectural layers:
────────────────────────────────────────────────────────────────
Layer 1 — APPLICATION INSTRUMENTATION
What it is → code changes in your application to emit
spans, metrics, and log records
Where it lives → inside your application binary
What it needs → language-specific OTel SDK (e.g. OTel
Java, Python, Go, .NET, Node.js)
Two flavours:
Auto-instrum. → agent or library that instruments common
frameworks (HTTP servers, database
clients, gRPC, etc.) with no code changes
Manual instrum. → explicit API calls in your code to
create spans and record metrics for
business-specific events
────────────────────────────────────────────────────────────────
Layer 2 — THE COLLECTOR
What it is → a separate service that receives, processes,
and exports telemetry
Where it lives → deployed alongside your applications,
typically as a DaemonSet in Kubernetes
Why it exists → decouples application code from backend
choice; handles batching, retry, filtering,
enrichment; enables multi-backend export
Two deployment patterns:
Agent → one Collector per node (DaemonSet), apps
send telemetry to localhost:4317
Gateway → a cluster-wide Collector tier that agents
forward to; does the heavy processing
────────────────────────────────────────────────────────────────
Layer 3 — SEMANTIC CONVENTIONS
What it is → a set of standardised attribute names for
common telemetry dimensions
Where it lives → applied at both instrumentation and
Collector layers
Why it matters → ensures that http.status_code means the
same thing in every service; enables
automatic correlation between signals
────────────────────────────────────────────────────────────────
Protocol: OTLP (OpenTelemetry Protocol)
→ the wire format used between every layer
→ gRPC-based (preferred) or HTTP-based (fallback)
→ supported as native ingestion by every major backendThe Collector is the piece that matters most
Of the three layers, the OpenTelemetry Collector is the one that does the most work and provides the most strategic flexibility. The Collector is where you implement:
Multi-backend export → send the same telemetry to both Splunk Observability Cloud and Dynatrace simultaneously during a migration, or send metrics to one backend and logs to another permanently
Tail-based sampling → the complex sampling strategy from Week 9 runs in the Collector, not the application — so changes to sampling policy do not require application redeploys
Attribute enrichment → add cluster name, region, environment tags, or Kubernetes metadata automatically to every telemetry item without touching application code
Data filtering and redaction → strip personally identifiable information, filter out health check noise, or drop high-cardinality dimensions before they hit your backend (and your bill)
Protocol translation → receive telemetry in legacy formats (Jaeger, Zipkin, Prometheus) and export to OTLP — a critical capability for migrating instrumentation incrementally
Think of the Collector as the observability equivalent of an Envoy sidecar or an API gateway. It sits at a choke point in your telemetry flow where cross-cutting concerns can be applied consistently, without scattering that logic across hundreds of application codebases. Changes to sampling, enrichment, filtering, or routing happen in one place — usually in GitOps-managed configuration — rather than in every application that emits telemetry.
Semantic conventions — the layer that makes correlation automatic
Week 9 ended with a list of OpenTelemetry semantic conventions and a strong claim: consistent field naming across your logs, metrics, and traces is what makes automatic correlation possible. That claim is worth unpacking further, because the power of semantic conventions is not obvious until you have seen what their absence costs.
Semantic conventions are a published, versioned set of standard attribute names for common telemetry dimensions. They define that the service name is always service.name, never svc or service or app_name. The HTTP status code is always http.status_code, never http_status or status_code or response_code. The W3C trace ID is always trace_id, with the same format and propagation rules in every language’s SDK.
# The OTel semantic convention subset you should adopt immediately
# — these are the highest-leverage attributes for SRE workflows
# Resource attributes (emitted on every signal)
service.name → logical service identifier
service.version → deployed version (links to deployments)
service.namespace → logical grouping (e.g. "payments")
deployment.environment → prod / staging / dev
k8s.cluster.name → Kubernetes cluster identifier
k8s.namespace.name → Kubernetes namespace
k8s.pod.name → pod name (for per-pod troubleshooting)
k8s.node.name → node the pod runs on
# HTTP attributes (on HTTP request spans and metrics)
http.request.method → GET, POST, etc.
http.response.status_code → 200, 404, 500, etc.
http.route → /api/v1/payments (NOT the full URL;
full URL would cause cardinality blowup)
url.scheme → http or https
# Database attributes (on DB client spans)
db.system → postgresql, redis, mysql
db.operation → SELECT, INSERT, etc.
db.name → logical database name
# Messaging attributes (on queue/stream spans)
messaging.system → kafka, rabbitmq
messaging.destination.name → topic or queue name
messaging.operation → publish, receive, process
# Error attributes (on error events and logs)
exception.type → the exception class name
exception.message → the exception message
exception.stacktrace → the full stack traceWhen every service in your system emits these consistently — in logs, in metric labels, and in trace span attributes — your observability backend does the correlation work for you. A burn rate alert fires with service.name=payment-svc. The dashboard links to traces filtered by that attribute. The traces link to logs carrying the same trace_id. The logs carry the same service.name, which links to the service catalog entry. The four pillars from Week 9 stitch together automatically, not through manual correlation queries at 3 AM.
Why custom conventions are the wrong answer
Many teams, particularly those with existing instrumentation, react to semantic conventions with some version of: “we already use svc everywhere, it would be easier to keep our own names.” This is wrong for three reasons, all of which compound over time.
Your observability backend has built-in dashboards, alerts, and correlation rules that expect the OTel names. Dynatrace, Grafana, and every other modern platform use the OTel conventions as defaults. Using custom names means disabling every out-of-the-box feature and rebuilding equivalent logic manually.
Your auto-instrumentation libraries emit the OTel names. Every OTel auto-instrumentation library — the Java agent, the Python instrumentations, the Go instrumentations — emits conventions-compliant attributes. If your manual instrumentation uses custom names, you have two naming schemes in the same service and every query has to handle both.
Every new engineer who joins has to learn your custom convention. OTel conventions are documented, searchable, and shared across the industry. Custom conventions are tribal knowledge that burns onboarding time and creates inconsistency every time someone forgets the rule.
Adopt the OTel conventions as they are. If you have existing instrumentation with different names, use the Collector’s attribute processor to map old names to convention-compliant names during migration, so applications can be updated over time without breaking queries immediately.
OpenTelemetry in a service mesh environment
For teams running Istio with STRICT mTLS — which Week 9 touched on briefly — the relationship between the service mesh and OpenTelemetry deserves specific attention. Three layers of telemetry exist in this environment, and all three need to fit together cleanly.
Telemetry layers in an Istio mTLS environment:
────────────────────────────────────────────────────────────────
Layer A — Envoy sidecar telemetry (network-level)
What it sees → request arrival, response departure, status
code at the proxy boundary, TLS events,
connection pool state
What emits it → Envoy, configured via Istio Telemetry API
Where it goes → can emit OTel spans and metrics directly
via the OpenTelemetry tracing provider
Strength → zero application-code changes required
Limitation → no business context, no app-internal timing,
no database or downstream call granularity
Layer B — Application telemetry (business-level)
What it sees → everything inside the application:
business logic, database calls, cache hits,
feature flag evaluations, error context
What emits it → OTel SDK in the application
Where it goes → OTLP to local Collector → backends
Strength → rich business context, full call granularity
Limitation → requires explicit trace context propagation
on every outbound call (sidecar cannot do
this for you in STRICT mTLS)
Layer C — Kubernetes and infrastructure telemetry
What it sees → pod lifecycle, resource utilisation,
scheduling events, node health
What emits it → Kubernetes, kubelet, cAdvisor, node exporters
Where it goes → scraped by Collector or Prometheus,
forwarded to backends
Strength → infrastructure saturation and health
Limitation → no connection to business context
────────────────────────────────────────────────────────────────
The integration that makes them one system:
All three layers emit with consistent service.name,
k8s.pod.name, and k8s.namespace.name attributes.
Envoy spans become parents of application spans via
trace context propagation in the request headers.
Kubernetes events correlate with spans via pod name
and timestamp, surfaced in the observability UI.
When done correctly, an engineer investigating a
slow request can see:
→ Envoy span showing 1.2s inbound latency
→ Application span showing 50ms of app logic
→ Application span showing 1.1s database call
→ Pod metrics showing memory pressure at the same time
→ Kubernetes event showing pod rescheduling
— all automatically correlated by shared attributes.Istio Telemetry API and OpenTelemetry
Istio’s Telemetry API can be configured to emit OTel-native spans and metrics directly, rather than the legacy Zipkin or Jaeger formats. This is the correct configuration for any new Istio deployment and the target for any existing deployment to migrate toward. The configuration lives in a Telemetry CRD that is straightforward to manage via GitOps:
apiVersion: telemetry.istio.io/v1
kind: Telemetry
metadata:
name: otel-tracing
namespace: istio-system
spec:
# Applied to all workloads in the mesh
tracing:
- providers:
- name: otel
randomSamplingPercentage: 10.0
customTags:
# Ensure resource attributes propagate from Envoy
environment:
literal:
value: production
cluster:
environment:
name: K8S_CLUSTER_NAME
metrics:
- providers:
- name: prometheus
overrides:
- match:
metric: ALL_METRICS
tagOverrides:
# Apply semantic conventions to Envoy-emitted metrics
destination_service:
operation: UPSERT
value: "%{DESTINATION_SERVICE_NAME}"Automation-first principle for mesh telemetry: the Istio Telemetry configuration and the OpenTelemetry Collector configuration should both live in Git and be deployed via Argo CD, not configured by hand through kubectl. Changes to sampling policy, attribute enrichment, or backend routing become pull requests with review and rollback — the same engineering discipline you apply to application code. Mesh telemetry configuration that is edited directly in-cluster is operational toil that will drift between environments and be forgotten during incident postmortems.
The migration path — how to adopt OTel without a rewrite
The realistic constraint for most teams is that they have existing instrumentation — often years of it — using vendor-specific agents, Prometheus client libraries, or legacy tracing SDKs. “Rewrite everything to OTel” is not a credible migration plan. The good news is that it is also not necessary. OTel is explicitly designed to allow incremental adoption, and the Collector is the piece that makes that adoption path work.
The four-phase adoption roadmap
Phase 1 — Deploy the Collector (no application changes)
Week 1–2
→ Deploy OTel Collector as a DaemonSet in each cluster
→ Configure OTLP, Prometheus, and Jaeger/Zipkin receivers
(accept telemetry in every current format your apps emit)
→ Export to your existing observability backend(s)
→ Validate data parity: metrics in the backend are identical
to what was sent before the Collector was introduced
At end of Phase 1: no app changes, Collector is in the path,
observability unchanged from the
user's perspective
────────────────────────────────────────────────────────────────
Phase 2 — Enable OTel auto-instrumentation for new services
Week 3–6
→ Update service scaffold templates to include OTel
auto-instrumentation libraries by default
→ New services emit OTel-native telemetry from day one
→ Collector receives OTLP directly from new services,
legacy formats from existing services — both exported
consistently to backend
At end of Phase 2: all NEW services are OTel-native;
existing services untouched
────────────────────────────────────────────────────────────────
Phase 3 — Migrate existing services opportunistically
Month 2–6
→ When a service is actively being modified for any
reason (feature work, dependency upgrade, refactor),
migrate its instrumentation to OTel as part of the
change
→ Priority services for migration: those involved in
frequent incidents, those with custom instrumentation
that breaks during upgrades, those with the highest
operational burden
→ No "instrumentation migration sprint" — this is
opportunistic work that rides on other development
At end of Phase 3: 60–80% of services are OTel-native,
depending on development velocity
────────────────────────────────────────────────────────────────
Phase 4 — Complete the migration
Month 6–12
→ Remaining services are migrated as deliberate
reliability investments, justified by specific
operational pain their legacy instrumentation causes
→ Legacy instrumentation libraries deprecated
→ Collector receivers for legacy formats removed
→ OTel becomes the only instrumentation standard
At end of Phase 4: 100% OTel adoption; vendor migration
option is now available as a
configuration changeThis roadmap is aggressive but achievable, and it is specifically designed to avoid the “big bang migration” that kills most instrumentation standardisation efforts. No team has ever successfully completed a rewrite-everything observability migration. Every successful OTel adoption has followed some version of the pattern above: deploy the compatibility layer first, default new work to the new standard, migrate existing work opportunistically.
The Collector configuration — a production-grade starting point
Most OTel Collector examples online show minimal configurations that are suitable for getting started but insufficient for production use. Here is a closer-to-production Collector configuration for a Kubernetes environment, with the pieces that matter for the SRE framework this series is building.
# otel-collector-config.yaml
# Production-grade Collector configuration for a Kubernetes
# environment with Istio mTLS and SLO-based alerting
receivers:
# Accept OTLP from applications on standard ports
otlp:
protocols:
grpc:
endpoint: 0.0.0.0:4317
http:
endpoint: 0.0.0.0:4318
# Scrape Prometheus metrics from legacy services
# during the migration period
prometheus:
config:
scrape_configs:
- job_name: 'kubernetes-pods'
kubernetes_sd_configs:
- role: pod
relabel_configs:
- source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_scrape]
action: keep
regex: true
# Host metrics (node-level CPU, memory, disk, network)
hostmetrics:
collection_interval: 30s
scrapers:
cpu:
memory:
disk:
network:
processors:
# Enrich all telemetry with Kubernetes metadata
k8sattributes:
auth_type: "serviceAccount"
extract:
metadata:
- k8s.cluster.name
- k8s.namespace.name
- k8s.pod.name
- k8s.pod.uid
- k8s.node.name
- k8s.deployment.name
labels:
- tag_name: app
key: app.kubernetes.io/name
from: pod
# Batch telemetry for efficient export
batch:
send_batch_size: 10000
timeout: 10s
# Tail-based sampling for traces (keep all errors,
# sample others at 10%)
tail_sampling:
decision_wait: 10s
num_traces: 100000
policies:
- name: errors-policy
type: status_code
status_code:
status_codes: [ERROR]
- name: slow-traces-policy
type: latency
latency:
threshold_ms: 1000
- name: probabilistic-policy
type: probabilistic
probabilistic:
sampling_percentage: 10
# Drop high-cardinality attributes that would blow
# up metric cardinality
attributes/drop-high-cardinality:
actions:
- key: user.id
action: delete
- key: request.id
action: delete
# Memory limit to prevent Collector OOM
memory_limiter:
check_interval: 1s
limit_percentage: 75
spike_limit_percentage: 25
exporters:
# Primary backend — current observability platform
otlp/primary:
endpoint: "splunk-otel-collector.observability:4317"
tls:
insecure: false
# Secondary backend — parallel-run during migration
# Comment out outside migration window
otlp/secondary:
endpoint: "dynatrace-collector.observability:4317"
tls:
insecure: false
# Prometheus remote-write for services that still
# query Prometheus directly
prometheusremotewrite:
endpoint: "http://prometheus:9090/api/v1/write"
service:
pipelines:
traces:
receivers: [otlp]
processors: [memory_limiter, k8sattributes,
tail_sampling, batch]
exporters: [otlp/primary, otlp/secondary]
metrics:
receivers: [otlp, prometheus, hostmetrics]
processors: [memory_limiter, k8sattributes,
attributes/drop-high-cardinality, batch]
exporters: [otlp/primary, otlp/secondary,
prometheusremotewrite]
logs:
receivers: [otlp]
processors: [memory_limiter, k8sattributes,
attributes/drop-high-cardinality, batch]
exporters: [otlp/primary, otlp/secondary]Three properties of this configuration deserve specific attention. First, the dual-export pattern (otlp/primary and otlp/secondary) is what makes observability vendor migrations painless — the same telemetry goes to both vendors for the duration of the parallel-run period, and the cutover is a configuration change rather than a rewrite. Second, the tail-based sampling configuration implements exactly the policy Week 9 recommended: keep all error traces and all slow traces unconditionally, sample everything else at 10%. Third, the attribute enrichment via k8sattributes processor adds Kubernetes context to every telemetry item automatically — no application-code changes required.
OpenTelemetry and the observability migration decision
For teams currently evaluating or executing an observability platform migration — the Splunk Observability Cloud to Dynatrace evaluation is a representative example — OTel adoption is not a separate decision from the vendor migration. It is part of the migration. And the sequencing matters.
The correct sequencing:
Adopt OTel first, at least for the Collector layer. Deploy the Collector in your current environment. Point it at your current backend. Validate data parity. This is Phase 1 of the roadmap above and it should happen before the vendor decision is finalised.
Evaluate the new vendor using OTel-emitted telemetry. Your proof-of-concept should demonstrate that OTel-native telemetry — the exact same format your production applications will eventually emit — works correctly end-to-end with the new vendor. This is a stronger POC than one that uses the vendor’s native agent, because it validates the production migration path, not just the product’s features.
Execute the vendor migration as a Collector configuration change. With OTel in place, the migration from Vendor A to Vendor B becomes: change the Collector’s export endpoint, add the new backend as a secondary export during the parallel-run period, cut over primary export when confident, remove the old backend.
Teams that sequence the OTel adoption after the vendor migration pay the migration cost twice: once for the current migration, once for the next one. Teams that sequence it before — or concurrent with the vendor evaluation — pay it once.
The strategic framing for leadership: OpenTelemetry adoption is not a technical choice; it is a strategic optionality purchase. The value of OTel is not what it does today — it is the cost it prevents in every future observability platform decision. Presented this way, OTel is one of the highest-ROI engineering investments available, because the cost of adoption is small and fixed, while the cost of not adopting it compounds with every service your team ships.
Five concrete starts for this week
Deploy the OTel Collector in a non-production cluster. Use the Helm chart or the Kubernetes Operator. Configure it to receive OTLP and to export to your current backend. This is a half-day exercise and it is the foundation everything else in the roadmap builds on.
Enable OTel auto-instrumentation for one service. Pick a service that is not critical enough for the exercise to be scary, but large enough that the instrumentation output is meaningful. Deploy the language-specific auto-instrumentation library (Java agent, Python instrumentation, etc.) with zero code changes. Observe what it emits.
Audit your existing telemetry against semantic conventions. For your most critical service, catalogue the attribute names currently used in logs, metrics, and traces. Map them to the OpenTelemetry conventions. The gap analysis is your first-quarter instrumentation backlog.
Add OTel to your service scaffold template. If your team uses Backstage or a similar internal developer platform, modify the service creation template to include OTel auto-instrumentation by default. From this point forward, every new service is OTel-native without requiring a conscious decision — which is how standards actually get adopted at scale.
Configure Istio Telemetry API for OTel output. If you run a service mesh, the mesh-layer telemetry should flow through OTel rather than through legacy Zipkin/Jaeger paths. This is a GitOps configuration change that takes hours, not weeks, and immediately benefits every service in the mesh.
Next week: Structured logging done right — from printf to queryable events. We will cover the log-to-metric transition, the field conventions that enable log-trace correlation, and the instrumentation patterns that make your log store a first-class participant in the observability framework rather than a parallel, disconnected data silo.
#SRE #OpenTelemetry #OTel #Observability #GoogleSRE #CNCF #DistributedTracing #Reliability #SiteReliabilityEngineering #DevOps
Part of the 52-Week SRE Blog Series · Week 10 of 52


