About the Traces app
Note
Version 1.0.0 marks the General Availability (GA) of the ITRS Analytics Traces app, transitioning from its Beta phase to a fully supported, production-ready release. The Traces app provides comprehensive, interactive visualizations of data flows in distributed systems, enabling both high-level and detailed operational insights.
Navigate the ITRS Analytics Traces app Copied
From high-level overviews to granular insights into individual operations, the Traces app provides an interactive visualization of data flows in your system. It can help you analyze how different and fragmented data points can be seamlessly consolidated together, so you can better understand the performance of your services.
Tip
Watch this product walkthrough to quickly familiarize yourself with the key UI elements and available actions in the app. This guided overview allows you to explore the app’s capabilities and understand its features without needing to install it first.
Filtering traces Copied
These filters allow you to narrow down and focus on specific services, instances, and operations within your system. Below is a description of each filter field shown in the interface.
Filter field | Description |
---|---|
Service namespace | Represents the logical grouping of services in your environment, such as production, staging, or a specific Kubernetes namespace. |
Service name | Refers to the name of the service emitting the trace or spans (for example, user-service, payment-service). |
Service instance ID | Identifies a specific instance of a service, such as a single pod, container, or virtual machine running that service. |
Operation | Represents a specific operation or endpoint within a service, such as GET /users or POST /checkout . This also allows you filter traces to view only those related to a specific route or function. |
Lookup by trace ID Copied
Search for a specific Trace ID by using the Lookup by trace ID search bar at the top of the screen.
Highlight errors Copied
Toggle the Highlight errors switch to highlight errors in the UI.
The UI will refresh to reflect your selected options.
Viewing traces in the Timeline Copied
The Timeline chart provides a chronological overview of your trace data. Use this section to visualize the flow of events and quickly spot anomalies or performance trends over time.
In the Timeline view, red dots mark errors in operations and services, while blue dots indicate an Unset
status, which is treated the same as OK
.
On the Timeline section, review the visualization of trace timelines, including duration and latency. You can select an option in the dropdown to examine latency metrics (such as p50, p95, p99).
Latency view Copied
Examine the histogram in the Latency section to understand the distribution of trace latencies. This displays latency statistics for all traces that match the applied filters within the selected date range.
Review aggregated data Copied
Get a high-level summary of your system’s performance.
The Aggregates section provides a consolidated view of errors and latencies across different services and namespaces. Aggregates represent a high-level summary that groups trace data by root service and operation to help you quickly identify performance characteristics and potential issues across your distributed system.
You can easily review performance by namespace and service, and refer to the Errors and latency columns for detailed metrics.
Fields | Description |
---|---|
Root Service | Originating service that initiated the trace. Helps identify the source of a request. |
Root Operation | Entry-point operation (for example, API endpoint, background task) associated with the root span. |
Errors | Count of trace root spans that ended in an error. Useful for spotting failing operations. |
Working with Sample traces Copied
Sample traces refer to a representative set of complete trace records collected based on the filters and time range applied. Each trace corresponds to a single request or workflow moving through one or more services in a distributed system.
- In the Sample traces section, locate the trace you want to inspect.
- Click View trace for the specific trace.
The UI transitions to a detailed view for the trace, where you can switch between the Timeline and Graph tabs. This information is also available in the Timeline view.
Exploring a trace with the Timeline tab Copied
The Timeline tab provides a chronological breakdown, presenting the trace as a series of sequential operations (spans). Each row represents a distinct span.
- Review the details of each span, which lists its Operation, the Service it belongs to, and its Status (for example,
Unset
orOK
). - Check the bar in the chart for the corresponding time taken for the specific operation. The overall timeline provides a scale.
- Review the exact duration in milliseconds, which is shown next to the bar.
With this view, you can quickly identify operations that consume the most time within the overall trace.
Graph view Copied
The Graph view displays the trace in a topology, illustrating the dependencies and flow between different operations (spans). Each circle or node in the graph represents an operation (span).
- Review parent-child or dependency relationships between operations through the lines that connect the nodes.
- Review the duration and percentage contribution to the total trace duration of each node.
- Hovering over a node to quickly preview information about the operation.
This view helps in tracing the path of a request through various services and understanding the order in which they are called.
Span fields details Copied
Clicking a specific operation from either the Timeline or Graph tab opens a panel on the right side of the screen, showing the Span Details and Span Dimensions.
Span Details refer to the core metadata and runtime characteristics of a span. These provide insights into what the span represents, when it occurred, and how it performed.
Span Dimensions are metadata fields that enable categorization, filtering, aggregation, and analysis across many spans.