Metric Analysis

Note: Before going through this article, we recommend you know how to create a metric template.

The goal of metric analysis is to score/compare all the metrics from baseline with the canary to get the user a good idea of whether the canary is a fail or a pass based on the overall average metric score given to the system. You can make a decision overall to have the application in production or not based on this metric score. The goal here is to carry out the canary analysis for old and new releases by comparing both versions' of metrics. This is carried out structurally by comparing each metric of the old release to the new release and finally scoring the metrics analysis by averaging out the individual score of the metrics.

From the “Analysis History” page click on “Metric Analysis” to view the scores of each metric. Refer to the image below.

To view the comparison between Baseline and New Release, Observed significant change in characteristics of a metric, click on the drop-down arrow of that particular metric and then click on the text as shown in the image below.

Critical Metrics

User-defined metrics that represent the KPI of service behavior. The continuous Verification test will fail if any of the metrics tagged as critical fail in the canary test. If critical metrics are not tagged by the user, the system will treat all metrics equally and will assign rank based on algorithms.

Watchlist Metrics

User-defined metrics that represent intuitive performance measures. These metrics are used for filtering when presenting results with a large number of analyzed metrics.

Metrics Groups

Metrics are grouped based on the service as well as system-level metrics based on network, compute, disk and memory. This grouping allows users to identify high-ranking groups and low-scoring groups to diagnose the issues.

Metric Details

Selecting an individual metric in the list shows details of the metric with its statistics, box plots, and behavior throughout canary. Each of the metrics rows has additional details on bucket scores. Each bucket is a timeslice that is used in comparison that allows users to identify trends or a subset of load that could be causing a problem with the new build.