Skip to end of metadata
Go to start of metadata

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 12 Next »

Advance functions

Anomaly Detection

Kloudfuse platform provides ability to perform anomaly detection on underlying data. Anomaly monitoring is effective when looking out for deviation in behavior in comparison to its past behavior. For example, are # of requests unusually different then it’s past behavior. Kloudfuse provides following algorithms for outlier detection:

  • Basic (Rolling-Quantile) is a statistical learning algorithm used to detect anomaly based on earlier behavior measured by the specified quantile. Following parameters are provided to tune the algorithm:

    • Threshold: # of standard deviations the value have to be away from mean for it to be considered anomalous. For example, a value of 1 would indicate 0.68 quantile value.

  • RRCF (Robust Random Cut Forest) is a machine learning algorithm used for detecting anomalies in large datasets. It uses a tree-based ensemble method to identify outliers based on their relative isolation within the dataset. The algorithm constructs a set of binary trees from random subsamples of the data, and determines the level of isolation of each point in the dataset by counting the number of trees that must be traversed before reaching that point. Anomalies are identified as points that require fewer traversals than the majority of points in the dataset, indicating that they are more isolated and potentially more unusual. The algorithm is robust to high-dimensional data, skewed distributions, and the presence of noisy or irrelevant features, making it well-suited for a wide range of applications in anomaly detection and outlier analysis. Following parameters are provided to tune the algorithm further:

    • global window: Time window to use for the rolling dataset (from the metric query done over this time window). At any point in time, RRCF algorithm captures the signal behavior seen over this time window.

    • local window: Time window to use for capturing the signal behavior in recent past.

Outlier Detection

Kloudfuse platform provides ability to perform outlier detection on underlying data. Outliers monitoring is effective when looking out for deviation in behavior in comparison to other similar entities in the cluster. For example, CPU usage per pod for a service with 3 replicas should be similar across all 3 pods. If one pod uses more or less CPU then others then it is an outlier. Kloudfuse provides following algorithms for outlier detection:

  • DBSCAN: (density-based spatial clustering of applications with noise) is a popular clustering algorithm. Following parameters are used to tune it further:

Auto Alerting: Hawkeye

Kloudfuse Analytics provides auto alerting feature on various entities, out of the box. In most cases, these require simple configuration and the auto alerting internally uses the advance functions that the kloudfuse platform supports/provides to monitor your cluster.

Having the required data and the unification of streams is central to Kloudfuse platform being able to do the auto alerting. Hawkeye service is designed to monitor user controllable entities in their infrastructure for abnormal behavior depending on the entity in an intelligent fashion.

Services

Knight discovers peer to peer communication between services automatically. The communication is tracked for various protocols. The discovered services and their connection to other services (entities) is discovered (and shown in the service list UI). The service map is also discovered using the communication as edges. Each of the service list and service map records the RED metric for the service or the edge.

Using this data, HawkEye looks for anomalies in real-time fashion using state-of-the-art statistical learning algorithms or service level objectives as configured. If an anomalous behavior is detected then an alert is raise which is then evaluated by BullsEye.

Persistent Volumes

This feature is not enabled by default. Contact us for more information. Follow these steps to enable.

BullsEye: Auto Analysis

Bullseye service is designed for analyzing signals, correlating them with signals within the same stream or across the streams. This analysis helps reduce the effort required to debug issues and narrowing down problematic areas in minutes. from one stream (upon detection of abnormal behavior by Hawkeye or on demand by user)

BullsEye is designed to narrow down to other anomalous areas of your infrastructure starting from the source which is captured in an alert. Due to the Kloudfuse Platform being unified, i.e., all data being present in a single platform, it can cast a wider net for looking into data derived from each of the streams present in the system, making it the most likely to identify problematic behavior in minutes.

Additionally, if instrumentation less tracing is enabled, it can sieve through this wider net in much more efficient manner to eliminate noise and present only the most relevant information which can tremendously reduce the time to resolution.

  • No labels