Skip to end of metadata
Go to start of metadata

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 7 Next »

Algorithms

Anomalies

Overlay a band on the metric, showing the expected behavior of a series based on past values.

Outliers

Highlight outliers series.

Forecast

Forecast future values based on past values.

Anomalies

Overlay a band on the metric, showing the expected behavior of a series based on past values.

Kloudfuse provides these possible implementations of anomaly detection:

basic

Implements the Rolling quantile algorithm.

The Basic Anomaly Detection algorithm calculates a predicted range using the 25th and 75th quantiles and the interquartile range (IQR) within a rolling window. This range helps determine the expected "normal" behavior, while deviations outside this range are flagged as anomalies.

Parameters

  • Window: Defines the size of the rolling window used for quantile computation. A larger window smooths the data but may reduce sensitivity to sudden changes.

  • Band: Sets the sensitivity of anomaly detection. A narrower band makes the algorithm more sensitive to deviations, while a wider band captures more data as "normal."

Rolling Window size :

Rolling Windows are 1m, 2m, 3m, 5m, 10m, 15m, 30m, 1h, and 2h.

Band parameter:

Has the possible values of 1, 2, or 3.

Example

In the example above, the time series graph displays unique count of errors over a period. The gray band represents the expected range based on recent data, while red markers indicate anomalies—data points outside the predicted range. Here, a sudden increase in errors during peak hours is flagged as an anomaly, allowing for quick detection and investigation.

Use Case

Basic Anomaly Detection is ideal for monitoring metrics with frequent, non-seasonal fluctuations, where rapid response to changes is essential. Use it to detect unexpected spikes or drops without needing to account for cyclic patterns or trends.

agile

Implements the SARIMA algorithm.

Numeric parameter

Has the possible values of 1, 2, or 3.

robust

Implements the Seasonal decompose algorithm.

sampling interval

Sampling intervals are 1m, 2m, 3m, 5m, 10m, 15m, 30m, 1h, and 2h.

Numeric parameter

Has the possible values of 1, 2, or 3.

agile-robust

Implements the Prophet algorithm.

sampling interval

Sampling intervals hourly, daily, or weekly.

Numeric parameter

Has the possible values of 1, 2, or 3.

Outliers

Highlight outliers series.

Kloudfuse provides the DBSCAN implementation of outlier detection.

Configure the Tolerance Level

In DBSCAN, the tolerance level (referred to as eps) determines the radius of the neighborhood around each point for clustering purposes. In this exampleeps is set to 0.8, which controls the sensitivity of outlier detection. A lower tolerance will detect more subtle outliers, while a higher tolerance will detect only the most significant deviations.

Visualization

The chart displays the results of DBSCAN outlier detection applied to the selected metric over time. In the visualization:

  • Solid Lines represent data series flagged as outliers. These indicate instances where the data behavior deviates significantly from the norm based on the defined tolerance.

  • Dotted Lines represent data series identified as non-outliers, meaning they exhibit expected behavior relative to their peers.

In the following examples, a cube root transformation is applied to the data before DBSCAN processing. The choice of eps significantly affects the number of detected outliers:

  • Tolerance = 0.8
    In the first exampleeps is set to 0.8, making the detection process highly sensitive to deviations. As a result:

    • Solid lines in the chart represent data points marked as outliers, where even small deviations from the normal pattern are detected.

    • Dotted lines indicate non-outliers, showing stable or expected behavior.

  • Tolerance = 5
    In the second example , eps is increased to 5. With this higher tolerance:

    • Only significant deviations are identified as outliers, with most series marked as dotted lines (non-outliers).

    • Solid lines (outliers) appear only for major deviations, filtering out more minor deviations.

    This setting is appropriate when you only want to capture large deviations and are not concerned with smaller fluctuations in the data.

Forecast

Forecasting allows users to predict future values in a time series based on historical data, enabling proactive monitoring and resource planning. By forecasting trends and patterns, users can anticipate potential issues, optimize resource allocation, and make data-driven decisions. Our platform supports two forecasting algorithms tailored to different data characteristics and forecasting needs:

  • Linear Forecast (Linear Regression):
    A straightforward method for forecasting based on a linear trend in the data. This approach is well-suited for time series that exhibit a consistent trend without significant seasonal variations. Linear forecasting can help identify steady growth or decline over time, making it ideal for simple trend prediction.

  • Seasonal Forecast (Prophet):
    Prophet is a sophisticated forecasting model designed to handle time series data with seasonal patterns and holiday effects. This algorithm is especially effective for data that shows recurring patterns (e.g., hourly, daily, weekly) and is capable of capturing seasonality and trends. Seasonal forecasting is suitable for applications with clear cyclical behaviors.

Arguments

The Seasonal Forecast function offers two options for seasonality, designed to capture the natural periodicity in log data:

  • Hourly: Captures seasonality with an hourly recurrence. This option is ideal for log metrics that show patterns within a 24-hour cycle. For instance, if your logs reveal traffic spikes at the start of each hour due to scheduled tasks, or if error logs increase during peak hours (e.g., lunchtime or late evening), the hourly setting can help forecast these recurring events and detect deviations from the expected hourly pattern.

  • Daily: Captures seasonality with a daily recurrence. This option is suitable for logs that follow a daily pattern, such as application logs that surge every morning when users start their workday or error logs that peak every evening due to heavy batch processing or data backups. By selecting the daily setting, you can anticipate daily log trends and based on these expected patterns.

  • No labels