Advanced Functions/Algorithms

Algorithms

Anomalies

Overlay a band on the metric, showing the expected behavior of a series based on past values.

Outliers

Highlight outliers series.

Forecast

Forecast future values based on past values.

Anomalies

Overlay a band on the metric, showing the expected behavior of a series based on past values.

Kloudfuse provides these possible implementations of anomaly detection:

basic

Implements the Rolling quantile algorithm.

The Basic Anomaly Detection algorithm calculates a predicted range using the 25th and 75th quantiles and the interquartile range (IQR) within a rolling window. This range helps determine the expected "normal" behavior, while deviations outside this range are flagged as anomalies.

Parameters

  • Window: Defines the size of the rolling window used for quantile computation. A larger window smooths the data but may reduce sensitivity to sudden changes.

  • Band: Sets the sensitivity of anomaly detection. A narrower band makes the algorithm more sensitive to deviations, while a wider band captures more data as "normal."

Example

In the example below, the time series graph displays unique count of errors over a period. The gray band represents the expected range based on recent data, while red markers indicate anomalies—data points outside the predicted range. Here, a sudden increase in errors during peak hours is flagged as an anomaly, allowing for quick detection and investigation.

Use Case

Basic Anomaly Detection is ideal for monitoring metrics with frequent, non-seasonal fluctuations, where rapid response to changes is essential. Use it to detect unexpected spikes or drops without needing to account for cyclic patterns or trends.

 

 

agile

The Agile Anomaly Detection algorithm uses the SARIMA (Seasonal AutoRegressive Integrated Moving Average) model to identify anomalies in time series data. Agile detection allows for quick adaptation to changes in the data

Key Arguments

  • Seasonality (Hourly, Daily):

    • Hourly: This setting is used for log metrics that display hourly cyclic behavior. For instance, if your log data typically fluctuates each hour based on user activity or background processes, setting the seasonality to Hourly enables the SARIMA model to capture these hourly patterns accurately.

    • Daily: This setting captures daily seasonality, suitable for log metrics with a daily recurring pattern. For example, if log entries spike every evening due to daily system maintenance tasks, setting the seasonality to Daily allows the model to recognize these daily trends.

  • Bands (1, 2, 3):

    • Band 1 (Narrow): Offers high sensitivity by setting a tighter range around predicted values, detecting even minor deviations. This band is useful when you need to capture subtle changes in log volume that might indicate early signs of issues.

    • Band 2 (Moderate): Provides a moderate range, making the algorithm less sensitive to minor fluctuations and ideal for monitoring with fewer false positives.

    • Band 3 (Wide): Defines the widest range, capturing only significant deviations. This setting is suitable for metrics where only large, impactful anomalies are of interest, reducing alert noise for minor variations.

 

Agile with Band = 1

 

robust

The Robust Anomaly Detection algorithm uses a seasonal decomposition technique to identify anomalies in time series data. Seasonal decomposition separates the data into its seasonal, trend, and residual components, allowing for more accurate anomaly detection in metrics with strong seasonal patterns.

Key Arguments

  1. Rolling Window Size:

    • The rolling window size is used to calculate the standard deviation (std) for anomaly detection and to set the band limits around the expected values.

    • A larger window size provides a smoother, more stable standard deviation calculation but may be less responsive to sudden, short-term spikes or drops.

    • A smaller window size is more responsive to recent data points, allowing for a quicker reaction to changes but may lead to more noise.

  2. Bands (1, 2, 3):

    • Band 1 (Narrow)

    • Band 2 (Moderate)

    • Band 3 (Wide)

agile-robust

The Agile Robust Anomaly Detection algorithm applies the Prophet model to detect anomalies in log metrics with recurring patterns and occasional level shifts. This approach is especially useful for identifying irregularities in logs that exhibit seasonal behaviors, such as error spikes, request rates, or event frequencies, which follow daily or hourly patterns.

Key Arguments

  1. Seasonality (Hourly, Daily):

    • Hourly: This setting is used for log metrics that exhibit hourly patterns within a 24-hour cycle. For example, if your error logs tend to spike each hour due to automated checks or periodic background processes, selecting Hourly allows the algorithm to model these regular occurrences and detect deviations that fall outside the norm.

    • Daily: This setting captures daily seasonality, making it useful for log metrics that show daily patterns. For instance, a daily surge in user login errors each morning when users start their workday would be expected. With Daily seasonality, the algorithm anticipates these recurring daily trends, flagging only unusual changes outside the expected pattern.

  2. Bands (1, 2, 3):

    • Band 1 (Narrow)

    • Band 2 (Moderate)

    • Band 3 (Wide)

 

 

 

Outliers

Highlight outliers series.

Kloudfuse provides DBSCAN implementation of outlier detection.

Configure the Tolerance Level

In DBSCAN, the tolerance level (referred to as eps) determines the radius of the neighborhood around each point for clustering purposes. eps controls the sensitivity of outlier detection. A lower tolerance will detect more subtle outliers, while a higher tolerance will detect only the most significant deviations.

Visualization

The chart displays the results of DBSCAN outlier detection applied to the selected log metric over time. In the visualization:

  • Solid Lines represent data series flagged as outliers. These indicate instances where the data behavior deviates significantly from the norm based on the defined tolerance.

  • Dotted Lines represent data series identified as non-outliers, meaning they exhibit expected behavior relative to their peers.

In the following examples, a cube root transformation is applied to the data before DBSCAN processing. The choice of eps significantly affects the number of detected outliers:

  • Tolerance = 0.8
    In fig-1 , eps is set to 0.8, making the detection process highly sensitive to deviations. As a result:

    • Solid lines in the chart represent data points marked as outliers, where even small deviations from the normal pattern are detected.

    • Dotted lines indicate non-outliers, showing stable or expected behavior.

  • Tolerance = 5
    In fig-2 , eps is increased to 5. With this higher tolerance:

    • Only significant deviations are identified as outliers, with most series marked as dotted lines (non-outliers).

    • Solid lines (outliers) appear only for major deviations, filtering out more minor deviations.

    This setting is appropriate when you only want to capture large deviations and are not concerned with smaller fluctuations in the data.

Forecast

Forecasting allows users to predict future values in a time series based on historical data, enabling proactive monitoring and resource planning. By forecasting trends and patterns, users can anticipate potential issues, optimize resource allocation, and make data-driven decisions. Our platform supports two forecasting algorithms tailored to different data characteristics and forecasting needs:

  • Linear Forecast (Linear Regression):
    A straightforward method for forecasting based on a linear trend in the data. This approach is well-suited for time series that exhibit a consistent trend without significant seasonal variations. Linear forecasting can help identify steady growth or decline over time, making it ideal for simple trend prediction.

 

 

  • Seasonal Forecast (Prophet):
    Prophet is a sophisticated forecasting model designed to handle time series data with seasonal patterns and holiday effects. This algorithm is especially effective for data that shows recurring patterns (e.g., hourly, daily, weekly) and is capable of capturing seasonality and trends. Seasonal forecasting is suitable for applications with clear cyclical behaviors.

Arguments

The Seasonal Forecast function offers two options for seasonality, designed to capture the natural periodicity in log data:

  • Hourly: Captures seasonality with an hourly recurrence. This option is ideal for log metrics that show patterns within a 24-hour cycle. For instance, if your logs reveal traffic spikes at the start of each hour due to scheduled tasks, or if error logs increase during peak hours (e.g., lunchtime or late evening), the hourly setting can help forecast these recurring events and detect deviations from the expected hourly pattern.

  • Daily: Captures seasonality with a daily recurrence. This option is suitable for logs that follow a daily pattern, such as application logs that surge every morning when users start their workday or error logs that peak every evening due to heavy batch processing or data backups. By selecting the daily setting, you can anticipate daily log trends and based on these expected patterns.

Related pages