Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

To delete a contact point click “Contact points” to open the page listing existing contact points. Find the contact point to delete, then click the trash icon to delete the contact point. Confirm the choice when a window pops up.

Dealing with multiple contact points

In cases when multiple contact points are used within a single alert, the default one to one mapping provided by Kloudfuse UI may not work. Some additional configuration may be required based on your specific requirements. Please make sure to review these steps. here to understand how to structure the notification flags.

Example 1. Notification to be triggered to more than one contact points for a single alert.

Make sure that the contact points within a group allow matching other sibling policies for notification to be attempted to other policies and related contact points. In this example, for the alert a grafana email and opsgenie contact points are enabled. The notification policy for grafana contact point ensures that rule is further matched with sibling policies. See following.

...

Contact point types

The following list of contact points is currently supported. This list will be enhanced as support for additional contact points is added.

Manage alerts

The Alerts > Rules page lists the installed alerts in the system. A list of alerts can be searched using their state, labels, title, etc., intuitively on this page. Each alert can be further investigated by clicking on that alert from the list where the alert’s properties, current evaluation graph, and history are displayed.

...

  1. create an app password in gmail. You will need to be using 2FA for the account (grafana_alerts@domain.com) to be able to create app password. https://support.google.com/mail/answer/185833?hl=en . Note down the app password as you will need it in step 3.

  2. make sure you are connected to the cluster where kloudfuse stack is installed and you are in kfuse namespace

    Code Block
    # connect to your cluster
    kubectx <cluster-name>
    kubens kfuse
  3. create a kubernetes secret with the username and password you created in step 1.

    Code Block
    kubectl create secret generic grafana-smtp-user-password --from-literal=user=grafana_alerts@domain.com --from-literal=password=<generated-app-password>
  4. edit the values.yaml to uncomment settings related to smtp in grafana section (to look like the snippet below). Update following settings:

    1. update host to your smtp mail server

    2. update from_address to the smtp user you want to use

    3. update from_name if needed

      Code Block
      grafana:
        grafana:
          # grafana.ini - Grafana server configuration settings
          grafana.ini:
            ...
          # start -- Uncomment the following to enable smtp
             smtp:
               enabled: true
               host: your_smtp_hostname_colon_port
               skip_verify: true
               from_address: your_smtp_user@domain.com
               from_name: AlertsAdmin
      
          envValueFrom:
      
            GF_SMTP_USER:
               secretKeyRef:
                 name: grafana-smtp-user-password
                 key: user
             GF_SMTP_PASSWORD:
               secretKeyRef:
                 name: grafana-smtp-user-password
      
                key: password
          # Uncomment the following to enable smtp -- end
              
  5. issue the same kfuse helm install command which you used to install kfuse cluster again.

    Code Block
    helm upgrade --create-namespace --install kfuse . -f [gcp|aws].yaml -f custom_values.yaml --set global.orgId=<your-company-name>

Please make sure to update the default email address in grafana-default-email otherwise

Setting Notifications to PagerDuty

...

...

  • Create a New Nested Policy for OpsGenie contact point

    image-20240130-044044.pngImage Removed

Please make sure to use the label name same as the contact point name and to choose the contact point from the dropdown from below.

  • After completing the steps, navigate to Alerts → Contact Points on Kloudfuse UI.

  • Choose Create Contact Point and fill the required details

image-20240411-070341.pngImage Added

  • Now use the contact point from Kloudfuse UI to any of the alert.

Setting up Google Chat contact point

image-20240411-070507.pngImage Added

Now use the contact point from Kloudfuse UI to any of the alert.

...


Kloudfuse Provided Out of the box control plane alerts

Kloudfuse provides a number of out of the box alerts for getting the stats for data plane these alerts thresholds or other parameters can be updated as per each deployment. These alerts are part of kfuse-cp folder in alerts. Following are the default thresholds for these alerts.

Type

Check

Alert Condition

Kubernetes Pods

In Failed state

For 5 mins

Restarting multiple times

For 5 mins

CrashLoopBackOff

For 5 mins

Deployments

Lesser replicas than desired

For 15 mins

Statefulsets

Lesser replicas than desired

For 15 mins

Nodes

Unschedulable

For 10 mins

Not Ready

For 5 mins

High CPU Usage

> 90% for 5 mins

Disk Usage

> 90% for 5 mins

Data Lake (pinot)

Segments in error condition

> 0 for 5 mins

Segment creation threshold breached

> 10 mins

Persistent Volumes

Current Usage

> 90%

Forecast Usage

Notify when it will run out of space

Agent/Collector

Not sending data

For 5 mins

These alerts do not have any default contact point associated with them. The contact point for these alerts need to be updated as per each deployment requirement.