...
To delete a contact point click “Contact points” to open the page listing existing contact points. Find the contact point to delete, then click the trash icon to delete the contact point. Confirm the choice when a window pops up.
Dealing with multiple contact points
In cases when multiple contact points are used within a single alert, the default one to one mapping provided by Kloudfuse UI may not work. Some additional configuration may be required based on your specific requirements. Please make sure to review these steps. here to understand how to structure the notification flags.
Example 1. Notification to be triggered to more than one contact points for a single alert.
Make sure that the contact points within a group allow matching other sibling policies for notification to be attempted to other policies and related contact points. In this example, for the alert a grafana email and opsgenie contact points are enabled. The notification policy for grafana contact point ensures that rule is further matched with sibling policies. See following.
...
Contact point types
The following list of contact points is currently supported. This list will be enhanced as support for additional contact points is added.
Email
PagerDuty
Slack
Webhook
Microsoft Teams
OpsGenie
Google Hangouts
Manage alerts
The Alerts > Rules page lists the installed alerts in the system. A list of alerts can be searched using their state, labels, title, etc., intuitively on this page. Each alert can be further investigated by clicking on that alert from the list where the alert’s properties, current evaluation graph, and history are displayed.
...
create an app password in gmail. You will need to be using 2FA for the account (grafana_alerts@domain.com) to be able to create app password. https://support.google.com/mail/answer/185833?hl=en . Note down the app password as you will need it in step 3.
make sure you are connected to the cluster where kloudfuse stack is installed and you are in kfuse namespace
Code Block # connect to your cluster kubectx <cluster-name> kubens kfuse
create a kubernetes secret with the username and password you created in step 1.
Code Block kubectl create secret generic grafana-smtp-user-password --from-literal=user=grafana_alerts@domain.com --from-literal=password=<generated-app-password>
edit the values.yaml to uncomment settings related to smtp in grafana section (to look like the snippet below). Update following settings:
update host to your smtp mail server
update from_address to the smtp user you want to use
update from_name if needed
Code Block grafana: grafana: # grafana.ini - Grafana server configuration settings grafana.ini: ... # start -- Uncomment the following to enable smtp smtp: enabled: true host: your_smtp_hostname_colon_port skip_verify: true from_address: your_smtp_user@domain.com from_name: AlertsAdmin envValueFrom: GF_SMTP_USER: secretKeyRef: name: grafana-smtp-user-password key: user GF_SMTP_PASSWORD: secretKeyRef: name: grafana-smtp-user-password key: password # Uncomment the following to enable smtp -- end
issue the same kfuse helm install command which you used to install kfuse cluster again.
Code Block helm upgrade --create-namespace --install kfuse . -f [gcp|aws].yaml -f custom_values.yaml --set global.orgId=<your-company-name>
Getting the integration key for PagerDuty
Getting integration key from an existing service in PagerDuty
...
...
Please make sure to update the default email address in grafana-default-email otherwise
Setting Notifications to PagerDuty
Setting Kloudfuse (and grafana) alert notifications to PagerDuty is done in 2 simple steps:
Obtaining service integration key from PagerDuty
If you already have an existing service (using grafana-incoming-incidents service in the following example), then use following steps to get the integration key.
Otherwise, create a new service (test-incoming-notifications) and use that service’s integration key
Use the integration key obtained above in Kloudfuse platform by choosing “PagerDuty” as the contact point type.
...
Getting the slack webhook URL
Use a Slack webhook to create a contact point in slack.
Follow the link below to create a slack webhook and get its url- https://api.slack.com/messaging/webhooks
One the webhook is created, enter its URL in the webhook field while creating the contact point.
Use the Optional Slack Settings to mention specific user/group or send alerts to entire channel.
...
Setting webhook contact point integration
...
First head to Microsoft Teams and create a channel for alert notifications. Using
test-notifications
channel name in this example, and create a connection of type “incoming webhook” as shown below. Copy the created “URL” (example:https://kloudfuse.webhook.office.com/webhookb2/257d29a4-xxx
)Go to the Kloudfuse grafana instance and create a Microsoft Teams contact point. Use URL from above for the contact point URL.
...
->Alerts->Contact Points->Add New Contact Point. Choose Microsoft Teams from the drop down menu. Use url from above in the “URL” field and save. (Test will send a test notification).
...
Setting up OpsGenie contact point
Navigate to Grafana tab in the Kloudfuse UI.
Create a OpsGenie-Grafana integration with steps in the https://support.atlassian.com/opsgenie/docs/integrate-opsgenie-with-grafana/
After completing the steps, navigate to Alerts → Contact Points on Kloudfuse UI.
Choose Create Contact Point and fill the required details
Now use the contact point from Kloudfuse UI to any of the alert.
Setting up Google Chat contact point
Create a new Google Workspace space for alertinghttps://support.google.com/a/users/answer/9300611?hl=en or you can use an existing space.
Create an incoming webhook for space https://developers.google.com/chat/how-tos/webhooks#create_a_webhook
Navigate to Alerts tab in the Kloudfuse UI. Select Contact Point and Click on New Contact Point.
Choose Create Contact Point and fill the required details
Now use the contact point from Kloudfuse UI to any of the alert.
Kloudfuse Provided Out of the box control plane alerts
Kloudfuse provides a number of out of the box alerts for getting the stats for data plane these alerts thresholds or other parameters can be updated as per each deployment. These alerts are part of kfuse-cp
folder in alerts. Following are the default thresholds for these alerts.
Type | Check | Alert Condition |
---|---|---|
Kubernetes Pods | In Failed state | For 5 mins |
Restarting multiple times | For 5 mins | |
CrashLoopBackOff | For 5 mins | |
Deployments | Lesser replicas than desired | For 15 mins |
Statefulsets | Lesser replicas than desired | For 15 mins |
Nodes | Unschedulable | For 10 mins |
Not Ready | For 5 mins | |
High CPU Usage | > 90% for 5 mins | |
Disk Usage | > 90% for 5 mins | |
Data Lake (pinot) | Segments in error condition | > 0 for 5 mins |
Segment creation threshold breached | > 10 mins | |
Persistent Volumes | Current Usage | > 90% |
Forecast Usage | Notify when it will run out of space | |
Agent/Collector | Not sending data | For 5 mins |
These alerts do not have any default contact point associated with them. The contact point for these alerts need to be updated as per each deployment requirement.