Table of Contents |
---|
If you use terraform as a PaaS tool and part of your gitops/CD, and choose to manage your dashboards and alerts (custom as well as ones you install using catalog service) . Using terraform with kloudfuse is same as using it with grafana for the dashboards and alertsusing terraform, please follow these steps (requires terraform grafana plugin)
Note: For specific questions or scenarios not covered here, feel free to contact the Kloudfuse team.
Terraform provider definition
Create a terraform resource file kfuse.tf
and add a provider resource with terraform using following info (make sure to change url
and auth
as required):
Code Block |
---|
terraform { required_providers { grafana = { source = "grafana/grafana" version = ">= 1.40.1" } } } provider "grafana" { url = "https://<your-kloudfuse-access-url>/grafana" auth = "admin:password" # use following format if using api key #auth = "glsa_n1BPVVcnBiDvhu3PmgrvpRJF47vl8MfO_cbb375b2" } |
Add Dashboard Resources to Terraform
Once initialized the dashboards and alerts have to be imported into this terraform instance. Importing dashboards is rather easy and can be done by simply adding following resource in your kfuse.tf
file. The example below shows how to import all dashboards within a custom folder custom_db_folder
(as described in the documentation here)
...
Note: In case of failures to apply the state due to conflict in the dashboard (this can happen if the dashboard already exists or there’s a UID clash. Force the import using overwrite
flag.
Add Alerting Resources to Terraform
Configuring alerting related resources into terraform has following steps. Each related resource has to be explicitly configured. Currently, there’s no file base load support (similar to unlike dashboards)
Note: The following instructions will work if there are no conflicting resources. If you have existing/conflicting resources in grafana instance, then you must import those individually first prior to updating the resource in terraform config. Please follow the import instructions for each of them as applicable.
Add Contact Points & Templates to Terraform
For each contact point which needs to be managed and optionally associated message template as a terraform resource execute following steps.
If conflicting resources exist
Update
kfuse.tf
to include the contact point name which is conflicting (or needs importing). Usingtf-test-slack-contact-point
as example contact point
Code Block |
---|
resource "grafana_contact_point" "tf-test-slack-contact-point" {
# (resource arguments)
name = "tf-test-slack-contact-point"
} |
Run
terraform import grafana_contact_point.grafana_contact_point tf-test-slack-contact-point
Running the above command will update the
terraform.tfstate
file. Use the necessary arguments for the contact points from the file and update thekfuse.tf
file. Update each required argument (plugin provider has the information on the required parameter per contact point type). For exampletf-test-slack-contact-point
update thekfuse.tf
file as follows:
Code Block |
---|
resource "grafana_contact_point" "tf-test-slack-contact-point" {
# (resource arguments)
name = "tf-test-slack-contact-point"
slack {
url = "https://hooks.slack.com/services/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"
}
} |
Run
terraform plan -auto-approve
Add Notification Policy to Terraform
Notification policies are maintained as a single object. Follow these steps to configure terraform to manage notification policies.
If conflicting resources exist
Update
kfuse.tf
to include the a notification policy name for examplecurrent-notifciation-policy
Code Block |
---|
resource "grafana_notification_policy" "current-notifciation-policy" {
} |
Run
terraform import grafana_notification_policy.current-notifciation-policy current-notifciation-policy
Running the above command will update the
terraform.tfstate
file. Use the necessary arguments for the contact points from the file and update thekfuse.tf
file. Update each required argument (plugin provider has the information on the required parameter for notification policy). For example:
Code Block |
---|
resource "grafana_notification_policy" "current-notifciation-policy" {
group_by = ["alertname", "grafana_folder"]
contact_point = grafana_contact_point.grafana-default-email.name
policy {
matcher {
label = "send_to_slack"
match = "="
value = "true"
}
group_by = ["..."]
contact_point = grafana_contact_point.tf-test-slack-contact-point.name
}
} |
Run
terraform plan -auto-approve
Add Datasource config to Terraform
Kfuse datasources are already created. Import them, they will be used when defining alert rules. Update kfuse.tf
as following:
Code Block |
---|
resource "grafana_data_source" "KfuseDatasource" {
name = "KfuseDatasource"
type = "prometheus"
is_default = true
url = "http://query-service:8080"
} |
Add Alert Rules to Terraform
For managing alert rules with terraform, follow these steps for each rule group. Make sure to use the grafana_data_source.KfuseDatasource.uid
and kfuse-tf-test-folder
in the alert rules. Here’s a working example of creating a test rulegroup with 1 rule in it. When an alert fires a notification is sent to the tf-test-slack-contact-point
because it is associated with the notification policy which matches the alert labels:
Code Block |
---|
resource "grafana_rule_group" "tf-test-alert-rulegroup" {
name = "Test Alert Rules"
folder_uid = grafana_folder.kfuse-tf-test-folder.uid
interval_seconds = 60
org_id = 1
rule {
name = "Test Alert"
condition = "C"
for = "0s"
labels = {
send_to_slack = true
}
// Query the datasource.
data {
ref_id = "A"
relative_time_range {
from = 600
to = 0
}
datasource_uid = grafana_data_source.KfuseDatasource.uid
// `model` is a JSON blob that sends datasource-specific data.
// It's different for every datasource. The alert's query is defined here.
model = jsonencode({
"expr": "avg(container_cpu_usage{container_name=~\".+\"})",
"hide": false,
"interval": "60s",
"intervalMs": 15000,
"maxDataPoints": 43200,
"refId": "A"
}
)
}
// The query was configured to obtain data from the last 60 seconds. Let's alert on the average value of that series using a Reduce stage.
data {
datasource_uid = "__expr__"
// You can also create a rule in the UI, then GET that rule to obtain the JSON.
// This can be helpful when using more complex reduce expressions.
model = <<EOT
{"conditions":[{"evaluator":{"params":[0,0],"type":"gt"},"operator":{"type":"and"},"query":{"params":["A"]},"reducer":{"params":[],"type":"last"},"type":"avg"}],"datasource":{"name":"Expression","type":"__expr__","uid":"__expr__"},"expression":"A","hide":false,"intervalMs":1000,"maxDataPoints":43200,"reducer":"last","refId":"B","type":"reduce"}
EOT
ref_id = "B"
relative_time_range {
from = 0
to = 0
}
}
// Now, let's use a math expression as our threshold.
// We want to alert when the value of stage "B" above exceeds 70.
data {
datasource_uid = "__expr__"
ref_id = "C"
relative_time_range {
from = 0
to = 0
}
model = jsonencode({
expression = "$B > 70"
type = "math"
refId = "C"
})
}
}
} |