If you use terraform and choose to manage your dashboards and alerts (custom as well as ones you install using catalog service) using terraform, please follow these steps (requires terraform grafana plugin)
Note: For specific questions or scenarios not covered here, feel free to contact the Kloudfuse team.
Terraform provider definition
Create a terraform resource file kfuse.tf
and add a provider resource with terraform using following info (make sure to change url
and auth
as required):
terraform { required_providers { grafana = { source = "grafana/grafana" version = ">= 1.40.1" } } } provider "grafana" { url = "https://<your-kloudfuse-access-url>/grafana" auth = "admin:password" # use following format if using api key #auth = "glsa_n1BPVVcnBiDvhu3PmgrvpRJF47vl8MfO_cbb375b2" }
Add Dashboard Resources to Terraform
Once initialized the dashboards and alerts have to be imported into this terraform instance. Importing dashboards is rather easy and can be done by simply adding following resource in your kfuse.tf
file. The example below shows how to import all dashboards within a custom folder custom_db_folder
(as described in the documentation here)
# Create folder custom_db_folder # resource "grafana_folder" "custom_db_folder" { provider = grafana title = "my custom folder" } # Add all dashboards resource "grafana_dashboard" "custom-dbs" { provider = grafana for_each = fileset("${path.module}/<path-to-custom-db-folder>", "*.json") config_json = file("${path.module}/<path-to-custom-db-folder>/${each.key}") folder = grafana_folder.custom_db_folder.id # uncomment to force apply. # overwrite = true }
Note: In case of failures to apply the state due to conflict in the dashboard (this can happen if the dashboard already exists or there’s a UID clash. Force the import using overwrite
flag.
Add Alerting Resources to Terraform
Configuring alerting related resources into terraform has following steps. Each related resource has to be explicitly configured. Currently, there’s no file base load support (unlike dashboards)
Note: The following instructions will work if there are no conflicting resources. If you have existing/conflicting resources in grafana instance, then you must import those individually first prior to updating the resource in terraform config. Please follow the import instructions for each of them as applicable.
Add Contact Points & Templates to Terraform
For each contact point which needs to be managed and optionally associated message template as a terraform resource execute following steps.
If conflicting resources exist
Update
kfuse.tf
to include the contact point name which is conflicting (or needs importing). Usingtf-test-slack-contact-point
as example contact point
resource "grafana_contact_point" "tf-test-slack-contact-point" { # (resource arguments) name = "tf-test-slack-contact-point" }
Run
terraform import grafana_contact_point.grafana_contact_point tf-test-slack-contact-point
Running the above command will update the
terraform.tfstate
file. Use the necessary arguments for the contact points from the file and update thekfuse.tf
file. Update each required argument (plugin provider has the information on the required parameter per contact point type). For exampletf-test-slack-contact-point
update thekfuse.tf
file as follows:
resource "grafana_contact_point" "tf-test-slack-contact-point" { # (resource arguments) name = "tf-test-slack-contact-point" slack { url = "https://hooks.slack.com/services/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" } }
Run
terraform plan -auto-approve
Add Notification Policy to Terraform
Notification policies are maintained as a single object. Follow these steps to configure terraform to manage notification policies.
If conflicting resources exist
Update
kfuse.tf
to include the a notification policy name for examplecurrent-notifciation-policy
resource "grafana_notification_policy" "current-notifciation-policy" { }
Run
terraform import grafana_notification_policy.current-notifciation-policy current-notifciation-policy
Running the above command will update the
terraform.tfstate
file. Use the necessary arguments for the contact points from the file and update thekfuse.tf
file. Update each required argument (plugin provider has the information on the required parameter for notification policy). For example:
resource "grafana_notification_policy" "current-notifciation-policy" { group_by = ["alertname", "grafana_folder"] contact_point = grafana_contact_point.tf-test-slack-contact-point.name policy { matcher { label = "kube_service" match = "=" value = "advance-function-service" } group_by = ["..."] contact_point = grafana_contact_point.tf-test-email-contact-point.name } }
Run
terraform plan -auto-approve
Add Datasource config to Terraform
Kfuse datasources are already created. Import them, they will be used when defining alert rules. Update kfuse.tf
as following:
resource "grafana_data_source" "KfuseDatasource" { name = "KfuseDatasource" type = "prometheus" is_default = true url = "http://query-service:8080" }
Add Alert Rules to Terraform
For managing alert rules with terraform, follow these steps for each rule group. Make sure to use the grafana_data_source.KfuseDatasource.uid
and kfuse-tf-test-folder
in the alert rules. Here’s a working example of creating a test rulegroup with 1 rule in it:
resource "grafana_rule_group" "tf-test-alert-rulegroup" { name = "Test Alert Rules" folder_uid = grafana_folder.kfuse-tf-test-folder.uid interval_seconds = 60 org_id = 1 rule { name = "Test Alert" condition = "C" for = "0s" // Query the datasource. data { ref_id = "A" relative_time_range { from = 600 to = 0 } datasource_uid = grafana_data_source.KfuseDatasource.uid // `model` is a JSON blob that sends datasource-specific data. // It's different for every datasource. The alert's query is defined here. model = jsonencode({ "expr": "avg(container_cpu_usage{container_name=~\".+\"})", "hide": false, "interval": "60s", "intervalMs": 15000, "maxDataPoints": 43200, "refId": "A" } ) } // The query was configured to obtain data from the last 60 seconds. Let's alert on the average value of that series using a Reduce stage. data { datasource_uid = "__expr__" // You can also create a rule in the UI, then GET that rule to obtain the JSON. // This can be helpful when using more complex reduce expressions. model = <<EOT {"conditions":[{"evaluator":{"params":[0,0],"type":"gt"},"operator":{"type":"and"},"query":{"params":["A"]},"reducer":{"params":[],"type":"last"},"type":"avg"}],"datasource":{"name":"Expression","type":"__expr__","uid":"__expr__"},"expression":"A","hide":false,"intervalMs":1000,"maxDataPoints":43200,"reducer":"last","refId":"B","type":"reduce"} EOT ref_id = "B" relative_time_range { from = 0 to = 0 } } // Now, let's use a math expression as our threshold. // We want to alert when the value of stage "B" above exceeds 70. data { datasource_uid = "__expr__" ref_id = "C" relative_time_range { from = 0 to = 0 } model = jsonencode({ expression = "$B > 70" type = "math" refId = "C" }) } } }