Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Table of Contents

If you use terraform as a PaaS tool and part of your gitops/CD, your and choose to manage your dashboards and alerts (custom as well as ones you install using catalog service) . Using terraform with kloudfuse is same as using it with grafana for the dashboards and alertsusing terraform, please follow these steps (requires terraform grafana plugin)

Note: For specific questions or scenarios not covered here, feel free to contact the Kloudfuse team.

Terraform provider definition

Create a terraform resource file kfuse.tf and add a provider resource with terraform using following info (make sure to change url and auth as required):

Code Block
terraform {
    required_providers {
        grafana = {
            source = "grafana/grafana"
            version = ">= 1.40.1"
        }
    }
}

provider "grafana" {
    url = "https://<your-kloudfuse-access-url>/grafana"
    auth = "admin:password"
    # use following format if using api key
    #auth = "glsa_n1BPVVcnBiDvhu3PmgrvpRJF47vl8MfO_cbb375b2"
}

Add Dashboard Resources to Terraform

Once initialized the dashboards and alerts have to be imported into this terraform instance. Importing dashboards is rather easy and can be done by simply adding following resource in your kfuse.tf file. The example below shows how to import all dashboards within a custom folder custom_db_folder (as described in the documentation here)

...

Note: In case of failures to apply the state due to conflict in the dashboard (this can happen if the dashboard already exists or there’s a UID clash. Force the import using overwrite flag.

Add Alerting Resources to Terraform

Configuring alerting related resources into terraform has following steps. Each related resource has to be explicitly configured. Currently, there’s no file base load support (similar to unlike dashboards)

Note: The following instructions will work if there are no conflicting resources. If you have existing/conflicting resources in grafana instance, then you must import those individually first prior to updating the resource in terraform config. Please follow the import instructions for each of them as applicable.

Add Contact Points & Templates to Terraform

For each contact point which needs to be managed and optionally associated message template as a terraform resource execute following steps.

If conflicting resources exist

  1. Update kfuse.tf to include the contact point name which is conflicting (or needs importing). Using tf-test-slack-contact-point as example contact point

Code Block
resource "grafana_contact_point" "tf-test-slack-contact-point" {
  # (resource arguments)
  name = "tf-test-slack-contact-point"
}
  1. Run terraform import grafana_contact_point.grafana_contact_point tf-test-slack-contact-point

  2. Running the above command will update the terraform.tfstate file. Use the necessary arguments for the contact points from the file and update the kfuse.tffile. Update each required argument (plugin provider has the information on the required parameter per contact point type). For example tf-test-slack-contact-point update the kfuse.tf file as follows:

Code Block
resource "grafana_contact_point" "tf-test-slack-contact-point" {
  # (resource arguments)
  name = "tf-test-slack-contact-point"
  slack {
    url = "https://hooks.slack.com/services/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"
  }
}
  1. Run terraform plan -auto-approve

Add Notification Policy to Terraform

Notification policies are maintained as a single object. Follow these steps to configure terraform to manage notification policies.

If conflicting resources exist

  1. Update kfuse.tf to include the a notification policy name for example current-notifciation-policy

Code Block
resource "grafana_notification_policy" "current-notifciation-policy" {
}
  1. Run terraform import grafana_notification_policy.current-notifciation-policy current-notifciation-policy

  2. Running the above command will update the terraform.tfstate file. Use the necessary arguments for the contact points from the file and update the kfuse.tffile. Update each required argument (plugin provider has the information on the required parameter for notification policy). For example:

Code Block
resource "grafana_notification_policy" "current-notifciation-policy" {
    group_by = ["alertname", "grafana_folder"]
    contact_point = grafana_contact_point.tf-test-slack-contact-point.name
    policy {
        matcher {
            label = "kube_service"
            match = "="
            value = "advance-function-service"
        }
        group_by = ["..."]
        contact_point = grafana_contact_point.tf-test-email-contact-point.name
    }
}
  1. Run terraform plan -auto-approve

Add Datasource config to Terraform

Kfuse datasources are already created. Import them, they will be used when defining alert rules. Update kfuse.tf as following:

Code Block
resource "grafana_data_source" "KfuseDatasource" {
    name = "KfuseDatasource"
    type = "prometheus"
    is_default = true
    url = "http://query-service:8080"
}

Add Alert Rules to Terraform

For managing alert rules with terraform, follow these steps for each rule group. Make sure to use the grafana_data_source.KfuseDatasource.uid and kfuse-tf-test-folder in the alert rules. Here’s a working example of creating a test rulegroup with 1 rule in it:

Code Block
resource "grafana_rule_group" "tf-test-alert-rulegroup" {
    name = "Test Alert Rules"
    folder_uid = grafana_folder.kfuse-tf-test-folder.uid
    interval_seconds = 60
    org_id = 1

    rule {
        name = "Test Alert"
        condition = "C"
        for = "0s"

        // Query the datasource.
        data {
            ref_id = "A"
            relative_time_range {
                from = 600
                to = 0
            }
            datasource_uid = grafana_data_source.KfuseDatasource.uid
            // `model` is a JSON blob that sends datasource-specific data.
            // It's different for every datasource. The alert's query is defined here.
            model = jsonencode({
                    "expr": "avg(container_cpu_usage{container_name=~\".+\"})",
                    "hide": false,
                    "interval": "60s",
                    "intervalMs": 15000,
                    "maxDataPoints": 43200,
                    "refId": "A"
                }
            )
        }

        // The query was configured to obtain data from the last 60 seconds. Let's alert on the average value of that series using a Reduce stage.
        data {
            datasource_uid = "__expr__"
            // You can also create a rule in the UI, then GET that rule to obtain the JSON.
            // This can be helpful when using more complex reduce expressions.
            model = <<EOT
{"conditions":[{"evaluator":{"params":[0,0],"type":"gt"},"operator":{"type":"and"},"query":{"params":["A"]},"reducer":{"params":[],"type":"last"},"type":"avg"}],"datasource":{"name":"Expression","type":"__expr__","uid":"__expr__"},"expression":"A","hide":false,"intervalMs":1000,"maxDataPoints":43200,"reducer":"last","refId":"B","type":"reduce"}
EOT
            ref_id = "B"
            relative_time_range {
                from = 0
                to = 0
            }
        }

        // Now, let's use a math expression as our threshold.
        // We want to alert when the value of stage "B" above exceeds 70.
        data {
            datasource_uid = "__expr__"
            ref_id = "C"
            relative_time_range {
                from = 0
                to = 0
            }
            model = jsonencode({
                expression = "$B > 70"
                type = "math"
                refId = "C"
            })
        }
    }
}