Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Table of Contents
minLevel1
maxLevel76

...

outline

...

namedd-values-kfuse-same-cluster-on-prem.yaml
View file
namedd-values-kfuse-diff-cluster-on-prem.yaml
false
styledefault
typelist
printabletrue

Assumptions

  • If using an existing dd-agent version (and just pointing it to KloudFuse instead of DD HQ)

    • dd agent version needs to be 7.41 or higher.

    • validate using following steps:

      • check the chart version with helm list - should be 3.1.10 or higher

        Code Block
        helm list -n <namespace-where-agent-is-installed>
      • check the image version of the agent with describe pod on dd-agent pod.

        Code Block
        k describe pod -n <namespace-where-agent-is-installed> | grep Image
            Image:         gcr.io/datadoghq/agent:7.36.0

Install DataDog Agent

...

Agent

Scenario \ Cluster Scenario

install Setup Scenario

Steps

Kloudfuse stack & target both in same VPC and in same K8S cluster (Default)

This is the default scenario. Just use the provided values file for the agent and install.

Kloudfuse stack & target both in same VPC, but in different K8S cluster

Kloudfuse stack hosted in a different VPC (hosted at “customer.kloudfuse.io”)

Existing or new agent

use provided dd-values-kfuse-same-cluster-on-prem.yaml with the install command

Search for _url in the provided file. Wherever found, comment the Default Scenario and uncomment Scenario 1. For the IP address values needed, please see the following steps.

  • Get ingress internal IP

    Code Block
    kubectl get svc -n kfuse | grep kfuse-ingress-nginx-controller-internal
    kfuse-ingress-nginx-controller-internal   LoadBalancer   10.53.250.80    10.53.232.3   80:32716/TCP,443:30767/TCP   125m
update the provided dd-values-kfuse-diff-cluster-on-prem.yamlwith above ip address and following fields
  • replace all settings with _url suffix, for example:

    Code Block
    dd_url: http://10.53.250.80/ingester
    logs_dd_url: "10.53.250.80:80"
  • use updated dd-values-kfuse-diff-cluster-on-prem.yaml with the install command.

  • Required Datadog configurations

    The yaml files described above already contains the required Datadog configurations for Kloudfuse to ingest the data properly. These required changes are listed below:

    Code Block
    datadog:
      logsEnabled: true
      logs:
        enabled: true
        containerCollectAll: true
        containerCollectUsingFiles: true
        autoMultiLineDetection: true
      kubeStateMetricsEnabled: false
      kubeStateMetricsCore:
        enabled: true
        ignoreLegacyKSMCheck: true
      orchestratorExplorer:
        enabled: true
      processAgent:
        enabled: true
      prometheusScrape:
        enabled: true
        version: 1
        additionalConfigs:
          - configurations:
            - send_monotonic_counter: false
              send_distribution_counts_as_monotonic: false
              send_distribution_sums_as_monotonic: false
              send_histograms_buckets: true
              max_returned_metrics: 999999
              min_collection_interval: 15          
    clusterAgent:
      enabled: true
      datadog_cluster_yaml:
        process_config:
          container_collection:
            enabled: false
        orchestrator_explorer:
          manifest_collection:
            enabled: false
        use_v2_api:
          events: true
          series: true
          service_checks: true
      admissionController:
        enabled: false
    agents:
      image:
        tagSuffix: jmx
      useConfigMap: true
      customAgentConfig:
        skip_ssl_validation: false
        enable_stream_payload_serialization: false
        process_config:
          container_collection:
            enabled: false
        orchestrator_explorer:
          manifest_collection:
            enabled: false
        use_v2_api:
          events: true
          series: true
          service_checks: true
        logs_config:
          use_http: true
          logs_no_ssl: true
          auto_multi_line_detection: true
          use_v2_api: false
        apm_config:
          enabled: true
          apm_non_local_traffic: true
        metadata_providers:
          - name: host
            interval: 300          

    Collect high cardinality tags

    The Datadog Agent is installed with cardinality set to ‘orchestrator’ level - allowing granular (pod and container level) tagging of metrics.. The default setting in the Datadog Agent is set to ‘low’ - allowing tagging only at host level.

    Install command

    If you haven’t before, add datadog helm repo:

    Code Block
    helm repo add datadog https://helm.datadoghq.com
    helm repo update

    Install the datadog agent with updated values (cmd below assumes the file is called dd-values-kfuse.yaml. Replace that argument with the filename that’s relevant to your scenario).

    Code Block
    # Create separate namespace for the agent to be installed (if required)
    kubectl create namespace kfuse-agent
    helm upgrade --install kfuse-agent -f dd-values-kfuse.yaml datadog/datadog -n kfuse-agent --version 3.6.7

    Kloudfuse stack hosted in a different VPC (hosted at “customer.kloudfuse.io”)

    Search for _url in the values.yaml file provided in the first scenario. Wherever found, comment the Default Scenario and uncomment Scenario 2. Replace the customer.kloudfuse.io with your custom DNS name entry.

    • replace all settings with _url suffix, for example:

      Code Block
      dd_url: http://customer.kloudfuse.io/ingester
      logs_dd_url: "customer.kloudfuse.io:443"

    Collect high cardinality tags

    The Datadog Agent is installed with cardinality set to ‘orchestrator’ level - allowing granular (pod and container level) tagging of metrics.. The default setting in the Datadog Agent is set to ‘low’ - allowing tagging only at host level.

    Install command

    If you haven’t before, add datadog helm repo:

    Code Block
    helm repo add datadog https://helm.datadoghq.com
    helm repo update

    Install the datadog agent with updated values (cmd below assumes the file is called dd-values-kfuse.yaml. Replace that argument with the filename that’s relevant to your scenario).

    Code Block
    # Create separate namespace for the agent to be installed (if required)
    kubectl create namespace kfuse-agent
    helm upgrade --install kfuse-agent -f dd-values-kfuse.yaml datadog/datadog -n kfuse-agent --version 3.6.7

    Adding custom tags

    Custom tags can be added to the agent such that all metrics collected by the agent will have the tag added. To add custom tags, please follow these steps.

    1. Update dd-values-kfuse.yaml file to include following for the custom tags (please note that each tag should be added as key:value as a string separated by the ":") :

    Code Block
    datadog:
      tags:
        - custom_tag_name:custom_tag_value
    1. kfuse helmcustom-values.yaml file needs to be updated to whitelist the new custom tag (note that the following entry needs to be “appended” to the existing list, without which some default values will get overwritten which is not the desired behavior. Please contact the Kloudfuse team for assistance):

    Code Block
    ingester:
      config:
        hostTagIncludes:
        - kf
        - kfuse
        - kube_cluster_name
        - kubernetes.io/hostname
        - node.kubernetes.io/instance-type
        - org_id
        - project
        - topology.kubernetes.io/region
        - topology.kubernetes.io/zone
        ...
        - custom_tag_name

    Enabling Pods to be detected by Prometheus Autodiscovery

    In addition to prometheusScrape to be enabled in the datadog values yaml, the pods needs to have the following annotations. Note that if the application pods are deployed using helm, typically the helm values support a podAnnotations section.

    Code Block
      prometheus.io/path: <specify prometheus endpoint path, e.g., /metrics>
      prometheus.io/port: <SPECIFY promethus endpoint here, e.g., "9090">
      prometheus.io/scrape: "true"



    Metadata for metrics collected using the openmetrics check

    If using the above configuration, then the openmetrics check is enabled in the agent. See here for more details on what openmetrics check does. Kloudfuse agent (installed using the provided values file) employs a custom check (kf_openmetrics) to collect metadata (the “Description” and “Type” of metrics) for metrics collected using the openmetrics check, which. by default, doesn’t collect any metadata. To enable collection of metadata for these metrics, the sources are required to be annotated which will enable the agent auto-discovery for these pods and execute the custom check.

    Kubernetes environment

    To enable metrics metadata to be collected from a Kubernetes pod (as shown in an example here) follow these steps (Note that this can be done in helm as well for each of the deployment/sts):

    Code Block
    apiVersion: v1
    kind: Pod
    # (...)
    metadata:
      name: '<POD_NAME>'
    annotations:
        ad.datadoghq.com/<CONTAINER_IDENTIFIER>.check_names: '["kf_openmetrics"]'
        ad.datadoghq.com/<CONTAINER_IDENTIFIER>.init_configs: '[{}]'
        ad.datadoghq.com/<CONTAINER_IDENTIFIER>.instances: '[ { "openmetrics_endpoint": "<http://%%host%%:%%port%%/metrics>"}
          ]'
        # (...)
    spec:
      containers:
        - name: '<CONTAINER_IDENTIFIER>'
    # (...)

    Advance monitoring using Knight

    Enable kubernetes_state_metrics

    /wiki/spaces/EX/pages/756056089 (Kfuse 1.3 or higher) currently has dependency on kubernetes_state_metrics (KSM) check which is not enabled in the newer version of the agent (2.0) by default. Please ensure that the agent continues to capture these metrics through KSM. To do that, please add/update the dd-agent values file as follows:

    Code Block
    datadog:
      kubeStateMetricsEnabled: true
      kubeStateMetricsCore:
        enabled: true
        ignoreLegacyKSMCheck: false

    Enable Knight based monitoring in kfuse

    Add knightEnabled in the custom-values.yaml and then upgrade the cluster.

    Code Block
    ingester:
      ...
      config:
        knightEnabled: true