...
Setup Kfuse-profiler agent to scrape the profiling data
Info |
---|
Prerequisite: 1. Ensure your Golang application exposes pprof endpoints. 2. In pull mode, the collector, Alloy, periodically retrieves profiles from Golang applications, specifically targeting the 3. If your go code is not setup to generate profiles, you need to setup Golang profiling as mentioned here (for Go Pull mode). For java, follow the instructions here to setup profiling. 4. Alloy then queries the pprof endpoints of your Golang application, collects the profiles, and forwards them to the Kfuse Profiler server. |
To setup scraping of data:
Enable Configure alloy in your customscraper config in a new file
alloy-values.yaml
. Download a copy of the defaultalloy-values.yaml
file: from here and customize the alloy configMap section following the instructions below.
Code Block |
---|
pyroscopealloy: alloyconfigMap: enabled: true |
Configure alloy scraper config:
Code Block |
---|
# -- OverridesCreate thea chart's name. Used to change the infix in the resource names. nameOverride: nullnew ConfigMap for the config file. create: true # -- OverridesContent to assign to the chart's computed fullname. Used to change the full prefix of # resource names. fullnameOverride: null ## Global properties for image pulling override the values defined under `image.registry` and `configReloader.image.registry`. ## If you want to override only one image registry, use the specific fields but if you want to override them all, use `global.image.registry` global: image: # -- Global image registry to use if it needs to be overriden for some specific use cases (e.g local registries, custom images, ...) registry: "" # -- Optional set of global image pull secrets. pullSecrets: [] # -- Security context to apply to the Grafana Alloy pod. podSecurityContext: {} crds: # -- Whether to install CRDs for monitoring. create: true ## Various Alloy settings. For backwards compatibility with the grafana-agent ## chart, this field may also be called "agent". Naming this field "agent" is ## deprecated and will be removed in a future release. alloy: configMap: # -- Create a new ConfigMap for the config file. create: true # -- Content to assign to the new ConfigMap. This is passed into `tpl` allowing for templating from values. content: |- // Write your Alloy config here: logging { level = "info"new ConfigMap. This is passed into `tpl` allowing for templating from values. content: |- // Write your Alloy config here: logging { level = "info" format = "logfmt" } discovery.kubernetes "pyroscope_kubernetes" { role = "pod" } discovery.relabel "kubernetes_pods" { targets = concat(discovery.kubernetes.pyroscope_kubernetes.targets) rule { action = "drop" source_labels = ["__meta_kubernetes_pod_phase"] regex = "Pending|Succeeded|Failed|Completed" } rule { action = "labelmap" regex = "__meta_kubernetes_pod_label_(.+)" } rule { action = "replace" source_labels = ["__meta_kubernetes_namespace"] format target_label = "logfmtkubernetes_namespace" } discovery.kubernetes "pyroscope_kubernetes" rule { roleaction = "pod" = "replace" } source_labels = discovery.relabel "kubernetes_pods" {["__meta_kubernetes_pod_name"] target_label targets = concat(discovery.kubernetes.pyroscope_kubernetes.targets)"kubernetes_pod_name" } rule { action = "dropkeep" source_labels = ["__meta_kubernetes_pod_phase_annotation_pyroscope_io_scrape"] regex = "Pending|Succeeded|Failed|Completedtrue" } rule { action = "labelmapreplace" regexsource_labels = ["__meta_kubernetes_pod_label_(.+)"_annotation_pyroscope_io_application_name"] }target_label = "service_name" } rule { action = "replace" source_labels = ["__meta_kubernetes_namespacepod_annotation_pyroscope_io_spy_name"] target_label = "kubernetes_namespace__spy_name__" } rule { action = "replace" source_labels = ["__meta_kubernetes_pod_name"]annotation_pyroscope_io_scheme"] regex = "(https?)" target_label = "kubernetes__scheme_pod_name" } rule { action = "keepreplace" source_labels = ["__address__", "__meta_kubernetes_pod_annotation_pyroscope_io_scrapeport"] regex = "true(.+?)(?::\\d+)?;(\\d+)" } replacement = "$1:$2" rule { target_label = "__address__" action} rule { action = "replacelabelmap" source_labelsregex = ["__meta_kubernetes_pod_annotation_pyroscope_io_application_name"]profile_(.+)" target_labelreplacement = "service__nameprofile_$1" } } rule { pyroscope.scrape action"pyroscope_scrape" { = "replace" clustering { source_labelsenabled = ["__meta_kubernetes_pod_annotation_pyroscope_io_spy_name"]true target_label = "__spy_name__"} }targets = concat(discovery.relabel.kubernetes_pods.output) rule { forward_to = [pyroscope.write.pyroscope_write.receiver] action profiling_config { = "replace" source_labels = ["__meta_kubernetes_pod_annotation_pyroscope_io_scheme"]profile.memory { regexenabled = "(https?)"true target_label} = "__scheme__" } profile.process_cpu { rule { enabled = true action} = "replace" source_labels = ["__address__", "__meta_kubernetes_pod_annotation_pyroscope_io_port"]profile.goroutine { regexenabled = "(.+?)(?::\\d+)?;(\\d+)"true replacement} = "$1:$2" target_label = "__address__" profile.block { } enabled = false rule { } action = "labelmap" regex = "__meta_kubernetes_pod_annotation_pyroscope_io_profile_(.+)" profile.mutex { replacement enabled = "__profile_$1"false } } pyroscope.scrape "pyroscope_scrape" profile.fgprof { clustering { enabled = truefalse } } targets = concat(discovery.relabel.kubernetes_pods.output) } forward_to = [pyroscope.write. "pyroscope_write.receiver]" { profiling_configendpoint { profile.memory { url = "https://<KFUSE ENDPOINT/DNS NAME>/profile" enabled = true } } } # -- Name of existing profile.process_cpu { ConfigMap to use. Used when create is false. enabled = truename: null # -- Key in ConfigMap }to get config from. key: null clustering: profile.goroutine { enabled = true# -- Deploy Alloy in a cluster to allow for load distribution. enabled: false } # -- Name for the Alloy cluster. Used for differentiating between clusters. profile.block {name: "" # -- Name enabled = false } for the port used for clustering, useful if running inside an Istio Mesh portName: http # -- Minimum stability level of components profile.mutex { and behavior to enable. Must be # one enabled = false of "experimental", "public-preview", or "generally-available". stabilityLevel: "generally-available" } # -- Path to where Grafana Alloy stores data (for example, the profile.fgprof { enabled = false Write-Ahead Log). # By default, data is lost between reboots. storagePath: /tmp/alloy # -- }Address to listen for traffic on. 0.0.0.0 exposes the }UI to other # containers. } listenAddr: 0.0.0.0 # -- Port to pyroscope.write "pyroscope_write" { listen for traffic on. listenPort: 12345 endpoint# {-- Scheme is needed for readiness probes. If enabling tls in your configs, urlset =to "https://pisco.kloudfuse.io/profileHTTPS" listenScheme: HTTP # -- }Base path where the UI is exposed. }uiPathPrefix: / # -- Name of existing ConfigMap Enables sending Grafana Labs anonymous usage stats to use. Used when create is falsehelp improve Grafana # Alloy. nameenableReporting: null true # -- KeyExtra inenvironment ConfigMapvariables to getpass configto from.the Alloy container. keyextraEnv: null [] clustering: # -- Maps all Deploythe Alloykeys inon a clusterConfigMap toor allowSecret foras loadenvironment distributionvariables. https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.24/#envfromsource-v1-core enabledenvFrom: false[] # -- NameExtra args forto thepass Alloyto cluster.`alloy Used for differentiating between clusters. run`: https://grafana.com/docs/alloy/latest/reference/cli/run/ extraArgs: [] name: "" # -- Name for the port used for clustering, useful if running inside an Istio Mesh portName: http # -- Minimum stability level of components and behavior to enable. Must be # one of "experimental", "public-preview", or "generally-available". stabilityLevel: "generally-available"# -- Extra ports to expose on the Alloy container. extraPorts: [] # - name: "faro" # port: 12347 # targetPort: 12347 # protocol: "TCP" # appProtocol: "h2c" mounts: # -- Path to where Grafana Alloy stores data (for example, the Write-Ahead Log)Mount /var/log from the host into the container for log collection. # By default,varlog: datafalse is lost between reboots. # -- storagePath:Mount /tmp/alloy # -- Address to listen for traffic on. 0.0.0.0 exposes the UI to other # containers. listenAddr: 0.0.0.0var/lib/docker/containers from the host into the container for log # collection. dockercontainers: false # -- PortExtra volume mounts to add listeninto the forGrafana trafficAlloy oncontainer. Does not listenPort: 12345 # --affect Schemethe iswatch neededcontainer. for readiness probes. If enablingextra: tls[] in your configs, set to "HTTPS" listenScheme: HTTP # -- Security Basecontext to pathapply whereto the UIGrafana isAlloy exposedcontainer. uiPathPrefixsecurityContext: /{} # -- EnablesResource sendingrequests Grafanaand Labslimits anonymousto usageapply stats to help improvethe Grafana Alloy # Alloycontainer. enableReporting: true # -- Extra environment variables to pass to the Alloy container. extraEnv: [] # -- Maps all the keys on a ConfigMap or Secret as environment variables. https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.24/#envfromsource-v1-core envFrom: [] # -- Extra args to pass to `alloy run`: https://grafana.com/docs/alloy/latest/reference/cli/run/ extraArgs: [] # -- Extra ports to expose on the Alloy container. extraPorts: [] # - name: "faro" # port: 12347 # targetPort: 12347 # protocol: "TCP" # appProtocol: "h2c" mounts: # -- Mount /var/log from the host into the container for log collection. varlog: false # -- Mount /var/lib/docker/containers from the host into the container for log # collection. dockercontainers: false # -- Extra volume mounts to add into the Grafana Alloy container. Does not # affect the watch container. extra: [] # -- Security context to apply to the Grafana Alloy container. securityContext: {} # -- Resource requests and limits to apply to the Grafana Alloy container. resources: {} image: # -- Grafana Alloy image registry (defaults to docker.io) registry: "docker.io" # -- Grafana Alloy image repository. repository: grafana/alloy # -- (string) Grafana Alloy image tag. When empty, the Chart's appVersion is # used. tag: null # -- Grafana Alloy image's SHA256 digest (either in format "sha256:XYZ" or "XYZ"). When set, will override `image.tag`. digest: null # -- Grafana Alloy image pull policy. pullPolicy: IfNotPresent # -- Optional set of image pull secrets. pullSecrets: [] rbac: # -- Whether to create RBAC resources for Alloy. create: true serviceAccount: # -- Whether to create a service account for the Grafana Alloy deployment. create: true # -- Additional labels to add to the created service account. additionalLabels: {} # -- Annotations to add to the created service account. annotations: {} # -- The name of the existing service account to use when # serviceAccount.create is false. name: null # Options for the extra controller used for config reloading. configReloader: # -- Enables automatically reloading when the Alloy config changes. enabled: true image: # -- Config reloader image registry (defaults to docker.io) registry: "ghcr.io" # -- Repository to get config reloader image from. repository: jimmidyson/configmap-reload # -- Tag of image to use for config reloading. tag: v0.12.0 # -- SHA256 digest of image to use for config reloading (either in format "sha256:XYZ" or "XYZ"). When set, will override `configReloader.image.tag` digest: "" # -- Override the args passed to the container. customArgs: [] # -- Resource requests and limits to apply to the config reloader container. resources: requests: cpu: "1m" memory: "5Mi" # -- Security context to apply to the Grafana configReloader container. securityContext: {} controller: # -- Type of controller to use for deploying Grafana Alloy in the cluster. # Must be one of 'daemonset', 'deployment', or 'statefulset'. type: 'deployment' # -- Number of pods to deploy. Ignored when controller.type is 'daemonset'. replicas: 1 # -- Annotations to add to controller. extraAnnotations: {} # -- Whether to deploy pods in parallel. Only used when controller.type is # 'statefulset'. parallelRollout: true # -- Configures Pods to use the host network. When set to true, the ports that will be used must be specified. hostNetwork: false # -- Configures Pods to use the host PID namespace. hostPID: false # -- Configures the DNS policy for the pod. https://kubernetes.io/docs/concepts/services-networking/dns-pod-service/#pod-s-dns-policy dnsPolicy: ClusterFirst # -- Update strategy for updating deployed Pods. updateStrategy: {} # -- nodeSelector to apply to Grafana Alloy pods. nodeSelector: {} # -- Tolerations to apply to Grafana Alloy pods. tolerations: - key: "ng_pisco" operator: "Equal" value: "kloudfuse" effect: "NoSchedule" # -- Topology Spread Constraints to apply to Grafana Alloy pods. topologySpreadConstraints: [] # -- priorityClassName to apply to Grafana Alloy pods. priorityClassName: '' # -- Extra pod annotations to add. podAnnotations: {} # -- Extra pod labels to add. podLabels: {} # -- PodDisruptionBudget configuration. podDisruptionBudget: # -- Whether to create a PodDisruptionBudget for the controller. enabled: false # -- Minimum number of pods that must be available during a disruption. # Note: Only one of minAvailable or maxUnavailable should be set. minAvailable: null # -- Maximum number of pods that can be unavailable during a disruption. # Note: Only one of minAvailable or maxUnavailable should be set. maxUnavailable: null # -- Whether to enable automatic deletion of stale PVCs due to a scale down operation, when controller.type is 'statefulset'. enableStatefulSetAutoDeletePVC: false autoscaling: # -- Creates a HorizontalPodAutoscaler for controller type deployment. enabled: false # -- The lower limit for the number of replicas to which the autoscaler can scale down. minReplicas: 1 # -- The upper limit for the number of replicas to which the autoscaler can scale up. maxReplicas: 5 # -- Average CPU utilization across all relevant pods, a percentage of the requested value of the resource for the pods. Setting `targetCPUUtilizationPercentage` to 0 will disable CPU scaling. targetCPUUtilizationPercentage: 0 # -- Average Memory utilization across all relevant pods, a percentage of the requested value of the resource for the pods. Setting `targetMemoryUtilizationPercentage` to 0 will disable Memory scaling. targetMemoryUtilizationPercentage: 80 scaleDown: # -- List of policies to determine the scale-down behavior. policies: [] # - type: Pods # value: 4 # periodSeconds: 60 # -- Determines which of the provided scaling-down policies to apply if multiple are specified. selectPolicy: Max # -- The duration that the autoscaling mechanism should look back on to make decisions about scaling down. stabilizationWindowSeconds: 300 scaleUp: # -- List of policies to determine the scale-up behavior. policies: [] # - type: Pods # value: 4 # periodSeconds: 60 # -- Determines which of the provided scaling-up policies to apply if multiple are specified. selectPolicy: Max # -- The duration that the autoscaling mechanism should look back on to make decisions about scaling up. stabilizationWindowSeconds: 0 # -- Affinity configuration for pods. affinity: nodeAffinity: requiredDuringSchedulingIgnoredDuringExecution: nodeSelectorTerms: - matchExpressions: - key: ng_label operator: In values: - pisco volumes: # -- Extra volumes to add to the Grafana Alloy pod. extra: [] # -- volumeClaimTemplates to add when controller.type is 'statefulset'. volumeClaimTemplates: [] ## -- Additional init containers to run. ## ref: https://kubernetes.io/docs/concepts/workloads/pods/init-containers/ ## initContainers: [] # -- Additional containers to run alongside the Alloy container and initContainers. extraContainers: [] service: # -- Creates a Service for the controller's pods. enabled: true # -- Service type type: ClusterIP # -- NodePort port. Only takes effect when `service.type: NodePort` nodePort: 31128 # -- Cluster IP, can be set to None, empty "" or an IP address clusterIP: '' # -- Value for internal traffic policy. 'Cluster' or 'Local' internalTrafficPolicy: Cluster annotations: {} # cloud.google.com/load-balancer-type: Internal serviceMonitor: enabled: false # -- Additional labels for the service monitor. additionalLabels: {} # -- Scrape interval. If not set, the Prometheus default scrape interval is used. interval: "" # -- MetricRelabelConfigs to apply to samples after scraping, but before ingestion. # ref: https://github.com/prometheus-operator/prometheus-operator/blob/main/Documentation/api.md#relabelconfig metricRelabelings: [] # - action: keep # regex: 'kube_(daemonset|deployment|pod|namespace|node|statefulset).+' # sourceLabels: [__name__] # -- Customize tls parameters for the service monitor tlsConfig: {} # -- RelabelConfigs to apply to samples before scraping # ref: https://github.com/prometheus-operator/prometheus-operator/blob/main/Documentation/api.md#relabelconfig relabelings: [] # - sourceLabels: [__meta_kubernetes_pod_node_name] # separator: ; # regex: ^(.*)$ # targetLabel: nodename # replacement: $1 # action: replace ingress: # -- Enables ingress for Alloy (Faro port) enabled: false # For Kubernetes >= 1.18 you should specify the ingress-controller via the field ingressClassName # See https://kubernetes.io/blog/2020/04/02/improvements-to-the-ingress-api-in-kubernetes-1.18/#specifying-the-class-of-an-ingress # ingressClassName: nginx # Values can be templated annotations: {} # kubernetes.io/ingress.class: nginx # kubernetes.io/tls-acme: "true" labels: {} path: / faroPort: 12347 # pathType is only for k8s >= 1.1= pathType: Prefix hosts: - chart-example.local ## Extra paths to prepend to every host configuration. This is useful when working with annotation based services. extraPaths: [] # - path: /* # backend: # serviceName: ssl-redirect # servicePort: use-annotation ## Or for k8s > 1.19 # - path: /* # pathType: Prefix # backend: # service: # name: ssl-redirect # port: # name: use-annotation tls: [] # - secretName: chart-example-tls # hosts: # - chart-example.local |
...
resources: {} |
Configure these two blocks in the above Alloy configuration file:
1. pyroscope.write
2. pyroscope.scrape
1.
...
Configure pyroscope.write
...
block
The pyroscope.write
block is used to define the endpoint where profiling data will be sent.
Change
url
tohttps://<KFUSE ENDPOINT/DNS NAME>/profile
.Change
write_job_name
to appropriate name likekfuse_profiler_write
.
Code Block |
---|
pyroscope.write "write_job_name" { endpoint { url = "httphttps:///localhost:4040<KFUSE ENDPOINT/DNS NAME>/profile" } } |
...
2. Configure pyroscope.scrape
block
The pyroscope.scrape
block is used to define the scraping configuration for profiling data.
Change
scrape_job_name
...
Update the url
with the appropriate endpoint for your Pyroscope server.
2. Add pyroscope.scrape
Block
...
to appropriate name like
kfuse_profiler_scrape
.Use
discovery.relabel.kubernetes_pods.output
as a target forpyroscope.scrape
block to discover kubernetes targets. Follow steps here to setup specific regex rules for discovering kubernetes targets.
Code Block |
---|
pyroscope.scrape "scrape_job_name" { targets = [{"__address__" = "localhost:4040", "service_name" = "example_service"}] concat(discovery.relabel.kubernetes_pods.output) forward_to = [pyroscope.write.write_job_name.receiver] profiling_config { profile.process_cpu { enabled = true } profile.godeltaprof_memory { enabled = true } profile.memory { // disable memory, use godeltaprof_memory instead enabled = false } profile.godeltaprof_mutex { enabled = true } profile.mutex { // disable mutex, use godeltaprof_mutex instead enabled = false } profile.godeltaprof_block { enabled = true } profile.block { // disable block, use godeltaprof_block instead enabled = false } profile.goroutine { enabled = true } } } |
...
Replace "scrape_job_name"
with a unique name for the scrape job.
...
}
} |
Configuration Details
pyroscope.write
:Defines where profiling data should be written.
The
url
specifies the endpoint where profiles need to be sent/written.
pyroscope.scrape
:Specifies the targets to scrape profiling data from.
The
forward_to
field connects the scrape job to the write job.The
profiling_config
block enables or disables specific profiles:profile.process_cpu
: Enables CPU profiling.profile.godeltaprof_memory
: Enables delta memory profiling.profile.memory
: Disabled to avoid redundancy withgodeltaprof_memory
.profile.godeltaprof_mutex
: Enables delta mutex profiling.profile.mutex
: Disabled to avoid redundancy with godeltaprof_ mutex .profile.godeltaprof_block
: Enables delta block profiling.profile.block
: Disabled to avoid redundancy withgodeltaprof_block
: Enables delta block profiling.profile.goroutine
: Enables goroutine profiling.
...
3. Applying the Configuration
After adding the above blocks to the Alloy configuration file, save the changes
...
.
Install
alloy
in the namespace where you want to scrape the data from by following steps here.Update
alloy
using thealloy-values.yaml
file which we setup above. Replace <namespace> with where you installed alloy in 2nd step above.Code Block helm upgrade --namespace <namespace> alloy grafana/alloy -f <path/to/alloy-values.yaml>