Fluentd integration

1 What is Fluentd?
2 Integrating Fluentd with Kloudfuse stack
- 2.1 HTTP plugin integration
- 2.2 Configuration
  - 2.2.1 Kubernetes Labels
  - 2.2.2 Log source
  - 2.2.3 Log message
  - 2.2.4 Agent extracted key-value pairs

What is Fluentd?

Fluentd is an open source data collector, which lets you unify the data collection and consumption for a better use and understanding of data (verbatim from Fluentd documentation). For more information on Fluentd, refer to its documentation here.

Integrating Fluentd with Kloudfuse stack

Refer to installation instructions for Fluentd here. To configure Fluentd to send data to Kloudfuse stack, you can modify the agent’s config or values.yaml (if you’re going to install using helm).

You’ll need to configure HTTP output plugin to forward data to Kloudfuse.

HTTP plugin integration

Add the following configuration to Fluentd agent’s config for HTTP configuration.

<match *> # Match everything
  @type http

  endpoint http://<KFUSE_INGESTER_IP>:80/ingester/v1/fluentd
  open_timeout 2

  <format>
    @type json
  </format>
  <buffer>
    chunk_limit_size 1048576 # 1MB
    flush_interval 10s
  </buffer>
</match>

The above snippet is an example configuration. You can change the buffer configuration to suit your application needs. However the end-point URI path cannot be changed.

Currently we support json and msgpack as the supported types for fluentd HTTP plugin.

If you have configured external ingress with TLS for Kloudfuse, use https://<KFUSE_INGESTER_IP>:443/ingester/v1/fluentd as the endpoint instead.

Configuration

Kloudfuse UI allows you to filter log events based on log labels/tags. You’ll find the label selectors and filter on the left nav bar of the UI. To get a seamless experience with Kloudfuse while using Fluentd agent, we recommend the following configuration(s) or customization(s).

Kubernetes Labels

Fluentd has a filter called kubernetes which will enrich the log event with Kubernetes metadata. Refer to the documentation on this filter here. If you have application deployed in a Kubernetes environment, we highly recommend enabling this filter for all those applications. Here’s an example configuration for this filter:

  <filter *>  # Match everything
    @type kubernetes_metadata
    @id filter_kube_metadata_catalog
    skip_labels false
    skip_container_metadata false
    skip_namespace_metadata false
    skip_master_url true
  </filter>

If you’re going to add any of our recommended filters, ensure that they don’t conflict with the existing filter definitions.

Log source

By default, Kloudfuse stack looks for container_name in the Fluentd payload as the log source. However, this will only be populated if the Fluentd agent is configured with kubernetes filter. If you want to Kloudfuse stack to use a different key as the log source, then include the following section under logs-parser section in your custom Kloudfuse’s values.yaml

kf_parsing_config:
  config: |-
    - remap:
        args:
          kf_source:
            - "$.<KEY_FOR_LOG_SOURCE>" # must be JSONPath
        conditions:
          - matcher: "__kf_agent"
            value: "fluentd"
            op: "=="

Log message

Fluentd agent includes the log event message in log key in the payload. However this can be overriden in the agent configuration. You can customize key which Kloudfuse stack should look for to get the log event message. To customize this setting, include the following section under logs-parser section in your custom Kloudfuse’s values.yaml

kf_parsing_config:
  config: |-
    - remap:
        args:
          kf_msg:
            - "$.<MSG_KEY_FROM_AGENT_CONFIG>" # must be JSONPath
        conditions:
          - matcher: "__kf_agent"
            value: "fluentd"
            op: "=="

Kloudfuse stack already looks for log message with these key names in the payload: "log", "LOG", "Log", "message", "msg", "MSG", "Message"

Agent extracted key-value pairs

Fluentd supports various parsers to extract key value pairs from an unstructured log. For a full list of parsers, refer to the documentation here. By default, Kloudfuse will add all these key-value pairs to log facets, which can be filtered on the UI. Note that Kloudfuse cannot differentiate between these key-value pairs and any metadata fields, added by any filter other than kubernetes. However, you can customize Kloudfuse stack by adding a list of prefix labels. To customize this setting, include the following section under logs-parser section in your custom Kloudfuse’s values.yaml

kf_parsing_config:
  config: |-
    - remap:
        args:
          kf_additional_tags:
            - "$.<PREFIX_KEY_FOR_AGENT_KV>" # must be JSONPath
        conditions:
          - matcher: "__kf_agent"
            value: "fluentd"
            op: "=="

kf_additional_tags is a list of key prefixes. So, any key that is a prefix match at the top level json will be included as a log label/tag, and not included as a log facet.

You don’t need to include any keys specified in log_source or message or metadata fields from kubernetes filter. They’re automatically treated as metadata and not included as a log facet.