Configuration of Retention Policies
Stream-Level Retention
By default, the Kloudfuse stack comes installed with stream-level (Metrics, Events, Logs, Traces) retention. These policies can be added in the global.retentionPolicy
section of the custom_values.yaml
. These are the default settings.
global:
retentionPolicy:
logs:
default:
retentionTimeValue: 7
retentionTimeUnit: DAYS
metrics:
default:
retentionTimeValue: 30
retentionTimeUnit: DAYS
events:
default:
retentionTimeValue: 30
retentionTimeUnit: DAYS
traces:
default:
retentionTimeValue: 7
retentionTimeUnit: DAYS
Record-level Retention
Kloudfuse also supports configuring retention at the record-level for each stream. Currently, only metrics, events, and traces stream are supported.
Note that there are some performance implications when enabling record-level retention policies. It is recommended to only add up to 3 additional policies.
Configuring record-level retention policies require 4 changes in the custom_values.yaml
, which will be described in the rest of this document.
Configure Additional Retention Policies
In the global section of the helm values, there is a section called retentionPolicy
. This section takes in the name of the stream and a list of retention policies. Similar to the format for the stream-level policy, it takes in a retentionTimeValue
and retentionTimeUnit
fields. There is also a required name field for each custom policy.
Always add new custom policies to the end. Additionally, do not delete or reorder any existing policy.
The example below shows an example configuration for metrics and logs with default policies and 2 additional policies.
global:
retentionPolicy:
expireRoundUp: 10800000 # 3 hour. Required.
metrics:
default:
retentionTimeValue: 5
retentionTimeUnit: DAYS
custom:
- retentionTimeValue: 7
retentionTimeUnit: DAYS
name: prod
- retentionTimeValue: 14
retentionTimeUnit: DAYS
name: staging
logs:
default:
retentionTimeValue: 14
retentionTimeUnit: DAYS
custom:
- retentionTimeValue: 30
retentionTimeUnit: DAYS
name: prod
- retentionTimeValue: 90
retentionTimeUnit: DAYS
name: staging
events: []
traces: []
Configure Kafka Partitions
The corresponding Kafka partitions for each stream must be adjusted to accommodate the new policies. The helm chart checks that the number of configured partitions are divisible by the number of retention policies (default + custom). In the above example, the Kafka metrics topic requires 3 partitions. This ensures that incoming records are sent to different Kafka partitions.
global:
kafkaTopics:
- name: kf_metrics_topic
partitions: 3
replicationFactor: 1
Configure Relabel Rules
Configuration for Metrics, Traces and Event streams
Kloudfuse utilizes relabel rules to attach retention policies to each record. If a record does not match any retention policy specified in the global.retentionPolicy.custom
, it defaults to the stream-level retention policy. The relabel rules are added in the ingester.config
section of the custom_values.yaml
. Refer to Configuration of Relabel Rules for more details on configuring relabel rules. For retention policy, the relabel rules syntax supports a retention
action. It has a retention_policy
field that specifies which retention policy the rule is for. Similarly, as with the prometheus relabel_config syntax, the source_labels, separator, regex fields are used to match the incoming record.
In the example below, metrics that have env
label with prod
value are configured with prod
retention policy. Metrics that have name prefixed test_app
prefix are configured with staging
retention policy. Note that the relabel rules are evaluated in order. If record matches both rules, the last one gets priority.
Configuration for Logs stream
Although the relabel rules for logs is very similar to metrics, traces and events streams, there’s still some difference with how the rules are expressed. To configure retention policy staging
for any logs from source with sub-string -stag-
, you can use the following relabel rule :
For more information on how to define and use use relabel rules for logs stream, please refer to this documentation.