Version: 96.1

Enterprise

Overview

Kpow's Prometheus egress endpoints follow the OpenMetrics standard.

This allows you to integrate Kpow to your favorite observability tools such as Prometheus, New Relic or Grafana for long-term reporting and alerting.

To get started see our how-to blogpost on alerting and monitoring with Kpow, Prometheus, and AlertManager.

Configuration

To enable Prometheus endpoints set the following environment variable:

PROMETHEUS_EGRESS=true

Once enabled, Kpow will log the available metric endpoints at startup:

* GET /metrics/v1 - all metrics
* GET /metrics/v1/cluster/:cluster-id - metrics for a specific cluster-id
* GET /metrics/v1/connect/:connect-id - metrics for a specific connect-id
* GET /metrics/v1/schema/:schema-id - metrics for a specific schema-id
* GET /metrics/v1/ksqldb/:ksqldb-id - metrics for a specific ksqldb-id
* GET /group-offsets/v1 - all group offset metrics
* GET /offsets/v1 - all topic offset metrics
* GET /streams/v1 - all Kafka streams metrics
* GET /streams/v1/state - all Kafka streams state metrics

The endpoint URLs are available on the same hostname and port that is configured to serve Kpow's user interface.

See Endpoints for more detailed documentation on each metric endpoint available.

Authentication

warning

Prometheus Endpoints are not secure by default.

To secure all metric endpoints you can configure basic authentication:

PROMETHEUS_USERNAME=foo
PROMETHEUS_PASSWORD=bar

Metric names

The Prometheus metric name and label format specifies [a-zA-Z_][a-zA-Z0-9_]* as valid characters. Where Kafka resource names (e.g. groups, topics) contain characters outside of that range Kpow will convert non-matching characters to _.

Metric types

Each metric in the metrics glossary has a corresponding type. Below is an explanation of how to work with each type.

Gauge

A gauge is a metric that represents a single numerical value that can arbitrarily go up and down.

Examples include: broker_bytes_disk and group_count

All metrics in the glossary marked as a meter are also represented as a gauge in the metrics endpoints.

Histogram

A histogram samples observations (usually things like request durations or response sizes) and counts them in configurable buckets. It also provides a sum of all observed values.

Examples include: group_offset_delta, broker_offset_lag and simple_broker_offset_delta

Note: a lot of times histograms have been used to represent aggregate metrics (such as group_offset_lag) where topic partition is the bucket. In such cases the histogram values can be used as follows:

group_offset_lag_sum - the actual aggregate lag of the consumer group
group_offset_lag_count - the number of topic partitions used to calculate the sum
group_offset_lag - represents the percentiles (eg quantile="0.95" in the metadata) and can be used to show the average lag distributed across topic partitions. In most cases you would probably use group_offset_lag_sum over this value.

Note: the bucket-as-partition pattern applies to all examples listed above. Histograms have been used in this case to reduce the overall cardinality of aggregate metrics, while still providing some useful stats about the individual topic partitions.

Endpoints

Kpow provides Prometheus endpoints for all metrics, topic and group offsets, and streams.

Base metrics

The base /metrics/v1 endpoint, without an added path, returns all metrics found in the metrics glossary for all Kafka clusters and resources.

https://HOSTNAME:PORT/metrics/v1

If you want only base metrics about a specific Kafka cluster, or resource append the following to the path:

https://HOSTNAME:PORT/metrics/v1/cluster/CLUSTER_ID

https://HOSTNAME:PORT/metrics/v1/schema/SCHEMA_ID

https://HOSTNAME:PORT/metrics/v1/ksqldb/KSQLDB_ID

https://HOSTNAME:PORT/metrics/v1/connect/CONNECT_ID

Topic offset metrics

The /offsets/v1 endpoint returns topic offset information at a topic partition level.

https://HOSTNAME:PORT/offsets/v1

Available metrics (topic partition granularity):

partition_start
partition_end
topic_end_sum

Group offset metrics

The /group-offsets/v1 endpoint returns group offset information for assigned topic partitions.

https://HOSTNAME:PORT/group-offsets/v1

Available metrics (group assignment granularity):

group_assignment_delta
group_assignment_first_observed
group_assignment_last_read
group_assignment_offset

Kafka Streams metrics

Note: these endpoints collect and expose all running Kpow streams agent client metrics.

Base metrics

The /streams/v1 endpoint returns all Kafka streams metrics for all configured Kafka Streams agents.

Note: only metrics allowed by the configured io.factorhouse.kpow.MetricFilter of the agent will appear in the Prometheus endpoint.

https://HOSTNAME:PORT/streams/v1

State metrics

The /streams/v1/state endpoint returns Kafka streams state information for all configured Kafka Streams agents.

This maps to the KafkaStreams.State enum.

https://HOSTNAME:PORT/streams/v1/state

Sample scraper configuration

Sample Prometheus scraper configuration that we use to test Kpow:

scrape_configs:
  - job_name: 'kpow'
    metrics_path: '/metrics/v1'
    static_configs:
      - targets: ['host.docker.internal:3000']
  - job_name: 'kpow_streams'
    metrics_path: '/streams/v1'
    static_configs:
      - targets: ['host.docker.internal:3000']
  - job_name: 'kpow_offsets'
    metrics_path: '/offsets/v1'
    static_configs:
      - targets: ['host.docker.internal:3000']

Configuration​

Authentication​

Metric names​

Metric types​

Gauge​

Histogram​

Endpoints​

Base metrics​

Topic offset metrics​

Group offset metrics​

Kafka Streams metrics​

Base metrics​

State metrics​

Sample scraper configuration​

Configuration

Authentication

Metric names

Metric types

Gauge

Histogram

Endpoints

Base metrics

Topic offset metrics

Group offset metrics

Kafka Streams metrics

Base metrics

State metrics

Sample scraper configuration