Tutorial
Projection expressions enable you to extract specific fields from a record using a comma-separated list of kJQ expressions.
You can apply projection expressions to both the key
and value
fields of Kafka records. This is especially useful for large records where you only need a subset of fields for your query.
Projection expression concepts
Object identifier-index
Object identifier-index filters extract values from JSON objects using field names. These are the most common way to access object properties and navigate nested data structures.
.foo
- Extract the value of field "foo" from an object.foo.bar
- Chain field access (equivalent to.foo | .bar
)."field-name"
- Access fields with special characters or that start with digits.["foo"]
- Bracket notation for field access (equivalent to.foo
)
Array index
Array index filters access specific elements within arrays using their position. Arrays are zero-indexed, meaning the first element is at position 0.
.[0]
- Access the first element of an array (zero-indexed).[-1]
- Access the last element (negative indexing).[-2]
- Access the second-to-last element
Array/String slice
Array slice filters extract subsequences from arrays or substrings from strings. They use the syntax .[start:end]
where start is inclusive and end is exclusive, similar to Python slicing.
.[2:5]
- Extract elements from index 2 (inclusive) to 5 (exclusive).[1:]
- Extract elements from index 1 to the end.[:3]
- Extract elements from the beginning to index 3 (exclusive).[2:-1]
- Extract from index 2 to second-to-last element
Example
Consider the following Kafka record:
{
"topic": "tx_partner",
"offset": 44914,
"partition": 6,
"key": "e1419afe-8404-48e3-b64b-05052322914d",
"value": {
"id": "e1419afe-8404-48e3-b64b-05052322914d",
"partner": {
"auth": "CRYPTOGRAM_3DS",
"id": "Merch A",
"name": "A Reseller YC",
"network": "MASTERCARD"
},
"trade": {
"compliance": {
"audit": false
},
"currency": "USD",
"fraction": 34,
"price": "1.34",
"status": "provisional",
"unit": 1
},
"version": 4
}
}
The projection expression .key, .value.currency, .value.partner.id, .value.partner.auth, .value.version
produces the following output:
{
"topic": "tx_partner",
"offset": 44914,
"partition": 6,
"key": "e1419afe-8404-48e3-b64b-05052322914d",
"value": {
"currency": "USD",
"partner": {
"id": "Merch A",
"auth": "CRYPTOGRAM_3DS"
},
"version": 4
}
}
Notes on conflicts
When multiple projection expressions are specified, they cannot overlap or create ambiguous selection boundaries.
The following types of conflicts are not allowed:
Parent-child conflicts
A projection expression cannot select both a parent object and its nested properties.
Example: .payload, .payload.userId
- selects both the entire payload AND a specific field within it
Array element conflicts
Cannot select both individual array elements and the array itself, or conflicting array positions.
Examples:
.payload.events, .payload.events[0]
- selects both the entire events array AND a specific event.payload.transactions[0], .payload.transactions[1]
- selects multiple elements from the same array level
Nested path conflicts
Cannot select overlapping nested paths that would create ambiguous results.
Example: .payload.user[0], .payload.user.profile
- creates uncertainty about array vs object structure
Resolving conflicts
To resolve any conflicts in your projection expressions:
- Choose either the parent path OR the specific nested paths, but not both
- For arrays, select either the entire array OR specific elements with distinct processing logic. You can also select slices of arrays using array slices like
.payment.transactions[0:3]
. - Ensure all projection expression paths represent non-overlapping, unambiguous data selections