Features
Data policies
Kpow supports configurable redaction of Data inspection results with Data policies.
Data policies are defined in a YAML file and configured with an environment variable:
DATA_POLICY_CONFIGURATION_FILE=/path/to/masking/config.yml
Data policies are a declarative way of defining how redactions are applied to query results (both Data inspect and ksqlDB queries).
Kpow supports redactions on both the key, value and header attributes of records and supports redaction of scalar types (eg: strings) or within structured data types (eg: maps, collections).
Structured data redaction currently supports Protobuf, AVRO, JSON, Transit, and EDN data formats as well as Custom Serdes with JSON format.
String serdes are removed from Data inspect when Data policies are configured as they could be used to circumvent redaction.
Exclusions
Define exclusions: in your Data policies YAML file to exclude specific topics from redaction and allow them to be inspected with String serdes.
exclusions:
topics: ["tx_meta", "tx_metrics"]
Data policies
The YAML configuration defines policies, each policy contains:
name
: the unique name of the data policyresources
: the resources governed by the policycategory
: the category for this policyredaction
: the redaction function to be appliedtype
: the type of data (eitherscalar
ornon-scalar
)fields
: the fields to redact fornon-scalar
data
Example YAML
Example: A Credit Card policy that shows only the last four digits of specific fields in all topics.
policies:
- name: Credit Card
category: PII
resources:
- [ 'cluster', '*', 'topic', '*', 'value']
redaction: ShowLast4
type: non-scalar
fields: [ credit_card, creditcard, pan ]
Resource
Resources are defined through a taxonomy that describes the hierarchy of objects in Kpow:
[DOMAIN_TYPE, DOMAIN_ID, OBJECT_TYPE?, OBJECT_ID? OBJECT_RESOURCE?]
Where:
DOMAIN_TYPE
: alwayscluster
for data policiesDOMAIN_ID
: the ID of the cluster or*
for all clusters.OBJECT_TYPE
: alwaystopic
for data policiesOBJECT_ID
: the name of the topic or*
for all topics.OBJECT_RESOURCE
: (optional) eitherkey
,headers
orvalue
Specifying a topic, key, or value is optional.
Example Resources
Resource | Effect |
---|---|
["cluster", "*"] | All clusters and topics |
["cluster", "N9xnGujkR32eYxHICeaHuQ"] | All topics for a specific cluster |
["cluster", "*", "topic", "MyTopic"] | Specific topic on all clusters (key and value) |
["cluster", "*", "topic", "MyTopic", "key"] | Specific topic on all clusters (key only) |
["cluster", "*", "topic", "*", "value"] | All topics on all clusters (value only) |
["cluster", "*", "topic", "MyTopic", "headers"] | Specific topic on all clusters (headers only) |
Redaction Functions
Supported redaction functions include:
Redaction | Description | Example Data | Example Result |
---|---|---|---|
Full | Fully redact the matched value | John Smith | ************ |
SHAHash | Apply a SHA512 hash to the value | John Smith | ed014a19bb67a.. |
ShowEmailHost | Show the email host | [email protected] | *********@corp.org |
ShowEmailPart | Show first character and host | [email protected] | j********@corp.org |
ShowFirst | Show the first character | John Smith | J********* |
ShowFirst2 | Show the first two characters | John Smith | Jo******** |
ShowFirst4 | Show the first four characters | John Smith | John****** |
ShowFirst6 | Show the first six characters | John Smith | John S**** |
ShowLast | Show the last character | John Smith | *********h |
ShowLast2 | Show the last two characters | John Smith | ********th |
ShowLast4 | Show the last four characters | John Smith | ******mith |
ShowLast6 | Show the last six characters | John Smith | **** Smith |
Nested Redaction
Kpow supports redaction of nested data structures.
Example: Applying the example Credit Card policy to a JSON message.
{
"user_details": {
"email_address": "[email protected]",
"payment_options": [
{ "credit_card": "376953644924215" }
]
}
}
The data is masked accordingly when displayed in Data inspect search results:
{
"user_details": {
"email_address": "[email protected]",
"payment_options": [
{ "credit_card": "***4215" }
]
}
}
Kpow is conservative when applying data policies. Given a field where the selected redaction function cannot apply, the fallback is to use the Full redaction policy, e.g:
{
"user_details": {
"email_address": "[email protected]",
"payment_options": [
{
"credit_card": {
"pan": "376953644924215",
"expiry": "10/10/2010"
}
}
]
}
}
Applying the same Credit Card policy to this data incurs a Full redaction at the credit_card field as Kpow does not know how to apply the configured "ShowLast4" redactor to a structured value (in this case a map with "pan" and "expiry" fields).
The result is effectively truncated:
{
"user_details": {
"email_address": "[email protected]",
"payment_options": [
{ "credit_card": "***" }
]
}
}
Data Policy Sandbox
Kpow comes with a built in Data Policy Sandbox to experiment with your currently configured policies or to create and test new configuration.
To access the Data Policy Sandbox navigate to Admin -> Data policies