Apache Kafka | Kafka | Open-source platform for building real-time data pipelines and streaming apps. Highly scalable and fault-tolerant. |
Kafka Broker | Kafka | Manages topic partitions, stores events, and serves producers and consumers. |
Topic | Kafka | Logical channel that organizes messages, divided into partitions for scalability. |
Partition | Kafka | Enables parallelism and load distribution for Kafka topics. |
Consumer Group | Kafka | A set of consumers that share the workload of reading from a topic. |
Schema Registry | Kafka | Centralized service for storing and validating schemas (Avro, JSON, Protobuf). |
Kafka Connect | Kafka | Connects Kafka to external systems using Source and Sink connectors. |
Kafka Streams | Kafka | Java library for building real-time apps on top of Kafka. |
ksqlDB | Kafka | Database for streaming data that lets you query Kafka topics with SQL syntax. |
Apache Flink | Flink | Framework and engine for stateful computations over unbounded and bounded data streams. |
Flink SQL | Flink | Enables querying and transforming data streams and batch datasets using SQL. |
Flink SQL Gateway | Flink | Service that allows submitting SQL queries to Flink from external clients. |
Flink Connectors | Flink | Connect Flink jobs to external systems (Kafka, JDBC, S3, Elasticsearch, etc.). |
Event Streaming | Streaming | Continuous movement of data from source to destination systems. |
Stream Processing | Streaming | Real-time processing and analysis of unbounded data streams. |
Batch Processing | Streaming | Analysis or transformation of finite datasets, often on a schedule. |
Event-Driven Architecture | Streaming | Software design pattern where services communicate via events. |
Watermark | Flink | Mechanism in stream processing to track event-time completeness. |
Exactly-Once Semantics (EOS) | Kafka, Flink | Ensures each event is processed once and only once, even after failures. |
Offset | Kafka | Marks a consumer’s position in a Kafka partition. |
Checkpoint | Flink | A snapshot of state and position in stream processing for recovery. |
Savepoint | Flink | User-triggered, persistent snapshot for controlled job upgrades and restarts. |
Retention Policy | Kafka | Configuration for how long Kafka retains topic data before deletion or compaction. |
Log Compaction | Kafka | Kafka feature that retains only the latest value for each key within a topic. |
Rebalance | Kafka | Process of redistributing partitions among consumers when group membership changes. |
Producer Acknowledgment (acks) | Kafka | Kafka producer setting controlling when a write is considered successful. |
Replication Factor | Kafka | Number of copies of a Kafka partition maintained across brokers. |
Leader Partition | Kafka | The broker partition that handles reads and writes for a given topic partition. |
Follower Partition | Kafka | Maintains a copy of the leader partition for redundancy and failover. |
Message Key | Kafka | Optional identifier used to route messages to specific partitions. |
Backpressure | Streaming | When a system slows data ingestion to avoid overwhelming downstream components. |
Late Event | Streaming | An event that arrives after the watermark has passed its event-time window. |
Windowing | Flink | Technique in stream processing to group records by time intervals (tumbling, sliding, session). |
Side Output | Flink | Allows outputting records to an alternate stream separate from the main pipeline. |
Stateful Processing | Flink | Maintaining and updating state across events for aggregations or joins. |
State Backend | Flink | Component responsible for storing Flink’s processing state (e.g., RocksDB, in-memory). |
Tumbling Window | Flink | Processes events in consecutive, non-overlapping time intervals. |
Sliding Window | Flink | Processes events in intervals that can overlap, enabling more frequent updates. |
Session Window | Flink | Groups events separated by periods of inactivity. |
Watermark Strategy | Flink | Configuration determining how and when watermarks are generated. |
Dead Letter Queue (DLQ) | Kafka, Flink | Topic or queue for storing records that failed processing. |
Exactly-Once Sink | Flink | Sink connector ensuring no duplicate or lost records even after failure. |
RBAC | Kpow, Flex | Mechanism to restrict system access to authorized users based on roles. |
SAML | Kpow, Flex | XML-based protocol for exchanging authentication and authorization data between parties. |
OpenID / OAuth 2.0 | Kpow, Flex | OpenID Connect is an authentication protocol, and OAuth 2.0 is an authorization framework. |
Multi-tenancy | Kpow, Flex | Design approach where a single instance serves multiple customers (tenants). |
kJQ | Kpow | A subset of JQ implemented in Kpow to filter and query Kafka messages. |
SerDes | Kpow | Components that convert data to/from byte streams for transmission/storage. |
kREPL | Kpow | Interactive environment for exploring and manipulating Kafka data. |
Audit Log | Kpow, Flex | Logs that track user actions and system events for security and compliance. |