Skip to main content

Glossary

TermApplicable ToDescription
Apache KafkaKafkaOpen-source platform for building real-time data pipelines and streaming apps. Highly scalable and fault-tolerant.
Kafka BrokerKafkaManages topic partitions, stores events, and serves producers and consumers.
TopicKafkaLogical channel that organizes messages, divided into partitions for scalability.
PartitionKafkaEnables parallelism and load distribution for Kafka topics.
Consumer GroupKafkaA set of consumers that share the workload of reading from a topic.
Schema RegistryKafkaCentralized service for storing and validating schemas (Avro, JSON, Protobuf).
Kafka ConnectKafkaConnects Kafka to external systems using Source and Sink connectors.
Kafka StreamsKafkaJava library for building real-time apps on top of Kafka.
ksqlDBKafkaDatabase for streaming data that lets you query Kafka topics with SQL syntax.
Apache FlinkFlinkFramework and engine for stateful computations over unbounded and bounded data streams.
Flink SQLFlinkEnables querying and transforming data streams and batch datasets using SQL.
Flink SQL GatewayFlinkService that allows submitting SQL queries to Flink from external clients.
Flink ConnectorsFlinkConnect Flink jobs to external systems (Kafka, JDBC, S3, Elasticsearch, etc.).
Event StreamingStreamingContinuous movement of data from source to destination systems.
Stream ProcessingStreamingReal-time processing and analysis of unbounded data streams.
Batch ProcessingStreamingAnalysis or transformation of finite datasets, often on a schedule.
Event-Driven ArchitectureStreamingSoftware design pattern where services communicate via events.
WatermarkFlinkMechanism in stream processing to track event-time completeness.
Exactly-Once Semantics (EOS)Kafka, FlinkEnsures each event is processed once and only once, even after failures.
OffsetKafkaMarks a consumer’s position in a Kafka partition.
CheckpointFlinkA snapshot of state and position in stream processing for recovery.
SavepointFlinkUser-triggered, persistent snapshot for controlled job upgrades and restarts.
Retention PolicyKafkaConfiguration for how long Kafka retains topic data before deletion or compaction.
Log CompactionKafkaKafka feature that retains only the latest value for each key within a topic.
RebalanceKafkaProcess of redistributing partitions among consumers when group membership changes.
Producer Acknowledgment (acks)KafkaKafka producer setting controlling when a write is considered successful.
Replication FactorKafkaNumber of copies of a Kafka partition maintained across brokers.
Leader PartitionKafkaThe broker partition that handles reads and writes for a given topic partition.
Follower PartitionKafkaMaintains a copy of the leader partition for redundancy and failover.
Message KeyKafkaOptional identifier used to route messages to specific partitions.
BackpressureStreamingWhen a system slows data ingestion to avoid overwhelming downstream components.
Late EventStreamingAn event that arrives after the watermark has passed its event-time window.
WindowingFlinkTechnique in stream processing to group records by time intervals (tumbling, sliding, session).
Side OutputFlinkAllows outputting records to an alternate stream separate from the main pipeline.
Stateful ProcessingFlinkMaintaining and updating state across events for aggregations or joins.
State BackendFlinkComponent responsible for storing Flink’s processing state (e.g., RocksDB, in-memory).
Tumbling WindowFlinkProcesses events in consecutive, non-overlapping time intervals.
Sliding WindowFlinkProcesses events in intervals that can overlap, enabling more frequent updates.
Session WindowFlinkGroups events separated by periods of inactivity.
Watermark StrategyFlinkConfiguration determining how and when watermarks are generated.
Dead Letter Queue (DLQ)Kafka, FlinkTopic or queue for storing records that failed processing.
Exactly-Once SinkFlinkSink connector ensuring no duplicate or lost records even after failure.
RBACKpow, FlexMechanism to restrict system access to authorized users based on roles.
SAMLKpow, FlexXML-based protocol for exchanging authentication and authorization data between parties.
OpenID / OAuth 2.0Kpow, FlexOpenID Connect is an authentication protocol, and OAuth 2.0 is an authorization framework.
Multi-tenancyKpow, FlexDesign approach where a single instance serves multiple customers (tenants).
kJQKpowA subset of JQ implemented in Kpow to filter and query Kafka messages.
SerDesKpowComponents that convert data to/from byte streams for transmission/storage.
kREPLKpowInteractive environment for exploring and manipulating Kafka data.
Audit LogKpow, FlexLogs that track user actions and system events for security and compliance.