DevLearn
Kafka Fundamentals

Understanding Apache Kafka

Foundation concepts every Java developer needs

25%

2/8 lessons

Lessons

Lesson 1
15 min

What is Apache Kafka?

An introduction to the world's most popular distributed streaming platform

Apache Kafka is an open-source distributed event streaming platform capable of handling trillions of events a day. Originally developed by LinkedIn and later donated to the Apache Software Foundation, Kafka has become the de facto standard for building real-time data pipelines and streaming applications.

Core Components

Topics

Categories/feeds to which records are published. Partitioned for parallelism.

Producers

Publish data to topics. Choose partition per record (round-robin or key-based).

Consumers

Subscribe to topics and process records. Organized in consumer groups.

Brokers

Servers that store data and serve client requests. Form the Kafka cluster.

Code Example

# Create a Kafka topic
kafka-topics.sh --create --topic orders \
  --bootstrap-server localhost:9092 \
  --partitions 3 --replication-factor 1

# List topics
kafka-topics.sh --list --bootstrap-server localhost:9092

# Produce messages
kafka-console-producer.sh --topic orders \
  --bootstrap-server localhost:9092

# Consume messages
kafka-console-consumer.sh --topic orders \
  --bootstrap-server localhost:9092 --from-beginning

Key Concepts

High Throughput

Handle millions of messages per second with sub-millisecond latency

Durable Storage

Messages persist on disk with configurable retention policies

Scalable

Scale horizontally by adding more brokers to the cluster

Fault Tolerant

Automatic failover with configurable replication factor

Key Insight

Kafka is often described as a "distributed commit log" or "distributed streaming platform." Think of it as a durable, scalable, and highly available message queue that can store events for as long as you need.