Get professional AI headshots with the best AI headshot generator. Save hundreds of dollars and hours of your time.

Apache Kafka is a popular distributed streaming platform that is widely used for building real-time data pipelines and streaming applications. One of the fundamental concepts in Kafka is a “topic,” which is a logical channel or category to which messages are published and from which messages are consumed. In this tutorial, we will explore how to create and describe topics in Apache Kafka, along with relevant examples.

Table of Contents

  1. Introduction to Kafka Topics
  2. Creating a Kafka Topic
  • Using the Kafka Command Line Tools
  • Using Kafka APIs (Java)
  1. Describing a Kafka Topic
  • Understanding Topic Configuration
  • Describing Topics with Kafka Tools
  1. Example Scenarios
  • Example 1: Creating and Describing a Topic via Command Line
  • Example 2: Creating and Describing a Topic Programmatically
  1. Conclusion

1. Introduction to Kafka Topics

In Kafka, a topic is a logical channel or feed name to which records are sent by producers and from which records are consumed by consumers. Topics allow for data segregation and organization within a Kafka cluster, enabling different applications to process distinct streams of data. Each topic consists of partitions, which are the basic units of parallelism and scalability in Kafka.

2. Creating a Kafka Topic

Using the Kafka Command Line Tools

To create a topic using the Kafka command line tools, you can use the kafka-topics.sh script that comes with Kafka installation. Here’s the basic syntax:

kafka-topics.sh --create --topic <topic-name> --partitions <num-partitions> --replication-factor <replication-factor> --zookeeper <zookeeper-host:port>
  • <topic-name>: The name of the topic you want to create.
  • <num-partitions>: The number of partitions for the topic.
  • <replication-factor>: The replication factor for the topic.
  • <zookeeper-host:port>: The address of the ZooKeeper ensemble used by Kafka (note that Kafka now supports using Apache ZooKeeper or its own internal metadata management).

For example, to create a topic named “user_activity” with 3 partitions and a replication factor of 2, you can use the following command:

kafka-topics.sh --create --topic user_activity --partitions 3 --replication-factor 2 --zookeeper localhost:2181

Using Kafka APIs (Java)

Kafka provides APIs for different programming languages to interact with the Kafka cluster programmatically. Here’s an example using the Kafka Java API to create a topic:

import org.apache.kafka.clients.admin.AdminClient;
import org.apache.kafka.clients.admin.NewTopic;
import java.util.Properties;

public class CreateTopicExample {
    public static void main(String[] args) {
        Properties props = new Properties();
        props.put("bootstrap.servers", "localhost:9092");

        try (AdminClient adminClient = AdminClient.create(props)) {
            NewTopic newTopic = new NewTopic("user_activity", 3, (short) 2);
            adminClient.createTopics(Collections.singleton(newTopic));
        }
    }
}

In this example, we create an AdminClient and define a NewTopic object with the topic name, number of partitions, and replication factor. Then, we call adminClient.createTopics() to create the topic.

3. Describing a Kafka Topic

Understanding Topic Configuration

Kafka topics come with a variety of configuration options that control how records are stored, retained, and consumed. Some common configuration parameters include:

  • retention.ms: The duration for which Kafka will retain messages in the topic.
  • cleanup.policy: The policy used to compact or delete old messages.
  • compression.type: The compression algorithm used for messages in the topic.
  • min.insync.replicas: The minimum number of in-sync replicas required for a produce request to be considered successful.

These configurations can significantly impact the behavior and performance of your Kafka topics.

Describing Topics with Kafka Tools

To view the configuration and details of a Kafka topic, you can use the kafka-topics.sh script with the --describe option:

kafka-topics.sh --describe --topic <topic-name> --zookeeper <zookeeper-host:port>

This command will display information about the topic’s partitions, replication factor, configuration, and more.

4. Example Scenarios

Example 1: Creating and Describing a Topic via Command Line

Let’s walk through an example scenario where we create a topic named “website_events” with 5 partitions and a replication factor of 3 using the command line tools:

  1. Create the topic:
kafka-topics.sh --create --topic website_events --partitions 5 --replication-factor 3 --zookeeper localhost:2181
  1. Describe the topic:
kafka-topics.sh --describe --topic website_events --zookeeper localhost:2181

The output will display information about the “website_events” topic, including its configuration and partitions.

Example 2: Creating and Describing a Topic Programmatically

Now, let’s create the same “website_events” topic programmatically using the Kafka Java API and then describe it:

import org.apache.kafka.clients.admin.AdminClient;
import org.apache.kafka.clients.admin.NewTopic;
import org.apache.kafka.common.config.TopicConfig;
import java.util.Collections;
import java.util.Properties;

public class CreateAndDescribeTopicExample {
    public static void main(String[] args) {
        Properties props = new Properties();
        props.put("bootstrap.servers", "localhost:9092");

        try (AdminClient adminClient = AdminClient.create(props)) {
            NewTopic newTopic = new NewTopic("website_events", 5, (short) 3);
            newTopic.configs(Collections.singletonMap(TopicConfig.RETENTION_MS_CONFIG, "604800000")); // 7 days retention
            adminClient.createTopics(Collections.singleton(newTopic));

            adminClient.describeTopics(Collections.singleton("website_events")).values().forEach((topicName, topicDesc) -> {
                System.out.println("Topic: " + topicName);
                System.out.println("Partitions: " + topicDesc.partitions());
                System.out.println("Replication factor: " + topicDesc.replicationFactor());
                System.out.println("Configs: " + topicDesc.configs());
            });
        }
    }
}

In this example, we set the retention period to 7 days using the configs method. After creating the topic, we use the describeTopics method to retrieve information about the topic and print it to the console.

5. Conclusion

In this tutorial, we’ve explored how to create and describe topics in Apache Kafka using both the command line tools and Kafka APIs. Topics are a foundational concept in Kafka, allowing you to organize and manage streams of data within a Kafka cluster. By creating topics with appropriate configurations, you can tailor Kafka to meet the specific requirements of your streaming applications. Describing topics helps you gain insights into their configurations and characteristics, enabling you to fine-tune and optimize your Kafka deployment for better performance and reliability.

Leave a Reply

Your email address will not be published. Required fields are marked *