Skip to main content

3 posts tagged with "MSK"

View All Tags

· 4 min read
Javier Montón

When working with Kafka, increasing or decreasing the number of brokers isn't as trivial as it seems. If you add a new broker, it will stand there doing nothing. You have to manually reassign partitions of your topics to the new broker. But you don't want to just move some topics completely to your new broker, you want to spread your partitions and their replicas equitably across all your brokers. You also want to have the number of leader partitions balanced across all your brokers.

Reassign partitions

To reassign partitions to different brokers, you can use the Kafka binaries (bin/kafka-reassign-partitions.sh), but it isn't trivial if you have to reassign thousands of topics.

The binary file has three operations:

· 7 min read
Javier Montón

This post is about Kafka Connect, Mirror Maker 2, how they manage offsets, and how to deal with them.

Kafka Offsets

When a consumer starts consuming messages from Kafka, it will probably use a consumer-group and Kafka will store the offset of the last message consumed by that consumer-group. This offset is stored in a Kafka topic called __consumer_offsets.

· 20 min read
Javier Montón

A guide to move data from Kafka to an AWS RDS using Kafka Connect and the JDBC Sink Connector with IAM Auth.

Kafka Connect

For these examples, we are using Confluent's Kafka Connect on its Docker version, as we are going to deploy it in a Kubernetes cluster.

Single and distributed modes

Kafka Connect comes with two modes of execution, single and distributed. The main difference between them is that the single mode runs all the connectors in the same JVM, while the distributed mode runs each connector in its own JVM. The distributed mode is the recommended one for production environments, as it provides better scalability and fault tolerance. In the case of K8s, it means we will be using more than one pod to run Kafka Connect.