From 6193357d87f4dce4c6832ac4f9a208d2a7a0539c Mon Sep 17 00:00:00 2001 From: Dustin Cote Date: Tue, 24 May 2016 17:26:54 -0700 Subject: [PATCH] KAFKA-3683; Add file descriptor recommendation to ops guide Adding sizing recommendations for file descriptors to the ops guide. Author: Dustin Cote Author: Dustin Cote Reviewers: Gwen Shapira Closes #1353 from cotedm/KAFKA-3683 and squashes the following commits: 8120318 [Dustin Cote] Adding file descriptor sizing recommendations 0908aa9 [Dustin Cote] Merge https://github.com/apache/kafka into trunk 32315e4 [Dustin Cote] Merge branch 'trunk' of https://github.com/cotedm/kafka into trunk 13309ed [Dustin Cote] Update links for new consumer API 4dcffc1 [Dustin Cote] Update links for new consumer API (cherry picked from commit 0e1c012fb551f32cf27b6b7367749047c374ee97) Signed-off-by: Gwen Shapira --- docs/ops.html | 9 ++++----- 1 file changed, 4 insertions(+), 5 deletions(-) diff --git a/docs/ops.html b/docs/ops.html index faf54535860..74131293455 100644 --- a/docs/ops.html +++ b/docs/ops.html @@ -468,13 +468,12 @@ Kafka should run well on any unix system and has been tested on Linux and Solari

We have seen a few issues running on Windows and Windows is not currently a well supported platform though we would be happy to change that.

-You likely don't need to do much OS-level tuning though there are a few things that will help performance. -

-Two configurations that may be important: +It is unlikely to require much OS-level tuning, but there are two potentially important OS-level configurations:

    -
  • We upped the number of file descriptors since we have lots of topics and lots of connections. -
  • We upped the max socket buffer size to enable high-performance data transfer between data centers described here. +
  • File descriptor limits: Kafka uses file descriptors for log segments and open connections. If a broker hosts many partitions, consider that the broker needs at least (number_of_partitions)*(partition_size/segment_size) to track all log segments in addition to the number of connections the broker makes. We recommend at least 100000 allowed file descriptors for the broker processes as a starting point. +
  • Max socket buffer size: can be increased to enable high-performance data transfer between data centers as described here.
+

Disks and Filesystem

We recommend using multiple drives to get good throughput and not sharing the same drives used for Kafka data with application logs or other OS filesystem activity to ensure good latency. You can either RAID these drives together into a single volume or format and mount each drive as its own directory. Since Kafka has replication the redundancy provided by RAID can also be provided at the application level. This choice has several tradeoffs.