mirror of https://github.com/apache/kafka.git
port paragrpah from CP docs (#7808)
The AK Streams architecture docs should explain how the maximum parallelism is determined Reviewers: Bill Bejeck <bbejeck@gmail.com>
This commit is contained in:
parent
54b26c0aa4
commit
9758e2d7fd
|
@ -52,6 +52,14 @@
|
||||||
these record buffers. As a result stream tasks can be processed independently and in parallel without manual intervention.
|
these record buffers. As a result stream tasks can be processed independently and in parallel without manual intervention.
|
||||||
</p>
|
</p>
|
||||||
|
|
||||||
|
<p>
|
||||||
|
Slightly simplified, the maximum parallelism at which your application may run is bounded by the maximum number of stream tasks, which itself is determined by
|
||||||
|
maximum number of partitions of the input topic(s) the application is reading from. For example, if your input topic has 5 partitions, then you can run up to 5
|
||||||
|
applications instances. These instances will collaboratively process the topic’s data. If you run a larger number of app instances than partitions of the input
|
||||||
|
topic, the “excess” app instances will launch but remain idle; however, if one of the busy instances goes down, one of the idle instances will resume the former’s
|
||||||
|
work.
|
||||||
|
</p>
|
||||||
|
|
||||||
<p>
|
<p>
|
||||||
It is important to understand that Kafka Streams is not a resource manager, but a library that "runs" anywhere its stream processing application runs.
|
It is important to understand that Kafka Streams is not a resource manager, but a library that "runs" anywhere its stream processing application runs.
|
||||||
Multiple instances of the application are executed either on the same machine, or spread across multiple machines and tasks can be distributed automatically
|
Multiple instances of the application are executed either on the same machine, or spread across multiple machines and tasks can be distributed automatically
|
||||||
|
|
Loading…
Reference in New Issue