From e847f057e31cbd50db854f12f2eafe0d1a865068 Mon Sep 17 00:00:00 2001 From: Bill Bejeck Date: Wed, 4 Nov 2020 08:30:10 -0500 Subject: [PATCH] KAFKA-10679: [Streams] migrate kafka-site updated docs to kafka/docs (#9554) During the AK website upgrade, changes made to kafka-site weren't migrated back to kafka-docs. This PR is an attempt at porting the streams changes to kafka/docs For the most part, the bulk of the changes in the PR are cosmetic. For testing: I reviewed the PR diffs Rendered the changes locally Reviewers: John Roesler --- .../developer-guide/dsl-topology-naming.html | 2 +- docs/streams/architecture.html | 12 +- docs/streams/core-concepts.html | 18 +- .../developer-guide/app-reset-tool.html | 12 +- .../developer-guide/config-streams.html | 412 +++++++++--------- docs/streams/developer-guide/datatypes.html | 18 +- docs/streams/developer-guide/dsl-api.html | 241 ++++------ .../developer-guide/dsl-topology-naming.html | 66 +-- docs/streams/developer-guide/index.html | 4 +- .../developer-guide/interactive-queries.html | 41 +- .../developer-guide/manage-topics.html | 4 +- docs/streams/developer-guide/memory-mgmt.html | 24 +- .../developer-guide/processor-api.html | 34 +- docs/streams/developer-guide/running-app.html | 11 +- docs/streams/developer-guide/security.html | 17 +- docs/streams/developer-guide/testing.html | 96 ++-- .../developer-guide/write-streams.html | 30 +- docs/streams/index.html | 32 +- docs/streams/quickstart.html | 106 ++--- docs/streams/tutorial.html | 174 +++----- docs/streams/upgrade-guide.html | 33 +- 21 files changed, 557 insertions(+), 830 deletions(-) diff --git a/docs/documentation/streams/developer-guide/dsl-topology-naming.html b/docs/documentation/streams/developer-guide/dsl-topology-naming.html index db5eee368df..9f42a04ac0e 100644 --- a/docs/documentation/streams/developer-guide/dsl-topology-naming.html +++ b/docs/documentation/streams/developer-guide/dsl-topology-naming.html @@ -16,4 +16,4 @@ --> - + diff --git a/docs/streams/architecture.html b/docs/streams/architecture.html index 43de9e793b4..67594c23234 100644 --- a/docs/streams/architecture.html +++ b/docs/streams/architecture.html @@ -41,7 +41,7 @@

-

Stream Partitions and Tasks

+

Stream Partitions and Tasks

The messaging layer of Kafka partitions data for storing and transporting it. Kafka Streams partitions data for processing it. @@ -91,7 +91,7 @@
-

Threading Model

+

Threading Model

Kafka Streams allows the user to configure the number of threads that the library can use to parallelize processing within an application instance. @@ -112,7 +112,7 @@


-

Local State Stores

+

Local State Stores

Kafka Streams provides so-called state stores, which can be used by stream processing applications to store and query data, @@ -131,7 +131,7 @@
-

Fault Tolerance

+

Fault Tolerance

Kafka Streams builds on fault-tolerance capabilities integrated natively within Kafka. Kafka partitions are highly available and replicated; so when stream data is persisted to Kafka it is available @@ -165,10 +165,10 @@ -

+
- +
@@ -581,7 +578,7 @@
  • Whenever data is read from or written to a Kafka topic (e.g., via the StreamsBuilder#stream() and KStream#to() methods).
  • Whenever data is read from or written to a state store.
  • -

    This is discussed in more detail in Data types and serialization.

    +

    This is discussed in more detail in Data types and serialization.

    @@ -629,11 +626,11 @@ -
    -

    Note

    -

    If you enable n standby tasks, you need to provision n+1 KafkaStreams - instances.

    -
    +
    +

    Note

    +

    If you enable n standby tasks, you need to provision n+1 KafkaStreams + instances.

    +

    num.stream.threads

    @@ -664,22 +661,22 @@

    processing.guarantee

    The processing guarantee that should be used. - Possible values are "at_least_once" (default), - "exactly_once", - and "exactly_once_beta". - Using "exactly_once" requires broker - version 0.11.0 or newer, while using "exactly_once_beta" - requires broker version 2.5 or newer. - Note that if exactly-once processing is enabled, the default for parameter - commit.interval.ms changes to 100ms. - Additionally, consumers are configured with isolation.level="read_committed" - and producers are configured with enable.idempotence=true per default. - Note that by default exactly-once processing requires a cluster of at least three brokers what is the recommended setting for production. - For development, you can change this configuration by adjusting broker setting - transaction.state.log.replication.factor - and transaction.state.log.min.isr - to the number of brokers you want to use. - For more details see Processing Guarantees. + Possible values are "at_least_once" (default), + "exactly_once", + and "exactly_once_beta". + Using "exactly_once" requires broker + version 0.11.0 or newer, while using "exactly_once_beta" + requires broker version 2.5 or newer. + Note that if exactly-once processing is enabled, the default for parameter + commit.interval.ms changes to 100ms. + Additionally, consumers are configured with isolation.level="read_committed" + and producers are configured with enable.idempotence=true per default. + Note that by default exactly-once processing requires a cluster of at least three brokers what is the recommended setting for production. + For development, you can change this configuration by adjusting broker setting + transaction.state.log.replication.factor + and transaction.state.log.min.isr + to the number of brokers you want to use. + For more details see Processing Guarantees.
    @@ -729,79 +726,79 @@ Properties streamsSettings = new Properties(); streamsConfig.put(StreamsConfig.ROCKSDB_CONFIG_SETTER_CLASS_CONFIG, CustomRocksDBConfig.class);
    - -
    -
    Notes for example:
    -
      -
    1. BlockBasedTableConfig tableConfig = (BlockBasedTableConfig) options.tableFormatConfig(); Get a reference to the existing table config rather than create a new one, so you don't accidentally overwrite defaults such as the BloomFilter, which is an important optimization. -
    2. tableConfig.setBlockSize(16 * 1024L); Modify the default block size per these instructions from the RocksDB GitHub.
    3. -
    4. tableConfig.setCacheIndexAndFilterBlocks(true); Do not let the index and filter blocks grow unbounded. For more information, see the RocksDB GitHub.
    5. -
    6. options.setMaxWriteBufferNumber(2); See the advanced options in the RocksDB GitHub.
    7. -
    8. cache.close(); To avoid memory leaks, you must close any objects you constructed that extend org.rocksdb.RocksObject. See RocksJava docs for more details.
    9. -
    -
    -
    - - - - -
    -

    state.dir

    -
    -
    The state directory. Kafka Streams persists local states under the state directory. Each application has a subdirectory on its hosting - machine that is located under the state directory. The name of the subdirectory is the application ID. The state stores associated - with the application are created under this subdirectory. When running multiple instances of the same application on a single machine, - this path must be unique for each such instance.
    -
    -
    -
    -

    topology.optimization

    -
    -
    -

    - You can tell Streams to apply topology optimizations by setting this config. The optimizations are currently all or none and disabled by default. - These optimizations include moving/reducing repartition topics and reusing the source topic as the changelog for source KTables. It is recommended to enable this. -

    -

    - Note that as of 2.3, you need to do two things to enable optimizations. In addition to setting this config to StreamsConfig.OPTIMIZE, you'll need to pass in your - configuration properties when building your topology by using the overloaded StreamsBuilder.build(Properties) method. - For example KafkaStreams myStream = new KafkaStreams(streamsBuilder.build(properties), properties). -

    -
    -
    -
    -

    upgrade.from

    -
    -
    - The version you are upgrading from. It is important to set this config when performing a rolling upgrade to certain versions, as described in the upgrade guide. - You should set this config to the appropriate version before bouncing your instances and upgrading them to the newer version. Once everyone is on the - newer version, you should remove this config and do a second rolling bounce. It is only necessary to set this config and follow the two-bounce upgrade path - when upgrading from below version 2.0, or when upgrading to 2.4+ from any version lower than 2.4. -
    -
    +
    +
    Notes for example:
    +
      +
    1. BlockBasedTableConfig tableConfig = (BlockBasedTableConfig) options.tableFormatConfig(); Get a reference to the existing table config rather than create a new one, so you don't accidentally overwrite defaults such as the BloomFilter, which is an important optimization. +
    2. tableConfig.setBlockSize(16 * 1024L); Modify the default block size per these instructions from the RocksDB GitHub.
    3. +
    4. tableConfig.setCacheIndexAndFilterBlocks(true); Do not let the index and filter blocks grow unbounded. For more information, see the RocksDB GitHub.
    5. +
    6. options.setMaxWriteBufferNumber(2); See the advanced options in the RocksDB GitHub.
    7. +
    8. cache.close(); To avoid memory leaks, you must close any objects you constructed that extend org.rocksdb.RocksObject. See RocksJava docs for more details.
    9. +
    +
    +
    +
    -
    -

    Kafka consumers, producer and admin client configuration parameters

    -

    You can specify parameters for the Kafka consumers, producers, - and admin client that are used internally. - The consumer, producer and admin client settings are defined by specifying parameters in a StreamsConfig instance.

    -

    In this example, the Kafka consumer session timeout is configured to be 60000 milliseconds in the Streams settings:

    -
    Properties streamsSettings = new Properties();
    +      
    +    
    +
    +

    state.dir

    +
    +
    The state directory. Kafka Streams persists local states under the state directory. Each application has a subdirectory on its hosting + machine that is located under the state directory. The name of the subdirectory is the application ID. The state stores associated + with the application are created under this subdirectory. When running multiple instances of the same application on a single machine, + this path must be unique for each such instance.
    +
    +
    +
    +

    topology.optimization

    +
    +
    +

    + You can tell Streams to apply topology optimizations by setting this config. The optimizations are currently all or none and disabled by default. + These optimizations include moving/reducing repartition topics and reusing the source topic as the changelog for source KTables. It is recommended to enable this. +

    +

    + Note that as of 2.3, you need to do two things to enable optimizations. In addition to setting this config to StreamsConfig.OPTIMIZE, you'll need to pass in your + configuration properties when building your topology by using the overloaded StreamsBuilder.build(Properties) method. + For example KafkaStreams myStream = new KafkaStreams(streamsBuilder.build(properties), properties). +

    +
    +
    +
    +

    upgrade.from

    +
    +
    + The version you are upgrading from. It is important to set this config when performing a rolling upgrade to certain versions, as described in the upgrade guide. + You should set this config to the appropriate version before bouncing your instances and upgrading them to the newer version. Once everyone is on the + newer version, you should remove this config and do a second rolling bounce. It is only necessary to set this config and follow the two-bounce upgrade path + when upgrading from below version 2.0, or when upgrading to 2.4+ from any version lower than 2.4. +
    +
    +
    +
    +
    +

    Kafka consumers, producer and admin client configuration parameters

    +

    You can specify parameters for the Kafka consumers, producers, + and admin client that are used internally. + The consumer, producer and admin client settings are defined by specifying parameters in a StreamsConfig instance.

    +

    In this example, the Kafka consumer session timeout is configured to be 60000 milliseconds in the Streams settings:

    +
    Properties streamsSettings = new Properties();
     // Example of a "normal" setting for Kafka Streams
     streamsSettings.put(StreamsConfig.BOOTSTRAP_SERVERS_CONFIG, "kafka-broker-01:9092");
     // Customize the Kafka consumer settings of your Streams application
     streamsSettings.put(ConsumerConfig.SESSION_TIMEOUT_MS_CONFIG, 60000);
     
    -
    -
    -

    Naming

    -

    Some consumer, producer and admin client configuration parameters use the same parameter name, and Kafka Streams library itself also uses some parameters that share the same name with its embedded client. For example, send.buffer.bytes and - receive.buffer.bytes are used to configure TCP buffers; request.timeout.ms and retry.backoff.ms control retries for client request; - retries are used to configure how many retries are allowed when handling retriable errors from broker request responses. - You can avoid duplicate names by prefix parameter names with consumer., producer., or admin. (e.g., consumer.send.buffer.bytes and producer.send.buffer.bytes).

    -
    Properties streamsSettings = new Properties();
    +    
    +
    +

    Naming

    +

    Some consumer, producer and admin client configuration parameters use the same parameter name, and Kafka Streams library itself also uses some parameters that share the same name with its embedded client. For example, send.buffer.bytes and + receive.buffer.bytes are used to configure TCP buffers; request.timeout.ms and retry.backoff.ms control retries for client request; + retries are used to configure how many retries are allowed when handling retriable errors from broker request responses. + You can avoid duplicate names by prefix parameter names with consumer., producer., or admin. (e.g., consumer.send.buffer.bytes and producer.send.buffer.bytes).

    +
    Properties streamsSettings = new Properties();
     // same value for consumer, producer, and admin client
     streamsSettings.put("PARAMETER_NAME", "value");
     // different values for consumer and producer
    @@ -813,14 +810,14 @@
     streamsSettings.put(StreamsConfig.producerPrefix("PARAMETER_NAME"), "producer-value");
     streamsSettings.put(StreamsConfig.adminClientPrefix("PARAMETER_NAME"), "admin-value");
     
    -

    You could further separate consumer configuration by adding different prefixes:

    -
      -
    • main.consumer. for main consumer which is the default consumer of stream source.
    • -
    • restore.consumer. for restore consumer which is in charge of state store recovery.
    • -
    • global.consumer. for global consumer which is used in global KTable construction.
    • -
    -

    For example, if you only want to set restore consumer config without touching other consumers' settings, you could simply use restore.consumer. to set the config.

    -
    Properties streamsSettings = new Properties();
    +        

    You could further separate consumer configuration by adding different prefixes:

    +
      +
    • main.consumer. for main consumer which is the default consumer of stream source.
    • +
    • restore.consumer. for restore consumer which is in charge of state store recovery.
    • +
    • global.consumer. for global consumer which is used in global KTable construction.
    • +
    +

    For example, if you only want to set restore consumer config without touching other consumers' settings, you could simply use restore.consumer. to set the config.

    +
    Properties streamsSettings = new Properties();
     // same config value for all consumer types
     streamsSettings.put("consumer.PARAMETER_NAME", "general-consumer-value");
     // set a different restore consumer config. This would make restore consumer take restore-consumer-value,
    @@ -829,103 +826,103 @@
     // alternatively, you can use
     streamsSettings.put(StreamsConfig.restoreConsumerPrefix("PARAMETER_NAME"), "restore-consumer-value");
     
    -
    -

    Same applied to main.consumer. and main.consumer., if you only want to specify one consumer type config.

    -

    Additionally, to configure the internal repartition/changelog topics, you could use the topic. prefix, followed by any of the standard topic configs.

    -
    Properties streamsSettings = new Properties();
    +        
    +

    Same applied to main.consumer. and main.consumer., if you only want to specify one consumer type config.

    +

    Additionally, to configure the internal repartition/changelog topics, you could use the topic. prefix, followed by any of the standard topic configs.

    +
    Properties streamsSettings = new Properties();
     // Override default for both changelog and repartition topics
     streamsSettings.put("topic.PARAMETER_NAME", "topic-value");
     // alternatively, you can use
     streamsSettings.put(StreamsConfig.topicPrefix("PARAMETER_NAME"), "topic-value");
     
    -
    -
    -
    -
    -

    Default Values

    -

    Kafka Streams uses different default values for some of the underlying client configs, which are summarized below. For detailed descriptions - of these configs, see Producer Configs - and Consumer Configs.

    - - - - - - - - - - - - - - - - - - - - - - - - - -
    Parameter NameCorresponding ClientStreams Default
    auto.offset.resetConsumerearliest
    linger.msProducer100
    max.poll.interval.msConsumerInteger.MAX_VALUE
    max.poll.recordsConsumer1000
    -
    -
    -

    Parameters controlled by Kafka Streams

    -

    Kafka Streams assigns the following configuration parameters. If you try to change - allow.auto.create.topics, your value - is ignored and setting it has no effect in a Kafka Streams application. You can set the other parameters. - Kafka Streams sets them to different default values than a plain - KafkaConsumer. -

    Kafka Streams uses the client.id - parameter to compute derived client IDs for internal clients. If you don't set - client.id, Kafka Streams sets it to - <application.id>-<random-UUID>. - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
    Parameter NameCorresponding ClientStreams Default
    allow.auto.create.topicsConsumerfalse
    auto.offset.resetConsumerearliest
    linger.msProducer100
    max.poll.interval.msConsumer300000
    max.poll.recordsConsumer1000
    -

    -

    enable.auto.commit

    -
    -
    The consumer auto commit. To guarantee at-least-once processing semantics and turn off auto commits, Kafka Streams overrides this consumer config - value to false. Consumers will only commit explicitly via commitSync calls when the Kafka Streams library or a user decides - to commit the current processing state.
    +
    +
    +
    +

    Default Values

    +

    Kafka Streams uses different default values for some of the underlying client configs, which are summarized below. For detailed descriptions + of these configs, see Producer Configs + and Consumer Configs.

    + + + + + + + + + + + + + + + + + + + + + + + + + +
    Parameter NameCorresponding ClientStreams Default
    auto.offset.resetConsumerearliest
    linger.msProducer100
    max.poll.interval.msConsumerInteger.MAX_VALUE
    max.poll.recordsConsumer1000
    +
    +
    +

    Parameters controlled by Kafka Streams

    +

    Kafka Streams assigns the following configuration parameters. If you try to change + allow.auto.create.topics, your value + is ignored and setting it has no effect in a Kafka Streams application. You can set the other parameters. + Kafka Streams sets them to different default values than a plain + KafkaConsumer. +

    Kafka Streams uses the client.id + parameter to compute derived client IDs for internal clients. If you don't set + client.id, Kafka Streams sets it to + <application.id>-<random-UUID>. + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    Parameter NameCorresponding ClientStreams Default
    allow.auto.create.topicsConsumerfalse
    auto.offset.resetConsumerearliest
    linger.msProducer100
    max.poll.interval.msConsumer300000
    max.poll.recordsConsumer1000
    +

    +

    enable.auto.commit

    +
    +
    The consumer auto commit. To guarantee at-least-once processing semantics and turn off auto commits, Kafka Streams overrides this consumer config + value to false. Consumers will only commit explicitly via commitSync calls when the Kafka Streams library or a user decides + to commit the current processing state.
    +
    -
    -
    -
    -
    +
    +
    +
    + @@ -997,10 +993,10 @@ -
    +
    - +
    +settings.put(StreamsConfig.DEFAULT_VALUE_SERDE_CLASS_CONFIG, Serdes.Long().getClass().getName());
    @@ -78,8 +77,7 @@ // The stream userCountByRegion has type `String` for record keys (for region) // and type `Long` for record values (for user counts). KStream<String, Long> userCountByRegion = ...; -userCountByRegion.to("RegionCountsTopic", Produced.with(stringSerde, longSerde)); -
    +userCountByRegion.to("RegionCountsTopic", Produced.with(stringSerde, longSerde));

    If you want to override serdes selectively, i.e., keep the defaults for some fields, then don’t specify the serde whenever you want to leverage the default settings:

    import org.apache.kafka.common.serialization.Serde;
    @@ -89,8 +87,7 @@
     // but override the default serializer for record values (here: userCount as Long).
     final Serde<Long> longSerde = Serdes.Long();
     KStream<String, Long> userCountByRegion = ...;
    -userCountByRegion.to("RegionCountsTopic", Produced.valueSerde(Serdes.Long()));
    -
    +userCountByRegion.to("RegionCountsTopic", Produced.valueSerde(Serdes.Long()));

    If some of your incoming records are corrupted or ill-formatted, they will cause the deserializer class to report an error. Since 1.0.x we have introduced an DeserializationExceptionHandler interface which allows @@ -104,12 +101,11 @@

    Primitive and basic types

    Apache Kafka includes several built-in serde implementations for Java primitives and basic types such as byte[] in its kafka-clients Maven artifact:

    -
    <dependency>
    +        
    <dependency>
         <groupId>org.apache.kafka</groupId>
         <artifactId>kafka-clients</artifactId>
         <version>{{fullDotVersion}}</version>
    -</dependency>
    -
    +</dependency>

    This artifact provides the following serde implementations under the package org.apache.kafka.common.serialization, which you can leverage when e.g., defining default serializers in your Streams configuration.

    @@ -200,10 +196,10 @@ -
    +
    - +
    -

    KStream

    +

    KStream

    Only the Kafka Streams DSL has the notion of a KStream. @@ -133,7 +133,7 @@ which would return 3 for alice.

    -

    KTable

    +

    KTable

    Only the Kafka Streams DSL has the notion of a KTable. @@ -172,7 +172,7 @@ KTable also provides an ability to look up current values of data records by keys. This table-lookup functionality is available through join operations (see also Joining in the Developer Guide) as well as through Interactive Queries.

    -

    GlobalKTable

    +

    GlobalKTable

    Only the Kafka Streams DSL has the notion of a GlobalKTable.

    @@ -242,7 +242,7 @@

    In the case of a KStream, the local KStream instance of every application instance will be populated with data from only a subset of the partitions of the input topic. Collectively, across all application instances, all input topic partitions are read and processed.

    -
    import org.apache.kafka.common.serialization.Serdes;
    +                        
    import org.apache.kafka.common.serialization.Serdes;
     import org.apache.kafka.streams.StreamsBuilder;
     import org.apache.kafka.streams.kstream.KStream;
     
    @@ -253,8 +253,7 @@
         Consumed.with(
           Serdes.String(), /* key serde */
           Serdes.Long()   /* value serde */
    -    );
    -
    + );

    If you do not specify SerDes explicitly, the default SerDes from the configuration are used.

    @@ -304,7 +303,7 @@ state store that backs the table). This is required for supporting interactive queries against the table. When a name is not provided the table will not be queryable and an internal name will be provided for the state store.

    -
    import org.apache.kafka.common.serialization.Serdes;
    +                        
    import org.apache.kafka.common.serialization.Serdes;
     import org.apache.kafka.streams.StreamsBuilder;
     import org.apache.kafka.streams.kstream.GlobalKTable;
     
    @@ -316,8 +315,7 @@
           "word-counts-global-store" /* table/store name */)
           .withKeySerde(Serdes.String()) /* key serde */
           .withValueSerde(Serdes.Long()) /* value serde */
    -    );
    -
    + );

    You must specify SerDes explicitly if the key or value types of the records in the Kafka input topics do not match the configured default SerDes. For information about configuring default SerDes, available @@ -384,8 +382,7 @@ // KStream branches[1] contains all records whose keys start with "B" // KStream branches[2] contains all other records -// Java 7 example: cf. `filter` for how to create `Predicate` instances -

    +// Java 7 example: cf. `filter` for how to create `Predicate` instances
    @@ -411,8 +408,7 @@ publicbooleantest(Stringkey,Longvalue){returnvalue>0;} - }); - + }); @@ -425,7 +421,7 @@ @@ -467,8 +462,7 @@ }); -// Java 7 example: cf. `map` for how to create `KeyValueMapper` instances - +// Java 7 example: cf. `map` for how to create `KeyValueMapper` instances @@ -486,8 +480,7 @@ KStream<byte[],String>sentences=...;KStream<byte[],String>words=sentences.flatMapValues(value->Arrays.asList(value.split("\\s+"))); -// Java 7 example: cf. `mapValues` for how to create `ValueMapper` instances - +// Java 7 example: cf. `mapValues` for how to create `ValueMapper` instances @@ -504,7 +497,7 @@ further processing of the input data (unlike peek, which is not a terminal operation).

    Note on processing guarantees: Any side effects of an action (such as writing to external systems) are not trackable by Kafka, which means they will typically not benefit from Kafka’s processing guarantees.

    -
    KStream<String, Long> stream = ...;
    +                            
    KStream<String, Long> stream = ...;
     
     // Print the contents of the KStream to the local console.
     // Java 8+ example, using lambda expressions
    @@ -517,8 +510,7 @@
           public void apply(String key, Long value) {
             System.out.println(key + " => " + value);
           }
    -    });
    -
    + });
    @@ -559,8 +551,7 @@ Grouped.with(Serdes.ByteArray(),/* key */Serdes.String())/* value */ - ); - + ); @@ -639,8 +630,7 @@ Grouped.with(Serdes.String(),/* key (note: type was modified) */Serdes.Integer())/* value (note: type was modified) */ - ); - + ); @@ -667,8 +657,7 @@ KTable<byte[],String>table=cogroupedStream.aggregate(initializer); -KTable<byte[],String>table2=cogroupedStream.windowedBy(TimeWindows.duration(500ms)).aggregate(initializer); - +KTable<byte[],String>table2=cogroupedStream.windowedBy(TimeWindows.duration(500ms)).aggregate(initializer); @@ -697,8 +686,7 @@ publicKeyValue<String,Integer>apply(byte[]key,Stringvalue){returnnewKeyValue<>(value.toLowerCase(),value.length());} - }); - + }); @@ -726,8 +714,7 @@ publicStringapply(Strings){returns.toUpperCase();} - }); - + }); @@ -743,13 +730,11 @@ from different streams in the merged stream. Relative order is preserved within each input stream though (ie, records within the same input stream are processed in order)

    -
    -KStream<byte[], String> stream1 = ...;
    +                                
    KStream<byte[], String> stream1 = ...;
     
     KStream<byte[], String> stream2 = ...;
     
    -KStream<byte[], String> merged = stream1.merge(stream2);
    -                                
    +KStream<byte[], String> merged = stream1.merge(stream2);
    @@ -780,8 +765,7 @@ publicvoidapply(byte[]key,Stringvalue){System.out.println("key="+key+", value="+value);} - }); - + }); @@ -800,8 +784,7 @@ stream.print();// print to file with a custom label -stream.print(Printed.toFile("streams.out").withLabel("streams")); - +stream.print(Printed.toFile("streams.out").withLabel("streams")); @@ -828,8 +811,7 @@ publicStringapply(byte[]key,Stringvalue){returnvalue.split(" ")[0];} - }); - + }); @@ -844,8 +826,7 @@ // Also, a variant of `toStream` exists that allows you// to select a new key for the resulting stream. -KStream<byte[],String>stream=table.toStream(); - +KStream<byte[],String>stream=table.toStream(); @@ -858,8 +839,7 @@ (details)

    KStream<byte[], String> stream = ...;
     
    -KTable<byte[], String> table = stream.toTable();
    -
    +KTable<byte[], String> table = stream.toTable();
    @@ -878,7 +858,7 @@ repartition() operation always triggers repartitioning of the stream, as a result it can be used with embedded Processor API methods (like transform() et al.) that do not trigger auto repartitioning when key changing operation is performed beforehand.
    KStream<byte[], String> stream = ... ;
    -KStream<byte[], String> repartitionedStream = stream.repartition(Repartitioned.numberOfPartitions(10));
    +KStream<byte[], String> repartitionedStream = stream.repartition(Repartitioned.numberOfPartitions(10));
    @@ -925,8 +905,7 @@ // `KTable<String, Long>` (word -> count)..count()// Convert the `KTable<String, Long>` into a `KStream<String, Long>`. - .toStream(); - + .toStream();

    WordCount example in Java 7:

    // Code below is equivalent to the previous Java 8+ example above.
    @@ -946,8 +925,7 @@
             }
         })
         .count()
    -    .toStream();
    -
    + .toStream();

    Aggregating

    @@ -1046,8 +1024,7 @@ } }, Materialized.as("aggregated-stream-store") - .withValueSerde(Serdes.Long()); -
    + .withValueSerde(Serdes.Long());

    Detailed behavior of KGroupedStream:

    Evaluates a boolean function for each element and drops those for which the function returns true. (KStream details, KTable details)

    -
    KStream<String, Long> stream = ...;
    +                            
    KStream<String, Long> stream = ...;
     
     // An inverse filter that discards any negative numbers or zero
     // Java 8+ example, using lambda expressions
    @@ -438,8 +434,7 @@
           public boolean test(String key, Long value) {
             return value <= 0;
           }
    -    });
    -
    + });
    @@ -1875,8 +1844,7 @@ Serdes.String(),/* key */Serdes.Long(),/* left value */Serdes.Double())/* right value */ - ); - + );

    Detailed behavior:

    @@ -2201,8 +2166,7 @@ publicStringapply(LongleftValue,DoublerightValue){return"left="+leftValue+", right="+rightValue;} - }); - + });

    Detailed behavior:

    @@ -2819,8 +2778,7 @@ },Joined.keySerde(Serdes.String())/* key */.withValueSerde(Serdes.Long())/* left value */ - ); - + );

    Detailed behavior:

    @@ -3070,7 +3026,7 @@

    The GlobalKTable is fully bootstrapped upon (re)start of a KafkaStreams instance, which means the table is fully populated with all the data in the underlying topic that is available at the time of the startup. The actual data processing begins only once the bootstrapping has completed.

    Causes data re-partitioning of the stream if and only if the stream was marked for re-partitioning.

    -
    KStream<String, Long> left = ...;
    +                                    
    KStream<String, Long> left = ...;
     GlobalKTable<Integer, Double> right = ...;
     
     // Java 8+ example, using lambda expressions
    @@ -3092,8 +3048,7 @@
           public String apply(Long leftValue, Double rightValue) {
             return "left=" + leftValue + ", right=" + rightValue;
           }
    -    });
    -
    + });

    Detailed behavior: