Fix the formatting of example RocksDBConfigSetter due to the un-arranged spaces within <pre> tag.
Reviewers: Anna Sophie Blee-Goldman <ableegoldman@apache.org>
Remove the default close implementation for RocksDBConfigSetter to avoid accidental memory leaks via C++ backed objects which are constructed but not closed by the user
Reviewers: Anna Sophie Blee-Goldman <ableegoldman@apache.org>
Part of KIP-572: follow up work to PR #9800. It's not save to retry a TX commit after a timeout, because it's unclear if the commit was successful or not, and thus on retry we might get an IllegalStateException. Instead, we will throw a TaskCorruptedException to retry the TX if the commit failed.
Reviewers: A. Sophie Blee-Goldman <sophie@confluent.io>
We recommend users to switch to jemalloc for RocksDB.
Co-authored-by: Jim Galasyn <jim.galasyn@confluent.io>
Reviewers: Jim Galasyn <jim.galasyn@confluent.io>, Rohan Desai <rohan@confluent.io>, A. Sophie Blee-Goldman <sophie@confluent.io>
Implements KIP-418, that deprecated the `branch()` operator in favor of the newly added and type-safe `split()` operator.
Reviewers: Matthias J. Sax <matthias@confluent.io>, John Roesler <john@confluent.io>
Co-authored-by: Bruno Cadonna <bruno@confluent.io>
Reviewers: Jim Galasyn <jim.galasyn@confluent.io>, Matthias J. Sax <matthias@confluent.io>, Guozhang Wang <guozhang@confluent.io>, Bruno Cadonna <bruno@confluent.io>
During the AK website upgrade, changes made to kafka-site weren't migrated back to kafka-docs.
This PR is an attempt at porting the streams changes to kafka/docs
For the most part, the bulk of the changes in the PR are cosmetic.
For testing:
I reviewed the PR diffs
Rendered the changes locally
Reviewers: John Roesler <john@confluent.io>
Currently, the source reference are all pointing to the 1.0 version codes,
which is obviously wrong. Update to the current dotVersion.
Reviewers: John Roesler <vvcephei@apache.org>
Add necessary documentation for KIP-450, adding sliding window aggregations to KStreams
Reviewers: A. Sophie Blee-Goldman <sophie@confluent.io>, Guozhang Wang <wangguoz@gmail.com>
Wildcard import of the old org.apache.kafka.streams.scala.Serdes leads
to a name clash because some of implicits has the same names as types
from the scala's std lib. The new oak.streams.scala.serialization.Serdes is
the same as the old Serdes, but without name clashes.
The old one is marked as deprecated.
Also, add missing serdes for UUID, ByteBuffer and Short types in
the new Serdes.
Implements: KIP-616
Reviewers: John Roesler <vvcephei@apache.org>
- part of KIP-572
- removed the usage of `retries` in `GlobalStateManger`
- instead of retries the new `task.timeout.ms` config is used
Reviewers: John Roesler <john@confluent.io>, Boyang Chen <boyang@confluent.io>, Guozhang Wang <guozhang@confluent.io>
- part of KIP-572
- deprecates producer config `retries` (still in use)
- deprecates admin config `retries` (still in use)
- deprecates Kafka Streams config `retries` (will be ignored)
- adds new Kafka Streams config `task.timeout.ms` (follow up PRs will leverage this new config)
Reviewers: John Roesler <john@confluent.io>, Jason Gustafson <jason@confluent.io>, Randall Hauch <randall@confluent.io>
Add docs for KIP-441 and KIP-613.
Fixed some miscellaneous unrelated issues in the docs:
* Adds some missing configs to the Streams config docs: max.task.idle.ms,topology.optimization, default.windowed.key.serde.inner.class, and default.windowed.value.serde.inner.class
* Defines the previously-undefined default windowed serde class configs, including choosing a default (null) and giving them a doc string, so the yshould nwo show up in the auto-generated general Kafka config docs
* Adds a note to warn users about the rocksDB bug that prevents setting a strict capacity limit and counting write buffer memory against the block cache
Reviewers: Bruno Cadonna <bruno@confluent.io>, John Roesler <vvcephei@apache.org>
Implements KIP-401:
- Add ConnectedStoreProvider interface
- let Processor/[*]Transformer[*]Suppliers extend ConnectedStoreProvider
- allows to add and connect state stores to processors/transformers implicitly
Reviewers: Matthias J. Sax <matthias@confluent.io>, John Roesler <john@confluent.io>
fix broken links
rephrase a sentence
update the version number
Reviewers: Sophie Blee-Goldman <sophie@confluent.io>, Boyang Chen <boyang@confluent.io>, Bill Bejeck <bbejeck@apache.org>
* Minor: Some html fixes Streams DSL documentation:
- duplicate id
- duplicate opening <span> element
- surplus closing </div> tag
* Replaced opening/closing quotation mark codes with " (they caused w3c validation to complain).
* Replaced right arrow that wasn't rendered correctly with →
Reviewers: Mickael Maison <mickael.maison@gmail.com>
Currently, tumbling windows are defined as "a special case of hopping time windows" in the streams docs, but hopping windows are only explained in a subsequent section.
I think it would make sense to switch the order of these paragraphs around. To me this also makes more sense semantically.
Testing
Built the site and checked that everything looks ok and html is valid (or at least didn't contain any new warnings that were caused by this change).
Reviewers: Bill Bejeck <bbejeck@apache.org>
This change updates ConsoleProducer, ConsumerPerformance, VerifiableProducer, and VerifiableConsumer classes to add and prefer the --bootstrap-server flag for defining the connection point of the Kafka cluster. This change is part of KIP-499: https://cwiki.apache.org/confluence/display/KAFKA/KIP-499+-+Unify+connection+name+flag+for+command+line+tool.
Reviewers: Ron Dagostino <rdagostino@confluent.io>, Stanislav Kozlovski <stanislav_kozlovski@outlook.com>, Chia-Ping Tsai <chia7712@gmail.com>, Jason Gustafson <jason@confluent.io>
Added re-direct for new page and added link to Developer Guide page
Reviewers: Matthias J. Sax <mjsax@apache.org>, Sophie Blee-Goldman <sophie@confluent.io>,
* Adjust build and documentation.
* Use lambda syntax for SAM types in `core`, `streams-scala` and
`connect-runtime` modules.
* Remove `runnable` and `newThread` from `CoreUtils` as lambda
syntax for SAM types make them unnecessary.
* Remove stale comment in `FunctionsCompatConversions`,
`KGroupedStream`, `KGroupedTable' and `KStream` about Scala 2.11,
the conversions are needed for Scala 2.12 too.
* Deprecate `org.apache.kafka.streams.scala.kstream.Suppressed`
and use `org.apache.kafka.streams.kstream.Suppressed` instead.
* Use `Admin.create` instead of `AdminClient.create`. Static methods
in Java interfaces can be invoked since Scala 2.12. I noticed that
MirrorMaker 2 uses `AdminClient.create`, but I did not change them
as Connectors have restrictions on newer client APIs.
* Improve efficiency in a few `Gauge` implementations by avoiding
unnecessary intermediate collections.
* Remove pointless `Option.apply` in `ZookeeperClient`
`SessionState` metric.
* Fix unused import/variable and other compiler warnings.
* Reduce visibility of some vals/defs.
Reviewers: Manikumar Reddy <manikumar@confluent.io>, Guozhang Wang <wangguoz@gmail.com>, Gwen Shapira <gwen@confluent.io>
Fixed a small typo on the Processor API page of the Kafka Streams developer guide docs. ("buildeer" changed to "builder")
Reviewers: Bill Bejeck <bbejeck@gmail.com>
Document the upgrade path for the consumer and for Streams (note that they differ significantly).
Needs to be cherry-picked to 2.4
Reviewers: Guozhang Wang <wangguoz@gmail.com>
* log lock acquistion failures on the state store
* Document required uniqueness of state.dir path
* Move bunch of log calls around task state changes to DEBUG
* More readable log messages during partition assignment
Reviewers: Matthias J. Sax <mjsax@apache.org>, A. Sophie Blee-Goldman <sophie@confluent.io>, Guozhang Wang <wangguoz@gmail.com>
As part of #5425 the streams default override for producer retries was removed. The documentation was not updated to reflect that change.
Reviewers: Matthias J. Sax <mjsax@apache.org>, Sophie Blee-Goldman <sophie@confluent.io>, Bill Bejeck <bbejeck@gmail.com>
Users often use the RocksDBConfigSetter to modify parameters such as cache or block size, which must be set through the BlockBasedTableConfig object. Rather than creating a new object in the config setter, however, users should most likely retrieve a reference to the existing one so as to not lose the other defaults (eg the BloomFilter)
There have been notes from the community that it is not obvious this should be done, nor is it immediately clear how to do so. This PR updates the RocksDBConfigSetter docs to hopefully improve things.
I also piggybacked a few minor cleanups in the docs
Reviewers: Kamal Chandraprakash, Jim Galasyn <jim.galasyn@confluent.io>, Bruno Cadonna <bruno@confluent.io>, Bill Bejeck <bbejeck@gmail.com>
The Pipe.java file should exist within the myapps package directory.
Reviewers: Boyang Chen <boyang@confluent.io>, Jason Gustafson <jason@confluent.io>
This PR intendent to address some typos in https://kafka.apache.org/documentation/streams/developer-guide/processor-api.html page.
Invalid configuration option specified in the example. I've replaced with closest constant TopicConfig.MIN_IN_SYNC_REPLICAS_CONFIG, since LogConfig.MinInSyncReplicasProp() requires Scala stuff
Reference to LogConfig seems to be obsolete, I believe I've moved it to correct line
Apostrophe displayed incorrectly
Reviewers: John Roesler <john@confluent.io>, Bill Bejeck <bbejeck@gmail.com>
Include the topic config `segment.bytes`.
Author: Vahid Hashemian <vahidhashemian@us.ibm.com>
Reviewers: Gwen Shapira
Closes#6945 from vahidhashemian/minor/update_streams_quickstart_doc
Now that we can configure RocksDB to bound the total memory we should include docs describing how, as well as touching on some possible options that should be considered when taking advantage of this feature.
Reviewers: Guozhang Wang <wangguoz@gmail.com>, Jim Galasyn <jim.galasyn@confluent.io>, Bill Bejeck <bbejeck@gmail.com>
The old docs here used a now deprecated method to set the block cache size. In switching over to the new one we would now need to construct a Cache object and therefore also need to close it, so this is a good opportunity to demonstrate the RocksDBConfigSetter#close method that will need to be implemented by users.
Reviewers: Guozhang Wang <wangguoz@gmail.com>
* Revert "MINOR: Add unit test for SerDe auto-configuration (#6610)"
This reverts commit 172fbb2dd5.
* Revert "[KAFKA-3729] Auto-configure non-default SerDes passed alongside the topology builder (#6461)"
This reverts commit e56ebbffca.
The two merged PRs introduce a breaking change. Reverting to preserve backward compatibility. Jira ticket reopened.
Reviewers: Ted Yu <yuzhihong@gmail.com>, Guozhang Wang <guozhang@confluent.io>
First pass at an in-memory session store implementation.
Reviewers: Simon Geisler, Damian Guy <damian@confluent.io>, John Roesler <john@confluent.io>, Bill Bejeck <bill@confluent.io>, Guozhang Wang <wangguoz@gmail.com>
Removed TOC entry in Streams Developer Guide for Avro, since we have no content for this
PR on kafka-site: apache/kafka-site#195
Reviewers: Guozhang Wang <wangguoz@gmail.com>
Though out the tutorial, the name of the input topic that was created is `streams-plaintext-input`. However, this was mistaken at some point in the tutorial and changed to `streams-wordcount-input`.
This patch is to adjust that. Thanks.
Reviewers: Guozhang Wang <wangguoz@gmail.com>
*More detailed description of your change,
if necessary. The PR title and PR message become
the squashed commit message, so use a separate
comment to ping reviewers.*
*Summary of testing strategy (including rationale)
for the feature or bug fix. Unit and/or integration
tests are expected for any behaviour change and
system tests should be considered for larger changes.*
Author: Matthias J. Sax <matthias@confluent.io>
Reviewers: Bill Bejeck
Closes#6490 from mjsax/minor-streams-docs-rocksdb
Updates ./jenkins.sh to build stream archetype and install it in local maven cache. Afterward, archetype is used to create a new maven project and maven project is compiled for verification.
Reviewers: Guozhang Wang <wangguoz@gmail.com>, John Roesler <john@confluent.io>, Bill Bejeck <bbejeck@gmail.com>
* updated names for deprecated streams constants
* add DEFAULT_TIMESTAMP_EXTRACTOR_CLASS_CONFIG in place of deprecated
Reviewers: Jim Galasyn <jim.galasyn@confluent.io>, Matthias J. Sax <matthias@confluent.io>
Implemented an in-memory window store allowing for range queries. A finite retention period defines how long records will be kept, ie the window of time for fetching, and the grace period defines the window within which late-arriving data may still be written to the store.
Unit tests were written to test the functionality of the window store, including its insert/update/delete and fetch operations. Single-record, all records, and range fetch were tested, for both time ranges and key ranges. The logging and metrics for late-arriving (dropped)records were tested as well as the ability to restore from a changelog.
Reviewers: John Roesler <john@confluent.io>, Matthias J. Sax <matthias@confluent.io>, Bill Bejeck <bill@confluent.io>, Guozhang Wang <wangguoz@gmail.com>
This PR adds a upgrade notes and changes examples to use the bootstrap-server.
Author: Viktor Somogyi-Vass <viktorsomogyi@gmail.com>
Reviewers: Srinivas <srinivas96alluri@gmail.com>, Manikumar Reddy <manikumar.reddy@gmail.com>
Closes#6118 from viktorsomogyi/topiccommand-adminclient-doc
Scala 2.12 has better support for newer Java versions and includes additional
compiler warnings that are helpful during development. In addition, Scala 2.11
hasn't been supported by the Scala community for a long time, the soon to be
released Spark 2.4.0 will finally support Scala 2.12 (this was the main reason
preventing many from upgrading to Scala 2.12) and Scala 2.13 is at the RC stage.
It's time to start recommending the Scala 2.12 build as we prepare support for
Scala 2.13 and start thinking about removing support for Scala 2.11.
In the meantime, Jenkins will continue to build all supported Scala versions (including
Scala 2.11) so the PR and trunk jobs will fail if people accidentally use methods
introduced in Scala 2.12.
Reviewers: Ewen Cheslack-Postava <me@ewencp.org>
While working on the documentation updates I realized the Streams Scala API needs
to get updated for the addition of Grouped
Added a test for Grouped.scala ran all streams-scala tests and streams tests
Reviewers: Matthias J. Sax <matthias@confluent.io>, John Roesler <john@confluent.io>, Guozhang Wang <guozhang@confluent.io>
Documentation changes for adding overloaded StreamsBuilder(java.util.Properties props) method in KIP-312
Reviewers: Matthias J. Sax <matthias@confluent.io>, Guozhang Wang <guozhang@confluent.io>
Reviewers: Matthias J. Sax <matthias@confluent.io>, Guozhang Wang <guozhang@confluent.io>, Bill Bejeck <bill@confluent.io>, Kamal Chandraprakash <kamal.chandraprakash@gmail.com>
Co-authored-by: Mickael Maison <mickael.maison@gmail.com>
Co-authored-by: Simon Clark <simonc6r@gmail.com>
Reviewers: Sriharsha Chintalapani <sriharsha@apache.org>
1. Rename metrics scope of rocksDB window and session stores; also modify the store metrics accordingly with guidance on its correlations to metricsScope.
2. Add the missing total metrics for per-thread, per-task, per-node and per-store sensors.
Reviewers: John Roesler <john@confluent.io>, Bill Bejeck <bill@confluent.io>, Matthias J. Sax <matthias@confluent.io>
Serdes are confusing in the Scala wrapper:
* We have wrappers around Serializer, Deserializer and Serde which are not very useful.
* We have Serdes in 2 places org.apache.kafka.common.serialization.Serde and in DefaultSerdes, instead we should be having only one place where to find all the Serdes.
I wanted to do this PR before the release as this is a breaking change.
This shouldn't add more so the current tests should be enough.
Reviewers: Debasish Ghosh <dghosh@acm.org>, Guozhang Wang <guozhang@confluent.io>
Updated the upgrade doc as well since we do not have an overloaded function without the deprecated parameter before. Also renamed the 1.2 release version to 2.0.
Reviewers: Bill Bejeck <bill@confluent.io>, Matthias J. Sax <matthias@confluent.io>
Several build and documentation updates were required after the merge of KAFKA-6670: Implement a Scala wrapper library for Kafka Streams.
Encode Scala major version into streams-scala artifacts.
To differentiate versions of the kafka-streams-scala artifact across Scala major versions it's required to encode the version into the artifact name before its published to a maven repository. This is accomplished by following a similar release process as kafka core, which encodes the Scala major version and then runs the build for each major version of Scala supported. This is considered standard practice when releasing Scala libraries, but is not handled for us automatically with the basic Scala for Gradle support.
After this change you can generate and install the kafka-streams-scala artifact into the local maven repository:
$ ./gradlew -PscalaVersion=2.11 install
$ ./gradlew -PscalaVersion=2.12 install
Reviewers: Ismael Juma <ismael@juma.me.uk>, Guozhang Wang <wangguoz@gmail.com>
This pull request is for JIRA 6657, for KIP-276.
Added unit tests for new getGlobalConsumerConfigs API and make sure existing restore consumer tests are passing.
Reviewers: Bill Bejeck <bill@confluent.io>, Matthias J. Sax <matthias@confluent.io>, Guozhang Wang <wangguoz@gmail.com>
This PR implements a Scala wrapper library for Kafka Streams. The library is implemented as a project under streams, namely `:streams:streams-scala`. The PR contains the following:
* the library implementation of the wrapper abstractions
* the test suite
* the changes in `build.gradle` to build the library jar
The library has been tested running the tests as follows:
```
$ ./gradlew -Dtest.single=StreamToTableJoinScalaIntegrationTestImplicitSerdes streams:streams-scala:test
$ ./gradlew -Dtest.single=StreamToTableJoinScalaIntegrationTestImplicitSerdesWithAvro streams:streams-scala:test
$ ./gradlew -Dtest.single=WordCountTest streams:streams-scala:test
```
Author: Debasish Ghosh <ghosh.debasish@gmail.com>
Author: Sean Glover <seglo@randonom.com>
Reviewers: Bill Bejeck <bill@confluent.io>, Matthias J. Sax <matthias@confluent.io>, Ismael Juma <ismael@juma.me.uk>, John Roesler <john@confluent.io>, Damian Guy <damian@confluent.io>, Guozhang Wang <wangguoz@gmail.com>
Closes#4756 from debasishg/scala-streams
This pull request targets https://issues.apache.org/jira/browse/KAFKA-6386
The minor fix to deprecate usage of `StreamsConfig` in favor of `java.util.Properties`.
I created separate public constructors using `Properties` in order to replace the old ones,
and prioritize new functions in the `KafkaStreams.java` file.
Since this is my first time doing open source contribution, I'm very happy to get
any comment or pointer to be more professional and get better next time, thank you Guozhang guozhangwang and Liquan Ishiihara!
testing strategy: existing unit test should be suffice to cover this change.
Author: cs427fa16staff <bchen11@outlook.com>
Reviewers: Matthias J. Sax <matthias@confluent.io>, Guozhang Wang <guozhang@confluent.io>, Bill Bejeck <bill@confluent.io>
Closes#4354 from abbccdda/starter
github comments