Adds documentation for KIP-708: Rack awareness for Kafka Streams
Co-authored-by: Bruno Cadonna <cadonna@apache.org>
Reviewers: Luke Chen <showuon@gmail.com>, Bruno Cadonna <cadonna@apache.org>
In the RocksDb memory management doc, we mentioned in the footnote that there's a rocksdb bug caused the strict_capacity_limit boolean parameter in the LRUCache constructor can't be set to true. However, the bug is already fixed in 6.11.4, and we're using 6.22 now, so the note can be removed.
Reviewer: Bruno Cadonna <cadonna@apache.org>
Right now, we have scattered uses of `SerDes` throughout the docs. These should be updated to be `Serdes`, as that's what we commonly use now.
Reviewers: Boyang Chen <bchen11@outlook.com>
We removed default 24 hours grace period in KIP-633, and deprecate some grace methods, but we forgot to update the stream docs.
Reviewer: Bruno Cadonna <cadonna@apache.org>
Update the docs for task idling, since the semantics have
changed in 3.0.
Reviewers: Jim Galasyn <jim.galasyn@confluent.io>, Luke Chen <showuon@gmail.com>, Boyang Chen <boyang@apache.org>
#10813 changed the default serde from ByteArraySerde as discussed in KIP-741. This adds proper documentation so users know to set a serde through the configs or explicitly pass one in.
Reviewers: Walker Carlson <wcarlson@confluent.io>, Anna Sophie Blee-Goldman <ableegoldman@apache.org>
Code samples are now unified and correctly formatted.
Samples under Streams use consistently the prism library.
Reviewers: Bruno Cadonna <cadonna@apache.org>
Introduce List serde for primitive types or custom serdes with a serializer and a deserializer according to KIP-466
Reviewers: Anna Sophie Blee-Goldman <ableegoldman@apache.org>, Matthias J. Sax <mjsax@conflunet.io>, John Roesler <roesler@confluent.io>, Michael Noll <michael@confluent.io>
Deprecates the following
1. StreamsConfig.EXACTLY_ONCE
2. StreamsConfig.EXACTLY_ONCE_BETA
3. Producer#sendOffsetsToTransaction(Map offsets, String consumerGroupId)
And introduces a new StreamsConfig.EXACTLY_ONCE_V2 config. Additionally, this PR replaces usages of the term "eos-beta" throughout the code with the term "eos-v2"
Reviewers: Matthias J. Sax <mjsax@confluent.io>
Allow user to specify subset of internal topics to clean up with application reset tool
Reviewers: Boyang Chen <boyang@confluent.io>, Anna Sophie Blee-Goldman <ableegoldman@apache.org>, Walker Carlson <wcarlson@confluent.io>
Fix the formatting of example RocksDBConfigSetter due to the un-arranged spaces within <pre> tag.
Reviewers: Anna Sophie Blee-Goldman <ableegoldman@apache.org>
We recommend users to switch to jemalloc for RocksDB.
Co-authored-by: Jim Galasyn <jim.galasyn@confluent.io>
Reviewers: Jim Galasyn <jim.galasyn@confluent.io>, Rohan Desai <rohan@confluent.io>, A. Sophie Blee-Goldman <sophie@confluent.io>
Implements KIP-418, that deprecated the `branch()` operator in favor of the newly added and type-safe `split()` operator.
Reviewers: Matthias J. Sax <matthias@confluent.io>, John Roesler <john@confluent.io>
During the AK website upgrade, changes made to kafka-site weren't migrated back to kafka-docs.
This PR is an attempt at porting the streams changes to kafka/docs
For the most part, the bulk of the changes in the PR are cosmetic.
For testing:
I reviewed the PR diffs
Rendered the changes locally
Reviewers: John Roesler <john@confluent.io>
Currently, the source reference are all pointing to the 1.0 version codes,
which is obviously wrong. Update to the current dotVersion.
Reviewers: John Roesler <vvcephei@apache.org>
Add necessary documentation for KIP-450, adding sliding window aggregations to KStreams
Reviewers: A. Sophie Blee-Goldman <sophie@confluent.io>, Guozhang Wang <wangguoz@gmail.com>
Wildcard import of the old org.apache.kafka.streams.scala.Serdes leads
to a name clash because some of implicits has the same names as types
from the scala's std lib. The new oak.streams.scala.serialization.Serdes is
the same as the old Serdes, but without name clashes.
The old one is marked as deprecated.
Also, add missing serdes for UUID, ByteBuffer and Short types in
the new Serdes.
Implements: KIP-616
Reviewers: John Roesler <vvcephei@apache.org>
- part of KIP-572
- removed the usage of `retries` in `GlobalStateManger`
- instead of retries the new `task.timeout.ms` config is used
Reviewers: John Roesler <john@confluent.io>, Boyang Chen <boyang@confluent.io>, Guozhang Wang <guozhang@confluent.io>
Add docs for KIP-441 and KIP-613.
Fixed some miscellaneous unrelated issues in the docs:
* Adds some missing configs to the Streams config docs: max.task.idle.ms,topology.optimization, default.windowed.key.serde.inner.class, and default.windowed.value.serde.inner.class
* Defines the previously-undefined default windowed serde class configs, including choosing a default (null) and giving them a doc string, so the yshould nwo show up in the auto-generated general Kafka config docs
* Adds a note to warn users about the rocksDB bug that prevents setting a strict capacity limit and counting write buffer memory against the block cache
Reviewers: Bruno Cadonna <bruno@confluent.io>, John Roesler <vvcephei@apache.org>
Implements KIP-401:
- Add ConnectedStoreProvider interface
- let Processor/[*]Transformer[*]Suppliers extend ConnectedStoreProvider
- allows to add and connect state stores to processors/transformers implicitly
Reviewers: Matthias J. Sax <matthias@confluent.io>, John Roesler <john@confluent.io>
fix broken links
rephrase a sentence
update the version number
Reviewers: Sophie Blee-Goldman <sophie@confluent.io>, Boyang Chen <boyang@confluent.io>, Bill Bejeck <bbejeck@apache.org>
* Minor: Some html fixes Streams DSL documentation:
- duplicate id
- duplicate opening <span> element
- surplus closing </div> tag
* Replaced opening/closing quotation mark codes with " (they caused w3c validation to complain).
* Replaced right arrow that wasn't rendered correctly with →
Reviewers: Mickael Maison <mickael.maison@gmail.com>
Currently, tumbling windows are defined as "a special case of hopping time windows" in the streams docs, but hopping windows are only explained in a subsequent section.
I think it would make sense to switch the order of these paragraphs around. To me this also makes more sense semantically.
Testing
Built the site and checked that everything looks ok and html is valid (or at least didn't contain any new warnings that were caused by this change).
Reviewers: Bill Bejeck <bbejeck@apache.org>
Added re-direct for new page and added link to Developer Guide page
Reviewers: Matthias J. Sax <mjsax@apache.org>, Sophie Blee-Goldman <sophie@confluent.io>,
* Adjust build and documentation.
* Use lambda syntax for SAM types in `core`, `streams-scala` and
`connect-runtime` modules.
* Remove `runnable` and `newThread` from `CoreUtils` as lambda
syntax for SAM types make them unnecessary.
* Remove stale comment in `FunctionsCompatConversions`,
`KGroupedStream`, `KGroupedTable' and `KStream` about Scala 2.11,
the conversions are needed for Scala 2.12 too.
* Deprecate `org.apache.kafka.streams.scala.kstream.Suppressed`
and use `org.apache.kafka.streams.kstream.Suppressed` instead.
* Use `Admin.create` instead of `AdminClient.create`. Static methods
in Java interfaces can be invoked since Scala 2.12. I noticed that
MirrorMaker 2 uses `AdminClient.create`, but I did not change them
as Connectors have restrictions on newer client APIs.
* Improve efficiency in a few `Gauge` implementations by avoiding
unnecessary intermediate collections.
* Remove pointless `Option.apply` in `ZookeeperClient`
`SessionState` metric.
* Fix unused import/variable and other compiler warnings.
* Reduce visibility of some vals/defs.
Reviewers: Manikumar Reddy <manikumar@confluent.io>, Guozhang Wang <wangguoz@gmail.com>, Gwen Shapira <gwen@confluent.io>
Fixed a small typo on the Processor API page of the Kafka Streams developer guide docs. ("buildeer" changed to "builder")
Reviewers: Bill Bejeck <bbejeck@gmail.com>
* log lock acquistion failures on the state store
* Document required uniqueness of state.dir path
* Move bunch of log calls around task state changes to DEBUG
* More readable log messages during partition assignment
Reviewers: Matthias J. Sax <mjsax@apache.org>, A. Sophie Blee-Goldman <sophie@confluent.io>, Guozhang Wang <wangguoz@gmail.com>
As part of #5425 the streams default override for producer retries was removed. The documentation was not updated to reflect that change.
Reviewers: Matthias J. Sax <mjsax@apache.org>, Sophie Blee-Goldman <sophie@confluent.io>, Bill Bejeck <bbejeck@gmail.com>
Users often use the RocksDBConfigSetter to modify parameters such as cache or block size, which must be set through the BlockBasedTableConfig object. Rather than creating a new object in the config setter, however, users should most likely retrieve a reference to the existing one so as to not lose the other defaults (eg the BloomFilter)
There have been notes from the community that it is not obvious this should be done, nor is it immediately clear how to do so. This PR updates the RocksDBConfigSetter docs to hopefully improve things.
I also piggybacked a few minor cleanups in the docs
Reviewers: Kamal Chandraprakash, Jim Galasyn <jim.galasyn@confluent.io>, Bruno Cadonna <bruno@confluent.io>, Bill Bejeck <bbejeck@gmail.com>
This PR intendent to address some typos in https://kafka.apache.org/documentation/streams/developer-guide/processor-api.html page.
Invalid configuration option specified in the example. I've replaced with closest constant TopicConfig.MIN_IN_SYNC_REPLICAS_CONFIG, since LogConfig.MinInSyncReplicasProp() requires Scala stuff
Reference to LogConfig seems to be obsolete, I believe I've moved it to correct line
Apostrophe displayed incorrectly
Reviewers: John Roesler <john@confluent.io>, Bill Bejeck <bbejeck@gmail.com>
Now that we can configure RocksDB to bound the total memory we should include docs describing how, as well as touching on some possible options that should be considered when taking advantage of this feature.
Reviewers: Guozhang Wang <wangguoz@gmail.com>, Jim Galasyn <jim.galasyn@confluent.io>, Bill Bejeck <bbejeck@gmail.com>
The old docs here used a now deprecated method to set the block cache size. In switching over to the new one we would now need to construct a Cache object and therefore also need to close it, so this is a good opportunity to demonstrate the RocksDBConfigSetter#close method that will need to be implemented by users.
Reviewers: Guozhang Wang <wangguoz@gmail.com>
* Revert "MINOR: Add unit test for SerDe auto-configuration (#6610)"
This reverts commit 172fbb2dd5.
* Revert "[KAFKA-3729] Auto-configure non-default SerDes passed alongside the topology builder (#6461)"
This reverts commit e56ebbffca.
The two merged PRs introduce a breaking change. Reverting to preserve backward compatibility. Jira ticket reopened.
Reviewers: Ted Yu <yuzhihong@gmail.com>, Guozhang Wang <guozhang@confluent.io>
First pass at an in-memory session store implementation.
Reviewers: Simon Geisler, Damian Guy <damian@confluent.io>, John Roesler <john@confluent.io>, Bill Bejeck <bill@confluent.io>, Guozhang Wang <wangguoz@gmail.com>
Removed TOC entry in Streams Developer Guide for Avro, since we have no content for this
PR on kafka-site: apache/kafka-site#195
Reviewers: Guozhang Wang <wangguoz@gmail.com>
* updated names for deprecated streams constants
* add DEFAULT_TIMESTAMP_EXTRACTOR_CLASS_CONFIG in place of deprecated
Reviewers: Jim Galasyn <jim.galasyn@confluent.io>, Matthias J. Sax <matthias@confluent.io>
Implemented an in-memory window store allowing for range queries. A finite retention period defines how long records will be kept, ie the window of time for fetching, and the grace period defines the window within which late-arriving data may still be written to the store.
Unit tests were written to test the functionality of the window store, including its insert/update/delete and fetch operations. Single-record, all records, and range fetch were tested, for both time ranges and key ranges. The logging and metrics for late-arriving (dropped)records were tested as well as the ability to restore from a changelog.
Reviewers: John Roesler <john@confluent.io>, Matthias J. Sax <matthias@confluent.io>, Bill Bejeck <bill@confluent.io>, Guozhang Wang <wangguoz@gmail.com>