The message generator was missing conversion logic for tagged structures. This led to casting errors when either `fromStruct` or `toStruct` were invoked. This patch also adds missing null checks in the serialization of tagged byte arrays, which was found from improved test coverage.
Reviewers: Colin P. McCabe <cmccabe@apache.org>
In Kafka Streams the source-of-truth of a state store is in its changelog, therefore when committing a state store we only need to make sure its changelog records are all flushed and committed, but we do not actually need to make sure that the materialized state have to be flushed and persisted since they can always be restored from changelog when necessary.
On the other hand, flushing a state store too frequently may have side effects, e.g. rocksDB flushing would gets the memtable into an L0 sstable, leaving many small L0 files to be compacted later, which introduces larger overhead.
Therefore this PR decouples flushing from committing, such that we do not always flush the state store upon committing, but only when sufficient data has been written since last time flushed. The checkpoint file would then also be overwritten only along with flushing the state store indicating its current known snapshot. This is okay since: a) if EOS is not enabled, then it is fine if the local persisted state is actually ahead of the checkpoint, b) if EOS is enabled, then we would never write a checkpoint file until close.
Here's a more detailed change list of this PR:
1. Do not always flush state stores when calling pre-commit; move stateMgr.flush into post-commit to couple together with checkpointing.
2. In post-commit, we checkpoint when: a) The state store's snapshot has progressed much further compared to the previous checkpoint, b) When the task is being closed, in which case we enforce checkpointing.
3. There are some tricky obstacles that I'd have to work around in a bit hacky way: for cache / suppression buffer, we still need to flush them in pre-commit to make sure all records sent via producers, while the underlying state store should not be flushed. I've decided to introduce a new API in CachingStateStore to be triggered in pre-commit.
I've also made some minor changes piggy-backed in this PR:
4. Do not delete checkpoint file upon loading it, and as a result simplify the checkpointNeeded logic, initializing the snapshotLastFlush to the loaded offsets.
5. In closing, also follow the commit -> suspend -> close ordering as in revocation / assignment.
6. If enforceCheckpoint == true during RUNNING, still calls maybeCheckpoint even with EOS since that is the case for suspending / closing.
Reviewers: John Roesler <john@confluent.io>, A. Sophie Blee-Goldman <sophie@confluent.io>, Matthias J. Sax <matthias@confluent.io>
This patch removes the PartitionHeader grouping from the Fetch response. With old versions of the protocol, there was no cost for this grouping, but once we add flexible version support, then it adds an extra byte to the schema for tagged fields with little apparent benefit.
Reviewers: Ismael Juma <ismael@juma.me.uk>, David Arthur <mumrah@gmail.com>
Update `CogroupedStreamAggregateBuilder` to have individual builders depending
on the windowed aggregation, or lack thereof. This replaced passing in all options
into the builder, with all but the current type of aggregation set to null and then
checking to see which value was not null.
Reviewers: A. Sophie Blee-Goldman <sophie@confluent.io>, John Roesler <vvcephei@apache.org>
This patch ensures we use a force resolution strategy for the scala-library dependency.
I've tested this locally and saw a difference in the output.
With the change (using 2.4 and the jackson library 2.10.5):
```
./core/build/dependant-libs-2.12.10/scala-java8-compat_2.12-0.9.0.jar
./core/build/dependant-libs-2.12.10/scala-collection-compat_2.12-2.1.2.jar
./core/build/dependant-libs-2.12.10/scala-reflect-2.12.10.jar
./core/build/dependant-libs-2.12.10/scala-logging_2.12-3.9.2.jar
./core/build/dependant-libs-2.12.10/scala-library-2.12.10.jar
```
Without (using 2.4 and the jackson library 2.10.5):
```
find . -name 'scala*.jar'
./core/build/dependant-libs-2.12.10/scala-java8-compat_2.12-0.9.0.jar
./core/build/dependant-libs-2.12.10/scala-collection-compat_2.12-2.1.2.jar
./core/build/dependant-libs-2.12.10/scala-reflect-2.12.10.jar
./core/build/dependant-libs-2.12.10/scala-logging_2.12-3.9.2.jar
./core/build/dependant-libs-2.12.10/scala-library-2.12.12.jar
```
Reviewers: Ismael Juma <ismael@juma.me.uk>
Adds the new Processor and ProcessorContext interfaces
as proposed in KIP-478. To integrate in a staged fashion
with the code base, adapters are included to convert back
and forth between the new and old APIs.
ProcessorNode is converted to the new APIs.
Reviewers: Boyang Chen <boyang@confluent.io>
The patch https://github.com/apache/kafka/pull/8672 introduced a bug leading to crashing the replica fetcher threads. The issue is that https://github.com/apache/kafka/pull/8672 deletes the Partitions prior to stopping the replica fetchers. As the replica fetchers relies access the Partition in the ReplicaManager, they crash with a NotLeaderOrFollowerException that is not handled.
This PR reverts the code to the original ordering to avoid this issue.
The regression was caught and validated by our system test: `kafkatest.tests.core.reassign_partitions_test`.
Reviewers: Vikas Singh <vikas@confluent.io>, Jason Gustafson <jason@confluent.io>
This PR improves the logging for segment deletion to ensure that a reason is logged for segment deletions via all code paths. It also updates the logging so that we log a reason for an entire batch of deletions instead of logging one message per segment in cases when segment-level details are not significant.
Reviewers: Ismael Juma <ismael@juma.me.uk>, Kowshik Prakasam <kprakasam@confluent.io>, Jason Gustafson <jason@confluent.io>
JIRA: https://issues.apache.org/jira/browse/KAFKA-10193
* add `preempt(): Unit` method for all `ControllerEvent` so that all events (and future events) must implement it
* for events that have callbacks, move the preemption from individual methods to `preempt()`
* add preemption for `ApiPartitionReassignment` and `ListPartitionReassignments`
* add integration tests:
1. test whether `preempt()` is called when controller shuts down
2. test whether the events with callbacks have the correct error response (`NOT_CONTROLLER`)
* explicit typing for `ControllerEvent` methods
Author: jeff kim <jeff.kim@confluent.io>
Reviewers: José Armando García Sancio <jsancio@users.noreply.github.com>,Stanislav Kozlovski <stanislav@confluent.io>, David Arthur <mumrah@gmail.com>, Manikumar Reddy <manikumar.reddy@gmail.com>
Closes#9050 from jeffkbkim/KAFKA-10193-controller-events-add-preemption
- part of KIP-572
- replace `retries` in InternalTopicManager with infinite retires plus a new timeout, based on consumer config MAX_POLL_INTERVAL_MS
Reviewers: David Jacot <djacot@confluent.io>, Boyang Chen <boyang@confluent.io>
- implements KIP-648
- Deprecated the existing getters and added new getters without `get` prefix to `KeyQueryMetadata`
Co-authored-by: johnthotekat <Iabon1989*>
Reviewers: Navinder Pal Singh Brar <navinder_brar@yahoo.com>, Matthias J. Sax <matthias@confluent.io>
Based on the discussion in #9072, I have put together an alternative way. This one does the following:
Instead of changing the implementation of the Rate to behave like a Token Bucket, it actually use two different metrics: the regular Rate and a new Token Bucket. The latter is used to enforce the quota.
The Token Bucket algorithm uses the rate of the quota as the refill rate for the credits and compute the burst based on the number of samples and their length (# samples * sample length * quota).
The Token Bucket algorithm used can go under zero in order to handle unlimited burst (e.g. create topic with a number of partitions higher than the burst). Throttling kicks in when the number of credits is under zero.
The throttle time is computed as credits under zero / refill rate (or quota).
Only the controller mutation uses it for now.
The remaining number of credits in the bucket is exposed with the tokens metrics per user/clientId.
Reviewers: Anna Povzner <anna@confluent.io>, Jun Rao <junrao@gmail.com>
- part of KIP-572
- removed the usage of `retries` in `GlobalStateManger`
- instead of retries the new `task.timeout.ms` config is used
Reviewers: John Roesler <john@confluent.io>, Boyang Chen <boyang@confluent.io>, Guozhang Wang <guozhang@confluent.io>
- replace System.exit with Exit.exit in all relevant classes
- forbid use of System.exit in all relevant classes and add exceptions for others
Co-authored-by: John Roesler <vvcephei@apache.org>
Co-authored-by: Matthias J. Sax <matthias@confluent.io>
Reviewers: Lucas Bradstreet <lucas@confluent.io>, Ismael Juma <ismael@confluent.io>
Creating a topic may fail (due to timeout) in running system tests. However, `RoundTripWorker` does not ignore `TopicExistsException` which makes `round_trip_fault_test.py` be a flaky one.
More specifically, a network exception can cause the `CreateTopics` request to reach Kafka but Trogdor retry it
and hit a `TopicAlreadyExists` exception on the retry, failing the test.
Reviewers: Ismael Juma <ismael@juma.me.uk>
Enhance the understandability for constrainedAssign and generalAssign method by getting more detailed meta comments.
Co-authored-by: A. Sophie Blee-Goldman <ableegoldman@gmail.com>
Reviewers: Boyang Chen <boyang@confluent.io>, A. Sophie Blee-Goldman <ableegoldman@gmail.com>
Set session timeout and heartbeat interval to default for RestoreIntegrationTest
Reviewers: Boyang Chen <boyang@confluent.io>, A. Sophie Blee-Goldman <sophie@confluent.io>, Guozhang Wang <wangguoz@gmail.com>
Currently, we remove the Log metrics when asynchronous deletion of the log is triggered. However, we attempt to register the metrics immediately upon log creation. If a Log object is re-created for a partition that is pending deletion (because a topic was quickly re-created or because a partition was moved off and back onto a broker), the registration of the new metrics can happen before the asyncrhonous deletion. In this case, the metrics are removed after the second registration, leading to missing Log metrics.
To fix this, this patch changes the log deletion behavior to remove the metrics when the log is first marked for deletion, rather than when the files are deleted. This removes the window in which metrics registration can occur before metrics removal. This is justifiable because the log should be logically deleted when a delete request or partition movement finishes, rather than when the files are actually removed. Tested with unit tests.
Author: Bob Barrett <bob.barrett@confluent.io>
Reviewers: David Jacot, Dhruvil Shah, Vikas Singh, Gwen Shapira
Closes#9054 from bob-barrett/KAFKA-10282
Refactored FetchRequest and FetchResponse to use the generated message classes for serialization and deserialization. This allows us to bypass unnecessary Struct conversion in a few places. A new "records" type was added to the message protocol which uses BaseRecords as the field type. When sending, we can set a FileRecords instance on the message, and when receiving the message class will use MemoryRecords.
Also included a few JMH benchmarks which indicate a small performance improvement for requests with high partition counts or small record sizes.
Reviewers: Jason Gustafson <jason@confluent.io>, Boyang Chen <boyang@confluent.io>, David Jacot <djacot@confluent.io>, Lucas Bradstreet <lucas@confluent.io>, Ismael Juma <ismael@juma.me.uk>, Colin P. McCabe <cmccabe@apache.org>
In PR #8962 we introduced a sentinel UNKNOWN_OFFSET to mark unknown offsets in checkpoint files. The sentinel was set to -2 which is the same value used for the sentinel LATEST_OFFSET that is used in subscriptions to signal that state stores have been used by an active task. Unfortunately, we missed to skip UNKNOWN_OFFSET when we compute the sum of the changelog offsets.
If a task had only one state store and it did not restore anything before the next rebalance, the stream thread wrote -2 (i.e., UNKNOWN_OFFSET) into the subscription as sum of the changelog offsets. During assignment, the leader interpreted the -2 as if the stream run the task as active although it might have run it as standby. This misinterpretation of the sentinel value resulted in unexpected task assignments.
Ports: KAFKA-10287 / #9066
Reviewers: A. Sophie Blee-Goldman <sophie@confluent.io>, Chia-Ping Tsai <chia7712@gmail.com>, John Roesler <vvcephei@apache.org>, Matthias J. Sax <mjsax@apache.org>
Reviewers: Mickael Maison <mickael.maison@gmail.com>, David Jacot <djacot@confluent.io>, Lee Dongjin <dongjin@apache.org>, Chia-Ping Tsai <chia7712@gmail.com>
The start() function for global stream thread only checks whether the thread is not running, as it needs to block until it finishes the initialization. This PR fixes this behavior by adding a check whether the thread is already in error state as well.
Reviewers: Guozhang Wang <wangguoz@gmail.com>, John Roesler <vvcephei@apache.org>
Modified KafkaProducer.sendOffsetsToTransaction() to be affected with max.block.ms, and added timeout test for blocking methods
Reviewers: Boyang Chen <boyang@confluent.io>, Guozhang Wang <wangguoz@gmail.com>, Xi Hu <huxi_2b@hotmail.com>
Add a broker to controller channel manager for use cases such as redirection and AlterIsr.
Reviewers: David Arthur <mumrah@gmail.com>, Colin P. McCabe <cmccabe@apache.org>, Ismael Juma <ismael@juma.me.uk>
Co-authored-by: Viktor Somogyi <viktorsomogyi@gmail.com>
Co-authored-by: Boyang Chen <boyang@confluent.io>
Update jersey license from CDDL to EPLv2
Author: Rens Groothuijsen <l.groothuijsen@alumni.maastrichtuniversity.nl>
Reviewer: Randall Hauch <rhauch@gmail.com>
A system test failed with the following error: global name 'self' is not defined
The reason was that `self` was accessed to log a message in a static method. This commit makes the method an instance method.
Reviewer: Matthias J. Sax <matthias@confluent.io>
I left out updates that could be risky. Preliminary testing indicates
we can build (including spotBugs) and run tests with Java 15 with
these changes. I will do more thorough testing once Java 15 reaches
release candidate stage in a few weeks.
Minor updates with mostly bug fixes:
- Scala: 2.12.11 -> 2.12.12 (compiler and collection performance improvements)
- Bouncy castle: 1.64 -> 1.66 (several bug fixes)
- HttpClient: 4.5.11 -> 4.5.12 (small number of bug fixes)
- Mockito: 3.3.3 -> 3.4.4 (several bug fixes and Java 15 support)
- Netty: 4.5.10 -> 4.5.11 (several bug fixes)
- Snappy: 1.1.7.3 -> 1.1.7.6 (small number of bug fixes)
- Zstd: 1.4.5-2 -> 1.4.5-6 (small number of bug fixes)
Gradle plugin and library upgrades:
- Gradle version plugins: 0.28.0 -> 0.29.0 (small number of bug fixes)
- Git: 4.0.1 -> 4.0.2 (small number of bug fixes)
- Scoverage plugin: 4.0.1 -> 4.0.2 (small number of bug fixes)
- Shadow plugin: 5.2.0 -> 6.0.0 (Java 15 support and require Gradle 6.0)
- Test Retry plugin: 1.1.5 -> 1.1.6 (small number of bug fixes)
- Spotless plugin: 4.4.4 -> 5.1.0 (several internal changes that should not matter to us)
- Spotbugs: 4.0.3 -> 4.0.6 (small number of bug fixes)
- Spotbugs plugin: 4.2.4 -> 4.4.4 (small number of bug fixes)
Reviewers: Manikumar Reddy <manikumar.reddy@gmail.com>
While debugging a rebalance scenario I found that inside rejoinNeededOrPending when we trigger rebalance due to metadata or subscription changes it is not logged, and hence it's actually a bit tricky to find out the reason of the triggered rebalance. I'm adding two INFO log4j entries to fill in the gap.
Other requestRejoin() calls are already covered.
Reviewers: Boyang Chen <boyang@confluent.io>
Java 11 has been recommended for a while, but the ops section had not been updated.
Also added `-XX:+ExplicitGCInvokesConcurrent` which has been in `kafka-run-class`
for a while. Finally, tweaked the text slightly to read better.
Reviewers: Manikumar Reddy <manikumar.reddy@gmail.com>
Set `replica.fetch.max.bytes` to `1` and produce multiple record batches to allow
for throttling to take place. This helps avoid a race condition where the
reassignment would complete more quickly than expected causing an
assertion to fail some times.
Reviewers: Lucas Bradstreet <lucas@confluent.io>, Jason Gustafson <jason@confluent.io>, Chia-Ping Tsai <chia7712@gmail.com>, Ismael Juma <ismael@juma.me.uk>
AbstractProcessorContext topic() throws NullPointerException when modifying a state store within the DSL from a punctuator. Reorder the check to avoid the NPE.
Co-authored-by: Ashish Roy <v-ashish.r@turvo.com>
Reviewers: Boyang Chen <boyang@confluent.io>
https://issues.apache.org/jira/browse/KAFKA-10305
When `kafka-consumer-perf-test.sh` is executed without required options or no options at all, only the error message is displayed. It's better off showing the usage as well like what we did for kafka-console-producer.sh.