This patch ensures that internal topics are included when searching for hanging transactions with the `--broker-id` argument in `kafka-transactions.sh`.
Reviewers: David Jacot <djacot@confluent.io>, Guozhang Wang <wangguoz@gmail.com>
Added asynchronous API support for RemoeLogMetadataManager add/update/put methods.
Implemented the changes on default topic based RemoteLogMetadataManager.
Refactored the respective tests to cover the introduced asynchronous APIs.
Reviewers: Cong Ding <cong@ccding.com>, Jun Rao <junrao@gmail.com>
ssh and rsync access has been removed from home.apache.org.
Removing the commands from release.py and replacing them with a note to make sure they are manually uploaded with an sftp client instead.
Reviewers: David Jacot <djacot@confluent.io>, Ismael Juma <ismael@juma.me.uk>
Avoid using the non-public API KafkaFutureImpl in the Admin client's `*Result` class constructors.
This is particularly problematic for `DescribeConsumerGroupsResult` which currently has a
public constructor. For the other classes the rationale is simply consistency with the majority of
the `*Result` classes.
Reviewers: Ismael Juma <ismael@juma.me.uk, David Jacot <djacot@confluent.io>, Luke Chen <showuon@gmail.com>
KAFKA-13243: KIP-773 Differentiate metric latency measured in ms and ns
Implementation of KIP-773
Deprecates inconsistent metrics bufferpool-wait-time-total,
io-waittime-total, and iotime-total.
Introduces new metrics bufferpool-wait-time-ns-total,
io-wait-time-ns-total, and io-time-ns-total with the same semantics as
before.
Adds metrics (old and new) in ops.html.
Adds upgrade guide for these metrics.
Reviewers: Guozhang Wang <wangguoz@gmail.com>, Tom Bentley <tbentley@redhat.com>
This patch adds the `ActiveBrokerCount` and the `FencedBrokerCount` metrics to the ZK controller. Note that `FencedBrokerCount` is always set to zero in the ZK controller.
Reviewers: Jason Gustafson <jason@confluent.io>
`ReplicationTest.test_replication_with_broker_failure` in KRaft mode sometimes fails with the following error in the log:
```
[2021-08-31 11:31:25,092] ERROR [ReplicaFetcher replicaId=1, leaderId=2, fetcherId=0] Unexpected error occurred while processing data for partition __consumer_offsets-1 at offset 31727 (kafka.server.ReplicaFetcherThread)java.lang.IllegalStateException: Offset mismatch for partition __consumer_offsets-1: fetched offset = 31727, log end offset = 31728. at kafka.server.ReplicaFetcherThread.processPartitionData(ReplicaFetcherThread.scala:194) at kafka.server.AbstractFetcherThread.$anonfun$processFetchRequest$8(AbstractFetcherThread.scala:545) at scala.Option.foreach(Option.scala:437) at kafka.server.AbstractFetcherThread.$anonfun$processFetchRequest$7(AbstractFetcherThread.scala:533) at kafka.server.AbstractFetcherThread.$anonfun$processFetchRequest$7$adapted(AbstractFetcherThread.scala:532) at kafka.utils.Implicits$MapExtensionMethods$.$anonfun$forKeyValue$1(Implicits.scala:62) at scala.collection.convert.JavaCollectionWrappers$JMapWrapperLike.foreachEntry(JavaCollectionWrappers.scala:359) at scala.collection.convert.JavaCollectionWrappers$JMapWrapperLike.foreachEntry$(JavaCollectionWrappers.scala:355) at scala.collection.convert.JavaCollectionWrappers$AbstractJMapWrapper.foreachEntry(JavaCollectionWrappers.scala:309) at kafka.server.AbstractFetcherThread.processFetchRequest(AbstractFetcherThread.scala:532) at kafka.server.AbstractFetcherThread.$anonfun$maybeFetch$3(AbstractFetcherThread.scala:216) at kafka.server.AbstractFetcherThread.$anonfun$maybeFetch$3$adapted(AbstractFetcherThread.scala:215) at scala.Option.foreach(Option.scala:437) at kafka.server.AbstractFetcherThread.maybeFetch(AbstractFetcherThread.scala:215) at kafka.server.AbstractFetcherThread.doWork(AbstractFetcherThread.scala:197) at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:99)[2021-08-31 11:31:25,093] WARN [ReplicaFetcher replicaId=1, leaderId=2, fetcherId=0] Partition __consumer_offsets-1 marked as failed (kafka.server.ReplicaFetcherThread)
```
The issue is due to a race condition in `ReplicaManager#applyLocalFollowersDelta`. The `InitialFetchState` is created and populated before the partition is removed from the fetcher threads. This means that the fetch offset of the `InitialFetchState` could be outdated when the fetcher threads are re-started because the fetcher threads could have incremented the log end offset in between.
The patch fixes the issue by removing the partitions from the replica fetcher threads before creating the `InitialFetchState` for them.
Reviewers: Jason Gustafson <jason@confluent.io>
### Committer Checklist (excluded from commit message)
- [ ] Verify design and implementation
- [ ] Verify test coverage and CI build status
- [ ] Verify documentation (including upgrade notes)
Description
Streams disables the write-ahead log (WAL) provided by RocksDB since it replicates the data in changelog topics. Hence, it does not make much sense to set WAL-related configs for RocksDB.
Proposed solution
Ignore any WAL-related configuration and state in the log that we are ignoring them.
Co-authored-by: Tomer Wizman <tomer.wizman@personetics.com>
Co-authored-by: Bruno Cadonna <cadonna@apache.org>
Reviewers: Boyang Chen <boyang@apache.org>, Bruno Cadonna <cadonna@apache.org>
AlterClientQuotas, DescribeProducers and FindCoordinator have issues when building error responses. This can lead to brokers returning responses without errors even when some have happened.
Reviewers: David Jacot <djacot@confluent.io>, Jason Gustafson <jason@confluent.io>, Luke Chen <showuon@gmail.com>
As detailed in KAFKA-12994, unit tests using the old API should be either removed or migrated to the new API.
This PR migrates relevant tests in JoinWindowsTest.java and SessionWindowsTest.java.
Reviewers: Anna Sophie Blee-Goldman <ableegoldman@apache.org>
Migrate Suppress as part of the migration of KStream/KTable
operations to the new Processor API (KAFKA-8410)
Reviewers: John Roesler <vvcephei@apache.org>
I added the final via 2f36001987 to catch overriding mistakes
since the implementation was moved from the deprecated and
overloaded `close` with two parameters to the no-arg
`close`.
I didn't realize then that `MockConsumer` is a public
API (seems like a bit of a mistake since we tweak the
implementation and sometimes adds methods without a KIP).
Given that this is a public API, I have also moved the implementation
of `close` to the one arg overload. This makes it easier for a
subclass to have specific overriding behavior depending on the
timeout.
Reviewers: David Jacot <djacot@confluent.io>
We restore the 3.4.x/3.5.x behavior unless the caller has set the property (note that ZKConfig
auto configures itself if certain system properties have been set).
I added a unit test that fails without the change and passes with it.
I also refactored the code to streamline the way we handle parameters passed to
KafkaZkClient and ZooKeeperClient.
See https://github.com/apache/zookeeper/pull/1129 for the details on why the behavior
changed in 3.6.0.
Credit to @rondagostino for finding and reporting this issue.
Reviewers: David Jacot <djacot@confluent.io>
Java 17 is at release candidate stage and it will be a LTS release once
it's out (previous LTS release was Java 11).
Details:
* Replace Java 16 with Java 17 in Jenkins and Readme.
* Replace `--illegal-access=permit` (which was removed from Java 17)
with `--add-opens` for the packages we require internal access to.
Filed KAFKA-13275 for updating the tests not to require `--add-opens`
(where possible).
* Update `release.py` to use JDK8. and JDK 17 (instead of JDK 8 and JDK 15).
* Removed all but one Streams test from `testsToExclude`. The
Connect test exclusion list remains the same.
* Add notable change to upgrade.html
* Upgrade to Gradle 7.2 as it's required for proper Java 17 support.
* Upgrade mockito to 3.12.4 for better Java 17 support.
* Adjusted `KafkaRaftClientTest` and `QuorumStateTest` not to require
private access to `jdk.internal.util.random`.
Reviewers: Manikumar Reddy <manikumar.reddy@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>
Currently KStreamMap and KStreamFlatMap classes will throw NPE if the call to KeyValueMapper#apply() return null. This commit checks whether the result of KeyValueMapper#apply() is null and throws a more meaningful error message for better debugging.
Two unit tests are also added to check if we successfully captured nulls.
Reviewers: Josep Prat <josep.prat@aiven.io>, Luke Chen <showuon@gmail.com>, Bruno Cadonna <cadonna@apache.org>
This patch adds `REBALANCE_IN_PROGRESS` error as retriable error for `AlterConsumerGroupOffsetsHandler`, and tests for it.
Reviewers: David Jacot <djacot@confluent.io>
The controller can skip sending updateMetadataRequest during the broker failure callback if there are offline partitions and the deleted brokers don't host any partitions.
Reviewers: Jun Rao <junrao@gmail.com>
* URL wasn't urlencoded when forwarded reconfiguration to leader connect worker
* handling previously swallowed errors in connect RestClient
Reviewers: Mickael Maison <mickael.maison@gmail.com>, Viktor Somogyi-Vass <viktorsomogyi@gmail.com>
Co-authored-by: Andras Katona <akatona@cloudera.com>
Co-authored-by: Daniel Urban <durban@cloudera.com>
After a topic is deleted, the topic is marked for deletion, create topic with the same name throw exception topic already exists. It should indicate the topic is marked for deletion.
Reviewers: Guozhang Wang <wangguoz@gmail.com>
Instead of letting all RuntimeExceptions go through and be processed by the uncaught exception handler, IllegalStateException and IllegalArgumentException are not passed through and fail fast. In this PR when setting the uncaught exception handler we check if the exception is in an "exclude list", if so, we terminate the client, otherwise we continue as usual.
Added test checking this new case. Added integration test checking that user defined exception handler is not used when an IllegalStateException is thrown.
Reviewers: Bruno Cadonna <cadonna@confluent.io>, Guozhang Wang <wangguoz@gmail.com>
Clearing under-replicated-partitions helps ensure that partitions do not become unavailable longer than necessary as brokers are rolled. This prevents flakiness due to transaction timeouts.
Reviewers: Luke Chen <showuon@gmail.com>, Ismael Juma <ismael@juma.me.uk>
* Add the following producer metrics:
flush-time-total: cumulative sum of time elapsed during in flush.
txn-init-time-total: cumulative sum of time elapsed during in initTransactions.
txn-begin-time-total: cumulative sum of time elapsed during in beginTransaction.
txn-send-offsets-time-total: cumulative sum of time elapsed during in sendOffsetsToTransaction.
txn-commit-time-total: cumulative sum of time elapsed during in commitTransaction.
txn-abort-time-total: cumulative sum of time elapsed during in abortTransaction.
* Add the following consumer metrics:
commited-time-total: cumulative sum of time elapsed during in committed.
commit-sync-time-total: cumulative sum of time elapsed during in commitSync.
* Add a total-blocked-time metric to streams that is the sum of:
consumer’s io-waittime-total
consumer’s iotime-total
consumer’s committed-time-total
consumer’s commit-sync-time-total
restore consumer’s io-waittime-total
restore consumer’s iotime-total
admin client’s io-waittime-total
admin client’s iotime-total
producer’s bufferpool-wait-time-total
producer's flush-time-total
producer's txn-init-time-total
producer's txn-begin-time-total
producer's txn-send-offsets-time-total
producer's txn-commit-time-total
producer's txn-abort-time-total
Reviewers: Bruno Cadonna <cadonna@confluent.io>, Guozhang Wang <wangguoz@gmail.com>
MINOR Refactored the existing CheckpointFile in core module, moved to server-common module.
Refactored CheckpointFile to server-common module as a Java class and it is reused by LeaderCheckpointFile, OffsetCheckpointFile.
This will be used by CommittedOffsetsFile which checkpoints remote log metadata partitions with respective offsets in the default RemoteLogMetadataManager implementation.
Existing tests are available for LeaderCheckpointFile, OffsetCheckpointFile.
Reviewers: Jun Rao <junrao@gmail.com>
Add a new case to the list of possible retriable exceptions for the flaky tests to take care of threads starting up
Reviewers: Leah Thomas <lthomas@confluent.io>, Anna Sophie Blee-Goldman
The original code uses a RemoteLogManagerConfig class to store KIP-405 configs and adds three configs to LogConfig. This makes the code complicated and developers may be confused.
This PR allows us to access RemoteLogManagerConfig from KafkaConfig and do the same for LogConfig. Kafka developers will see the same interface for the KIP-405 configs. After this change, if we want to read remoteStorageEnable we should use LogConfig.tieredLogConfig.remoteStorageEnable instead of LogConfig.remoteStorageEnable. The same for localRetentionMs and localRetentionBytes. If we want to read configs in RemoteLogManagerConfig, we should use KafkaConfig.tieredKafkaConfig.xxx.
Reviewers: Satish Duggana <satishd@apache.org>, Kowshik Prakasam <kprakasam@confluent.io>, Jun Rao <junrao@gmail.com>
This patch has a couple small improvements to `TransactionalMessageCopier` logging:
- Log all fatal exceptions which cause the copier to shutdown unexpectedly
- Log all non-fatal exceptions which cause the copier to abort a transaction
Reviewers: David Jacot <djacot@confluent.io>
The FetchSessionHandler had a small bug in the session build method where we did not consider building a session where no partitions were added and the session previously did not use topic IDs. (ie, it was relying on at least one partition being added to signify whether topic IDs were present)
Due to this, we could send forgotten partitions with the zero UUID. This would always result in an exception and closed session.
This patch fixes the logic to check that any forgotten partitions have topic IDs. There is also a test added for the empty session situation when topic IDs are used and when topic names are used.
Reviewers: David Jacot <djacot@confluent.io>, Jun Rao <junrao@gmail.com>
When debugging issues with partition state, it's very useful to know the zkVersion that was written. This patch adds the zkVersion of LeaderAndIsr in a few more places.
This patch ensures that the transaction message copier is fully started in `start_node`. Without this, it is possible that `stop_node` is called before the process is started which results in not stopping it at all.
Reviewers: Jason Gustafson <jason@confluent.io>
This patch refactors `ReplicaManager#becomeLeaderOrFollower` to avoid having to re-iterate over all the partitions to determine which ones should become leaders and which ones should become followers.
The patch also refactors how partitions are marked as offline when the log can't be created. Before the patch, we were iterating over all the partitions in the request or in the delta to mark them as offline is the log was not present. Now, we mark them as failed directly if the log can not be created.
Reviewers: Luke Chen <showuon@gmail.com>, Jason Gustafson <jason@confluent.io>
The GraphNode#writeToTopology method accepts a Properties input parameter, but never uses it in any of its implementations. We can remove this parameter to clean things up and help make it clear that writing nodes to the topology doesn't involve the app properties.
Reviewers: Bruno Cadonna <cadonna@confluent.io>
Currently in OrderedBytes#upperRange method, we'll check key bytes 1 by 1, to see if there's a byte value >= first timestamp byte value, so that we can skip the following key bytes, because we know compareTo will always return 0 or 1. However, in most cases, the first timestamp byte is always 0, more specifically the upperRange is called for both window store and session store. For former, the suffix is in timestamp, Long.MAX_VALUE and for latter the suffix is in Long.MAX_VALUE, timestamp. For Long.MAX_VALUE the first digit is not 0, for timestamp it could be 0 or not, but as long as it is up to "now" (i.e. Aug. 23rd) then the first byte should be 0 since the value is far smaller than what a long typed value could have. So in practice for window stores, that suffix's first byte has a large chance to be 0, and hence worth optimizing for.
This PR optimizes the not all query cases by not checking the key byte 1 by 1 (because we know the unsigned integer will always be >= 0), instead, put all bytes and timestamp directly. So, we won't have byte array copy in the end either.
Reviewers: Guozhang Wang <wangguoz@gmail.com>
Currently the consumer will reset state after any retriable error during a rebalance. This includes coordinator disconnects as well as coordinator changes. The impact of this is that rebalances get delayed since they will be blocked until the session timeout of the old memberId expires.
The patch here fixes the problem by not resetting the member state after a retriable error.
Reviewers: David Jacot <djacot@confluent.io>, Guozhang Wang <wangguoz@gmail.com>
A few small logging improvements:
- Only print error when it is not NONE
- Full list of remaining partitions are printed only at debug level
- Only backoff and print retry logging if there are remaining retries
Reviewers: Luke Chen <showuon@gmail.com>, David Jacot <djacot@confluent.io>
Small locking improvement to drop the group metadata lock before invoking the response callback in `GroupCoordinator#handleHeartbeat`.
Reviewers: David Jacot <djacot@confluent.io>