Commit Graph

3840 Commits

Author SHA1 Message Date
David Jacot 3072b3d23e
MINOR: Fix AlterPartitionManager topic id handling in response handler (#12317)
f83d95d9a2 introduced topic ids in the AlterPartitionRequest/Response and we just found a bug in the request handling logic. The issue is the following.

When the `AlterPartitionManager` receives the response, it builds the `partitionResponses` mapping `TopicIdPartition` to its result. `TopicIdPartition` is built from the response. Therefore if version < 2 is used, `TopicIdPartition` will have the `ZERO` topic id. Then the `AlterPartitionManager` iterates over the item sent to find their response. If an item has a topic id in its `TopicIdPartition` and version < 2 was used, it cannot find it because one has it and the other one has not.

This patch fixes the issue by using `TopicPartition` as a key in the `partitionResponses` map. This ensures that the result can be found regardless of the topic id being set or not.

Note that the case where version 2 is used is handled correctly because we already have logic to get back the topic name from the topic id in order to construct the `TopicPartition`.

`testPartialTopicIds` test was supposed to catch this but it didn't due to the ignorable topic id field being present. This patch fixes the test as well.

Reviewers: Kvicii <42023367+Kvicii@users.noreply.github.com>, Jason Gustafson <jason@confluent.io>
2022-06-21 18:24:33 +02:00
Viktor Somogyi-Vass d65d886798
KAFKA-6945: KIP-373, allow users to create delegation token for others (#10738)
Reviewers: Manikumar Reddy <manikumar.reddy@gmail.com>
2022-06-21 12:51:08 +05:30
Niket a126e3a622
KAFKA-13888; Addition of Information in DescribeQuorumResponse about Voter Lag (#12206)
This commit adds an Admin API handler for DescribeQuorum Request and also
adds in two new fields LastFetchTimestamp and LastCaughtUpTimestamp to
the DescribeQuorumResponse as described by KIP-836.

This commit does not implement the newly added fields. Those will be
added in a subsequent commit.

Reviewers: dengziming <dengziming1993@gmail.com>, David Jacot <djacot@confluent.io>, Jason Gustafson <jason@confluent.io>
2022-06-15 09:20:15 -07:00
Ismael Juma f421c008aa MINOR: Remove ReplicaManagerTest.initializeLogAndTopicId (#12276)
The workaround is not required with mockito.

Reviewers: José Armando García Sancio <jsancio@users.noreply.github.com>, Divij Vaidya <diviv@amazon.com>, Kvicii <42023367+Kvicii@users.noreply.github.com>
2022-06-14 09:10:18 -07:00
Mickael Maison 4fcfd9ddc4
KAFKA-13958: Expose logdirs total/usable space via Kafka API (KIP-827) (#12248)
This implements KIP-827: https://cwiki.apache.org/confluence/display/KAFKA/KIP-827%3A+Expose+logdirs+total+and+usable+space+via+Kafka+API

Add TotalBytes and UsableBytes to DescribeLogDirsResponse
Add matching getters on LogDirDescription

Reviewers: Tom Bentley <tbentley@redhat.com>, Divij Vaidya<diviv@amazon.com>, Viktor Somogyi-Vass <viktorsomogyi@gmail.com>, Igor Soarez <soarez@apple.com>
2022-06-14 14:20:29 +02:00
David Jacot f83d95d9a2
KAFKA-13916; Fenced replicas should not be allowed to join the ISR in KRaft (KIP-841, Part 2) (#12181)
This path implements [KIP-841](https://cwiki.apache.org/confluence/display/KAFKA/KIP-841%3A+Fenced+replicas+should+not+be+allowed+to+join+the+ISR+in+KRaft). Specifically, it implements the following:
* It introduces INELIGIBLE_REPLICA and NEW_LEADER_ELECTED error codes.
* The KRaft controller validates the new ISR provided in the AlterPartition request and rejects the call if any replica in the new ISR is not eligible to join the the ISR - e.g. when fenced or shutting down. The leader reverts to the last committed ISR when its request is rejected due to this.
* The partition leader also verifies that a replica is eligible before trying to add it back to the ISR. If it is not eligible, the ISR expansion is not triggered at all.
* Updates the AlterPartition API to use topic ids. Updates the AlterPartition manger to handle topic names/ids. Updates the ZK controller and the KRaft controller to handle topic names/ids depending on the version of the request used.

Reviewers: Artem Livshits <84364232+artemlivshits@users.noreply.github.com>, José Armando García Sancio <jsancio@users.noreply.github.com>, Jason Gustafson <jason@confluent.io>
2022-06-14 13:12:45 +02:00
David Arthur cc384054c6
KAFKA-13935 Fix static usages of IBP in KRaft mode (#12250)
* Set the minimum supported MetadataVersion to 3.0-IV1
* Remove MetadataVersion.UNINITIALIZED
* Relocate RPC version mapping for fetch protocols into MetadataVersion
* Replace static IBP calls with dynamic calls to MetadataCache

A side effect of removing the UNINITIALIZED metadata version is that the FeatureControlManager and FeatureImage will initialize themselves with the minimum KRaft version (3.0-IV1).

The rationale for setting the minimum version to 3.0-IV1 is so that we can avoid any cases of KRaft mode running with an old log message format (KIP-724 was introduced in 3.0-IV1). As a side-effect of increasing this minimum version, the feature level values decreased by one.

Reviewers: Jason Gustafson <jason@confluent.io>, Jun Rao <junrao@gmail.com>
2022-06-13 14:23:28 -04:00
Divij Vaidya 0a50005408
KAFKA-13929: Replace legacy File.createNewFile() with NIO.2 Files.createFile() (#12197)
Reviewers: Mickael Maison <mickael.maison@gmail.com>
2022-06-10 13:28:55 +02:00
Joel Hamill 249cd4461f
MINOR: Fix typo in Kafka config docs (#12268)
Reviewers: Guozhang Wang <wangguoz@gmail.com>
2022-06-08 13:51:41 -07:00
José Armando García Sancio 21490af989
MINOR; Test last committed record offset for Controllers (#12249)
As part of KIP-835, LastCommittedRecordOffset was added to the
KafkaController metric type. Make sure to test that metric.

Reviewers: Jason Gustafson <jason@confluent.io>
2022-06-08 10:45:04 -07:00
Jason Gustafson 4542acdc14
MINOR: Convert `ReassignPartitionsIntegrationTest` to KRaft (#12258)
Updates relevant tests in `ReassignPartitionsIntegrationTest` for KRaft. We skip JBOD tests since it is not supported and we skip `AlterPartition` upgrade tests since they are not relevant.

Reviewers: Kvicii <Karonazaba@gmail.com>, David Arthur <mumrah@gmail.com>
2022-06-07 20:59:24 -07:00
Jason Gustafson 49a0e0d72e
MINOR: Fix kraft timeout in LogOffsetTest (#12262)
Fixes the timeouts we have been seeing in `LogOffsetTest` when KRaft is enabled. The problem is the dependence on `MockTime`. In the KRaft broker, we need a steadily advancing clock for events in `KafkaEventQueue` to get executed. In the case of the timeouts, the broker was stuck with the next heartbeat event in the queue. We depended on the execution of this event in order to send the next heartbeat and complete the `initialCatchUpFuture` and finish broker startup. This caused the test to get stuck during initialization, which is probably why the `@Timeout` wasn't working. The patch fixes the problem by using system time instead because there was not a strong dependence on `MockTime`.

Reviewers: David Arthur <mumrah@gmail.com>
2022-06-07 17:50:45 -07:00
David Arthur 806098ffe1
KAFKA-13410; Add a --release-version flag for storage-tool (#12245)
This patch removes the --metadata-version and adds a --release-version to the kafka-storage tool. This change is not a breaking change since we are removing --metadata-version which was introduced on May 18, but it has not been released yet.

Reviewers: José Armando García Sancio <jsancio@users.noreply.github.com>, dengziming <dengziming1993@gmail.com>
2022-06-07 11:25:40 -07:00
David Jacot 151ca12a56
KAFKA-13916; Fenced replicas should not be allowed to join the ISR in KRaft (#12240)
This PR implements the first part of KIP-841. Specifically, it implements the following:

1. Adds a new metadata version.
2. Adds the InControlledShutdown field to the BrokerRegistrationRecord and BrokerRegistrationChangeRecord and bump their versions. The newest versions are only used if the new metadata version is enabled.
3. Writes a BrokerRegistrationChangeRecord with InControlledShutdown set when a broker requests a controlled shutdown.
4. Ensures that fenced and in controlled shutdown replicas are not picked as leaders nor included in the ISR.
5. Adds or extends unit tests.

Reviewes: José Armando García Sancio <jsancio@users.noreply.github.com>, dengziming <dengziming1993@gmail.com>, David Arthur <mumrah@gmail.com>
2022-06-07 10:37:20 -07:00
Kvicii 09570f2540
KAFKA-13592; Fix flaky test ControllerIntegrationTest.testTopicIdUpgradeAfterReassigningPartitions (#11687)
Fixes several race conditions in the test case causing the flaky failure.

Reviewers: Divij Vaidya <divijvaidya13@gmail.com>, Jason Gustafson <jason@confluent.io>

Co-authored-by: Kvicii <Karonazaba@gmail.com>
2022-06-06 17:56:48 -07:00
Divij Vaidya 601051354b
MINOR: Correctly mark some tests as integration tests (#12223)
Also fix package name of `ListOffsetsIntegrationTest`.

Reviewers: dengziming <dengziming1993@gmail.com>, Jason Gustafson <jason@confluent.io>
2022-06-06 11:18:24 -07:00
dengziming 0b3ab4687e
KIP-835: metadata.max.idle.interval.ms shoud be much bigger than broker.heartbeat.interval.ms (#12238)
The active quorum controller will append NoOpRecord periodically to increase metadata LEO, however, when a broker startup, we will wait until its metadata LEO catches up with the controller LEO, we generate NoOpRecord every 500ms and send heartbeat request every 2000ms.

It's almost impossible for a broker to catch up with the controller LEO if the broker sends a query request every 2000ms but the controller LEO increases every 500ms, so the tests in KRaftClusterTest will fail.

Reviewers: José Armando García Sancio <jsancio@users.noreply.github.com>, showuon <43372967+showuon@users.noreply.github.com>, Colin Patrick McCabe <cmccabe@confluent.io>
2022-06-03 11:39:30 -07:00
Rittika Adhikari 3467036e01
KAFKA-13803: Refactor Leader API Access (#12005)
This PR refactors the leader API access in the follower fetch path.

Added a LeaderEndPoint interface which serves all access to the leader.

Added a LocalLeaderEndPoint and a RemoteLeaderEndPoint which implements the LeaderEndPoint interface to handle fetches from leader in local & remote storage respectively.

Reviewers: David Jacot <djacot@confluent.io>, Kowshik Prakasam <kprakasam@confluent.io>, Jun Rao <junrao@gmail.com>
2022-06-03 09:12:06 -07:00
Kvicii 7e71483aed
MINOR: fix doc (#12243)
Reviewers: Luke Chen <showuon@gmail.com>
2022-06-03 15:56:13 +08:00
RivenSun d8d92f0f80
MINOR: Update the kafka-reassign-partitions script command in documentation (#12237)
Reviewers: Luke Chen <showuon@gmail.com>
2022-06-02 21:30:22 +08:00
Luke Chen fa33fb4d3c
KAFKA-13773: catch kafkaStorageException to avoid broker shutdown directly (#12136)
When logManager startup and loadLogs, we expect to catch any IOException (ex: out of space error) and turn the log dir into offline. Later, we'll handle the offline logDir in ReplicaManage, so that the cleanShutdown file won't be created when all logDirs are offline. The reason why the broker shutdown with cleanShutdown file after full disk is because during loadLogs and do log recovery, we'll write leader-epoch-checkpoint fil. And if any IOException thrown, we'll wrap it as KafkaStorageException and rethrow. And since we don't catch KafkaStorageException, so the exception is caught in the other place and go with clean shutdown path.

This PR is to fix the issue by catching the KafkaStorageException with IOException cause exceptions during loadLogs, and mark the logDir as offline to let the ReplicaManager handle the offline logDirs.

Reviewers: Jun Rao <jun@confluent.io>, Alok Thatikunta <alok123thatikunta@gmail.com>
2022-06-02 14:15:51 +08:00
Jason Gustafson 0f9f7e6c78
MINOR: Enable kraft support in quota integration tests (#12217)
Enable kraft support in BaseQuotaTest and its extensions.

Reviewers: Kvicii <42023367+Kvicii@users.noreply.github.com>, dengziming <dengziming1993@gmail.com>
2022-06-01 16:56:10 -07:00
Colin Patrick McCabe 0ca9cd4d2d
MINOR: Several fixes and improvements for FeatureControlManager (#12207)
This PR fixes a bug where FeatureControlManager#replay(FeatureLevelRecord) was throwing an
exception if not all controllers in the quorum supported the feature being applied. While we do
want to validate this, it needs to be validated earlier, before the record is committed to the log.
Once the record has been committed to the log it should always be applied if the current controller
supports it.

Fix another bug where removing a feature was not supported once it had been configured. Note that
because we reserve feature level 0 for "feature not enabled", we don't need to use
Optional<VersionRange>; we can just return a range of 0-0 when the feature is not supported.

Allow the metadata version to be downgraded when UpgradeType.UNSAFE_DOWNGRADE has been set.
Previously we were unconditionally denying this even when this was set.

Add a builder for FeatureControlManager, so that we can easily add new parameters to the
constructor in the future. This will also be useful for creating FeatureControlManagers that are
initialized to a specific MetadataVersion.

Get rid of RemoveFeatureLevelRecord, since it's easier to just issue a FeatureLevelRecord with
the level set to 0.

Set metadata.max.idle.interval.ms to 0 in RaftClusterSnapshotTest for more predictability.

Reviewers: David Arthur <mumrah@gmail.com>, dengziming <dengziming1993@gmail.com>
2022-06-01 16:09:38 -07:00
dengziming 1d6e3d6cb3
KAFKA-13845: Add support for reading KRaft snapshots in kafka-dump-log (#12084)
The kafka-dump-log command should accept files with a suffix of ".checkpoint". It should also decode and print using JSON the snapshot header and footer control records.

Reviewers: José Armando García Sancio <jsancio@users.noreply.github.com>
2022-06-01 14:49:00 -07:00
José Armando García Sancio 7d1b0926fa
KAFKA-13883: Implement NoOpRecord and metadata metrics (#12183)
Implement NoOpRecord as described in KIP-835. This is controlled by the new
metadata.max.idle.interval.ms configuration.

The KRaft controller schedules an event to write NoOpRecord to the metadata log if the metadata
version supports this feature. This event is scheduled at the interval defined in
metadata.max.idle.interval.ms. Brokers and controllers were improved to ignore the NoOpRecord when
replaying the metadata log.

This PR also addsffour new metrics to the KafkaController metric group, as described KIP-835.

Finally, there are some small fixes to leader recovery. This PR fixes a bug where metadata version
3.3-IV1 was not marked as changing the metadata. It also changes the ReplicaControlManager to
accept a metadata version supplier to determine if the leader recovery state is supported.

Reviewers: Colin P. McCabe <cmccabe@apache.org>
2022-06-01 10:48:24 -07:00
Clara Fang 31a84dd72e
KAFKA-13946; Add missing parameter to kraft test kit `ControllerNode.setMetadataDirectory()` (#12225)
Added parameter `metadataDirectory` to `setMetadataDirectory()` so that `this.metadataDirectory` would not be set to itself.

Reviewers: Kvicii <42023367+Kvicii@users.noreply.github.com>, dengziming <dengziming1993@gmail.com>, Jason Gustafson <jason@confluent.io>
2022-05-30 15:55:07 -07:00
Jason Gustafson 645c1ba526
MINOR: Fix buildResponseSend test cases for envelope responses (#12185)
The test cases we have in `RequestChannelTest` for `buildResponseSend` construct the envelope request incorrectly. The request is created using the envelope context, but also a reference to the wrapped envelope request object. This patch fixes `TestUtils.buildEnvelopeRequest` so that the wrapped request is built properly. It also fixes the dependence on this incorrect construction and consolidates the tests in `RequestChannelTest` to avoid duplication.

Reviewers: dengziming <dengziming1993@gmail.com>, David Jacot <djacot@confluent.io>
2022-05-30 11:34:36 -07:00
bozhao12 620ada9888
MINOR: Fix typo in ClusterTestExtensionsTest (#12218)
Reviewers: Kvicii <Karonazaba@gmail.com>, Jason Gustafson <jason@confluent.io>
2022-05-27 13:38:54 -07:00
Colin Patrick McCabe 7143267f71
MINOR: Fix some bugs with UNREGISTER_BROKER
Fix some bugs in the KRaft unregisterBroker API and add a junit test.

1. kafka-cluster-tool.sh unregister should fail if no broker ID is passed.

2. UnregisterBrokerRequest must be marked as a KRaft broker API so 
that KRaft brokers can receive it.

3. KafkaApis.scala must forward UNREGISTER_BROKER to the controller.

Reviewers: Jason Gustafson <jason@confluent.io>, dengziming <dengziming1993@gmail.com>, David Jacot <djacot@confluent.io>
2022-05-26 14:07:29 -07:00
David Arthur 4efdc1a310
MINOR: Consolidate FinalizedFeatureCache into MetadataCache (#12214)
Reviewers: Colin P. McCabe <cmccabe@apache.org>
2022-05-26 16:25:58 -04:00
Jason Gustafson 43160bc476
MINOR: Add timeout to LogOffsetTest (#12213)
Reviewers: Kvicii <Karonazaba@gmail.com>, David Arthur <mumrah@gmail.com>
2022-05-26 16:07:54 -04:00
David Jacot 76477ffd2d
KAFKA-13858; Kraft should not shutdown metadata listener until controller shutdown is finished (#12187)
When the kraft broker begins controlled shutdown, it immediately disables the metadata listener. This means that metadata changes as part of the controlled shutdown do not get sent to the respective components. For partitions that the broker is follower of, that is what we want. It prevents the follower from being able to rejoin the ISR while still shutting down. But for partitions that the broker is leading, it means the leader will remain active until controlled shutdown finishes and the socket server is stopped. That delay can be as much as 5 seconds and probably even worse.

This PR revises the controlled shutdown procedure as follow:
* The broker signals to the replica manager that it is about to start the controlled shutdown.
* The broker requests a controlled shutdown to the controller.
* The controller moves leaders off from the broker, removes the broker from any ISR that it is a member of, and writes those changes to the metadata log.
* When the broker receives a partition metadata change, it looks if it is in the ISR. If it is, it updates the partition as usual. If it is not or if there is no leader defined--as would be the case if the broker was the last member of the ISR--it stops the fetcher/replica. This basically stops all the partitions for which the broker was part of their ISR.

When the broker is a replica of a partition but it is not in the ISR, the controller does not do anything. The leader epoch is not bumped. In this particular case, the follower will continue to run until the replica manager shuts down. In this time, the replica could become in-sync and the leader could try to bring it back to the ISR. This remaining issue will be addressed separately.

Reviewers: Jason Gustafson <jason@confluent.io>
2022-05-25 16:09:01 -07:00
dengziming 54d60ced86
KAFKA-13833: Remove the min_version_level from the finalized version range written to ZooKeeper (#12062)
Reviewers: David Arthur <mumrah@gmail.com>
2022-05-25 14:02:34 -04:00
Divij Vaidya f6ba10ef9c
MINOR: Fix flaky test TopicCommandIntegrationTest.testDescribeAtMinIsrPartitions(String).quorum=kraft (#12189)
Flaky test as failed in CI https://ci-builds.apache.org/blue/organizations/jenkins/Kafka%2Fkafka-pr/detail/PR-12184/1/tests/

The test fails because it does not wait for metadata to be propagated across brokers before killing a broker which may lead to it getting stale information. Note that a similar test was done in #12104 for a different test.

Reviewers: Kvicii Y, Ziming Deng, Jason Gustafson <jason@confluent.io>, Guozhang Wang <wangguoz@gmail.com>
2022-05-21 10:33:44 -07:00
dengziming 6380652a5a
KAFKA-13863; Prevent null config value when create topic in KRaft mode (#12109)
This patch ensures consistent handling of null-valued topic configs between the zk and kraft controller. Prior to this patch, we returned INVALID_REQUEST in zk mode and it was not an error in kraft. After this patch, we return INVALID_CONFIG consistently for this case.

Reviewers: Jason Gustafson <jason@confluent.io>
2022-05-19 09:46:48 -07:00
Jason Gustafson 8efdbce523
KAFKA-13837; Return an error from Fetch if follower is not a valid replica (#12150)
When a partition leader receives a `Fetch` request from a replica which is not in the current replica set, the behavior today is to return a successful fetch response, but with empty data. This causes the follower to retry until metadata converges without updating any state on the leader side. It is clearer in this case to return an error, so that the metadata inconsistency is visible in logging and so that the follower backs off before retrying. 

In this patch, we use `UNKNOWN_LEADER_EPOCH` when the `Fetch` request includes the current leader epoch. The way we see this is that the leader is validating the (replicaId, leaderEpoch) tuple. When the leader returns `UNKNOWN_LEADER_EPOCH`, it means that the leader does not expect the given leaderEpoch from that replica. If the request does not include a leader epoch, then we use `NOT_LEADER_OR_FOLLOWER`. We can take a similar interpretation for this case: the leader is rejecting the request because it does not think it should be the leader for that replica. But mainly these errors ensure that the follower will retry the request.

As a part of this patch, I have refactored the way that the leader updates follower fetch state. Previously, the process is a little convoluted. We send the fetch from `ReplicaManager` down to `Partition.readRecords`, then we iterate over the results and call `Partition.updateFollowerFetchState`. It is more straightforward to update state directly as a part of `readRecords`. All we need to do is pass through the `FetchParams`. This also prevents an unnecessary copy of the read results.

Reviewers: David Jacot <djacot@confluent.io>
2022-05-18 20:58:20 -07:00
bozhao12 b4f35c9ce0
MINOR: Fix typo in ReplicaManagerTest (#12178)
Reviewer: Luke Chen <showuon@gmail.com>
2022-05-19 10:28:47 +08:00
Jason Gustafson 1802c6dcb5
MINOR: Enable KRaft in `TransactionsTest` (#12176)
Enable support for KRaft in `TransactionsTest`. 

Reviewers: David Arthur <mumrah@gmail.com>
2022-05-18 14:07:59 -07:00
David Arthur 1135f22eaf
KAFKA-13830 MetadataVersion integration for KRaft controller (#12050)
This patch builds on #12072 and adds controller support for metadata.version. The kafka-storage tool now allows a
user to specify a specific metadata.version to bootstrap into the cluster, otherwise the latest version is used.

Upon the first leader election of the KRaft quroum, this initial metadata.version is written into the metadata log. When
writing snapshots, a FeatureLevelRecord for metadata.version will be written out ahead of other records so we can
decode things at the correct version level.

This also includes additional validation in the controller when setting feature levels. It will now check that a given
metadata.version is supportable by the quroum, not just the brokers.

Reviewers: José Armando García Sancio <jsancio@gmail.com>, Colin P. McCabe <cmccabe@apache.org>, dengziming <dengziming1993@gmail.com>, Alyssa Huang <ahuang@confluent.io>
2022-05-18 12:08:36 -07:00
runom cf34a2e4b0
MINOR: Replace string literal with constant in RequestChannel (#12134)
Replace the "RequestsPerSec" literal value with the pre-existing constant `RequestsPerSec`.

Reviewers: Divij Vaidya <divijvaidya13@gmail.com>, Jason Gustafson <jason@confluent.io>
2022-05-18 11:31:15 -07:00
dengziming 67d00e25e9
MINOR: Enable some AdminClient integration tests (#12110)
Enable KRaft in `AdminClientWithPoliciesIntegrationTes`t and `PlaintextAdminIntegrationTest`. There are some tests not enabled or not as expected yet:

- testNullConfigs, see KAFKA-13863
- testDescribeCluster and testMetadataRefresh, currently we don't get the real controller in KRaft mode so the test may not run as expected

This patch also changes the exception type raised from invalid `IncrementalAlterConfig` requests with the `SUBTRACT` and `APPEND` operations. When the configuration value type is not a list, we now raise `INVALID_CONFIG` instead of `INVALID_REQUEST`.

Reviewers: Luke Chen <showuon@gmail.com>, Jason Gustafson <jason@confluent.io>
2022-05-18 09:39:26 -07:00
David Jacot a1cd1d1839
MINOR: Followers should not have any remote replica states left over from previous leadership (#12138)
This patch ensures that followers don't have any remote replica states left over from previous leadership.

Reviewers: Jason Gustafson <jason@confluent.io>
2022-05-18 09:32:48 +02:00
bozhao12 f36de0744b
MINOR: Remove redundant metric reset in KafkaController (#12158)
The following variables in `KafkaController` are used for metrics:
```
    offlinePartitionCount 
    preferredReplicaImbalanceCount
    globalTopicCount 
    globalPartitionCount
    topicsToDeleteCount 
    replicasToDeleteCount 
    ineligibleTopicsToDeleteCount 
    ineligibleReplicasToDeleteCount 
```
When the controller goes from active to non-active, these variables will be reset to 0. Currently, this is done explicitly in in `KafkaController.onControllerResignation()` and also after every loop iteration in `KafkaController.updateMetrics()` .
The first of these is redundant and can be removed. This patch fixes this and also simplifies `updateMetrics`. 

Reviewers: Jason Gustafson <jason@confluent.io>
2022-05-17 15:40:05 -07:00
dengziming 5f039bae1c
KAFKA-13905: Fix failing ServerShutdownTest.testCleanShutdownAfterFailedStartupDueToCorruptLogs (#12165)
Reviewers: Jason Gustafson <jason@confluent.io>, Luke Chen <showuon@gmail.com>
2022-05-17 16:31:28 +08:00
David Jacot 972b76561a
MINOR: Rename remaining `zkVersion` to `partitionEpoch` in `PartitionTest` (#12147)
Reviewers:  Kvicii <42023367+Kvicii@users.noreply.github.com>, dengziming <dengziming1993@gmail.com>, Jason Gustafson <jason@confluent.io>
2022-05-17 08:58:43 +02:00
Jason Gustafson 1103c76d63
KAFKA-13899: Use INVALID_CONFIG error code consistently in AlterConfig APIs (#12162)
In the AlterConfigs/IncrementalAlterConfigs zk handler, we return `INVALID_REQUEST` and `INVALID_CONFIG` inconsistently. The problem is in `LogConfig.validate`. We may either return `ConfigException` or `InvalidConfigException`. When the first of these is thrown, we catch it and convert to `INVALID_REQUEST`. If the latter is thrown, then we return `INVALID_CONFIG`. It seems more appropriate to return `INVALID_CONFIG` consistently, which is what the KRaft implementation already does this. This patch fixes this and converts a few integration tests to KRaft.

Reviewers: José Armando García Sancio <jsancio@users.noreply.github.com>
2022-05-16 17:41:23 -07:00
Joel Hamill 06051988a2
MINOR: Clarify impact of num.replica.fetchers (#12153)
The documentation for `num.replica.fetchers` should emphasize the fact that the count applies to each source broker individually. Also mention the tradeoff.

Reviewers: Jason Gustafson <jason@confluent.io>
2022-05-16 09:38:06 -07:00
Divij Vaidya 5fae84e4d1
KAFKA-13851: Add integration tests for DeleteRecords API (#12087)
Reviewers: Luke Chen <showuon@gmail.com>, dengziming <dengziming1993@gmail.com>
2022-05-16 15:50:50 +08:00
Colin Patrick McCabe a3e0af94f2
MINOR: convert some tests to KRaft (#12155)
Convert EndToEndClusterIdTest, ConsumerGroupCommandTest,
ListConsumerGroupTest, and LogOffsetTest to test KRaft mode.

Reviewers: Jason Gustafson <jason@confluent.io>, dengziming <dengziming1993@gmail.com>
2022-05-13 17:29:47 -07:00
vamossagar12 f96e381387
KAFKA-13746: Attempt to fix flaky test by waiting on metadata update (#12104)
Reviewers: dengziming <dengziming1993@gmail.com>, Guozhang Wang <wangguoz@gmail.com>
2022-05-13 17:09:47 -07:00