Commit Graph

255 Commits

Author SHA1 Message Date
Colin P. McCabe 5a1aa1a670 MINOR: Standardize controller log4j output for replaying records
Standardize controller log4j output for replaying important records. The log message should include
word "replayed" to make it clear that this is a record replay. Log the replay of records for ACLs,
client quotas, and producer IDs, which were previously not logged. Also fix a case where we weren't
logging changes to broker registrations.

AclControlManager, ClientQuotaControlManager, and ProducerIdControlManager didn't previously have a
log4j logger object, so this PR adds one. It also converts them to using Builder objects. This
makes junit tests more readable because we don't need to specify paramaters where the test can use
the default (like LogContexts).

Throw an exception in replay if we get another TopicRecord for a topic which already exists.
2023-07-13 10:18:34 -07:00
Ron Dagostino edd64fa251
MINOR: more KRaft Metadata Image tests (#13724)
Adds additional testing for the various KRaft *Image classes. For every image that we create we already test that we can get there by applying all the records corresponding to that image written out as a list of records. This patch adds more tests to confirm that we can get to each such final image with intermediate stops at all possible intermediate images along the way.

Reviewers: Colin P. McCabe <cmccabe@apache.org>, David Arthur <mumrah@gmail.com>
2023-07-10 10:01:10 -04:00
David Arthur 726d277c0a
MINOR: Move some things around in KRaftMigrationDriver (#13978)
Reviewers: Colin P. McCabe <cmccabe@apache.org>
2023-07-10 09:05:46 -04:00
andymg3 1223b79973
KAFKA-15149: Fix handling of new partitions in dual-write mode (#13968)
Fixes a bug where we don't send UMR and LISR requests in dual-write mode when new partitions are created. Prior to this patch, KRaftMigrationZkWriter was mutating the internal data-structures of TopicDelta which prevented MigrationPropagator from sending UMR and LISR for the changed partitions.

Reviewers: David Arthur <mumrah@gmail.com>
2023-07-07 10:16:51 -04:00
David Arthur fc7d912e8b
KAFKA-15109 Ensure the leader epoch bump occurs for older MetadataVersions (#13910)
This fixes a regression introduced by the previous KAFKA-15109 commit (d0457f7360 on trunk).

Reviewers: Colin P. McCabe <cmccabe@apache.org>, José Armando García Sancio <jsancio@apache.org>
2023-06-27 11:49:20 -04:00
David Arthur 1bf7039999
KAFKA-15098 Allow authorizers to be configured in ZK migration (#13895)
Reviewers: Ron Dagostino <rdagostino@confluent.io>
2023-06-22 09:34:49 -04:00
David Arthur d0457f7360
KAFKA-15109 Don't skip leader epoch bump while in migration mode (#13890)
While in migration mode, the KRaft controller must always bump the leader epoch when shrinking an ISR. 
This is required to maintain compatibility with the ZK brokers. Without the epoch bump, the ZK brokers
will ignore the partition state change present in the LeaderAndIsrRequest since it would not contain a new
leader epoch.

Reviewers: Colin P. McCabe <cmccabe@apache.org>
2023-06-21 13:09:05 -04:00
minjian.cai ba5e1acdfb
MINOR: fix typos for metadata (#13889)
Reviewers: Divij Vaidya <diviv@amazon.com>, Deqi Hu <deqi.hu@shopee.com>
2023-06-21 15:09:15 +02:00
Colin P. McCabe cd3c0ab1a3 KAFKA-15060: fix the ApiVersionManager interface
This PR expands the scope of ApiVersionManager a bit to include returning the current
MetadataVersion and features that are in effect. This is useful in general because that information
needs to be returned in an ApiVersionsResponse. It also allows us to fix the ApiVersionManager
interface so that all subclasses implement all methods of the interface. Having subclasses that
don't implement some methods is dangerous because they could cause exceptions at runtime in
unexpected scenarios.

On the KRaft controller, we were previously performing a read operation in the QuorumController
thread to get the current metadata version and features. With this PR, we now read a volatile
variable maintained by a separate MetadataVersionContextPublisher object. This will improve
performance and simplify the code. It should not change the guarantees we are providing; in both
the old and new scenarios, we need to be robust against version skew scenarios during updates.

Add a Features class which just has a 3-tuple of metadata version, features, and feature epoch.
Remove MetadataCache.FinalizedFeaturesAndEpoch, since it just duplicates the Features class.
(There are some additional feature-related classes that can be consolidated in in a follow-on PR.)

Create a java class, EndpointReadyFutures, for managing the futures associated with individual
authorizer endpoints. This avoids code duplication between ControllerServer and BrokerServer and
makes this code unit-testable.

Reviewers: David Arthur <mumrah@gmail.com>, dengziming <dengziming1993@gmail.com>, Luke Chen <showuon@gmail.com>
2023-06-19 16:46:44 -07:00
Luke Chen d3e0b27b24
KAFKA-15040: trigger onLeadershipChange under KRaft mode (#13807)
When received LeaderAndIsr request, we'll notify remoteLogManager about this leadership changed to trigger the following workflow. But LeaderAndIsr won't be sent in KRaft mode, instead, the topicDelta will be received.

This PR fixes this issue by getting leader change and follower change from topicDelta, and triggering rlm.onLeadershipChange to notify remote log manager. Adding tests for remote storage enabled cases.

Reviewers: Satish Duggana <satishd@apache.org>
2023-06-09 09:53:46 +08:00
José Armando García Sancio 8ad0ed3e61
KAFKA-15021; Skip leader epoch bump on ISR shrink (#13765)
When the KRaft controller removes a replica from the ISR because of the controlled shutdown there is no need for the leader epoch to be increased by the KRaft controller. This is accurate as long as the topic partition leader doesn't add the removed replica back to the ISR.

This change also fixes a bug when computing the HWM. When computing the HWM, replicas that are not eligible to join the ISR but are caught up should not be included in the computation. Otherwise, the HWM will never increase for replica.lag.time.max.ms because the shutting down replica is not sending FETCH request. Without this additional fix PRODUCE requests would timeout if the request timeout is greater than replica.lag.time.max.ms.

Because of the bug above the KRaft controller needs to check the MV to guarantee that all brokers support this bug fix before skipping the leader epoch bump.

Reviewers: David Mao <47232755+splett2@users.noreply.github.com>, Divij Vaidya <diviv@amazon.com>, David Jacot <djacot@confluent.io>
2023-06-07 07:20:40 -07:00
andymg3 db9d845702
KAFKA-14791; Create a builder for PartitionRegistration (#13788)
This creates a builder for PartitionRegistration. The motivation for the builder is that the constructor of PartitionRegistration has four arguments all of type int[] which makes it easy to make a mistake when using it.

Reviewers: José Armando García Sancio <jsancio@apache.org>
2023-06-06 07:58:23 -07:00
Dimitar Dimitrov 0d5cf4c385
KAFKA-15052 Fix the flaky QuorumControllerTest.testBalancePartitionLeaders (#13804)
In this test broker session timeout is configured aggressively low
(to 1 second) so that fencing can happen without much waiting. Then
in the final portion of the test when brokers should not be fenced
heartbeats are sent roughly 2 times in a session timeout window.
However the first time that's done there's other code between
sending the heartbeat and taking the timestamp, and in local tests
that code can take up to 0.5 seconds (1/2 of the session timeout).
That then can result in all brokers being fenced again which would
fail the test.

This change sends a heartbeat just when a timestamp is taken,
which in local tests results flaky failures from 4 out of 50
to 0 out of 50.

Reviewers: Colin P. McCabe <cmccabe@apache.org>
2023-06-04 11:02:27 -07:00
Colin Patrick McCabe 146a6976ae
KAFKA-15048: Improve handling of unexpected quorum controller errors (#13799)
When the active quorum controller encounters an "unexpected" error, such as a NullPointerException,
it currently resigns its leadership. This PR fixes it so that in addition to doing that, it also
increments the metadata error count metric. This will allow us to better track down these errors.

This PR also fixes a minor bug where performing read operations on a standby controller would
result in an unexpected RuntimeException. The bug happened because the standby controller does not
take in-memory snapshots, and read operations were attempting to read from the epoch of the latest
committed offset. The fix is for the standby controller to simply read the latest value of each
data structure. This is always safe, because standby controllers don't contain uncommitted data.

Also, fix a bug where listPartitionReassignments was reading the latest data, rather than data from
the last committed offset.

Reviewers: dengziming <dengziming1993@gmail.com>, David Arthur <mumrah@gmail.com>
2023-06-02 12:51:15 -07:00
David Arthur f499662923 KAFKA-15003: Fix ZK sync logic for partition assignments (#13735)
Fixed the metadata change events in the Migration component to check correctly for the diff in
existing topic changes and replicate the metadata to the Zookeeper. Also, made the diff check
exhaustive enough to handle the partial writes in Zookeeper when we're try to replicate changes
using a snapshot in the event of Controller failover.

Add migration client and integration tests to verify the change.

Co-authored-by: Akhilesh Chaganti <akhileshchg@users.noreply.github.com>
2023-06-01 15:43:41 -07:00
David Arthur d27ba5bfba
KAFKA-15010 ZK migration failover support (#13758)
This patch adds snapshot reconciliation during ZK to KRaft migration. This reconciliation happens whenever a snapshot is loaded by KRaft, or during a controller failover. Prior to this patch, it was possible to miss metadata updates coming from KRaft when dual-writing to ZK.

Internally this adds a new state SYNC_KRAFT_TO_ZK to the KRaftMigrationDriver state machine. The controller passes through this state after the initial ZK migration and each time a controller becomes active. 

Logging during dual-write was enhanced to include a count of write operations happening.

Reviewers: Colin P. McCabe <cmccabe@apache.org>
2023-06-01 10:25:46 -04:00
Ron Dagostino e74e5e7ac5
KAFKA-15039: Reduce logging level to trace in PartitionChangeBuilder.… (#13780)
…tryElection()

A CPU profile in a large cluster showed PartitionChangeBuilder.tryElection() taking significant CPU due to logging. We adjust the logging statements in that method for clean elections from DEBUG level to TRACE to mitigate the impact of this logging under normal operations.  Unclean elections are now logged at the INFO level rather than DEBUG.

Reviewers: Jason Gustafson <jason@confluent.io>, Colin P. McCabe <cmccabe@apache.org>
2023-05-31 16:26:01 -04:00
Proven Provenzano 731c8c967e
KAFKA-15017 Fix snapshot load in dual write mode for ClientQuotas and SCRAM (#13757)
This patch fixes the case where a ClientQuota or SCRAM credential was added in KRaft, but not written back to ZK. This missed write only occurred when handling a KRaft snapshot. If the changed quota was processed in a metadata delta (which is the typical case), it would be written to ZK.

Reviewers: David Arthur <mumrah@gmail.com>
2023-05-31 15:42:00 -04:00
Colin Patrick McCabe 9b3db6d50a
KAFKA-15019: Improve handling of broker heartbeat timeouts (#13759)
When the active KRaft controller is overloaded, it will not be able to process broker heartbeat
requests. Instead, they will be timed out. When using the default configuration, this will happen
if the time needed to process a broker heartbeat climbs above a second for a sustained period.
This, in turn, could lead to brokers being improperly fenced when they are still alive.

With this PR, timed out heartbeats will still update the lastContactNs and metadataOffset of the
broker in the BrokerHeartbeatManager. While we don't generate any records, this should still be
adequate to prevent spurious fencing. We also log a message at ERROR level so that this condition
will be more obvious.

Other small changes in this PR: fix grammar issue in log4j of BrokerHeartbeatManager. Add JavaDoc
for ClusterControlManager#zkMigrationEnabled field. Add builder for ReplicationControlTestContext
to avoid having tons of constructors. Update ClusterControlManager.DEFAULT_SESSION_TIMEOUT_NS to
match the default in KafkaConfig.

Reviewers: Ismael Juma <ijuma@apache.org>, Ron Dagostino <rdagostino@confluent.io>
2023-05-31 10:49:05 -07:00
David Arthur 7a679af687
KAFKA-15004: Fix configuration dual-write during migration (#13767)
This patch fixes several small bugs with configuration dual-write during migration.

* Topic configs are not written back to ZK while handling snapshot.
* New broker/topic configs in KRaft that did not exist in ZK will not be written to ZK.
* The sensitive configs are not encoded while writing them to Zookeeper.
* Handle topic configs in ConfigMigrationClient and KRaftMigrationZkWriter#handleConfigsSnapshot

Added tests to ensure we no longer have the above mentioned issues.

Co-authored-by: Akhilesh Chaganti <akhileshchg@users.noreply.github.com>
Reviewers: Colin P. McCabe <cmccabe@apache.org>
2023-05-27 17:20:44 -04:00
Colin Patrick McCabe b74204fa0a
KAFKA-14996: Handle overly large user operations on the kcontroller (#13742)
Previously, if a user tried to perform an overly large batch operation on the KRaft controller
(such as creating a million topics), we would create a very large number of records in memory. Our
attempt to write these records to the Raft layer would fail, because there were too many to fit in
an atomic batch. This failure, in turn, would trigger a controller failover.

(Note: I am assuming here that no topic creation policy was in place that would prevent the
creation of a million topics. I am also assuming that the user operation must be done atomically,
which is true for all current user operations, since we have not implemented KIP-868 yet.)

With this PR, we fail immediately when the number of records we have generated exceeds the
threshold that we can apply. This failure does not generate a controller failover. We also now
fail with a PolicyViolationException rather than an UnknownServerException.

In order to implement this in a simple way, this PR adds the BoundedList class, which wraps any
list and adds a maximum length. Attempts to grow the list beyond this length cause an exception to
be thrown.

Reviewers: David Arthur <mumrah@gmail.com>, Ismael Juma <ijuma@apache.org>, Divij Vaidya <diviv@amazon.com>
2023-05-26 13:16:17 -07:00
Manyanda Chitimbo a27c98ca61
MINOR: remove unused variable from QuorumMetaLogListener#handleCommit method (#13611)
The local variable processedRecordsSize as just left over from another commit and can be safely removed.

Reviewers: Divij Vaidya <diviv@amazon.com> , José Armando García Sancio <jsancio@apache.org>
2023-05-26 08:21:40 -07:00
Proven Provenzano 79351ec88e
KAFKA-14970: Fix SCRAM during migration dual-write (#13729)
Fixed a bug during dual write mode where if a user is updating SCRAM records and has no quotas, the SCRAM records will not be written to ZK. Add tests explicitly for this scenario.

Reviewers: David Arthur <mumrah@gmail.com>
2023-05-24 17:01:39 -04:00
Colin P. McCabe 12130cfcec MINOR: Create the MetadataNode classes to introspect MetadataImage
Metadata image classes such as MetadataImage, ClusterImage, FeaturesImage, and so forth contain
numerous sub-images. This PR adds a structured way of traversing those sub-images. This is useful
for the metadata shell, and also for implementing toString functions.

In both cases, the previous solution was suboptimal. The metadata shell was previously implemented
in an ad-hoc way by mutating text-based tree nodes when records were replayed. This was difficult
to keep in sync with changes to the record types (for example, we forgot to do this for SCRAM). It
was also pretty low-level, being done at a level below that of the image classes. For toString, it
was difficult to keep the implementations consistent previously, and also support both redacted and
non-redacted output.

The metadata shell directory was getting crowded since we never had submodules for it. This PR
creates glob/, command/, node/, and state/ directories to keep things better organized.

Reviewers: David Arthur <mumrah@gmail.com>, Ron Dagostino <rdagostino@confluent.io>
2023-05-23 10:11:26 -07:00
Akhilesh C ea6ce3bf82
KAFKA-15009: Handle new ACLs in KRaft snapshot during migration (#13741)
When loading a snapshot during dual-write mode, we were missing the logic to detect new ACLs that 
had been added on the KRaft side. This patch adds support for finding those new ACLs as well as tests
to verify the correct behavior.

Reviewers: David Arthur <mumrah@gmail.com>
2023-05-23 10:43:02 -04:00
Akhilesh C 6b95581867
KAFKA-15007: Use the correct MetadataVersion in MetadataPropagator (#13732)
Use the MetadataVersion from the MetadataImage passed to MetadataPropagator. The ensures the propagator 
sends the right versions of UMR, LISR and StopReplica requests when the migration is in DUAL_WRITE mode.

Reviewers: David Arthur <mumrah@gmail.com>
2023-05-22 14:46:53 -04:00
David Mao d944ef1efb MINOR: Rename handleSnapshot to handleLoadSnapshot (#13727)
Rename handleSnapshot to handleLoadSnapshot to make it explicit that it is handling snapshot load,
not generation.

Reviewers: Colin P. McCabe <cmccabe@apache.org>, Jason Gustafson <jason@confluent.io>
2023-05-17 09:57:24 -07:00
Divij Vaidya bb10ae4273
KAFKA-14962: Trim whitespace from ACL configuration (#13670)
Reviewers: Manikumar Reddy <manikumar.reddy@gmail.com>, Christo Lolov <lolovc@amazon.com>
2023-05-12 23:51:00 +05:30
hudeqi 440bed2391
MINOR:code optimization in QuorumController (#13697)
1. add hint in switch item "BROKER_LOGGER" in ConfigResourceExistenceChecker, otherwise, it will be classified as default break and deleted directly. I don’t know if adding hint is better than deleting directly.
2. delete some unused variables and methods.
3. add the "@test" mark to a method in unit test that is forgotten.

Reviewers: dengziming <dengziming1993@gmail.com>
2023-05-12 14:03:17 +08:00
dengziming a7c9842f70
KAFKA-14291: KRaft controller should return right finalized features in ApiVersionResponse (#13679)
The KRaft controller return empty finalized features in `ApiVersionResponse`, the brokers are not infected by this, so this problem doesn't have any impact currently, but it's worth fixing it to avoid unexpected problems.

And there is a bunch of of confusing methods in `ApiVersionResponse` which are only used in test code, I moved them to TestUtils to make the code more clear, and force everyone to pass in the correct parameters instead of the default zero parameters, for example, empty supported features and empty finalized features.

Reviewers: Luke Chen <showuon@gmail.com>
2023-05-12 13:46:06 +08:00
David Arthur 0822ce0ed1
KAFKA-14840: Support for snapshots during ZK migration (#13461)
This patch adds support for handling metadata snapshots while in dual-write mode. Prior to this change, if the active
controller loaded a snapshot, it would get out of sync with the ZK state.

In order to reconcile the snapshot state with ZK, several methods were added to scan through the metadata in ZK to
compute differences with the MetadataImage. Since this introduced a lot of code, I opted to split out a lot of methods
from ZkMigrationClient into their own client interfaces, such as TopicMigrationClient, ConfigMigrationClient, and
AclMigrationClient. Each of these has some iterator method that lets the caller examine the ZK state in a single pass
and without using too much memory.

Reviewers: Colin P. McCabe <cmccabe@apache.org>, Luke Chen <showuon@gmail.com>
2023-05-05 01:35:26 -07:00
Colin P. McCabe 97c36f3f31 HOTFIX: fix file deletions left out of MINOR: improve QuorumController logging #13540 2023-05-04 12:20:33 -07:00
Colin P. McCabe 63f9f23ec0 MINOR: improve QuorumController logging #13540
When creating the QuorumController, log whether ZK migration is enabled.

When applying a feature level record which sets the metadata version, log the metadata version enum
rather than the numeric feature level.

Improve the logging when we replay snapshots in QuorumController. Log both the beginning and the
end of replay.

When TRACE is enabled, log every record that is replayed in QuorumController. Since some records
may contain sensitive information, create RecordRedactor to assist in logging only what is safe to
put in the log4j file.

Add logging to ControllerPurgatory. Successful completions are logged at DEBUG; failures are logged
at INFO, and additions are logged at TRACE.

Remove SnapshotReason.java, SnapshotReasonTest.java, and
QuorumController#generateSnapshotScheduled. They are deadcode now that snapshot generation moved to
org.apache.kafka.image.publisher.SnapshotGenerator.

Reviewers: David Arthur <mumrah@gmail.com>, José Armando García Sancio <jsancio@apache.org>
2023-05-04 11:18:03 -07:00
Proven Provenzano e29942347a
KAFKA-14859: SCRAM ZK to KRaft migration with dual write (#13628)
Handle migrating SCRAM records in ZK when migrating from ZK to KRaft.

This includes handling writing back SCRAM records to ZK while in dual write mode where metadata updates are written to both the KRaft metadata log and to ZK. This allows for rollback of migration to include SCRAM metadata changes.

Reviewers: David Arthur <mumrah@gmail.com>
2023-05-01 09:56:04 -04:00
Luke Chen d796480fe8
KAFKA-14909: check zkMigrationReady tag before migration (#13631)
1. add ZkMigrationReady in apiVersionsResponse
2. check all nodes if ZkMigrationReady are ready before moving to next migration state

Reviewers: David Arthur <mumrah@gmail.com>, dengziming <dengziming1993@gmail.com>
2023-04-28 14:35:12 +08:00
Colin Patrick McCabe c708f7ba5f
MINOR: remove spurious call to fatalFaultHandler (#13651)
Remove a spurious call to fatalFaultHandler accidentally introduced by KAFKA-14805.  We should only
invoke the fatal fault handller if we are unable to generate the activation records. If we are
unable to write the activation records, a controller failover should be sufficient to remedy the
situation.

Co-authored-by: Luke Chen showuon@gmail.com

Reviewers: Luke Chen <showuon@gmail.com>, David Arthur <mumrah@gmail.com>
2023-04-28 10:15:26 +08:00
Colin P. McCabe 7049333617 KAFKA-14943: Fix ClientQuotaControlManager validation
Don't allow setting negative or zero values for quotas. Don't allow SCRAM mechanism names to be
used as client quota names. SCRAM mechanisms are not client quotas. (The confusion arose because of
internal ZK representation details that treated them both as "client configs.")

Add unit tests for ClientQuotaControlManager.isValidIpEntity and
ClientQuotaControlManager.configKeysForEntityType.

This change doesn't affect metadata record application, only input validation. If there are bad
client quotas that are set currently, this change will not alter the current behavior (of throwing
an exception and ignoring the bad quota).
2023-04-27 10:42:32 -07:00
David Arthur c1b5c75d92
KAFKA-14805 KRaft controller supports pre-migration mode (#13407)
This patch adds the concept of pre-migration mode to the KRaft controller. While in this mode, 
the controller will only allow certain write operations. The purpose of this is to disallow metadata 
changes when the controller is waiting for the ZK migration records to be committed.

The following ControllerWriteEvent operations are permitted in pre-migration mode

* completeActivation
* maybeFenceReplicas
* writeNoOpRecord
* processBrokerHeartbeat
* registerBroker (only for migrating ZK brokers)
* unregisterBroker

Raft events and other controller events do not follow the same code path as ControllerWriteEvent, 
so they are not affected by this new behavior.

This patch also add a new metric as defined in KIP-868: kafka.controller:type=KafkaController,name=ZkMigrationState

In order to support upgrades from 3.4.0, this patch also redefines the enum value of value 1 to mean 
MIGRATION rather than PRE_MIGRATION.

Reviewers: Chia-Ping Tsai <chia7712@gmail.com>, Colin P. McCabe <cmccabe@apache.org>
2023-04-26 10:20:30 -04:00
Manyanda Chitimbo dd63d88ac3
MINOR: fix noticed typo in raft and metadata projects (#13612)
Reviewers: Josep Prat <jlprat@apache.org>
2023-04-21 15:02:06 +02:00
David Jacot 2d0b816150
MINOR: Move `ControllerPurgatory` to `server-common` (#13555)
This patch renames from `ControllerPurgatory` to `DeferredEventQueue` and moves it from the `metadata` module to `server-common` module.

Reviewers: Alexandre Dupriez <alexandre.dupriez@gmail.com>, Ziming Deng <dengziming1993@gmail.com>, José Armando García Sancio <jsancio@apache.org>
2023-04-21 11:19:04 +02:00
Purshotam Chauhan df13775254
KAFKA-14828: Remove R/W locks using persistent data structures (#13437)
Currently, StandardAuthorizer uses a R/W lock for maintaining the consistency of data. For the clusters with very high traffic, we will typically see an increase in latencies whenever a write operation comes. The intent of this PR is to get rid of the R/W lock with the help of immutable or persistent collections. Basically, new object references are used to hold the intermediate state of the write operation. After the completion of the operation, the main reference to the cache is changed to point to the new object. Also, for the read operation, the code is changed such that all accesses to the cache for a single read operation are done to a particular cache object only.

In the PR description, you can find the performance of various libraries at the time of both read and write. Read performance is checked with the existing AuthorizerBenchmark. For write performance, a new AuthorizerUpdateBenchmark has been added which evaluates the performance of the addAcl operation.


Reviewers:  Ron Dagostino <rndgstn@gmail.com>, Manikumar Reddy <manikumar.reddy@gmail.com>,  Divij Vaidya <diviv@amazon.com>
2023-04-21 14:08:23 +05:30
Proven Provenzano abca86511e
KAFKA-14881: Rework UserScramCredentialRecord (#13513)
Rework UserScramCredentialRecord to store serverKey and StoredKey rather than saltedPassword. This
is necessary to support migration from ZK, since those are the fields we stored in ZK.  Update
latest MetadataVersion to IBP_3_5_IV2 and make SCRAM support conditional on this version.  Moved
ScramCredentialData.java from org.apache.kafka.image to org.apache.kafka.metadata, which seems more
appropriate.

Reviewers: Colin P. McCabe <cmccabe@apache.org>
2023-04-18 09:41:38 -07:00
Manyanda Chitimbo b36a170aa3
MINOR: fix typos in MigrationClient, StandardAuthorizer, StandardAuthorizerData and KafkaConfigSchema files (#13593)
Reviewers: Luke Chen <showuon@gmail.com>
2023-04-18 19:36:56 +08:00
Ron Dagostino e27926f92b
KAFKA-14735: Improve KRaft metadata image change performance at high … (#13280)
topic counts.

Introduces the use of persistent data structures in the KRaft metadata image to avoid copying the entire TopicsImage upon every change.  Performance that was O(<number of topics in the cluster>) is now O(<number of topics changing>), which has dramatic time and GC improvements for the most common topic-related metadata events.  We abstract away the chosen underlying persistent collection library via ImmutableMap<> and ImmutableSet<> interfaces and static factory methods.

Reviewers: Luke Chen <showuon@gmail.com>, Colin P. McCabe <cmccabe@apache.org>, Ismael Juma <ismael@juma.me.uk>, Purshotam Chauhan <pchauhan@confluent.io>
2023-04-17 17:52:28 -04:00
andymg3 c4ad09e47d
MINOR: Add more KRaft reassignment tests (#13521)
Although KAFKA-14808 did not affect KRaft mode, it is important to ensure that we have regression
tests in KRaft mode to prevent a similar bug from appearing there in the future. This PR adds two
tests. First, it adds a test that makes sure we handle what happens when a reassignment completes
and none of the new replicas can be made leader. It's important that we dont keep an old replica as
leader. Second, it adds a test that makes sure we handle new reassignments that don't include a
previous assignment replica that was leader.

Reviewers: Colin P. McCabe <cmccabe@apache.org>
2023-04-12 12:00:35 -07:00
Colin Patrick McCabe f1f35ef1a8
KAFKA-14894: MetadataLoader must call finishSnapshot after loading a snapshot (#13541)
The MetadataLoader must call finishSnapshot after loading a snapshot. This function removes
whatever was in the old snapshot that is not in the new snapshot that was just loaded. While this
is not significant when the old snapshot was the empty snapshot, it is important to do when we are
loading a snapshot on top of an existing non-empty image.

In initializeNewPublishers, the newly installed publishers should be given a MetadataDelta based on
MetadataImage.EMPTY, reflecting the fact that they are seeing everything for the first time.

Reviewers: David Arthur <mumrah@gmail.com>
2023-04-11 15:02:33 -07:00
José Armando García Sancio 672dd3ab6a
KAFKA-13020; Implement reading Snapshot log append timestamp (#13345)
The SnapshotReader exposes the "last contained log time". This is mainly used during snapshot cleanup. The previous implementation used the append time of the snapshot record. This is not accurate as this is the time when the snapshot was created and not the log append time of the last record included in the snapshot.

The log append time of the last record included in the snapshot is store in the header control record of the snapshot. The header control record is the first record of the snapshot.

To be able to read this record, this change extends the RecordsIterator to decode and expose the control records in the Records type.

Reviewers: Colin Patrick McCabe <cmccabe@apache.org>
2023-04-07 09:25:54 -07:00
Calvin Liu 8c88cdb718
KAFKA-14617: Update AlterPartitionRequest and enable Kraft controller to reject stale request. (#13408)
Second part of the [KIP-903](https://cwiki.apache.org/confluence/display/KAFKA/KIP-903%3A+Replicas+with+stale+broker+epoch+should+not+be+allowed+to+join+the+ISR), it updates the AlterPartitionRequest:
- Deprecate the NewIsr field
- Create a new field BrokerState with BrokerId and BrokerEpoch
- Bump the AlterPartition version to 3

With this change, the Quorum Controller is enabled to reject stale AlterPartition request.

Reviewers: Jun Rao <junrao@gmail.com>, David Jacot <djacot@confluent.io>
2023-03-31 11:27:42 +02:00
andymg3 887d05559f
MINOR: Create only one FeatureControlManager instance in ReplicationControlManagerTest (#13468)
This is a small patch to make it so we only create one FeatureControlManager instance in ReplicationControlManagerTest. Currently we create two, which isn't needed. Its also a bit confusing because the ReplicationControlTestContext objects ends up having a different FeatureControlManager reference that the one its own ReplicationControlManager instance has a reference to.

Reviewers: José Armando García Sancio <jsancio@apache.org>, dengziming <dengziming1993@gmail.com>
2023-03-29 19:10:03 -07:00
Colin Patrick McCabe 09e59bc776
KAFKA-14857: Fix some MetadataLoader bugs (#13462)
The MetadataLoader is not supposed to publish metadata updates until we have loaded up to the high
water mark. Previously, this logic was broken, and we published updates immediately. This PR fixes
that and adds a junit test.

Another issue is that the MetadataLoader previously assumed that we would periodically get
callbacks from the Raft layer even if nothing had happened. We relied on this to install new
publishers in a timely fashion, for example. However, in older MetadataVersions that don't include
NoOpRecord, this is not a safe assumption.

Aside from the above changes, also fix a deadlock in SnapshotGeneratorTest, fix the log prefix for
BrokerLifecycleManager, and remove metadata publishers on brokerserver shutdown (like we do for
controllers).

Reviewers: David Arthur <mumrah@gmail.com>, dengziming <dengziming1993@gmail.com>
2023-03-29 12:30:12 -07:00