Commit Graph

11383 Commits

Author SHA1 Message Date
Colin P. McCabe 5a1aa1a670 MINOR: Standardize controller log4j output for replaying records
Standardize controller log4j output for replaying important records. The log message should include
word "replayed" to make it clear that this is a record replay. Log the replay of records for ACLs,
client quotas, and producer IDs, which were previously not logged. Also fix a case where we weren't
logging changes to broker registrations.

AclControlManager, ClientQuotaControlManager, and ProducerIdControlManager didn't previously have a
log4j logger object, so this PR adds one. It also converts them to using Builder objects. This
makes junit tests more readable because we don't need to specify paramaters where the test can use
the default (like LogContexts).

Throw an exception in replay if we get another TopicRecord for a topic which already exists.
2023-07-13 10:18:34 -07:00
hudeqi 8d24716f27
MINOR: Avoid slow Set.removeAll(List) in MirrorSourceConnector (#13992)
Reviewed-by: Greg Harris <greg.harris@aiven.io>
2023-07-12 12:19:35 -07:00
Sambhav Jain d114d8e29c
KAFKA-14938: Fixing flaky test testConnectorBoundary (#13646)
Reviewers: Sagar Rao <sagarmeansocean@gmail.com>, Yash Mayya <yash.mayya@gmail.com>, 
Sudesh Wasnik <swasnik@confluent.io>, Chris Egerton <chrise@aiven.io>
2023-07-12 11:45:49 -04:00
ezio 170f5f4ed0
KAFKA-15148: Mark tests correctly as integration tests where they running as unit tests (#13973)
Reviewers: Divij Vaidya <diviv@amazon.com>
2023-07-12 13:41:58 +02:00
Manikumar Reddy 4e85bc9f80
MINOR: Fix Jmxtool to honour wait option when MBean is not yet avaibale in MBean server (#13995)
In JmxTool.scala, we will wait till all the object names are available from MBean server. But in the newer version, we only wait for subset of object names. Due to this, we may not enforce wait option and prematurely return the result if the objects are not yet registered in MBean sever.

Reviewers: Luke Chen <showuon@gmail.com>, Federico Valeri <fvaleri@redhat.com>
2023-07-12 17:01:10 +05:30
David Jacot aafbe34443
KAFKA-14462; [22/N] Implement session and revocation timeouts (#13963)
This patch adds the session timeout and the revocation timeout to the new consumer group protocol.

Reviewers: Calvin Liu <caliu@confluent.io>, Jeff Kim <jeff.kim@confluent.io>, Justine Olshan <jolshan@confluent.io>
2023-07-12 11:10:51 +02:00
Clay Johnson 451fff8937
MINOR: Capture build scans on ge.apache.org to benefit from deep build insights (#13676)
Reviewers: Mickael Maison <mickael.maison@gmail.com>, Ismael Juma <ismael@juma.me.uk>, Divij Vaidya <diviv@amazon.com>
2023-07-12 10:47:32 +02:00
Mickael Maison b584e91036
KAFKA-15093: Add 3.4.0 and 3.5.0 to core upgrade and compatibility system tests (#13859)
Reviewers: Luke Chen <showuon@gmail.com>, Christo Lolov  <christololov@gmail.com>
2023-07-12 10:36:57 +02:00
Mickael Maison 354db26b95
MINOR: Add 3.5.0 and 3.4.1 to system tests (#13849)
Reviewers: Luke Chen <showuon@gmail.com>
2023-07-12 10:11:44 +02:00
Jim Galasyn f3ee9ff90f
MINOR: Add Streams API broker compatibility table (#13937)
Reviewers: Matthias J. Sax <matthias@confluent.io>
2023-07-11 12:24:40 -07:00
Satish Duggana 7e2f878713
KAFKA-14522 Rewrite/Move of RemoteIndexCache to storage module. (#13275)
KAFKA-14522 Rewrite and Move of RemoteIndexCache to storage module.
Cleanedup index file suffix usages and other minor cleanups

Reviewers: Jun Rao <junrao@gmail.com>, Ismael Juma <ismael@juma.me.uk>, Luke Chen <showuon@gmail.com>, Divij Vaidya <diviv@amazon.com>, Kamal Chandraprakash<kamal.chandraprakash@gmail.com>, Jorge Esteban Quilcate Otoya <quilcate.jorge@gmail.com>
2023-07-11 23:55:23 +05:30
vamossagar12 adacfea2d6
KAFKA-12525: Ignoring stale status statuses when reading from Connect status topic (#13453)
During fast consecutive rebalances where a task is revoked from one worker and assigned to another one, it has been observed that there is a small time window and thus a race condition during which a RUNNING status record in the new generation is produced and is immediately followed by a delayed UNASSIGNED status record belonging to the same or a previous generation before the worker that sends this message reads the RUNNING status record that corresponds to the latest generation.
Although this doesn't inhibit the actual execution of tasks, it reports an incorrect status for those tasks(i.e UNASSIGNED). If the users have setup some kind of monitoring on tasks status then this could lead to false alarms for example.
This fix addresses this problem by checking if a status message is stale after reading it and updates it's status only when it is safe to. 

Reviewers: Lucent-Wong <manchesterfans@live.cn>, Chris Egerton <chrise@aiven.io>, Yash Mayya <yash.mayya@gmail.com>, Konstantine Karantasis <k.karantasis@gmail.com>
2023-07-11 08:05:10 -07:00
Yi-Sheng Lien b8f3776f24
KAFKA-15155: Follow PEP 8 best practice in Python to check if a container is empty (#13974)
Reviewers: Divij Vaidya <diviv@amazon.com>
2023-07-11 11:01:50 +02:00
Alyssa Huang 5b5f6fcafb
[KAFKA-15137] Do not log entire request payload in KRaftControllerChannelManager (#13988)
Reviewers: David Arthur <mumrah@gmail.com>
2023-07-11 10:48:53 +02:00
ezio 6afcfba9f3
KAFKA-15159: upgrade minor dependencies (#13982)
Reviewers: Divij Vaidya <diviv@amazon.com>

---------

Co-authored-by: Damon Xie <damon.xie@zoom.us>
2023-07-11 10:39:39 +02:00
hudeqi 51bc41031b
KAFKA-15139: Avoid slow Set.removeAll(List) in MirrorCheckpointConnector (#13946)
Reviewed-by: Greg Harris <greg.harris@aiven.io>
2023-07-10 14:35:46 -07:00
Yash Mayya 9ee28d1fe6
KAFKA-15145: Don't re-process records filtered out by SMTs on Kafka client retriable exceptions in AbstractWorkerSourceTask (#13955)
Reviewers: Sagar Rao <sagarmeansocean@gmail.com>, Chris Egerton <chrise@aiven.io>
2023-07-10 13:26:58 -04:00
Hector Geraldino 6368d14a1d
KAFKA-14059 Replace PowerMock with Mockito in WorkerSourceTaskTest (#13383)
Reviewers: Chris Egerton <chrise@aiven.io>
2023-07-10 12:58:54 -04:00
Divij Vaidya d9a3e60dcc
KAFKA-14718: Wait for MirrorMaker to start before executing test (#13284) 2023-07-10 12:53:01 -04:00
Cheryl Simmons e98508747a
Doc fixes: Fix format and other small errors in config documentation (#13661)
Various formatting fixes in the config docs.

Reviewers: Bill Bejeck <bbejeck@apache.org>
2023-07-10 12:48:35 -04:00
hudeqi 8be601d051
MINOR: Move TROGDOR.md to trogdor module (#13979)
Reviewers: Divij Vaidya <diviv@amazon.com>

---------

Co-authored-by: Deqi Hu <deqi.hu@shopee.com>
2023-07-10 18:11:21 +02:00
Hao Li 0e56cc8841
KAFKA-15022: [1/N] initial implementation of rack aware assignor (#13851)
Part of KIP-925. Adds first internal classes to track rack.id client/partition metadata.

Reviewers: Matthias J. Sax <matthias@confluent.io>
2023-07-10 08:41:20 -07:00
Ron Dagostino edd64fa251
MINOR: more KRaft Metadata Image tests (#13724)
Adds additional testing for the various KRaft *Image classes. For every image that we create we already test that we can get there by applying all the records corresponding to that image written out as a list of records. This patch adds more tests to confirm that we can get to each such final image with intermediate stops at all possible intermediate images along the way.

Reviewers: Colin P. McCabe <cmccabe@apache.org>, David Arthur <mumrah@gmail.com>
2023-07-10 10:01:10 -04:00
David Arthur 726d277c0a
MINOR: Move some things around in KRaftMigrationDriver (#13978)
Reviewers: Colin P. McCabe <cmccabe@apache.org>
2023-07-10 09:05:46 -04:00
Divij Vaidya 5926840cab
MINOR: Check for thread leak at the end of @AfterEach, not at beginning (#13976)
Reviewers: David Jacot <djacot@confluent.io>
2023-07-10 14:29:04 +02:00
DL1231 d481163d55
MINOR: Print startup time for RemoteIndexCache (#13970)
Reviewers: Satish Duggana <satishd@apache.org>, Divij Vaidya <diviv@amazon.com>

Co-authored-by: d00791190 <dinglan6@huawei.com>
2023-07-08 12:53:47 +02:00
Divij Vaidya 7bdcb22cf6
MINOR: Refactor & cleanup for RemoteIndexCache (#13936)
- Add new unit tests
- Change the on-disk filename from <offset>_<uuid>_.<indexSuffix> to <offset>_<uuid>.<indexSuffix> i.e. remove trailing underscore after
- Fix a small bug where we were parsing offset as Int when reading the file name from disk. Offset is long.
- Perform input validation in RemoteLogSegmentMetadata.
- Remove an extra loop in cleaner thread. Shutdownable thread already performs looping.

Reviewers: Jorge Esteban Quilcate Otoya <jorge.quilcate@aiven.io>, Satish Duggana <satishd@apache.org>
2023-07-08 12:52:22 +02:00
Colin Patrick McCabe 14a97fafe7
MINOR: some minor shell fixes and improvements (#13940)
Make the output of 'find' and 'ls' sorted alphabetically.

Add GlobComponentTest.java to test globbing.

Add shell/src/test/resources/log4j.properties so that shell JUnit tests show some output

Reviewers: David Arthur <mumrah@gmail.com>
2023-07-07 13:52:47 -07:00
David Jacot 9cde3a7910
KAFKA-14462; [21/N] Add CoordinatorTimer implementation in CoordinatorRuntime (#13961)
This patch adds EventBasedCoordinatorTimer and the CoordinatorTimer interface.

Reviewers: Jeff Kim <jeff.kim@confluent.io>, Justine Olshan <jolshan@confluent.io>
2023-07-07 22:21:30 +02:00
Aneel Kumar fd5b300b57
MINOR: Fix typo in javadoc (#13972)
Reviewers: Divij Vaidya <diviv@amazon.com>
2023-07-07 16:50:35 +02:00
andymg3 1223b79973
KAFKA-15149: Fix handling of new partitions in dual-write mode (#13968)
Fixes a bug where we don't send UMR and LISR requests in dual-write mode when new partitions are created. Prior to this patch, KRaftMigrationZkWriter was mutating the internal data-structures of TopicDelta which prevented MigrationPropagator from sending UMR and LISR for the changed partitions.

Reviewers: David Arthur <mumrah@gmail.com>
2023-07-07 10:16:51 -04:00
hudeqi 1d8b07ed64
KAFKA-15129;[7/N] Remove metrics in TransactionMarkerChannelManager when TransactionCoordinator shutdown (#13962)
Reviewers: Divij Vaidya <diviv@amazon.com>

Co-authored-by: Deqi Hu <deqi.hu@shopee.com>
2023-07-07 10:27:10 +02:00
hudeqi 574f394a3e
MINOR: Fix regression introduced in #13924 (#13958)
Fixes a regression introduced in PR #13924 by moving the map from static to a instance specific variable.
---------

Co-authored-by: Deqi Hu <deqi.hu@shopee.com>
2023-07-07 10:18:38 +02:00
Greg Harris 1b925e9ee7
KAFKA-15069: Refactor plugin scanning logic into ReflectionScanner (#13821)
Reviewers: Chris Egerton <chrise@aiven.io>
2023-07-06 13:22:28 -04:00
DL1231 4149e31cad
KAFKA-15153: Use Python 'is' instead of '==' to compare for None (#13964)
Reviewers: Divij Vaidya <diviv@amazon.com>

Co-authored-by: d00791190 <dinglan6@huawei.com>
2023-07-06 16:59:13 +02:00
Yash Mayya a1a3ec0bcb
MINOR: Update connector status metric description to include 'stopped' as a potential value (#13967)
Reviewers: Mickael Maison <mickael.maison@gmail.com>
2023-07-06 15:28:07 +02:00
David Jacot bd1f02b2be
MINOR: Move MockTimer to server-common (#13954)
This patch rewrites MockTimer in Java and moves it from core to server-common. This continues the work started in https://github.com/apache/kafka/pull/13820.

Reviewers: Divij Vaidya <diviv@amazon.com>
2023-07-06 14:56:05 +02:00
Lianet Magrans 4a61b48d3d
KAFKA-14966; [2/N] Extract OffsetFetcher reusable logic (#13898)
This is a follow up on the initial OffsetFetcher refactoring to extract reusable logic, needed for the new consumer implementation (initial refactoring merged with PR-13815).

Similar to the initial refactoring, this PR brings no changes to the existing logic, just extracting functions or pieces of logic.

There were no individual tests for the extracted functions, so no tests were migrated.

Reviewers: Jun Rao <junrao@gmail.com>
2023-07-05 17:20:49 -07:00
David Jacot 98fbd8afc7
KAFKA-14462; [20/N] Refresh subscription metadata on new metadata image (#13901)
This patch adds (1) the logic to propagate a new MetadataImage to the running coordinators; and (2) the logic to ensure that all the consumer groups subscribed to topics with changes will refresh their subscriptions metadata on the next heartbeat. In the mean time, it ensures that freshly loaded consumer groups also refresh their subscriptions metadata on the next heartbeat.

Reviewers: Jeff Kim <jeff.kim@confluent.io>, Justine Olshan <jolshan@confluent.io>
2023-07-05 18:28:38 +02:00
DL1231 701f924352
KAFKA-15140: Use TestUtils methods and add logs for assertion failure at TopicCommandIntegrationTest (#13950)
This commit utilizes TestUtils methods to create a topic and adds logs when assertions fail.

Reviewers: Divij Vaidya <diviv@amazon.com>

---------

Co-authored-by: d00791190 <dinglan6@huawei.com>
2023-07-04 16:02:39 +02:00
Bruno Cadonna 5c2492bca7
KAFKA-10199: Consider tasks in state updater when computing offset sums (#13925)
With the state updater, the task manager needs also to look into the
tasks owned by the state updater when computing the sum of offsets
of the state. This sum of offsets is used by the high availability
assignor to assign warm-up replicas.
If the task manager does not take into account tasks in the
state updater, a warm-up replica will never report back that
the state for the corresponding task has caught up. Consequently,
the warm-up replica will never be dismissed and probing rebalances
will never end.

Reviewers: Lucas Brutschy <lbrutschy@confluent.io>, Matthias J. Sax <matthias@confluent.io>, Walker Carlson <wcarlson@confluent.io>
2023-07-03 16:35:34 +02:00
hudeqi 48eb8c90ef
KAFKA-15129: [1/N] Remove metrics in LogCleanerManager when LogCleaner shutdown (#13924)
Reviewers: Divij Vaidya <diviv@amazon.com>, Christo Lolov <lolovc@amazon.com>

---------

Co-authored-by: Deqi Hu <deqi.hu@shopee.com>
2023-07-03 16:14:30 +02:00
Jorge Esteban Quilcate Otoya 0ae1d22879
KAFKA-15135: fix(storage): pass endpoint configurations as client common to TBRLMM (#13938)
Pass endpoint properties from RLM to TBRLMM and validate those are not ignored.

Reviewers: Luke Chen <showuon@gmail.com>
2023-07-03 09:16:15 +08:00
Gantigmaa Selenge b2d647904c
KAFKA-8982: Add retry of fetching metadata to Admin.deleteRecords (#13760)
Use AdminApiDriver class to refresh the metadata and retry the request that failed with retriable errors.

Reviewers: Luke Chen <showuon@gmail.com>, Divij Vaidya <diviv@amazon.com>, Mickael Maison <mmaison@redhat.com>, Dimitar Dimitrov <30328539+dimitarndimitrov@users.noreply.github.com>
2023-07-03 09:13:55 +08:00
vamossagar12 96e59d7bfd
[MINOR] Correcting few WARN log lines in DistributedHerder#handleRebalance (#13939)
Reviewers: Chris Egerton <chrise@aiven.io>
2023-06-30 12:39:37 -04:00
Jorge Esteban Quilcate Otoya 43574beb97
KAFKA-15131: Improve RemoteStorageManager exception handling documentation (#13923)
Reviewers: Divij Vaidya <diviv@amazon.com>, Satish Duggana <satishd@apache.org>, Luke Chen <showuon@gmail.com>
2023-06-30 14:37:48 +02:00
Ismael Juma 1f4cbc5d53
MINOR: Add JDK 20 CI build and remove some branch builds (#12948)
It's good for us to add support for Java 20 in preparation for Java 21 - the next LTS.

Given that Scala 2.12 support has been deprecated, a Scala 2.12 variant is not included.

Also remove some branch builds that add load to the CI, but have
low value: JDK 8 & Scala 2.13 (JDK 8 support has been deprecated),
JDK 11 & Scala 2.12 (Scala 2.12 support has been deprecated) and
JDK 17 & Scala 2.12 (Scala 2.12 support has been deprecated).

A newer version of Mockito (4.9.0 -> 4.11.0) is required for Java 20 support, but we
only use it with Scala 2.13+ since it causes compilation errors with Scala 2.12. Similarly,
we upgrade easymock when the Java version is 16 or newer as it's incompatible
with powermock (which doesn't support Java 16 or newer).

Filed KAFKA-15117 for a test that fails with Java 20 (SslTransportLayerTest.testValidEndpointIdentificationCN).

Finally, fixed some lossy conversions that were added after #13582 was submitted.

Reviewers: Ismael Juma <ismael@juma.me.uk>
2023-06-30 01:12:00 -07:00
Yash Mayya 32bcdac6a1
MINOR: Replace synchronization with atomic update in Connect's StateTracker::changeState method (#13934)
Reviewers: Chris Egerton <chrise@aiven.io>
2023-06-29 15:05:06 -04:00
Kirk True a81f35c1c8
KAFKA-14831: Illegal state errors should be fatal in transactional producer (#13591)
Poison the transaction manager if we detect an illegal transition in the Sender thread. A ThreadLocal in is stored in TransactionManager so that the Sender can inform TransactionManager which thread it's using.

Reviewers: Daniel Urban <durban@cloudera.com>, Justine Olshan <jolshan@confluent.io>, Jason Gustafson <jason@confluent.io>
2023-06-29 11:21:15 -07:00
Chris Egerton 1ed8fa2ee0
MINOR: Update anchor link for exactly-once source connectors (#13933)
Reviewers: Josep Prat <josep.prat@aiven.io>
2023-06-29 14:15:32 -04:00