Commit Graph

11348 Commits

Author SHA1 Message Date
Yash Mayya a1a3ec0bcb
MINOR: Update connector status metric description to include 'stopped' as a potential value (#13967)
Reviewers: Mickael Maison <mickael.maison@gmail.com>
2023-07-06 15:28:07 +02:00
David Jacot bd1f02b2be
MINOR: Move MockTimer to server-common (#13954)
This patch rewrites MockTimer in Java and moves it from core to server-common. This continues the work started in https://github.com/apache/kafka/pull/13820.

Reviewers: Divij Vaidya <diviv@amazon.com>
2023-07-06 14:56:05 +02:00
Lianet Magrans 4a61b48d3d
KAFKA-14966; [2/N] Extract OffsetFetcher reusable logic (#13898)
This is a follow up on the initial OffsetFetcher refactoring to extract reusable logic, needed for the new consumer implementation (initial refactoring merged with PR-13815).

Similar to the initial refactoring, this PR brings no changes to the existing logic, just extracting functions or pieces of logic.

There were no individual tests for the extracted functions, so no tests were migrated.

Reviewers: Jun Rao <junrao@gmail.com>
2023-07-05 17:20:49 -07:00
David Jacot 98fbd8afc7
KAFKA-14462; [20/N] Refresh subscription metadata on new metadata image (#13901)
This patch adds (1) the logic to propagate a new MetadataImage to the running coordinators; and (2) the logic to ensure that all the consumer groups subscribed to topics with changes will refresh their subscriptions metadata on the next heartbeat. In the mean time, it ensures that freshly loaded consumer groups also refresh their subscriptions metadata on the next heartbeat.

Reviewers: Jeff Kim <jeff.kim@confluent.io>, Justine Olshan <jolshan@confluent.io>
2023-07-05 18:28:38 +02:00
DL1231 701f924352
KAFKA-15140: Use TestUtils methods and add logs for assertion failure at TopicCommandIntegrationTest (#13950)
This commit utilizes TestUtils methods to create a topic and adds logs when assertions fail.

Reviewers: Divij Vaidya <diviv@amazon.com>

---------

Co-authored-by: d00791190 <dinglan6@huawei.com>
2023-07-04 16:02:39 +02:00
Bruno Cadonna 5c2492bca7
KAFKA-10199: Consider tasks in state updater when computing offset sums (#13925)
With the state updater, the task manager needs also to look into the
tasks owned by the state updater when computing the sum of offsets
of the state. This sum of offsets is used by the high availability
assignor to assign warm-up replicas.
If the task manager does not take into account tasks in the
state updater, a warm-up replica will never report back that
the state for the corresponding task has caught up. Consequently,
the warm-up replica will never be dismissed and probing rebalances
will never end.

Reviewers: Lucas Brutschy <lbrutschy@confluent.io>, Matthias J. Sax <matthias@confluent.io>, Walker Carlson <wcarlson@confluent.io>
2023-07-03 16:35:34 +02:00
hudeqi 48eb8c90ef
KAFKA-15129: [1/N] Remove metrics in LogCleanerManager when LogCleaner shutdown (#13924)
Reviewers: Divij Vaidya <diviv@amazon.com>, Christo Lolov <lolovc@amazon.com>

---------

Co-authored-by: Deqi Hu <deqi.hu@shopee.com>
2023-07-03 16:14:30 +02:00
Jorge Esteban Quilcate Otoya 0ae1d22879
KAFKA-15135: fix(storage): pass endpoint configurations as client common to TBRLMM (#13938)
Pass endpoint properties from RLM to TBRLMM and validate those are not ignored.

Reviewers: Luke Chen <showuon@gmail.com>
2023-07-03 09:16:15 +08:00
Gantigmaa Selenge b2d647904c
KAFKA-8982: Add retry of fetching metadata to Admin.deleteRecords (#13760)
Use AdminApiDriver class to refresh the metadata and retry the request that failed with retriable errors.

Reviewers: Luke Chen <showuon@gmail.com>, Divij Vaidya <diviv@amazon.com>, Mickael Maison <mmaison@redhat.com>, Dimitar Dimitrov <30328539+dimitarndimitrov@users.noreply.github.com>
2023-07-03 09:13:55 +08:00
vamossagar12 96e59d7bfd
[MINOR] Correcting few WARN log lines in DistributedHerder#handleRebalance (#13939)
Reviewers: Chris Egerton <chrise@aiven.io>
2023-06-30 12:39:37 -04:00
Jorge Esteban Quilcate Otoya 43574beb97
KAFKA-15131: Improve RemoteStorageManager exception handling documentation (#13923)
Reviewers: Divij Vaidya <diviv@amazon.com>, Satish Duggana <satishd@apache.org>, Luke Chen <showuon@gmail.com>
2023-06-30 14:37:48 +02:00
Ismael Juma 1f4cbc5d53
MINOR: Add JDK 20 CI build and remove some branch builds (#12948)
It's good for us to add support for Java 20 in preparation for Java 21 - the next LTS.

Given that Scala 2.12 support has been deprecated, a Scala 2.12 variant is not included.

Also remove some branch builds that add load to the CI, but have
low value: JDK 8 & Scala 2.13 (JDK 8 support has been deprecated),
JDK 11 & Scala 2.12 (Scala 2.12 support has been deprecated) and
JDK 17 & Scala 2.12 (Scala 2.12 support has been deprecated).

A newer version of Mockito (4.9.0 -> 4.11.0) is required for Java 20 support, but we
only use it with Scala 2.13+ since it causes compilation errors with Scala 2.12. Similarly,
we upgrade easymock when the Java version is 16 or newer as it's incompatible
with powermock (which doesn't support Java 16 or newer).

Filed KAFKA-15117 for a test that fails with Java 20 (SslTransportLayerTest.testValidEndpointIdentificationCN).

Finally, fixed some lossy conversions that were added after #13582 was submitted.

Reviewers: Ismael Juma <ismael@juma.me.uk>
2023-06-30 01:12:00 -07:00
Yash Mayya 32bcdac6a1
MINOR: Replace synchronization with atomic update in Connect's StateTracker::changeState method (#13934)
Reviewers: Chris Egerton <chrise@aiven.io>
2023-06-29 15:05:06 -04:00
Kirk True a81f35c1c8
KAFKA-14831: Illegal state errors should be fatal in transactional producer (#13591)
Poison the transaction manager if we detect an illegal transition in the Sender thread. A ThreadLocal in is stored in TransactionManager so that the Sender can inform TransactionManager which thread it's using.

Reviewers: Daniel Urban <durban@cloudera.com>, Justine Olshan <jolshan@confluent.io>, Jason Gustafson <jason@confluent.io>
2023-06-29 11:21:15 -07:00
Chris Egerton 1ed8fa2ee0
MINOR: Update anchor link for exactly-once source connectors (#13933)
Reviewers: Josep Prat <josep.prat@aiven.io>
2023-06-29 14:15:32 -04:00
Proven Provenzano 586f89cb1c
KAFKA-15114: Update StorageTool help for creating SCRAM credentials to specify name instead of user. (#13904)
The choice of using name vs. user as a parameter is because internally the record uses name, all
tests using the StorageTool use name as a parameter, KafkaPrincipals are created with name and
because creating SCRAM credentials is done with --entity-name

Reviewers: Colin P. McCabe <cmccabe@apache.org>
2023-06-29 11:11:12 -07:00
Yash Mayya 30b087ead9
KAFKA-14930: Document the new PATCH and DELETE offsets REST APIs for Connect (#13915)
Reviewers: Chris Egerton <chrise@aiven.io>
2023-06-29 11:23:23 -04:00
Walker Carlson 12be344fdd
KAFKA-14936: Add Grace period logic to Stream Table Join (2/N) (#13855)
This PR adds the interface for grace period to the Joined object as well as uses the buffer. The majority of it is tests and moving some of the existing join logic.

Reviewers: Victoria Xia <victoria.xia@confluent.io>, Bruno Cadonna <cadonna@apache.org>
2023-06-29 14:14:04 +02:00
Bo Gao 005416879e
KAFKA-15053: Use case insensitive validator for security.protocol config (#13831)
Fixed a regression described in KAFKA-15053 that security.protocol only allows uppercase values like PLAINTEXT, SSL, SASL_PLAINTEXT, SASL_SSL. With this fix, both lower case and upper case values will be supported (e.g. PLAINTEXT, SSL, SASL_PLAINTEXT, SASL_SSL, plaintext, ssl, sasl_plaintext, sasl_ssl)

Reviewers: Chris Egerton <chrise@aiven.io>, Divij Vaidya <diviv@amazon.com>
2023-06-29 10:13:21 +02:00
David Jacot 482299c4e2
KAFKA-14462; [19/N] Add CoordinatorLoader implementation (#13880)
This patch adds a coordinator loader implementation.

Reviewers: Jeff Kim <jeff.kim@confluent.io>, Justine Olshan <jolshan@confluent.io>
2023-06-29 08:12:53 +02:00
José Armando García Sancio ee88a3d1b9
MINOR; Failed atomic file move should be logged at WARN (#13917)
When Kafka fails to perform an atomic file move the error is getting swallowed. Kafka should log these cases at least at WARN level.

Reviewers: Ron Dagostino <rndgstn@gmail.com>, Kirk True <kirk@kirktrue.pro>
2023-06-28 15:55:50 -07:00
José Armando García Sancio 3a246b1aba
KAFKA-15078; KRaft leader replys with snapshot for offset 0 (#13845)
If the follower has an empty log, fetches with offset 0, it is more
efficient for the leader to reply with a snapshot id (redirect to
FETCH_SNAPSHOT) than for the follower to continue fetching from the log
segments.

Reviewers: David Arthur <mumrah@gmail.com>, dengziming <dengziming1993@gmail.com>
2023-06-28 14:21:11 -07:00
Justine Olshan 2f71708955
KAFKA-15028: AddPartitionsToTxnManager metrics (#13798)
Adding the following metrics as per kip-890:

VerificationTimeMs – number of milliseconds from adding partition info to the manager to the time the response is sent. This will include the round trip to the transaction coordinator if it is called. This will also account for verifications that fail before the coordinator is called.

VerificationFailureRate – rate of verifications that returned in failure either from the AddPartitionsToTxn response or through errors in the manager.

AddPartitionsToTxnVerification metrics – separating the verification request metrics from the typical add partitions ones similar to how fetch replication and fetch consumer metrics are separated.

Reviewers: Divij Vaidya <diviv@amazon.com>
2023-06-28 09:00:37 -07:00
prasanthV 58fc264410
MINOR: Fix ToolsTestUtils by removing incorrect closure of Std Stream (#13922)
Reviewers: Lucas Bradstreet <lucas@confluent.io>, Divij Vaidya <diviv@amazon.com>
2023-06-28 17:46:22 +02:00
minjian.cai e71f68d6c9
MINOR: fix typos for client (#13884)
Reviewers: Luke Chen <showuon@gmail.com>, Divij Vaidya <diviv@amazon.com>, Kirk True <ktrue@confluent.io>
2023-06-28 16:47:42 +02:00
Chia-Ping Tsai 12005484af
MINOR: fix flaky ZkMigrationIntegrationTest.testNewAndChangedTopicsInDualWrite (#13902)
Reviewers: David Arthur <mumrah@gmail.com>
2023-06-28 22:45:26 +08:00
Manyanda Chitimbo f32ebeab17
MINOR: Bump requests (python package) from 2.24.0 to 2.31.0 in /tests (#13903)
Update "requests" lib used in system tests to version "2.31.0" to fix CVE-2023-32681: Unintended leak of Proxy-Authorization header in requests

Reviewers: Divij Vaidya <diviv@amazon.com>
2023-06-27 21:17:22 +02:00
David Arthur fc7d912e8b
KAFKA-15109 Ensure the leader epoch bump occurs for older MetadataVersions (#13910)
This fixes a regression introduced by the previous KAFKA-15109 commit (d0457f7360 on trunk).

Reviewers: Colin P. McCabe <cmccabe@apache.org>, José Armando García Sancio <jsancio@apache.org>
2023-06-27 11:49:20 -04:00
Manyanda Chitimbo c5889fcedd
MINOR: Split ConsumerCoordinator#testCommitOffsetMetadata onto two test cases testing commitSync and commitAsync (#13665)
Split ConsumerCoordinator#testCommitOffsetMetadata onto two test cases testing commitSync and commitAsync 

Reviewers:  Luke Chen <showuon@gmail.com>
2023-06-24 12:32:21 +08:00
Greg Harris 3978684126
MINOR: Silence error logs for faulty plugins in integration tests (#13912)
Reviewers: Chris Egerton <chrise@aiven.io>
2023-06-23 14:10:18 -04:00
Yash Mayya 6e72986949
KAFKA-14784: Connect offset reset REST API (#13818)
Reviewers: Chris Egerton <chrise@aiven.io>
2023-06-23 13:27:46 -04:00
Jeff Kim 1dbcb7da9e
KAFKA-14694: RPCProducerIdManager should not wait on new block (#13267)
RPCProducerIdManager initiates an async request to the controller to grab a block of producer IDs and then blocks waiting for a response from the controller.

This is done in the request handler threads while holding a global lock. This means that if many producers are requesting producer IDs and the controller is slow to respond, many threads can get stuck waiting for the lock.

This patch aims to:
* resolve the deadlock scenario mentioned above by not waiting for a new block and returning an error immediately
* remove synchronization usages in RpcProducerIdManager.generateProducerId()
* handle errors returned from generateProducerId() so that KafkaApis does not log unexpected errors
* confirm producers backoff before retrying
* introduce backoff if manager fails to process AllocateProducerIdsResponse

Reviewers: Artem Livshits <alivshits@confluent.io>, Jason Gustafson <jason@confluent.io>
2023-06-22 10:19:39 -07:00
Ismael Juma 9c8aaa2c35
MINOR: Fix lossy conversions flagged by Java 20 (#13582)
An example of the warning:
> warning: [lossy-conversions] implicit cast from long to int in compound assignment is possibly lossy

There should be no change in behavior as part of these changes - runtime logic ensured
we didn't run into issues due to the lossy conversions.

Reviewers: Divij Vaidya <diviv@amazon.com>
2023-06-22 08:05:55 -07:00
David Arthur 1bf7039999
KAFKA-15098 Allow authorizers to be configured in ZK migration (#13895)
Reviewers: Ron Dagostino <rdagostino@confluent.io>
2023-06-22 09:34:49 -04:00
David Jacot a81486e4f8
KAFKA-14462; [18/N] Add GroupCoordinatorService (#13812)
This patch introduces the GroupCoordinatorService. This is the new (incomplete) implementation of the group coordinator based on the coordinator runtime introduced in https://github.com/apache/kafka/pull/13795.

Reviewers: Divij Vaidya <diviv@amazon.com>, Justine Olshan <jolshan@confluent.io>
2023-06-22 09:06:10 +02:00
Mickael Maison 3c059133d3
MINOR: Fix generated client ids for Connect (#13896)
Reviewers: Chris Egerton <fearthecellos@gmail.com>
2023-06-21 21:44:14 +02:00
Greg Harris 3b72b0abb1
MINOR: Optimize runtime of MM2 integration tests by batching transactions (#13816)
Reviewers: Chris Egerton <chrise@aiven.io>
2023-06-21 14:51:54 -04:00
David Arthur d0457f7360
KAFKA-15109 Don't skip leader epoch bump while in migration mode (#13890)
While in migration mode, the KRaft controller must always bump the leader epoch when shrinking an ISR. 
This is required to maintain compatibility with the ZK brokers. Without the epoch bump, the ZK brokers
will ignore the partition state change present in the LeaderAndIsrRequest since it would not contain a new
leader epoch.

Reviewers: Colin P. McCabe <cmccabe@apache.org>
2023-06-21 13:09:05 -04:00
Divij Vaidya 88e784f7c6
KAFKA-15084: Remove lock contention from RemoteIndexCache (#13850)
Use thread safe Caffeine to cache indexes fetched from RemoteTier locally. This PR removes a lock contention that led to higher fetch latencies as the IO threads spent time unnecessarily waiting on global cache lock while a single thread fetches the index from remote tier. See PR #13850 for details and rejected alternatives.

Reviewers: Luke Chen <showuon@gmail.com>, Satish Duggana <satishd@apache.org>
2023-06-21 18:22:49 +02:00
hudeqi d5dafe22fe
MINOR:Fill missing parameter annotations for LogCleaner methods (#13839)
Reviewers: Josep Prat <josep.prat@aiven.io>
---------

Co-authored-by: Deqi Hu <deqi.hu@shopee.com>
2023-06-21 15:54:32 +02:00
David Arthur 16bb8cbb8c
MINOR: Increase Github API operations for stale PR check (#13894)
Reviewers: Josep Prat <josep.prat@aiven.io>
2023-06-21 09:52:49 -04:00
minjian.cai ba5e1acdfb
MINOR: fix typos for metadata (#13889)
Reviewers: Divij Vaidya <diviv@amazon.com>, Deqi Hu <deqi.hu@shopee.com>
2023-06-21 15:09:15 +02:00
hudeqi 9b383a1e9e
MINOR: Fix documentation for ConsumeBench metrics in Trogdor (#13877)
Co-authored-by: Deqi Hu <deqi.hu@shopee.com>

Reviewers: Divij Vaidya <diviv@amazon.com>
2023-06-21 14:49:13 +02:00
Joseph (Ting-Chou) Lin 72503904e8
MINOR: Log lastCaughtUpTime on ISR shrinkage (#13187)
Reviewers: Divij Vaidya <diviv@amazon.com>
2023-06-21 10:15:50 +02:00
minjian.cai 49c1697ab0
MINOR: fix typos for doc (#13883)
Reviewers: Divij Vaidya <diviv@amazon.com>
2023-06-21 09:57:43 +02:00
Divij Vaidya dd25753aa2
MINOR: Close ReplicaManager correctly in ReplicaManagerTest (#13868)
Fixes thread leaks by closing the ReplicaManager using try/finally at the end of each test. The leaks were leading to flaky test failures in ReplicaManagerTest.

Reviewers: Justine Olshan <jolshan@confluent.io>, David Jacot <djacot@confluent.io>
2023-06-21 09:55:03 +02:00
minjian.cai 474053d297
MINOR: fix typos for streams (#13888)
Reviewers: Divij Vaidya <diviv@amazon.com>, Manyanda Chitimbo <manyanda.chitimbo@gmail.com>
2023-06-20 23:03:42 +02:00
minjian.cai 39a47c8999
MINOR: fix typos for group coordinator (#13886)
Reviewers: Manyanda Chitimbo <manyanda.chitimbo@gmail.com>, Divij Vaidya <diviv@amazon.com>
2023-06-20 22:57:26 +02:00
minjian.cai af678a563d
MINOR: fix typos for server common (#13887)
Reviewers: Manyanda Chitimbo <manyanda.chitimbo@gmail.com>, Divij Vaidya <diviv@amazon.com>
2023-06-20 22:56:01 +02:00
minjian.cai 3d97743c67
MINOR: Fix some typos for core (#13882)
Reviewers:  Divij Vaidya <diviv@amazon.com>
2023-06-20 22:52:39 +02:00