Commit Graph

2744 Commits

Author SHA1 Message Date
Philip Nee 62431dca70
KAFKA-14468: Implement CommitRequestManager to manage the commit and autocommit requests (#13021)
This pull request introduces a CommitRequestManager to efficiently manage commit requests from clients and the autocommit state. The manager utilizes a "staged" commit queue to store commit requests made by clients. A background thread regularly polls the CommitRequestManager, which then checks the queue for any outstanding commit requests. When permitted, the CommitRequestManager generates a PollResult which contains a list of UnsentRequests that are subsequently processed by the NetworkClientDelegate.

In addition, a RequestManagerRegistry has been implemented to hold all request managers, including the new CommitRequestManager and the CoordinatorRequestManager. The registry is regularly polled by a background thread in each event loop, ensuring that all request managers are kept up to date and able to handle incoming requests

Reviewers: Jason Gustafson <jason@confluent.io>, Guozhang Wang <wangguoz@gmail.com>
2023-02-24 15:42:38 -08:00
Chia-Ping Tsai 7626a43079
KAFKA-14295 FetchMessageConversionsPerSec meter not recorded (#13279)
Reviewers: Luke Chen <showuon@gmail.com>
2023-02-24 01:50:49 +08:00
David Jacot c55e5afb83
KAFKA-14744; NPE while converting OffsetFetch from version < 8 to version >= 8 (#13295)
While refactoring the OffsetFetch handling in KafkaApis, we introduced a NullPointerException (NPE). The NPE arises when the FetchOffset API is called with a client using a version older than version 8 and using null for the topics to signal that all topic-partition offsets must be returned. This means that this bug mainly impacts admin tools. The consumer does not use null.

This NPE is here: 24a86423e9 (diff-0f2f19fd03e2fc5aa9618c607b432ea72e5aaa53866f07444269f38cb537f3feR237).

We missed this during the refactor because we had no tests in place to test this mode.

Reviewers: Chia-Ping Tsai <chia7712@gmail.com>, Justine Olshan <jolshan@confluent.io>
2023-02-23 18:32:33 +01:00
bachmanity1 3fe2f8c442
MINOR: after reading BYTES type it's possible to access data beyond its size (#13261)
After reading data of type BYTES, COMPACT_BYTES, NULLABLE_BYTES or COMPACT_NULLABLE_BYTES returned ByteBuffer might have a capacity that is larger than its limit, thus these data types may access data that lies beyond its size by increasing limit of the returned ByteBuffer. I guess this is not very critical but I think it would be good to restrict increasing limit of the returned ByteBuffer by making its capacity strictly equal to its limit. I think someone might unintentionally mishandle these data types and accidentally mess up data in the ByteBuffer from which they were read.

Reviewers: Luke Chen <showuon@gmail.com>
2023-02-23 10:13:18 +08:00
Luke Chen 30d7d3b5ce
MINOR: add size check for tagged fields (#13100)
Add size check for taggedFields of a tag, and add tests.

Reviewers: Mickael Maison <mickael.maison@gmail.com>, Divij Vaidya <diviv@amazon.com>
2023-02-22 11:52:21 +08:00
Kirk True c9a42c85e2
KAFKA-14675: Extract metadata-related tasks from Fetcher into MetadataFetcher 1/4 (#13192)
The `Fetcher` class is used internally by the `KafkaConsumer` to fetch records from the brokers. There is [ongoing work to create a new consumer implementation with a significantly refactored threading model](https://issues.apache.org/jira/browse/KAFKA-14246). The threading refactor work requires a similarly refactored `Fetcher`.

This task covers the work to extract from `Fetcher` the APIs that are related to metadata operations into two new classes: `OffsetFetcher` and `TopicMetadataFetcher`. This will allow the refactoring of `Fetcher` and `MetadataFetcher` for the new consumer.

Reviewers: Philip Nee <pnee@confluent.io>, Guozhang Wang <guozhang@apache.org>, Jason Gustafson <jason@confluent.io>
2023-02-21 09:00:35 -08:00
Terry f3dc3f0dad
Kafka 14565: On failure, close AutoCloseable objects instantiated and configured by AbstractConfig (#13168)
Reviewers: Chris Egerton <chrise@aiven.io>
2023-02-16 12:39:24 -05:00
Christo Lolov ba0c5b0902
MINOR: Simplify JUnit assertions in tests; remove accidental unnecessary code in tests (#13219)
* assertEquals called on array
* Method is identical to its super method
* Simplifiable assertions
* Unused imports

Reviewers: Mickael Maison <mickael.maison@gmail.com>, Divij Vaidya <diviv@amazon.com>
2023-02-16 16:13:31 +01:00
José Armando García Sancio 10164a6d2e
KAFKA-14693; Kafka node should halt instead of exit (#13227)
Extend the implementation of ProcessTerminatingFaultHandler to support calling either Exit.halt or Exit.exit. Change the fault handler used by the Controller thread and the KRaft thread to use a halting fault handler.

Those threads cannot call Exit.exit because Runtime.exit joins on the default shutdown hook thread. The shutdown hook thread joins on the controller and kraft thread terminating. This causes a deadlock.

Reviewers: Colin Patrick McCabe <cmccabe@apache.org>, Jason Gustafson <jason@confluent.io>
2023-02-14 09:53:38 -08:00
Mickael Maison fa43f13044
MINOR: Various cleanups in clients javadoc (#13239)
Reviewers: Luke Chen <showuon@gmail.com>
2023-02-14 10:24:20 +01:00
Manikumar Reddy 1ed61e4090 MINOR: Few cleanups to JaasContext/Utils classes
Reviewers: Rajini Sivaram <rajinisivaram@googlemail.com>
2023-02-14 11:18:24 +05:30
Philip Nee eee2bf9686
KAFKA-6793: Unused configuration logging appears to be noisy and unnecessary (#13225)
Warnings about unused configs are most often spurious. This patch changes the current warning to an info message.

Reviewers: Chris Egerton <chrise@aiven.io>, Jason Gustafson <jason@confluent.io>
2023-02-13 09:27:55 -08:00
Chris Egerton 8cfafba279
KAFKA-14021: Implement new KIP-618 APIs in MirrorSourceConnector (#12366)
Reviewers: Mickael Maison <mickael.maison@gmail.com>
2023-02-13 10:09:14 -05:00
Rajini Sivaram ef9c9486da
KAFKA-14676: Include all SASL configs in login cache key to ensure clients in a JVM can use different OAuth configs (#13211)
We currently cache login managers in static maps for both static JAAS config using system property and for JAAS config specified using Kafka config sasl.jaas.config. In addition to the JAAS config, the login manager callback handler is included in the key, but all other configs are ignored. This implementation is based on the assumption clients that require different logins (e.g. username/password) use different JAAS configs, because login properties are included in the JAAS config rather than as separate top-level configs. The OIDC support added in KIP-768 only allows configuration of token endpoint URL as a top-level config. This results in two clients in a JVM configured with different token endpoint URLs to incorrectly share a login.

This PR includes all SASL configs prefixed with sasl. to be included in the key so that logins are only shared if all the sasl configs are identical. 

Reviewers: Manikumar Reddy <manikumar.reddy@gmail.com>, Kirk True <kirk@mustardgrain.com>
2023-02-12 19:49:12 +00:00
Divij Vaidya e903f2cd96
KAFKA-7109: Close fetch sessions on close of consumer (#12590)
## Problem
When consumer is closed, fetch sessions associated with the consumer should notify the server about it's intention to close using a Fetch call with epoch = -1 (identified by `FINAL_EPOCH` in `FetchMetadata.java`). However, we are not sending this final fetch request in the current flow which leads to unnecessary fetch sessions on the server which are closed only after timeout.

## Changes
1. Change `close()` in `Fetcher` to add a logic to send the final Fetch request notifying close to the server.
2. Change `close()` in `Consumer` to respect the timeout duration passed to it. Prior to this change, the timeout parameter was being ignored.
3. Change tests to close with `Duration.zero` to reduce the execution time of the tests. Otherwise the tests will wait for default timeout to exit (close() in the tests is expected to be unsuccessful because there is no server to send the request to).
4. Distinguish between the case of "close existing session and create new session" and "close existing session" by renaming the `nextCloseExisting` function to `nextCloseExistingAttemptNew`.

## Testing
Added unit test which validates that the correct close request is sent to the server.

Reviewers: Ismael Juma <ismael@juma.me.uk>, Kirk True <kirk@mustardgrain.com>, Philip Nee <philipnee@gmail.com>, Luke Chen <showuon@gmail.com>, David Jacot <djacot@confluent.io>
2023-02-09 14:53:10 +01:00
David Jacot 3be7f7d611
KAFKA-14391; Add ConsumerGroupHeartbeat API (#12972)
This patch does a few things:
1) It introduces a new flag to the request spec: `latestVersionUnstable`. It signifies that the last version of the API is considered unstable (or still in development). As such, the last API version is not exposed by the server unless specified otherwise with the new internal `unstable.api.versions.enable`. This allows us to commit new APIs which are still in development.
3) It adds the ConsumerGroupHeartbeat API, part of KIP-848, and marks it as unreleased for now.
4) It adds the new error codes required by the new ConsumerGroupHeartbeat API.

Reviewers: Justine Olshan <jolshan@confluent.io>, Jeff Kim <jeff.kim@confluent.io>, Jason Gustafson <jason@confluent.io>
2023-02-09 09:13:31 +01:00
Christo Lolov a0a9b6ffea
MINOR: Remove unnecessary code (#13210)
Reviewers: Mickael Maison <mickael.maison@gmail.com>, Divij Vaidya <diviv@amazon.com>
2023-02-07 17:37:45 +01:00
Yash Mayya a3cf8b54e0
KAFKA-14455: Kafka Connect create and update REST APIs should surface failures while writing to the config topic (#12984)
Reviewers: Greg Harris <greg.harris@aiven.io>, Chris Egerton <chrise@aiven.io>
2023-02-02 11:03:38 -05:00
Colin Patrick McCabe eb7d5cbf15
MINOR: add startup timeouts to KRaft integration tests (#13153)
When running junit tests, it is not good to block forever on CompletableFuture objects.  When there
are bugs, this can lead to junit tests hanging forever. Jenkins does not deal with this well -- it
often brings down the whole multi-hour test run.  Therefore, when running integration tests in
JUnit, set some reasonable time limits on broker and controller startup time.

Reviewers: Jason Gustafson <jason@confluent.io>
2023-01-30 11:29:30 -08:00
Kirk True bc1ce9f0f1
KAFKA-14623: OAuth's HttpAccessTokenRetriever potentially leaks secrets in logging (#13119)
Removes logging of the HTTP response directly in all known cases to prevent potentially logging access tokens.

Reviewers:  Sushant Mahajan <sushant.mahajan88@gmail.com>, Rajini Sivaram <rajinisivaram@googlemail.com>
2023-01-24 19:06:24 +00:00
Jason Gustafson b2cb546fba
MINOR: Small cleanups in refactored consumer implementation (#13138)
This patch contains a few cleanups and fixes in the new refactored consumer logic:

- Use `CompletableFuture` instead of `RequestFuture` in `NetworkClientDelegate`. This is a much more extensible API and it avoids tying the new implementation to `ConsumerNetworkClient`.
- Fix call to `isReady` in `NetworkClientDelegate`. We need the call to `ready` to initiate the connection.
- Ensure backoff is enforced even after successful `FindCoordinator` requests. This avoids a tight loop while metadata is converging after a coordinator change.
- `RequestState` was incorrectly using the reconnect backoff as the retry backoff. In fact, we don't currently have a retry backoff max implemented in the consumer, so the use of `ExponentialBackoff` is unnecessary, but I've left it since we may add this with https://cwiki.apache.org/confluence/display/KAFKA/KIP-580%3A+Exponential+Backoff+for+Kafka+Clients. 
- Minor cleanups in test cases to avoid unused classes/fields.

Reviewers: Philip Nee <pnee@confluent.i>, Guozhang Wang <guozhang@apache.org>
2023-01-23 12:45:24 -08:00
Mickael Maison 00e5803cd3
MINOR: Various cleanups in client tests (#13094)
Reviewers: Luke Chen <showuon@gmail.com>, Ismael Juma <mlists@juma.me.uk>, Divij Vaidya <diviv@amazon.com>, Christo Lolov <christololov@gmail.com>
2023-01-23 13:07:44 +01:00
Philip Nee ca80502ebe
Update outdated documentation. (#13139)
The current documentation indicates two positions are tracked, but these positions were removed a few years ago. Now we use a single position to track the last consumed record. Updated the documentation to reflect to the current state.

Reviewers: Guozhang Wang <wangguoz@gmail.com>
2023-01-20 14:53:20 -08:00
A. Sophie Blee-Goldman 79b12cbe65
MINOR: fix flaky StickyAssignorTest.testLargeAssignmentAndGroupWithNonEqualSubscription (#13134)
This test is supposed to be a sanity check that rebalancing with a large number of partitions/consumers won't start to take obscenely long or approach the max.poll.interval.ms -- bumping up the timeout by another 30s still feels very reasonable considering the test is for 1 million partitions

Reviewers: Matthias J. Sax <mjsax@apache.org>
2023-01-20 13:56:38 -08:00
Alex Sorokoumov 4b75265761
KAFKA-14638: Elaborate when transaction.timeout.ms resets (#13129)
Reviewers: Matthias J. Sax <matthias@confluent.io>, Justine Olshan <jolshan@confluent.io>, Jason Gustafson <jason@confluent.io>
2023-01-19 18:55:32 -08:00
David Jacot 700947aa5a
KAFKA-14367; Add `OffsetDelete` to the new `GroupCoordinator` interface (#12902)
This patch adds `OffsetDelete` to the new `GroupCoordinator` interface and updates `KafkaApis` to use it.

Reviewers: Jeff Kim <jeff.kim@confluent.io>, Justine Olshan <jolshan@confluent.io>
2023-01-17 20:39:01 +01:00
Divij Vaidya b2bc72dc79
MINOR: Include the inner exception stack trace when re-throwing an exception (#12229)
While wrapping the caught exception into a custom one, information about the caught
exception is being lost, including information about the stack trace of the exception.

When re-throwing an exception, we either include the original exception or the relevant
information is added to the exception message.

Reviewers: Ismael Juma <ismael@juma.me.uk>, Luke Chen <showuon@gmail.com>, dengziming <dengziming1993@gmail.com>, Matthew de Detrich <mdedetrich@gmail.com>
2023-01-15 15:03:23 -08:00
Cheryl Simmons d2556e02a2
Update ProducerConfig.java (#13115)
This should be greater than 1 to be consistent with behavior described inmax.in.flight.requests.per.connection.

Reviewers: Bill Bejeck <bbejeck@apache.org>
2023-01-13 16:35:49 -05:00
David Jacot a2926edc2f
KAFKA-14367; Add `TxnOffsetCommit` to the new `GroupCoordinator` interface (#12901)
This patch adds `TxnOffsetCommit` to the new `GroupCoordinator` interface and updates `KafkaApis` to use it.

Reviewers: Jeff Kim <jeff.kim@confluent.io>, Justine Olshan <jolshan@confluent.io>
2023-01-13 09:54:54 +01:00
Federico Valeri 111f02cc74
KAFKA-14568: Move FetchDataInfo and related to storage module (#13085)
Part of KAFKA-14470: Move log layer to storage module.

Reviewers: Ismael Juma <ismael@juma.me.uk>

Co-authored-by: Ismael Juma <ismael@juma.me.uk>
2023-01-12 21:32:23 -08:00
David Jacot e6669672ef
KAFKA-14367; Add `OffsetCommit` to the new `GroupCoordinator` interface (#12886)
This patch adds `OffsetCommit` to the new `GroupCoordinator` interface and updates `KafkaApis` to use it.

Reviewers: Omnia G H Ibrahim <o.g.h.ibrahim@gmail.com>, Jeff Kim <jeff.kim@confluent.io>, Justine Olshan <jolshan@confluent.io>
2023-01-12 18:05:49 +01:00
David Arthur 0bb05d8679
KAFKA-14304 Use boolean for ZK migrating brokers in RPC/record (#13103)
With the new broker epoch validation logic introduced in #12998, we no longer need the ZK broker epoch to be sent to the KRaft controller. This patch removes that epoch and replaces it with a boolean.

Another small fix is included in this patch for controlled shutdown in migration mode. Previously, if a ZK broker was in migration mode, it would always try to do controlled shutdown via BrokerLifecycleManager. Since there is no ordering dependency between bringing up ZK brokers and the KRaft quorum during migration, a ZK broker could be running in migration mode, but talking to a ZK controller. A small check was added to see if the current controller is ZK or KRaft before decided which controlled shutdown to attempt.

Reviewers: Colin P. McCabe <cmccabe@apache.org>
2023-01-11 14:36:56 -05:00
David Jacot 24a86423e9
KAFKA-14367; Add `OffsetFetch` to the new `GroupCoordinator` interface (#12870)
This patch adds OffsetFetch to the new GroupCoordinator interface and updates KafkaApis to use it. 

Reviewers: Philip Nee <pnee@confluent.i>, Jeff Kim <jeff.kim@confluent.io>, Justine Olshan <jolshan@confluent.io>
2023-01-10 11:38:31 -08:00
Michael Marshall def8d724c8
KAFKA-14540: Fix DataOutputStreamWritable#writeByteBuffer (#13032)
When writing a ByteBuffer backed by a HeapBuffer to a DataOutputStream, it is necessary to pass in the offset and the position, not just the position. It is also necessary to pass the remain length, not the limit. The current code results in writing the wrong data to DataOutputStream. While the DataOutputStreamWritable is used in the project, I do not see any references that would utilize this code path, so this bug fix is relatively minor.

I added a new test to cover the exact bug. The test fails without this change.
2023-01-10 11:50:07 +01:00
Akhilesh C db49070760
KAFKA-14493: Introduce Zk to KRaft migration state machine STUBs in KRaft controller. (#12998)
This patch introduces a preliminary state machine that can be used by KRaft
controller to drive online migration from Zk to KRaft.

MigrationState -- Defines the states we can have while migration from Zk to
KRaft.

KRaftMigrationDriver -- Defines the state transitions, and events to handle
actions like controller change, metadata change, broker change and have
interfaces through which it claims Zk controllership, performs zk writes and
sends RPCs to ZkBrokers.

MigrationClient -- Interface that defines the functions used to claim and
relinquish Zk controllership, read to and write from Zk.

Co-authored-by: David Arthur <mumrah@gmail.com>
Reviewers: Colin P. McCabe <cmccabe@apache.org>
2023-01-09 10:44:11 -08:00
iamazy e38526e375
KAFKA-14570: Fix parenthesis in verifyFullFetchResponsePartitions output (#13072)
Reviewers: Mickael Maison <mickael.maison@gmail.com>
2023-01-09 11:15:01 +01:00
Satish Duggana 026105d05f
KAFKA-14550: Move SnapshotFile and CorruptSnapshotException to storage module (#13039)
For broader context on this change, see:

* KAFKA-14470: Move log layer to storage module

Reviewers: Ismael Juma <ismael@juma.me.uk>
2023-01-02 07:31:40 -08:00
Ismael Juma 871289c5c4
KAFKA-14476: Move OffsetMap and related to storage module (#13042)
For broader context on this change, please check:

* KAFKA-14470: Move log layer to storage module

Reviewers: dengziming <dengziming1993@gmail.com>, Satish Duggana <satishd@apache.org>, Federico Valeri <fedevaleri@gmail.com>
2022-12-23 08:19:00 -08:00
Luke Chen b47d1950c7
MINOR: increase connectionMaxIdleMs to make test reliable (#13031)
Saw this flaky test in recent builds: #1452, #1454,

org.opentest4j.AssertionFailedError: Unexpected channel state EXPIRED ==> expected: <true> but was: <false>
	at app//org.junit.jupiter.api.AssertionFailureBuilder.build(AssertionFailureBuilder.java:151)
	at app//org.junit.jupiter.api.AssertionFailureBuilder.buildAndThrow(AssertionFailureBuilder.java:132)
	at app//org.junit.jupiter.api.AssertTrue.failNotTrue(AssertTrue.java:63)
	at app//org.junit.jupiter.api.AssertTrue.assertTrue(AssertTrue.java:36)
	at app//org.junit.jupiter.api.Assertions.assertTrue(Assertions.java:210)
	at app//org.apache.kafka.common.network.SslTransportLayerTest.testIOExceptionsDuringHandshake(SslTransportLayerTest.java:883)
	at app//org.apache.kafka.common.network.SslTransportLayerTest.testUngracefulRemoteCloseDuringHandshakeWrite(SslTransportLayerTest.java:833)

We expected the channel state to be AUTHENTICATE or READY, but got EXPIRED. Checking the test, we're not expecting expired state at all, but set a 5 secs idle timeout in selector, which is not enough when machine is busy. Increasing the connectionMaxIdleMs to 10 secs to make the test reliable.

Reviewers: Ismael Juma <ismael@juma.me.uk>
2022-12-22 10:29:13 +08:00
Ismael Juma e8232edd24
KAFKA-14477: Move LogValidator and related to storage module (#13012)
Also improved `LogValidatorTest` to cover a bug that was originally
only caught by `LogAppendTimeTest`.

For broader context on this change, please check:

* KAFKA-14470: Move log layer to storage module

Reviewers: Jun Rao <junrao@gmail.com>
2022-12-21 16:57:02 -08:00
Akhilesh C 5b521031ed
MINOR: Add zk migration field to the ApiVersionsResponse (#13029)
Add the zk mgiration field which will be used by the KRaft controller to see if the quorum is
ready to handle a zk to kraft migration.

Reviewers: Colin P. McCabe <cmccabe@apache.org>
2022-12-21 15:47:30 -08:00
Lucas Brutschy 6a6c730241
KAFKA-14532: Correctly handle failed fetch when partitions unassigned (#13023)
The failure handling code for fetches could run into an IllegalStateException if a fetch response came back with a failure after the corresponding topic partition has already been removed from the assignment.

Reviewers: David Jacot <djacot@confluent.io>
2022-12-21 09:17:11 +01:00
Philip Nee 4548c272ae
KAFKA-14264; New logic to discover group coordinator (#12862)
[KAFKA-14264](https://issues.apache.org/jira/browse/KAFKA-14264)
In this patch, we refactored the existing FindCoordinator mechanism. In particular, we first centralize all of the network operation (send, poll) in `NetworkClientDelegate`, then we introduced a RequestManager interface that is responsible to handle the timing of different kind of requests, based on the implementation.  In this path, we implemented a `CoordinatorRequestManager` which determines when to create an `UnsentRequest` upon polling the request manager.

Reviewers: Jason Gustafson <jason@confluent.io>
2022-12-19 09:48:52 -08:00
Satish Duggana 7146ac57ba
[KAFKA-13369] Follower fetch protocol changes for tiered storage. (#11390)
This PR implements the follower fetch protocol as mentioned in KIP-405.

Added a new version for ListOffsets protocol to receive local log start offset on the leader replica. This is used by follower replicas to find the local log star offset on the leader.

Added a new version for FetchRequest protocol to receive OffsetMovedToTieredStorageException error. This is part of the enhanced fetch protocol as described in KIP-405.

We introduced a new field locaLogStartOffset to maintain the log start offset in the local logs. Existing logStartOffset will continue to be the log start offset of the effective log that includes the segments in remote storage.

When a follower receives OffsetMovedToTieredStorage, then it tries to build the required state from the leader and remote storage so that it can be ready to move to fetch state.

Introduced RemoteLogManager which is responsible for

initializing RemoteStorageManager and RemoteLogMetadataManager instances.
receives any leader and follower replica events and partition stop events and act on them
also provides APIs to fetch indexes, metadata about remote log segments.
Followup PRs will add more functionality like copying segments to tiered storage, retention checks to clean local and remote log segments. This will change the local log start offset and make sure the follower fetch protocol works fine for several cases.

You can look at the detailed protocol changes in KIP: https://cwiki.apache.org/confluence/display/KAFKA/KIP-405%3A+Kafka+Tiered+Storage#KIP405:KafkaTieredStorage-FollowerReplication

Co-authors: satishd@apache.org, kamal.chandraprakash@gmail.com, yingz@uber.com

Reviewers: Kowshik Prakasam <kprakasam@confluent.io>, Cong Ding <cong@ccding.com>, Tirtha Chatterjee <tirtha.p.chatterjee@gmail.com>, Yaodong Yang <yangyaodong88@gmail.com>, Divij Vaidya <diviv@amazon.com>, Luke Chen <showuon@gmail.com>, Jun Rao <junrao@gmail.com>
2022-12-17 09:36:44 -08:00
Kirk True f247aac96a
KAFKA-14496: Wrong Base64 encoder used by OIDC OAuthBearerLoginCallbackHandler (#13000)
The OAuth code to generate the Authentication header was incorrectly
using the URL-safe base64 encoder. For client IDs and/or secrets with
dashes and/or plus signs would not be encoded correctly, leading to the
OAuth server to reject the credentials.

This change uses the correct base64 encoder, per RFC-7617.

Co-authored-by: Endre Vig <vendre@gmail.com>
2022-12-16 19:44:41 +05:30
Daniel Scanteianu e3585a4cd5
MINOR: Document Offset and Partition 0-indexing, fix typo (#12753)
Add comments to clarify that both offsets and partitions are 0-indexed, and fix a minor typo. Clarify which offset will be retrieved by poll() after seek() is used in various circumstances. Also added integration tests.

Reviewers: Luke Chen <showuon@gmail.com>
2022-12-16 17:12:40 +08:00
Akhilesh C 8b045dcbf6
KAFKA-14446: API forwarding support from zkBrokers to the Controller (#12961)
This PR enables brokers which are upgrading from ZK mode to KRaft mode to forward certain metadata
change requests to the controller instead of applying them directly through ZK. To faciliate this,
we now support EnvelopeRequest on zkBrokers (instead of only on KRaft nodes.)

In BrokerToControllerChannelManager, we can now reinitialize our NetworkClient. This is needed to
handle the case when we transition from forwarding requests to a ZK-based broker over the
inter-broker listener, to forwarding requests to a quorum node over the controller listener.

In MetadataCache.scala, distinguish between KRaft and ZK controller nodes with a new type,
CachedControllerId.

In LeaderAndIsrRequest, StopReplicaRequest, and UpdateMetadataRequest, switch from sending both a
zk and a KRaft controller ID to sending a single controller ID plus a boolean to express whether it
is KRaft. The previous scheme was ambiguous as to whether the system was in KRaft or ZK mode when
both IDs were -1 (although this case is unlikely to come up in practice). The new scheme avoids
this ambiguity and is simpler to understand.

Reviewers: dengziming <dengziming1993@gmail.com>, David Arthur <mumrah@gmail.com>, Colin P. McCabe <cmccabe@apache.org>
2022-12-15 14:16:41 -08:00
Greg Harris dcc02346c5
KAFKA-13881: Add Clients package description javadocs (#12895)
Reviewers: Mickael Maison <mickael.maison@gmail.com>

, Tom Bentley <tbentley@redhat.com>
2022-12-15 10:03:44 +01:00
David Arthur 67c72596af
KAFKA-14448 Let ZK brokers register with KRaft controller (#12965)
Prior to starting a KIP-866 migration, the ZK brokers must register themselves with the active
KRaft controller. The controller waits for all brokers to register in order to verify that all the
brokers can

A) Communicate with the quorum
B) Have the migration config enabled
C) Have the proper IBP set

This patch uses the new isMigratingZkBroker field in BrokerRegistrationRequest and
RegisterBrokerRecord. The type was changed from int8 to bool for BrokerRegistrationRequest (a
mistake from #12860). The ZK brokers use the existing BrokerLifecycleManager class to register and
heartbeat with the controllers.

Reviewers: Mickael Maison <mickael.maison@gmail.com>, Colin P. McCabe <cmccabe@apache.org>
2022-12-13 13:15:21 -08:00
Artem Livshits 43f39c2e60
KAFKA-14379: Consumer should refresh preferred read replica on update metadata (#12956)
The consumer (fetcher) used to refresh the preferred read replica on
three conditions:
    
1. the consumer receives an OFFSET_OUT_OF_RANGE error
2. the follower does not exist in the client's metadata (i.e., offline)
3. after metadata.max.age.ms (5 min default)
    
For other errors, it will continue to reach to the possibly unavailable
follower and only after 5 minutes will it refresh the preferred read
replica and go back to the leader.
    
Another problem is that the client might have stale metadata and not
send fetches to preferred replica, even after the leader redirects to
the preferred replica.
    
A specific example is when a partition is reassigned. the consumer will
get NOT_LEADER_OR_FOLLOWER which triggers a metadata update but the
preferred read replica will not be refreshed as the follower is still
online. it will continue to reach out to the old follower until the
preferred read replica expires.
    
The consumer can instead refresh its preferred read replica whenever it
makes a metadata update request, so when the consumer receives i.e.
NOT_LEADER_OR_FOLLOWER it can find the new preferred read replica without
waiting for the expiration.
    
Generally, we will rely on the leader to choose the correct preferred
read replica and have the consumer fail fast (clear preferred read replica
cache) on errors and reach out to the leader.

Co-authored-by: Jeff Kim <jeff.kim@confluent.io>

Reviewers: David Jacot <djacot@confluent.io>, Jason Gustafson <jason@confluent.io>
2022-12-12 09:55:22 +01:00
Ismael Juma 88725669e7
MINOR: Move MetadataQuorumCommand from `core` to `tools` (#12951)
`core` should only be  used for legacy cli tools and tools that require
access to `core` classes instead of communicating via the kafka protocol
(typically by using the client classes).

Summary of changes:
1. Convert the command implementation and tests to Java and move it to
    the `tools` module.
2. Introduce mechanism to capture stdout and stderr from tests.
3. Change `kafka-metadata-quorum.sh` to point to the new command class.
4. Adjusted the test classpath of the `tools` module so that it supports tests
    that rely on the `@ClusterTests` annotation.
5. Improved error handling when an exception different from `TerseFailure` is
    thrown.
6. Changed `ToolsUtils` to avoid usage of arrays in favor of `List`.

Reviewers: dengziming <dengziming1993@gmail.com>
2022-12-09 09:22:58 -08:00
David Jacot f9a09fdd29
MINOR: Small refactor in DescribeGroupsResponse (#12970)
This patch does a few cleanups:
* It removes `DescribeGroupsResponse.fromError` and pushes its logic to `DescribeGroupsRequest.getErrorResponse` to be consistent with how we implemented the other requests/responses.
* It renames `DescribedGroup.forError` to `DescribedGroup.groupError`.

The patch relies on existing tests.

Reviewers: Mickael Maison <mickael.maison@gmail.com>
2022-12-09 14:15:53 +01:00
David Jacot a0c19c05ef
KAFKA-14425; The Kafka protocol should support nullable structs (#12932)
This patch adds support for nullable structs in the Kafka protocol as described in KIP-893 - https://cwiki.apache.org/confluence/display/KAFKA/KIP-893%3A+The+Kafka+protocol+should+support+nullable+structs.

Reviewers: Colin Patrick McCabe <cmccabe@apache.org>
2022-12-08 20:54:29 +01:00
Rajini Sivaram d23ce20bdf
KAFKA-14352: Rack-aware consumer partition assignment protocol changes (KIP-881) (#12954)
Reviewers: David Jacot <djacot@confluent.io>
2022-12-07 11:41:21 +00:00
Justine Olshan 5aad085a8e
KAFKA-14417: Producer doesn't handle REQUEST_TIMED_OUT for InitProducerIdRequest, treats as fatal error (#12915)
The broker may return the `REQUEST_TIMED_OUT` error in `InitProducerId` responses when allocating the ID using the `AllocateProducerIds` request. The client currently does not handle this. Instead of retrying as we would expect, the client raises a fatal exception to the application. 

In this patch, we address this problem by modifying the producer to handle `REQUEST_TIMED_OUT` and any other retriable errors by re-enqueuing the request. 

Reviewers: Jason Gustafson <jason@confluent.io>
2022-12-06 11:41:31 -08:00
Divij Vaidya 8bb897655f
MINOR: Optimize metric recording when quota check not required (#12933)
In the selector code path, we record some values for some sensors which do not have any metric associated with them that requires quota. Yet, a redundant check for quotas is made which consumes ~0.7% of CPU time in the hot path as demonstrated by the flame graph below. 

![Screenshot 2022-12-01 at 15 46 52](https://user-images.githubusercontent.com/71267/205082727-29c1300d-48fa-475f-b736-45743a2291ba.png)

This PR is a minor optimization which removes the redundant check for quotas in cases where it is not required in the hot path.

Flamegraph after this patch (note that checkQuotas CPU utilization has been removed)

<img width="1789" alt="Screenshot 2022-12-02 at 12 00 31" src="https://user-images.githubusercontent.com/71267/205278204-798fbe96-0acf-4fa4-995b-19b66b6c2cbb.png">

Reviewers: Luke Chen <showuon@gmail.com>, David Jacot <djacot@confluent.io>
2022-12-05 13:58:07 +01:00
David Jacot df29b17fc4
KAFKA-14367; Add `LeaveGroup` to the new `GroupCoordinator` interface (#12850)
This patch adds `leaveGroup` to the new `GroupCoordinator` interface and updates `KafkaApis` to use it.

Reviewers: Justine Olshan <jolshan@confluent.io>, Jeff Kim <jeff.kim@confluent.io>, Jason Gustafson <jason@confluent.io>
2022-12-05 09:28:35 +01:00
David Arthur 7b7e40a536
KAFKA-14304 Add RPC changes, records, and config from KIP-866 (#12928)
Reviewers: Colin Patrick McCabe <cmccabe@apache.org>
2022-12-02 19:59:52 -05:00
Colin Patrick McCabe 5514f372b3
MINOR: extract jointly owned parts of BrokerServer and ControllerServer (#12837)
Extract jointly owned parts of BrokerServer and ControllerServer into SharedServer. Shut down
SharedServer when the last component using it shuts down. But make sure to stop the raft manager
before closing the ControllerServer's sockets.

This PR also fixes a memory leak where ReplicaManager was not removing some topic metric callbacks
during shutdown. Finally, we now release memory from the BatchMemoryPool in KafkaRaftClient#close.
These changes should reduce memory consumption while running junit tests.

Reviewers: Jason Gustafson <jason@confluent.io>, Ismael Juma <ismael@juma.me.uk>
2022-12-02 00:27:22 -08:00
José Armando García Sancio 9b409de1e2
KAFKA-14358; Disallow creation of cluster metadata topic (#12885)
With KRaft the cluster metadata topic (__cluster_metadata) has a different implementation compared to regular topic. The user should not be allowed to create this topic. This can cause issues if the metadata log dir is the same as one of the log dirs.

This change returns an authorization error if the user tries to create the cluster metadata topic.

Reviewers: David Arthur <mumrah@gmail.com>
2022-12-01 18:34:29 -08:00
Jonathan Albrecht b56e71faee
MINOR: Update unit/integration tests to work with the IBM Semeru JDK (#12343)
The IBM Semeru JDK use the OpenJDK security providers instead of the IBM security providers so test for the OpenJDK classes first where possible and test for Semeru in the java.runtime.name system property otherwise.

Reviewers: Mickael Maison <mickael.maison@gmail.com>, Bruno Cadonna <cadonna@apache.org>
2022-12-01 16:22:00 +01:00
Luke Chen 81d98993ad
KAFKA-13715: add generationId field in subscription (#12748)
Implementation for KIP-792, to add generationId field in ConsumerProtocolSubscription message. So when doing assignment, we'll take from subscription generationId fields it is provided in cooperative rebalance protocol. Otherwise, we'll fall back to original solution to use userData.

Reviewers: David Jacot <djacot@confluent.io>
2022-12-01 14:25:51 +08:00
Joel Hamill d9946a7ffc
MINOR: Fix config documentation formatting (#12921)
Reviewers: José Armando García Sancio <jsancio@apache.org>
2022-11-30 08:54:39 -08:00
Divij Vaidya b2d8354e10
KAFKA-14414: Fix request/response header size calculation (#12917)
Reviewers: Mickael Maison <mickael.maison@gmail.com>, Luke Chen <showuon@gmail.com>
2022-11-30 12:17:21 +01:00
David Jacot be032735b3
KAFKA-14422; Consumer rebalance stuck after new static member joins a group with members not supporting static members (#12909)
When a consumer group on a version prior to 2.3 is upgraded to a newer version and static membership is enabled in the meantime, the consumer group remains stuck, iff the leader is still on the old version.

The issue is that setting `GroupInstanceId` in the response to the leader is only supported from JoinGroup version >= 5 and that `GroupInstanceId` is not ignorable nor handled anywhere else. Hence is there is at least one static member in the group, sending the JoinGroup response to the leader fails with a serialization error.

```
org.apache.kafka.common.errors.UnsupportedVersionException: Attempted to write a non-default groupInstanceId at version 2
```

When this happens, the member stays around until the group coordinator is bounced because a member with a non-null `awaitingJoinCallback` is never expired.

This patch fixes the issue by making `GroupInstanceId` ignorable. A unit test has been modified to cover this.

Reviewers: Jason Gustafson <jason@confluent.io>
2022-11-28 20:15:54 +01:00
Divij Vaidya edfa894eb5
KAFKA-14414: Remove unnecessary usage of ObjectSerializationCache (#12890)
Reviewers: Mickael Maison <mickael.maison@gmail.com>
2022-11-28 15:15:34 +01:00
Divij Vaidya 19286449ee
MINOR: Remove possibility of overriding test files clean up (#12889)
Reviewers: Chris Egerton <chrise@aiven.io>
2022-11-22 11:06:33 -05:00
Artem Livshits 36f933fc5f
Minor; fix sticky partitioner doc to say at least batch.size (#12880)
Reviewers: Jun Rao <junrao@gmail.com>
2022-11-21 09:10:04 -08:00
Joel Hamill 95499c409f
MINOR: fix syntax typo (#12868)
Reviewers: Luke Chen <showuon@gmail.com>
2022-11-18 11:13:11 +08:00
Igor Soarez 5bd556a49b
KAFKA-14303 Producer.send without record key and batch.size=0 goes into infinite loop (#12752)
This fixes an bug which causes a call to producer.send(record) with a record without a key and configured with batch.size=0 never to return.

Without specifying a key or a custom partitioner the new BuiltInPartitioner, as described in KIP-749 kicks in.

BuiltInPartitioner seems to have been designed with the reasonable assumption that the batch size will never be lower than one.

However, documentation for producer configuration states batch.size=0 as a valid value, and even recommends its use directly. [1]

[1] clients/src/main/java/org/apache/kafka/clients/producer/ProducerConfig.java:87

Reviewers: Artem Livshits <alivshits@confluent.io>, Luke Chen <showuon@gmail.com>, Ismael Juma <ismael@juma.me.uk>, Jun Rao <junrao@gmail.com>
2022-11-17 14:27:02 -08:00
David Jacot c2fc36f331
MINOR: Handle JoinGroupResponseData.protocolName backward compatibility in JoinGroupResponse (#12864)
This is a small refactor extracted from https://github.com/apache/kafka/pull/12845. It basically moves the logic to handle the backward compatibility of `JoinGroupResponseData.protocolName` from `KafkaApis` to `JoinGroupResponse`.

The patch adds a new unit test for `JoinGroupResponse` and relies on existing tests as well.

Reviewers: Justine Olshan <jolshan@confluent.io>, Jason Gustafson <jason@confluent.io>
2022-11-16 12:43:00 -08:00
Omnia G H Ibrahim 46bee5bcf3
KAFKA-13401: KIP-787 - MM2 manage Kafka resources with custom Admin implementation. (#12577)
Reviewers: Mickael Maison <mickael.maison@gmail.com>, Tom Bentley <tombentley@users.noreply.github.com>

Co-authored-by: Tom Bentley <tombentley@users.noreply.github.com>
Co-authored-by: oibrahim3 <omnia@apple.com>
2022-11-15 11:21:24 +01:00
Ashmeet Lamba a971448f3f
KAFKA-14254: Format timestamps as dates in logs (#12684)
Improves logs withing Streams by replacing timestamps to date instances to improve readability.

Approach - Adds a function within common.utils.Utils to convert a given long timestamp to a date-time string with the same format used by Kafka's logger.

Reviewers: Divij Vaidya <diviv@amazon.com>, Bruno Cadonna <cadonna@apache.org>
2022-11-07 13:42:39 +01:00
Shawn fcab5fb888
Revert "KAFKA-13891: reset generation when syncgroup failed with REBALANCE_IN_PROGRESS (#12140)" (#12794)
This reverts commit c23d60d56c.

Reviewers: Luke Chen <showuon@gmail.com>
2022-11-05 14:05:48 +08:00
Philip Nee dc18dd921c
MINOR: Remove outdated comment about metadata request (#12810)
This is an outdated comment about the metadata request from the old implementation. The name of the RPC was changed from GroupMetadata to FindCoordinator, but the comment was not updated.

Reviewers: Jason Gustafson <jason@confluent.io>
2022-11-01 17:01:19 -07:00
Philip Nee fa10e213bf
KAFKA-14247: add handler impl to the prototype (#12775)
Minor revision for KAFKA-14247. Added how the handler is called and constructed to the prototype code path.

Reviewers: John Roesler <vvcephei@apache.org>, Kirk True <kirk@mustardgrain.com>
2022-10-31 14:01:22 -05:00
Philip Nee 0a045d4ef7
KAFKA-14247: Consumer background thread base implementation (#12672)
Adds skeleton of the background thread.

1-pager: https://cwiki.apache.org/confluence/display/KAFKA/Proposal%3A+Consumer+Threading+Model+Refactor
Continuation of #12663

Reviewers: Guozhang Wang <guozhang@apache.org>, Kirk True <kirk@mustardgrain.com>, John Roesler <vvcephei@apache.org>
2022-10-19 21:02:55 -05:00
Ismael Juma de8c9ea04c
MINOR: Include TLS version in transport layer debug log (#12751)
This was helpful when debugging an issue recently.

Reviewers: Manikumar Reddy <manikumar.reddy@gmail.com>, Divij Vaidya <divijvaidya13@gmail.com>
2022-10-17 08:26:23 -07:00
Philip Nee 7d67ddce22
MINOR: typo in KafkaChannel (#12757)
Reviewers: Luke Chen <showuon@gmail.com>, Divij Vaidya <diviv@amazon.com>
2022-10-16 20:08:53 +08:00
Chris Egerton 18e60cb000
KAFKA-12497: Skip periodic offset commits for failed source tasks (#10528)
Also moves the Streams LogCaptureAppender class into the clients module so that it can be used by both Streams and Connect.

Reviewers: Nigel Liang <nigel@nigelliang.com>, Kalpesh Patel <kpatel@confluent.io>, John Roesler <vvcephei@apache.org>, Tom Bentley <tbentley@redhat.com>
2022-10-13 10:15:42 -04:00
Mickael Maison a7a026cabb
MINOR: Fix closing code tag in producer config docs (#12718)
Reviewers: David Jacot <djacot@confluent.io>, Luke Chen <showuon@gmail.com>
2022-10-06 14:33:28 +02:00
Jason Gustafson c5745d2845
MINOR: Add initial property tests for StandardAuthorizer (#12703)
In https://github.com/apache/kafka/pull/12695, we discovered a gap in our testing of `StandardAuthorizer`. We addressed the specific case that was failing, but I think we need to establish a better methodology for testing which incorporates randomized inputs. This patch is a start in that direction. We implement a few basic property tests using jqwik which focus on prefix searching. It catches the case from https://github.com/apache/kafka/pull/12695 prior to the fix. In the future, we can extend this to cover additional operation types, principal matching, etc.

Reviewers: David Arthur <mumrah@gmail.com>
2022-10-04 16:31:43 -07:00
Philip Nee 997dfa950e
MINOR: Fix typo in selector documentation (#12710)
Reviewers: David Jacot <djacot@confluent.io>
2022-10-04 13:05:26 +02:00
Philip Nee c3690a3b4a
KAFKA-14247; Define event handler interface and events (#12663)
Add some initial interfaces to kick off https://cwiki.apache.org/confluence/display/KAFKA/Proposal%3A+Consumer+Threading+Model+Refactor. We introduce an `EventHandler` interface and a new consumer implementation which demonstrates how it will be used. Subsequent PRs will continue to flesh out the implementation.

Reviewers: Guozhang Wang wangguoz@gmail.com, Jason Gustafson <jason@confluent.io>
2022-10-03 09:36:40 -07:00
LinShunKang 496ae054c2
Fix ByteBufferSerializer#serialize(String, ByteBuffer) not roundtrip input with ByteBufferDeserializer#deserialize(String, byte[]) (#12704)
Reviewers: Guozhang Wang <wangguoz@gmail.com>
2022-09-30 06:45:18 -07:00
LinShunKang 51dbd175b0
KAFKA-4852: Fix ByteBufferSerializer#serialize(String, ByteBuffer) not compatible with offsets (#12683)
Reviewers: Guozhang Wang <wangguoz@gmail.com>
2022-09-29 10:59:47 -07:00
Kirk True 8e43548175
KAFKA-13725: KIP-768 OAuth code mixes public and internal classes in same package (#12039)
* KAFKA-13725: KIP-768 OAuth code mixes public and internal classes in same package

Move classes into a sub-package of "internal" named "secured" that
matches the layout more closely of the "unsecured" package.

Replaces the concrete implementations in the former packages with
sub-classes of the new package layout and marks them as deprecated. If
anyone is already using the newer OAuth code, this should still work.

* Fix checkstyle and spotbugs violations

Co-authored-by: Kirk True <kirk@mustardgrain.com>

Reviewers: Manikumar Reddy <manikumar.reddy@gmail.com>
2022-09-23 13:15:15 +05:30
Manikumar Reddy 5587c65fd3 MINOR: Add configurable max receive size for SASL authentication requests
This adds a new configuration `sasl.server.max.receive.size` that sets the maximum receive size for requests before and during authentication.

Reviewers: Tom Bentley <tbentley@redhat.com>, Mickael Maison <mickael.maison@gmail.com>

Co-authored-by: Manikumar Reddy <manikumar.reddy@gmail.com>
Co-authored-by: Mickael Maison <mickael.maison@gmail.com>
2022-09-21 20:58:33 +05:30
Colin Patrick McCabe b401fdaefb MINOR: Add more validation during KRPC deserialization
When deserializing KRPC (which is used for RPCs sent to Kafka, Kafka Metadata records, and some
    other things), check that we have at least N bytes remaining before allocating an array of size N.

    Remove DataInputStreamReadable since it was hard to make this class aware of how many bytes were
    remaining. Instead, when reading an individual record in the Raft layer, simply create a
    ByteBufferAccessor with a ByteBuffer containing just the bytes we're interested in.

    Add SimpleArraysMessageTest and ByteBufferAccessorTest. Also add some additional tests in
    RequestResponseTest.

    Reviewers: Tom Bentley <tbentley@redhat.com>, Mickael Maison <mickael.maison@gmail.com>, Colin McCabe <colin@cmccabe.xyz>

    Co-authored-by: Colin McCabe <colin@cmccabe.xyz>
    Co-authored-by: Manikumar Reddy <manikumar.reddy@gmail.com>
    Co-authored-by: Mickael Maison <mickael.maison@gmail.com>
2022-09-21 20:58:23 +05:30
Sushant Mahajan f8e0a6d924
KAFKA-14212: Enhance HttpAccessTokenRetriever to retrieve error message (#12651)
Currently HttpAccessTokenRetriever client side class does not retrieve error response from the token e/p. As a result, seemingly trivial config issues could take a lot of time to diagnose and fix. For example, client could be sending invalid client secret, id or scope.
This PR aims to remedy the situation by retrieving the error response, if present and logging as well as appending to any exceptions thrown.
New unit tests have also been added.

Reviewers: Manikumar Reddy <manikumar.reddy@gmail.com>
2022-09-20 12:33:19 +05:30
Artem Livshits 2b2039f0ba
KAFKA-14156: Built-in partitioner may create suboptimal batches (#12570)
Now the built-in partitioner defers partition switch (while still
accounting produced bytes) if there is no ready batch to send, thus
avoiding switching partitions and creating fractional batches.

Reviewers: Jun Rao <jun@confluent.io>
2022-09-14 17:39:14 -07:00
Philip Nee 5f01fed206
MINOR: Small cleanups in FetcherTest following KAFKA-14196 (#12629)
Minor cleanups in `FetcherTest` following https://github.com/apache/kafka/pull/12603.

Reviewers: Luke Chen <showuon@gmail.com>, Jason Gustafson <jason@confluent.io>
2022-09-13 11:10:41 -07:00
Jason Gustafson 3d2ac7cdbe
KAFKA-14208; Do not raise wakeup in consumer during asynchronous offset commits (#12626)
Asynchronous offset commits may throw an unexpected WakeupException following #11631 and #12244. This patch fixes the problem by passing through a flag to ensureCoordinatorReady to indicate whether wakeups should be disabled. This is used to disable wakeups in the context of asynchronous offset commits. All other uses leave wakeups enabled.

Note: this patch builds on top of #12611.

Co-Authored-By: Guozhang Wang wangguoz@gmail.com

Reviewers: Luke Chen <showuon@gmail.com>
2022-09-13 15:43:09 +08:00
Philip Nee 536cdf692f
KAFKA-14196; Do not continue fetching partitions awaiting auto-commit prior to revocation (#12603)
When auto-commit is enabled with the "eager" rebalance strategy, the consumer will commit all offsets prior to revocation. Following recent changes, this offset commit is done asynchronously, which means there is an opportunity for fetches to continue returning data to the application. When this happens, the progress is lost following revocation, which results in duplicate consumption. This patch fixes the problem by adding a flag in `SubscriptionState` to ensure that partitions which are awaiting revocation will not continue being fetched.

Reviewers: Luke Chen <showuon@gmail.com>, Jason Gustafson <jason@confluent.io>
2022-09-12 21:02:13 -07:00
Jason Gustafson b9774c0b02
KAFKA-14215; Ensure forwarded requests are applied to broker request quota (#12624)
Currently forwarded requests are not applied to any quotas on either the controller or the broker. The controller-side throttling requires the controller to apply the quota changes from the log to the quota managers, which will be done separately. In this patch, we change the response logic on the broker side to also apply the broker's request quota. The enforced throttle time is the maximum of the throttle returned from the controller (which is 0 until we fix the aforementioned issue) and the broker's request throttle time.

Reviewers: David Arthur <mumrah@gmail.com>
2022-09-12 20:50:33 -07:00
Divij Vaidya d4fc3186b4
MINOR: Replace usage of File.createTempFile() with TestUtils.tempFile() (#12591)
Why
Current set of integration tests leak files in the /tmp directory which makes it cumbersome if you don't restart the machine often.

Fix
Replace the usage of File.createTempFile with existing TestUtils.tempFile method across the test files. TestUtils.tempFile automatically performs a clean up of the temp files generated in /tmp/ folder.

Reviewers: Luke Chen <showuon@gmail.com>, Ismael Juma <mlists@juma.me.uk>
2022-09-13 08:44:21 +08:00
Colin Patrick McCabe 47252f431e
KAFKA-14216: Remove ZK reference from org.apache.kafka.server.quota.ClientQuotaCallback javadoc (#12617)
Reviewers: Luke Chen <showuon@gmail.com>
2022-09-12 08:34:46 -07:00
David Jacot b7f20be809
KAFKA-14201; Consumer should not send group instance ID if committing with empty member ID (#12599)
The consumer group instance ID is used to support a notion of "static" consumer groups. The idea is to be able to identify the same group instance across restarts so that a rebalance is not needed. However, if the user sets `group.instance.id` in the consumer configuration, but uses "simple" assignment with `assign()`, then the instance ID nevertheless is sent in the OffsetCommit request to the coordinator. This may result in a surprising UNKNOWN_MEMBER_ID error.

This PR fixes the issue on the client side by not setting the group instance id if the member id is empty (no generation).

Reviewers: Jason Gustafson <jason@confluent.io>
2022-09-08 15:05:40 -07:00
Ismael Juma faefbb29b6
MINOR: Fix comment in Exit.java to refer to the right method (#12592)
Reviewers: Rajini Sivaram <rajinisivaram@googlemail.com>
2022-09-06 20:12:55 -07:00
Andrew Dean 0cbf74a940
KAFKA-14194: Fix NPE in Cluster.nodeIfOnline (#12584)
When utilizing the rack-aware consumer configuration and rolling updates are being applied to the Kafka brokers the metadata updates can be in a transient state and a given topic-partition can be missing from the metadata. This seems to resolve itself after a bit of time but before it can resolve the `Cluster.nodeIfOnline` method throws an NPE. This patch checks to make sure that a given topic-partition has partition info available before using that partition info.

Reviewers: David Jacot <djacot@confluent.io>
2022-09-05 09:56:23 +02:00