Commit Graph

14081 Commits

Author SHA1 Message Date
David Jacot af5df59d2b
KAFKA-17593; [1/N] Introduce re2j dependency (#17634)
This patch is the first of a series of patches to introduce support for server side regular expression. It introduces the re2j dependency.

Co-authored-by: Lianet Magrans <lmagrans@confluent.io>

Reviewers: Lianet Magrans <lmagrans@confluent.io>
2024-10-30 08:20:11 -07:00
Ken Huang 2a46282b2a
KAFKA-17873: Add description to all packages in the public API (#17605)
Reviewers: Mickael Maison <mickael.maison@gmail.com>
2024-10-30 15:41:10 +01:00
PoAn Yang 7fb6e9ec1c
KAFKA-17840 Move ReplicationQuotaManager, ClientRequestQuotaManager and QuotaFactory to server module (#17609)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-10-30 21:18:28 +08:00
Kuan-Po Tseng f0a3960e3e
KAFKA-17867 Consider using zero-copy for PushTelemetryRequest (#17622)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-10-30 20:49:56 +08:00
wperlichek 49ea947095
MINOR: Fix spelling typo in Docker Compose examples in README (#17631)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-10-30 20:44:24 +08:00
Josep Prat c55d4f08c4
MINOR: Update the upgrade docs to include 3.8.1 version (#17628)
Signed-off-by: Josep Prat <josep.prat@aiven.io>

Reviewers: Mickael Maison <mickael.maison@gmail.com>
2024-10-30 11:41:00 +01:00
Nick Telford 571f50817c
KAFKA-17411: Create local state Standbys on start (#16922)
Instead of waiting until Tasks are assigned to us, we pre-emptively
create a StandbyTask for each non-empty Task directory found on-disk.

We do this before starting any StreamThreads, and on our first
assignment (after joining the consumer group), we recycle any of these
StandbyTasks that were assigned to us, either as an Active or a
Standby.

We can't just use these "initial Standbys" as-is, because they were
constructed outside the context of a StreamThread, so we first have to
update them with the context (log context, ChangelogReader, and source
topics) of the thread that it has been assigned to.

The motivation for this is to (in a later commit) read StateStore
offsets for unowned Tasks from the StateStore itself, rather than the
.checkpoint file, which we plan to deprecate and remove.

There are a few additional benefits:

Initializing these Tasks on start-up, instead of on-assignment, will
reduce the time between a member joining the consumer group and beginning
processing. This is especially important when active tasks are being moved
over, for example, as part of a rolling restart.

If a Task has corrupt data on-disk, it will be discovered on startup and
wiped under EOS. This is preferable to wiping the state after being
assigned the Task, because another instance may have non-corrupt data and
would not need to restore (as much).

There is a potential performance impact: we open all on-disk Task
StateStores, and keep them all open until we have our first assignment.
This could require large amounts of memory, in particular when there are
a large number of local state stores on-disk.

However, since old local state for Tasks we don't own is automatically
cleaned up after a period of time, in practice, we will almost always
only be dealing with the state that was last assigned to the local
instance.

Reviewers: Anna Sophie Blee-Goldman <ableegoldman@apache.org>, Bruno Cadonna <cadonna@apache.org>, Matthias Sax <mjsax@apache.org>
2024-10-29 12:59:25 -07:00
TengYao Chi 7366f0487a
KAFKA-17128 Make node.id immutable after removing zookeeper migration (#17616)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-10-29 23:29:18 +08:00
Josep Prat 4419677e06
Update docker_scan.yml for 3.8.1 (#17623)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-10-29 16:06:01 +01:00
Yung 984777f0b9
KAFKA-17875: Align KRaft controller count recommendations (#17600)
Reviewers: Mickael Maison <mickael.maison@gmail.com>
Co-authored-by: yungh <yungh@nvidia.com>
2024-10-29 11:37:49 +01:00
Alieh Saeedi 4817eb9227
KAFKA-15344: Streams task should cache consumer nextOffsets (#17091)
This PR augments Streams messages with leader epoch. In case of empty buffer queues, the last offset and leader epoch are retrieved from the streams task 's cache of nextOffsets.

Co-authored-by: Lucas Brutschy <lbrutschy@confluent.io>
Reviewers: Lucas Brutschy <lbrutschy@confluent.io>, Matthias J. Sax <matthias@confluent.io>
2024-10-29 09:30:11 +01:00
David Arthur 8c071b02e9
MINOR: Fix JDK versions in CI (#17621)
Our "validate" job was running on JDK 21 while the "test" job was running 11 and 23. This patch updates the validate job to 23 and fixes the test catalog step to only run on JDK 23 (instead of 21)

Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-10-29 13:58:38 +08:00
Kirk True 9e424755d4
KAFKA-17439: Make polling for new records an explicit action/event in the new consumer (#17035)
Reviewers: Andrew Schofield <aschofield@confluent.io>, Lianet Magrans <lmagrans@confluent.io>
2024-10-28 15:46:37 -04:00
Sushant Mahajan 5f92f60bff
KAFKA-17329: DefaultStatePersister implementation (#17270)
Adds the DefaultStatePersister and other supporting classes for managing share state.

* Added DefaultStatePersister implementation. This is the entry point for callers who wish to invoke the share state RPC API.
* Added PersisterStateManager which is used by DefaultStatePersister to manage and send the RPCs over the network.
* Added code to BrokerServer and BrokerMetadataPublisher to instantiate the appropriate persister based on the config value for group.share.persister.class.name. If this is not specified, the DefaultStatePersister will be used. To force use of NoOpStatePersister, set the config to empty. This is an internal config, not to be exposed to the end user. This will be used to factory plug the appropriate persister.
* Using this persister, the internal __share_group_state topic will come to life and will be used for persistence of share group info.

Reviewers: Andrew Schofield <aschofield@confluent.io>, Jun Rao <junrao@gmail.com>, David Arthur <mumrah@gmail.com>
2024-10-28 14:11:04 -04:00
Colin Patrick McCabe 14a9130f6f
KAFKA-17793: Improve kcontroller robustness against long delays (#17502)
As described in KIP-500, the Kafka controller monitors the liveness of each broker in the cluster. It gathers this information from heartbeats sent from the brokers themselves.

In some rare cases, the main controller thread may get blocked for several seconds at a time. In the current code, this will result in the controller being unable to update the last contact times for the brokers during this time.

This PR changes the controller heartbeat handling to be partially lockless. Specifically, the last contact time for each broker will be updated locklessly prior to the rest of the heartbeat handling. This will ensure that heartbeats always get through.

Additionally, this PR adds a PeriodicTaskControlManager to better manage periodic tasks. This should help handle the very common pattern where we want to schedule a background task at some frequency. We also want the background task to be immediately rescheduled if there is too much work to be done in one event.

Reviewers: Liu Zeyu <zeyu.luke@gmail.com>, David Arthur <mumrah@gmail.com>
2024-10-28 08:36:07 -07:00
Mickael Maison 6e88c10ed5
KAFKA-14483 Move LocalLog to storage module (#17587)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-10-28 20:41:46 +08:00
Dmitry Werner 12a60b8cd9
KAFKA-17878 Move ActionQueue to server module (#17602)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-10-28 20:35:23 +08:00
Yung db25c212ed
KAFKA-17883 Fix jvm error caused by UseParNewGC when running old kafka client in e2e (#17612)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-10-28 20:23:32 +08:00
xijiu 71f4001a21
KAFKA-17874: Fix the KRaft metric names in the documentation (#17599)
Reviewers: Mickael Maison <mickael.maison@gmail.com>
2024-10-28 11:12:18 +01:00
Yung 24689dc6ab
KAFKA-17879 test_performance_services.py should use DEV version to run kafka service (#17606)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-10-28 04:32:06 +08:00
Chia-Chuan Yu 7fe009b2e7
KAFKA-17881 Apply the minJavaVersion to test code (#17610)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-10-27 22:09:54 +08:00
TengYao Chi 5a2d44ff4d
MINOR: Fix the broken javadoc in ConsumerNetworkThread (#17607)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-10-27 05:31:22 +08:00
Apoorv Mittal 397ae598e6
KAFKA-17848: Fixing share purgatory request and locks handling (#17583)
For delayed fetch, tryComplete can be called again after onComplete. As the requests are processed with parallel threads hence this scenario can occur. We attain locks in tryComplete which keeps pending as onComplete is never called when request is already completed.

Reviewers: Abhinav Dixit <adixit@confluent.io>, Andrew Schofield <aschofield@confluent.io>, Jun Rao <junrao@gmail.com>
2024-10-26 08:00:33 -07:00
Colin Patrick McCabe d7ac865fb0
KAFKA-17868: Do not ignore --feature flag in kafka-storage.sh (#17597)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>, Justine Olshan <jolshan@confluent.io>
2024-10-25 12:33:22 -07:00
Ken Huang e4935aaf8e
MINOR: clean up TestUtils.scala (#17595)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-10-26 01:17:37 +08:00
Ken Huang 09d76f917c
KAFKA-16564 Apply `Xlint` to java code in core module (#16965)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-10-26 01:07:56 +08:00
Andrew Schofield 33147d1089
KAFKA-17863: share consumer max poll records soft limit (#17592)
Make max.poll.records a soft limit so that record batch boundaries can be respected in records returned by ShareConsumer.poll. This gives a significant performance gain because the broker is much more efficient at handling batches which have not been split.

Reviewers: Apoorv Mittal <apoorvmittal10@gmail.com>,  Manikumar Reddy <manikumar.reddy@gmail.com>
2024-10-25 16:37:14 +05:30
Dmitry Werner 1eb7644349
KAFKA-16845 Migrate ReplicationQuotasTestRig to new test infra (#17089)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-10-25 03:33:01 +08:00
ClarkChen 5311839bd5
KAFKA-17847 Avoid the extra bytes copy when compressing telemetry payload (#17578)
Reviewers: Apoorv Mittal <apoorvmittal10@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>
2024-10-25 03:05:48 +08:00
Yung fc4b739578
KAFKA-17854 Improve tests for ReadOnlyWindowStoreStub#fetch and #backwardFetch (#17586)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-10-25 02:35:07 +08:00
TengYao Chi 63cdb3602a
MINOR: Correct the wrong behavior of TestUtils#verifyTopicDeletion (#17589)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-10-25 02:25:36 +08:00
Stig Døssing 98b7e4deaf
KAFKA-17574 Allow overriding TestKitNodes baseDirectory (#17225)
This allows shutting down a KafkaClusterTestKit from a JVM shutdown hook without risking error logs because the base directory has already been deleted by the shutdown hook TestUtils.tempDirectory sets up.

Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-10-25 01:48:25 +08:00
ClarkChen 24c6e8d085
KAFKA-17846 ClientTelemetryReporter does not log trace-level message (#17570)
Reviewers: Apoorv Mittal <apoorvmittal10@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>
2024-10-25 01:37:20 +08:00
Kuan-Po Tseng 140d35c545
KAFKA-8779 Fix flaky tests introduced by dynamic log levels (#17382)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-10-24 21:39:35 +08:00
TengYao Chi 553e6b4c6d
KAFKA-17860 Remove log4j-appender module (#17588)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-10-24 18:13:30 +08:00
Said Boudjelda 57053ef47d
MINOR: Remove never thrown exception in ByteUtilsBenchmark (#17532)
Reviewers: Mickael Maison <mickael.maison@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>
2024-10-24 11:51:23 +02:00
Andrew Schofield e831dfb409
KAFKA-17785: Add tagged field information to protocol docs (#17498)
Reviewers: Mickael Maison <mickael.maison@gmail.com>, Ken Huang <s7133700@gmail.com>
2024-10-24 11:14:45 +02:00
Joao Pedro Fonseca Dantas 3856644cd6
KAFKA-17233: MirrorCheckpointConnector should use batched listConsumerGroupOffsets (#17038)
Reviewers: Mickael Maison <mickael.maison@gmail.com>, Greg Harris <gharris1727@gmail.com>
2024-10-24 09:38:44 +02:00
Tom Duckering 1d231b35b8
MINOR: Use bools for ignorable in EndTxnResponse.json (#17558)
A tiny fix to a single protocol JSON which was recently updated under KAFKA-14562. It seems to erroneously use strings where bools appear to be the right type.

Reviewers: Andrew Schofield <aschofield@confluent.io>, Justine Olshan <jolshan@confluent.io>
2024-10-23 22:24:25 -07:00
Federico Valeri 363bf3cab4
MINOR: Fix the valid values generated doc of the RLM thread pools (#17575)
This patch fixes the valid values generated doc of remote.log.manager.copier.thread.pool.size and remote.log.manager.expiration.thread.pool.size.

Signed-off-by: Federico Valeri <fedevaleri@gmail.com>

Reviewers: Luke Chen <showuon@gmail.com>, Satish Duggana <satishd@apache.org>
2024-10-24 09:58:39 +08:00
Sanskar Jhajharia 8faeb9390d
MINOR: Code cleanup Kafka Streams (#16050)
Reviewers: Matthias J. Sax <matthias@confluent.io>
2024-10-23 16:54:06 -07:00
Apoorv Mittal 0d44415bac
KAFKA-17774: Adding capability to handle max fetch records in Share Fetch (KIP-932) (#17322)
The PR adds capability to restrict the messages in Share Fetch. The max fetch records will be an additional way to limit the number of records sent from broker to client.

In Share Fetch, with min and mx bytes, there exists 3 problems:

1. The max.poll.records client config sends the max number of records defined to application but might have fetched extra becuase of higher max bytes. But the timeout for the sent records has started on the broker.
2. As the application processes records as per max.poll.records, hence those number of records are sent in every acknowledgement. This causes the cache data to be tracked per offset as the batch is broken.
3. The client has to sent the partial acknoledgment batch and cannot piggyback on fetch requests.

To handle the above scenario max fetch records has been added. Once this PR is merged and we define the right methodolgy then KIP will be updated to have max fetch records in share fetch RPC rather as broker config.

Reviewers: Abhinav Dixit <adixit@confluent.io>, Andrew Schofield <aschofield@confluent.io>, Chia-Ping Tsai <chia7712@gmail.com>, Jun Rao <junrao@gmail.com>
2024-10-23 13:21:32 -07:00
TaiJuWu 661bed242e
MINOR: add controller-related tests to metadataQuorumCommandTest (#17486)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-10-24 03:08:26 +08:00
David Jacot a96cc6a24d
MINOR: Fix coordinator logging in system tests (#17585)
Reviewers: Jeff Kim <jeff.kim@confluent.io>, Chia-Ping Tsai <chia7712@gmail.com>
2024-10-24 02:51:18 +08:00
Ken Huang 2ff13976ab
KAFKA-17568 Rewrite TestPurgatoryPerformance by Java (#17246)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-10-24 02:44:37 +08:00
PoAn Yang 2d896d9130
KAFKA-17614: Remove AclAuthorizer (#17424)
Reviewers: Mickael Maison <mickael.maison@gmail.com>
2024-10-23 17:07:48 +02:00
Alieh Saeedi 14a098b289
KAFKA-17600: Add nextOffsets to the ConsumerRecords (#17414)
This PR implements KIP-1094.

Reviewers: Andrew Schofield <aschofield@confluent.io>, Kirk True <ktrue@confluent.io>, Lucas Brutschy <lbrutschy@confluent.io>
2024-10-23 16:25:50 +02:00
Federico Valeri 9d65ff8077
KAFKA-17852 Add help message to the --ignore-formatted flag of StorageTool (#17577)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-10-23 03:52:08 +08:00
TengYao Chi 9cbb3f0a4f
KAFKA-17526 make ConfigCommandIntegrationTest.java test use correct arguments in testing alias (#17201)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-10-23 03:24:08 +08:00
Apoorv Mittal 25a3590dc2
KAFKA-17813: Moving broker endpoint class and common server connection id (#17519)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>, Kuan-Po Tseng <brandboat@gmail.com>, Jun Rao <junrao@gmail.com>
2024-10-22 11:58:28 -07:00