Commit Graph

13609 Commits

Author SHA1 Message Date
Alyssa Huang b048798a09 KAFKA-16521: Have Raft endpoints printed as name://host:port (#16830)
Reviewers: Colin P. McCabe <cmccabe@apache.org>
2024-08-08 09:23:03 -07:00
Dmitry Werner 9230a3899f
KAFKA-17242: Do not log spurious timeout message for MirrorCheckpointTask sync store startup (#16773)
Reviewers: Chris Egerton <chrise@aiven.io>
2024-08-08 10:04:00 -04:00
TengYao Chi 0b57b36c8f
KAFKA-17232: Do not generate task configs in MirrorCheckpointConnector if initial consumer group load times out (#16767)
Reviewers: Hongten <hongtenzone@foxmail.com>, Chris Egerton <chrise@aiven.io>
2024-08-08 09:58:34 -04:00
Luke Chen 7fe3cec4eb KAFKA-17236: Handle local log deletion when remote.log.copy.disabled=true (#16765)
Handle local log deletion when remote.log.copy.disabled=true based on the KIP-950.

When tiered storage is disabled or becomes read-only on a topic, the local retention configuration becomes irrelevant, and all data expiration follows the topic-wide retention configuration exclusively.

- added remoteLogEnabledAndRemoteCopyEnabled method to check if this topic enables tiered storage and remote log copy is enabled. We should adopt local.retention.ms/bytes when remote.storage.enable=true,remote.log.copy.disable=false.
- Changed to use retention.bytes/retention.ms when remote copy disabled.
- Added validation to ask users to set local.retention.ms == retention.ms and local.retention.bytes == retention.bytes
- Added tests

Reviewers: Kamal Chandraprakash<kamal.chandraprakash@gmail.com>, Satish Duggana <satishd@apache.org>, Christo Lolov <lolovc@amazon.com>
2024-08-08 19:49:23 +08:00
Ken Huang dd5e7a8291 KAFKA-17276; replicaDirectoryId for Fetch and FetchSnapshot should be ignorable (#16819)
The replicaDirectoryId field for FetchRequest and FetchSnapshotRequest should be ignorable. This allows data objects with the directory id to be serialized to any version of the requests.

Reviewers: José Armando García Sancio <jsancio@apache.org>, Chia-Ping Tsai <chia7712@apache.org>
2024-08-08 00:58:01 +00:00
dujian0068 c736d02b52 KAFKA-16584: Make log processing summary configurable or debug--update upgrade-guide (#16709)
Updates Kafka Streams upgrade-guide for KIP-1049.

Reviewers: Bill Bejeck <bill@confluent.io>, Matthias J. Sax <matthias@confluent.io>
2024-08-06 12:09:56 -07:00
Mickael Maison 4e6508b5e3 KAFKA-17227: Refactor compression code to only load codecs when used (#16782)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>, Josep Prat <josep.prat@aiven.io>
2024-08-06 11:04:28 +02:00
Kuan-Po Tseng 4537c8af5b KAFKA-17235 system test test_performance_service.py failed (#16789)
related to https://issues.apache.org/jira/browse/KAFKA-17235

The root cause of this issue is a change we introduced in KAFKA-16879, where we modified the PushHttpMetricsReporter constructor to use Time.System [1]. However, Time.System doesn't exist in Kafka versions 0.8.2 and 0.9.

In test_performance_services.py, we have system tests for Kafka versions 0.8.2 and 0.9 [2]. These tests always use the tools JAR from the trunk branch, regardless of the Kafka version being tested [3], while the client JAR aligns with the Kafka version specified in the test suite [4]. This discrepancy is what causes the issue to arise.

To resolve this issue, we have a few options:

1) Add Time.System to Kafka 0.8.2 and 0.9: This isn't practical, as we no longer maintain these versions.
2) Modify the PushHttpMetricsReporter constructor to use new SystemTime() instead of Time.System: This would contradict the intent of KAFKA-16879, which aims to make SystemTime a singleton.
3) Implement Time in PushHttpMetricsReporter use the time to get current time
4) Remove system tests for Kafka 0.8.2 and 0.9 from test_performance_services.py

Given that we no longer maintain Kafka 0.8.2 and 0.9, and altering the constructor goes against the design goals of KAFKA-16879, option 4 appears to be the most feasible solution. However, I'm not sure whether it's acceptable to remove these old version tests. Maybe someone else has a better solution

"We'll proceed with option 3 since support for versions 0.8 and 0.9 is still required, meaning we can't remove those Kafka versions from the system tests."

Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-08-06 14:52:17 +08:00
José Armando García Sancio 81edb74c5e KAFKA-16533; Update voter handling
Add support for handling the update voter RPC. The update voter RPC is used to automatically update
the voters supported kraft versions and available endpoints as the operator upgrades and
reconfigures the KRaft controllers.

The add voter RPC is handled as follow:

1. Check that the leader has fenced the previous leader(s) by checking that the HWM is known;
   otherwise, return the REQUEST_TIMED_OUT error.

2. Check that the cluster supports kraft.version 1; otherwise, return the UNSUPPORTED_VERSION error.

3. Check that there are no uncommitted voter changes, otherwise return the REQUEST_TIMED_OUT error.

4. Check that the updated voter still supports the currently finalized kraft.version; otherwise
   return the INVALID_REQUEST error.

5. Check that the updated voter is still listening on the default listener.

6. Append the updated VotersRecord to the log. The KRaft internal listener will read this
   uncommitted record from the log and update the voter in the set of voters.

7. Wait for the VotersRecord to commit using the majority of the voters. Return a REQUEST_TIMED_OUT
   error if it doesn't commit in time.

8. Send the UpdateVoter successful response to the voter.

This change also implements the ability for the leader to update its own entry in the voter
set when it becomes leader for an epoch. This is done by updating the voter set and writing a
control batch as the first batch in a new leader epoch.

Finally, fix a bug in KafkaAdminClient's handling of removeRaftVoterResponse where we tried to cast
the response to the wrong type.

Reviewers: Alyssa Huang <ahuang@confluent.io>, Colin P. McCabe <cmccabe@apache.org>
2024-08-05 13:31:51 -07:00
Colin Patrick McCabe 129e7fb0b8 KAFKA-16518: Implement KIP-853 flags for storage-tool.sh (#16669)
As part of KIP-853, storage-tool.sh now has two new flags: --standalone, and --initial-voters. This PR implements these two flags in storage-tool.sh.

There are currently two valid ways to format a cluster:

The pre-KIP-853 way, where you use a statically configured controller quorum. In this case, neither --standalone nor --initial-voters may be specified, and kraft.version must be set to 0.

The KIP-853 way, where one of --standalone and --initial-voters must be specified with the initial value of the dynamic controller quorum. In this case, kraft.version must be set to 1.

This PR moves the formatting logic out of StorageTool.scala and into Formatter.java. The tool file was never intended to get so huge, or to implement complex logic like generating metadata records. Those things should be done by code in the metadata or raft gradle modules. This is also useful for junit tests, which often need to do formatting. (The 'info' and 'random-uuid' commands remain in StorageTool.scala, for now.)

Reviewers: José Armando García Sancio <jsancio@apache.org>
2024-08-05 13:31:40 -07:00
Josep Prat c7d02127b1
KAFKA-17227: Update zstd-jni lib (#16763)
* KAFKA-17227: Update zstd-jni lib
* Add note in upgrade docs
* Change zstd-jni version in docker native file and add warning in dependencies.gradle file
* Add reference to snappy in upgrade

Reviewers:  Chia-Ping Tsai <chia7712@gmail.com>,  Mickael Maison <mickael.maison@gmail.com>
2024-08-05 09:55:42 +02:00
Kuan-Po Tseng b65644c3e3 KAFKA-16154: Broker returns offset for LATEST_TIERED_TIMESTAMP (#16783)
This pr support EarliestLocalSpec LatestTierSpec in GetOffsetShell, and add integration tests.

Reviewers: Luke Chen <showuon@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>, PoAn Yang <payang@apache.org>
2024-08-05 10:41:56 +08:00
Matthias J. Sax 2ddbfebecb KAFKA-16448: Unify error-callback exception handling (#16745)
Follow up code cleanup for KIP-1033.

This PR unifies the handling of both error cases for exception handlers:
 - handler throws an exception
 - handler returns null

The unification happens for all 5 handler cases:
 - deserialzation
 - production / serialization
 - production / send
 - processing
 - punctuation

Reviewers:  Sebastien Viale <sebastien.viale@michelin.com>, Loic Greffier <loic.greffier@michelin.com>, Bill Bejeck <bill@confluent.io>
2024-08-03 13:08:11 -07:00
Luke Chen b622121c0a KAFKA-16855: remote log disable policy in KRaft (#16653)
Reviewers: Kamal Chandraprakash <kamal.chandraprakash@gmail.com>, Christo Lolov <lolovc@amazon.com>
2024-08-03 20:21:05 +08:00
Luke Chen 38db4c46ff KAFKA-17205: Allow topic config validation in controller level in KRaft mode (#16693)
Reviewers: Kamal Chandraprakash <kamal.chandraprakash@gmail.com>, Christo Lolov <lolovc@amazon.com>
2024-08-03 20:20:19 +08:00
PoAn Yang 66485b04c6 KAFKA-16480: ListOffsets change should have an associated API/IBP version update (#16781)
1. Use oldestAllowedVersion as 9 if using ListOffsetsRequest#EARLIEST_LOCAL_TIMESTAMP or ListOffsetsRequest#LATEST_TIERED_TIMESTAMP.
   2. Add test cases to ListOffsetsRequestTest#testListOffsetsRequestOldestVersion to make sure requireTieredStorageTimestamp return 9 as minVersion.
   3. Add EarliestLocalSpec and LatestTierSpec to OffsetSpec.
   4. Add more cases to KafkaAdminClient#getOffsetFromSpec.
   5. Add testListOffsetsEarliestLocalSpecMinVersion and testListOffsetsLatestTierSpecSpecMinVersion to KafkaAdminClientTest to make sure request builder has oldestAllowedVersion as 9.

Signed-off-by: PoAn Yang <payang@apache.org>

Reviewers: Luke Chen <showuon@gmail.com>
2024-08-03 20:17:58 +08:00
TengYao Chi 4e75c57bbb KAFKA-17245: Revert TopicRecord changes. (#16780)
Revert KAFKA-16257 changes because KIP-950 doesn't need it anymore.

Reviewers: Luke Chen <showuon@gmail.com>
2024-08-03 20:17:25 +08:00
TengYao Chi 6b039ce75b KAFKA-16390: add `group.coordinator.rebalance.protocols=classic,consumer` to broker configs when system tests need the new coordinator (#16715)
Fix an issue that cause system test failing when using AsyncKafkaConsumer.
A configuration option, group.coordinator.rebalance.protocols, was introduced to specify the rebalance protocols used by the group coordinator. By default, the rebalance protocol is set to classic. When the new group coordinator is enabled, the rebalance protocols are set to classic,consumer.

Reviewers: Chia-Ping Tsai <chia7712@gmail.com>, David Jacot <djacot@confluent.io>, Lianet Magrans <lianetmr@gmail.com>, Kirk True <kirk@kirktrue.pro>, Justine Olshan <jolshan@confluent.io>
2024-08-02 16:19:04 -07:00
Sebastien Viale 4afe5f380a KAFKA-16448: Update documentation (#16776)
Updated docs for KIP-1033.

Reviewers: Matthias J. Sax <matthias@confluent.io>
2024-08-02 09:54:51 -07:00
Ken Huang fbb598ce82 KAFKA-16666 Migrate GroupMetadataMessageFormatter` to tools module (#16748)
we need to migate GroupMetadataMessageFormatter from scala code to java code,and make the message format is json pattern

Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-08-02 11:54:43 +08:00
Kondrat Bertalan 60e1478fb9
KAFKA-17192 Fix MirrorMaker2 worker config does not pass config.provi… (#16678)
Reviewers: Chris Egerton <chrise@aiven.io>
2024-08-01 16:13:38 -04:00
Alyssa Huang 25f04804cd KAFKA-16521; kafka-metadata-quorum describe command changes for KIP-853 (#16759)
describe --status now includes directory id and endpoint information for voter and observers.
describe --replication now includes directory id.

Reviewers: Colin P. McCabe <cmccabe@apache.org>, José Armando García Sancio <jsancio@apache.org>
2024-08-01 19:30:56 +00:00
Sebastien Viale 578fef2355 KAFKA-16448: Handle processing exceptions in punctuate (#16300)
This PR is part of KIP-1033 which aims to bring a ProcessingExceptionHandler to Kafka Streams in order to deal with exceptions that occur during processing.

This PR actually catches processing exceptions from punctuate.

Co-authored-by: Dabz <d.gasparina@gmail.com>
Co-authored-by: loicgreffier <loic.greffier@michelin.com>

Reviewers: Bruno Cadonna <bruno@confluent.io>, Matthias J. Sax <matthias@confluent.io>
2024-07-31 16:06:39 -07:00
Matthias J. Sax 2c957a6e5c MINOR: simplify code which calles `Punctuator.punctuate()` (#16725)
Reviewers: Bill Bejeck <bill@confluent.io>
2024-07-31 16:06:25 -07:00
Loïc GREFFIER aaed1bdd89 KAFKA-16448: Unify class cast exception handling for both key and value (#16736)
Part of KIP-1033. Minor code cleanup.

Reviewers: Matthias J. Sax <matthias@confluent.io>
2024-07-31 13:23:03 -07:00
Matthias J. Sax ccb04acb56
Revert "KAFKA-16508: Streams custom handler should handle the timeout exceptions (#16450)" (#16738)
This reverts commit 15a4501bde.

We consider this change backward incompatible and will fix forward for 4.0
release via KIP-1065, but need to revert for 3.9 release.

Reviewers: Josep Prat <josep.prat@aiven.io>, Bill Bejeck <bill@confluent.io>
2024-07-31 10:29:02 -07:00
Ken Huang fbdfd0d596 KAFKA-16666 Migrate OffsetMessageFormatter to tools module (#16689)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-07-31 15:19:28 +08:00
Sebastien Viale c8dc09c265 KAFKA-16448: Handle fatal user exception during processing error (#16675)
This PR is part of KIP-1033 which aims to bring a ProcessingExceptionHandler to Kafka Streams in order to deal with exceptions that occur during processing.

This PR catch the exceptions thrown while handling a processing exception

Co-authored-by: Dabz <d.gasparina@gmail.com>
Co-authored-by: loicgreffier <loic.greffier@michelin.com>

Reviewers: Bruno Cadonna <bruno@confluent.io>, Matthias J. Sax <matthias@confluent.io>
2024-07-30 22:57:31 -07:00
Josep Prat 0370a6464b
MINOR: Add text and link to blog in announcement template email (#16734)
Reviewers: Igor Soarez <soarez@apple.com>
2024-07-30 21:50:31 +02:00
Josep Prat 3d2ea547d8
KAFKA-17214: Add 3.8.0 version to core and client system tests (#16726)
Reviewers: Greg Harris <greg.harris@aiven.io>
2024-07-30 19:42:12 +02:00
Josep Prat b8c54c3f38
KAFKA-17214: Add 3.8.0 version to streams system tests (#16728)
* KAFKA-17214: Add 3.8.0 version to streams system tests

Reviewers: Bill Bejeck <bbejeck@gmail.com>
2024-07-30 19:41:36 +02:00
PaulRMellor 0969789973 KAFKA-15469: Add documentation for configuration providers (#16650)
Reviewers: Mickael Maison <mickael.maison@gmail.com>
2024-07-30 15:35:40 +02:00
Josep Prat bc243ab1e8
MINOR: Add 3.8.0 to system tests (#16714)
Reviewers:  Manikumar Reddy <manikumar.reddy@gmail.com>
2024-07-30 09:20:35 +02:00
Matthias J. Sax b8532070f7 HOTFIX: fix compilation error 2024-07-29 21:08:49 -07:00
Sebastien Viale 10d9f7872d KAFKA-16448: Add ErrorHandlerContext in deserialization exception handler (#16432)
This PR is part of KIP1033 which aims to bring a ProcessingExceptionHandler to Kafka Streams in order to deal with exceptions that occur during processing.

This PR expose the new ErrorHandlerContext as a parameter to the Deserialization exception handlers and deprecate the previous handle signature.

Co-authored-by: Dabz <d.gasparina@gmail.com>
Co-authored-by: loicgreffier <loic.greffier@michelin.com>

Reviewers: Bruno Cadonna <bruno@confluent.io>, Matthias J. Sax <matthias@confluent.io>
2024-07-29 20:35:25 -07:00
Sebastien Viale a4ea9aec73 KAFKA-16448: Add ErrorHandlerContext in production exception handler (#16433)
This PR is part of KIP-1033 which aims to bring a ProcessingExceptionHandler to Kafka Streams in order to deal with exceptions that occur during processing.

This PR expose the new ErrorHandlerContext as a parameter to the Production exception handler and deprecate the previous handle signature.

Co-authored-by: Dabz <d.gasparina@gmail.com>
Co-authored-by: loicgreffier <loic.greffier@michelin.com>

Reviewers: Bruno Cadonna <bruno@confluent.io>, Matthias J. Sax <matthias@confluent.io>
2024-07-29 20:35:17 -07:00
Colin P. McCabe f26f0b6626 tests/kafkatest/version.py: Add 3.9.0 as DEV_VERSION 2024-07-29 15:58:04 -07:00
Abhinav Dixit 6ff51bc388
KAFKA-17210: Broker fixes for smooth concurrent fetches on share partition (#16711)
Identified a couple of reliability issues with broker code for share groups 

1. Broker seems to get stuck at times when using multiple share consumers due to a corner case where the second last fetch request did not contain any topic partition to fetch, because of which the broker could never complete the last request. This results in a share fetch request getting stuck.

2. Since persister would not perform any business logic around sending state batches for a share partition, there could be scenarios where it sends state batches with no AVAILABLE records. This could cause a breach on the limit of in-flight messages we have configured, and hence broker would never be able to complete the share fetch requests.

Reviewers:  Andrew Schofield <aschofield@confluent.io>, Apoorv Mittal <apoorvmittal10@gmail.com>,  Manikumar Reddy <manikumar.reddy@gmail.com>
2024-07-30 01:16:58 +05:30
Matthias J. Sax b6c1cb0eec
MINOR: update CachingPersistentWindowStoreTest (#16701)
Refactor test to move off deprecated `transform()` in favor of
`process()`.

Reviewers: Bill Bejeck <bill@confluent.io>
2024-07-29 12:45:13 -07:00
Alyssa Huang 2cf87bff9b
KAFKA-16953; Properly implement the sending of DescribeQuorumResponse (#16637)
This change allows the KRaft leader to send the DescribeQuorumResponse version based on the schema version used by the client.

Reviewers: José Armando García Sancio <jsancio@apache.org>
2024-07-29 14:36:17 -04:00
TengYao Chi b348b556be
KAFKA-17202 surround consumer with try-resource statement (#16702)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-07-30 01:06:11 +08:00
Chris Egerton 61f61d6240
KAFKA-14569: Migrate Connect's integration test EmbeddedKafkaCluster from ZK to KRaft mode (#16599)
Reviewers: Omnia Ibrahim <o.g.h.ibrahim@gmail.com>, Mickael Maison <mickael.maison@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>
2024-07-29 10:43:55 -04:00
Kirk True d260b06180
KAFKA-17060 Rename LegacyKafkaConsumer to ClassicKafkaConsumer (#16683)
Reviewers: Andrew Schofield <aschofield@confluent.io>, Lianet Magrans <lianetmr@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>
2024-07-29 20:56:23 +08:00
Dmitry Werner 4e69bc09e6
KAFKA-17194 Don't create cluster for MetadataQuorumCommandTest#testCommandConfig (#16682)
Reviewers: TengYao Chi <kitingiao@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>
2024-07-29 18:48:00 +08:00
Chia Chuan Yu fdee225a1b
KAFKA-17177 reviewers.py should grep "authors" to offer more candidates of reviewers information (#16674)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-07-29 18:36:35 +08:00
Alyssa Huang da8fe6355b
KAFKA-16915; LeaderChangeMessage supports directory id (#16668)
Extend LeaderChangeMessage schema to support version 1 of the message. The leader will continue to write version 0 of the schema. This is needed so that in the future the leader can write version 1 of the message and be guaranteed that all of the replicas in the cluster support version 1 of the schema.

Reviewers: José Armando García Sancio <jsancio@apache.org>
2024-07-28 11:12:42 -04:00
José Armando García Sancio da32dcab2c
KAKFA-16537; Implement remove voter RPC (#16670)
Implement the RemoveVoter RPC. The general algorithm is as follow:

1. Check that the leader has fenced the previous leader(s) by checking that the HWM is known;
  otherwise return the REQUEST_TIMED_OUT error.
2. Check that the cluster supports kraft.version 1; otherwise return the UNSUPPORTED_VERSION error.
3. Check that there are no uncommitted voter changes; otherwise return the REQUEST_TIMED_OUT error.
4. Append the updated VotersRecord to the log. The KRaft internal listener will read this uncommitted
  record from the log and add the new voter to the set of voters.
5. Wait for the VotersRecord to commit using the majority of the new set of voters. Return a 
  REQUEST_TIMED_OUT error if it doesn't commit in time.
6. Send the RemoveVoter successful response to the client.
7. Resign the leadership if the leader is not in the new voter set

One thing to note is that now that KRaft supports both the remove voter and add voter RPC. Only one
change can be pending at once. This is achieved in the following ways. The AddVoter RPC checks if
there are pending AddVoter or RemoveVoter RPC. The RemoveVoter RPC checks if there are any
pending AddVoter or RemoveVoter RPC. Both RPCs check that there is no uncommitted VotersRecord.

Reviewers: Colin P. McCabe <cmccabe@apache.org>
2024-07-26 16:25:41 -07:00
Justine Olshan a0f6e6f816
KAFKA-16192: Introduce transaction.version and usage of flexible records to coordinators (#16183)
This change includes adding transaction.version (part of KIP-1022)

New transaction version 1 is introduced to support writing flexible fields in transaction state log messages.

Transaction version 2 is created in anticipation for further KIP-890 changes.

Neither are made production ready. Tests for the new transaction version and new MV are created.

Also include change to not report a feature as supported if the range is 0-0.

Reviewers: Jun Rao <junrao@apache.org>, David Jacot <djacot@confluent.io>, Artem Livshits <alivshits@confluent.io>, Colin P. McCabe <cmccabe@apache.org>
2024-07-26 11:38:44 -07:00
TengYao Chi a07294a732
KAFKA-17204: KafkaStreamsCloseOptionsIntegrationTest.before leaks AdminClient (#16692)
To avoid a resource leak, we need to close the AdminClient after the test.

Reviewers: Lianet Magrans <lianetmr@gmail.com>, Matthias J. Sax <matthias@confluent.io>
2024-07-26 10:32:39 -07:00
PaulRMellor 738d8cc91e
MINOR: Update bootstrap.servers doc string (#16655)
Reviewers: Mickael Maison <mickael.maison@gmail.com>
2024-07-26 15:08:01 +02:00