Commit Graph

13609 Commits

Author SHA1 Message Date
Mickael Maison c3187d1d7f MINOR: Bump Netty to 4.1.115.Final (#17860)
Reviewers: Josep Prat <josep.prat@aiven.io>
2024-11-19 17:32:56 +01:00
Nick Telford f9cc9d4a55 KAFKA-17954: Error getting oldest-iterator-open-since-ms from JMX (#17713)
The thread that evaluates the gauge for the oldest-iterator-open-since-ms runs concurrently
with threads that open/close iterators (stream threads and interactive query threads). This PR
fixed a race condition between `openIterators.isEmpty()` and `openIterators.first()`, by catching
a potential exception. Because we except the race condition to be rare, we rather catch the
exception in favor of introducing a guard via locking.

Reviewers: Matthias J. Sax <matthias@confluent.io>, Anna Sophie Blee-Goldman <ableegoldman@apache.org>
2024-11-18 17:50:17 -08:00
Ken Huang cbb8cfe520
KAFKA-18029 Fix quorum_reconfiguration_test.py (#17839)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-11-18 00:47:34 +08:00
Matthias J. Sax 2127ae6329 KAFKA-17994 Checked exceptions are not handled (#17817)
Reviewers: Bill Bejeck <bill@confluent.io>, Chia-Ping Tsai <chia7712@gmail.com>
2024-11-15 20:47:12 +08:00
PoAn Yang 726d0d604d KAFKA-17995: Fix errors in remote segment cleanup when retention.ms is large (#17794)
If a user has configured value of `retention.ms` to a value greater than current unix timestamp epoch, then we fail cleanup of a remote log segment with an error. This change fixes the bug by handling this case of large `retention.ms` correctly.

Reviewers: Divij Vaidya <diviv@amazon.com>
2024-11-14 11:34:38 +01:00
Bill Bejeck a6c3576be3 Update javadoc on split to mention first matching (#17799)
Clarify the functionality of split matching on first predicate
Reviewers: Matthias Sax <mjsax@apache.org>
2024-11-13 11:49:08 -05:00
Kamal Chandraprakash 6ea2ed3db0
KAFKA-17801: RemoteLogManager may compute inaccurate upperBoundOffset for aborted txns (#17676) (#1 7733)
Reviewers: Jun Rao <junrao@gmail.com>
2024-11-10 22:54:26 +05:30
Matthias J. Sax ad3e34d33b HOTFIX: fix broken commit
Fixes broken commit for KAFKA-17872
2024-11-09 17:46:05 -08:00
Matthias J. Sax 587c3cfec5 KAFKA-17872: Update consumed offsets on records with invalid timestamp (#17710)
TimestampExtractor allows to drop records by returning a timestamp of -1. For this case, we still need to update consumed offsets to allows us to commit progress.

Reviewers: Bill Bejeck <bill@confluent.io>
2024-11-09 17:28:36 -08:00
Yung 014fdfed44
KAFKA-17969 Fix StorageToolTest for 3.9 (#17723)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-11-09 13:53:45 +08:00
Josep Prat 17bd58a313
MINOR: fix docs upgrade instructions fro 3.9 and 3.8 (#17447)
* MINOR: Fix upgrade instructions for 3.8 and 3.9

Signed-off-by: Josep Prat <josep.prat@aiven.io>

Reviewers: Chia-Ping Tsai <chia7712@gmail.com>, Colin McCabe <colin@cmccabe.xyz>
2024-11-07 08:56:30 +01:00
Luke Chen 0129877ede MINOR: Improve the KIP-853 documentation (#17598)
In docs/ops.html, add a section discussion the difference between static and dynamic quorums. This section also discusses how to find out which quorum type you have. Also discuss the current limitations, such as the inability to transition from static quorums to dynamic.

Add a brief section to docs/upgrade.html discussing controller membership change.

Co-authored-by: Federico Valeri <fedevaleri@gmail.com>, Colin P. McCabe <cmccabe@apache.org>
Reviewers: Justine Olshan <jolshan@confluent.io>
2024-11-07 14:37:43 +08:00
Bill Bejeck 0eae7507d5 KAFKA-17635: Ensure only committed offsets are returned for purging (#17686)
Kafka Streams actively purges records from repartition topics. Prior to this PR, Kafka Streams would retrieve the offset from the consumedOffsets map, but here are a couple of edge cases where the consumedOffsets can get ahead of the commitedOffsets map. In these cases, this means Kafka Streams will potentially purge a repartition record before it's committed.

Updated the current StreamTask test to cover this case

Reviewers: Matthias Sax <mjsax@apache.org>
2024-11-06 17:50:06 -05:00
Colin P. McCabe c21ad10bce MINOR: Update 3.9 branch version to 3.9.1-SNAPSHOT 2024-11-06 13:17:58 -08:00
Colin P. McCabe cc53a632ed Bump version to 3.9.0 2024-11-06 13:15:24 -08:00
ShivsundarR 4a562cddcb
Removed Set.of usage (#17683)
Reviewers: Federico Valeri <fedevaleri@gmail.com>, Lianet Magrans <lmagrans@confluent.io>, Chia-Ping Tsai <chia7712@gmail.com>
2024-11-04 20:04:39 +01:00
Jonah Hooper bcb5d167fd [KAFKA-17870] Fail CreateTopicsRequest if total number of partitions exceeds 10k (#17604)
We fail the entire CreateTopicsRequest action if there are more than 10k total
partitions being created in this topic for this specific request. The usual pattern for
this API to try and succeed with some topics. Since the 10k limit applies to all topics
then no topic should be created if they all exceede it.

Reviewers: Colin P. McCabe <cmccabe@apache.org>
2024-10-31 13:55:49 -07:00
Colin Patrick McCabe 398b4c4fa1 KAFKA-17868: Do not ignore --feature flag in kafka-storage.sh (#17597)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>, Justine Olshan <jolshan@confluent.io>
2024-10-25 17:08:51 -07:00
Colin Patrick McCabe c821449fb7 KAFKA-17794: Add some formatting safeguards for KIP-853 (#17504)
KIP-853 adds support for dynamic KRaft quorums. This means that the quorum topology is
no longer statically determined by the controller.quorum.voters configuration. Instead, it
is contained in the storage directories of each controller and broker.

Users of dynamic quorums must format at least one controller storage directory with either
the --initial-controllers or --standalone flags.  If they fail to do this, no quorum can be
established. This PR changes the storage tool to warn about the case where a KIP-853 flag has
not been supplied to format a KIP-853 controller. (Note that broker storage directories
can continue to be formatted without a KIP-853 flag.)

There are cases where we don't want to specify initial voters when formatting a controller. One
example is where we format a single controller with --standalone, and then dynamically add 4
more controllers with no initial topology. In this case, we want the 4 later controllers to grab
the quorum topology from the initial one. To support this case, this PR adds the
--no-initial-controllers flag.

Reviewers: José Armando García Sancio <jsancio@apache.org>, Federico Valeri <fvaleri@redhat.com>
2024-10-21 10:41:26 -07:00
Federico Valeri 7842e25d32 KAFKA-17031: Make RLM thread pool configurations public and fix default handling (#17499)
According to KIP-950, remote.log.manager.thread.pool.size should be marked as deprecated and replaced by two new configurations: remote.log.manager.copier.thread.pool.size and remote.log.manager.expiration.thread.pool.size. Fix default handling so that -1 works as expected.

Reviewers: Luke Chen <showuon@gmail.com>, Gaurav Narula <gaurav_narula2@apple.com>, Satish Duggana <satishd@apache.org>, Colin P. McCabe <cmccabe@apache.org>
2024-10-21 10:39:53 -07:00
Josep Prat de9a7199df KAFKA-17810 upgrade Jetty because of CVE-2024-8184 (#17517)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-10-21 02:51:37 +08:00
Colin Patrick McCabe abd4bf08ab
KAFKA-17790: Document that control.plane.listener should be removed before ZK migration is finished (#17501)
Reviewers: Luke Chen <showuon@gmail.com>
2024-10-15 14:36:16 -07:00
Colin P. McCabe 796ce2121b KAFKA-17788: During ZK migration, always include control.plane.listener.name in advertisedBrokerListeners
During ZK migration, always include control.plane.listener.name in advertisedBrokerListeners, to be
bug-compatible with earlier Apache Kafka versions that ignored this misconfiguration. (Just as
before, control.plane.listener.name is not supported in KRaft mode itself.)

Reviewers: Luke Chen <showuon@gmail.com>
2024-10-15 14:34:42 -07:00
Ken Huang 51253e2bf4
KAFKA-17520 align the low bound of ducktape version (#17481)
Reviewers: Colin Patrick McCabe <cmccabe@apache.org>, Chia-Ping Tsai <chia7712@gmail.com>
2024-10-15 00:15:59 +08:00
David Arthur 8c3c6c3841
KAFKA-17193: Pin all external GitHub Actions to the specific git hash (#16960) (#17461)
Co-authored-by: Mickael Maison <mimaison@users.noreply.github.com>

Reviewers: Chia-Ping Tsai <chia7712@gmail.com>, Colin P. McCabe <cmccabe@apache.org>
2024-10-10 13:20:49 -07:00
Mickael Maison 44f15cc22c KAFKA-17749: Fix Throttler metrics name
Reviewers: Colin P. McCabe <cmccabe@apache.org>
2024-10-10 09:20:14 -07:00
PoAn Yang 4878174b77 KAFKA-16972 Move BrokerTopicMetrics to org.apache.kafka.storage.log.metrics (#16387)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-10-10 09:15:21 -07:00
Apoorv Mittal db4c80a455 KAFKA-17731: Removed timed waiting signal for client telemetry close (#17431)
Reviewers: Andrew Schofield <aschofield@confluent.io>, Kirk True <ktrue@confluent.io>, Chia-Ping Tsai <chia7712@gmail.com>, Lianet Magrans <lmagrans@confluent.io>
2024-10-10 08:55:43 -07:00
Colin Patrick McCabe bf95a3239c
KAFKA-17753: Update protobuf and commons-io dependencies (#17436)
Reviewers: Josep Prat <jlprat@apache.org>
2024-10-09 16:34:26 -07:00
Gaurav Narula ab6dafaab6 KAFKA-17751; fix pollTimeout calculation in pollFollowerAsVoter (#17434)
KAFKA-16534 introduced a change to send UpdateVoterRequest every "3 * fetchTimeoutMs" if the voter's configure endpoints are different from the endpoints persisted in the KRaft log. It also introduced a regression where if the voter nodes do not need an update then updateVoterTimer wasn't reset. This resulted in a busy-loop in KafkaRaftClient#poll method resulting in high CPU usage.

This PR modifies the conditions in pollFollowerAsVoter to reset updateVoterTimer appropriately.

Reviewers: José Armando García Sancio <jsancio@apache.org>
2024-10-09 18:13:10 -04:00
Colin Patrick McCabe 8af063a165
KAFKA-17735: release.py must not use home.apache.org (#17421)
Previously, Apache Kafka was uploading release candidate (RC) artifacts
to users' home directories on home.apache.org. However, since this
resource has been decommissioned, we need to follow the standard
approach of putting release candidate artifacts into the appropriate
subversion directory, at https://dist.apache.org/repos/dist/dev/kafka/.

Reviewers: Justine Olshan <jolshan@confluent.io>
2024-10-08 15:40:27 -07:00
Colin Patrick McCabe 0a70c3a61e
KAFKA-17714 Fix StorageToolTest.scala to compile under Scala 2.12 (#17400)
Reviewers: David Arthur <mumrah@gmail.com>, Justine Olshan <jolshan@confluent.io>, Chia-Ping Tsai <chia7712@gmail.com>
2024-10-08 11:00:18 +08:00
David Arthur 1d54a7373c KAFKA-17146 Include note to remove migration znode (#16770)
When reverting the ZK migration, we must also remove the /migration ZNode in order to allow the migration to be re-attempted in the future.

Reviewers: Colin P. McCabe <cmccabe@apache.org>, Chia-Ping Tsai <chia7712@gmail.com>
2024-10-07 10:34:22 -07:00
José Armando García Sancio 550bf60460 KAFKA-16927; Handle expanding leader endpoints (#17363)
When a replica restarts in the follower state it is possible for the set of leader endpoints to not match the latest set of leader endpoints. Voters will discover the latest set of leader endpoints through the BEGIN_QUORUM_EPOCH request. This means that KRaft needs to allow for the replica to transition from Follower to Follower when only the set of leader endpoints has changed.

Reviewers: Colin P. McCabe <cmccabe@apache.org>, Alyssa Huang <ahuang@confluent.io>
2024-10-04 14:53:17 +00:00
Alyssa Huang 5c95a5da31 MINOR: Fix kafkatest advertised listeners (#17294)
Followup for #17146

Reviewers: Bill Bejeck <bbejeck@apache.org>
2024-10-01 17:21:28 +00:00
Bill Bejeck edd77c1e25 MINOR: Need to split the controller bootstrap servers on ',' in list comprehenson (#17183)
Kafka Streams system tests were failing with this error:

Failed to parse host name from entry 3001@d for the configuration controller.quorum.voters.  Each entry should be in the form `{id}@{host}:{port}`.

The cause is that in kafka.py line 876, we create a delimited string from a list comprehension, but the input is a string itself, so each character gets appended vs. the bootstrap server string of host:port. To fix this, this PR adds split(',') to controller_quorum_bootstrap_servers. Note that this only applies when dynamicRaftQuorum=False

Reviewers: Alyssa Huang <ahuang@confluent.io>, Chia-Ping Tsai <chia7712@gmail.com>
2024-10-01 17:07:26 +00:00
David Arthur 2cbc5bd3ca KAFKA-17636 Fix missing SCRAM bootstrap records (#17305)
Fixes a regression introduced by #16669 which inadvertently stopped processing SCRAM arguments from kafka-storage.sh

Reviewers: Colin P. McCabe <cmccabe@apache.org>, Federico Valeri <fedevaleri@gmail.com>
2024-09-28 10:04:29 -04:00
Alyssa Huang 89cb632acd KAFKA-17608, KAFKA-17604, KAFKA-16963; KRaft controller crashes when active controller is removed (#17146)
This change fixes a few issues.

KAFKA-17608; KRaft controller crashes when active controller is removed
When a control batch is committed, the quorum controller currently increases the last stable offset but fails to create a snapshot for that offset. This causes an issue if the quorum controller renounces and needs to revert to that offset (which has no snapshot present). Since the control batches are no-ops for the quorum controller, it does not need to update its offsets for control records. We skip handle commit logic for control batches.

KAFKA-17604; Describe quorum output missing added voters endpoints
Describe quorum output will miss endpoints of voters which were added via AddRaftVoter. This is due to a bug in LeaderState's updateVoterAndObserverStates which will pull replica state from observer states map (which does not include endpoints). The fix is to populate endpoints from the lastVoterSet passed into the method.

Reviewers: José Armando García Sancio <jsancio@apache.org>, Colin P. McCabe <cmccabe@apache.org>, Chia-Ping Tsai <chia7712@apache.org>
2024-09-26 18:04:05 +00:00
Alyssa Huang c2c2dd424b KAFKA-16963: Ducktape test for KIP-853 (#17081)
Add a ducktape system test for KIP-853 quorum reconfiguration, including adding and removing voters.

Reviewers: Colin P. McCabe <cmccabe@apache.org>
2024-09-26 18:03:56 +00:00
Colin Patrick McCabe 57b098c397 KAFKA-17584: Fix incorrect synonym handling for dynamic log configurations (#17258)
Several Kafka log configurations in have synonyms. For example, log retention can be configured
either by log.retention.ms, or by log.retention.minutes, or by log.retention.hours. There is also
a faculty in Kafka to dynamically change broker configurations without restarting the broker. These
dynamically set configurations are stored in the metadata log and override what is in the broker
properties file.

Unfortunately, these two features interacted poorly; there was a bug where the dynamic log
configuration update code ignored synonyms. For example, if you set log.retention.minutes and then
reconfigured something unrelated that triggered the LogConfig update path, the retention value that
you had configured was overwritten.

The reason for this was incorrect handling of synonyms. The code tried to treat the Kafka broker
configuration as a bag of key/value entities rather than extracting the correct retention time (or
other setting with overrides) from the KafkaConfig object.

Reviewers: Luke Chen <showuon@gmail.com>, Jun Rao <junrao@gmail.com>, Kamal Chandraprakash<kamal.chandraprakash@gmail.com>, Christo Lolov <lolovc@amazon.com>, Federico Valeri <fedevaleri@gmail.com>, Rajini Sivaram <rajinisivaram@googlemail.com>, amangandhi94 <>
2024-09-26 14:20:33 +08:00
José Armando García Sancio e36c82d71c MINOR: Replace gt and lt char with html encoding (#17235)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-09-24 17:07:16 +00:00
Ken Huang 333483a16e MINOR: add a space for kafka.metrics.polling.interval.secs description (#17256)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-09-24 21:52:16 +08:00
TengYao Chi 7d14cd6b33 KAFKA-17459 Stablize reassign_partitions_test.py (#17250)
This test expects that each partition can receive the record, so using a non-null key helps distribute the records more randomly.

Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-09-24 17:37:03 +08:00
Jakub Scholz 83091994a6 KAFKA-17543: Improve and clarify the error message about generated broker IDs in migration (#17210)
This PR tries to improve the error message when broker.id is set to -1 and ZK migration is enabled. It is not
needed to disable the broker.id.generation.enable option. It is sufficient to just not use it (by not setting
the broker.id to -1).

Reviewers: Chia-Ping Tsai <chia7712@gmail.com>, Luke Chen <showuon@gmail.com>
2024-09-18 11:47:01 -07:00
José Armando García Sancio c141acb6bf KAFKA-17048; Update docs for KIP-853 (#17076)
Change the configurations under config/kraft to use controller.quorum.bootstrap.servers instead of controller.quorum.voters. Add comments explaining how to use the older static quorum configuration where appropriate.

In docs/ops.html, remove the reference to "tentative timelines for ZooKeeper removal" and "Tiered storage is considered as an early access feature" since they are no longer up-to-date. Add KIP-853 information.

In docs/quickstart.html, move the ZK instructions to be after the KRaft instructions. Update the KRaft instructions to use KIP-853.

In docs/security.html, add an explanation of --bootstrap-controller and document controller.quorum.bootstrap.servers instead of controller.quorum.voters.

Reviewers: Mickael Maison <mickael.maison@gmail.com>, Alyssa Huang <ahuang@confluent.io>, Colin P. McCabe <cmccabe@apache.org>
2024-09-18 11:34:12 -07:00
Colin Patrick McCabe 389a8d8dec
Revert "KAFKA-16803: Change fork, update ShadowJavaPlugin to 8.1.7 (#16295)" (#17218)
This reverts commit 391778b8d7.

Unfortunately that commit re-introduced bug #15127 which prevented the publishing of kafka-clients
artifacts to remote maven. As that bug says:

    The issue triggers only with publishMavenJavaPublicationToMavenRepository due to signing.
    Generating signed asc files error out for shadowed release artifacts as the module name
    (clients) differs from the artifact name (kafka-clients).

    The fix is basically to explicitly define artifact of shadowJar to signing and publish plugin.
    project.shadow.component(mavenJava) previously outputs the name as client-<version>-all.jar
    though the classifier and archivesBaseName are already defined correctly in :clients and
    shadowJar construction.

Reviewers: David Arthur <mumrah@gmail.com>
2024-09-17 12:05:25 -07:00
Colin Patrick McCabe f324ef461f
MINOR: update documentation link to 3.9 (#17216)
Reviewers: David Arthur <mumrah@gmail.com>
2024-09-17 07:36:08 -07:00
Colin Patrick McCabe a1a4389c35 KAFKA-17543: Enforce that broker.id.generation.enable is not used when migrating to KRaft (#17192)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>, David Arthur <mumrah@gmail.com>
2024-09-13 17:25:26 -07:00
Matthias J. Sax b43439482d KAFKA-17527: Fix NPE for null RecordContext (#17169)
Reviewers: Bruno Cadonna <bruno@confluent.io>
2024-09-13 16:32:54 -07:00
Colin Patrick McCabe 7d3ba8a0eb KAFKA-16468: verify that migrating brokers provide their inter.broker.listener (#17159)
When brokers undergoing ZK migration register with the controller, it should verify that they have
provided a way to contact them via their inter.broker.listener. Otherwise the migration will fail
later on with a more confusing error message.

Reviewers: David Arthur <mumrah@gmail.com>
2024-09-13 09:18:43 -07:00