Commit Graph

12337 Commits

Author SHA1 Message Date
Matthias J. Sax 79a8f2b5f4 Bump version to 3.7.2 2024-12-04 10:50:28 -08:00
Mickael Maison 36f5fe83a0 MINOR: Bump Netty to 4.1.115.Final (#17860)
Reviewers: Josep Prat <josep.prat@aiven.io>
2024-12-04 10:28:54 -08:00
Manikumar Reddy 4189beb22a MINOR: Fix error in installing docker-compose on docker-builds workflows 2024-12-04 23:46:44 +05:30
Vedarth Sharma 8b1d9e2138 MINOR: Install docker-compose on docker-build workflows (#18037)
Docker tests rely on docker compose. In recent runs it has been observed that github actions does not provide support for docker compose, so we are installing it explicitly in the workflow.
2024-12-04 22:05:14 +05:30
Matthias J. Sax 70ad6645fc MINOR: update license file 2024-11-25 20:22:47 -08:00
Matthias J. Sax 4f52da1445 MINOR: update upgrade.html for 3.7.2 release 2024-11-25 19:27:35 -08:00
Matthias J. Sax 96d64af3d8 KAFKA-17299: add unit tests for previous fix (#17919)
https://github.com/apache/kafka/pull/17899 fixed the issue, but did not
add any unit tests.

Reviewers: Bill Bejeck <bill@confluent.io>
2024-11-25 12:57:50 -08:00
Laxman Ch f416c01e37 KAFKA-17299: Fix Kafka Streams consumer hang issue (#17899)
When Kafka Streams skips overs corrupted messages, it might not resume previously paused partitions,
if more than one record is skipped at once, and if the buffer drop below the max-buffer limit at the same time.

Reviewers: Matthias J. Sax <matthias@confluent.io>
2024-11-25 12:47:25 -08:00
Christo Lolov 1c4b02a6fb
KAFKA-17584: Fix incorrect synonym handling for dynamic log configurations
This is a cherry-pick of #17258 to 3.7.2

This commit differs from the original by using the old (read 3.7) references to the configurations and not changing as many unit tests

Reviewers: Divij Vaidya <diviv@amazon.com>, Colin Patrick McCabe <cmccabe@apache.org>
2024-11-25 10:07:29 -08:00
Bill Bejeck 81ab70a66c Update streams docs with alive stream threads (#17868)
Add alive-stream-threads to Kafka Streams client metrics table
Reviewers: Matthias Sax <mjsax@apache.org>
2024-11-21 11:37:44 -05:00
Bill Bejeck 1d975923d8 Update javadoc on split to mention first matching (#17799)
Clarify the functionality of split matching on first predicate
Reviewers: Matthias Sax <mjsax@apache.org>
2024-11-13 11:58:06 -05:00
Bill Bejeck 9b29f289a8 Backport fix from 3.9 (#17716)
This is a backport of #17686 merged to trunk and cherry-picked to 3.9. Need to do a standalone PR due to merge conflicts.
Reviewers: Matthias Sax <mjsax@apache.org>
2024-11-12 16:08:38 -05:00
Matthias J. Sax 35c17ac0ae KAFKA-17872: Update consumed offsets on records with invalid timestamp (#17710)
TimestampExtractor allows to drop records by returning a timestamp of -1. For this case, we still need to update consumed offsets to allows us to commit progress.

Reviewers: Bill Bejeck <bill@confluent.io>
2024-11-09 23:35:09 -08:00
Rohan fb2d647116 KAFKA-16955: fix synchronization of streams threadState (#16337)
Each KafkaStreams instance maintains a map from threadId to state
to use to aggregate to a KafkaStreams app state. The map is updated
on every state change, and when a new thread is created. State change
updates are done in a synchronized blocks, however the update that
happens on thread creation is not, which can raise
ConcurrentModificationException. This patch moves this update
into the listener object and protects it using the object's lock.
It also moves ownership of the state map into the listener so that
its less likely that future changes access it without locking

Reviewers: Matthias J. Sax <matthias@confluent.io>
2024-10-25 14:15:36 -07:00
Josep Prat 7fdfeb120f KAFKA-17810 upgrade Jetty because of CVE-2024-8184 (#17517)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-10-17 14:33:04 +08:00
Colin Patrick McCabe 3b64e2fea6 KAFKA-17790: Document that control.plane.listener should be removed before ZK migration is finished (#17501)
Reviewers: Luke Chen <showuon@gmail.com>
2024-10-15 14:37:19 -07:00
Ken Huang 50e310b435
KAFKA-17520 Align ducktape version in tests/docker/Dockerfile and tests/setup.py (#17485)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-10-14 01:25:56 +08:00
TengYao Chi 9bdf1fa31f
KAFKA-17768 Update protobuf and commons-io dependencies in 3.7.2 (#17477)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-10-13 00:50:40 +08:00
Andrew Schofield a2e133df5a KAFKA-16759: Handle telemetry push response while terminating (#15957)
When client telemetry is configured in a cluster, Kafka producers and consumers push metrics to the brokers periodically. There is a special push of metrics that occurs when the client is terminating. A state machine in the client telemetry reporter controls its behaviour in different states.

Sometimes, when a client was terminating, it was attempting an invalid state transition from TERMINATING_PUSH_IN_PROGRESS to PUSH_NEEDED when it receives a response to a PushTelemetry RPC. This was essentially harmless because the state transition did not occur but it did cause unsightly log lines to be generated. This PR performs a check for the terminating states when receiving the response and simply remains in the current state.

I added a test to validate the state management in this case. Actually, the test passes before the code change in the PR, but with unsightly log lines.


Reviewers: Manikumar Reddy <manikumar.reddy@gmail.com>,  Apoorv Mittal <amittal@confluent.io>
2024-10-10 10:37:47 -04:00
Apoorv Mittal f4b46a7cec KAFKA-17731: Removed timed waiting signal for client telemetry close (#17431)
Reviewers: Andrew Schofield <aschofield@confluent.io>, Kirk True <ktrue@confluent.io>, Chia-Ping Tsai <chia7712@gmail.com>, Lianet Magrans <lmagrans@confluent.io>
2024-10-10 10:16:56 -04:00
Manikumar Reddy 17e3b853fd
MINOR: Update version.py (#17186)
Reviewers: Lianet Magrans <lmagrans@confluent.io>
2024-09-13 12:34:15 +02:00
David Arthur a1766e02b6 KAFKA-17506 KRaftMigrationDriver initialization race (#17147)
There is a race condition between KRaftMigrationDriver running its first poll() and being notified by Raft about a leader change. If onControllerChange is called before RecoverMigrationStateFromZKEvent is run, we will end up getting stuck in the INACTIVE state.

This patch fixes the race by enqueuing a RecoverMigrationStateFromZKEvent from onControllerChange if the driver has not yet initialized. If another RecoverMigrationStateFromZKEvent was already enqueued, the second one to run will just be ignored.

Reviewers: Luke Chen <showuon@gmail.com>
2024-09-11 11:11:40 -04:00
Vikas Singh 0f31035c9c MINOR: Few cleanups
Reviewers: Manikumar Reddy <manikumar.reddy@gmail.com>
2024-09-11 15:03:41 +05:30
PoAn Yang 364ddf2d8a
KAFKA-17417 Backport KAFKA-15751 and KAFKA-15752 to 3.8 and Enable KRaft for BaseAdminIntegrationTest and SaslSslAdminIntegrationTest (#15175) (#17102)
(cherry picked from commit 166d9e8)

KAFKA-15751 and KAFKA-15752 enable the kraft mode in SaslSslAdminIntegrationTest, and that is useful in writing security-related IT. Without the backport, the fixes to security could be obstructed from backport due to IT (KAFKA-17315, for example).

Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-09-09 14:12:09 +08:00
Omnia Ibrahim c5b524f0d7 MINOR: Remove unwanted debug line in LogDirFailureTest (#15371)
Reviewers: Mickael Maison <mickael.maison@gmail.com>, Justine Olshan <jolshan@confluent.io>, Igor Soarez <soarez@apple.com>
2024-09-09 13:57:28 +08:00
Omnia Ibrahim 6f76ed9a54 KAFKA-16225 [1/N]: Set metadata.log.dir to broker in KRAFT mode in integration test
Fix the flakiness of LogDirFailureTest by setting a separate metadata.log.dir for brokers in KRAFT mode.

The test was flaky because as we call causeLogDirFailure some times we impact the first log.dir which also is KafkaConfig.metadataLogDir as we don't have metadata.log.dir. So to fix the flakiness we need to explicitly set metadata.log.dir to diff log dir than the ones we could potentially fail for the tests. 

This is part 1 of the fixes. Delivering them separately as the other issues were not as clear cut.

Reviewers: Gaurav Narula <gaurav_narula2@apple.com>, Justine Olshan <jolshan@confluent.io>, Greg Harris <greg.harris@aiven.io>
2024-09-09 13:57:08 +08:00
David Arthur 619aa47963
KAFKA-17457 Don't allow ZK migration to start without transactions (#17094) (#17120)
This patch raises the minimum MetadataVersion for migrations to 3.6-IV1 (metadata transactions). This is only enforced on the controller during bootstrap (when the log is empty). If the log is not empty on controller startup, as in the case of a software upgrade, we allow the migration to continue where it left off.

The broker will log an ERROR message if migrations are enabled and the IBP is not at least 3.6-IV1.

Reviewers: Colin P. McCabe <cmccabe@apache.org>
2024-09-07 08:49:30 -07:00
Kirk True c7c3e609c0
Back-port KAFKA-16230 to 3.7 branch (#16951)
Reviewers: Andrew Schofield <aschofield@confluent.io>, Lianet Magrans <lianetmr@gmail.com>
2024-09-03 21:34:35 +02:00
Kuan-Po Tseng 57b6c2ef98
KAFKA-17360 local log retention ms/bytes "-2" is not treated correctly (#16996)
1) When the local.retention.ms/bytes is set to -2, we didn't replace it with the server-side retention.ms/bytes config, so the -2 local retention won't take effect.
2) When setting retention.ms/bytes to -2, we can notice this log message:

```
Deleting segment LogSegment(baseOffset=10045, size=1037087, lastModifiedTime=1724040653922, largestRecordTimestamp=1724040653835) due to local log retention size -2 breach. Local log size after deletion will be 13435280. (kafka.log.UnifiedLog) [kafka-scheduler-6]
```
This is not helpful for users. We should replace -2 with real retention value when logging.

Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-08-26 06:22:35 +08:00
TengYao Chi f596a0dffc KAFKA-17315 Fix the behavior of delegation tokens that expire immediately upon creation in KRaft mode (#16858)
In kraft mode, expiring delegation token (`expiryTimePeriodMs` < 0) has following different behavior to zk mode.

1. `ExpiryTimestampMs` is set to "expiryTimePeriodMs" [0] rather than "now" [1]
2. it throws exception directly if the token is expired already [2]. By contrast, zk mode does not. [3]

[0] 49fc14f611/metadata/src/main/java/org/apache/kafka/controller/DelegationTokenControlManager.java (L316)
[1] 49fc14f611/core/src/main/scala/kafka/server/DelegationTokenManagerZk.scala (L292)
[2] 49fc14f611/metadata/src/main/java/org/apache/kafka/controller/DelegationTokenControlManager.java (L305)
[3] 49fc14f611/core/src/main/scala/kafka/server/DelegationTokenManagerZk.scala (L293)

Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-08-25 08:09:39 +08:00
Matthias J. Sax 89be50f582 MINOR: fix HTML for topology.optimization config (#16953)
The HTML rendering broke via https://issues.apache.org/jira/browse/KAFKA-14209 in 3.4 release. The currently shown value is some garbage org.apache.kafka.streams.StreamsConfig$$Lambda$20/0x0000000800c0cf18@b1bc7ed

cf https://kafka.apache.org/documentation/#streamsconfigs_topology.optimization

Verified the fix via running StreamsConfig#main() locally.

Reviewers: Bill Bejeck <bill@confluent.io>, Chia-Ping Tsai <chia7712@gmail.com>
2024-08-22 17:49:58 -07:00
Colin P. McCabe 431c00d802 KAFKA-17190: AssignmentsManager gets stuck retrying on deleted topics (#16672)
In MetadataVersion 3.7-IV2 and above, the broker's AssignmentsManager sends an RPC to the
controller informing it about which directory we have chosen to place each new replica on.
Unfortunately, the code does not check to see if the topic still exists in the MetadataImage before
sending the RPC. It will also retry infinitely. Therefore, after a topic is created and deleted in
rapid succession, we can get stuck including the now-defunct replica in our subsequent
AssignReplicasToDirsRequests forever.

Reviewers: Igor Soarez <i@soarez.me>, Ron Dagostino <rndgstn@gmail.com>

Conflicts: the original PR in trunk and 3.9 was large and fixed some other issues, like batching.
In order to avoid too much disruption to this older branch, this cherry-pick is minimal and just
stops retrying if the AssignmentsManager receives Errors.UNKNOWN_TOPIC_ID from the controller.
2024-08-12 12:46:19 -07:00
Josep Prat df96e411fb
KAFKA-17227: Update zstd-jni lib (#16763)
* KAFKA-17227: Update zstd-jni lib
* Add note in upgrade docs
* Change zstd-jni version in docker native file and add warning in dependencies.gradle file
* Add reference to snappy in upgrade

Reviewers:  Chia-Ping Tsai <chia7712@gmail.com>,  Mickael Maison <mickael.maison@gmail.com>
2024-08-05 17:00:20 +02:00
Kondrat Bertalan 373f8e3806
KAFKA-17192 Fix MirrorMaker2 worker config does not pass config.provi… (#16678)
Reviewers: Chris Egerton <chrise@aiven.io>
2024-08-01 16:31:45 -04:00
Chris Egerton 3ccd019198
MINOR: Clarify ACL requirements for Connect workers when exactly.once.source.support is set to preparing (#16636)
Reviewers: Mickael Maison <mickael.maison@gmail.com>,
2024-07-23 13:17:17 -04:00
Greg Harris abaecdf63e
KAFKA-17148: Remove print MetaPropertiesEnsemble from kafka-storage tool (#16607) (3.7) (#16616)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>, Greg Harris <greg.harris@aiven.io>, Chris Egerton <chrise@aiven.io>
Co-authored-by: Dmitry Werner <grimekillah@gmail.com>
2024-07-17 17:06:15 -07:00
Igor Soarez 67151c6022
MINOR: Update 3.7 branch version to 3.7.2-SNAPSHOT 2024-06-28 10:51:17 +02:00
Igor Soarez e2494e6ffb
Bump version to 3.7.1 2024-06-18 22:27:22 +01:00
Cheng-Kai, Zhang 1c81d32d8e
KAFKA-16252: Fix the documentation and adjust the format (#15473)
Currently, there are few document files generated automatically like the task genConnectMetricsDocs
However, the unwanted log information also added into it.
And the format is not aligned with other which has Mbean located of the third column.

I modified the code logic so the format could follow other section in ops.html
Also close the log since we take everything from the std as a documentation

Reviewers: Luke Chen <showuon@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>
2024-06-18 15:54:58 +01:00
Luke Chen 4d4a1615df
KAFKA-16988: add 1 more node for test_exactly_once_source system test (#16379)
Reviewers: Igor Soarez <soarez@apple.com>
2024-06-18 13:12:32 +01:00
Igor Soarez 738cb17f89 KAFKA-16969: Log error if config conficts with MV (#16366)
When broker configuration is incompatible with the current Metadata Version the Broker should log an error-level message but avoid shutting down.

Reviewers: Luke Chen <showuon@gmail.com>
2024-06-18 11:13:10 +08:00
Justine Olshan b4099736f2
MINOR: fix imports in AdminFenceProducersIntegrationTest (#16342)
* Fix imports as the config files were moved after the 3.7 release.

* MINOR: Add integration tag to AdminFenceProducersIntegrationTest (#16326)
Add @tag("integration") to AdminFenceProducersIntegrationTest
Co-authored-by: Edoardo Comar <ecomar@uk.ibm.com>

Reviewers: Bill Bejeck <bill@confluent.io>, Mickael Maison <mickael.maison@gmail.com>, Edoardo Comar <ecomar@uk.ibm.com>
2024-06-14 19:04:21 -07:00
Matthias J. Sax 8053830a47 MINOR: update Kafka Streams docs with 3.4 KIP information (#16336)
Reviewers: Jim Galasyn <jim.galasyn@confluent.io>, Bill Bejeck <bill@confluent.io>
2024-06-14 15:02:56 -07:00
Matthias J. Sax 65cbd478e8 MINOR: update Kafka Streams docs with 3.3 KIP information (#16316)
Reviewers: Lucas Brutschy <lbrutschy@confluent.io>, Jim Galasyn <jim.galasyn@confluent.io>
2024-06-13 15:17:59 -07:00
Matthias J. Sax f9001b5a46 MINOR: update Kafka Streams docs with 3.2 KIP information (#16313)
Reviewers: Bruno Cadonna <bruno@confluent.io>, Jim Galasyn <jim.galasyn@confluent.io>
2024-06-13 14:59:02 -07:00
Edoardo Comar 7a38fbabdd
KAFKA-16570 FenceProducers API returns "unexpected error" when succes… (#16229)
KAFKA-16570 FenceProducers API returns "unexpected error" when successful

* Client handling of ConcurrentTransactionsException as retriable
* Unit test
* Integration test

Reviewers: Chris Egerton <chrise@aiven.io>, Justine Olshan <jolshan@confluent.io>
2024-06-13 13:42:56 +03:00
Chris Egerton b38d2eb0ea
KAFKA-9228: Restart tasks on runtime-only connector config changes (#16053)
Reviewers: Greg Harris <greg.harris@aiven.io>
2024-06-10 17:25:01 -04:00
Igor Soarez cf67774d8c
KAFKA-16886: Detect replica demotion in AssignmentsManager (#16232)
JBOD Brokers keep the Controller up to date with replica-to-directory
placement via AssignReplicasToDirsRequest. These requests are queued,
compacted and sent by AssignmentsManager.

The Controller returns the error NOT_LEADER_OR_FOLLOWER when handling
a AssignReplicasToDirsRequest from a broker that is not a replica.

A partition reassignment can take place, removing the Broker
as a replica before the AssignReplicasToDirsRequest successfully
reaches the Controller. AssignmentsManager retries failed
requests, and will continuously try to propagate this assignment,
until the Broker either shuts down, or is added back as a replica.

When encountering a NOT_LEADER_OR_FOLLOWER error, AssignmentsManager
should assume that the broker is no longer a replica, and stop
trying to propagate the directory assignment for that partition.

Reviewers: Luke Chen <showuon@gmail.com>
2024-06-08 16:17:01 +03:00
Igor Soarez d6eae90ebe
MINOR: Fix broken ReassignPartitionsCommandTest test (#16251)
KAFKA-16606 (#15834) introduced a change that broke
ReassignPartitionsCommandTest.testReassignmentCompletionDuringPartialUpgrade.

The point was to validate that the MetadataVersion supports JBOD
in KRaft when multiple log directories are configured.
We do that by checking the version used in
kafka-features.sh upgrade --metadata, and the version discovered
via a FeatureRecord for metadata.version in the cluster metadata.

There's no point in checking inter.broker.protocol.version in
KafkaConfig, since in KRaft, that configuration is deprecated
and ignored — always assuming the value of MINIMUM_KRAFT_VERSION.

The broken that was broken sets inter.broker.protocol.version in
KRaft mode and configures 3 directories. So alternatively, we
could change the test to not configure this property.
Since the property isn't forbidden in KRaft mode, just ignored,
and operators may forget to remove it, it seems better to remote
the fail condition in KafkaConfig.

Reviewers: Luke Chen <showuon@gmail.com>
2024-06-08 14:08:18 +03:00
Igor Soarez 67ca44fdf3
KAFKA-16606 Gate JBOD configuration on 3.7-IV2 (#15834)
Support for multiple log directories in KRaft exists from
MetataVersion 3.7-IV2.

When migrating a ZK broker to KRaft, we already check that
the IBP is high enough before allowing the broker to startup.

With KIP-584 and KIP-778, Brokers in KRaft mode do not require
the IBP configuration - the configuration is deprecated.
In KRaft mode inter.broker.protocol.version defaults to
MetadataVersion.MINIMUM_KRAFT_VERSION (IBP_3_0_IV1).

Instead KRaft brokers discover the MetadataVersion by reading
the "metadata.version" FeatureLevelRecord from the cluster metadata.

This change adds a new configuration validation step upon discovering
the "metadata.version" from the cluster metadata.

Reviewers: Mickael Maison <mickael.maison@gmail.com>
2024-06-07 11:33:38 +03:00