KIP-373 added a "token requester" field to the output of kafka-delegation-tokens.sh. The system test was failing since it was not expecting this new field. This patch adds support for this field and improves the error output if we can't parse.
Reviewers: José Armando García Sancio <jsancio@apache.org>, Manikumar Reddy <manikumar.reddy@gmail.com>
This patch adds a section in security.html about listener configuration. This includes the basics of how to define the security mapping of each listener as well as the configurations to control inter-cluster traffic.
Reviewers: José Armando García Sancio <jsancio@users.noreply.github.com>, Luke Chen <showuon@gmail.com>
This commit adds KRaft monitoring related metrics to the Kafka docs (docs/ops.html).
Reviewers: Jason Gustafson <jason@confluent.io>, Luke Chen <showuon@gmail.com>
Previously, BrokerRegistration#toString sould throw an exception, terminating metadata replay,
because the sorted() method is used on an entry set rather than a key set.
Reviewers: David Arthur <mumrah@gmail.com>
Fixes an issue with StandardAuthorizer#authorize that allowed inconsistent results. The underlying
concurrent data structure (ConcurrentSkipListMap) had weak consistency guarantees. This meant
that a concurrent update to the authorizer data could result in the authorize function processing
ACL updates out of order.
This patch replaces the concurrent data structures with regular non-thread safe equivalents and uses
a read/write lock for thread safety and strong consistency.
Reviewers: David Arthur <mumrah@gmail.com>, Jason Gustafson <jason@confluent.io>, Colin P. McCabe <cmccabe@apache.org>, Luke Chen <showuon@gmail.com>
We should prevent the metadata log from initializing in a known bad state. If the log start offset of the first segment is greater than 0, then must be a snapshot an offset greater than or equal to it order to ensure that the initialized state is complete.
Reviewers: José Armando García Sancio <jsancio@users.noreply.github.com>
Disable segment deletion based on size and time by setting the KRaft metadata log's `RetentionMsProp` and `RetentionBytesProp` to `-1`. This will cause `UnifiedLog.deleteRetentionMsBreachedSegments` and `UnifiedLog.deleteRetentionSizeBreachedSegments` to short circuit instead of deleting segments.
Without this changes the included test would fail. This happens because `deleteRetentionMsBreachedSegments` is able to delete past the `logStartOffset`. Deleting past the `logStartOffset` would violate the invariant that if the `logStartOffset` is greater than 0 then there is a snapshot with an end offset greater than or equal to the log start offset.
Reviewers: Luke Chen <showuon@gmail.com>, Jason Gustafson <jason@confluent.io>
Now the built-in partitioner defers partition switch (while still
accounting produced bytes) if there is no ready batch to send, thus
avoiding switching partitions and creating fractional batches.
Reviewers: Jun Rao <jun@confluent.io>
Migrates Streams sustem tests to either use kraft brokers or to use both kraft and zk in a testing matrix.
This skips tests which use various forms of Kafka versioning since those seem to have issues with KRaft at the moment. Running these tests with KRaft will require a followup PR.
Reviewers: Guozhang Wang <guozhang@apache.org>, John Roesler <vvcephei@apache.org>
Because the snapshot writer sets a linger ms of Integer.MAX_VALUE it is
possible for the memory pool to run out of memory if the snapshot is
greater than 5 * 8MB.
This change allows the BatchMemoryPool to always allocate a buffer when
requested. The memory pool frees the extra allocated buffer when released if
the number of pooled buffers is greater than the configured maximum
batches.
Reviewers: Jason Gustafson <jason@confluent.io>
Asynchronous offset commits may throw an unexpected WakeupException following #11631 and #12244. This patch fixes the problem by passing through a flag to ensureCoordinatorReady to indicate whether wakeups should be disabled. This is used to disable wakeups in the context of asynchronous offset commits. All other uses leave wakeups enabled.
Note: this patch builds on top of #12611.
Co-Authored-By: Guozhang Wang wangguoz@gmail.com
Reviewers: Luke Chen <showuon@gmail.com>
When auto-commit is enabled with the "eager" rebalance strategy, the consumer will commit all offsets prior to revocation. Following recent changes, this offset commit is done asynchronously, which means there is an opportunity for fetches to continue returning data to the application. When this happens, the progress is lost following revocation, which results in duplicate consumption. This patch fixes the problem by adding a flag in `SubscriptionState` to ensure that partitions which are awaiting revocation will not continue being fetched.
Reviewers: Luke Chen <showuon@gmail.com>, Jason Gustafson <jason@confluent.io>
Currently forwarded requests are not applied to any quotas on either the controller or the broker. The controller-side throttling requires the controller to apply the quota changes from the log to the quota managers, which will be done separately. In this patch, we change the response logic on the broker side to also apply the broker's request quota. The enforced throttle time is the maximum of the throttle returned from the controller (which is 0 until we fix the aforementioned issue) and the broker's request throttle time.
Reviewers: David Arthur <mumrah@gmail.com>
The </html> tag doesn't have a matching <html> tag. Those tags are added by
the server side include and are not needed in docs/upgrade.html
Reviewers: Ismael Juma <ismael@juma.me.uk>
Document process for recovering and formatting the metadata log directory for the KRaft controller.
Reviewers: Colin Patrick McCabe <cmccabe@apache.org>, Jason Gustafson <jason@confluent.io>
Verified that the artifact generated by `releaseTarGz` no longer includes
swagger-jaxrs2 or its dependencies (like snakeyaml).
Reviewers: José Armando García Sancio <jsancio@users.noreply.github.com>, Chris Egerton <fearthecellos@gmail.com>
When utilizing the rack-aware consumer configuration and rolling updates are being applied to the Kafka brokers the metadata updates can be in a transient state and a given topic-partition can be missing from the metadata. This seems to resolve itself after a bit of time but before it can resolve the `Cluster.nodeIfOnline` method throws an NPE. This patch checks to make sure that a given topic-partition has partition info available before using that partition info.
Reviewers: David Jacot <djacot@confluent.io>
This patch removes test_kafka_version.py, which contains two tests at the moment. The first test verifies we can start a 0.8.2 cluster. The second verifies we can start a cluster with one node on 0.8.2 and another on the latest. These test are covered in greater depth by upgrade_test.py and downgrade_test.py.
Reviewers: José Armando García Sancio <jsancio@users.noreply.github.com>
Update the quickstart HTML pages for Kafka and Kafka Stream to include how to quickly start and
experiment with a Kafka cluster using KRaft in addition to ZooKeeper.
Reviews: Colin Patrick McCabe <cmccabe@apache.org>, Chase Thomas <forlack@users.noreply.github.com>, Luke Chen <showuon@gmail.com>
The consumer group instance ID is used to support a notion of "static" consumer groups. The idea is to be able to identify the same group instance across restarts so that a rebalance is not needed. However, if the user sets `group.instance.id` in the consumer configuration, but uses "simple" assignment with `assign()`, then the instance ID nevertheless is sent in the OffsetCommit request to the coordinator. This may result in a surprising UNKNOWN_MEMBER_ID error.
This PR fixes the issue on the client side by not setting the group instance id if the member id is empty (no generation).
Reviewers: Jason Gustafson <jason@confluent.io>
The consumer group instance ID is used to support a notion of "static" consumer groups. The idea is to be able to identify the same group instance across restarts so that a rebalance is not needed. However, if the user sets `group.instance.id` in the consumer configuration, but uses "simple" assignment with `assign()`, then the instance ID nevertheless is sent in the OffsetCommit request to the coordinator. This may result in a surprising UNKNOWN_MEMBER_ID error.
This PR attempts to fix this issue for existing consumers by relaxing the validation in this case. One way is to simply ignore the member id and the static id when the generation id is -1. -1 signals that the request comes from either the admin client or a consumer which does not use the group management. This does not apply to transactional offsets commit.
Reviewers: Jason Gustafson <jason@confluent.io>
Originally, the QuorumController did not try to limit the number of records in a batch that it sent
to the Raft layer. This caused two problems. Firstly, we were not correctly handling the exception
that was thrown by the Raft layer when a batch of records was too large to apply atomically. This
happened because the Raft layer threw an exception which was a subclass of ApiException. Secondly,
by letting the Raft layer split non-atomic batches, we were not able to create snapshots at each of
the splits. This led to O(N) behavior during controller failovers.
This PR fixes both of these issues by limiting the number of records in a batch. Atomic batches
that are too large will fail with a RuntimeException which will cause the active controller to
become inactive and revert to the last committed state. Non-atomic batches will be split into
multiple batches with a fixed number of records in each.
Reviewers: Luke Chen <showuon@gmail.com>, José Armando García Sancio <jsancio@gmail.com>
Also includes a minor quality-of-life improvement to clarify why some internal REST requests to workers may fail while that worker is still starting up.
Reviewers: Tom Bentley <tbentley@redhat.com>, Luke Chen <showuon@gmail.com>, José Armando García Sancio <jsancio@gmail.com>, Mickael Maison <mickael.maison@gmail.com>
This adds a new configuration `sasl.server.max.receive.size` that sets the maximum receive size for requests before and during authentication.
Reviewers: Tom Bentley <tbentley@redhat.com>, Mickael Maison <mickael.maison@gmail.com>
Co-authored-by: Manikumar Reddy <manikumar.reddy@gmail.com>
Co-authored-by: Mickael Maison <mickael.maison@gmail.com>
When deserializing KRPC (which is used for RPCs sent to Kafka, Kafka Metadata records, and some
other things), check that we have at least N bytes remaining before allocating an array of size N.
Remove DataInputStreamReadable since it was hard to make this class aware of how many bytes were
remaining. Instead, when reading an individual record in the Raft layer, simply create a
ByteBufferAccessor with a ByteBuffer containing just the bytes we're interested in.
Add SimpleArraysMessageTest and ByteBufferAccessorTest. Also add some additional tests in
RequestResponseTest.
Reviewers: Tom Bentley <tbentley@redhat.com>, Mickael Maison <mickael.maison@gmail.com>, Colin McCabe <colin@cmccabe.xyz>
Co-authored-by: Colin McCabe <colin@cmccabe.xyz>
Co-authored-by: Manikumar Reddy <manikumar.reddy@gmail.com>
Co-authored-by: Mickael Maison <mickael.maison@gmail.com>
Originally, we only enabled retries for PR builds to avoid hiding timing
related issues. In practice, however, the results are too noisy without
any retry due to various environmental issues.
Enable 1 retry for all builds and increase the max test retry to 10.
Reviewers: José Armando García Sancio <jsancio@users.noreply.github.com>
kafka-features.sh must exit with a non-zero error code on error. We must do this in order to catch
regressions like KAFKA-13990.
Reviewers: David Arthur <mumrah@gmail.com>