Standardize controller log4j output for replaying important records. The log message should include
word "replayed" to make it clear that this is a record replay. Log the replay of records for ACLs,
client quotas, and producer IDs, which were previously not logged. Also fix a case where we weren't
logging changes to broker registrations.
AclControlManager, ClientQuotaControlManager, and ProducerIdControlManager didn't previously have a
log4j logger object, so this PR adds one. It also converts them to using Builder objects. This
makes junit tests more readable because we don't need to specify paramaters where the test can use
the default (like LogContexts).
Throw an exception in replay if we get another TopicRecord for a topic which already exists.
In JmxTool.scala, we will wait till all the object names are available from MBean server. But in the newer version, we only wait for subset of object names. Due to this, we may not enforce wait option and prematurely return the result if the objects are not yet registered in MBean sever.
Reviewers: Luke Chen <showuon@gmail.com>, Federico Valeri <fvaleri@redhat.com>
This patch adds the session timeout and the revocation timeout to the new consumer group protocol.
Reviewers: Calvin Liu <caliu@confluent.io>, Jeff Kim <jeff.kim@confluent.io>, Justine Olshan <jolshan@confluent.io>
KAFKA-14522 Rewrite and Move of RemoteIndexCache to storage module.
Cleanedup index file suffix usages and other minor cleanups
Reviewers: Jun Rao <junrao@gmail.com>, Ismael Juma <ismael@juma.me.uk>, Luke Chen <showuon@gmail.com>, Divij Vaidya <diviv@amazon.com>, Kamal Chandraprakash<kamal.chandraprakash@gmail.com>, Jorge Esteban Quilcate Otoya <quilcate.jorge@gmail.com>
During fast consecutive rebalances where a task is revoked from one worker and assigned to another one, it has been observed that there is a small time window and thus a race condition during which a RUNNING status record in the new generation is produced and is immediately followed by a delayed UNASSIGNED status record belonging to the same or a previous generation before the worker that sends this message reads the RUNNING status record that corresponds to the latest generation.
Although this doesn't inhibit the actual execution of tasks, it reports an incorrect status for those tasks(i.e UNASSIGNED). If the users have setup some kind of monitoring on tasks status then this could lead to false alarms for example.
This fix addresses this problem by checking if a status message is stale after reading it and updates it's status only when it is safe to.
Reviewers: Lucent-Wong <manchesterfans@live.cn>, Chris Egerton <chrise@aiven.io>, Yash Mayya <yash.mayya@gmail.com>, Konstantine Karantasis <k.karantasis@gmail.com>
Adds additional testing for the various KRaft *Image classes. For every image that we create we already test that we can get there by applying all the records corresponding to that image written out as a list of records. This patch adds more tests to confirm that we can get to each such final image with intermediate stops at all possible intermediate images along the way.
Reviewers: Colin P. McCabe <cmccabe@apache.org>, David Arthur <mumrah@gmail.com>
- Add new unit tests
- Change the on-disk filename from <offset>_<uuid>_.<indexSuffix> to <offset>_<uuid>.<indexSuffix> i.e. remove trailing underscore after
- Fix a small bug where we were parsing offset as Int when reading the file name from disk. Offset is long.
- Perform input validation in RemoteLogSegmentMetadata.
- Remove an extra loop in cleaner thread. Shutdownable thread already performs looping.
Reviewers: Jorge Esteban Quilcate Otoya <jorge.quilcate@aiven.io>, Satish Duggana <satishd@apache.org>
Make the output of 'find' and 'ls' sorted alphabetically.
Add GlobComponentTest.java to test globbing.
Add shell/src/test/resources/log4j.properties so that shell JUnit tests show some output
Reviewers: David Arthur <mumrah@gmail.com>
This patch adds EventBasedCoordinatorTimer and the CoordinatorTimer interface.
Reviewers: Jeff Kim <jeff.kim@confluent.io>, Justine Olshan <jolshan@confluent.io>
Fixes a bug where we don't send UMR and LISR requests in dual-write mode when new partitions are created. Prior to this patch, KRaftMigrationZkWriter was mutating the internal data-structures of TopicDelta which prevented MigrationPropagator from sending UMR and LISR for the changed partitions.
Reviewers: David Arthur <mumrah@gmail.com>
Fixes a regression introduced in PR #13924 by moving the map from static to a instance specific variable.
---------
Co-authored-by: Deqi Hu <deqi.hu@shopee.com>
This patch rewrites MockTimer in Java and moves it from core to server-common. This continues the work started in https://github.com/apache/kafka/pull/13820.
Reviewers: Divij Vaidya <diviv@amazon.com>
This is a follow up on the initial OffsetFetcher refactoring to extract reusable logic, needed for the new consumer implementation (initial refactoring merged with PR-13815).
Similar to the initial refactoring, this PR brings no changes to the existing logic, just extracting functions or pieces of logic.
There were no individual tests for the extracted functions, so no tests were migrated.
Reviewers: Jun Rao <junrao@gmail.com>
This patch adds (1) the logic to propagate a new MetadataImage to the running coordinators; and (2) the logic to ensure that all the consumer groups subscribed to topics with changes will refresh their subscriptions metadata on the next heartbeat. In the mean time, it ensures that freshly loaded consumer groups also refresh their subscriptions metadata on the next heartbeat.
Reviewers: Jeff Kim <jeff.kim@confluent.io>, Justine Olshan <jolshan@confluent.io>
This commit utilizes TestUtils methods to create a topic and adds logs when assertions fail.
Reviewers: Divij Vaidya <diviv@amazon.com>
---------
Co-authored-by: d00791190 <dinglan6@huawei.com>
With the state updater, the task manager needs also to look into the
tasks owned by the state updater when computing the sum of offsets
of the state. This sum of offsets is used by the high availability
assignor to assign warm-up replicas.
If the task manager does not take into account tasks in the
state updater, a warm-up replica will never report back that
the state for the corresponding task has caught up. Consequently,
the warm-up replica will never be dismissed and probing rebalances
will never end.
Reviewers: Lucas Brutschy <lbrutschy@confluent.io>, Matthias J. Sax <matthias@confluent.io>, Walker Carlson <wcarlson@confluent.io>
Use AdminApiDriver class to refresh the metadata and retry the request that failed with retriable errors.
Reviewers: Luke Chen <showuon@gmail.com>, Divij Vaidya <diviv@amazon.com>, Mickael Maison <mmaison@redhat.com>, Dimitar Dimitrov <30328539+dimitarndimitrov@users.noreply.github.com>
It's good for us to add support for Java 20 in preparation for Java 21 - the next LTS.
Given that Scala 2.12 support has been deprecated, a Scala 2.12 variant is not included.
Also remove some branch builds that add load to the CI, but have
low value: JDK 8 & Scala 2.13 (JDK 8 support has been deprecated),
JDK 11 & Scala 2.12 (Scala 2.12 support has been deprecated) and
JDK 17 & Scala 2.12 (Scala 2.12 support has been deprecated).
A newer version of Mockito (4.9.0 -> 4.11.0) is required for Java 20 support, but we
only use it with Scala 2.13+ since it causes compilation errors with Scala 2.12. Similarly,
we upgrade easymock when the Java version is 16 or newer as it's incompatible
with powermock (which doesn't support Java 16 or newer).
Filed KAFKA-15117 for a test that fails with Java 20 (SslTransportLayerTest.testValidEndpointIdentificationCN).
Finally, fixed some lossy conversions that were added after #13582 was submitted.
Reviewers: Ismael Juma <ismael@juma.me.uk>
Poison the transaction manager if we detect an illegal transition in the Sender thread. A ThreadLocal in is stored in TransactionManager so that the Sender can inform TransactionManager which thread it's using.
Reviewers: Daniel Urban <durban@cloudera.com>, Justine Olshan <jolshan@confluent.io>, Jason Gustafson <jason@confluent.io>