Currently, when executing kafka-reassign-partitions.sh with the --execute option, if a partition number specified in the JSON file does not exist, this check occurs only when submitting the reassignments to alterPartitionReassignments on the server-side.
We can perform this check in advance before submitting the reassignments to the server side.
Reviewers: Luke Chen <showuon@gmail.com>
After profiling the kafka tests, tons of client-metrics-reaper thread not cleanup after BrokerServer shutdown.
The thread client-metrics-reaper comes from ClientMetricsManager#expirationTimer, and BrokerServer#shudown doesn't close ClientMetricsManager which let the thread still runs in background.
Reviewers: Luke Chen <showuon@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>
A subtle difference in the behavior of the two API causes the failures with Invalid negative timestamp.
In this PR, the list offsets response will be processed differently based on the API. For beginingOffsets/endOffsets - the offset response should be directly returned.
For offsetsForTimes - A OffsetAndTimestamp object is constructed for each requested TopicPartition before being returned.
The reason beginningOffsets and endOffsets - We are expecting a -1 timestamp from the response which subsequently causes the invalid timestamp exception because the original code tries to construct an OffsetAndTimestamp object upon returning.
In this PR, the following missing tasks are added:
short-circuit both beginningOrEndOffsets
Test both API (beginningOrEndOffsets, OffsetsForTime)
Seems like we don't have tests for this API: Note it is presented in other IntegrationTests but they are added to test Async consumer
Reviewers: Lucas Brutschy <lbrutschy@confluent.io>, Lianet Magrans <lianetmr@gmail.com>
- Use `Empty` instead of 'none' when referring to `Optional` values.
- `Headers.lastHeader` returns `null` when no header is found.
- Fix minor spelling mistakes.
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
Enables log directory failure system test for all Kraft modes in addition to ZK mode.
Reviewers: Luke Chen <showuon@gmail.com>, Igor Soarez <soarez@apple.com>, Proven Provenzano <pprovenzano@confluent.io>
This pr fixes the bug created by #15263 which caused topic partition to be recreated whenever the original log dir is offline: Log directory failure re-creates partitions in another logdir automatically
Reviewers: Luke Chen <showuon@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>, Igor Soarez <soarez@apple.com>, Gaurav Narula <gaurav_narula2@apple.com>, Proven Provenzano <pprovenzano@confluent.io>
Following test cases don't really run kraft case. The reason is that the test info doesn't contain parameter name, so it always returns false in TestInfoUtils#isKRaft.
1) TopicCommandIntegrationTest
2) DeleteConsumerGroupsTest
3) AuthorizerIntegrationTest
4) DeleteOffsetsConsumerGroupCommandIntegrationTest
We can fix it by adding options.compilerArgs << '-parameters' after
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
testUnrecoverableErrors was flaky as the wanted error either affected the next block request (prefecthing) or just missed that.
First I tried to wait for the background thread to be finished before setting the Errors.X. But then it consistently failed, because the generateProducerId call does prefetching too and after a successful producer id generation we set the error and expected that it will fail again with coordinator-load-in-progress exception but since the block was prefetched, it was able to serve us with a proper producer id.
calling generateProducerId --> no current block exists, so requesting block --> CoordinatorLoadInProgressException
asserting exception
calling generateProducerId again --> prefetching, requesting the next block --> giving back the producer id from the first block
asserting received producer id
setting error -- waiting for the background callback(s) to be finished first
calling generateProducerId, expecting CoordinatorLoadInProgressException, but --> works like 2), just the prefetching callback is failing due to the error we set before
Note: without the waiting for the background thread completions the error setting could happened before the 2) step's callback or after that, the test was written in a way that it expected to happen before the cb.
This was the point I realised that we need to have a queue to control the responses rather than trying to do it in the middle of the test method.
Errors can be passed in a queue at creation of the mock id manager instead modifying on-the-fly.
In the queue we're specifying Errors, how the background thread (which imitates the controllerChannel) should behave, return an error or a proper response and call the callback accordingly with that.
I was able to simplify the mock manager id class as well, no need for the maybeRequestNextBlock overriding if the errors are handled this way via a queue.
Reviewers: Igor Soarez <soarez@apple.com>, Daniel Urban <urb.daniel7@gmail.com>, Viktor Somogyi-Vass <viktorsomogyi@gmail.com>
Invokes `SSLEngine::closeInbound` after we flush close_notify alert tothe socket. This fixes memory leak in Netty/OpenSSL based SSLEngine which only free native resources once closeInbound has been invoked.
Reviewers: Omnia Ibrahim <o.g.h.ibrahim@gmail.com>, Luke Chen <showuon@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>
If the broker registers with the same broker epoch as the previous session, it is recognized as a clean shutdown. Otherwise, it is an unclean shutdown. This replica will be removed from any ELR.
Reviewers: Artem Livshits <alivshits@confluent.io>, David Arthur <mumrah@gmail.com>
The issue KAFKA-16359 reported inclusion of kafka-clients runtime dependencies in MANIFEST.MF Class-Path.
The root cause is the issue defined here with the usage of shadow plugin.
Looking into the specifics of plugin and documentation, specifies that any dependency marked as shadow will be treated as following by the shadow plugin:
1. Adds the dependency as runtime dependency in resultant pom.xml - code here
2. Adds the dependency as Class-Path in MANIFEST.MF as well - code here
Resolution
We do need the runtime dependencies available in the pom (1 above) but not on manifest (2 above). Also there is no clear way to separate the behaviour as both above tasks relies on shadow configuration.
To fix, I have defined another custom configuration named shadowed which is later used to populate the correct pom (the code is similar to what shadow plugin does to populate pom for runtime dependencies).
Though this might seem like a workaround, but I think that's the only way to fix the issue. I have checked other SDKs which suffered with same issue and went with similar route to populate pom.
Reviewers: Luke Chen <showuon@gmail.com>, Reviewers: Mickael Maison <mickael.maison@gmail.com>, Gaurav Narula <gaurav_narula2@apple.com>
Added a new optional group_protocol parameter to the test methods, then passed that down to the setup_consumer method.
Unfortunately, because the new consumer can only be used with the new coordinator, this required a new @matrix block instead of adding the group_protocol=["classic", "consumer"] to the existing blocks 😢
Reviewers: Walker Carlson <wcarlson@apache.org>
Added a new optional group_protocol parameter to the test methods, then passed that down to the setup_consumer method.
Unfortunately, because the new consumer can only be used with the new coordinator, this required a new @matrix block instead of adding the group_protocol=["classic", "consumer"] to the existing blocks 😢
Reviewers: Walker Carlson <wcarlson@apache.org>
Migrated the following tests for the new consumer:
- test_fencing_static_consumer
- test_static_consumer_bounce
- test_static_consumer_persisted_after_rejoin
Reviewers: Walker Carlson <wcarlson@apache.org>
Added a new optional group_protocol parameter to the test methods, then passed that down to the methods involved.
Unfortunately, because the new consumer can only be used with the new coordinator, this required a new @matrix block instead of adding the group_protocol=["classic", "consumer"] to the existing blocks 😢
Reviewers: Walker Carlson <wcarlson@apache.org>
Currently, when we using kafka-reassign-partitions to move the log directory, the output only indicates which replica's movement has successfully started.
This PR propose to show more detailed information, helping end users understand that the operation is proceeding as expected.
Reviewers: Luke Chen <showuon@gmail.com>, Andrew Schofield <aschofield@confluent.io>
When moving replicas between directories in the same broker, future replica promotion hinges on acknowledgment from the controller of a change in the directory assignment.
ReplicaAlterLogDirsThread relies on AssignmentsManager for a completion notification of the directory assignment change.
In its current form, under certain assignment scheduling, AssignmentsManager both miss completion notifications, or prematurely trigger them.
Reviewers: Luke Chen <showuon@gmail.com>, Omnia Ibrahim <o.g.h.ibrahim@gmail.com>, Gaurav Narula <gaurav_narula2@apple.com>
This PR changes the handling of authenticationException on a request from the node to the controller.
We disconnect controller connection and invalidate the cache so that the next run of the thread will establish a connection with the (potentially) updated controller.
Reviewers: Luke Chen <showuon@gmail.com>, Igor Soarez <soarez@apple.com>, Omnia Ibrahim <o.g.h.ibrahim@gmail.com>
The variable of metrics item (kafka.server:type=DelayedRemoteFetchMetrics,name=ExpiresPerSec) is singleton object and it could be removed by other tests which are running in same JVM (and it is not recreated). Hence, verifying the metrics value is not stable to this test case.
Reviewers: Luke Chen <showuon@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>, Kamal Chandraprakash <kamal.chandraprakash@gmail.com>
The change in https://github.com/apache/kafka/pull/15373/files#r1544335647 updated broker port from 9093 to 9094. Some of test cases check broker endpoint with fixed string 9093. I change test cases to get endpoint by broker id, so these cases will not fail if someone change the port again in the future.
Reviewers: Luke Chen <showuon@gmail.com>
1) This PR moves kafka.security classes from core to server module.
2) AclAuthorizer not moved, because it has heavy dependencies on core classes that not rewrited from scala at the moment.
3) AclAuthorizer will be deleted as part of ZK removal
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
This is a mitigation fix for the https://issues.apache.org/jira/browse/KAFKA-16217. Exceptions should not block closing the producers.
This PR reverts a part of the change #13591
Reviewers: Kirk True <ktrue@confluent.io>, Justine Olshan <jolshan@confluent.io>
Signed-off-by: Greg Harris <greg.harris@aiven.io>
Reviewers: Chris Egerton <chrise@aiven.io>, Mickael Maison <mickael.maison@gmail.com>, Omnia G H Ibrahim <o.g.h.ibrahim@gmail.com>, Matthias J. Sax <matthias@confluent.io>
When custom processors are added via StreamBuilder#addGlobalStore they will now reprocess all records through the custom transformer instead of loading directly.
We do this so that users that transform the records will not get improperly formatted records down stream.
Reviewers: Matthias J. Sax <matthias@confluent.io>
Reviewers: Mickael Maison <mickael.maison@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>, Omnia Ibrahim <o.g.h.ibrahim@gmail.com>
Co-authored-by: n.izhikov <n.izhikov@vk.team>
Follow-up to #15535, splitting consumer integration tests defined in the long-running PlainTextConsumerTest. This PR extracts the tests that directly relate to committing offsets. No changes in logic.
Reviewers: Lucas Brutschy <lbrutschy@confluent.io>