Commit Graph

271 Commits

Author SHA1 Message Date
Dmitry Werner 2d4abb85bf
KAFKA-16415 Fix handling of "--version" option in ConsumerGroupCommand (#15592)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-03-26 15:44:23 +08:00
PoAn Yang fa1cf7975e
KAFKA-16409: DeleteRecordsCommand should use standard exception handling (#15586)
DeleteRecordsCommand should use standard exception handling

Reviewers: Luke Chen <showuon@gmail.com>
2024-03-26 08:44:59 +08:00
Kuan-Po (Cooper) Tseng 7b2fc469ad
KAFKA-16410 kafka-leader-election / LeaderElectionCommand doesn't set exit code on error (#15591)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-03-25 12:31:37 +08:00
Dmitry Werner 0434c29e58
KAFKA-16408 kafka-get-offsets / GetOffsetShell doesn't handle --version or --help (#15583)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-03-25 12:12:23 +08:00
Nikolay 0f216b6448
MINOR: Tuple2 replaced with Map.Entry (#15560)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-03-23 11:44:05 +08:00
Nikolay b6183a4134
KAFKA-14589 ConsumerGroupCommand rewritten in java (#14471)
This PR contains changes to rewrite ConsumerGroupCommand in java and transfer it to tools module

Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-03-20 15:34:45 +08:00
Kuan-Po (Cooper) Tseng 12a1d85362
KAFKA-12187 replace assertTrue(obj instanceof X) with assertInstanceOf (#15512)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-03-20 10:36:25 +08:00
Luke Chen 834efa6606
KAFKA-16342 fix getOffsetByMaxTimestamp for compressed records (#15474)
Fix getOffsetByMaxTimestamp for compressed records.

This PR adds:

1) For inPlaceAssignment case, compute the correct offset for maxTimestamp when traversing the batch records, and set to ValidationResult in the end, instead of setting to last offset always.

2) For not inPlaceAssignment, set the offsetOfMaxTimestamp for the log create time, like non-compressed, and inPlaceAssignment cases, instead of setting to last offset always.

3) Add tests to verify the fix.

Reviewers: Jun Rao <junrao@apache.org>, Chia-Ping Tsai <chia7712@gmail.com>
2024-03-15 06:09:45 +08:00
Nikolay 414365979e
KAFKA-14589 [4/4] Tests of ConsoleGroupCommand rewritten in java (#15465)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-03-09 03:54:39 +08:00
PoAn Yang 5dd382ccbd
MINOR: Use INFO logging for tools tests (#15487)
Reviewers: Luke Chen <showuon@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>
2024-03-08 03:02:22 +08:00
Dmitry Werner ba0db81e53
KAFKA-16246: Cleanups in ConsoleConsumer (#15457)
Reviewers: Mickael Maison <mickael.maison@gmail.com>, Omnia Ibrahim <o.g.h.ibrahim@gmail.com>
2024-03-07 09:39:16 +01:00
Nikolay 5f4806fd1c
KAFKA-14589 [2/4] Tests of ConsoleGroupCommand rewritten in java (#15363)
This PR is part of #14471
It contains some of ConsoleGroupCommand tests rewritten in java.
Intention of separate PR is to reduce changes and simplify review.

Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-03-07 07:44:17 +08:00
Nikolay f6198bc075
KAFKA-14589 [3/4] Tests of ConsoleGroupCommand rewritten in java (#15365)
Is contains some of ConsoleGroupCommand tests rewritten in java.
Intention of separate PR is to reduce changes and simplify review.

Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-03-06 17:13:39 +08:00
Ritika Reddy 96c68096a2
KAFKA-15462: Add Group Type Filter for List Group to the Admin Client (#15150)
In KIP-848, we introduce the notion of Group Types based on the protocol type that the members in the consumer group use. As of now we support two types of groups:
* Classic : Members use the classic consumer group protocol ( existing one )
* Consumer : Members use the consumer group protocol introduced in KIP-848.
Currently List Groups allows users to list all the consumer groups available. KIP-518 introduced filtering the consumer groups by the state that they are in. We now want to allow users to filter consumer groups by type.

This patch includes the changes to the admin client and related files. It also includes changes to parameterize the tests to include permutations of the old GC and the new GC with the different protocol types.

Reviewers: David Jacot <djacot@confluent.io>
2024-02-29 00:38:42 -08:00
Yang Yu b4e96913cc
KAFKA-16265: KIP-994 (Part 1) Minor Enhancements to ListTransactionsRequest (#15384)
Introduces a new filter in ListTransactionsRequest API. This enables caller to filter on transactions that have been running for longer than a certain duration of time.

This PR includes the following changes:

bumps version for ListTransactionsRequest API to 1. Set the durationFilter to -1L when communicating with an older broker that does not support version 1.
bumps version for ListTransactionsResponse to 1 without changing the response structure.
adds durationFilter option to kafka-transactions.sh --list
Tests:

Client side test to build request with correct combination of duration filter and API version: testBuildRequestWithDurationFilter
Server side test to filter transactions based on duration: testListTransactionsFiltering
Added test case for kafka-transactions.sh change in TransactionsCommandTest

Reviewers: Justine Olshan <jolshan@confluent.io>, Raman Verma <rverma@confluent.io>
2024-02-24 06:09:23 -08:00
Owen Leung 71a4e6fc0c
KAFKA-15140: improve TopicCommandIntegrationTest to be less flaky (#14891)
This PR improves TopicCommandIntegrationTest by :
    - using TestUtils.createTopicWithAdmin
    - replacing \n with lineSeperator
    - using waitForAllReassignmentsToComplete
    - adding more log when assertion fails

Reviewers: Luke Chen <showuon@gmail.com>, Justine Olshan <jolshan@confluent.io>
2024-02-19 19:37:31 +08:00
David Jacot e247bd03af
MINOR: Improve ListConsumerGroupTest.testListGroupCommand (#15382)
While reviewing https://github.com/apache/kafka/pull/15150, I found that our tests verifying the console output are really hard to read. Here is my proposal to make it better.

Reviewers: Ritika Reddy <rreddy@confluent.io>, Justine Olshan <jolshan@confluent.io>
2024-02-17 00:07:50 -08:00
Mickael Maison 0bf830fc9c
KAFKA-14576: Move ConsoleConsumer to tools (#15274)
Reviewers: Josep Prat <josep.prat@aiven.io>, Omnia Ibrahim <o.g.h.ibrahim@gmail.com>
2024-02-13 19:24:07 +01:00
Nikolay 88c5543ccf
KAFKA-14589: [1/3] Tests of ConsoleGroupCommand rewritten in java (#15256)
This PR is part of #14471
Is contains some of ConsoleGroupCommand tests rewritten in java.
Intention of separate PR is to reduce changes and simplify review.

Reviewers: Luke Chen <showuon@gmail.com>
2024-02-13 11:02:36 +08:00
Nikolay 13c0c5ee97
KAFKA-14589 ConsumerGroupServiceTest rewritten in java (#15248)
This PR is part of #14471
Is contains single test rewritten in java.
Intention of separate PR is to reduce changes and simplify review.

Reviewers: Justine Olshan <jolshan@confluent.io>
2024-01-26 10:32:48 -08:00
Sergio Troiano ae4d308f68
KAFKA-16015: Fix custom timeouts overwritten by defaults in LeaderElectionCommand (#15030)
This commit fixes a bug in LeaderElectionCommand due to which custom timeout configuration was not being respected.

Reviewers: Divij Vaidya <diviv@amazon.com>, Proven Provenzano <pprovenzano@confluent.io>
2023-12-29 10:50:26 +01:00
Proven Provenzano b0e99b5593
KAFKA-15922: Bump MetadataVersion to support JBOD with KRaft (#14984)
Moves ELR from MetadataVersion IBP_3_7_IV3 into the new IBP_3_8_IV0 because the ELR feature was not completed before 3.7 reached feature freeze.  Leaves IBP_3_7_IV3 empty -- it is a no-op and is not reused for anything.  Adds the new MetadataVersion IBP_3_7_IV4 for the FETCH request changes from KIP-951, which were mistakenly never associated with a MetadataVersion.  Updates the LATEST_PRODUCTION MetadataVersion to IBP_3_7_IV4 to declare both KRaft JBOD and the KIP-951 changes ready for production use.

Reviewers: Omnia G H Ibrahim <o.g.h.ibrahim@gmail.com>, Ron Dagostino <rdagostino@confluent.io>, Ismael Juma <ismael@juma.me.uk>, José Armando García Sancio <jsancio@apache.org>, Justine Olshan <jolshan@confluent.io>
2023-12-14 10:08:54 -05:00
Andrew Schofield 46852eea1c
KAFKA-15871: kafka-client-metrics.sh (#14926)
Initial implementation of kafka-client-metrics.sh tools for KIP-714 and KIP-1000.

Reviewers: Igor Soarez <soarez@apple.com>, Jun Rao <junrao@gmail.com>
2023-12-06 10:10:10 -08:00
Nikolay 783698c525
KAFKA-15645: Move ReplicationQuotasTestRig to tools module (#14588)
Reviewers: Mickael Maison <mickael.maison@gmail.com>, Justine Olshan <jolshan@confluent.io>, Taras Ledkov <tledkov@apache.org>
2023-12-05 10:03:33 +01:00
David Jacot 26274afd05
MINOR: Ensure that DisplayName is set in all parameterized tests (#14850)
This is a follow-up to https://github.com/apache/kafka/pull/14687 as we found out that some parameterized tests do not include the test method name in their name. For the context, the JUnit XML report does not include the name of the method by default but only rely on the display name provided.

Reviewers: David Arthur <mumrah@gmail.com>
2023-12-04 23:58:48 -08:00
Colin Patrick McCabe a94bc8d6d5
KAFKA-15922: Add a MetadataVersion for JBOD (#14860)
Assign MetadataVersion.IBP_3_7_IV2 to JBOD.

Move KIP-966 support to MetadataVersion.IBP_3_7_IV3.

Create MetadataVersion.LATEST_PRODUCTION as the latest metadata version that can be used when formatting a
new cluster, or upgrading a cluster using kafka-features.sh. This will allow us to clearly distinguish between stable
and unstable metadata versions for the first time.

Reviewers: Igor Soarez <soarez@apple.com>, Ron Dagostino <rndgstn@gmail.com>, Calvin Liu <caliu@confluent.io>, Proven Provenzano <pprovenzano@confluent.io>
2023-11-30 10:35:13 -08:00
Nikolay 76b1b50b64
KAFKA-14595 Move ReassignPartitionsCommand to java (#13247)
This PR contains changes required to move PartitionReassignmentState class to java code.

Reviewers: Mickael Maison <mickael.maison@gmail.com>, Justine Olshan <jolshan@confluent.io>, Federico Valeri <fedevaleri@gmail.com>, Taras Ledkov Taras Ledkov <tledkov@apache.org>, Kamal Chandraprakash<kamal.chandraprakash@gmail.com>,
2023-10-31 17:29:05 -07:00
Calvin Liu af747fbfed
KAFKA-15581: Introduce ELR (#14312)
This patch introduces preliminary changes for Eligible Leader Replicas (KIP-966)

* New MetadataVersion 16 (3.7-IV1)
* New record versions for PartitionRecord and PartitionChangeRecord
* New tagged fields on PartitionRecord and PartitionChangeRecord
* New static config "eligible.leader.replicas.enable" to gate the whole feature

Reviewers: Artem Livshits <alivshits@confluent.io>, David Arthur <mumrah@gmail.com>, Colin P. McCabe <cmccabe@apache.org>
2023-10-19 14:05:15 -04:00
Omnia G.H Ibrahim 9af1e74b5e
KAFKA-14596: Move TopicCommand to tools (#13201)
Reviewers: Mickael Maison <mickael.maison@gmail.com>, Federico Valeri <fedevaleri@gmail.com>
2023-10-17 11:40:15 +02:00
Federico Valeri aec07f76d7
KAFKA-15537: Fix metadata downgrade documentation (#14484)
In KIP-778 we introduced the "unsafe" (lossy) downgrade in case metadata has changes in one of the versions between target and current, as defined in MetadataVersion.

The documentation says it is possible:

"Note that the cluster metadata version cannot be downgraded to a pre-production 3.0.x, 3.1.x, or 3.2.x version once it has been upgraded. However, it is possible to downgrade to production versions such as 3.3-IV0, 3.3-IV1, etc."

The command line tool shows that this doesn't work:

bin/kafka-features.sh --bootstrap-server :9092 downgrade --metadata 3.4 --unsafe
Could not downgrade metadata.version to 8. Invalid metadata.version 8. Unsafe metadata downgrade is not supported in this version.
1 out of 1 operation(s) failed.

In addition to unsafe, also safe metadata downgrades are not supported in practice. For example, when you upgrade to 3.5, you land on 3.5-IV2 as metadata version, which has metadata changes and won't let you to downgrade. This is true for every other release at the moment.

This change fixes the documentation to reflect that, and improves the error messages.

Signed-off-by: Federico Valeri <fedevaleri@gmail.com>

Reviewers: Luke Chen <showuon@gmail.com>, Jakub Scholz <github@scholzj.com>
2023-10-12 11:12:44 +08:00
Omnia G.H Ibrahim 7553d3f562
KAFKA-14593: Move LeaderElectionCommand to tools (#13204)
Reviewers: Mickael Maison <mickael.maison@gmail.com>, Federico Valeri <fedevaleri@gmail.com>
2023-10-03 11:59:56 +02:00
Nikolay 8f8dbad564
KAFKA-14595 ReassignPartitionsIntegrationTest rewritten in java (#14456)
This PR is part of #13247
It contains ReassignPartitionsIntegrationTest rewritten in java.
Goal of PR is reduce changes size in main PR.

Reviewers: Taras Ledkov  <tledkov@apache.org>, Justine Olshan <jolshan@confluent.io>
2023-10-02 13:22:17 -07:00
Colin Patrick McCabe fcac880fd5
KAFKA-15466: Add KIP-919 support for some admin APIs (#14399)
Add support for --bootstrap-controller in the following command-line tools:
    - kafka-cluster.sh
    - kafka-configs.sh
    - kafka-features.sh
    - kafka-metadata-quorum.sh

To implement this, the following AdminClient APIs now support the new bootstrap.controllers
configuration:
    - Admin.alterConfigs
    - Admin.describeCluster
    - Admin.describeConfigs
    - Admin.describeFeatures
    - Admin.describeMetadataQuorum
    - Admin.incrementalAlterConfigs
    - Admin.updateFeatures

Command-line tool changes:
    - Add CommandLineUtils.initializeBootstrapProperties to handle parsing --bootstrap-controller
      in addition to --bootstrap-server.
    - Add --bootstrap-controller to ConfigCommand.scala, ClusterTool.java, FeatureCommand.java, and
      MetadataQuorumCommand.java.

KafkaAdminClient changes:
    - Add the AdminBootstrapAddresses class to handle extracting bootstrap.servers or
      bootstrap.controllers from the config map for KafkaAdminClient.
    - In AdminMetadataManager, store the new usingBootstrapControllers boolean. Generalize
      authException to encompass the concept of fatal exceptions in general. (For example, the
      fatal exception where we talked to the wrong node type.) Treat
      MismatchedEndpointTypeException and UnsupportedEndpointTypeException as fatal exceptions.
    - Extend NodeProvider to include information about whether bootstrap.controllers is supported.
    - Modify the APIs described above to support bootstrap.controllers.

Server-side changes:
    - Support DescribeConfigsRequest on kcontrollers.
    - Add KRaftMetadataCache to the kcontroller to simplify implemeting describeConfigs (and
      probably more APIs in the future). It's mainly a wrapper around MetadataImage, so there is
      essentially no extra resource consumption.
    - Split RuntimeLoggerManager out of ConfigAdminManager to handle the incrementalAlterConfigs
      support for BROKER_LOGGER. This is now supported on kcontrollers as well as brokers.
    - Fix bug in AuthHelper.computeDescribeClusterResponse that resulted in us always sending back
      BROKER as the endpoint type, even on the kcontroller.

Miscellaneous:
    - Fix a few places in exceptions and log messages where we wrote "broker" instead of "node".
      For example, an exception in NodeApiVersions.java, and a log message in NetworkClient.java.
    - Fix the slf4j log prefix used by KafkaRequestHandler logging so that request handlers on a
      controller don't look like they're on a broker.
    - Make the FinalizedVersionRange constructor public for the sake of a junit test.
    - Add unit and integration tests for the above.

Reviewers: David Arthur <mumrah@gmail.com>, Doguscan Namal <namal.doguscan@gmail.com>
2023-09-26 14:43:42 -07:00
Nikolay daf8a0deda
KAFKA-14595 ReassignPartitionsUnitTest rewritten in java (#14355)
This PR is part of #13247
It contains changes to rewrite single test in java.
Intention is reduce changes in parent PR.

Reviewers: Luke Chen <showuon@gmail.com>, Taras Ledkov <tledkov@apache.org>
2023-09-23 09:45:14 +08:00
Ruslan Krivoshein b72d92919f
KAFKA-14581: Moving GetOffsetShell to tools (#13562)
This PR moves GetOffsetShell from core module to tools module with rewriting from Scala to Java.

Reviewers: Federico Valeri fedevaleri@gmail.com, Ziming Deng dengziming1993@gmail.com, Mickael Maison mimaison@apache.org.
2023-09-11 10:30:22 +08:00
Colin Patrick McCabe 41b695b6e3
KAFKA-15369: Implement KIP-919: Allow AC to Talk Directly with Controllers (#14306)
Implement KIP-919: Allow AdminClient to Talk Directly with the KRaft Controller Quorum and add
Controller Registration. This KIP adds a new version of DescribeClusterRequest which is supported
by KRaft controllers. It also teaches AdminClient how to use this new DESCRIBE_CLUSTER request to
talk directly with the controller quorum. This is all gated behind a new MetadataVersion,
IBP_3_7_IV0.

In order to share the DESCRIBE_CLUSTER logic between broker and controller, this PR factors it out
into AuthHelper.computeDescribeClusterResponse.

The KIP adds three new errors codes: MISMATCHED_ENDPOINT_TYPE, UNSUPPORTED_ENDPOINT_TYPE, and
UNKNOWN_CONTROLLER_ID. The endpoint type errors can be returned from DescribeClusterRequest

On the controller side, the controllers now try to register themselves with the current active
controller, by sending a CONTROLLER_REGISTRATION request. This, in turn, is converted into a
RegisterControllerRecord by the active controller. ClusterImage, ClusterDelta, and all other
associated classes have been upgraded to propagate the new metadata. In the metadata shell, the
cluster directory now contains both broker and controller subdirectories.

QuorumFeatures previously had a reference to the ApiVersions structure used by the controller's
NetworkClient. Because this PR removes that reference, QuorumFeatures now contains only immutable
data. Specifically, it contains the current node ID, the locally supported features, and the list
of quorum node IDs in the cluster.

Reviewers: David Arthur <mumrah@gmail.com>, Ziming Deng <dengziming1993@gmail.com>, Luke Chen <showuon@gmail.com>
2023-09-07 15:21:52 -07:00
Nikolay 0029bc4897
KAFKA-14595: ReassignPartitionsCommandArgsTest rewritten in java (#14217)
Reviewers: Taras Ledkov <tledkov@apache.org>, Greg Harris <greg.harris@aiven.io>
2023-09-07 10:12:07 -07:00
Ron Dagostino 8394ddc0d2
MINOR: Move delegation token support to Metadata Version 3.6-IV2 (#14270)
#14083 added support for delegation tokens in KRaft and attached that support to the existing
MetadataVersion 3.6-IV1. This patch moves that support into a separate MetadataVersion 3.6-IV2.

Reviewers: Colin P. McCabe <cmccabe@apache.org>
2023-08-22 16:04:53 -07:00
Greg Harris 6bd17419b7
KAFKA-15228: Add sync-manifests command to connect-plugin-path (KIP-898) (#14195)
Reviewers: Chris Egerton <chrise@aiven.io>
2023-08-16 11:37:33 -07:00
Colin Patrick McCabe adc16d0f31
KAFKA-14538: Implement KRaft metadata transactions in QuorumController
Implement the QuorumController side of KRaft metadata transactions.

As specified in KIP-868, this PR creates a new metadata version, IBP_3_6_IV1, which contains the
three new records: AbortTransactionRecord, BeginTransactionRecord, EndTransactionRecord.

In order to make offset management unit-testable, this PR moves it out of QuorumController.java and
into OffsetControlManager.java. The general approach here is to track the "last stable offset," which is
calculated by looking at the latest committed offset and the in-progress transaction (if any). When
a transaction is aborted, we revert back to this last stable offset. We also revert back to it when
the controller is transitioning from active to inactive.

In a follow-up PR, we will add support for the transaction records in MetadataLoader. We will also
add support for automatically aborting pending transactions after a controller failover.

Reviewers: David Arthur <mumrah@gmail.com>
2023-08-14 16:58:56 -07:00
Greg Harris f5655d31d3
KAFKA-15030: Add connect-plugin-path command-line tool (#14064)
Reviewers: Chris Egerton <chrise@aiven.io>
2023-08-11 12:05:51 -07:00
Federico Valeri bb677c4959
KAFKA-14583: Move ReplicaVerificationTool to tools (#14059)
Reviewers: Mickael Maison <mickael.maison@gmail.com>
2023-07-26 12:04:34 +02:00
Nikolay 4bba2c8a32
KAFKA-14591: Move DeleteRecordsCommand to tools (#13278)
Reviewers: Mickael Maison <mickael.maison@gmail.com>, Federico Valeri <fedevaleri@gmail.com>
2023-07-21 17:30:28 +02:00
Federico Valeri 334c41d604
KAFKA-14734: Use CommandDefaultOptions in StreamsResetter (#13983)
This PR adds CommandDefaultOptions usage like in the other joptsimple based tools. It also moves the associated unit test class from streams to tools module as discussed in #13127 (comment)

Reviewers:  Luke Chen <showuon@gmail.com>, Bruno Cadonna <cadonna@apache.org>, Sagar Rao <sagarmeansocean@gmail.com>
2023-07-20 18:45:05 +08:00
Manikumar Reddy 4e85bc9f80
MINOR: Fix Jmxtool to honour wait option when MBean is not yet avaibale in MBean server (#13995)
In JmxTool.scala, we will wait till all the object names are available from MBean server. But in the newer version, we only wait for subset of object names. Due to this, we may not enforce wait option and prematurely return the result if the objects are not yet registered in MBean sever.

Reviewers: Luke Chen <showuon@gmail.com>, Federico Valeri <fvaleri@redhat.com>
2023-07-12 17:01:10 +05:30
prasanthV 58fc264410
MINOR: Fix ToolsTestUtils by removing incorrect closure of Std Stream (#13922)
Reviewers: Lucas Bradstreet <lucas@confluent.io>, Divij Vaidya <diviv@amazon.com>
2023-06-28 17:46:22 +02:00
José Armando García Sancio 8ad0ed3e61
KAFKA-15021; Skip leader epoch bump on ISR shrink (#13765)
When the KRaft controller removes a replica from the ISR because of the controlled shutdown there is no need for the leader epoch to be increased by the KRaft controller. This is accurate as long as the topic partition leader doesn't add the removed replica back to the ISR.

This change also fixes a bug when computing the HWM. When computing the HWM, replicas that are not eligible to join the ISR but are caught up should not be included in the computation. Otherwise, the HWM will never increase for replica.lag.time.max.ms because the shutting down replica is not sending FETCH request. Without this additional fix PRODUCE requests would timeout if the request timeout is greater than replica.lag.time.max.ms.

Because of the bug above the KRaft controller needs to check the MV to guarantee that all brokers support this bug fix before skipping the leader epoch bump.

Reviewers: David Mao <47232755+splett2@users.noreply.github.com>, Divij Vaidya <diviv@amazon.com>, David Jacot <djacot@confluent.io>
2023-06-07 07:20:40 -07:00
Federico Valeri 7e9a82c732
MINOR: Fix for MetadataQuorumCommandErrorTest.testRelativeTimeMs (#13784)
Reviewers: Divij Vaidya <diviv@amazon.com>, David Jacot <djacot@confluent.io>
2023-05-31 18:48:26 +02:00
Federico Valeri 45520c1342
KAFKA-14982: Improve the kafka-metadata-quorum output (#13738)
When running kafka-metadata-quorum script to get the quorum replication status, the LastFetchTimestamp and LastCaughtUpTimestamp output is not human-readable.

I will be convenient to add an optional flag (-hr, --human-readable) to enable a human-readable format showing the delay in ms (i.e. 366 ms ago).

This dealy is computed as (now - timestamp), where they are both represented as Unix time (UTC based).

$ bin/kafka-metadata-quorum.sh --bootstrap-server :9092 describe --replication --human-readable
NodeId	LogEndOffset	Lag	LastFetchTimestamp	LastCaughtUpTimestamp	Status  	
2     	61          	0  	5 ms ago          	5 ms ago             	Leader  	
3     	61          	0  	56 ms ago         	56 ms ago            	Follower	
4     	61          	0  	56 ms ago         	56 ms ago            	Follower

Reviewers: Luke Chen <showuon@gmail.com>
2023-05-29 10:04:46 +08:00
Federico Valeri ac9d11b426
KAFKA-14997: Fix JmxToolTest failing on CI server (#13720)
This test was reported as flaky on CI server.

When connecting to a multi-homed machine using RMI, the wrong address may be returned by the RMI registry to the client, causing the connection to the RMI server to timeout.

This change explicitly set the hostname returned to the the clients in the remote stub object.

Reviewers: Luke Chen <showuon@gmail.com>, vamossagar12 <sagarmeansocean@gmail.com>, Kamal Chandraprakash <kamal.chandraprakash@gmail.com>, hudeqi <16120374@bjtu.edu.cn>, Christo Lolov <christololov@gmail.com>
2023-05-16 10:31:32 +08:00
Kamal Chandraprakash 54a4067f81
KAFKA-14559: Fix JMX tool to handle the object names with wildcard and optional attributes (#13060)
Reviewers: Federico Valeri <fedevaleri@gmail.com>, Satish Duggana <satishd@apache.org>
2023-05-11 21:49:21 +05:30
Christo Lolov dc7819d7f1
KAFKA-14594: Move LogDirsCommand to tools module (#13122)
Reviewers: Mickael Maison <mickael.maison@gmail.com>
2023-05-04 12:00:33 +02:00
Gantigmaa Selenge ea540fa400
KAFKA-14592: Move FeatureCommand to tools (#13459)
KAFKA-14592: Move FeatureCommand to tools

Reviewers: Luke Chen <showuon@gmail.com>
2023-04-25 20:28:37 +08:00
Robert Young 2b26db0d38
Switch to SplittableRandom in ProducerPerformance utility (#13482)
Why:
Using java.util.Random to generate every byte sent from the ProducerPerformance
appears to be a limiting factor. Throughput of the ProducerPerformance script is
higher with a file of records as compared to randomly generated records.

On my machine a single thread can generate ~100MB/second of uppercase letters using
java.util.Random and ~300MB/sec using java.util.SplittableRandom. This is a limit on
throughput.

Note: you can optimise further by expanding it from 26 letters to 32 letter generated
as it is more efficient to generate a nicely distributed int when the bound is a
power of two.

Reviewers: Luke Chen <showuon@gmail.com>
2023-03-31 14:52:10 +08:00
hudeqi aef004edee
KAFKA-14812:ProducerPerformance still counting successful sending in console when sending failed (#13404)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2023-03-21 16:59:18 +08:00
Chia-Ping Tsai 279c237632
Revert "MINOR: Fixed ProducerPerformance still counting successful sending when sending failed (#13348)" (#13401)
This reverts commit 8e4c0d0b04.

Reviewers: Luke Chen <showuon@gmail.com>
2023-03-16 21:26:01 +08:00
hudeqi 8e4c0d0b04
MINOR: Fixed ProducerPerformance still counting successful sending when sending failed (#13348)
Reviewers: Luke Chen <showuon@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>
2023-03-15 21:30:51 +08:00
Federico Valeri 07e2f6cd4d
KAFKA-14578: Move ConsumerPerformance to tools (#13215)
Reviewers: Mickael Maison <mickael.maison@gmail.com>, Alexandre Dupriez <alexandre.dupriez@gmail.com>
2023-03-06 18:16:55 +01:00
vamossagar12 bb3111f472
KAFKA-14580: Moving EndToEndLatency from core to tools module (#13095)
Reviewers: Mickael Maison <mickael.maison@gmail.com>, Federico Valeri <fedevaleri@gmail.com>, Ismael Juma <mlists@juma.me.uk>
2023-03-02 12:05:22 +01:00
Gantigmaa Selenge ea30ec4b56
KAFKA-14590: Move DelegationTokenCommand to tools (#13172)
KAFKA-14590: Move DelegationTokenCommand to tools

Reviewers: Luke Chen <showuon@gmail.com>, Christo Lolov <christo_lolov@yahoo.com>, Federico Valeri <fvaleri@redhat.com>
2023-03-02 14:30:07 +08:00
Ron Dagostino 631e6be3a0
KAFKA-14711: kafaka-metadata-quorum.sh does not honor --command-confi… (#13241)
…g option

https://github.com/apache/kafka/pull/12951 accidentally changed the behavior of the `kafaka-metadata-quorum.sh` CLI by making it silently ignore a `--command-config <filename>` properties file that exists. This was an undetected regression in the 3.4.0 release.  This patch fixes the issue such that any such specified file will be honored.

Reviewers: José Armando García Sancio <jsancio@apache.org>, Ismael Juma <ismael@juma.me.uk>
2023-02-13 18:33:20 -05:00
Federico Valeri 50e0e3c257
KAFKA-14582: Move JmxTool to tools (#13136)
Reviewers: Mickael Maison <mickael.maison@gmail.com>
2023-02-02 11:23:26 +01:00
Mickael Maison 8b44237655
KAFKA-14575: Move ClusterTool to tools module (#13080)
Reviewers: dengziming <dengziming1993@gmail.com>, Federico Valeri  <fedevaleri@gmail.com>
2023-01-22 12:50:43 +01:00
Luke Chen 2575362639
KAFKA-14498: reduce the startup nodes to avoid timeout error (#13016)
In MetadataQuorumCommandTest, we sometimes got the error:

java.util.concurrent.ExecutionException: java.lang.RuntimeException: Received a fatal error while waiting for the broker to catch up with the current cluster metadata.

Since we tried to bring up 3 broker + 3 controllers at the same time, and the config initial.broker.registration.timeout.ms (default 1 min) is sometimes not enough for them to start up. Checking the tests, it doesn't require so many nodes. Reducing the nodes number to make these tests reliable.

Reviewers: dengziming <dengziming1993@gmail.com>, Ismael Juma <ismael@juma.me.uk>
2022-12-21 11:19:22 +08:00
Ismael Juma c0b28fde66
MINOR: Use INFO logging for tools and trogdor tests (#13006)
`TRACE` is too noisy and makes the build slower.

Reviewers: David Jacot <djacot@confluent.io>
2022-12-17 10:22:40 -08:00
Ismael Juma 88725669e7
MINOR: Move MetadataQuorumCommand from `core` to `tools` (#12951)
`core` should only be  used for legacy cli tools and tools that require
access to `core` classes instead of communicating via the kafka protocol
(typically by using the client classes).

Summary of changes:
1. Convert the command implementation and tests to Java and move it to
    the `tools` module.
2. Introduce mechanism to capture stdout and stderr from tests.
3. Change `kafka-metadata-quorum.sh` to point to the new command class.
4. Adjusted the test classpath of the `tools` module so that it supports tests
    that rely on the `@ClusterTests` annotation.
5. Improved error handling when an exception different from `TerseFailure` is
    thrown.
6. Changed `ToolsUtils` to avoid usage of arrays in favor of `List`.

Reviewers: dengziming <dengziming1993@gmail.com>
2022-12-09 09:22:58 -08:00
runom b8754c074a
KAFKA-14355: Fix integer overflow in ProducerPerformance (#12822)
Change types from int to long to avoid overflow

Reviewers: Luke Chen <showuon@gmail.com>,  Igor Soarez <soarez@apple.com>
2022-11-05 20:19:08 +08:00
Jason Gustafson c1c639db77
KAFKA-13288; Include internal topics when searching hanging transactions (#11319)
This patch ensures that internal topics are included when searching for hanging transactions with the `--broker-id` argument in `kafka-transactions.sh`.

Reviewers: David Jacot <djacot@confluent.io>, Guozhang Wang <wangguoz@gmail.com>
2021-09-10 14:33:37 -07:00
Yanwen(Jason) Lin 66a27af2f1
KAFKA-10038: Supports default client.id for ConsoleConsumer, ProducerPerformance, ConsumerPerformance (#11297)
Reviewers: Guozhang Wang <wangguoz@gmail.com>
2021-09-07 13:49:50 -07:00
dengziming 1d22b0d706
KAFKA-10774; Admin API for Describe topic using topic IDs (#9769)
Reviewers: Justine Olshan <jolshan@confluent.io>, Chia-Ping Tsai <chia7712@gmail.com>, Satish Duggana <satishd@apache.org>, Rajini Sivaram <rajinisivaram@googlemail.com>
2021-08-28 09:00:36 +01:00
Jason Gustafson f29c43bdbb
KAFKA-12979; Implement command to find hanging transactions (#10974)
This patch implements the `find-hanging` command described in KIP-664: https://cwiki.apache.org/confluence/display/KAFKA/KIP-664%3A+Provide+tooling+to+detect+and+abort+hanging+transactions#KIP664:Providetoolingtodetectandaborthangingtransactions-FindingHangingTransactions.

Reviewers: Luke Chen <showuon@gmail.com>, David Jacot <djacot@confluent.io>
2021-07-06 10:39:59 -07:00
Jason Gustafson fce771579c
KAFKA-12888; Add transaction tool from KIP-664 (#10814)
This patch adds the transaction tool specified in KIP-664: https://cwiki.apache.org/confluence/display/KAFKA/KIP-664%3A+Provide+tooling+to+detect+and+abort+hanging+transactions. This includes all of the logic for describing transactional state and for aborting transactions. The only thing that is left out is the `--find-hanging` implementation, which will be left for a subsequent patch.

Reviewers: Boyang Chen <boyang@apache.org>, David Jacot <djacot@confluent.io>
2021-06-22 09:47:30 -07:00
CHUN-HAO TANG 580c111258
KAFKA-12662: add unit test for ProducerPerformance (#10588)
Reviewers: Luke Chen <showuon@gmail.com>, wenbingshen <oliver.shen999@gmail.com>, dengziming <dengziming1993@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>
2021-06-17 20:07:12 +08:00
Shay Elkin fc405d792d
Minor: Move trogdor out of tools and into its own gradle module (#10539)
Move Trogdor out of tools and into its own gradle module.  This allows us to minimize
the dependencies of the tools module.  We still keep Trogdor in the CLASSPATH
created by kafka-run-class.sh.

Reviewers: Colin P. McCabe <cmccabe@apache.org>
2021-04-15 11:37:15 -07:00
Scott Hendricks baef516789
Add ConfigurableProducerSpec to Trogdor for improved E2E latency tracking. (#9736)
Reviewer: Colin P. McCabe <cmccabe@apache.org>
2020-12-18 13:03:59 -08:00
Ismael Juma 7d0086e0c3
KAFKA-10447: Migrate tools module to JUnit 5 (#9231)
This change sets the groundwork for migrating other modules incrementally.

Main changes:
- Replace `junit` 4.13 with `junit-jupiter` and `junit-vintage` 5.7.0-RC1.
- All modules except for `tools` depend on `junit-vintage`.
- `tools` depends on `junit-jupiter`.
- Convert `tools` tests to JUnit 5.
- Update `PushHttpMetricsReporterTest` to use `mockito` instead of `powermock` and `easymock`
(powermock doesn't seem to work well with JUnit 5 and we don't need it since mockito can mock
static methods).
- Update `mockito` to 3.5.7.
- Update `TestUtils` to use JUnit 5 assertions since `tools` depends on it.

Unrelated clean-ups:
- Remove `unit` from package names in a few `core` tests.
- Replace `try/catch/fail` with `assertThrows` in a number of places.
- Tag `CoordinatorTest` as integration test.
- Remove unnecessary type parameters when invoking methods and constructors.

Tested with IntelliJ and gradle. Verified that the following commands work as expected:
* ./gradlew tools:unitTest
* ./gradlew tools:integrationTest
* ./gradlew tools:test
* ./gradlew core:unitTest
* ./gradlew core:integrationTest
* ./gradlew clients:test

Reviewers: Manikumar Reddy <manikumar.reddy@gmail.com>
2020-09-10 16:14:38 -07:00
Karan Kumar c8d97c6d51
KAFKA-9375: Add names to all Connect threads (#7901)
Reviewers: Mickael Maison <mickael.maison@gmail.com>, Ryanne Dolan <ryannedolan@gmail.com>, gcsaba2
2020-01-31 18:21:21 +00:00
jolshan 2c2b30d96b MINOR: Add RandomComponentPayloadGenerator and update Trogdor documentation (#7103)
Add a new RandomComponentPayloadGenerator that gives a payload based on random selection of another PayloadGenerator.  Additionally, add an example that uses a non-default PayloadGenerator configuration to TROGDOR.md.

Reviewers: Colin P. McCabe <cmccabe@apache.org>
2019-07-31 14:00:49 -07:00
jolshan 442d36241b MINOR: add useConfiguredPartitioner and skipFlush options for ProduceBench
Add a "useConfiguredPartitioner" boolean to specify testing with the configured partitioner, rather than overriding the partitioner in the test.

Add a "skipFlush" boolean to specify skipping the flush operation when producing.  This is helpful when testing some scenarios where linger.ms is greater than 0.

Reviewers: Colin P. McCabe <cmccabe@apache.org>
2019-07-03 17:23:36 -07:00
Colin Patrick McCabe 822abe47db
MINOR: WorkerUtils#topicDescriptions must unwrap exceptions properly (#6937)
Reviewers: Ismael Juma <ismael@juma.me.uk>, Stanislav Kozlovski <stanislav_kozlovski@outlook.com>
2019-07-03 16:08:39 -07:00
Stanislav Kozlovski 58aa04f91e MINOR: Improve Trogdor external command worker docs (#6438)
Reviewers: Colin McCabe <cmccabe@apache.org>, Xi Yang <xi@confluent.io>
2019-06-06 10:04:05 -07:00
Stanislav Kozlovski 0d55f0f3ec KAFKA-8102: Add an interval-based Trogdor transaction generator (#6444)
This patch adds a TimeIntervalTransactionsGenerator class which enables the Trogdor ProduceBench worker to commit transactions based on a configurable millisecond time interval.

Also, we now handle 409 create task responses in the coordinator command-line client by printing a more informative message

Reviewers: Colin P. McCabe <cmccabe@apache.org>
2019-03-25 09:58:11 -07:00
Rajini Sivaram ca6ac9393b
MINOR: Retain public constructors of classes from public API (#6455)
TopicDescription and ConsumerGroupDescription in org.apache.kafka.clients.admin. are part of the public API, so we should retain the existing public constructor. Changed the new constructor with authorized operations to be package-private to avoid maintaining more public constructors since we only expect admin client to use this.

Reviewers: Manikumar Reddy <manikumar.reddy@gmail.com>
2019-03-18 08:51:50 +00:00
Manikumar Reddy a42f16f980 KAFKA-7922: Return authorized operations in Metadata request response (KIP-430 Part-2)
-  Use automatic RPC generation in Metadata Request/Response classes
-  https://cwiki.apache.org/confluence/display/KAFKA/KIP-430+-+Return+Authorized+Operations+in+Describe+Responses

Author: Manikumar Reddy <manikumar.reddy@gmail.com>

Reviewers: Rajini Sivaram <rajinisivaram@googlemail.com>

Closes #6352 from omkreddy/KIP-430-METADATA
2019-03-10 17:30:16 +05:30
Colin Patrick McCabe 4be68c58da
KAFKA-7828: Add ExternalCommandWorker to Trogdor (#6219)
Allow the Trogdor agent to execute external commands. The agent communicates with the external commands via stdin, stdout, and stderr.

Based on a patch by Xi Yang <xi@confluent.io>

Reviewers: David Arthur <mumrah@gmail.com>
2019-02-06 16:42:02 -08:00
Colin Patrick McCabe a79d6dcdb6
KAFKA-7793: Improve the Trogdor command line. (#6133)
* Allow the Trogdor agent to be started in "exec mode", where it simply
runs a single task and exits after it is complete.

* For AgentClient and CoordinatorClient, allow the user to pass the path
to a file containing JSON, instead of specifying the JSON object in the
command-line text itself.  This means that we can get rid of the bash
scripts whose only function was to load task specs into a bash string
and run a Trogdor command.

* Print dates and times in a human-readable way, rather than as numbers
of milliseconds.

* When listing tasks or workers, output human-readable tables of
information.

* Allow the user to filter on task ID name, task ID pattern, or task
state.

* Support a --json flag to provide raw JSON output if desired.

Reviewed-by: David Arthur <mumrah@gmail.com>, Stanislav Kozlovski <stanislav_kozlovski@outlook.com>
2019-01-24 09:26:51 -08:00
Stanislav Kozlovski 2e53fa08af KAFKA-7792: Add simple /agent/uptime and /coordinator/uptime health check endpoints (#6130)
Reviewed-by: Colin P. McCabe <cmccabe@apache.org>
2019-01-15 11:52:48 -08:00
Stanislav Kozlovski 13f679013a MINOR: Update Trogdor StringExpander regex to handle an epilogue (#6123)
Update the Trogdor StringExpander regex to handle an epilogue.  Previously the regex would use a lazy quantifier at the end, which meant it would not catch anything after the range expression.  Add a unit test.

Reviewed-by: Colin P. McCabe <cmccabe@apache.org>
2019-01-14 20:49:24 -08:00
Stanislav Kozlovski 625e0d8829 KAFKA-7790: Fix Bugs in Trogdor Task Expiration (#6103)
The Trogdor Coordinator now overwrites a task's startMs to the time it received it if startMs is in the past.

The Trogdor Agent now correctly expires a task after the expiry time (startMs + durationMs) passes. Previously, it would ignore startMs and expire after durationMs milliseconds of local start of the task.

Reviewed-by: Colin P. McCabe <cmccabe@apache.org>
2019-01-11 13:38:00 -08:00
Srinivas Reddy 85906d3d2b MINOR: Switch anonymous classes to lambda expressions in tools module
Switch to lambda when ever possible instead of old anonymous way
in tools module

Author: Srinivas Reddy <srinivas96alluri@gmail.com>
Author: Srinivas Reddy <mrsrinivas@users.noreply.github.com>

Reviewers: Ryanne Dolan <ryannedolan@gmail.com>, Ismael Juma <ismael@juma.me.uk>, Manikumar Reddy <manikumar.reddy@gmail.com>

Closes #6013 from mrsrinivas/tools-switch-to-java8
2018-12-21 14:20:57 +05:30
Stanislav Kozlovski 9368743b8f KAFKA-7597: Add transaction support to ProduceBenchWorker (#5885)
KAFKA-7597: Add configurable transaction support to ProduceBenchWorker.  In order to get support for serializing Optional<> types to JSON, add a new library: jackson-datatype-jdk8. Once Jackson 3 comes out, this library will not be needed.

Reviewers: Colin McCabe <cmccabe@apache.org>, Ismael Juma <ismael@juma.me.uk>
2018-11-27 12:49:53 -08:00
Stanislav Kozlovski 7fadf0a11d Trogdor: Add Task State filter to /coordinator/tasks endpoint (#5907)
Reviewers: Colin McCabe <cmccabe@apache.org>
2018-11-26 16:46:58 -08:00
Stanislav Kozlovski 8259fda695 KAFKA-7514: Add threads to ConsumeBenchWorker (#5864)
Add threads with separate consumers to ConsumeBenchWorker.  Update the Trogdor test scripts and documentation with the new functionality.

Reviewers: Colin McCabe <cmccabe@apache.org>
2018-11-13 08:38:42 -08:00
Ismael Juma 12f310d50e
KAFKA-7612: Fix javac warnings and enable warnings as errors (#5900)
- Use Xlint:all with 3 exclusions (filed KAFKA-7613 to remove the exclusions)
- Use the same javac options when compiling tests (seems accidental that
we didn't do this before)
- Replaced several deprecated method calls with non-deprecated ones:
  - `KafkaConsumer.poll(long)` and `KafkaConsumer.close(long)`
  - `Class.newInstance` and `new Integer/Long` (deprecated since Java 9)
  - `scala.Console` (deprecated in Scala 2.11)
  - `PartitionData` taking a timestamp (one of them seemingly a bug)
  - `JsonMappingException` single parameter constructor
- Fix unnecessary usage of raw types in several places.
- Add @SuppressWarnings for deprecations, unchecked and switch fallthrough in
several places.
- Scala clean-ups (var -> val, ETA expansion warnings, avoid reflective calls)
- Use lambdas to simplify code in a few places
- Add @SafeVarargs, fix varargs usage and remove unnecessary `Utils.mkList` method

Reviewers: Matthias J. Sax <mjsax@apache.org>, Manikumar Reddy <manikumar.reddy@gmail.com>, Randall Hauch <rhauch@gmail.com>, Bill Bejeck <bill@confluent.io>, Stanislav Kozlovski <stanislav_kozlovski@outlook.com>
2018-11-12 22:18:59 -08:00
Stanislav Kozlovski ecb71cf471 KAFKA-7564: Expose single task details in Trogdor (#5852)
This commit adds a new "/coordinator/tasks/{taskId}" endpoint which fetches details for a single task.
2018-11-09 10:31:04 -08:00
Dong Lin df0faee097 KAFKA-7560; PushHttpMetricsReporter should not convert metric value to double
Currently PushHttpMetricsReporter will convert value from KafkaMetric.metricValue() to double. This will not work for non-numerical metrics such as version in AppInfoParser whose value can be string. This has caused issue for PushHttpMetricsReporter which in turn caused system test kafkatest.tests.client.quota_test.QuotaTest.test_quota to fail.

Since we allow metric value to be object, PushHttpMetricsReporter should also read metric value as object and pass it to the http server.

Author: Dong Lin <lindong28@gmail.com>

Reviewers: Manikumar Reddy O <manikumar.reddy@gmail.com>, Ewen Cheslack-Postava <ewen@confluent.io>

Closes #5886 from lindong28/KAFKA-7560
2018-11-07 08:04:29 -08:00
Stanislav Kozlovski d28c534819 KAFKA-7515: Trogdor - Add Consumer Group Benchmark Specification (#5810)
This ConsumeBenchWorker now supports using consumer groups.  The groups may be either used to store offsets, or as subscriptions.
2018-10-29 10:51:07 -07:00
Colin Patrick McCabe 089d1b154f MINOR: Add topic config to PartitionsSpec (#5523)
Reviewers: Bob Barrett <bob.barrett@outlook.com>, Ismael Juma <ismael@juma.me.uk>
2018-10-05 11:45:50 -07:00
Ismael Juma 7a74ec62d2
MINOR: Avoid FileInputStream/FileOutputStream (#5281)
They rely on finalizers (before Java 11), which create
unnecessary GC load. The alternatives are as easy to
use and don't have this issue.

Also use FileChannel directly instead of retrieving
it from RandomAccessFile whenever possible
since the indirection is unnecessary.

Finally, add a few try/finally blocks.

Reviewers: Colin Patrick McCabe <colin@cmccabe.xyz>, Rajini Sivaram <rajinisivaram@googlemail.com>
2018-06-27 01:00:05 -07:00
Colin Patrick McCabe 8577632b3a MINOR: Fix Trogdor tests, partition assignments (#4892) 2018-04-29 15:54:38 +01:00
Colin Patrick McCabe 93e03414f7 KAFKA-6771. Make specifying partitions more flexible (#4850) 2018-04-16 08:55:13 +01:00
Colin Patrick McCabe 832b096f4f KAFKA-6696 Trogdor should support destroying tasks (#4759)
Implement destroying tasks and workers.  This means erasing all record of them on the Coordinator and the Agent.

Workers should be identified by unique 64-bit worker IDs, rather than by the names of the tasks they are implementing.  This ensures that when a task is destroyed and re-created with the same task ID, the old workers will be not be treated as part of the new task instance.

Fix some return results from RPCs.  In some cases RPCs were returning values that were never used.  Attempting to re-create the same task ID with different arguments should fail.  Add RequestConflictException to represent HTTP error code 409 (CONFLICT) for this scenario.

If only one worker in a task stops, don't stop all the other workers for that task, unless the worker that stopped had an error.

Reviewers: Anna Povzner <anna@confluent.io>, Rajini Sivaram <rajinisivaram@googlemail.com>
2018-04-16 08:51:33 +01:00
Colin Patrick McCabe 4223ef6106 MINOR: Add NullPayloadGenerator to Trogdor (#4844) 2018-04-10 20:48:38 +01:00
Anna Povzner 989fe0497e Kafka-6693: Added consumer workload to Trogdor (#4775)
Added consumer only workload to Trogdor. The topics must already be pre-populated. The spec lets the user request topic pattern and range of partitions to assign to [startPartition, endPartition].

Reviewers: Colin P. Mccabe <cmccabe@confluent.io>, Rajini Sivaram <rajinisivaram@googlemail.com>
2018-04-10 09:45:08 +01:00
Colin Patrick McCabe 40183e3156 KAFKA-6688. The Trogdor coordinator should track task statuses (#4737)
Reviewers: Anna Povzner <anna@confluent.io>, Rajini Sivaram <rajinisivaram@googlemail.com>
2018-04-08 09:35:33 +01:00
Anna Povzner da32db9f34 Trogdor: Added commonClientConf and adminClientConf to workload specs (#4757)
Currently, WorkerUtils will be able to create topics when there is no security. To be able to work with secure kafka, WorkerUtils.createTopic() needs to be able to take security configs. This PR adds commonClientConf field to both producer bench and roundtrip workload specs so that users can specify security and other common configs once for producer/consumer and adminClient. Also added adminClientConf field to workload specs so that users can specify adminClient specific configs if they want to. For completeness, added consumerConf and producerConf to roundtrip workload spec.

Reviewers: Ismael Juma <ismael@juma.me.uk>, Colin P. Mccabe <cmccabe@confluent.io>, Rajini Sivaram <rajinisivaram@googlemail.com>
2018-04-06 19:21:41 +01:00
Colin Patrick McCabe 63642d6051 KAFKA-6694: The Trogdor Coordinator should support filtering task responses (#4741) 2018-04-05 13:35:20 +01:00
Anna Povzner 5c24295d44 Trogdor's ProducerBench does not fail if topics exists (#4673)
Added configs to ProducerBenchSpec:
topicPrefix: name of topics will be of format topicPrefix + topic index. If not provided, default is "produceBenchTopic".
partitionsPerTopic: number of partitions per topic. If not provided, default is 1.
replicationFactor: replication factor per topic. If not provided, default is 3.

The behavior of producer bench is changed such that if some or all topics already exist (with topic names = topicPrefix + topic index), and they have the same number of partitions as requested, the worker uses those topics and does not fail. The producer bench fails if one or more existing topics has number of partitions that is different from expected number of partitions.

Added unit test for WorkerUtils -- for existing methods and new methods.

Fixed bug in MockAdminClient, where createTopics() would over-write existing topic's replication factor and number of partitions while correctly completing the appropriate futures exceptionally with TopicExistsException.

Reviewers: Colin P. Mccabe <cmccabe@confluent.io>, Rajini Sivaram <rajinisivaram@googlemail.com>
2018-03-20 13:51:45 +00:00
Colin Patrick McCabe a70e4f95d7 KAFKA-6658; Fix RoundTripWorkload and make k/v generation configurable (#4710)
Make PayloadGenerator an interface which can have multiple implementations: constant, uniform random, sequential.

Allow different payload generators to be used for keys and values.

This change fixes RoundTripWorkload.  Previously RoundTripWorkload was unable to get the sequence number of the keys that it produced.
2018-03-16 16:15:49 -07:00
Colin Patrick McCabe 9e0e6e43a7 MINOR: Trogdor should not assume an agent co-located with the controller (#4712) 2018-03-16 17:57:38 +00:00
Colin Patrick McCabe 8c10e06007 MINOR: Avoid nulls when deserializing Trogodor JSON (#4688) 2018-03-15 11:44:27 +00:00
Colin Patrick McCabe bf8a4c2ce7 MINOR: Improve Trogdor client logging. (#4675)
AgentClient and CoordinatorClient should have the option of logging failures to custom log4j objects.  There should also be builders for these objects, to make them easier to extend in the future.

Reviewers: Rajini Sivaram <rajinisivaram@googlemail.com>
2018-03-14 10:12:15 +00:00
Anna Povzner f1c112c63d MINOR: Add PayloadGenerator to Trogdor (#4640)
It generates the producer payload (key and value) and makes sure that the values are
populated to target a realistic compression rate (0.3 - 0.4) if compression is used.
The generated payload is deterministic and can be replayed from a given position.
For now, all generated values are constant size, and key types can be configured
to be either null or 8 bytes.

Added messageSize parameter to producer spec, that specifies produced
key + message size.
2018-03-09 13:57:04 -08:00
Romain Hardouin a7e49027b2 MINOR: Catch JsonMappingException subclass (#3821)
Handle InvalidTypeIdException as NOT_IMPLEMENTED and add unit tests for all exceptions.

Reviewers: Colin P. Mccabe <cmccabe@confluent.io>, Rajini Sivaram <rajinisivaram@googlemail.com>
2018-01-08 12:14:25 +00:00
Colin P. Mccabe 760d86a970 KAFKA-5849; Add process stop, round trip workload, partitioned test
* Implement process stop faults via SIGSTOP / SIGCONT

* Implement RoundTripWorkload, which both sends messages, and confirms that they are received at least once.

* Allow Trogdor tasks to block until other Trogdor tasks are complete.

* Add CreateTopicsWorker, which can be a building block for a lot of tests.

* Simplify how TaskSpec subclasses in ducktape serialize themselves to JSON.

* Implement some fault injection tests in round_trip_workload_test.py

Author: Colin P. Mccabe <cmccabe@confluent.io>

Reviewers: Ismael Juma <ismael@juma.me.uk>, Rajini Sivaram <rajinisivaram@googlemail.com>

Closes #4323 from cmccabe/KAFKA-5849
2017-12-20 21:35:33 +00:00
Colin P. Mccabe 58877a0dea KAFKA-6255; Add ProduceBench to Trogdor
Author: Colin P. Mccabe <cmccabe@confluent.io>

Reviewers: Rajini Sivaram <rajinisivaram@googlemail.com>

Closes #4245 from cmccabe/KAFKA-6255
2017-11-28 22:09:55 +00:00
Colin P. Mccabe d9cbc6b1a2 KAFKA-5811; Add Kibosh integration for Trogdor and Ducktape
For ducktape: add Kibosh to the testing Dockerfile.
Create files_unreadable_fault_spec.py.

For trogdor: create FilesUnreadableFaultSpec.java.
Add a unit test of using the Kibosh service.

Author: Colin P. Mccabe <cmccabe@confluent.io>

Reviewers: Rajini Sivaram <rajinisivaram@googlemail.com>

Closes #4195 from cmccabe/KAFKA-5811
2017-11-16 17:59:24 +00:00
Ewen Cheslack-Postava 718dda1144 MINOR: Add HttpMetricsReporter for system tests
Author: Ewen Cheslack-Postava <me@ewencp.org>

Reviewers: Apurva Mehta <apurva@confluent.io>, Ismael Juma <ismael@juma.me.uk>

Closes #4072 from ewencp/http-metrics
2017-11-09 09:42:46 -08:00
Colin P. Mccabe 4fac83ba1f KAFKA-6060; Add workload generation capabilities to Trogdor
Previously, Trogdor only handled "Faults."  Now, Trogdor can handle
"Tasks" which may be either faults, or workloads to execute in the
background.

The Agent and Coordinator have been refactored from a
mutexes-and-condition-variables paradigm into a message passing
paradigm.  No locks are necessary, because only one thread can access
the task state or worker state.  This makes them a lot easier to reason
about.

The MockTime class can now handle mocking deferred message passing
(adding a message to an ExecutorService with a delay).  I added a
MockTimeTest.

MiniTrogdorCluster now starts up Agent and Coordinator classes in
paralle in order to minimize junit test time.

RPC messages now inherit from a common Message.java class.  This class
handles implementing serialization, equals, hashCode, etc.

Remove FaultSet, since it is no longer necessary.

Previously, if CoordinatorClient or AgentClient hit a networking
problem, they would throw an exception.  They now retry several times
before giving up.  Additionally, the REST RPCs to the Coordinator and
Agent have been changed to be idempotent.  If a response is lost, and
the request is resent, no harm will be done.

Author: Colin P. Mccabe <cmccabe@confluent.io>

Reviewers: Rajini Sivaram <rajinisivaram@googlemail.com>, Ismael Juma <ismael@juma.me.uk>

Closes #4073 from cmccabe/KAFKA-6060
2017-11-03 09:37:29 +00:00
Colin P. Mccabe 4065ffb3e1 KAFKA-5777; Add ducktape integration for Trogdor
Author: Colin P. Mccabe <cmccabe@confluent.io>

Reviewers: Rajini Sivaram <rajinisivaram@googlemail.com>

Closes #3726 from cmccabe/KAFKA-5777
2017-09-07 13:23:03 +01:00
Colin P. Mccabe 0772fde562 KAFKA-5776; Add the Trogdor fault injection daemon
Author: Colin P. Mccabe <cmccabe@confluent.io>

Reviewers: Ismael Juma <ismael@juma.me.uk>, Rajini Sivaram <rajinisivaram@googlemail.com>

Closes #3699 from cmccabe/trogdor-review
2017-08-25 12:29:40 -07:00