Commit Graph

14480 Commits

Author SHA1 Message Date
Alieh Saeedi 1b02873511
KAFKA-17125 Add integration test for StreamsGroup in Admin API (#18911)
Integration test for both `--list` and `--describe` commands.
2025-02-21 16:27:00 +01:00
Bill Bejeck 31545eb614
Add integration test for IQv2 in KIP-1071 (#18856)
* Merging the Refactory to add standby tasks heartbeat response
added test

Add stability check and refactor partition assignments

Introduces a `membersStable` method in `StreamsGroup` to ensure group members are in a stable state before partition assignments. Refactors `EndpointToPartitionsManager` to streamline input topic handling and updates related tests for clarity and correctness. Adds integration test for validating partition assignment logic with multiple configurations.

* Refactor IQv2 metadata handling and endpoint partition mapping.

Streamlined metadata validation logic in IQv2 integration tests and removed redundant code. Simplified `maybeBuildEndpointToPartitions` by eliminating unnecessary stability checks, improving clarity and maintainability of endpoint-to-partition mapping.

* Remove unused `membersStable` method from StreamsGroup

* Refactor and enhance IQv2 test metadata validation.

Replaced redundant variable names and improved metadata handling for active and standby tasks in the IQv2 test. Added validations for state store names using `EXPECTED_STORE_NAME` and organized metadata assertions to enhance clarity and testing coverage. These changes improve maintainability and ensure better validation of stream metadata.

* Add inspection of standby topics
2025-02-18 11:25:13 -05:00
Bill Bejeck 138c2a211a
Refactor heartbeat response to include standbys (#18827)
* Adding standby tasks heartbeat response

* Refactory to add standby tasks heartbeat response
added test

* Addressing comments plus some minor checkstyle side cleanup
2025-02-11 10:32:20 -05:00
Alieh Saeedi f7b303eff3
Implement kafka-streams-groups.sh --describe (#18231)
Implement --describe and its options: (--state, --offset, --members and the combination of them with --verbose)

Reviewers: Lucas Brutschy <lbrutschy@confluent.io>
2025-01-23 10:35:13 +01:00
Lucas Brutschy c64fa8d5cd
Fix handling of reordered records, add replay testing (#18217)
Replaying records from the offset topic has several corner
cases, that we need to take care of. In particular, due to
compaction it can happen that we only see the tombstone
record of a member or the group, but not the original record
of the group. Similarly, if the group metadata record is
updated after the last update to the current assignment
record of a member, the records will be replayed out of order,
that is, we first replay the current assignment of the member,
and then create the group.

Therefore, we need to change the code to always create groups
when replaying any kind of records (instead of failing, as before).
Also, tombstones of non-existing groups are ignored.

This change also adds unit tests for all replay methods for streams
records.

We also piggyback some changes to streamsGroupHeartbeat:

 * the structure of the method is cleaned up a bit
 * request validation is extended
 * unit test for request validation is extendend
2024-12-18 11:40:35 +01:00
Bruno Cadonna 6ec17cfc6d
Add warm-up tasks to Streams membership manager (#18214)
This commit adds warm-up tasks assignment to the
Streams membership manager.
2024-12-17 13:07:14 +01:00
Alieh Saeedi faf000be73
Implement kafka-streams-groups.sh --list (#18167)
Implement the core of kafka-streams-groups.sh
Implement --list and its options: (only --state)
2024-12-16 11:58:05 +01:00
Bruno Cadonna db1361d065
Add standby task assignment to Streams membership manager (#18077)
Until now the Streams membership manager only processed active tasks. This commit adds the capability to also handle assigned standby tasks.
2024-12-13 16:36:44 +01:00
Alieh Saeedi fb2eb8eb8b
Add unit test for StickyTaskAssignor (#18169)
This PR formulates a unit test that verifies the fix merged with PR #18051.
2024-12-13 15:34:15 +01:00
Lucas Brutschy c5d6f1aab5
Add describeStreamsGroup to Admin API (#18125)
Adds describeStreamsGroup to Admin API.

Also includes some minor clean-ups in the describe RPC.
2024-12-12 15:06:17 +01:00
Bill Bejeck dc545b2b6e MINOR: Added tests and side cleanup (#18069)
Adding a couple of tests and some side cleanup.
Co-authored-by: Bruno Cadonna <cadonna@apache.org>

Reviewers: Bruno Cadonna <cadonna@apache.org>, Lucas Brutschy <lucasbru@apache.org>
2024-12-09 12:48:04 -05:00
Bill f0262a7169 Fixes from trunk rebase 2024-12-06 16:22:17 -05:00
Alieh Saeedi b26b108862 KIP-1071: Fix StickyTaskAssignor (#18051)
Fixing Streams StickyTaskAssignor bug.
2024-12-06 16:22:16 -05:00
Lucas Brutschy 60f7c5af2f MINOR: Fix epoch validation (#18065)
We accidentally broke the epoch validation, throwing
on all epochs > 0. This is just the fix. We should improve
unit test coverage though (there is a ticket for it).
2024-12-06 16:22:16 -05:00
Bill Bejeck 4e1b29d8aa Add KIP-1082 (#18044)
This PR integrates KIP-1082 changes into the KIP-1071. The change centers around clients creating the member id on startup, vs. waiting for the GroupCoordinator to create the member id and return it in the response to initial heartbeat.

The changes were small and included some changes to the GroupMetadataManagerTest streams group related tests.
I'll add comments in-line regarding the updates. Overall, the impact can be best summarized by stating that now certain actions will now happen one epoch earlier. For example, task assignment would typically happen on epoch 2 (bumping up from 1) since clients waited for the membership id to be returned and the assignment occurred in the subsequent heartbeat request.

For testing, I updated the GroupMetadataManagerTest to get all tests passing.
2024-12-06 16:22:16 -05:00
Bruno Cadonna cf2586103e Use LeaveOnClose event with Streams membership manager (#18054)
The Streams membership manager did not use the LeaveOnClose event
to faster leaving the group on closing. With the event the Streams
membership manager leaves the group without invoking any callback
for releasing the active tasks.

With this commit the Streams membership member uses the LeaveOnClose
event.
2024-12-06 16:22:15 -05:00
Lucas Brutschy 9783c987f4 Implement support for streams groups in kafka-groups.sh (#18043)
Add support for streams groups in kafka-groups.sh, including a little integration test that spins up a kafka streams application and lists the group.
2024-12-06 16:22:15 -05:00
Lucas Brutschy dfc69c2301 MINOR: add endpoint to sync RPCs with KIP (#18035) 2024-12-06 16:22:15 -05:00
Lucas Brutschy 4af637c7e1 Add group.protocol config handling for streams (#18033)
- accept new streams group.protocol value
- do not forward to consumer clients
- log message that not for production
2024-12-06 16:22:14 -05:00
Bruno Cadonna ddfadf72ad Follow-ups to the integration of Streams membership manager (#18019) 2024-12-06 16:22:14 -05:00
Bill Bejeck ce9ce51e8e MINOR: Updates from running spotless apply (#18007) 2024-12-06 16:22:14 -05:00
Lucas Brutschy fb4a3e6fd7 Topology epochs and topic validation (#17721)
This PR implements two changes in the KIP, and related logic on client & server side.

We replace topology ID with topology epoch. We only include the basic changes to the schema and basic validation for now (fail if topology is changed without a topology epoch bump) - since topology updating will follow later on.
We add status codes for topic validation, and implement the required logic to validate the internal topics.
2024-12-06 16:22:13 -05:00
Bruno Cadonna 5b93228834 Remove unused import 2024-12-06 16:22:13 -05:00
Bruno Cadonna 1f1c0cb6df Remove superfluous record version 2024-12-06 16:22:13 -05:00
Bill Bejeck a085a76d49 Resolve conflict from 12/6 trunk rebase - KIP1071 trunk rebase 11_25 add streams membership manager stream thread integration (#17968)
Rebase on trunk
2024-12-06 16:22:12 -05:00
Bill ee307af22a Fixes from rebase 2024-12-06 16:22:12 -05:00
Bill 718fb9b2c4 Correctness updates from rebase 2024-12-06 16:22:11 -05:00
Lucas Brutschy 4a68f95cac Resolve conflict from 12/6 rebase - Resolve conflicts from 11/15 trunk rebase - Initialize topologies as part of heartbeat (#17695)
We merge the initialization of the Streams group into the heartbeat of the Streams group.
As a result the first heartbeat of a member (member epoch is zero) contains the topology
metadata required for initialization.
2024-12-06 16:22:11 -05:00
Bruno Cadonna 6cda02d7e8 Introduce the streams membership manager (#17564)
The streams membership manager manages the state of the member
and is responsible for reconciling the received task assignment
with the current task assignment of the member. Additionally,
it requests the call of the callback that installs the
reconciled task assignment in the Streams client.
2024-12-06 16:22:11 -05:00
Lucas Brutschy 4ccb85115f Resolve conflicts from 11/25 trunk rebase - Internal topic auto creation (#17433)
* impl

* fixes
2024-12-06 16:22:10 -05:00
Lucas Brutschy d43bdc3450 Resolve conflicts from trunk rebase - Port tools for topic configuration (#17371)
* KSTREAMS-6456: Port tools for topic configuration

Ports several tools for topic configuration from the client side, to the
broker-side. Several things are refactored:

 - Decoupling. On the client side, for example, RepartitionTopics was
   using the CopartitionedTopicEnforcer, the InternalTopicCreator, the
   TopologyMetadata and the Cluster objects. All of those are mocked
   with Mockito during testing. This points to bad coupling. We
   refactored all classes to be mostly self-sufficient, only relying on
   themselves and simple interfaces.
 - Tests only use JUnit5, not hamcrast matchers and no other streams
   utilities, to not pollute the group coordinator module.
 - All classes only modify the configurations -- the code does not
   actually call into the AdminClient anymore.
 - We map all errors to new errors in the broker, in particular,
   the error for missing topics, inconsistent internal topics, and
   invalid topologies.

We include the internal, mutable datastructures, that are set- and
map-based for effiecient algorithms. They are distinctly different
from the data represented in `StreamsGroupTopologyValue` and
`StreamsGroupInitializeRequest`, since regular expressions must
be resolved at this point.

Both the topic creation and internal topic validation will be
based on this code, the basics of this are implemented in the
`InternalTopicManager`.

Every time, either the broker-side topology or the topic metadata
on the broker changes, we reconfigure the internal topics, check
consistency with the current topics on the broker, and possibly
trigger creation of the missing internal topics. These changes
will be built on top of this change.

* formatting fixes
2024-12-06 16:22:10 -05:00
Bruno Cadonna 8b6d3f921d Resolve conflicts from 11/25 trunk rebase - MINOR: Rebase dev branch on current trunk 2024-12-06 16:22:09 -05:00
Lucas Brutschy 4f1328f5b0 Complete topic metadata for automatic topic creation (#17391)
* KSTREAMS-6456: Complete topic metadata for automatic topic creation

This change updates the current RPCs and schemas to add the following
metadata:

 - copartition groups are added to topology record and initialize RPC
 - multiple regexs are added to topology record and initialize RPC
 - replication factors are added for reach internal topic

We also add code to fill this information correctly from the
`InternalTopologyBuilder` object.

The fields `assignmentConfiguration` and `assignor` in the
`StreamsAssignmentInterface` are removed, because they are not
needed anymore.

* fixes
2024-12-06 16:22:09 -05:00
Lucas Brutschy 8d0a6f84e5 MINOR: Fix unit tests (#17374)
* MINOR: Fix unit tests

Some unit tests were failing on the KIP1071 feature branch. This fixes
them.

* spotlessApply
2024-12-06 16:22:09 -05:00
Lucas Brutschy 09a2593457 MINOR: Use subtopologyId and spelling consistently (#17370)
Throughout RPCs, we were using subtopology instead of subtopologyId for
the string ID of a subtopology. We were also using sub-topology
spelling. Cleaning this up before merging the auto-topic creating code.
2024-12-06 16:22:08 -05:00
Alieh Saeedi ccb02aabd0 KAFKA-17125: finalize TaskId (#17300) 2024-12-06 16:22:08 -05:00
Lucas Brutschy c0ae9c5386 Resolve conflict from 11/25 trunk rebase - Rebase on AK trunk 2024-09-25 2024-12-06 16:22:08 -05:00
Matthias J. Sax ece269de75 Resolve confict from 12/6 rebase - Add broker configs for streams group
Adds broker configs for Streams group for the MVP.
2024-12-06 16:22:07 -05:00
Lucas Brutschy 435c780bf5 Resolve conflict from 11/25 trunk rebase - Streams reconciliation logic
This PR implements the new reconciliation logic for Kafka Streams. In
the POC, we used the reconciliation logic for topic partitions on the
broker side, developed as part of KIP-848. This worked for active tasks,
but does not take standby tasks or warm-up tasks into account.
The new reconciliation logic also takes standby and warm up tasks into
account. Inside the streams group object, we keep track of the process
IDs that own a task, and in which role. This replaces task epochs from
KIP-848. Using the process IDs, we can check the assignment invariants:

* A. A process can own a task only once in any role. For example, if
processX already owns a task as a standby, it cannot own it as an active
as well.

* B. For active tasks, since there can never two instances owning the
same task in an active role, the invariant is different: An active task
can be assigned only if no other member owns it (as an active task).

The reconciliation logic will make sure that these invariants are
followed. For this, it uses the same states for the member as KIP-848,
just with different state transitions:

A member is in state UNREVOKED whenever we are waiting for the member to
confirm revocation of at least one task. We expect the member to revoke
all tasks that are not in its target assignment before transitioning to
the next epoch. Once the member has revoked all tasks that are not in
its target assignment, it transitions to the new epoch and gets new
tasks assigned. Once all tasks are assigned, we transition to STABLE,
but it may be that some tasks cannot be assigned yet, because of the
above invariants:

* A. For any task, another instance of the same process may own the task
in any role (active/standby/warmup), and we are still awaiting
revocation.

* B. For an active task, an instance of any process may own the same
active task, and we are still awaiting revocation.

As long as tasks cannot be assigned, we remain in state UNRELEASED.

The testing required some changes, so that I could parameterize the test
on the task role which is causing state transitions. There are some
utitlity changes / clean-ups to construct assignments.
2024-12-06 16:22:07 -05:00
Bruno Cadonna 9eba835a6f Improve the Streams group initialization handler
This commit:
- schedules a timeout for the initialization call
- requests the initialization only from the one member
- intializations of unknown topologies or already initialized topologies
are silently dropped
2024-12-06 16:22:06 -05:00
Bruno Cadonna 5ca2155e1d Rename streamsHeartbeatX and streamsInitializeX to streamsGroupX 2024-12-06 16:22:06 -05:00
Alieh Saeedi 9c2bc6ed30 resovle conflict from 12/6 rebase - StickyTaskAssignor
Implement the `StickyTaskAssignor` for KIP-1071.
2024-12-06 16:22:05 -05:00
Lucas Brutschy 50482108dd Minor: Revert test changes 2024-12-06 16:22:05 -05:00
Bruno Cadonna 00a7e85444 Resolve conflict from 11/25 trunk rebase - Rebase on AK trunk 2024-08-15 2024-12-06 16:22:05 -05:00
Lucas Brutschy 50cdd5f2dc Resolve conflicts from 12/6 rebase Resolve conflicts from 11/25 trunk rebase - Implement DescribeStreamsGroup RPC handling
Implement the DescribeStreamsGroup RPC handling for KIP-1071.
2024-12-06 16:22:04 -05:00
Bruno Cadonna 871aa86e88 Resovle conflicts from 11/25 trunk rebase - Update RPCs
Updates RPCs from KIP-1071
2024-12-06 16:22:04 -05:00
Bruno Cadonna ac6a2e4d48 Resolve conficts from 12/6 rebase Resolve conflicts for 11/25 trunk rebase - Rebased on AK trunk 2024-07-16 2024-12-06 16:22:03 -05:00
Lucas Brutschy a4a5245239 Resolve conflict in 12/6 rebase - Resolve conflicts from 11/25 rebase - Get SmokeTestDriverIntegrationTest working
See https://github.com/lucasbru/kafka/pull/22
2024-12-06 16:22:03 -05:00
Bruno Cadonna b174b9233e Reslove confict from 11/25 trunk rebase - Replay all streams-related records
See https://github.com/lucasbru/kafka/pull/23
2024-12-06 16:22:03 -05:00
Lucas Brutschy 65897c276f Reslove conflicts from 11/25 rebase - Create assignment-related classes for streams groups
Translate from org.apache.kafka.coordinator.group.consumer package

CurrentAssignmentBuilder

TargetAssignmentBuilder

See https://github.com/lucasbru/kafka/pull/23

copy unit tests that can be preserved easily
2024-12-06 16:22:02 -05:00