Commit Graph

267 Commits

Author SHA1 Message Date
Mickael Maison fccd7fec66
MINOR: Various cleanups in clients (#15705)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2024-04-16 15:02:03 +02:00
Hector Geraldino 178761eb36
KAFKA-14683 Cleanup WorkerSinkTaskTest (#15506)
1) Rename WorkerSinkTaskMockitoTest back to WorkerSinkTaskTest
2) Tidy up the code a bit
3) rewrite "fail" by "assertThrow"

Reviewers: Omnia Ibrahim <o.g.h.ibrahim@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>
2024-03-15 03:50:57 +08:00
Daan Gerits b9a5b4a805
KAFKA-10892: Shared Readonly State Stores ( revisited ) (#12742)
Implements KIP-813.

Reviewers: Matthias J. Sax <matthias@confluent.io>, Walker Carlson <wcarlson@confluent.io>
2024-03-08 10:57:56 -08:00
testn 80def43a34
MINOR: Reduce memory allocation in ClientTelemetryReporter (#15402)
Reviewers: Divij Vaidya <diviv@amazon.com>
2024-03-08 17:43:44 +01:00
Hector Geraldino 62998b7264
KAFKA-14683: Migrate WorkerSinkTaskTest to Mockito (3/3) (#15316)
Reviewers: Greg Harris <greg.harris@aiven.io>
2024-03-06 10:31:33 -08:00
Ritika Reddy 96c68096a2
KAFKA-15462: Add Group Type Filter for List Group to the Admin Client (#15150)
In KIP-848, we introduce the notion of Group Types based on the protocol type that the members in the consumer group use. As of now we support two types of groups:
* Classic : Members use the classic consumer group protocol ( existing one )
* Consumer : Members use the consumer group protocol introduced in KIP-848.
Currently List Groups allows users to list all the consumer groups available. KIP-518 introduced filtering the consumer groups by the state that they are in. We now want to allow users to filter consumer groups by type.

This patch includes the changes to the admin client and related files. It also includes changes to parameterize the tests to include permutations of the old GC and the new GC with the different protocol types.

Reviewers: David Jacot <djacot@confluent.io>
2024-02-29 00:38:42 -08:00
David Jacot d24abe0ede
MINOR: Refactor GroupMetadataManagerTest (#15348)
`GroupMetadataManagerTest` class got a little under control. We have too many things defined in it. As a first steps, this patch extracts all the inner classes. It also extracts all the helper methods. However, the logic is not changed at all.

Reviewers: Omnia Ibrahim <o.g.h.ibrahim@gmail.com>, Justine Olshan <jolshan@confluent.io>
2024-02-13 23:29:29 -08:00
Mickael Maison 0bf830fc9c
KAFKA-14576: Move ConsoleConsumer to tools (#15274)
Reviewers: Josep Prat <josep.prat@aiven.io>, Omnia Ibrahim <o.g.h.ibrahim@gmail.com>
2024-02-13 19:24:07 +01:00
Chris Egerton 4f0a405908
KAFKA-15575: Begin enforcing 'tasks.max' property for connectors (#15180)
Reviewers: Ashwin Pankaj <apankaj@confluent.io>, Greg Harris <greg.harris@aiven.io>
2024-02-01 11:33:04 -05:00
Lianet Magrans 44272eaa77
KAFKA-16032: Fixes for commit/fetch error handling (#15202)
This includes multiple fixes for offsets commit/fetch error handling:
* ensure the right exceptions are thrown for each expected error
* ensure KafkaException is thrown for all unexpected errors
* properly handle disconnection exceptions (added for fetch, fixed for commit)

Reviewers: David Jacot <djacot@confluent.io>
2024-01-26 05:42:54 -08:00
Calvin Liu 7e5ef9b509
KAFKA-15585: Implement DescribeTopicPartitions RPC on broker (#14612)
This patch implements the new DescribeTopicPartitions RPC as defined in KIP-966 (ELR). Additionally, this patch adds a broker config "max.request.partition.size.limit" which limits the number of partitions returned by the new RPC.

Reviewers: Artem Livshits <alivshits@confluent.io>, Jason Gustafson <jason@confluent.io>, David Arthur <mumrah@gmail.com>
2024-01-24 15:16:09 -05:00
Greg Harris d1d6b5096f
KAFKA-16166: Generify RetryWithToleranceOperator, ErrorReporter, and WorkerTask (#15233)
Signed-off-by: Greg Harris <greg.harris@aiven.io>
Reviewers: Chris Egerton <chrise@aiven.io>
2024-01-22 16:56:52 -08:00
Bruno Cadonna 19727f8d51
KAFKA-16017: Checkpoint restored offsets instead of written offsets (#15044)
Kafka Streams checkpoints the wrong offset when a task is closed during
restoration. If under exactly-once processing guarantees a
TaskCorruptedException happens, the affected task is closed dirty, its
state content is wiped out and the task is re-initialized. If during
the following restoration the task is closed cleanly, the task writes
the offsets that it stores in its record collector to the checkpoint
file. Those offsets are the offsets that the task wrote to the changelog
topics. In other words, the task writes the end offsets of its changelog
topics to the checkpoint file. Consequently, when the task is
initialized again on the same Streams client, the checkpoint file is
read and the task assumes it is fully restored although the records
between the last offsets the task restored before closing clean and
the end offset of the changelog topics are missing locally.

The fix is to clear the offsets in the record collector on close.

Reviewer: Lucas Brutschy <lbrutschy@confluent.io>
2023-12-21 10:15:04 +01:00
Kirk True 9dc9040f33
KAFKA-15276: Implement event plumbing for ConsumerRebalanceListener callbacks (#14640)
This patch adds the logic for coordinating the invocation of the `ConsumerRebalanceListener` callback invocations between the background thread (in `MembershipManagerImpl`) and the application thread (`AsyncKafkaConsumer`) and back again. It allowed us to enable more tests from `PlaintextConsumerTest` to exercise the code herein.

Reviewers: David Jacot <djacot@confluent.io>
2023-12-15 00:42:31 -08:00
Chris Egerton 2a5fbf2882
KAFKA-15563: Provide informative error messages when Connect REST requests time out (#14562)
Reviewers: Greg Harris <greg.harris@aiven.io>
2023-12-11 16:48:16 -05:00
Matthias J. Sax 7dabd27f8d
KAFKA-15662: Add support for clientInstanceIds in Kafka Stream (#14922)
Part of KIP-714.

Adds support to expose main consumer client instance id.

Reviewers: Walker Carlson <wcarlson@confluent.io>, Lucas Brutschy <lbrutschy@confluent.io>
2023-12-07 10:39:39 -08:00
David Jacot 522c2864cd
KAFKA-14505; [2/N] Implement TxnOffsetCommit API (#14845)
This patch implements the TxnOffsetCommit API. When a transactional offset commit is received, it is stored in the pending transactional offsets structure and waits there until the transaction is committed or aborted. Note that the handling of the transaction completion is not implemented in this patch.

Reviewers: Justine Olshan <jolshan@confluent.io>
2023-12-07 02:51:22 -08:00
Alieh Saeedi 9658942366
KAFKA-15347: add support for 'single key multi timestamp' IQs with versioned state stores (KIP-968) (#14626)
Implements KIP-968.

Add new query type MultiVersionedKeyQuery for IQv2 supported by versioned state stores.
2023-12-06 07:56:12 -08:00
Apoorv Mittal 2b99d0e450
KAFKA-15901: Client changes for registering telemetry and API calls (KIP-714) (#14843)
The PR adds changes for the client APIs to register ClientTelemetryReporter, if enabled, and periodically report client metrics. The changes include front facing API changes with NetworkCLient issuing telemetry APIs.

The PR build is dependent on: #14620, #14724

Reviewers: Philip Nee <pnee@confluent.io>, Andrew Schofield <aschofield@confluent.io>, Kirk True <ktrue@confluent.io>, Matthias J. Sax <matthias@confluent.io>, Walker Carlson <wcarlson@apache.org>
2023-12-05 11:50:33 -06:00
Andrew Schofield 1750d735cd
KAFKA-15842: Correct handling of KafkaConsumer.committed for new consumer (#14859)
This PR fixes some details of the interface to KafkaConsumer.committed which were different between the existing consumer and the new consumer.

Adds a unit test that validates the behaviour is the same for both consumer implementations.

Reviewers: Kirk True <ktrue@confluent.io>, Bruno Cadonna <cadonna@apache.org>
2023-12-01 14:37:21 +01:00
Hanyu Zheng f1cd11dcc5
KAFKA-15629: Proposal to introduce IQv2 Query Types: TimestampedKeyQuery and TimestampedRangeQuery (#14570)
Implements KIP-992.

Adds TimestampedKeyQuery and TimestampedRangeQuery (IQv2) for ts-ks-store, plus changes semantics of existing KeyQuery and RangeQuery if issues against a ts-kv-store, now unwrapping value-and-timestamp and only returning the plain value.

Reviewers: Matthias J. Sax <matthias@confluent.io>
2023-11-30 12:14:23 -08:00
Apoorv Mittal 38f2faf83f
KAFKA-15681: Add support of client-metrics in kafka-configs.sh (KIP-714) (#14632)
The PR adds support of alter/describe configs for client-metrics as defined in KIP-714

Reviewers: Andrew Schofield <aschofield@confluent.io>, Jun Rao <junrao@gmail.com>
2023-11-28 09:24:25 -08:00
Ritika Reddy 55017a4f68
KAFKA-15484: General Rack Aware Assignor (#14481)
This patch adds the second part of the Uniform Assignor, used when the subscriptions of each member in a consumer group are different.

Reviewers: Jeff Kim <jeff.kim@confluent.io>, David Jacot <djacot@confluent.io>
2023-11-23 01:18:50 -08:00
Colin Patrick McCabe 7060c08d6f
MINOR: Rewrite the meta.properties handling code in Java and fix some issues #14628 (#14628)
meta.properties files are used by Kafka to identify log directories within the filesystem.
Previously, the code for handling them was in BrokerMetadataCheckpoint.scala. This PR rewrites the
code for handling them as Java and moves it to the apache.kafka.metadata.properties namespace. It
also gets rid of the separate types for v0 and v1 meta.properties objects. Having separate types
wasn't so bad back when we had a strict rule that zk clusters used v0 and kraft clusters used v1.
But ZK migration has blurred the lines. Now, a zk cluster may have either v0 or v1, if it is
migrating, and a kraft cluster may have either v0 or v1, at any time.

The new code distinguishes between an individual meta.properties file, which is represented by
MetaProperties, and a collection of meta.properties files, which is represented by
MetaPropertiesEnsemble. It is useful to have this distinction, because in JBOD mode, even if some
log directories are inaccessible, we can still use the ensemble to extract needed information like
the cluster ID. (Of course, even when not in JBOD mode, KRaft servers have always been able to
configure a metadata log directory separate from the main log directory.)

Since we recently added a unique directory.id to each meta.properties file, the previous convention
of passing a "canonical" MetaProperties object for the cluster around to various places in the code
needs to be revisited. After all, we can no longer assume all of the meta.properties files are the
same. This PR fixes these parts of the code. For example, it fixes the constructors of
ControllerApis and RaftManager to just take a cluster ID, rather than a MetaProperties object. It
fixes some other parts of the code, like the constructor of SharedServer, to take a
MetaPropertiesEnsemble object.

Another goal of this PR was to centralize meta.properties validation a bit more and make it
unit-testable. For this purpose, the PR adds MetaPropertiesEnsemble.verify, and a few other
verification methods. These enforce invariants like "the metadata directory must be readable," and
so on.

Reviewers: Igor Soarez <soarez@apple.com>, David Arthur <mumrah@gmail.com>, Divij Vaidya <diviv@amazon.com>, Proven Provenzano <pprovenzano@confluent.io>
2023-11-09 09:32:35 -08:00
Calvin Liu 505e5b3eaa
KAFKA-15584: Leader election with ELR (#14593)
The patch includes the following changes as part of KIP-966

* Allow ISR shrink to empty
* Allow leader election with ELR members
* Allow electing the last known leader

Reviewers: Artem Livshits <alivshits@confluent.io>, David Arthur <mumrah@gmail.com>
2023-11-06 17:21:51 -05:00
Kirk True 2b233bfa5f
KAFKA-14274 [6, 7]: Introduction of fetch request manager (#14406)
Changes:

1. Introduces FetchRequestManager that implements the RequestManager
   API for fetching messages from brokers. Unlike Fetcher, record
   decompression and deserialization is performed on the application
   thread inside CompletedFetch.
2. Restructured the code so that objects owned by the background thread
   are not instantiated until the background thread runs (via Supplier)
   to ensure that there are no references available to the
   application thread.
3. Ensuring resources are properly using Closeable and using
   IdempotentCloser to ensure they're only closed once.
4. Introduces ConsumerTestBuilder to reduce a lot of inconsistency in
   the way the objects were built up for tests.

Reviewers: Philip Nee <pnee@confluent.io>, Lianet Magrans <lianetmr@gmail.com>, Jun Rao<junrao@gmail.com>
2023-10-24 13:03:05 -07:00
Chris Egerton 091eb9b349
KAFKA-15428: Cluster-wide dynamic log adjustments for Connect (#14538)
Reviewers: Greg Harris <greg.harris@aiven.io>, Yang Yang <yayang@uber.com>, Yash Mayya <yash.mayya@gmail.com>
2023-10-20 09:52:37 -04:00
Calvin Liu af747fbfed
KAFKA-15581: Introduce ELR (#14312)
This patch introduces preliminary changes for Eligible Leader Replicas (KIP-966)

* New MetadataVersion 16 (3.7-IV1)
* New record versions for PartitionRecord and PartitionChangeRecord
* New tagged fields on PartitionRecord and PartitionChangeRecord
* New static config "eligible.leader.replicas.enable" to gate the whole feature

Reviewers: Artem Livshits <alivshits@confluent.io>, David Arthur <mumrah@gmail.com>, Colin P. McCabe <cmccabe@apache.org>
2023-10-19 14:05:15 -04:00
Apoorv Mittal 36abc8dcea
KAFKA-15604: Telemetry API request and response schemas and classes (KIP-714) (#14554)
Initial PR for [KIP-714](https://cwiki.apache.org/confluence/display/KAFKA/KIP-714%3A+Client+metrics+and+observability) - [KAFKA-15601](https://issues.apache.org/jira/browse/KAFKA-15601).

This PR defines json request and response schemas for the new Telemetry APIs and implements the corresponding java classes.

Reviewers: 
Andrew Schofield <andrew_schofield@uk.ibm.com>, Kirk True <ktrue@confluent.io>, Matthias J. Sax <matthias@confluent.io>, Walker Carlson <wcarlson@apache.org>
2023-10-19 10:55:21 -05:00
Jeff Kim abee8f711c
KAFKA-14519; [1/N] Implement coordinator runtime metrics (#14417)
Implements the following metrics:

kafka.server:type=group-coordinator-metrics,name=num-partitions,state=loading
kafka.server:type=group-coordinator-metrics,name=num-partitions,state=active
kafka.server:type=group-coordinator-metrics,name=num-partitions,state=failed
kafka.server:type=group-coordinator-metrics,name=event-queue-size
kafka.server:type=group-coordinator-metrics,name=partition-load-time-max
kafka.server:type=group-coordinator-metrics,name=partition-load-time-avg
kafka.server:type=group-coordinator-metrics,name=thread-idle-ratio-min
kafka.server:type=group-coordinator-metrics,name=thread-idle-ratio-avg
The PR makes these metrics generic so that in the future the transaction coordinator runtime can implement the same metrics in a similar fashion.

Also, CoordinatorLoaderImpl#load will now return LoadSummary which encapsulates the start time, end time, number of records/bytes.

Co-authored-by: David Jacot <djacot@confluent.io>

Reviewers:  Ritika Reddy <rreddy@confluent.io>, Calvin Liu <caliu@confluent.io>, David Jacot <djacot@confluent.io>, Justine Olshan <jolshan@confluent.io>
2023-10-17 16:06:23 -07:00
Lianet Magrans 58dfa1cc81
MINOR - KAFKA-15550: Validation for negative target times in offsetsForTimes (#14503)
The current KafkaConsumer offsetsForTimes fails with IllegalArgumentException if negative target timestamps are provided as arguments. This change includes the same validation and tests for the new consumer implementation (and some improved comments for the updateFetchPositions)

Reviewer: Lucas Brutschy <lbrutschy@confluent.io>
2023-10-13 09:59:57 +02:00
Jeff Kim 7b5d640cc6
KAFKA-14987; Implement Group/Offset expiration in the new coordinator (#14467)
This patch implements the groups and offsets expiration in the new group coordinator.

Reviewers: Ritika Reddy <rreddy@confluent.io>, David Jacot <djacot@confluent.io>
2023-10-11 23:45:13 -07:00
Mayank Shekhar Narula d817b1b590
KAFKA-15415: On producer-batch retry, skip-backoff on a new leader (#14384)
When producer-batch is being retried, new-leader is known for the partition Vs the leader used in last attempt, then it is worthwhile to retry immediately to this new leader. A partition-leader is considered to be newer, if the epoch has advanced.

Reviewers: Walker Carlson <wcarlson@apache.org>, Kirk True <kirk@kirktrue.pro>, Andrew Schofield <andrew_schofield@uk.ibm.com
2023-10-05 09:11:47 -05:00
Dongnuo Lyu a12f9f97c9
KAFKA-14506: Implement DeleteGroups API and OffsetDelete API (#14408)
This patch implements DeleteGroups and OffsetDelete API in the new group coordinator.

Reviewers: yangy0000, Ritika Reddy <rreddy@confluent.io>, Jeff Kim <jeff.kim@confluent.io>, David Jacot <djacot@confluent.io>
2023-10-04 02:30:45 -07:00
Lucas Brutschy 6263197a62
KAFKA-15326: [9/N] Start and stop executors and cornercases (#14281)
* Implements start and stop of task executors
* Introduce flush operation to keep consumer operations out of the processing threads
* Fixes corner case: handle requested unassignment during shutdown
* Fixes corner case: handle race between voluntary unassignment and requested unassigment
* Fixes corner case: task locking future completes for the empty set
* Fixes corner case: we should not reassign a task with an uncaught exception to a task executor
* Improved logging
* Number of threads controlled from outside, of the TaskManager

Reviewers: Bruno Cadonna <bruno@confluent.io>
2023-10-02 15:41:21 +02:00
Kirk True e1dc6d9f34
KAFKA-14274 [2-5/7]: Introduction of more infrastructure for forthcoming fetch request manager (#14359)
This continues the work of providing the groundwork for the fetch
refactoring work by introducing some new classes and refactoring the
existing code to use the new classes where applicable.

Changes:

* Minor clean up of the events classes to make data immutable,
  private, and implement toString().
* Added IdempotentCloser which prevents a resource from being closed
  more than once. It's general enough that it could be used elsewhere
  in the project, but it's limited to the consumer internals for now.
* Split core Fetcher code into classes to buffer raw results
  (FetchBuffer) and to collect raw results into ConsumerRecords
  (FetchCollector). These can be tested and changed in isolation from
  the core fetcher logic.
* Added NodeStatusDetector which abstracts methods from
  ConsumerNetworkClient so that it and NetworkClientDelegate can be
  used in AbstractFetch via the interface instead of using
  ConsumerNetworkClient directly.

Reviewers: Jun Rao <junrao@gmail.com>
2023-09-16 09:15:37 -07:00
zhaohaidao f309299f3c
KAFKA-14503: Implement ListGroups (#14271)
This patch implements the ListGroups API in the new group coordinator.

Reviewers: David Jacot <djacot@confluent.io>
2023-09-14 23:45:03 -07:00
Jeff Kim e9057aab37
KAFKA-14502; Implement LeaveGroup protocol in new GroupCoordinator (#14147)
This patch implements the LeaveGroup API in the new group coordinator.

Reviewers: David Jacot <djacot@confluent.io>
2023-09-13 01:43:37 -07:00
Andrew Schofield b49013b73e
KAFKA-9800: Exponential backoff for Kafka clients - KIP-580 (#14111)
Implementation of KIP-580 to add exponential back-off to situations in which retry.backoff.ms
is used to delay backoff attempts. This KIP adds exponential backoff behavior with a maximum
controlled by a new config retry.backoff.max.ms, together with a +/- 20% of jitter to spread the
retry attempts of the client fleet.

Reviewers: Mayank Shekhar Narula <mayanks.narula@gmail.com>, Milind Luthra <i.milind.luthra@gmail.com>, Kirk True <kirk@mustardgrain.com>, Jun Rao<junrao@gmail.com>
2023-09-05 11:57:51 -07:00
Satish Duggana d4ab3ae85a
KAFKA-14888: Added remote log segments retention mechanism based on time and size. (#13561)
This change introduces a remote log segment segment retention cleanup mechanism.

RemoteLogManager runs retention cleanup activity tasks on each leader replica. It assesses factors such as overall size and retention duration, subsequently removing qualified segments from remote storage. This process also involves adjusting the log-start-offset within the UnifiedLog accordingly. It also cleans up the segments which have epochs earlier than the earliest leader epoch in the current leader. 

Co-authored-by: Satish Duggana <satishd@apache.org>
Co-authored-by: Kamal Chandraprakash <kamal.chandraprakash@gmail.com>

Reviewers: Jun Rao <junrao@gmail.com>, Divij Vaidya <diviv@amazon.com, Luke Chen <showuon@gmail.com>, Kamal Chandraprakash <kamal.chandraprakash@gmail.com>, Christo Lolov <lolovc@amazon.com>, Jorge Esteban Quilcate Otoya <quilcate.jorge@gmail.com>, Alexandre Dupriez <alexandre.dupriez@gmail.com>, Nikhil Ramakrishnan <ramakrishnan.nikhil@gmail.com>
2023-08-25 05:27:59 +05:30
Proven Provenzano c2759df067
KAFKA-15219: KRaft support for DelegationTokens (#14083)
Reviewers: David Arthur <mumrah@gmail.com>, Ron Dagostino <rndgstn@gmail.com>, Manikumar Reddy <manikumar.reddy@gmail.com>, Viktor Somogyi <viktor.somogyi@cloudera.com>
2023-08-19 14:01:08 -04:00
Ivan Yurchenko b3db905b27
KAFKA-15107: Support custom metadata for remote log segment (#13984)
* KAFKA-15107: Support custom metadata for remote log segment

This commit does the changes discussed in the KIP-917. Mainly, changes the `RemoteStorageManager` interface in order to return `CustomMetadata` and then ensures these custom metadata are stored, propagated, (de-)serialized correctly along with the standard metadata throughout the whole lifecycle. It introduces the `remote.log.metadata.custom.metadata.max.size` to limit the custom metadata size acceptable by the broker and stop uploading in case a piece of metadata exceeds this limit.

On testing:
1. `RemoteLogManagerTest` checks the case when a piece of custom metadata is larger than the configured limit.
2. `RemoteLogSegmentMetadataTest` checks if `createWithUpdates` works correctly, including custom metadata.
3. `RemoteLogSegmentMetadataTransformTest`, `RemoteLogSegmentMetadataSnapshotTransformTest`, and `RemoteLogSegmentMetadataUpdateTransformTest` were added to test the corresponding class (de-)serialization, including custom metadata.
4. `FileBasedRemoteLogMetadataCacheTest` checks if custom metadata are being correctly saved and loaded to a file (indirectly, via `equals`).
5. `RemoteLogManagerConfigTest` checks if the configuration setting is handled correctly.

Reviewers: Luke Chen <showuon@gmail.com>, Satish Duggana <satishd@apache.org>, Divij Vaidya <diviv@amazon.com>
2023-08-04 18:23:25 +05:30
Jeff Kim 19f9e1e6d0
KAFKA-14501: Implement Heartbeat protocol in new GroupCoordinator (#14056)
This patch implements the existing Heartbeat API in the new Group Coordinator.

Reviewers: David Jacot <djacot@confluent.io>
2023-07-28 15:13:27 +02:00
Hao Li ed44bcd71b
KAFKA-15022: [3/N] use graph to compute rack aware assignment for active stateful tasks (#14030)
Part of KIP-925.

Reviewers: Matthias J. Sax <matthias@confluent.io>
2023-07-26 16:02:52 -07:00
Federico Valeri bb677c4959
KAFKA-14583: Move ReplicaVerificationTool to tools (#14059)
Reviewers: Mickael Maison <mickael.maison@gmail.com>
2023-07-26 12:04:34 +02:00
Luke Chen 27ea025e33
KAFKA-15176: add tests for tiered storage metrics (#13999)
Added tests for metrics:
1. RemoteLogReaderTaskQueueSize
2. RemoteLogReaderAvgIdlePercent
3. RemoteLogManagerTasksAvgIdlePercent

Also, added tests for OffsetOutOfRangeException will be thrown while reading logs

Reviewers: Christo Lolov <christololov@gmail.com>, Satish Duggana <satishd@apache.org>
2023-07-21 10:30:33 +08:00
Jeff Kim a500c3ecf9
KAFKA-14500; [5/N] Implement JoinGroup protocol in new GroupCoordinator (#13870)
This patch implements the existing JoinGroup protocol within the new group coordinator. 

Some notable differences:
* Methods return a CoordinatorResult to the runtime framework, which includes records to append to the log as well as a future to complete after the append succeeds/fails.
* The coordinator runtime ensures that only a single thread will be processing a group at any given time, therefore there is no more locking on groups.
* Instead of using on purgatories, we rely on the Timer interface to schedule/cancel delayed operations.

Reviewers: David Jacot <djacot@confluent.io>
2023-07-19 09:15:13 +02:00
Abhijeet Kumar fd3b1137d2
KAFKA-14953: Add tiered storage related metrics (#13944)
* KAFKA-14953: Adding RemoteLogManager metrics
In this PR, I have added the following metrics that are related to tiered storage mentioned in[ KIP-405](https://cwiki.apache.org/confluence/display/KAFKA/KIP-405%3A+Kafka+Tiered+Storage).
|Metric|Description|
|-----------------------------------------|--------------------------------------------------------------|
| RemoteReadRequestsPerSec                    | Number of remote storage read requests per second               |
| RemoteWriteRequestsPerSec                    | Number of remote storage write requests per second              |
| RemoteBytesInPerSec                                | Number of bytes read from remote storage per second           |
| RemoteReadErrorsPerSec                          | Number of remote storage read errors per second                   |
| RemoteBytesOutPerSec                             | Number of bytes copied to remote storage per second            |
| RemoteWriteErrorsPerSec                          | Number of remote storage write errors per second                   |
| RemoteLogReaderTaskQueueSize             | Number of remote storage read tasks pending for execution.  |
| RemoteLogReaderAvgIdlePercent             | Average idle percent of the remote storage reader thread pool|
| RemoteLogManagerTasksAvgIdlePercent | Average idle percent of RemoteLogManager thread pool          |

Added unit tests for all the rate metrics.

Reviewers: Luke Chen <showuon@gmail.com>, Divij Vaidya <diviv@amazon.com>, Kamal Chandraprakash<kamal.chandraprakash@gmail.com>, Jorge Esteban Quilcate Otoya <quilcate.jorge@gmail.com>, Staniel Yao<yaolixinylx@gmail.com>, hudeqi<1217150961@qq.com>, Satish Duggana <satishd@apache.org>
2023-07-18 20:16:19 +05:30
Satish Duggana 7e2f878713
KAFKA-14522 Rewrite/Move of RemoteIndexCache to storage module. (#13275)
KAFKA-14522 Rewrite and Move of RemoteIndexCache to storage module.
Cleanedup index file suffix usages and other minor cleanups

Reviewers: Jun Rao <junrao@gmail.com>, Ismael Juma <ismael@juma.me.uk>, Luke Chen <showuon@gmail.com>, Divij Vaidya <diviv@amazon.com>, Kamal Chandraprakash<kamal.chandraprakash@gmail.com>, Jorge Esteban Quilcate Otoya <quilcate.jorge@gmail.com>
2023-07-11 23:55:23 +05:30
David Jacot 98fbd8afc7
KAFKA-14462; [20/N] Refresh subscription metadata on new metadata image (#13901)
This patch adds (1) the logic to propagate a new MetadataImage to the running coordinators; and (2) the logic to ensure that all the consumer groups subscribed to topics with changes will refresh their subscriptions metadata on the next heartbeat. In the mean time, it ensures that freshly loaded consumer groups also refresh their subscriptions metadata on the next heartbeat.

Reviewers: Jeff Kim <jeff.kim@confluent.io>, Justine Olshan <jolshan@confluent.io>
2023-07-05 18:28:38 +02:00