kafka

Commit Graph

Author	SHA1	Message	Date
Viktor Somogyi-Vass	16d08e9e63	KAFKA-14978 ExactlyOnceWorkerSourceTask should remove parent metrics (#13690 ) Reviewers: Chris Egerton <chrise@aiven.io>, Viktor Somogyi-Vass <viktorsomogyi@gmail.com> Co-authored-by: Dániel Urbán <48119872+urbandan@users.noreply.github.com>	2023-05-11 13:03:04 +02:00
Yash Mayya	721a917b44	KAFKA-14974: Restore backward compatibility in KafkaBasedLog (#13688 ) `KafkaBasedLog` is a widely used utility class that provides a generic implementation of a shared, compacted log of records in a Kafka topic. It isn't in Connect's public API, but has been used outside of Connect and we try to preserve backward compatibility whenever possible. KAFKA-14455 modified the two overloaded void `KafkaBasedLog::send` methods to return a `Future`. While this change is source compatible, it isn't binary compatible. We can restore backward compatibility simply by renaming the new Future returning send methods, and reinstating the older send methods to delegate to the newer methods. This refactoring changes no functionality other than restoring the older methods. Reviewers: Randall Hauch <rhauch@gmail.com>	2023-05-09 08:14:25 -05:00
Luke Chen	52fce15ca5	MINOR: fix compilation failure (#13684 ) Reviewers: Divij Vaidya <diviv@amazon.com>, Mickael Maison <mickael.maison@gmail.com>	2023-05-09 10:23:38 +08:00
Luke Chen	92ebf0d43d	fix compilation failure	2023-05-08 15:04:40 +08:00
Jason Gustafson	c81795692f	KAFKA-14644: Process should crash after failure in Raft IO thread (#13140 ) Unexpected errors caught in the Raft IO thread should cause the process to stop. This is similar to the handling of exceptions in the controller. Reviewers: Colin P. McCabe <cmccabe@apache.org>	2023-05-04 15:22:32 +08:00
Jason Gustafson	69fbf4c46a	MINOR: Allow tagged fields with version subset of flexible version range (#13551 ) The generated message types are missing a range check for the case when the tagged version range is a subset of the flexible version range. This causes the tagged field count, which is computed correctly, to conflict with the number of tags serialized. Reviewers: Colin P. McCabe <cmccabe@apache.org>	2023-05-03 15:26:56 -07:00
José Armando García Sancio	b7996d1152	KAFKA-14963; Do not use equals with Uuid (#13668 ) Uuid is an object so they need to be compared with the equals method and not the == operator.	2023-05-03 10:58:30 -07:00
Luke Chen	09d794852a	KAFKA-14946: fix NPE when merging the deltatable (#13653 ) Fix NPE while merging the deltatable. Because it's possible that hashTier is not null but deltatable is null (ex: removing data), we should have null check while merging for deltatable like other places did. Also added tests that will fail without this change. Reviewers: Colin P. McCabe <cmccabe@apache.org>	2023-05-03 10:10:42 -07:00
Philip Nee	ae72257028	KAFKA-14639: A single partition may be revoked and assign during a single round of rebalance (#13550 ) (#13652 ) This is a really long story, but the incident started in KAFKA-13419 when we observed a member sending out a topic partition owned from the previous generation when a member missed a rebalance cycle due to REBALANCE_IN_PROGRESS. This patch changes the AbstractStickyAssignor.AllSubscriptionsEqual method. In short, it should no long check and validate only the highest generation. Instead, we consider 3 cases: 1. Member will continue to hold on to its partition if there are no other owners 2. If there are 1+ owners to the same partition. One with the highest generation will win. 3. If two members of the same generation hold on to the same partition. We will log an error but remove both from the assignment. (Same with the current logic) Here are some important notes that lead to the patch: - If a member is kicked out of the group, and `UNKNOWN_MEMBER_ID` will be thrown. - It seems to be a common situation that members are late to joinGroup and therefore get `REBALANCE_IN_PROGRESS` error. This is why we don't want to reset generation because it might cause lots of revocations and can be disruptive To summarize the current behavior of different errors: `REBALANCE_IN_PROGRESS` - heartbeat: requestRejoin if member state is stable - joinGroup: rejoin immediately - syncGroup: rejoin immediately - commit: requestRejoin and fail the commit. Raise this exception if the generation is staled, i.e. another rebalance is already in progress. `UNKNOWN_MEMBER_ID` - heartbeat: resetStateAndRejoinif generation hasn't changed. otherwise, ignore - joinGroup: resetStateAndRejoin if generation unchanged, otherwise rejoin immediately - syncGroup: resetStateAndRejoin if generation unchanged, otherwise rejoin immediately `ILLEGAL_GENERATION` - heartbeat: resetStateAndRejoinif generation hasn't changed. otherwise, ignore - syncGroup: raised the exception if generation has been resetted or the member hasn't completed rebalancing. then resetStateAndRejoin if generation unchanged, otherwise rejoin immediately Reviewers: David Jacot <djacot@confluent.io>	2023-05-02 13:47:18 +02:00
Greg Harris	402cce0796	KAFKA-14666: Add MM2 in-memory offset translation index for offsets behind replication (#13429 ) Reviewers: Daniel Urban <durban@cloudera.com>, Chris Egerton <chrise@aiven.io>	2023-05-01 12:40:29 -04:00
Greg Harris	c7347c266f	MINOR: Refactor Mirror integration tests to reduce duplication (#13428 ) Reviewers: Mickael Maison <mickael.maison@gmail.com>	2023-05-01 11:56:34 -04:00
hudeqi	721d160175	KAFKA-14837/14842:Avoid the rebalance caused by the addition and deletion of irrelevant groups for MirrorCheckPointConnector (#13446 ) Reviewers: Chris Egerton <chrise@aiven.io>	2023-05-01 11:50:27 -04:00
Victoria Xia	cb7871ca6e	MINOR: update docs note about spurious stream-stream join results (#13642 ) Reviewers: Matthias J. Sax <matthias@confluent.io>	2023-04-25 19:41:23 -07:00
Matthias J. Sax	754365a032	KAFKA-14862: Outer stream-stream join does not output all results with multiple input partitions (#13592 ) Stream-stream outer join, uses a "shared time tracker" to track stream-time progress for left and right input in a single place. This time tracker is incorrectly shared across tasks. This PR introduces a supplier to create a "shared time tracker" object per task, to be shared between the left and right join processors. Reviewers: Victoria Xia <victoria.xia@confluent.io>, Bruno Cadonna <bruno@confluent.io>, Walker Carlson <wcarlson@confluent.io>	2023-04-24 13:03:14 -07:00
Greg Harris	41037bf78d	KAFKA-14905: Reduce flakiness in MM2 ForwardingAdmin test due to admin timeouts (#13575 ) Reduce flakiness of `MirrorConnectorsWithCustomForwardingAdminIntegrationTest` Reviewers: Josep Prat <jlprat@apache.org>	2023-04-21 22:17:05 +02:00
Jeff Kim	ec804a5b4e	KAFKA-14869: Bump coordinator value records to flexible versions (KIP-915, Part-2) (#13604 ) This patch implemented the second part of KIP-915. It bumps the versions of the value records used by the group coordinator and the transaction coordinator to make them flexible versions. The new versions are not used when writing to the partitions but only when reading from the partitions. This allows downgrades from future versions that will include tagged fields. Reviewers: David Jacot <djacot@confluent.io>	2023-04-21 13:54:37 +02:00
Ron Dagostino	03b41b54c9	KAFKA-14887: FinalizedFeatureChangeListener should not shut down when ZK session expires FinalizedFeatureChangeListener shuts the broker down when it encounters an issue trying to process feature change events. However, it does not distinguish between issues related to feature changes actually failing and other exceptions like ZooKeeper session expiration. This introduces the possibility that Zookeeper session expiration could cause the broker to shutdown, which is not intended. This patch updates the code to distinguish between these two types of exceptions. In the case of something like a ZK session expiration it logs a warning and continues. We shutdown the broker only for FeatureCacheUpdateException. Reviewers: Kamal Chandraprakash <kamal.chandraprakash@gmail.com>, Christo Lolov <christololov@gmail.com>, Colin P. McCabe <cmccabe@apache.org>	2023-04-20 20:17:40 -04:00
Victoria Xia	960110dd85	MINOR: update comment for FK join processor renames (#13610 ) Reviewers: Matthias J. Sax <matthias@confluent.io>	2023-04-19 16:29:45 -07:00
Matthias J. Sax	4f9ef72a63	MINOR: rename internal FK-join processor classes	2023-04-18 12:18:52 -07:00
David Jacot	6a2331a60a	HOTFIX: KAFKA-14869: Updated location of auto-generated records While cherry-picking `5115906515` to 3.4, auto-generated classes where still on my disk so the issue was not caught. This patch fixes the full qualified named to match the location of the auto-generated records in 3.4.	2023-04-18 11:07:21 +02:00
Jeff Kim	5115906515	KAFKA-14869: Ignore unknown record types for coordinators (KIP-915, Part-1) (#13511 ) This patch implemented the first part of KIP-915. It updates the group coordinator and the transaction coordinator to ignores unknown record types while loading their respective state from the partitions. This allows downgrades from future versions that will include new record types. Reviewers: Alexandre Dupriez <alexandre.dupriez@gmail.com>, David Jacot <djacot@confluent.io>	2023-04-18 10:47:04 +02:00
Matthias J. Sax	eb616d3ffc	KAFKA-14054: Handle TimeoutException gracefully (#13534 ) We incorrectly assumed, that `consumer.position()` should always be served by the consumer locally set position. However, within `commitNeeded()` we check if first `if(commitNeeded)` and thus go into the else only if we have not processed data (otherwise, `commitNeeded` would be true). For this reason, we actually don't know if the consumer has a valid position or not. We should just swallow a timeout if the consumer cannot get the position from the broker, and try the next partition. If any position advances, we can return true, and if we timeout for all partitions we can return false. Reviewers: Michal Cabak (@miccab), John Roesler <john@confluent.io>, Guozhang Wang <guozhand@confluent.io>	2023-04-14 11:38:57 -07:00
Colin Patrick McCabe	6c89a3f365	KAFKA-14894: MetadataLoader must call finishSnapshot after loading a snapshot (#13541 ) The MetadataLoader must call finishSnapshot after loading a snapshot. This function removes whatever was in the old snapshot that is not in the new snapshot that was just loaded. While this is not significant when the old snapshot was the empty snapshot, it is important to do when we are loading a snapshot on top of an existing non-empty image. In initializeNewPublishers, the newly installed publishers should be given a MetadataDelta based on MetadataImage.EMPTY, reflecting the fact that they are seeing everything for the first time. Reviewers: David Arthur <mumrah@gmail.com>	2023-04-12 15:44:41 -07:00
Colin P. McCabe	8f94b627ae	KAFKA-14857: Fix some MetadataLoader bugs (#13462 ) The MetadataLoader is not supposed to publish metadata updates until we have loaded up to the high water mark. Previously, this logic was broken, and we published updates immediately. This PR fixes that and adds a junit test. Another issue is that the MetadataLoader previously assumed that we would periodically get callbacks from the Raft layer even if nothing had happened. We relied on this to install new publishers in a timely fashion, for example. However, in older MetadataVersions that don't include NoOpRecord, this is not a safe assumption. Aside from the above changes, also fix a deadlock in SnapshotGeneratorTest, fix the log prefix for BrokerLifecycleManager, and remove metadata publishers on brokerserver shutdown (like we do for controllers). Reviewers: David Arthur <mumrah@gmail.com>, dengziming <dengziming1993@gmail.com> Conflicts: This patch was cut down to make it easier to cherry-pick to this older branch. Specifically, I removed the BrokerLifecycleManager.scala logging change and the BrokerServer installPublishers and removeAndClosePublisher changes.	2023-04-12 15:43:33 -07:00
David Jacot	1666e8e8af	KAFKA-14880; TransactionMetadata with producer epoch -1 should be expirable (#13499 ) We have seen the following error in logs: ``` "Mar 22, 2019 @ 21:57:56.655",Error,"kafka-0-0","transaction-log-manager-0","Uncaught exception in scheduled task 'transactionalId-expiration'","java.lang.IllegalArgumentException: Illegal new producer epoch -1 ``` Investigations showed that it is actually possible for a transaction metadata object to still have -1 as producer epoch when it transitions to Dead. When a transaction metadata is created for the first time (in handleInitProducerId), it has -1 as its producer epoch. Then a producer epoch is attributed and the transaction coordinator tries to persist the change. If the write fail for instance because there is an under min isr, the transaction metadata remains with its epoch as -1 forever or until the init producer id is retried. This means that it is possible for transaction metadata to remain with -1 as producer epoch until it gets expired. At the moment, this is not allowed because we enforce a producer epoch greater or equals to 0 in prepareTransitionTo. Reviewers: Luke Chen <showuon@gmail.com>, Justine Olshan <jolshan@confluent.io>	2023-04-06 08:51:05 +02:00
Guozhang Wang	2dd3713b2a	KAFKA-14172: Should clear cache when active recycled from standby (#13369 ) This fix is inspired by #12540. 1. Added a clearCache function for CachedStateStore, which would be triggered upon recycling a state manager. 2. Added the integration test inherited from #12540 . 3. Improved some log4j entries. 4. Found and fixed a minor issue with log4j prefix. Reviewers: Lucas Brutschy <lbrutschy@confluent.io>, Matthias J. Sax <matthias@confluent.io>	2023-04-05 17:02:56 -07:00
Chris Egerton	699c3511a7	MINOR: Fix base ConfigDef in AbstractHerder::connectorPluginConfig (#13466 ) Reviewers: Mickael Maison <mickael.maison@gmail.com>, Greg Harris <gharris1727@gmail.com>	2023-04-04 11:59:24 +02:00
Victoria Xia	ac72ba571f	KAFKA-14864: Close iterator in KStream windowed aggregation emit on window close (#13470 ) Reviewers: Matthias J. Sax <matthias@confluent.io>	2023-04-03 21:32:18 -07:00
Chia-Ping Tsai	255bf5a1d3	KAFKA-14774 the removed listeners should not be reconfigurable (#13472 ) Reviewers: Luke Chen <showuon@gmail.com>	2023-03-29 22:08:00 +08:00
Jorge Esteban Quilcate Otoya	e62556be2b	KAFKA-14843: Include Connect framework properties when retrieving connector config definitions (#13445 ) Reviewers: Yash Mayya <yash.mayya@gmail.com>, Greg Harris <greg.harris@aiven.io>, Chris Egerton <chrise@aiven.io>	2023-03-28 11:29:53 -04:00
Chris Egerton	0e95f730bd	KAFKA-14645: Use plugin classloader when retrieving connector plugin config definitions (#13148 ) Reviewers: Mickael Maison <mickael.maison@gmail.com>, Greg Harris <gharris1727@gmail.com>	2023-03-28 11:29:37 -04:00
Greg Harris	7814593243	KAFKA-14797: Emit offset sync when offset translation lag would exceed max.offset.lag (#13367 ) Reviewers: Chris Egerton <chrise@aiven.io>	2023-03-21 09:42:00 -04:00
Chris Egerton	aea37adf3f	KAFKA-14816: Only load SSL properties when issuing cross-worker requests to HTTPS URLs (#13415 ) This fixes a regression introduced in #12828, which caused workers to start unconditionally loading (and therefore validating) SSL-related properties when issuing REST requests to other workers. That was fine for the most part, but caused unnecessary failures when workers were configured with invalid SSL-related properties and their REST API used HTTP instead of HTTPS. Reviewers: Ian McDonald <imcdonald@confluent.io>, Greg Harris <greg.harris@aiven.io>, Yash Mayya <yash.mayya@gmail.com>, Justine Olshan <jolshan@confluent.io>	2023-03-20 09:36:28 -07:00
Hector Geraldino	a99c7fa44d	KAFKA-14809 Fix logging conditional on WorkerSourceTask (#13386 ) Reviewers: Chris Egerton <chrise@aiven.io>	2023-03-16 08:39:50 -04:00
Chris Egerton	9b4397abee	KAFKA-14799: Ignore source task requests to abort empty transactions (#13379 ) Reviewers: Chia-Ping Tsai <chia7712@gmail.com>	2023-03-14 16:12:36 -04:00
José Armando García Sancio	0a53cfb3a2	MINOR; Fix command name kafka-metadata-quorum (#13381 ) The name of the command should be kafka-metadata-quorum not kafka-metatada-quorum. Reviewers: Ron Dagostino <rdagostino@confluent.io>, Divij Vaidya <diviv@amazon.com>	2023-03-14 09:23:04 -07:00
Yash Mayya	1c5ba8b1d0	MINOR: Fix error check in Connect Worker zombie fencing (#13392 )	2023-03-14 12:10:59 -04:00
Eric Haag	55d7519cf7	MINOR: Remove unnecessary call to asCollection causing eager dependency resolution (#13149 ) Reviewers: Mickael Maison <mickael.maison@gmail.com>, Nelson Osacky	2023-03-10 18:28:32 +01:00
Chris Egerton	f7381e134e	KAFKA-14781: Downgrade MM2 log message severity when no ACL authorizer is configured on source broker (#13351 ) Reviewers: Mickael Maison <mickael.maison@gmail.com>	2023-03-08 10:34:01 -05:00
Greg Harris	22ab975fe2	KAFKA-14649: Isolate failures during plugin path scanning to single plugin classes (#13182 ) Reviewers: Christo Lolov <christo_lolov@yahoo.com>, Chris Egerton <chrise@aiven.io>	2023-03-02 15:11:02 -05:00
Hector Geraldino	da4e1bf283	KAFKA-14659 source-record-write-[rate\|total] metrics should exclude filtered records (#13193 ) Reviewers: Christo Lolov <christololov@gmail.com>, Chris Egerton <chrise@aiven.io>	2023-02-28 09:40:57 -05:00
Chia-Ping Tsai	98f770f468	KAFKA-14295 FetchMessageConversionsPerSec meter not recorded (#13279 ) Reviewers: Luke Chen <showuon@gmail.com>	2023-02-27 13:14:07 +08:00
Luke Chen	fa88333039	Kafka-14743: update request metrics after callback (#13297 ) Currently, the kafka.network:type=RequestMetrics,name=MessageConversionsTimeMs,request=Fetch will not get updated because the request metrics is recorded BEFORE the messageConversions metrics value updated. That means, even if we updated the messageConversions metrics value, the request metrics will never reflect the update. This patch fixes it by updating the request metric after callback completed, so that the messageConversions metric value can be updated correctly. Reviewers: Chia-Ping Tsai <chia7712@gmail.com>, Divij Vaidya <diviv@amazon.com>	2023-02-26 15:22:45 +08:00
Chia-Ping Tsai	7220e3ecc5	MINOR: enable DynamicBrokerReconfigurationTest.testUncleanLeaderElectionEnable (#13296 ) Reviewers: Luke Chen <showuon@gmail.com>	2023-02-26 15:20:14 +08:00
Greg Harris	175a342580	KAFKA-12468, KAFKA-13659, KAFKA-12566: Fix MM2 causing negative downstream lag, syncing stale offsets, and flaky integration tests (#13178 ) KAFKA-12468: Fix negative lag on down consumer groups synced by MirrorMaker 2 KAFKA-13659: Stop syncing consumer groups with stale offsets in MirrorMaker 2 KAFKA-12566: Fix flaky MirrorMaker 2 integration tests Reviewers: Chris Egerton <chrise@aiven.io>	2023-02-23 08:18:31 -05:00
csolidum	300779dee4	KAFKA-14545: Make MirrorCheckpointTask.checkpoint handle null OffsetAndMetadata gracefully (#13052 ) Reviewers: Mickael Maison <mickael.maison@gmail.com>, Greg Harris <gharris1727@gmail.com>	2023-02-23 07:54:01 -05:00
Chris Egerton	a4b33bd0a5	KAFKA-14610: Publish Mirror Maker 2 offset syncs in task commit() method (#13181 ) Reviewers: Mickael Maison <mickael.maison@gmail.com>, Greg Harris <gharris1727@gmail.com>	2023-02-23 07:49:02 -05:00
emilnkrastev	55e69a0db8	KAFKA-12558: Do not prematurely mutate internal partition state in Mirror Maker 2 (#11818 ) Reviewers: Greg Harris <greg.harris@aiven.io>, Chris Egerton <chrise@aiven.io>	2023-02-23 07:48:55 -05:00
Lucia Cerchie	96e1e41f93	KAFKA-14128: Kafka Streams does not handle TimeoutException (#13161 ) Kafka Streams is supposed to handle TimeoutException during internal topic creation gracefully. This PR fixes the exception handling code to avoid crashing on an TimeoutException returned by the admin client. Reviewer: Matthias J. Sax <matthias@confluent.io>, Colin Patrick McCabe <cmccabe@apache.org>, Alexandre Dupriez (@Hangleton)	2023-02-22 23:01:29 -08:00
Purshotam Chauhan	b7a8fd7bfe	KAKFA-14733: Added a few missing checks for Kraft Authorizer and updated AclAuthorizerTest to run tests for both zk and kraft (#13282 ) Added the following checks - * In StandardAuthorizerData.authorize() to fail if `patternType` other than `LITERAL` is passed. * In AclControlManager.addAcl() to fail if Resource Name is null or empty. Also, updated `AclAuthorizerTest` includes a lot of tests covering various scenarios that are missing in `StandardAuthorizerTest`. This PR changes the AclAuthorizerTest to run tests for both `zk` and `kraft` modes - * Rename AclAuthorizerTest -> AuthorizerTest * Parameterize relevant tests to run for both modes Reviewers: Manikumar Reddy <manikumar.reddy@gmail.com>	2023-02-21 19:22:16 +05:30

1 2 3 4 5 ...

10779 Commits All Branches Search

10779 Commits

All Branches