Commit Graph

8191 Commits

Author SHA1 Message Date
Justine Olshan fd71e1355b
MINOR: Fixed comment to refer to UpdateMetadataPartitionState rather than UpdateMetadataTopicState. (#9447)
Reviewers: David Jacot <djacot@confluent.io>
2020-10-19 21:15:30 +02:00
leah 4d14d6a96c
KAFKA-10455: Ensure that probing rebalances always occur (#9383)
Add dummy data to subscriptionUserData to make sure that
it is different each time a member rejoins.

Reviewers: A. Sophie Blee-Goldman <ableegoldman@apache.org>, John Roesler <vvcephei@apache.org>
2020-10-19 13:29:35 -05:00
Matthias J. Sax aef6cd6e99
KAFKA-9274: Add timeout handling for state restore and StandbyTasks (#9368)
* Part of KIP-572
* If a TimeoutException happens during restore of active tasks, or updating standby tasks, we need to trigger task.timeout.ms timeout.

Reviewers: John Roesler <john@confluent.io>
2020-10-19 11:07:56 -07:00
Ismael Juma 2db67db8e1
MINOR: Update jdk and maven names in Jenkinsfile (#9453) 2020-10-19 17:54:47 +01:00
Kowshik Prakasam d99fe49234
KAFKA-10599: Implement basic CLI tool for feature versioning system (#9409)
This PR implements a basic CLI tool for feature versioning system. The KIP-584 write up has been updated to suit this PR. The following is implemented in this PR:

--describe:
Describe supported and finalized features.
Usage: $> ./bin/kafka-features.sh --bootstrap-server host1:port1, host2:port2 --describe [--from-controller] [--command-config <path_to_java_properties_file>]
Optionally, use the --from-controller option to get features from the controller.
--upgrade-all:
Upgrades all features known to the tool to their highest max version levels.
Usage: $> ./bin/kafka-features.sh --bootstrap-server host1:port1, host2:port2 --upgrade-all [--dry-run] [--command-config <path_to_java_properties_file>]
Optionally, use the --dry-run CLI option to preview the feature updates without actually applying them.
--downgrade-all:
Downgrades existing finalized features to the highest max version levels known to this tool.
Usage: $> ./bin/kafka-features.sh --bootstrap-server host1:port1, host2:port2 --downgrade-all [--dry-run] [--command-config <path_to_java_properties_file>].
Optionally, use the --dry-run CLI option to preview the feature updates without actually applying them.

Reviewers: Boyang Chen <boyang@confluent.io>, Jun Rao <junrao@gmail.com>
2020-10-19 09:24:26 -07:00
Mickael Maison 270881cd65
KAFKA-10332: Update MM2 refreshTopicPartitions() logic (#9343)
Trigger task reconfiguration when:
- topic-partitions are created or deleted on source cluster
- topic-partitions are missing on target cluster

Authors: Mickael Maison <mickael.maison@gmail.com>, Edoardo Comar <ecomar@uk.ibm.com>
Reviewer: Randall Hauch <rhauch@gmail.com>
2020-10-19 10:51:44 -05:00
Adem Efe Gencer d71fd8857c
KAFKA-10583: Add documentation on the thread-safety of KafkaAdminClient (#9397)
Other than a Stack Overflow comment (see https://stackoverflow.com/a/61738065) by Colin Patrick McCabe and a proposed design note on KIP-117 wiki, there is no source that verifies the thread-safety of KafkaAdminClient.

This patch updates JavaDoc of KafkaAdminClient to clarify its thread-safety.

Reviewers: Tom Bentley <tbentley@redhat.com>, Chia-Ping Tsai <chia7712@gmail.com>
2020-10-19 15:53:48 +08:00
Matthias Merdes d841b912d1
MINOR: inconsistent naming for the output topic in the stream documentation (#9265)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
2020-10-18 23:19:49 +08:00
Patrick Dignan 7d7b92b3ae
MINOR: Fix comment about AbstractFetcherThread.handlePartitionsWithError (#7205)
Reviewers: Stanislav Kozlovski<stanislav_kozlovski@outlook.com>, William Hammond<william.t.hammond@gmail.com>, Chia-Ping Tsai<chia7712@gmail.com>
2020-10-18 22:16:52 +08:00
Chia-Ping Tsai 50bcb34d8d
MINOR: fix potential NPE in PartitionData.equals (#9391)
the field metadata is nullable (see https://github.com/apache/kafka/blob/trunk/clients/src/main/resources/common/message/OffsetFetchResponse.json#L50)

Reviewers: David Jacot <david.jacot@gmail.com>
2020-10-18 21:29:13 +08:00
Samuel Cantero cf202cb6ac
MINOR: Fix consumer/producer properties override (#9313)
Reviewers: Mickael Maison <mickael.maison@gmail.com>, Ryanne Dolan <ryannedolan@gmail.com>
2020-10-16 18:17:03 +02:00
Randall Hauch 9e0bf0bd2a
KAFKA-10600: Connect should not add error to connector validation values for properties not in connector’s ConfigDef (#9425)
Connect should not always add an error to configuration values in validation results that don't have a `ConfigKey` defined in the connector's `ConfigDef`, and any errors on such configuration values included by the connector should be counted in the total number of errors. Added more unit tests for `AbstractHerder.generateResult(...)`.

Author: Randall Hauch <rhauch@gmail.com>
Reviewer: Konstantine Karantasis <konstantine@confluent.io>
2020-10-16 09:14:43 -05:00
Cyrus Vafadari 432be58a7c
MINOR: Use debug level logging for noisy log messages in Connect (#8918)
Author: Cyrus Vafadari <cyrus@confluent.io>
Reviewers: Chris Egerton <chrise@confluent.io>, Arjun Satish <arjun@confluent.io>, Randall Hauch <rhauch@gmail.com>
2020-10-16 09:10:41 -05:00
Rajini Sivaram fcc7c2de39
MINOR: Handle lastFetchedEpoch/divergingEpoch in FetchSession and DelayedFetch (#9434)
In 2.7, we added lastFetchedEpoch to fetch requests and divergingEpoch to fetch responses. We are not using these for truncation yet, but in order to use these for truncation with IBP 2.7 onwards in the next release, we should make sure that we handle these in all the supporting classes even in 2.7.

Reviewers: Jason Gustafson <jason@confluent.io>
2020-10-16 09:58:01 +01:00
Matthias J. Sax e8ad80ebe1
MINOR: remove explicit passing of AdminClient into StreamsPartitionAssignor (#9384)
Currently, we pass multiple object reference (AdminClient,TaskManager, and a few more) into StreamsPartitionAssignor. Furthermore, we (miss)use TaskManager#mainConsumer() to get access to the main consumer (we need to do this, to avoid a cyclic dependency).

This PR unifies how object references are passed into a single ReferenceContainer class to
 - not "miss use" the TaskManager as reference container
 - unify how object references are passes

Note: we need to use a reference container to avoid cyclic dependencies, instead of using a config for each passed reference individually.

Reviewers: John Roesler <john@confluent.io>
2020-10-15 16:10:27 -07:00
vamossagar12 a85802faa1
KAFKA-10559: Not letting TimeoutException shutdown the app during internal topic validation (#9432)
Reviewers: A. Sophie Blee-Goldman <sophie@confluent.io>, Matthias J. Sax <matthias@confluent.io>
2020-10-15 15:10:21 -07:00
Tom Bentley 775a08876a
KAFKA-10602: Make RetryWithToleranceOperator thread safe (#9422)
ErrantRecordReporter uses a RetryWithToleranceOperator instance, which is necessarily stateful, having a ProcessingContext of which there's supposed to be one per task. That ProcessingContext is used by both RetryWithToleranceOperator.executeFailed() and execute(), so it's not enough to just synchronize executeFailed().

So make all public methods of RetryWithToleranceOperator synchronized so that RetryWithToleranceOperator is now threadsafe.

Tested with the addition of a multithreaded test case that fails consistently if the methods are not properly synchronized. 

Reviewers: Konstantine Karantasis <k.karantasis@gmail.com>
2020-10-15 11:54:46 -07:00
Luke Chen c217788e69
KAFKA-10340: Improve trace logging under connector based topic creation (#9149)
Reviewers: Konstantine Karantasis <k.karantasis@gmail.com>
2020-10-15 11:14:54 -07:00
Kowshik Prakasam b752097f84
MINOR: Check for active controller in UpdateFeatures request processing logic (#9436)
Reviewers: Jun Rao <junrao@gmail.com>
2020-10-15 10:23:05 -07:00
Guozhang Wang 236daf294d
MINOR: more log4j entry on elect / resignation of coordinators (#9416)
When a coordinator module is being elected / resigned, our log entry is usually associated with a background scheduler on loading / unloading entries and hence it is unclear at the exact time when the election or resignation happens, and we have to then compare with the KafkaAPI's log entry for leaderAndISR / StopReplica to infer the actual time. I think add a couple new log entries indicating the exact time when it happens is helpful.

Reviewers: Boyang Chen <boyang@confluent.io>, Lee Dongjin <dongjin@apache.org>, Bruno Cadonna <bruno@confluent.io>
2020-10-15 10:08:55 -07:00
Guozhang Wang c40985049f
KAFKA-10613: Only set leader epoch when list-offset version >= 4 (#9438)
The leader epoch field is added in version 4, and the auto-generated protocol code would throw unsupported version exception if the field is set to any non-default values for version < 4. This would cause older versioned clients to never receive list-offset results.

Reviewers: Boyang Chen <boyang@confluent.io>
2020-10-15 10:01:51 -07:00
Ron Dagostino 1636481c5f
MINOR: fix error in quota_test.py system tests (#9443) 2020-10-15 17:08:42 +01:00
Ismael Juma ebe6595c3d
MINOR: Upgrade to gradle 6.7 (#9440)
This release includes a key fix:
* Zinc leaks its dependencies to user classpath (https://github.com/gradle/gradle/issues/14168)

Release notes:
https://docs.gradle.org/6.7/release-notes.html

Reviewers: Manikumar Reddy <manikumar.reddy@gmail.com>
2020-10-15 07:54:43 -07:00
Andrey Bozhko 88862cc848
MINOR: Fix typos in DefaultSslEngineFactory javadoc (#9413)
Fix comment typos.

Reviewers: Boyang Chen <boyang@confluent.io>, Lee Dongjin <dongjin@apache.org>
2020-10-14 23:30:42 -07:00
Benoit Maggi da6871943f
KAFKA-10611: Merge log error to avoid double error (#9407)
When using an error tracking system, two error log messages result into two different alerts.
It's best to group the logs and have one error with all the information.

For example when using with Sentry, this double line of log.error will create 2 different Issues. One can merge the issues but it will be simpler to have a single error log line.

Signed-off-by: Benoit Maggi <benoit.maggi@gmail.com>

Reviewers: Ewen Cheslack-Postava <me@ewencp.org>, Konstantine Karantasis <k.karantasis@gmail.com>
2020-10-14 20:16:24 -07:00
Colin Patrick McCabe 7f9beeaaaf
MINOR: fix a bug in removing elements from an ImplicitLinkedHashColle… (#9428)
Fix a bug that was introduced by change 86013dc that resulted in incorrect behavior when
deleting through an iterator.

The bug is that the hash table relies on a denseness invariant... if you remove something,
you might have to move some other things. Calling removeElementAtSlot will do this.
Calling removeFromList is not enough.

Reviewers: Jason Gustafson <jason@confluent.io>
2020-10-14 14:53:30 -07:00
Lee Dongjin 7e9dec707d
KAFKA-9587: Add omitted configs in KafkaProducer javadoc (#8150)
Simple javadoc fix that aligns the properties with the text. 

Reviewers: Konstantine Karantasis <k.karantasis@gmail.com>
2020-10-13 22:23:43 -07:00
Jason Gustafson 8118b6c9f9
KAFKA-10521; Skip partition watch registration when `AlterIsr` is expected (#9353)
Before `AlterIsr` which was introduced in KIP-497, the controller would register watches in Zookeeper for each reassigning partition so that it could be notified immediately when the ISR was expanded and the reassignment could be completed. This notification is not needed with the latest IBP when `AlterIsr` is enabled because the controller will execute all ISR changes itself.

There is one subtle detail. If we are in the middle of a roll in order to bump the IBP, then it is possible for the controller to be on the latest IBP while some of the brokers are still on the older one. In this case, the brokers on the older IBP will not send `AlterIsr`, but we can still rely on the delayed notification through the `isr_notifications` path to complete reassignments. This seems like a reasonable tradeoff since it should be a short window before the roll is completed.

Reviewers: David Jacot <djacot@confluent.io>, Jun Rao <junrao@gmail.com>
2020-10-13 17:53:47 -07:00
Xavier Léauté eab61cad2c
KAFKA-10573 Update connect transforms configs for KIP-629 (#9403)
Changes the Connect `ReplaceField` SMT's configuration properties, deprecating and replacing `blacklist` with `exclude`, and `whitelist` with `include`. The old configurations are still allowed (ensuring backward compatibility), but warning messages are written to the log to suggest users change to `include` and `exclude`.

This is part of KIP-629.

Author: Xavier Léauté <xvrl@apache.org>
Reviewer: Randall Hauch <rhauch@gmail.com>
2020-10-13 18:13:44 -05:00
Xavier Léauté 26e9058aa0 MINOR internal KIP-629 changes to methods and variables
cc gwenshap

Author: Xavier Léauté <xvrl@apache.org>

Reviewers: Gwen Shapira

Closes #9405 from xvrl/minor-kip-629-vars
2020-10-13 14:52:04 -07:00
Xavier Léauté 0c044a77cb MINOR rename kafka.utils.Whitelist to IncludeList
rename internal classes, methods, and related constants for KIP-629

Author: Xavier Léauté <xvrl@apache.org>

Reviewers: Gwen Shapira

Closes #9400 from xvrl/rename-topic-includelist
2020-10-13 12:36:52 -07:00
Xavier Léauté f46d4f4fce KAFKA-10570; Rename JMXReporter configs for KIP-629
* rename whitelist/blacklist to include/exclude
* add utility methods to translate deprecated configs

Author: Xavier Léauté <xvrl@apache.org>

Reviewers: Gwen Shapira

Closes #9367 from xvrl/kafka-10570
2020-10-13 12:33:05 -07:00
Andy Coates 40ad4fe0ae
KAFKA-10494: Eager handling of sending old values (#9415)
Nodes that are materialized should not forward requests to `enableSendingOldValues` to parent nodes, as they themselves can handle fulfilling this request. However, some instances of `KTableProcessorSupplier` were still forwarding requests to parent nodes, which was causing unnecessary materialization of table sources.

The following instances of `KTableProcessorSupplier` have been updated to not forward `enableSendingOldValues` to parent nodes if they themselves are materialized and can handle sending old values downstream:

 * `KTableFilter`
 * `KTableMapValues`
 * `KTableTransformValues`

Other instances of `KTableProcessorSupplier` have not be modified for reasons given below:
 * `KTableSuppressProcessorSupplier`: though it has a `storeName` field, it didn't seem right for this to handle sending old values itself. Its only job is to suppress output.
 * `KTableKTableAbstractJoin`: doesn't have a store name, i.e. it is never materialized, so can't handle the call itself.
 * `KTableKTableJoinMerger`: table-table joins already have materialized sources, which are sending old values. It would be an unnecessary performance hit to have this class do a lookup to retrieve the old value from its store.
 * `KTableReduce`: is always materialized and already handling the call without forwarding
 * `KTableAggregate`: is always materialized and already handling the call without forwarding

Reviewer: Matthias J. Sax <matthias@confluent.io>
2020-10-13 11:19:05 -07:00
John Roesler 27b0e35e7a
KAFKA-10437: Implement new PAPI support for test-utils (#9396)
Implements KIP-478 for the test-utils module:
* adds mocks of the new ProcessorContext and StateStoreContext
* adds tests that all stores and store builders are usable with the new mock
* adds tests that the new Processor api is usable with the new mock
* updates the demonstration Processor to the new api

Reviewers: Guozhang Wang <guozhang@apache.org>
2020-10-13 11:15:22 -05:00
Jason Gustafson a72f0c1eac
KAFKA-10533; KafkaRaftClient should flush log after appends (#9352)
This patch adds missing flush logic to `KafkaRaftClient`. The initial flushing behavior is simplistic. We guarantee that the leader will not replicate above the last flushed offset and we guarantee that the follower will not fetch data above its own flush point. More sophisticated flush behavior is proposed in KAFKA-10526.

We have also extended the simulation test so that it covers flush behavior. When a node is shutdown, all unflushed data is lost. We were able to confirm that the monotonic high watermark invariant fails without the added `flush` calls.

This patch also piggybacks a fix to the `TestRaftServer` implementation. The initial check-in contained a bug which caused `RequestChannel` to fail sending responses because the disabled APIs did not have metrics registered. As a result of this, it is impossible to elect leaders.

Reviewers: Guozhang Wang <wangguoz@gmail.com>
2020-10-13 08:59:02 -07:00
huxi 1457cc6525
MINOR: Fix shouldNotResetEpochHistoryHeadIfUndefinedPassed (#9218)
In LeaderEpochFileCacheTest.scala, code is identical for `shouldNotResetEpochHistoryHeadIfUndefinedPassed` and `shouldNotResetEpochHistoryTailIfUndefinedPassed`. Seems `truncateFromStart` should be invoked in `shouldNotResetEpochHistoryHeadIfUndefinedPassed` instead of `truncateFromEnd`.
2020-10-13 10:06:47 +08:00
Jason Gustafson aba9036eb6
KAFKA-10143; Improve test coverage for throttle changes during reassignment (#8891)
In KIP-455, we changed the behavior of the reassignment tool so that the `--additional` flag is required in order to use the command to alter the throttle. This patch improves the documentation to make this clearer and adds some integration tests to validate the behavior.

This patch also contains a few minor code quality improvements:

- Factor out a helper `calculateCurrentMoveMap` from `calculateMoveMap` to compute the current move map, which makes the logic easier to follow
- Rename `calculateMoveMap` to `calculateProposedMoveMap` to make intention clearer
- Split `modifyBrokerThrottles` into two methods `modifyLogDirThrottle` and `modifyInterBrokerThrottle`
- Move logic to compute leader and follower throttles into a new method `modifyReassignmentThrottle`, which takes it out of the execution path when log dir throttles are changed
- Minor stylistic improvements such as replacing `.map.flatten` with `.flatMap`

Reviewers: Chia-Ping Tsai <chia7712@gmail.com>, Colin P. McCabe <cmccabe@apache.org>, Ismael Juma <ismael@juma.me.uk>
2020-10-12 12:36:46 -07:00
Chris Egerton 0a93d2b1af
KAFKA-10574: Fix infinite loop in Values::parseString (#9375)
Fix infinite loop in Values::parseString

Author: Chris Egerton <chrise@confluent.io>
Reviewers: Konstantine Karantasis <konstantine@confluent.io>, Randall Hauch <rhauch@gmail.com>
2020-10-12 11:42:42 -05:00
John Roesler 1f8ac6e6fe
KAFKA-10598: Improve IQ name and type checks (#9408)
Previously, we would throw a confusing error, "the store has migrated,"
when users ask for a store that is not in the topology at all, or when the
type of the store doesn't match the QueryableStoreType parameter.

Adds an up-front check that the requested store is registered and also
a better error message when the QueryableStoreType parameter
doesn't match the store's type.

Reviewers: Guozhang Wang <guozhang@apache.org>
2020-10-12 09:34:32 -05:00
huxi a73bf5931a
KAFKA-10584:IndexSearchType should use sealed trait instead of Enumeration (#9399)
https://issues.apache.org/jira/browse/KAFKA-10584

In Scala, we prefer sealed traits over Enumeration since the former gives you exhaustiveness checking. With Scala Enumeration, you don't get a warning if you add a new value that is not handled in a given pattern match.
2020-10-10 11:34:45 +08:00
Gardner Vickers 24290de828
KAFKA-9393: DeleteRecords may cause extreme lock contention for large partition directories (#7929)
This PR avoids a performance issue with DeleteRecords when a partition directory contains high numbers of files. Previously, DeleteRecords would iterate the partition directory searching for producer state snapshot files. With this change, the iteration is removed in favor of keeping a 1:1 mapping between producer state snapshot file and segment file. A segment files corresponding producer state snapshot file is now deleted when the segment file is deleted.

Reviewers: Jason Gustafson <jason@confluent.io>, Ismael Juma <ismael@juma.me.uk>, Jun Rao <junrao@gmail.com>
2020-10-09 14:25:01 -07:00
Colin Patrick McCabe 46e48d7f22
MINOR: Implement ApiError#equals and hashCode (#9390)
Reviewers: Mickael Maison <mickael.maison@gmail.com>
2020-10-09 10:25:54 -07:00
Ron Dagostino 147a19036e
MINOR: ACLs for secured cluster system tests (#9378)
This PR adds missing broker ACLs required to create topics and SCRAM credentials when ACLs are enabled for a system test. This PR also adds support for using PLAINTEXT as the inter broker security protocol when using SCRAM from the client in a system test with a secured cluster-- without this it would always be necessary to set both the inter-broker and client mechanisms to a SCRAM mechanism. Also contains some refactoring to make assumptions clearer.

Reviewers: Rajini Sivaram <rajinisivaram@googlemail.com>
2020-10-09 15:34:53 +01:00
Xavier Léauté 7947c18b57 MINOR update comments and docs to be gender-neutral
While this is not technically part of KIP-629, I believe this makes our codebase more inclusive as well.

cc gwenshap

Author: Xavier Léauté <xvrl@apache.org>

Reviewers: Gwen Shapira

Closes #9398 from xvrl/neutral-term
2020-10-08 17:05:15 -07:00
Xavier Léauté 4ab72780dd KAFKA-10571; Replace blackout with backoff for KIP-629
This replaces code and comment occurrences as described in the KIP

Author: Xavier Léauté <xvrl@apache.org>

Reviewers: Gwen Shapira, Mickael Maison

Closes #9366 from xvrl/kafka-10571
2020-10-08 15:54:59 -07:00
voffcheg109 5fc3f73f08
KAFKA-7334: Suggest changing config for state.dir in case of FileNotFoundException (#9380)
Add additional warning logs and improve existing log messages for `FileNotFoundException` and if /tmp is used as state directory.

Reviewers: A. Sophie Blee-Goldman <sophie@confluent.io>, Matthias J. Sax <matthias@confluent.io>
2020-10-08 12:20:21 -07:00
Dima Reznik cc54000e72
KAFKA-10271: Performance regression while fetching a key from a single partition (#9020)
StreamThreadStateStoreProvider excessive loop over calling internalTopologyBuilder.topicGroups(), which is synchronized, thus causing significant performance degradation to the caller, especially when store has many partitions.

Reviewers: John Roesler <vvcephei@apache.org>, Guozhang Wang <wangguoz@gmail.com>
2020-10-08 10:12:33 -07:00
Kowshik Prakasam de4183485b
KAFKA-10028: Minor fixes to describeFeatures and updateFeatures apis (#9393)
In this PR, I have addressed the review comments from @chia7712 in #9001 which were provided after #9001 was merged. The changes are made mainly to KafkaAdminClient:

Improve error message in updateFeatures api when feature name is empty.
Propagate top-level error message in updateFeatures api.
Add an empty-parameter variety for describeFeatures api.
Minor documentation updates to @param and @return to make these resemble other apis.

Reviewers: Chia-Ping Tsai chia7712@gmail.com, Jun Rao junrao@gmail.com
2020-10-08 10:05:29 -07:00
Chia-Ping Tsai de546ba827
MINOR: correct package of LinuxIoMetricsCollector (#9271)
Reviewers: Mickael Maison <mickael.maison@gmail.com>, Lee Dongjin <dongjin@apache.org>
2020-10-08 16:58:50 +02:00
Chia-Ping Tsai 68401a40b2
MINOR: remove unused scala files from core module (#9296)
Reviewers: Mickael Maison <mickael.maison@gmail.com>, Lee Dongjin <dongjin@apache.org>
2020-10-08 16:10:08 +02:00