kafka

Commit Graph

Author	SHA1	Message	Date
Tom Bentley	6cfed8ad00	KAFKA-9651: Fix ArithmeticException (÷ by 0) in DefaultStreamPartitioner (#8226 ) In Streams `StreamsMetadataState.getMetadataWithKey`, we should use the inferred max topic partitions passed in directly from the caller than relying on cluster to contain its topic-partition information. Reviewers: Guozhang Wang <wangguoz@gmail.com>	2020-03-23 10:07:54 -07:00
Hossein Torabi	afe26e2992	KAFKA-9563: Fix Kafka Connect documentation around consumer and producer overrides (#8124 ) Kafka Connect main doc required a fix to distinguish between worker level producer and consumer overrides and per-connector level producer and consumer overrides, after the latter were introduced with KIP-458. * [KAFKA-9563] Fix Kafka connect consumer and producer override documentation Co-authored-by: Konstantine Karantasis <konstantine@confluent.io> Reviewers: Konstantine Karantasis <konstantine@confluent.io>	2020-03-22 20:07:21 -07:00
Matthias J. Sax	1ad5f346cb	KAFKA-9451: Enable producer per thread for Streams EOS (#8318 ) - KIP-447 - add new configs to enable producer per thread EOS - updates TaskManager to use single shared producer for eos-beta Reviewers: Boyang Chen <boyang@confluent.io>, Guozhang Wang <guozhang@confluent.io>	2020-03-22 13:50:07 -07:00
Tom Bentley	8aa1861816	KAFKA-9634: Add note about thread safety in the ConfigProvider interface (#8205 ) In Kafka Connect, a ConfigProvider instance can be used concurrently (e.g. via a PUT request to the `/connector-plugins/{connectorType}/config/validate` REST endpoint), but there is no mention of concurrent usage in the Javadocs of the ConfigProvider interface. It's worth calling out that implementations need to be thread safe. Reviewers: Konstantine Karantasis <konstantine@confluent.io>	2020-03-22 12:58:21 -07:00
nicolasguyomar	f8173c2df5	MINOR: Update Connect error message to point to the correct config validation REST endpoint (#7991 ) When incorrect connector configuration is detected, the returned exception message suggests to check the connector's configuration against the `{connectorType}/config/validate` endpoint. Changing the error message to refer to the exact REST endpoint which is `/connector-plugins/{connectorType}/config/validate` This aligns the exception message with the documentation at: https://kafka.apache.org/documentation/#connect_rest Reviewers: Konstantine Karantasis <konstantine@confluent.io>	2020-03-22 12:42:45 -07:00
Chia-Ping Tsai	4f6907947a	MINOR: fix linking errors in javadoc (#8198 ) This improvement fixes several linking errors to classes and methods from within javadocs. Related to #8291 Reviewers: Konstantine Karantasis <konstantine@confluent.io>	2020-03-22 10:15:18 -07:00
Alaa Zbair	5ccd3cd46d	KAFKA-8842: : Reading/Writing confused in Connect QuickStart Guide In step 7 of the QuickStart guide, "Writing data from the console and writing it back to the console" should be "Reading data from the console and writing it back to the console". Co-authored-by: Alaa Zbair <alaa.zbair@grenoble-inp.org> Reviewer: Konstantine Karantasis <konstantine@confluent.io>	2020-03-21 22:00:33 -07:00
A. Sophie Blee-Goldman	6cf27c9c77	KAFKA-6145: Pt 2.5 Compute overall task lag per client (#8252 ) Once we have encoded the offset sums per task for each client, we can compute the overall lag during assign by fetching the end offsets for all changelog and subtracting. If the listOffsets request fails, we simply return a "completely sticky" assignment, ie all active tasks are given to previous owners regardless of balance. Builds (but does not yet use) the statefulTasksToRankedCandidates map with the ranking: Rank -1: active running task Rank 0: standby or restoring task whose overall lag is within acceptableRecoveryLag Rank 1: tasks whose lag is unknown (eg during version probing) Rank 1+: all other tasks are ranked according to their actual total lag Implements: KIP-441 Reviewers: Bruno Cadonna <bruno@confluent.io>, John Roesler <vvcephei@apache.org>	2020-03-21 13:40:34 -05:00
Lucas Bradstreet	c16938fd96	MINOR: allow retries for unitTest and integrationTest runs (#8323 ) We will currently retry if you run gradle test, but not unitTest or integrationTest, which are used directly in Jenkins. This means that we have not been achieving the expected retry behavior. Reviewers: Ismael Juma <ismael@juma.me.uk>	2020-03-21 09:09:29 -07:00
Matthias J. Sax	635b5fd47c	KAFKA-9741: Update ConsumerGroupMetadata before calling onPartitionsRevoked() (#8325 ) If partitions are revoked, an application may want to commit the current offsets. Using transactions, committing offsets would be done via the producer passing in the current ConsumerGroupMetadata. If the metadata is not updates before the callback, the call to commitTransaction(...) fails as and old generationId would be used. Reviewers: Guozhang Wang <wangguoz@gmail.com>	2020-03-20 21:34:08 -07:00
Boyang Chen	c75dc5e2e0	KAFKA-9701 (fix): Only check protocol name when generation is valid (#8324 ) This bug was incurred by #7994 with a too-strong consistency check. It is because a reset generation operation could be called in between the joinGroupRequest -> joinGroupResponse -> SyncGroupRequest -> SyncGroupResponse sequence of events, if user calls unsubscribe in the middle of consumer#poll(). Proper fix is to avoid the protocol name check when the generation is invalid. Reviewers: Matthias J. Sax <matthias@confluent.io>, Guozhang Wang <wangguoz@gmail.com>	2020-03-20 21:26:57 -07:00
Konstantine Karantasis	406635bcc9	MINOR: Use Exit.exit instead of System.exit in MM2 (#8321 ) Exit.exit needs to be used in code instead of System.exit. Particularly in integration tests using System.exit is disrupting because it exits the jvm process and does not just fail the test correctly. Integration tests override procedures in Exit to protect against such cases. Reviewers: Ryanne Dolan <ryannedolan@gmail.com>, Ismael Juma <ismael@juma.me.uk>, Randall Hauch <rhauch@gmail.com>	2020-03-20 16:39:17 -07:00
Boyang Chen	c249ea8e5d	KAFKA-9727: cleanup the state store for standby task dirty close and check null for changelogs (#8307 ) This PR fixes three things: * the state should be closed when standby task is restoring as well * the EOS standby task should also wipe out state under dirty close * the changelog reader should check for null as well Reviewers: Guozhang Wang <wangguoz@gmail.com>	2020-03-20 15:26:13 -07:00
Bruno Cadonna	cbdd0d6cf1	KAFKA-6145: Add constrained balanced assignment algorithm (#8262 ) Adds a currently unused component of the new Streams assignment algorithm. Implements: KIP-441 Reviewers: A. Sophie Blee-Goldman <sophie@confluent.io>, John Roesler <vvcephei@apache.org>	2020-03-20 13:51:24 -05:00
John Roesler	960b216290	KAFKA-9734: Fix IllegalState in Streams transit to standby (#8319 ) Consolidate ChangelogReader state management inside of StreamThread to avoid having to reason about all execution paths in both StreamThread and TaskManager. Reviewers: Guozhang Wang <wangguoz@gmail.com>	2020-03-20 10:17:51 -05:00
Agam Brahma	24d05aa601	KAFKA-9553; Improve measurement for loading groups and transactions (#8155 ) This patch modifies the loading time metric to account for time spent waiting for the loading time task to be scheduled. Reviewers: Jason Gustafson <jason@confluent.io>	2020-03-19 23:35:16 -07:00
Colin Patrick McCabe	56051e7639	KAFKA-8820: kafka-reassign-partitions.sh should support the KIP-455 API (#8244 ) Rewrite ReassignPartitionsCommand to use the KIP-455 API when possible, rather than direct communication with ZooKeeper. Direct ZK access is still supported, but deprecated, as described in KIP-455. As specified in KIP-455, the tool has several new flags. --cancel stops an assignment which is in progress. --preserve-throttle causes the --verify and --cancel commands to leave the throttles alone. --additional allows users to execute another partition assignment even if there is already one in progress. Finally, --show displays all of the current partition reassignments. Reorganize the reassignment code and tests somewhat to rely more on unit testing using the MockAdminClient and less on integration testing. Each integration test where we bring up a cluster seems to take about 5 seconds, so it's good when we can get similar coverage from unit tests. To enable this, MockAdminClient now supports incrementalAlterConfigs, alterReplicaLogDirs, describeReplicaLogDirs, and some other APIs. MockAdminClient is also now thread-safe, to match the real AdminClient implementation. In DeleteTopicTest, use the KIP-455 API rather than invoking the reassignment command.	2020-03-19 20:44:34 -07:00
Chia-Ping Tsai	c27f629e95	KAFKA-9654; Update epoch in `ReplicaAlterLogDirsThread` after new LeaderAndIsr (#8223 ) Currently when there is a leader change with a log dir reassignment in progress, we do not update the leader epoch in the partition state maintained by `ReplicaAlterLogDirsThread`. This can lead to a FENCED_LEADER_EPOCH error, which results in the partition being marked as failed, which is a permanent failure until the broker is restarted. This patch fixes the problem by updating the epoch in `ReplicaAlterLogDirsThread` after receiving a new LeaderAndIsr request from the controller. Reviewers: Jun Rao <junrao@gmail.com>, Jason Gustafson <jason@confluent.io>	2020-03-19 16:49:35 -07:00
jiameixie	11e6aedff6	MINOR: Bump RocksDB version from 5.18.3 to 5.18.4 (#8284 ) Reviewers: A. Sophie Blee-Goldman <sophie@confluent.io>, Matthias J. Sax <matthias@confluent.io>	2020-03-19 15:50:23 -07:00
Matthias J. Sax	21cfd0b453	MINOR: Fix generic types in StreamsBuilder and Topology (#8273 ) Reviewers: Chia-Ping Tsai <chia7712@gmail.com>, Guozhang Wang <guozhang@confluent.io>, John Roesler <john@confluent.io>	2020-03-19 14:29:15 -07:00
Matthias J. Sax	89cd2f2a0b	KAFKA-9441: Unify committing within TaskManager (#8218 ) - part of KIP-447 - commit all tasks at once using non-eos (and eos-beta in follow up work) - unified commit logic into TaskManager - split existing methods of Task interface in pre/post parts Reviewers: Boyang Chen <boyang@confluent.io>, Guozhang Wang <guozhang@confluent.io>	2020-03-19 11:31:51 -07:00
A. Sophie Blee-Goldman	9ee8277cdd	KAFKA-6145: Add new assignment configs Add 4 new assignor configs in preparation for the new assignment algorithm: 1. acceptable.recovery.lag 2. balance.factor 3. max.warmup.replicas 4. probing.rebalance.interval.ms Implements: KIP-441 Reviewers: Bruno Cadonna <bruno@confluent.io>, John Roesler <vvcephei@apache.org>	2020-03-19 13:19:14 -05:00
Bruno Cadonna	0174c95f4f	MINOR: Fix javadoc warning in StreamsMetric (#8314 ) Reviewers: Matthias J. Sax <mjsax@apache.org>, Bill Bejeck <bbejeck@apache.org>	2020-03-19 13:12:24 -04:00
Tom Bentley	4f1e8331ff	KAFKA-9435: DescribeLogDirs automated protocol (#7972 ) Reviewers: Mickael Maison <mickael.maison@gmail.com>	2020-03-19 15:26:18 +00:00
Lucas Bradstreet	5c0cf02947	MINOR: return unmodifiableMap for PartitionStates.partitionStateMap. (#7637 ) Makes the map returned by partitionStateMap unmodifiable to prevent mutation. Reviewers: Ismael Juma <ismael@juma.me.uk>	2020-03-19 08:19:51 -07:00
Boyang Chen	c7164a3866	KAFKA-8618: Replace Txn marker with automated protocol (#7039 ) Reviewers: Mickael Maison <mickael.maison@gmail.com>	2020-03-19 14:26:41 +00:00
A. Sophie Blee-Goldman	85c96f5230	KAFKA-9568: enforce rebalance if client endpoint has changed (#8299 ) Since the assignment info includes a map with all member's host info, we can just check the received map to make sure our endpoint is contained. If not, we need to force the group to rebalance and get our updated endpoint info. Reviewers: Boyang Chen <boyang@confluent.io>, Guozhang Wang <wangguoz@gmail.com>	2020-03-18 18:37:38 -07:00
Guozhang Wang	b1999ba22d	KAFKA-5604: Remove the redundant TODO marker on the Streams side (#8313 ) The issue itself has been fixed a while ago on the producer side, so we can just remove this TODO marker now (we've removed the isZombie flag already anyways). Reviewers: John Roesler <vvcephei@apache.org>	2020-03-18 14:22:53 -07:00
Boyang Chen	b586283c53	KAFKA-9656; Return COORDINATOR_NOT_AVAILABLE for older producer clients (#8253 ) The `TxnOffsetCommit` API suffers from a bug affecting older client versions which treat `COORDINATOR_LOAD_IN_PROGRESS` errors as fatal. This PR changes the handling on the broker to instead return `COORDINATOR_NOT_AVAILABLE` in this case so that clients won't crash upon doing txn commit. Reviewers: Jason Gustafson <jason@confluent.io>	2020-03-18 12:25:36 -07:00
Sanjana Kaundinya	34a7ba56a1	KAFKA-9047; AdminClient group operations should respect retries and backoff (#8161 ) Previously, `AdminClient` group operations did not respect a `Call`'s number of configured tries and retry backoff. This could lead to tight retry loops that put a lot of pressure on the broker. This PR introduces fixes that ensures for all group operations the `AdminClient` respects the number of tries and the backoff a given `Call` has. Reviewers: Vikas Singh <vikas@confluent.io>, Jason Gustafson <jason@confluent.io>	2020-03-18 12:19:40 -07:00
Guozhang Wang	c0cff61e8c	HOTFIX: do not depend on file modified time in StateDirectoryTest	2020-03-18 11:31:23 -07:00
Ismael Juma	93f082e093	MINOR: Update Scala to 2.12.11 (#8308 ) Highlights: * Performance improvements in the ollections library: algorithmic improvements and changes to avoid unnecessary allocations. * Performance improvements in the compiler. * ASM was upgraded to 7.3.1, allowing the optimizer to run on JDK 13+. Full release notes: https://github.com/scala/scala/releases/tag/v2.12.11 Reviewers: Manikumar Reddy <manikumar.reddy@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>	2020-03-18 09:33:14 -07:00
belugabehr	da7d134640	KAFKA-9404: Use ArrayList instead of LinkedList in Sensor (#7936 ) The former is generally a better option than the latter. Reviewers: Ron Dagostino <rdagostino@confluent.io>, Ismael Juma <ismael@juma.me.uk>	2020-03-18 06:38:36 -07:00
Sanjana Kaundinya	5fc3cd61fc	KAFKA-9625: Fix altering and describing dynamic broker configurations (#8260 ) * Broker throttles were incorrectly marked as sensitive configurations. Fix this, so that their values can be returned via DescribeConfigs as expected. * Previously, changes to broker configs that consisted only of deletions were ignored by the brokers because of faulty delta calculation logic that didn't consider deletions as changes, only alterations as changes. Fix this and add a regression test. Reviewers: Colin P. McCabe <cmccabe@apache.org>	2020-03-17 23:02:33 -07:00
A. Sophie Blee-Goldman	d38e97e319	MINOR: clean up required setup for StreamsPartitionAssignorTest (#8306 ) No logical or behavioral changes, just a bit of cleanup in this class before we have to write and fix a lot of these tests for KIP-441: * Moved creation of streamsMetadata mock to setUp (in exactly one test it will be overwritten with a strict mock) * Tried to clean up the use of helper methods for configuring the assignor. Reviewers: Guozhang Wang <wangguoz@gmail.com>	2020-03-17 22:14:00 -07:00
Chia-Ping Tsai	f08c9c7de6	HOTFIX: fix flaky StateDirectoryTest.shouldReturnEmptyArrayIfListFilesReturnsNull (#8310 ) StateDirectoryTest.shouldReturnEmptyArrayIfListFilesReturnsNull always moves the stage dir to /tmp/state-renamed so it always fails if there is already a folder (for example, the stuff leaved by previous test). Reviewers: Boyang Chen <boyang@confluent.io>, A. Sophie Blee-Goldman <sophie@confluent.io>, Guozhang Wang <wangguoz@gmail.com>	2020-03-17 22:13:13 -07:00
Guozhang Wang	6a88d32b9f	KAFKA-8803: Remove timestamp check in completeTransitionTo (#8278 ) In prepareAddPartitions the txnStartTimestamp could be updated as updateTimestamp, which is assumed to be always larger then the original startTimestamp. However, due to ntp time shift the timer may go backwards and hence the newStartTimestamp be smaller than the original one. Then later in completeTransitionTo the time check would fail with an IllegalStateException, and the txn would not transit to Ongoing. An indirect result of this, is that this txn would NEVER be expired anymore because only Ongoing ones would be checked for expiration. We should do the same as in #3286 to remove this check. Also added test coverage for both KAFKA-5415 and KAFKA-8803. Reviewers: Jason Gustafson<jason@confluent.io>	2020-03-17 14:40:02 -07:00
Lucas Bradstreet	97156256c7	MINOR: double -Xss setting from 2m to 4m in build.gradle (#8264 ) I have seen an increased incidence in StackOverflowError(s) when compiling scala. This change doubles the max stack size to 4m. ``` > Task :core:compileScala FAILED FAILURE: Build failed with an exception. * What went wrong: Execution failed for task ':core:compileScala'. > java.lang.StackOverflowError (no error message) ``` Reviewers: Andrew Choi <a24choi@edu.uwaterloo.ca>, Ismael Juma <ismael@juma.me.uk>	2020-03-17 09:12:40 -07:00
A. Sophie Blee-Goldman	673018504f	MINOR: cleanup and add tests to StateDirectoryTest (#8304 ) Adds tests for edge conditions of listAllTaskDirectories Also includes some minor cleanup of the StateDirectoryTest class Reviewers: Guozhang Wang <wangguoz@gmail.com>	2020-03-16 21:59:02 -07:00
Boyang Chen	1e6d944813	HOTFIX: StateDirectoryTest should use Set instead of List (#8305 ) Reviewers: Matthias J. Sax <matthias@confluent.io>, Guozhang Wang <guozhang@confluent.io>	2020-03-16 21:37:46 -07:00
Matthias J. Sax	dffc7f8c30	MINOR: Fix build and JavaDoc warnings (#8291 ) Reviewers: Chia-Ping Tsai <chia7712@gmail.com>, @SoontaekLim, Bill Bejeck <bill@confluent.io>	2020-03-16 18:23:21 -07:00
Brian Byrne	fc79853c4d	MINOR: Fix kafka.server.RequestQuotaTest missing new ApiKeys. (#8302 ) The test was broken by commit `227a7322b7`. Reviewers: Chia-Ping Tsai <chia7712@gmail.com>, Colin P. McCabe <cmccabe@apache.org>	2020-03-16 16:21:45 -07:00
Nigel Liang	569cf994b0	KAFKA-9712: Catch and handle exception thrown by reflections scanner (#8289 ) This commit works around a bug in version v0.9.12 of the upstream `reflections` library by catching and handling the exception thrown. The reflections issue is tracked by: https://github.com/ronmamo/reflections/issues/273 New unit tests were introduced to test the behavior. * KAFKA-9712: Catch and handle exception thrown by reflections scanner * Update connect/runtime/src/main/java/org/apache/kafka/connect/runtime/isolation/DelegatingClassLoader.java Co-Authored-By: Konstantine Karantasis <konstantine@confluent.io> * Move result initialization back to right before it is used * Use `java.io.File` in tests * Fix checkstyle Co-authored-by: Konstantine Karantasis <konstantine@confluent.io> Reviewers: Konstantine Karantasis <konstantine@confluent.io>, Chia-Ping Tsai <chia7712@gmail.com>	2020-03-16 14:43:10 -07:00
Manikumar Reddy	a0e1407820	KAFKA-9670; Reduce allocations in Metadata Response preparation (#8236 ) This PR removes intermediate conversions between `MetadataResponse.TopicMetadata` => `MetadataResponseTopic` and `MetadataResponse.PartitionMetadata` => `MetadataResponsePartition` objects. There is 15-20% reduction in object allocations and 5-10% improvement in metadata request performance. Reviewers: Ismael Juma <ismael@juma.me.uk>, Jason Gustafson<jason@confluent.io>	2020-03-16 09:30:48 -07:00
Chia-Ping Tsai	31659c3ee1	MINOR: fix Scala 2.13 build error introduced in #8083 (#8301 ) Reviewers: Colin P. McCabe <cmccabe@apache.org>, Brian Byrne <bbyrne@confluent.io>	2020-03-16 08:56:47 -07:00
A. Sophie Blee-Goldman	045c6c3c48	MINOR: enforce non-negative invariant for checkpointed offsets (#8297 ) While discussing KIP-441 we realize we don't strictly enforce that all checkpointed offset sums are positive (or 0, though there's not much point to checkingpoint a 0 offset is there)? Rather than awkwardly try handle this within every user/reader of the checkpoint file, we should just make a guarantee that all returned checkpointed offsets are positive. Reviewers: Guozhang Wang <wangguoz@gmail.com>	2020-03-15 09:30:40 -07:00
Dominic Evans	ddd3dfbfae	MINOR: comment apikey types in generated switch (#8201 ) As a developer, it would be convenient if the generated {request,response}HeaderVersion case statements in ApiMessageType.java included a comment to remind me which type each of them is so I don't need to manually cross-reference the newer/rarer ones. Also include commented lines for the two special cases around ApiVersionsResponse and ControllerShutdownRequest which are hardcoded in the ApiMessageTypeGenerator.java and not covered by the message format json files. Before: ```java public short requestHeaderVersion(short _version) { switch (apiKey) { case 0: return (short) 1; case 1: return (short) 1; case 2: return (short) 1; case 3: if (_version >= 9) { return (short) 2; } else { return (short) 1; } // ...etc ``` After: ```java public short requestHeaderVersion(short _version) { switch (apiKey) { case 0: // Produce return (short) 1; case 1: // Fetch return (short) 1; case 2: // ListOffset return (short) 1; case 3: // Metadata if (_version >= 9) { return (short) 2; } else { return (short) 1; } // ...etc ``` Signed-off-by: Dominic Evans <dominic.evans@uk.ibm.com> Reviewers: Mickael Maison <mickael.maison@gmail.com>	2020-03-15 10:46:09 +00:00
Mickael Maison	2e6b15813c	MINOR: Fix typo in CreateTopicsResponse.json (#8300 ) Reviewers: Manikumar Reddy <manikumar.reddy@gmail.com>, Boyang Chen <boyang@confluent.io>	2020-03-15 10:24:37 +00:00
Brian Byrne	227a7322b7	KIP-546: Implement describeClientQuotas and alterClientQuotas. (#8083 ) Reviewers: Colin P. McCabe <cmccabe@apache.org>	2020-03-14 23:03:13 -07:00
Guozhang Wang	605d55dc17	KAFKA-6647: Do note delete the lock file while holding the lock (#8267 ) 1. Inside StateDirectory#cleanRemovedTasks, skip deleting the lock file (and hence the parent directory) until releasing the lock. And after the lock is released only go ahead and delete the parent directory if manualUserCall == true. That is, this is triggered from KafkaStreams#cleanUp and users are responsible to make sure that Streams instance is not started and hence there are no other threads trying to grab that lock. 2. As a result, during scheduled cleanup the corresponding task.dir would not be empty but be left with only the lock file, so effectively we still achieve the goal of releasing disk spaces. For callers of listTaskDirectories like KIP-441 (cc @ableegoldman to take a look) I've introduced a new listNonEmptyTaskDirectories which excludes such dummy task.dirs with only the lock file left. 3. Also fixed KAFKA-8999 along the way to expose the exception while traversing the directory. Reviewers: A. Sophie Blee-Goldman <sophie@confluent.io>, John Roesler <vvcephei@apache.org>	2020-03-14 13:49:08 -07:00

1 2 3 4 5 ...

7371 Commits All Branches Search

7371 Commits

All Branches