Commit Graph

4853 Commits

Author SHA1 Message Date
ro7m ac4374dc24 KAFKA-6732: Fix Streams doc ref link (#4806)
Reviewers: Guozhang Wang <wangguoz@gmail.com>
2018-04-01 18:05:15 -07:00
Guozhang Wang 2e5d4af83f
MINOR: refactor error message of task migration (#4803)
In the stream thread capture of the TaskMigration exception, print the task full information in WARN. In other places only log as INFO, plus additional context information.

Reviewers: Bill Bejeck <bill@confluent.io>, Matthias J. Sax <matthias@confluent.io>
2018-04-01 17:37:00 -07:00
Nick Travers 4106cb1db1 KAFKA-4914: Partition reassignment tool should check types before persisting state in ZooKeeper (#2708)
Prior to this, there have been instances where invalid data was allowed to be persisted in
ZooKeeper, which causes ClassCastExceptions when a broker is restarted and reads this
type-unsafe data.

Adds basic structural and type validation for the reassignment JSON via
introduction of Scala case classes that map to the expected JSON
structure. Also use the Scala case classes to deserialize the JSON
to avoid duplication.

Reviewers: Manikumar Reddy <manikumar.reddy@gmail.com>, Viktor Somogyi <viktor.somogyi@cloudera.com>, Ismael Juma <ismael@juma.me.uk>
2018-03-31 07:57:19 -07:00
gitlw 2ef6ee2338 KAFKA-6630: Speed up the processing of TopicDeletionStopReplicaResponseReceived events on the controller (#4668)
Reviewed by Jun Rao <junrao@gmail.com>
2018-03-29 22:08:28 -07:00
JieFang.He fc0d0021cc KAFKA-6707: The default value for config of Type.LONG should be *L (#4762)
Reviewers: Guozhang Wang <wangguoz@gmail.com>
2018-03-29 16:24:34 -07:00
JieFang.He cb7cf7c5a7 KAFKA-6702: Wrong className in LoggerFactory.getLogger method (#4772)
Reviewers: Manikumar Reddy, Matthias J. Sax <matthias@confluent.io>, Guozhang Wang <wangguoz@gmail.com>, Jason Gustafson <jason@confluent.io>
2018-03-29 15:50:46 -07:00
Bill Bejeck 29838c1042 HOTFIX: ignoring tests using old versions of Streams until KIP-268 is merged (#4766)
Reviewers: Guozhang Wang <wangguoz@gmail.com>
2018-03-28 17:41:08 -07:00
Ismael Juma 77b840b7c4
MINOR: Downgrade to Gradle 4.5.1 (#4791)
There is a regression in 4.6 that causes
`testAll` to fail:

https://github.com/gradle/gradle/pull/4680

Reviewers: Jason Gustafson <jason@confluent.io>
2018-03-28 16:10:26 -07:00
Guozhang Wang 28f1fc2f55
MINOR: Change getMessage to toString (#4790)
Reviewers: Bill Bejeck <bill@confluent.io>, Matthias J. Sax <matthias@confluent.io>
2018-03-28 14:40:21 -07:00
John Roesler 659fbb0b06 MINOR: Depend on streams:test-utils for streams and examples tests (#4760)
Reviewers: Bill Bejeck <bill@confluent.io>, Matthias J. Sax <matthias@confluent.io>, Guozhang Wang <wangguoz@gmail.com>
2018-03-28 11:00:35 -07:00
Alex D dd7011783f KAFKA-6724; ConsumerPerformance should not always reset to earliest offsets (#4787)
Remove the explicit `seekToBeginning` on startup and instead rely on the consumer's auto offset reset strategy to set the initial position.
2018-03-28 10:33:23 -07:00
huxi 5d5a2ce4bb KAFKA-6716: Should close the `discardChannel` in MockSelector#completeSend (#4783) 2018-03-28 15:03:39 +01:00
Ismael Juma 9baa9bddba MINOR: Update Jackson to 2.9.5 (#4776) 2018-03-27 23:08:07 -07:00
Ismael Juma 281dbfd981
MINOR: LogCleaner.validateReconfiguration fixes (#4770)
Reviewers: Rajini Sivaram <rajinisivaram@googlemail.com>
2018-03-27 21:03:35 -07:00
Boyang Chen 964693e40d KAFKA-6386: Use Properties instead of StreamsConfig in KafkaStreams constructor
This pull request targets https://issues.apache.org/jira/browse/KAFKA-6386
The minor fix to deprecate usage of `StreamsConfig` in favor of `java.util.Properties`.
I created separate public constructors using `Properties` in order to replace the old ones,
and prioritize new functions in the `KafkaStreams.java` file.

Since this is my first time doing open source contribution, I'm very happy to get
any comment or pointer to be more professional and get better next time, thank you Guozhang guozhangwang and Liquan Ishiihara!

testing strategy: existing unit test should be suffice to cover this change.

Author: cs427fa16staff <bchen11@outlook.com>

Reviewers: Matthias J. Sax <matthias@confluent.io>, Guozhang Wang <guozhang@confluent.io>, Bill Bejeck <bill@confluent.io>

Closes #4354 from abbccdda/starter

github comments
2018-03-27 16:09:34 -07:00
John Roesler adbf31ab1d KAFKA-6473: Add MockProcessorContext to public test-utils (#4736)
Reviewers: Matthias J. Sax <matthias@confluent.io>, Guozhang Wang <guozhang@confluent.io>, Bill Bejeck <bill@confluent.io>
2018-03-27 14:03:24 -07:00
Manikumar Reddy O 395c7e0f09 MINOR: Fix ReassignPartitionsClusterTest.testHwAfterPartitionReassignment test (#4781)
Reviewers: Dong Lin <lindong28@gmail.com>, Jason Gustafson <jason@confluent.io>
2018-03-27 13:18:53 -07:00
huxi 9eb32eaad5 KAFKA-6446; KafkaProducer initTransactions() should timeout after max.block.ms (#4563)
Currently the `initTransactions()` API blocks indefinitely if the broker cannot be reached. This patch changes the behavior to raise a `TimeoutException` after waiting for `max.block.ms`. 

Reviewers: Apurva Mehta <apurva@confluent.io>, Jason Gustafson <jason@confluent.io>
2018-03-27 09:21:18 -07:00
Bill Bejeck f4a39f9b83 MINOR: Fix flaky standby task test (#4767)
The standby-task test failed due to standby task distribution not be exactly equal. I think this will be the case from time to time, so I've updated test to make sure the standby task assignment count is not zero.

Reviewers: Guozhang Wang <wangguoz@gmail.com>, John Roesler <john@confluent.io>, Matthias J. Sax <matthias@confluent.io>
2018-03-26 16:22:20 -07:00
Vahid Hashemian 3f611044cc KAFKA-6052; Fix WriteTxnMarkers request retry issue in InterBrokerSendThread (#4705)
This resolves the issue found when running the brokers on Windows which prevents the coordinator from sending WriteTxnMarkers requests to complete a transaction.
2018-03-24 14:04:08 -07:00
Rajini Sivaram f66aebff36
KAFKA-6710: Remove Thread.sleep from LogManager.deleteLogs (#4771)
`Thread.sleep` in `LogManager.deleteLogs` potentially blocks a scheduler thread for up to `log.segment.delete.delay.ms` with a default value of a minute. To avoid this, `deleteLogs` now deletes the logs for which `currentDefaultConfig.fileDeleteDelayMs` has elapsed after the delete was scheduled. Logs for which this interval has not yet elapsed are considered for deletion in the next iteration of `deleteLogs`, which is scheduled sooner if required.

Reviewers: Jun Rao <junrao@gmail.com>, Dong Lin <lindong28@gmail.com>, Ted Yu <yuzhihong@gmail.com>
2018-03-24 20:54:09 +00:00
cburroughs 514936af6f MINOR: Remove redundant initialization of `Stats.index` (#4751) 2018-03-24 12:39:10 -07:00
Attila Sasvari 549a5cec1e MINOR: Fix potential resource leak in FileOffsetBackingStore (#4739)
Reviewers: Sandor Murakozi <smurakozi@gmail.com>, Jason Gustafson <jason@confluent.io>
2018-03-24 12:20:11 -07:00
huxi dd78b9fa26 KAFKA-6637; Avoid divide by zero error with segment.ms set to zero (#4698)
Require a minimum value of 1 for `segment.ms` to avoid division by zero when computing random jitter.

Reviewers: Manikumar Reddy O <manikumar.reddy@gmail.com>, Jason Gustafson <jason@confluent.io>
2018-03-24 12:14:53 -07:00
Lucas Wang 685fd03dda KAFKA-6612: Added logic to prevent increasing partition counts during topic deletion
This patch adds logic in handling the PartitionModifications event, so that if the partition count is increased when a topic deletion is still in progress, the controller will restore the data of the path /brokers/topics/"topic" to remove the added partitions.

Testing done:
Added a new test method to cover the bug

Author: Lucas Wang <luwang@linkedin.com>

Reviewers: Jiangjie (Becket) Qin <becket.qin@gmail.com>

Closes #4666 from gitlw/prevent_increasing_partition_count_during_topic_deletion
2018-03-23 18:18:16 -07:00
Rajini Sivaram 2307314432
MINOR: Fix encoder config to make DynamicBrokerReconfigurationTest stable (#4764)
DynamicBrokerReconfigurationTest currently assumes that passwords encoded with one secret will fail with an exception if decoded with another secret and configures an old.secret in setUp. This could potentially cause test failures if a password was incorrectly decoded with the wrong secret, since the test writes passwords encoded with the new secret directly to ZooKeeper. Since old.secret is only used in one test for verifying secret rotation, this config can be moved to that test to avoid transient failures.

Reviewers: Ismael Juma <ismael@juma.me.uk>
2018-03-23 17:01:32 +00:00
Ismael Juma 395dcad7d9
MINOR: Java 9/10 fixes, gradle and minor deps update (#4725)
* Added dependencies so that Trogdor and Connect work with Java 9 and 10
* Updated Jacoco to 0.8.1 so that it works with Java 10
* Updated Gradle to 4.6
* A few minor version bumps (not related to Java9/10 fixes)

I tested manually that we can run ./gradlew test with Java 10
after these changes. There are test failures as EasyMock
and PowerMock will have to be updated to use a newer
ASM version. But compiling successfully and most tests
passing is progress. :)

I also tested manually that Trogdor can be started with Java 10.
It previously failed with a ClassNotFoundError.

Reviewers: Jason Gustafson <jason@confluent.io>
2018-03-22 22:01:51 -07:00
Jason Gustafson fcf8781602 KAFKA-6683; Ensure producer state not mutated prior to append (#4755)
We were unintentionally mutating the cached queue of batches prior to appending to the log. This could have several bad consequences if the append ultimately failed or was truncated. In the reporter's case, it caused the snapshot to be invalid after a segment roll. The snapshot contained producer state at offsets higher than the snapshot offset. If we ever had to load from that snapshot, the state was left inconsistent, which led to an error that ultimately crashed the replica fetcher.

The fix required some refactoring to avoid sharing the same underlying queue inside ProducerAppendInfo. I have added test cases which reproduce the invalid snapshot state. I have also made an effort to clean up logging since it was not easy to track this problem down.

One final note: I have removed the duplicate check inside ProducerStateManager since it was both redundant and incorrect. The redundancy was in the checking of the cached batches: we already check these in Log.analyzeAndValidateProducerState. The incorrectness was the handling of sequence number overflow: we were only handling one very specific case of overflow, but others would have resulted in an invalid assertion. Instead, we now throw OutOfOrderSequenceException.

Reviewers: Apurva Mehta <apurva@confluent.io>, Jun Rao <junrao@gmail.com>
2018-03-22 21:42:49 -07:00
Guozhang Wang f2fbfaaccc
KAFKA-6611: PART I, Use JMXTool in SimpleBenchmark (#4650)
1. Use JmxMixin for SimpleBenchmark (will remove the self reporting in #4744), only when loading phase is false (i.e. we are in fact starting the streams app).

2. Reported the full jmx reported metrics in log files, and in the returned data only return the max values: this is because we want to skip the warming up and cooling down periods that will have lower rate numbers, while max represents the actual rate at full speed.

3. Incorporates two other improves to JMXTool: #1241 and #2950

Reviewers: John Roesler <john@confluent.io>, Matthias J. Sax <matthias@confluent.io>, Rohan Desai <desai.p.rohan@gmail.com>
2018-03-22 16:46:56 -07:00
Bill Bejeck 286216b56e MINOR: Rolling bounce upgrade fixed broker system test (#4690)
Reviewers: Guozhang Wang <guozhang@confluent.io>, John Roesler <john@confluent.io>, Matthias J. Sax <matthias@confluent.io>
2018-03-22 16:02:16 -07:00
Stuart Perks 9ee00c4b66 KAFKA-6659: Improve error message if state store is not found (#4732)
Reviewers: Bill Bejeck <bill@confluent.io>, Matthias J. Sax <matthias@confluent.io>
2018-03-22 15:37:28 -07:00
Rajini Sivaram 57b1c28d60
MINOR: Fix AdminClient.describeConfigs() of listener configs (#4747)
Don't return config values from `describeConfigs` if the config type cannot be determined. Obtain config types correctly for listener configs for `describeConfigs` and password encryption.

Reviewers: Jason Gustafson <jason@confluent.io>
2018-03-22 20:05:45 +00:00
Matthias J. Sax f0a29a6935
MINOR: remove obsolete warning in StreamsResetter (#4749)
Reviewer: Guozhang Wang <guozhang@confluent.io>
2018-03-21 16:24:59 -07:00
Rajini Sivaram 2f90cb86c1 MINOR: Remove acceptor creation in network thread update code (#4742)
Fix dynamic addition of network threads to only create new Processor threads and not the Acceptor.
2018-03-20 22:40:16 -07:00
Colin Patrick McCabe f5287ccad2 MINOR: Fix flaky TestUtils functions (#4743)
TestUtils#produceMessages should always close the KafkaProducer, even
when there is an exception.  Otherwise, the test will leak threads when
there is an error.

TestUtils#createNewProducer should create a producer with a
requestTimeoutMs of 30 seconds by default, not around 10 seconds.
This should avoid tests that flake when the load on Jenkins climbs.

Fix two cases where a very short timeout of 2 seconds was getting set.

Reviewers: Ismael Juma <ismael@juma.me.uk>
2018-03-20 22:36:41 -07:00
Mickael Maison 36d4b91573 MINOR: Updated SASL Authentication Sequence Docs (#4724)
Reworded the SASL Authentication sequence to update it to >= 1.0.0

Co-authored-by: Edoardo Comar <ecomar@uk.ibm.com>, Mickael Maison <mickael.maison@gmail.com>

Reviewers: Rajini Sivaram <rajinisivaram@googlemail.com>
2018-03-20 13:56:40 +00:00
Anna Povzner 5c24295d44 Trogdor's ProducerBench does not fail if topics exists (#4673)
Added configs to ProducerBenchSpec:
topicPrefix: name of topics will be of format topicPrefix + topic index. If not provided, default is "produceBenchTopic".
partitionsPerTopic: number of partitions per topic. If not provided, default is 1.
replicationFactor: replication factor per topic. If not provided, default is 3.

The behavior of producer bench is changed such that if some or all topics already exist (with topic names = topicPrefix + topic index), and they have the same number of partitions as requested, the worker uses those topics and does not fail. The producer bench fails if one or more existing topics has number of partitions that is different from expected number of partitions.

Added unit test for WorkerUtils -- for existing methods and new methods.

Fixed bug in MockAdminClient, where createTopics() would over-write existing topic's replication factor and number of partitions while correctly completing the appropriate futures exceptionally with TopicExistsException.

Reviewers: Colin P. Mccabe <cmccabe@confluent.io>, Rajini Sivaram <rajinisivaram@googlemail.com>
2018-03-20 13:51:45 +00:00
Manikumar Reddy O aefe35e493 KAFKA-6680: Fix issues related to Dynamic Broker configs (#4731)
- Fix kafkaConfig initialization if there are no dynamic configs defined in ZK.
- Update DynamicListenerConfig.validateReconfiguration() to check new Listeners must be subset of listener map

Reviewers: Rajini Sivaram <rajinisivaram@googlemail.com>
2018-03-19 22:22:53 +00:00
Guozhang Wang 0f364cd53a
MINOR: Pass a streams config to replace the single state dir (#4714)
This is a general change and is re-requisite to allow streams benchmark test with different streams tests. For the streams benchmark itself I will have a separate PR for switching configs. Details:

1. Create a "streams.properties" file under PERSISTENT_ROOT before all the streams test. For now it will only contain a single config of state.dir pointing to PERSISTENT_ROOT.

2. For all the system test related code, replace the main function parameter of state.dir with propsFilename, then inside the function load the props from the file and apply overrides if necessary.

3. Minor fixes.

Matthias J. Sax <matthias@confluent.io>, Bill Bejeck <bill@confluent.io>
2018-03-19 14:17:00 -07:00
Colin Patrick McCabe 27bb3ccace MINOR: KafkaFutureImpl#addWaiter should be protected (#4734)
KafkaFutureImpl#addWaiter should be protected, just like KafkaFuture#addWaiter.  As described in KIP-218, whenComplete is the public API, not addWaiter.
2018-03-19 13:30:13 -07:00
Ismael Juma 6fab286da2 MINOR: Fix some compiler warnings (#4726) 2018-03-19 15:03:58 +00:00
Colin Patrick McCabe 0560193706 MINOR: improve trogdor commandline (#4721)
Allow -c as a synonym for --agent.config and --coordinator.config.

Allow -n as a synonym for --node-name.

Add an example trogdor.conf file.
2018-03-19 11:55:29 +00:00
Jason Gustafson 7041e76bd6 MINOR: Some logging improvements for debugging delayed produce status (#4691)
A few small logging improvements which help debugging replication issues.
2018-03-19 11:08:12 +00:00
Ewen Cheslack-Postava f264bfa296 KAFKA-6676: Ensure Kafka chroot exists in system tests and use chroot on one test with security parameterizations (#4729)
Ensures Kafka chroot exists in ZK when starting KafkaService so commands that use ZK and are executed before the first Kafka broker starts do not fail due to the missing chroot.

Also uses chroot with one test that also has security parameterizations so Kafka's test suite exercises these combinations. Previously no tests were exercising chroots.

Changes were validated using sanity_checks which include the chroot-ed test as well as some non-chroot-ed tests.
2018-03-19 10:37:02 +00:00
asutosh936 02a8ef8595 KAFKA-6486: Implemented LinkedHashMap in TimeWindows (#4628)
Reviewers: Guozhang Wang <guozhang@confluent.io>, Matthias J. Sax <matthias@confluent.io>
2018-03-18 20:44:11 -07:00
Dong Lin 4391a4214d MINOR: Use log start offset as high watermark if current value is out of range (#4722)
Reviewers: Jun Rao <junrao@gmail.com>, Jason Gustafson <jason@confluent.io>
2018-03-18 11:21:43 -07:00
Matthias J. Sax 7fe06a8666
MINOR: fix flaky Streams EOS system test (#4728)
Reviewer: Guozhang Wang <guozhang@confluent.io>, Bill Bejeck <bill@confluent.io>
2018-03-17 18:02:20 -07:00
Dhruvil Shah ae31ee63dc KAFKA-6530: Use actual first offset of message set when rolling log segment (#4660)
Use the exact first offset of message set when rolling log segment. This is possible to do for message format V2 and beyond without any performance penalty, because we have the first offset stored in the header. This augments the fix made in KAFKA-4451 to avoid using the heuristic for V2 and beyond messages.

Added unit tests to simulate cases where segment needs to roll because of overflow in index offsets. Verified that the new segment created in these cases uses the first offset, instead of the heuristic in use previously.
2018-03-17 10:29:42 -07:00
Sandor Murakozi 2afac71566 MINOR: Remove unnecessary null checks (#4708)
Remove unnecessary null check in StringDeserializer, MockProducerInterceptor and KStreamImpl.

Reviewers: Vahid Hashemian <vahidhashemian@us.ibm.com>, Jason Gustafson <jason@confluent.io>
2018-03-16 23:34:16 -07:00
Jason Gustafson 9782465d6f
KAFKA-6672; ConfigCommand should create config change parent path if needed (#4727)
Change `KafkaZkClient.createConfigChangeNotification` to ensure creation of the change directory. This fixes failing system tests which depend on setting SCRAM credentials prior to broker startup. Existing test case has been modified for new expected usage.

Reviewers: Ismael Juma <ismael@juma.me.uk>
2018-03-16 22:29:22 -07:00