Commit Graph

5722 Commits

Author SHA1 Message Date
hejiefang ffd6f2a2e8 MINOR: Fix doc format in upgrade notes
Author: hejiefang <he.jiefang@zte.com.cn>

Reviewers: Srinivas <srinivas96alluri@gmail.com>, Manikumar Reddy <manikumar.reddy@gmail.com>

Closes #6076 from hejiefang/modifynotable
2019-01-02 19:30:51 +05:30
Flavien Raynaud 9295444d48 MINOR: Improve exception messages in FileChannelRecordBatch (#6068)
Replace `channel` by `fileRecords` in potentially thrown KafkaException
descriptions when loading/writing `FileChannelRecordBatch`. This makes exception
messages more readable (channel only shows an object hashcode, fileRecords shows
the path of the file being read and start/end positions in the file).

Reviewers: Jason Gustafson <jason@confluent.io>
2018-12-28 13:58:05 -08:00
Renato Mefi 964e2c57c2 KAFKA-3832; Kafka Connect's JSON Converter never outputs a null value (#6027)
When using the Connect `JsonConverter`, it's impossible to produce tombstone messages, thus impacting the compaction of the topic. This patch allows the converter with and without schemas to output a NULL byte value in order to have a proper tombstone message. When it's regarding to get this data into a connect record, the approach is the same as when the payload looks like `"{ "schema": null, "payload": null }"`, this way the sink connectors can maintain their functionality and reduces the BCC.

Reviewers: Gunnar Morling <gunnar.morling@googlemail.com>, Randall Hauch <rhauch@gmail.com>, Jason Gustafson <jason@confluent.io>
2018-12-28 09:39:52 -08:00
Bill Bejeck 4616c0aaff MINOR: Add test demonstrating re-use of KGroupedStream with Optimizations enabled (#6050)
Right now if a repartition is required and users choose to name the repartition topic for an aggregation i.e. kGroupedStream = builder.<String, String>stream("topic").selectKey((k, v) -> k).groupByKey(Grouped.as("grouping")); The resulting KGroupedStream can't be reused
with optimizations are disabled, as Streams will attempt to create two repartiton topics with the same name.

However, if optimizations are enabled then the resulting KGroupedStream can be re-used
For example the following will work if optimizations are enabled.

This PR provides a unit test proving as much.

Reviewers: Matthias J. Sax <mjsax@apache.org>, Guozhang Wang <wangguoz@gmail.com>
2018-12-21 22:56:32 -08:00
Bill Bejeck 53eb8df344 MINOR: Bump admin client retries for creating repartition topics (#6063)
The topology optimization test was getting intermittent failures because of failures to create repartition topics on startup. This PR Increased admin client retries

I kicked off the system test with 25 repeats, all passed http://confluent-kafka-branch-builder-system-test-results.s3-us-west-2.amazonaws.com/2018-12-21--001.1545436859--bbejeck--MINOR_flaky_optimization_test_create_repartition_fails--6cd55e2/report.html

Reviewers: Guozhang Wang <wangguoz@gmail.com>
2018-12-21 22:43:55 -08:00
Bill Bejeck 639427a38f MINOR: Increase throughput for VerifiableProducer in test (#6060)
Previous PR #6043 reduced throughput for VerifiableProducer in base class, but the streams_standby_replica_test needs higher throughput for consumer to complete verification in 60 seconds

Update system test and kicked off branch builder with 25 repeats https://jenkins.confluent.io/job/system-test-kafka-branch-builder/2201/

Reviewers: Guozhang Wang <wangguoz@gmail.com>
2018-12-21 22:32:19 -08:00
Manohar Vanam 55334453a5 KAFKA-7054; Kafka describe command should throw topic doesn't exist exception
**User Interface Improvement :** If topic doesn't exist then Kafka describe command should throw topic doesn't exist exception, like alter and delete commands

Author: Manohar Vanam <manohar.crazy09@gmail.com>

Reviewers: Vahid Hashemian <vahid.hashemian@gmail.com>, Jason Gustafson <jason@confluent.io>, Manikumar Reddy <manikumar.reddy@gmail.com>

Closes #5211 from ManoharVanam/KAFKA-7054
2018-12-21 14:48:22 +05:30
Viktor Somogyi 684184973e MINOR: Hygiene fixes in KafkaFutureImpl (#5098)
Change-Id: Ia44c6c659418bbed5367645b814725365daba820
2018-12-21 14:32:22 +05:30
Srinivas Reddy 85906d3d2b MINOR: Switch anonymous classes to lambda expressions in tools module
Switch to lambda when ever possible instead of old anonymous way
in tools module

Author: Srinivas Reddy <srinivas96alluri@gmail.com>
Author: Srinivas Reddy <mrsrinivas@users.noreply.github.com>

Reviewers: Ryanne Dolan <ryannedolan@gmail.com>, Ismael Juma <ismael@juma.me.uk>, Manikumar Reddy <manikumar.reddy@gmail.com>

Closes #6013 from mrsrinivas/tools-switch-to-java8
2018-12-21 14:20:57 +05:30
Alex Diachenko ee370d3893 KAFKA-7759; Disable WADL output in the Connect REST API (#6051)
This patch disables support for WADL output in the Connect REST API since it was never intended to be exposed. 

Reviewers: Randall Hauch <rhauch@gmail.com>, Jason Gustafson <jason@confluent.io>
2018-12-20 14:24:05 -08:00
Matthias Wessendorf d413117769 KAFKA-7762; Update KafkaConsumer Javadoc examples to use poll(Duration timeout) API
Author: Matthias Wessendorf <mwessend@redhat.com>

Reviewers: Manikumar Reddy <manikumar.reddy@gmail.com>

Closes #6052 from matzew/use_new_poll_api
2018-12-20 22:25:25 +05:30
Satish Duggana b23bf41e84 KAFKA-7742; Fixed removing hmac entry for a token being removed from DelegationTokenCache
Author: Satish Duggana <satishd@apache.org>

Reviewers: Manikumar Reddy <manikumar.reddy@gmail.com>

Closes #6037 from satishd/KAFKA-7742
2018-12-20 15:55:34 +05:30
Bill Bejeck 40ca7ddeed MINOR: Stabilization fixes broker down test trunk (#6043)
This PR addresses a few issues with this system test flakiness. This PR is a cherry-picked duplicate of #6041 but for the trunk branch, hence I won't repeat the inline comments here.

1. Need to grab the monitor before a given operation to observe logs for signal
2. Relied too much on a timely rebalance and only sent a handful of messages.
I've updated the test and ran it here https://jenkins.confluent.io/job/system-test-kafka-branch-builder/2143/ parameterized for 15 repeats all passed.

Reviewers: Guozhang Wang <wangguoz@gmail.com>
2018-12-19 17:53:09 -08:00
Attila Sasvari 09c4a3e87d MINOR: Update README.md with Gradle 5+ requirement (#6039) 2018-12-17 16:30:14 -08:00
Jason Gustafson 40392266aa
MINOR: Include additional detail in fetch error message (#6036)
This patch adds additional information in the log message after a fetch failure to make debugging easier.

Reviewers: David Arthur <mumrah@gmail.com>
2018-12-17 15:51:01 -08:00
Matthias J. Sax c441528b93
MINOR: improve Streams error message (#5975)
Reviewers: Bill Bejeck <bill@confluent.io>, Guozhang Wang <guozhang@confluent.io>
2018-12-17 13:57:01 +01:00
David Arthur 152292994e KAFKA-2334; Guard against non-monotonic offsets in the client (#5991)
After a recent leader election, the leaders high-water mark might lag behind the offset at the beginning of the new epoch (as well as the previous leader's HW). This can lead to offsets going backwards from a client perspective, which is confusing and leads to strange behavior in some clients.

This change causes Partition#fetchOffsetForTimestamp to throw an exception to indicate the offsets are not yet available from the leader. For new clients, a new OFFSET_NOT_AVAILABLE error is added. For existing clients, a LEADER_NOT_AVAILABLE is thrown.

This is an implementation of [KIP-207](https://cwiki.apache.org/confluence/display/KAFKA/KIP-207%3A+Offsets+returned+by+ListOffsetsResponse+should+be+monotonically+increasing+even+during+a+partition+leader+change).

Reviewers: Colin Patrick McCabe <colin@cmccabe.xyz>, Dhruvil Shah <dhruvil@confluent.io>, Jason Gustafson <jason@confluent.io>
2018-12-14 13:53:03 -08:00
Guozhang Wang 5c549b2a89
MINOR: Replace tbd with the actual link for out-of-ordering data (#6035)
Reviewers: Jason Gustafson <jason@confluent.io>
2018-12-14 09:37:43 -08:00
cwildman f982f61fbe MINOR: Update documentation for internal changelog when using table(). (#6021)
Updating the documentation for table operation because I believe it is incorrect.

In PR #5163 the table operation stopped disabling the changelog topic by default and instead moved that optimization to a configuration that is not enabled by default. This PR updates the documentation to reflect the change in behavior and point to the new configuration for optimization.

Reviewers: Bill Bejeck <bbejeck@gmail.com>, Guozhang Wang <wangguoz@gmail.com>
2018-12-13 20:33:25 -08:00
Bill Bejeck da332f2241 MINOR:Start processor inside verify message (#6029)
This PR fixes a flaky system test.

I ran six runs of branch builder, and each run was parameterized to repeat the test 25 times for a total of 150 runs. All test runs passed.

https://jenkins.confluent.io/job/system-test-kafka-branch-builder/2122/
https://jenkins.confluent.io/job/system-test-kafka-branch-builder/2123/
https://jenkins.confluent.io/job/system-test-kafka-branch-builder/2124/
https://jenkins.confluent.io/job/system-test-kafka-branch-builder/2128/
https://jenkins.confluent.io/job/system-test-kafka-branch-builder/2129/
https://jenkins.confluent.io/job/system-test-kafka-branch-builder/2130/

Reviewers: Matthias J. Sax <matthias@confluent.io>, Guozhang Wang <guozhang@confluent.io>, John Roesler <vvcephei@users.noreply.github.com>
2018-12-13 20:30:04 -08:00
Srinivas Reddy 88443b4a37 Fix the missing ApiUtils tests in streams module. (#6003)
Reviewers: Matthias J. Sax <matthias@confluent.io>, Guozhang Wang <guozhang@confluent.io>, Bill Bejeck <bill@confluent.io>
2018-12-13 19:31:58 -08:00
Pasquale Vazzana dffce6e7ae KAFKA-7655 Metadata spamming requests from Kafka Streams under some circumstances, potential DOS (#5929)
Reviewers: Matthias J. Sax <matthias@confluent.io>, Guozhang Wang <guozhang@confluent.io>
2018-12-13 16:40:39 +01:00
John Roesler fdd33bcab0 KAFKA-7223: document suppression buffer metrics (#6024)
Document the new metrics added in #5795

Reviewers: Bill Bejeck <bill@confluent.io>, Guozhang Wang <wangguoz@gmail.com>
2018-12-12 18:22:54 -08:00
Rajini Sivaram 46e8081f9c KAFKA-7712; Remove channel from Selector before propagating exception (#6023)
Ensure that channel and selection keys are removed from `Selector` collections before propagating connect exceptions. They are currently cleared on the next `poll()`, but we can't ensure that callers (NetworkClient for example) wont try to connect again before the next `poll` and hence we should clear the collections before re-throwing exceptions from `connect()`.

Reviewers: Jason Gustafson <jason@confluent.io>
2018-12-12 10:50:56 -08:00
hackerwin7 975b680bcd KAFKA-7705; Fix and simplify producer config in javadoc example (#6000)
The example in the producer's javadoc contained an inconsistent value for `delivery.timeout.ms`. This patch removes the inconsistent config and several unnecessary overrides in order to simplify the example.

Reviewers: huxi <huxi_2b@hotmail.com>, Ismael Juma <ismael@juma.me.uk>, Jason Gustafson <jason@confluent.io>
2018-12-12 08:37:09 -08:00
Arjun Satish 3ef6b0fce0 MINOR: Add unit for max latency in ProducerPerformance output (#6014)
Reviewers: Jason Gustafson <jason@confluent.io>
2018-12-12 08:28:58 -08:00
Matthias J. Sax 046b0087bd
MINOR: improve Streams checkstyle and code cleanup (#5954)
Reviewers: Guozhang Wang <guozhang@confluent.io>, Nikolay Izhikov <nIzhikov@gmail.com>, Ismael Juma <ismael@confluent.io>, Bill Bejeck <bill@confluent.io>
2018-12-11 01:54:41 -08:00
Nikolay c142809038 KAFKA-6970: All standard state stores guarded with read only wrapper (#6016)
Reviewer: Matthias J. Sax <matthias@confluent.io>, John Roesler <john@confluent.io>
2018-12-11 01:44:18 -08:00
linyli001 a94c8da508 KAFKA-7443: OffsetOutOfRangeException in restoring state store from changelog topic when start offset of local checkpoint is smaller than that of changelog topic (#5946)
Reviewer: Matthias J. Sax <matthias@confluent.io>, John Roesler <john@confluent.io>
2018-12-11 00:40:18 -08:00
Jason Gustafson 20069b3906 KAFKA-7610; Proactively timeout new group members if rebalance is delayed (#5962)
When a consumer first joins a group, it doesn't have an assigned memberId. If the rebalance is delayed for some reason, the client may disconnect after a request timeout and retry. Since the client had not received its memberId, then we do not have a way to detect the retry and expire the previously generated member id. This can lead to unbounded growth in the size of the group until the rebalance has completed.

This patch fixes the problem by proactively completing all JoinGroup requests for new members after a timeout of 5 minutes. If the client is still around, we expect it to retry.

Reviewers: Stanislav Kozlovski <stanislav_kozlovski@outlook.com>, Boyang Chen <bchen11@outlook.com>, Guozhang Wang <wangguoz@gmail.com>
2018-12-10 14:32:29 -08:00
Lee Dongjin 7a3dffb0ca KAFKA-7549; Old ProduceRequest with zstd compression does not return error to client (#5925)
Older versions of the Produce API should return an error if zstd is used. This validation existed, but it was done during request parsing, which means that instead of returning an error code, the broker disconnected. This patch fixes the issue by moving the validation outside of the parsing logic. It also fixes several other record validations which had the same problem.

Reviewers: Jason Gustafson <jason@confluent.io>
2018-12-10 09:45:18 -08:00
Guozhang Wang ab156fded1
KAFKA-6036: Follow-up; cleanup sendOldValues logic ForwardingCacheFlushListener (#6017)
This is a follow-up PR from the previous PR #5779, where KTabeSource always get old values from the store even if sendOldValues. It gets me to make a pass over all the KTable/KStreamXXX processor to push the sendOldValues at the callers in order to avoid unnecessary store reads.

More details: ForwardingCacheFlushListener and TupleForwarder both need sendOldValues as parameters.
a. For ForwardingCacheFlushListener it is not needed at all, since its callers XXXCachedStore already use the sendOldValues values passed from TupleForwarder to avoid getting old values from underlying stores.
b. For TupleForwarder, it actually only need to pass the boolean flag to the cached store; and then it does not need to keep it as its own variable since the cached store already respects the boolean to pass null or the actual value..

The only other minor bug I found from the pass in on KTableJoinMerge, where we always pass old values and ignores sendOldValues.

Reviewers: Matthias J. Sax <mjsax@apache.org>
2018-12-09 15:33:17 -08:00
Guozhang Wang c0353d8ddc
KAFKA-6036: Local Materialization for Source KTable (#5779)
Refactor the materialization for source KTables in the way that:

If Materialized.as(queryableName) is specified, materialize;
If the downstream operator requires to fetch from this KTable via ValueGetters, materialize;
If the downstream operator requires to send old values, materialize.
Otherwise do not materialize the KTable. E.g. builder.table("topic").filter().toStream().to("topic") would not create any state stores.

There's a couple of minor changes along with PR as well:

KTableImpl's queryableStoreName and isQueryable are merged into queryableStoreName only, and if it is null it means not queryable. As long as it is not null, it should be queryable (i.e. internally generated names will not be used any more).
To achieve this, splitted MaterializedInternal.storeName() and MaterializedInternal.queryableName(). The former can be internally generated and will not be exposed to users. QueryableName can be modified to set to the internal store name if we decide to materialize it during the DSL parsing / physical topology generation phase. And only if queryableName is specified the corresponding KTable is determined to be materialized.

Found some overlapping unit tests among KTableImplTest, and KTableXXTest, removed them.

There are a few typing bugs found along the way, fixed them as well.

-----------------------

This PR is an illustration of experimenting a poc towards logical materializations.

Today we've logically materialized the KTable for filter / mapValues / transformValues if queryableName is not specified via Materialized, but whenever users specify queryableName we will still always materialize. My original goal is to also consider logically materialize for queryable stores, but when implementing it via a wrapped store to apply the transformations on the fly I realized it is tougher than I thought, because we not only need to support fetch or get, but also needs to support range queries, approximateNumEntries, and isOpen etc as well, which are not efficient to support. So in the end I'd suggest we still stick with the rule of always materializing if queryableName is specified, and only consider logical materialization otherwise.

Reviewers: Bill Bejeck <bill@confluent.io>, Matthias J. Sax <mjsax@apache.org>
2018-12-08 22:49:48 -08:00
John Roesler b492296757 MINOR: fix checkpoint write failure warning log (#6008)
We saw a log statement in which the cause of the failure to write a checkpoint was not properly logged.
This change logs the exception properly and also verifies the log message.

Reviewers: Bill Bejeck <bill@confluent.io>, Guozhang Wang <wangguoz@gmail.com>
2018-12-08 19:00:57 -08:00
Stanislav Kozlovski edfa681736 MINOR: Catch NoRecordsException in testCommaSeparatedRegex() test (#5944)
This test sometimes fails with

```
kafka.tools.MirrorMaker$NoRecordsException
	at kafka.tools.MirrorMaker$ConsumerWrapper.receive(MirrorMaker.scala:483)
	at kafka.tools.MirrorMakerIntegrationTest$$anonfun$testCommaSeparatedRegex$1.apply$mcZ$sp(MirrorMakerIntegrationTest.scala:92)
	at kafka.utils.TestUtils$.waitUntilTrue(TestUtils.scala:738)
```

The test should catch `NoRecordsException` instead of `TimeoutException`.

Reviewers: Ismael Juma <ismael@juma.me.uk>
2018-12-08 09:43:43 -08:00
Gardner Vickers ac35ef6242 MINOR: Specify character encoding in NetworkTestUtils (#5965)
This attempts to address the flaky test `SaslAuthenticatorTest.testCannotReauthenticateWithDifferentPrincipal()` 

I was not able to reproduce locally even after 150 test runs in a loop, but given the error message:

```
org.junit.ComparisonFailure: expected:
<[6QBJiMZ6o5AqbNAjDTDjWtQSa4alfuUWsYKIy2tt7dz5heDaWZlz21yr8Gl4uEJkQABQXeEL0UebdpufDb5k8SvReSK6wYwQ9huP-9]> but was:<[????����????OAUTHBEARER]>
```

`????����????` seems to mean invalid UTF-8.

We now specify the charset when writing out and reading in bytes.

Reviewers: Ismael Juma <ismael@juma.me.uk>
2018-12-08 09:36:14 -08:00
Stanislav Kozlovski 05dc36d548 MINOR: Fix failing ConsumeBenchTest:test_multiple_consumers_specified_group_partitions_should_raise (#6015)
This is the error message we're after:

"You may not specify an explicit partition assignment when using multiple consumers in the same group."

We apparently changed it midway through #5810 and forgot to update the test.

Reviewers: Dhruvil Shah <dhruvil@confluent.io>, Ismael Juma <ismael@juma.me.uk>
2018-12-08 09:27:59 -08:00
Mark Cho c050503464 KAFKA-7709: Fix ConcurrentModificationException when retrieving expired inflight batches on multiple partitions. (#6005)
Reviewers: Dhruvil Shah <dhruvil@confluent.io>, Guozhang Wang <wangguoz@gmail.com>
2018-12-06 16:55:01 -08:00
huxi 87cc31c4e7 KAFKA-7704: MaxLag.Replica metric is reported incorrectly (#5998)
On the follower side, for the empty `LogAppendInfo` retrieved from the leader, fetcherLagStats set the wrong lag for fetcherLagStats due to `nextOffset` is zero.
2018-12-06 08:46:35 -05:00
Guozhang Wang 38e7d5763f
KAFKA-7673: Upgrade rocksdb to 5.15.10 (#5985)
Reviewers: Matthias J. Sax <mjsax@apache.org>
2018-12-05 16:52:25 -08:00
Anna Povzner 0ffdf8307f KAFKA-6388; Recover from rolling an empty segment that already exists (#5986)
There were several reported incidents where the log is rolled to a new segment with the same base offset as an active segment, causing KafkaException: Trying to roll a new log segment for topic partition X-N with start offset M while it already exists. In the cases we have seen, this happens to an empty log segment where there is long idle time before the next append and somehow we get to a state where offsetIndex.isFull() returns true due to _maxEntries == 0. This PR recovers from this state by deleting and recreating the segment and all of its associated index files.

Reviewers: Jason Gustafson <jason@confluent.io>
2018-12-05 14:49:19 -08:00
Cyrus Vafadari 9f954ac614 MINOR: Safe string conversion to avoid NPEs
Should be ported back to 2.0

Author: Cyrus Vafadari <cyrus@confluent.io>

Reviewers: Ewen Cheslack-Postava <ewen@confluent.io>

Closes #6004 from cyrusv/cyrus-npe
2018-12-05 13:23:52 -08:00
Jonathan Santilli b616f913c8 KAFKA-7678: Avoid NPE when closing the RecordCollector (#5993)
Reviewers: Matthias J. Sax <matthias@confluent.io>, Bill Bejeck <bill@confluent.io>
2018-12-05 11:48:39 -08:00
Nikolay ec501f305e KAFKA-7420: Global store surrounded by read only implementation (#5865)
Reviewers: Matthias J. Sax <matthias@confluent.io>, Kamal Chandraprakash (@kamalcph), Bill Bejeck <bill@confluent.io>
2018-12-05 11:25:52 -08:00
Rajini Sivaram 0d4cf64af3
KAFKA-7697: Process DelayedFetch without holding leaderIsrUpdateLock (#5999)
Delayed fetch operations acquire leaderIsrUpdate read lock of one or more Partitions from the fetch request when attempting to complete the fetch operation. While appending new records, complete fetch requests after releasing leaderIsrUpdate of the Partition to which records were appended to avoid deadlocks in request handler threads.

Reviewers: Jason Gustafson <jason@confluent.io>, Jun Rao <junrao@gmail.com>
2018-12-05 09:05:26 +00:00
Matthias J. Sax a1807d4914 MINOR: Improve GlobalKTable docs (#5996)
Reviewers: Jim Galasyn, Michael G. Noll, John Roesler, Bill Bejeck, Guozhang Wang
2018-12-04 16:21:18 -08:00
huxi f65b1c4796 KAFKA-7687; Print batch level information in DumpLogSegments when deep iterating (#5976)
DumpLogSegments should print batch level information when deep-iteration is specified.

Reviewers: Jason Gustafson <jason@confluent.io>
2018-12-04 09:04:39 -08:00
Rajini Sivaram 437cf35373
KAFKA-7702: Fix matching of prefixed ACLs to match single char prefix (#5994)
Reviewers: Jun Rao <junrao@gmail.com>
2018-12-04 09:44:11 +00:00
Srinivas Reddy 7283711c0d KAFKA-7446: Fix the duration and instant validation messages. (#5930)
Changes made as part of this commit.
 - Improved error message for better readability at millis validation utility
 - Corrected java documentation on `AdvanceInterval` check.
 - Added caller specific prefix text to make error message more clear to developers/users.

Reviewers: Matthias J. Sax <matthias@confluent.io>, Bill Bejeck <bill@confluent.io>, Jacek Laskowski <jacek@japila.pl>
2018-12-03 22:59:54 -08:00
Jakub Scholz b4030a0375 MINOR: Update command line options in Authorization and ACLs documentation chapter (#5995) 2018-12-04 09:41:08 +05:30