Commit Graph

5472 Commits

Author SHA1 Message Date
cwildman 5b23eb64ea MINOR: state.cleanup.delay.ms default is 600,000 ms (10 minutes). (#6345)
Reviewers: Bill Bejeck <bill@confluent.io>, Matthias J. Sax <matthias@confluent.io>
2019-02-28 09:26:27 -08:00
Guozhang Wang e71c93af5f MINOR: disable Streams system test for broker upgrade/downgrade (#6341)
Reviewers: Bill Bejeck <bill@confluent.io>, Matthias J. Sax <matthias@confluent.io>
2019-02-28 08:56:26 -08:00
Guozhang Wang 1158c743cd
KAFKA-7990: Close streams at the end in KafkaStreamsTest (#6334)
This fix is already in 2.1+ branches, but did not get into older branches.

Should be cherry-picked all the way to 0.10.2.

Reviewers: Matthias J. Sax <mjsax@apache.org>
2019-02-26 22:28:02 -08:00
Arjun Satish d573585795 MINOR: Increase produce timeout to 120 seconds (#6326)
MINOR: Increase produce timeout for EmbeddedKafkaCluster to 120 seconds

Previous value was 500ms. This change gives more room to pass tests on systems with low resources running many parallel tests.

Reviewers: Randall Hauch <randall@confluent.io>
2019-02-25 23:35:29 -06:00
Stanislav Kozlovski 48986f380e KAFKA-7959; Delete leader epoch cache files with old message format versions (#6298)
It is important to clean up any cached epochs that may exist if the log message format does not support it (due to a regression in KAFKA-7415). Otherwise, the broker might make use of them once it upgrades its message format. This can cause unnecessary truncation of data.

Reviewers: Jason Gustafson <jason@confluent.io>
2019-02-22 14:56:08 -08:00
Bill Bejeck d072db1d67
MINOR: Add check all topics created check streams broker bounce test (2.0) (#6241)
The StreamsBrokerBounceTest.test_broker_type_bounce experienced what looked like a transient failure. After looking over this test and failure, it seems like it is vulnerable to timing error that streams will start before the kafka service creates all topics.

Reviewers: Guozhang Wang <wangguoz@gmail.com>, Matthias J. Sax <mjsax@apache.org>, John Roesler <john@confluent.io>
2019-02-20 16:40:33 -05:00
Alex Diachenko 436bd779a4 KAFKA-7799; Use httpcomponents-client in RestServerTest.
The test `org.apache.kafka.connect.runtime.rest.RestServerTest#testCORSEnabled` assumes Jersey client can send restricted HTTP headers(`Origin`).

Jersey client uses `sun.net.www.protocol.http.HttpURLConnection`.
`sun.net.www.protocol.http.HttpURLConnection` drops restricted headers(`Host`, `Keep-Alive`, `Origin`, etc) based on static property `allowRestrictedHeaders`.
This property is initialized in a static block by reading Java system property `sun.net.http.allowRestrictedHeaders`.

So, if classloader loads `HttpURLConnection` before we set `sun.net.http.allowRestrictedHeaders=true`, then all subsequent changes of this system property won't take any effect(which happens if `org.apache.kafka.connect.integration.ExampleConnectIntegrationTest` is executed before `RestServerTest`).
To prevent this, we have to either make sure we set `sun.net.http.allowRestrictedHeaders=true` as early as possible or do not rely on this system property at all.

This PR adds test dependency on `httpcomponents-client` which doesn't depend on `sun.net.http.allowRestrictedHeaders` system property. Thus none of existing tests should interfere with `RestServerTest`.

Author: Alex Diachenko <sansanichfb@gmail.com>

Reviewers: Gwen Shapira

Closes #6276 from avocader/KAFKA-7799_2.0
2019-02-19 11:52:07 -08:00
Stanislav Kozlovski 0acab27b3b MINOR: Make MockClient#poll() more thread-safe (#5942)
It used to preallocate an array of responses and then complete each response from the original collection sequentially. The problem was that the original collection could have been modified (another thread completing the response) while this was hapenning
2019-02-14 10:24:17 -08:00
Jason Gustafson 1d5d461ee2
KAFKA-7897; Do not write epoch start offset for older message format versions (#6253)
When an older message format is in use, we should disable the leader epoch cache so that we resort to truncation by high watermark. Previously we updated the cache for all versions when a broker became leader for a partition. This can cause large and unnecessary truncations after leader changes because we relied on the presence of _any_ cached epoch in order to tell whether to use the improved truncation logic possible with the OffsetsForLeaderEpoch API.

Note this is a simplified fix than what was merged to trunk in #6232 since the branches have diverged significantly. Rather than removing the epoch cache file, we guard usage of the cache with the record version.

Reviewers: Stanislav Kozlovski <stanislav_kozlovski@outlook.com>, Jun Rao <junrao@gmail.com>
2019-02-12 08:32:31 -08:00
John Roesler a3ce7daf40 KAFKA-7741: Reword Streams dependency workaround docs (#6207)
Avoid mentioning unreleased versions in the docs.

Reviewers: Matthias J. Sax <matthias@confluent.io>, Bill Bejeck <bill@confluent.io>, Guozhang Wang <guozhang@confluent.io>
2019-02-11 10:26:07 -08:00
Konstantine Karantasis d96c7eae0b KAFKA-7834: Extend collected logs in system test services to include heap dumps
* Enable heap dumps on OOM with -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=<file.bin> in the major services in system tests
* Collect the heap dump from the predefined location as part of the result logs for each service
* Change Connect service to delete the whole root directory instead of individual expected files
* Tested by running the full suite of system tests

Author: Konstantine Karantasis <konstantine@confluent.io>

Reviewers: Ewen Cheslack-Postava <ewen@confluent.io>

Closes #6158 from kkonstantine/KAFKA-7834

(cherry picked from commit 83c435f3ba)
Signed-off-by: Ewen Cheslack-Postava <me@ewencp.org>
2019-02-04 16:46:58 -08:00
Konstantine Karantasis 47bb1a4668 MINOR: Upgrade ducktape to 0.7.5 (#6197)
Reviewed-by: Colin P. McCabe <cmccabe@apache.org>
2019-02-04 16:04:04 -08:00
Randall Hauch 81fae5c3af KAFKA-7873; Always seek to beginning in KafkaBasedLog (#6203)
Explicitly seek KafkaBasedLog’s consumer to the beginning of the topic partitions, rather than potentially use committed offsets (which would be unexpected) if group.id is set or rely upon `auto.offset.reset=earliest` if the group.id is null.

This should not change existing behavior but should remove some potential issues introduced with KIP-287 if `group.id` is not set in the consumer configurations. Note that even if `group.id` is set, we still always want to consume from the beginning.

Reviewers: Jason Gustafson <jason@confluent.io>
2019-02-01 14:27:47 -08:00
Jarek Rudzinski e31b973122 MINOR: upgrade to jdk8 8u202
Upgrade from 171 to 202. Unpack and install directly from a cached tgz rather than going via the installer deb from webupd8. The installer is still on 8u919 while we want 202.

Testing via kafka branch builder job

https://jenkins.confluent.io/job/system-test-kafka-branch-builder/2305/

Author: Jarek Rudzinski <jarek@confluent.io>
Author: Ewen Cheslack-Postava <me@ewencp.org>

Reviewers: Alex Diachenko <sansanichfb@gmail.com>, Ewen Cheslack-Postava <ewen@confluent.io>

Closes #6165 from jarekr/trunk-jdk8-from-tgz

(cherry picked from commit ad3b6dd835)
Signed-off-by: Ewen Cheslack-Postava <me@ewencp.org>
2019-01-24 22:19:55 -08:00
mingaliu 801b07b7a8 KAFKA-7693; Fix SequenceNumber overflow in producer (#5989)
The problem is that the sequence number is an Int and should wrap around when it reaches the Int.MaxValue. The bug here is it doesn't wrap around and become negative and raises an error.

Reviewers: Jason Gustafson <jason@confluent.io>
2019-01-24 16:49:55 -08:00
mingaliu e29114a328 KAFKA-7692; Fix ProducerStateManager SequenceNumber overflow (#5990)
This patch fixes a few overflow issues with wrapping sequence numbers in the broker's producer state tracking. 

Reviewers: Jason Gustafson <jason@confluent.io>
2019-01-24 16:38:44 -08:00
Robert Yokota 8397a0802e KAFKA-6833; Producer should await metadata for unknown partitions (#6073) (#6154)
This patch changes the behavior of KafkaProducer.waitOnMetadata to wait up to max.block.ms when the partition specified in the produce request is out of the range of partitions present in the metadata. This improves the user experience in the case when partitions are added to a topic and a client attempts to produce to one of the new partitions before the metadata has propagated to the brokers. Tested with unit tests.

Reviewers: Arjun Satish <arjun@confluent.io>, Jason Gustafson <jason@confluent.io>
2019-01-23 18:27:09 -08:00
Chris Egerton b199eba296 KAFKA-5117: Stop resolving externalized configs in Connect REST API
[KIP-297](https://cwiki.apache.org/confluence/display/KAFKA/KIP-297%3A+Externalizing+Secrets+for+Connect+Configurations#KIP-297:ExternalizingSecretsforConnectConfigurations-PublicInterfaces) introduced the `ConfigProvider` mechanism, which was primarily intended for externalizing secrets provided in connector configurations. However, when querying the Connect REST API for the configuration of a connector or its tasks, those secrets are still exposed. The changes here prevent the Connect REST API from ever exposing resolved configurations in order to address that. rhauch has given a more thorough writeup of the thinking behind this in [KAFKA-5117](https://issues.apache.org/jira/browse/KAFKA-5117)

Tested and verified manually. If these changes are approved unit tests can be added to prevent a regression.

Author: Chris Egerton <chrise@confluent.io>

Reviewers: Robert Yokota <rayokota@gmail.com>, Randall Hauch <rhauch@gmail.com, Ewen Cheslack-Postava <ewen@confluent.io>

Closes #6129 from C0urante/hide-provided-connect-configs

(cherry picked from commit 743607af5a)
Signed-off-by: Ewen Cheslack-Postava <me@ewencp.org>
2019-01-23 11:11:35 -08:00
Jason Gustafson aaf56930db MINOR: Cleanup handling of mixed transactional/idempotent records (#6172)
Reviewers: Rajini Sivaram <rajinisivaram@googlemail.com>, Ismael Juma <ismael@juma.me.uk>, Colin Patrick McCabe <colin@cmccabe.xyz>
2019-01-22 09:00:14 -08:00
Arjun Satish 5d170e12db MINOR: Handle case where connector status endpoints returns 404 (#6176)
Reviewers: Randall Hauch <randall@confluent.io>, Matthias J. Sax <matthias@confluent.io>
2019-01-21 00:00:44 -08:00
Magesh Nandakumar 05e70e6b1c MINOR: Start Connect REST server in standalone mode to match distributed mode (KAFKA-7503 follow-up)
Start the Rest server in the standalone mode similar to how it's done for distributed mode.

Author: Magesh Nandakumar <magesh.n.kumar@gmail.com>

Reviewers: Arjun Satish <arjun@confluent.io>, Ewen Cheslack-Postava <ewen@confluent.io>

Closes #6148 from mageshn/KAFKA-7826

(cherry picked from commit dec68c9350)
Signed-off-by: Ewen Cheslack-Postava <me@ewencp.org>
2019-01-16 22:58:53 -08:00
Arjun Satish 6d7f6ddff1 KAFKA-7503: Connect integration test harness
Expose a programmatic way to bring up a Kafka and Zk cluster through Java API to facilitate integration tests for framework level changes in Kafka Connect. The Kafka classes would be similar to KafkaEmbedded in streams. The new classes would reuse the kafka.server.KafkaServer classes from :core, and provide a simple interface to bring up brokers in integration tests.

Signed-off-by: Arjun Satish <arjunconfluent.io>

Author: Arjun Satish <arjun@confluent.io>
Author: Arjun Satish <wicknicks@users.noreply.github.com>

Reviewers: Randall Hauch <rhauch@gmail.com>, Konstantine Karantasis <konstantine@confluent.io>, Ewen Cheslack-Postava <ewen@confluent.io>

Closes #5516 from wicknicks/connect-integration-test

(cherry picked from commit 69d8d2ea11)
Signed-off-by: Ewen Cheslack-Postava <me@ewencp.org>
2019-01-14 14:30:03 -08:00
John Roesler c5ed908476 KAFKA-7741: streams-scala - document dependency workaround (#6125)
Reviewers: Guozhang Wang <wangguoz@gmail.com>
2019-01-11 17:16:43 -08:00
John Roesler e066a37677 KAFKA-7741: Streams exclude javax dependency (#6121)
As documented in https://issues.apache.org/jira/browse/KAFKA-7741,
the javax dependency we receive transitively from connect is incompatible
with SBT builds.

Streams doesn't use the portion of Connect that needs the dependency,
so we can fix the builds by simply excluding it.

Reviewers: Bill Bejeck <bill@confluent.io>, Guozhang Wang <guozhang@confluent.io>
2019-01-10 15:51:52 -08:00
Jason Gustafson a6ab1d5094 KAFKA-7799; Fix flaky test RestServerTest.testCORSEnabled (#6106)
The test always fails if testOptionsDoesNotIncludeWadlOutput is executed before testCORSEnabled. It seems the problem is the use of the system property. Perhaps there is some static caching somewhere.

Reviewers: Randall Hauch <rhauch@gmail.com>, Guozhang Wang <wangguoz@gmail.com>
2019-01-09 00:14:42 -08:00
Chia-Ping Tsai dbaa15226b KAFKA-7253; The returned connector type is always null when creating connector (#5470)
The null map returned from the current snapshot causes the null type in response. The connector class name can be taken from the config of request instead since we require the config should contain the connector class name.

Reviewers: Jason Gustafson <jason@confluent.io>
2019-01-08 08:55:47 -08:00
layfe ffdb220862 KAFKA-5503; Idempotent producer ignores shutdown while fetching ProducerId (#5881)
Check `running` in `Sender.maybeWaitForProducerId` to ensure that the producer can be closed while awaiting initialization of the producerId.

Reviewers: Jason Gustafson <jason@confluent.io>
2019-01-02 15:05:19 -08:00
Flavien Raynaud cb288b77ba MINOR: Improve exception messages in FileChannelRecordBatch (#6068)
Replace `channel` by `fileRecords` in potentially thrown KafkaException
descriptions when loading/writing `FileChannelRecordBatch`. This makes exception
messages more readable (channel only shows an object hashcode, fileRecords shows
the path of the file being read and start/end positions in the file).

Reviewers: Jason Gustafson <jason@confluent.io>
2018-12-28 14:05:13 -08:00
Renato Mefi df5336e1aa KAFKA-3832; Kafka Connect's JSON Converter never outputs a null value (#6027)
When using the Connect `JsonConverter`, it's impossible to produce tombstone messages, thus impacting the compaction of the topic. This patch allows the converter with and without schemas to output a NULL byte value in order to have a proper tombstone message. When it's regarding to get this data into a connect record, the approach is the same as when the payload looks like `"{ "schema": null, "payload": null }"`, this way the sink connectors can maintain their functionality and reduces the BCC.

Reviewers: Gunnar Morling <gunnar.morling@googlemail.com>, Randall Hauch <rhauch@gmail.com>, Jason Gustafson <jason@confluent.io>
2018-12-28 09:42:43 -08:00
Bill Bejeck 38fd13d9c0 MINOR: standby task test throughput too low 2.0 (#6062)
Previous PR #6043 reduced throughput for VerifiableProducer in base class, but the streams_standby_replica_test needs higher throughput for consumer to complete verification in 60 seconds. Same update as #6060 and #6061

Reviewers: Guozhang Wang <wangguoz@gmail.com>
2018-12-21 22:41:28 -08:00
Alex Diachenko 9736f97d2f KAFKA-7759; Disable WADL output in the Connect REST API (#6051)
This patch disables support for WADL output in the Connect REST API since it was never intended to be exposed. 

Reviewers: Randall Hauch <rhauch@gmail.com>, Jason Gustafson <jason@confluent.io>
2018-12-20 14:28:57 -08:00
Bill Bejeck ea9cb3ce4c MINOR: Streams broker down flaky test (#6041)
This PR addresses a few issues with this system test flakiness. I'll issue similar PRs for 2.1 and trunk as well.

1. Need to grab the monitor before a given operation to observe logs for signal
2. Relied too much on a timely rebalance and only sent a handful of messages.
I've updated the test and ran it here https://jenkins.confluent.io/job/system-test-kafka-branch-builder/2141/ parameterized for 15 repeats all passed.

Reviewers: Guozhang Wang <wangguoz@gmail.com>
2018-12-19 17:53:42 -08:00
Stig Rohde Døssing 9328a137df KAFKA-7616; Make MockConsumer only add entries to the partition map returned by poll() if there are any records to return
…eturned by poll() if there are any records to return

The MockConsumer behaves unlike the real consumer in that it can return a non-empty ConsumerRecords from poll, that also has a count of 0. This change makes the MockConsumer only add partitions to the ConsumerRecords if there are records to return for those partitions.

A unit test in MockConsumerTest demonstrates the issue.

Author: Stig Rohde Døssing <stigdoessing@gmail.com>

Reviewers: Manikumar Reddy <manikumar.reddy@gmail.com>

Closes #5901 from srdo/KAFKA-7616
2018-12-17 10:40:36 -08:00
Pasquale Vazzana 46a2e50055 KAFKA-7655 Metadata spamming requests from Kafka Streams under some circumstances, potential DOS (#5929)
Reviewers: Matthias J. Sax <matthias@confluent.io>, Guozhang Wang <guozhang@confluent.io>
2018-12-13 17:31:04 +01:00
linyli001 071d6de3ed KAFKA-7443: OffsetOutOfRangeException in restoring state store from changelog topic when start offset of local checkpoint is smaller than that of changelog topic (#5946)
Reviewer: Matthias J. Sax <matthias@confluent.io>, John Roesler <john@confluent.io>
2018-12-11 08:48:35 +00:00
Anna Povzner fc9532d44e KAFKA-6388; Recover from rolling an empty segment that already exists (#5986)
There were several reported incidents where the log is rolled to a new segment with the same base offset as an active segment, causing KafkaException: Trying to roll a new log segment for topic partition X-N with start offset M while it already exists. In the cases we have seen, this happens to an empty log segment where there is long idle time before the next append and somehow we get to a state where offsetIndex.isFull() returns true due to _maxEntries == 0. This PR recovers from this state by deleting and recreating the segment and all of its associated index files.

Reviewers: Jason Gustafson <jason@confluent.io>
2018-12-05 15:02:59 -08:00
Jonathan Santilli 87a37c5a58 KAFKA-7678: Avoid NPE when closing the RecordCollector (#5993)
Reviewers: Matthias J. Sax <matthias@confluent.io>, Bill Bejeck <bill@confluent.io>
2018-12-05 14:09:28 -08:00
Cyrus Vafadari 4baf0afd04 MINOR: Safe string conversion to avoid NPEs
Should be ported back to 2.0

Author: Cyrus Vafadari <cyrus@confluent.io>

Reviewers: Ewen Cheslack-Postava <ewen@confluent.io>

Closes #6004 from cyrusv/cyrus-npe

(cherry picked from commit 9f954ac614)
Signed-off-by: Ewen Cheslack-Postava <me@ewencp.org>
2018-12-05 13:24:17 -08:00
John Roesler 985287a274 KAFKA-7660: fix streams and Metrics memory leaks (#5980) 2018-12-04 13:51:18 -08:00
Rajini Sivaram 85ae92cf89 KAFKA-7702: Fix matching of prefixed ACLs to match single char prefix (#5994)
Reviewers: Jun Rao <junrao@gmail.com>
2018-12-04 09:51:45 +00:00
Guozhang Wang fc39e0639f MINOR: improve QueryableStateIntegrationTest (#5987) 2018-12-02 22:17:01 -08:00
Cyrus Vafadari 1f16ae4aa1 MINOR: Add logging to Connect SMTs
Includes Update to ConnectRecord string representation to give
visibility into schemas, useful in SMT tracing

Author: Cyrus Vafadari <cyrus@confluent.io>

Reviewers: Randall Hauch <rhauch@gmail.com>, Konstantine Karantasis <konstantine@confluent.io>, Ewen Cheslack-Postava <ewen@confluent.io>

Closes #5860 from cyrusv/cyrus-logging

(cherry picked from commit 4712a36416)
Signed-off-by: Ewen Cheslack-Postava <me@ewencp.org>
2018-11-29 23:01:22 -08:00
Bill Bejeck 1d560edfd3 KAFKA-7671: Stream-Global Table join should not reset repartition flag (#5959)
This PR fixes an issue reported from a user. When we join a KStream with a GlobalKTable we should not reset the repartition flag as the stream may have previously changed its key, and the resulting stream could be used in an aggregation operation or join with another stream which may require a repartition for correct results.

I've added a test which fails without the fix.

Reviewers: John Roesler <john@confluent.io>, Matthias J. Sax <matthias@confluent.io>, Guozhang Wang <wangguoz@gmail.com>
2018-11-28 18:22:06 -08:00
Robert Yokota e7298f4fc5 KAFKA-7620: Fix restart logic for TTLs in WorkerConfigTransformer
The restart logic for TTLs in `WorkerConfigTransformer` was broken when trying to make it toggle-able.   Accessing the toggle through the `Herder` causes the same code to be called recursively.  This fix just accesses the toggle by simply looking in the properties map that is passed to `WorkerConfigTransformer`.

Author: Robert Yokota <rayokota@gmail.com>

Reviewers: Magesh Nandakumar <magesh.n.kumar@gmail.com>, Ewen Cheslack-Postava <ewen@confluent.io>

Closes #5914 from rayokota/KAFKA-7620

(cherry picked from commit a2e87feb8b)
Signed-off-by: Ewen Cheslack-Postava <me@ewencp.org>
2018-11-27 22:02:17 -08:00
John Roesler cc427c23f0 MINOR: increase system test kafka start timeout (#5934)
The Kafka Streams system tests fail with some regularity due to a timeout starting the broker.

The initial start is quite quick, but many of our tests involve stopping and restarting nodes with data already loaded, and also while processing is ongoing.

Under these conditions, it seems to be normal for the broker to take about 25 seconds to start, which makes the 30 second timeout pretty close for comfort.
I have seen many test failures in which the broker successfully started within a couple of seconds after the tests timed out and already initiated the failure/shut-down sequence.

Reviewers: Guozhang Wang <wangguoz@gmail.com>
2018-11-21 11:50:49 -08:00
Guozhang Wang c6a9abe3fa KAFKA-7536: Initialize TopologyTestDriver with non-null topic (#5923)
In TopologyTestDriver constructor set non-null topic; and in unit test intentionally turn on caching to verify this case.

Reviewers: Bill Bejeck <bill@confluent.io>, Matthias J. Sax <matthias@confluent.io>
2018-11-20 14:47:58 -08:00
Rajini Sivaram ad40bed8d9 KAFKA-7576; Fix shutdown of replica fetcher threads (#5875)
ReplicaFetcherThread.shutdown attempts to close the fetcher's Selector while the thread is running. This in unsafe and can result in `Selector.close()` failing with an exception. The exception is caught and logged at debug level, but this can lead to socket leak if the shutdown is due to dynamic config update rather than broker shutdown. This PR changes the shutdown logic to close Selector after the replica fetcher thread is shutdown, with a wakeup() and flag to terminate blocking sends first.

Reviewers: Ismael Juma <ismael@juma.me.uk>, Jason Gustafson <jason@confluent.io>
2018-11-16 10:23:21 +00:00
Matthias J. Sax 40cc817f3b KAFKA-7584: StreamsConfig throws ClassCastException if max.in.flight.request.per.connect is specified as String (#5874)
Reviewers: Bill Bejeck <bill@confluent.io>, John Roesler <john@confluent.io>, Guozhang Wang <wangguoz@gmail.com>
2018-11-15 15:10:00 -08:00
Randall Hauch c27aacf51d MINOR: Avoid logging connector configuration in Connect framework (#5868)
Some connector configs may be sensitive, so we should avoid logging them.

Reviewers: Alex Diachenko, Dustin Cote <dustin@confluent.io>, Jason Gustafson <jason@confluent.io>
2018-11-13 13:34:41 -08:00
Manikumar Reddy O fa0189de51
MINOR: Update version to 2.0.2-SNAPSHOT (#5892) 2018-11-09 01:03:19 +05:30