Commit Graph

615 Commits

Author SHA1 Message Date
Magesh Nandakumar 586c587b3d KAFKA-8974: Trim whitespaces in topic names in sink connector configs (#7442)
Trim whitespaces in topic names specified in sink connector configs before subscribing to the consumer. Topic names don't allow whitespace characters, so trimming only will eliminate potential problems and will not place additional limits on topics specified in sink connectors.

Author: Magesh Nandakumar <magesh.n.kumar@gmail.com>
Reviewers: Arjun Satish <arjun@confluent.io>, Randall Hauch <rhauch@gmail.com>
2019-10-04 13:12:32 -05:00
Nigel Liang ded1fb8c4d KAFKA-6290: Support casting from logical types in cast transform (#7371)
Adds support for the Connect Cast transforms to cast from Connect logical types, such as DATE, TIME, TIMESTAMP, and DECIMAL. Casting to numeric types will produce the underlying numeric value represented in the desired type. For logical types represented by underlying Java Date class, this means the milliseconds since EPOCH. For Decimal, this means the underlying value. If the value does not fit in the desired target type, it may overflow.

Casting to String from Date, Time, and Timestamp types will produce their ISO 8601 representation. Casting to String from Decimal will result in the value represented as a string. e.g. 1234 -> "1234".

Author: Nigel Liang <nigel@nigelliang.com>
Reviewer: Randall Hauch <rhauch@gmail.com>
2019-10-03 14:55:52 -05:00
Gunnar Morling e76770343c KAFKA-8523 Enabling InsertField transform to be used with tombstone events (#6914)
* KAFKA-8523 Avoiding raw type usage

* KAFKA-8523 Gracefully handling tombstone events in InsertField SMT
2019-10-03 13:51:00 -05:00
Cyrus Vafadari 5b829f2b61 KAFKA-8447: New Metric to Measure Number of Tasks on a Connector (#6843)
Implemented KIP-475 to add new metrics for each worker to expose the number of current tasks per connector and per status.

Author: Cyrus Vafadari <cyrus@confluent.io>
Reviewer: Randall Hauch <rhauch@gmail.com>, Boyang Chen <Boyang Chen <boyang@confluent.io>
2019-10-02 19:42:21 -05:00
Chris Egerton 791d0d61bf KAFKA-8804: Secure internal Connect REST endpoints (#7310)
Implemented KIP-507 to secure the internal Connect REST endpoints that are only for intra-cluster communication. A new V2 of the Connect subprotocol enables this feature, where the leader generates a new session key, shares it with the other workers via the configuration topic, and workers send and validate requests to these internal endpoints using the shared key.

Currently the internal `POST /connectors/<connector>/tasks` endpoint is the only one that is secured.

This change adds unit tests and makes some small alterations to system tests to target the new `sessioned` Connect subprotocol. A new integration test ensures that the endpoint is actually secured (i.e., requests with missing/invalid signatures are rejected with a 400 BAD RESPONSE status).

Author: Chris Egerton <chrise@confluent.io>
Reviewed: Konstantine Karantasis <konstantine@confluent.io>, Randall Hauch <rhauch@gmail.com>
2019-10-02 17:06:57 -05:00
Arjun Satish 1c831c22e1 KAFKA-7772: Dynamically Adjust Log Levels in Connect (#7403)
Implemented KIP-495 to expose a new `admin/loggers` endpoint for the Connect REST API that lists the current log levels and allows the caller to change log levels. 

Author: Arjun Satish <arjun@confluent.io>
Reviewer: Randall Hauch <rhauch@gmail.com>
2019-10-02 17:00:37 -05:00
Almog Gavra bfcc17f211 KAFKA-8595: Support deserialization of JSON decimals encoded in NUMERIC (#7354)
Implemented KIP-481 by adding support for deserializing Connect DECIMAL values encoded in JSON as numbers, in addition to raw byte array (base64) format used previously.

Author: Almog Gavra <almog@confluent.io>
Reviewers: Chris Egerton <chrise@confluent.io>, Konstantine Karantasis <konstantine@confluent.io>, Randall Hauch <rhauch@gmail.com>
2019-10-01 15:33:06 -05:00
Tu V. Tran f6f24c4700 KAFKA-8729, pt 2: Add error_records and error_message to PartitionResponse (#7150)
As noted in the KIP-467, the updated ProduceResponse is

```
Produce Response (Version: 8) => [responses] throttle_time_ms
  responses => topic [partition_responses]
    topic => STRING
    partition_responses => partition error_code base_offset log_append_time log_start_offset
      partition => INT32
      error_code => INT16
      base_offset => INT64
      log_append_time => INT64
      log_start_offset => INT64
      error_records => [INT32]         // new field, encodes the relative offset of the records that caused error
      error_message => STRING          // new field, encodes the error message that client can use to log itself
    throttle_time_ms => INT32
with a new error code:
```

INVALID_RECORD(86, "Some record has failed the validation on broker and hence be rejected.", InvalidRecordException::new);

Reviewers: Jason Gustafson <jason@confluent.io>, Magnus Edenhill <magnus@edenhill.se>, Guozhang Wang <wangguoz@gmail.com>
2019-09-30 19:29:36 -07:00
Yaroslav Tkachenko 70d1bb40d9 KAFKA-7273: Extend Connect Converter to support headers (#6362)
Implemented KIP-440 to allow Connect converters to use record headers when serializing or deserializing keys and values. This change is backward compatible in that the new methods default to calling the older existing methods, so existing Converter implementations need not be changed. This changes the WorkerSinkTask and WorkerSourceTask to use the new converter methods, but Connect's existing Converter implementations and the use of converters for internal topics are intentionally not modified. Added unit tests.

Author: Yaroslav Tkachenko <sapiensy@gmail.com>
Reviewers: Ryanne Dolan <ryannedolan@gmail.com>, Ewen Cheslack-Postava <me@ewencp.org>, Randall Hauch <rhauch@gmail.com>
2019-09-25 11:23:03 -05:00
Mickael Maison ac385c4c3a KAFKA-8474; Use HTML lists for config layout (#6870)
Replace the `<table>` elements by `<ul>` so the full page width can be used for the configuration descriptions instead of only a very narrow column. I moved the other fields (Type, Default Value, etc) below each entry.

Reviewers: Boyang Chen <boyang@confluent.io>, Jason Gustafson <jason@confluent.io>
2019-09-13 15:48:36 -07:00
Konstantine Karantasis 74fc3323a5 MINOR: Add unit test for KAFKA-8676 to guard against unrequired task restarts (#7287)
Added unit test for recent fix of `KafkaConfigBackingStore`.

Author: Konstantine Karantasis <konstantine@confluent.io>
Reviewer: Randall Hauch <rhauch@gmail.com>
2019-09-03 15:48:19 -05:00
LuyingLiu 31a5f92b9f Changed for updatedTasks, avoids stopping and starting of unnecessary tasks (#7097)
Corrected the `KafkaConfigBackingStore` logic to notify of only the changed tasks, rather than all tasks. This was not noticed before because Connect always stopped and restarted all tasks during a rebalanced, but since 2.3 the incremental rebalance logic exposed this bug.

Author: Luying Liu <lyliu@lyliu-mac.freewheelmedia.net>
Reviewers: Konstantine Karantasis <konstantine@confluent.io>, Randall Hauch <rhauch@gmail.com>
2019-09-03 15:47:17 -05:00
Konstantine Karantasis d7f8ec8628 MINOR: Fix the doc of scheduled.rebalance.max.delay.ms config property (#7242)
Author: Konstantine Karantasis <konstantine@confluent.io>
Reviewer: Randall Hauch <rhauch@gmail.com>
2019-08-25 17:35:03 -05:00
Randall Hauch 5b4e749dc6 KAFKA-8391; Temporarily ignore flaky Connect rebalance integration tests
I've spent quite a bit of time on trying to discover the root cause, with no luck so far. I have been able to reproduce it locally by running the following 100 times:
```
./gradlew connect:runtime:clean connect:runtime:test --tests org.apache.kafka.connect.integration.RebalanceSourceConnectorsIntegrationTest
```
The `testReconfigConnector` test failed 28% of the time and the others failed 0%. This issue and KAFKA-8661 suggest that `testDeleteConnector` and `testStartTwoConnectors` are also flaky, though I've not seen those tests fail locally.

Because this flakiness is causing issues for the rest of the project, I'm going to temporarily ignore several of the flaky ITs while I continue to investigate:
* `RebalanceSourceConnectorsIntegrationTest.testReconfigConnector`
* `RebalanceSourceConnectorsIntegrationTest.testDeleteConnector`
* `RebalanceSourceConnectorsIntegrationTest.testStartTwoConnectors`

**This should be backported to the `2.3` branch, which is when these integration tests were first added.**

Author: Randall Hauch <rhauch@gmail.com>

Reviewers: Ismael Juma

Closes #7237 from rhauch/kafka-8391-temporary
2019-08-25 14:30:23 -07:00
Chris Egerton 237e83dea0 KAFKA-8586: Fail source tasks when producers fail to send records (#6993)
Changed Connect's `WorkerSourceTask` to capture non-retriable exceptions from the `producer.send(...)` (e.g., authentication or authorization errors) and to fail the connector task when such an error is encountered. Modified the existing unit tests to verify this functionality.

Note that most producer errors are retriable, and Connect will (by default) set up each producer with 1 max in-flight message and infinite retries. This change only affects non-retriable errors.
2019-08-25 15:54:00 -05:00
Randall Hauch e5657bc910 KAFKA-8391; Improved the Connect integration tests to make them less flaky
Added the ability for the connector handles and task handles, which are used by the monitorable source and sink connectors used to verify the functionality of the Connect framework, to record the number of times the connector and tasks have each been started, and to allow a test to obtain a `RestartLatch` that can be used to block until the connectors and/or tasks have been restarted a specified number of types.

Typically, a test will get the `ConnectorHandle` for a connector, and call the `ConnectorHandle.expectedRestarts(int)` method with the expected number of times that the connector and/or tasks will be restarted, and will hold onto the resulting `RestartLatch`. The test will then change the connector (or otherwise cause the connector to restart) one or more times as desired, and then call `RestartLatch.await(long, TimeUnit)` to block the test up to a specified duration for the connector and all tasks to be started the specified number of times.

This commit also increases several of the maximum wait times used in other integration tests. It doesn’t hurt to potentially wait longer, since most test runs will not need to wait the maximum amount of time anyway. However, in the rare cases that do need that extra time, waiting a bit more is fine if we can reduce the flakiness and minimize test failures that happened to time out too early.

Unit tests were added for the new `RestartLatch` and `StopAndStartCounter` utility classes. This PR only affects the tests and does not affect any runtime code or API.

**This should be merged on `trunk` and backported to the `2.3.x` branch.**

Author: Randall Hauch <rhauch@gmail.com>

Reviewers: Konstantine Karantasis, Arjun Satish

Closes #7019 from rhauch/kafka-8391
2019-08-13 11:14:41 -07:00
Paul e2c8612d01 KAFKA-7941: Catch TimeoutException in KafkaBasedLog worker thread (#6283)
When calling readLogToEnd(), the KafkaBasedLog worker thread should catch TimeoutException and log a warning, which can occur if brokers are unavailable, otherwise the worker thread terminates.

Includes an enhancement to MockConsumer that allows simulating exceptions not just when polling but also when querying for offsets, which is necessary for testing the fix.

Author: Paul Whalen <pgwhalen@gmail.com>
Reviewers: Randall Hauch <rhauch@gmail.com>, Arjun Satish <arjun@confluent.io>, Ryanne Dolan <ryannedolan@gmail.com>
2019-08-13 10:16:54 -05:00
Arjun Satish 794637232c KAFKA-8774: Regex can be found anywhere in config value (#7197)
Corrected the AbstractHerder to correctly identify task configs that contain variables for externalized secrets. The original method incorrectly used `matcher.matches()` instead of `matcher.find()`. The former method expects the entire string to match the regex, whereas the second one can find a pattern anywhere within the input string (which fits this use case more correctly).

Added unit tests to cover various cases of a config with externalized secrets, and updated system tests to cover case where config value contains additional characters besides secret that requires regex pattern to be found anywhere in the string (as opposed to complete match).

Author: Arjun Satish <arjun@confluent.io>
Reviewer: Randall Hauch <rhauch@gmail.com>
2019-08-13 09:40:12 -05:00
John Roesler a8aedc85eb KAFKA-8696: clean up Sum/Count/Total metrics (#7057)
* Clean up one redundant and one misplaced metric
* Clarify the relationship among these metrics to avoid future confusion

Reviewers: Matthias J. Sax <matthias@confluent.io>, Bill Bejeck <bill@confluent.io>, Guozhang Wang <wangguoz@gmail.com>
2019-07-23 16:54:20 -07:00
Andy Coates 2a133ba656 KAFKA-8454; Add Java AdminClient Interface (KIP-476) (#7087)
Adds an `Admin` interface as specified in [KIP-476](https://cwiki.apache.org/confluence/display/KAFKA/KIP-476%3A+Add+Java+AdminClient+Interface).

Reviewers: Ismael Juma <ismael@juma.me.uk>, Jason Gustafson <jason@confluent.io>
2019-07-22 15:47:34 -07:00
Jason Gustafson f65c71cf6e
MINOR: Increase `awaitCommits` timeout in ExampleConnectIntegrationTest (#7061)
The transient failures are usually caused by a timeout in `awaitCommits`. This patch increases the timeout from 15s to 30s.

Reviewers: Konstantine Karantasis <konstantine@confluent.io>, Matthias J. Sax <mjsax@apache.org>
2019-07-16 12:00:49 -07:00
Jason Gustafson 1d873a9de9
MINOR: Use dynamic port in `RestServerTest` (#7079)
We have seen some failures recently in `RestServerTest`. It's the usual problem with reliance on static ports. 
```
Caused by: java.io.IOException: Failed to bind to 0.0.0.0/0.0.0.0:8083
	at org.eclipse.jetty.server.ServerConnector.openAcceptChannel(ServerConnector.java:346)
	at org.eclipse.jetty.server.ServerConnector.open(ServerConnector.java:308)
	at org.eclipse.jetty.server.AbstractNetworkConnector.doStart(AbstractNetworkConnector.java:80)
	at org.eclipse.jetty.server.ServerConnector.doStart(ServerConnector.java:236)
	at org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:68)
	at org.eclipse.jetty.server.Server.doStart(Server.java:396)
	at org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:68)
	at org.apache.kafka.connect.runtime.rest.RestServer.initializeServer(RestServer.java:178)
	... 56 more
Caused by: java.net.BindException: Address already in use
```
This patch makes the chosen port dynamic.

Reviewers: Ismael Juma <ismael@juma.me.uk>
2019-07-12 15:29:10 -07:00
Robert Yokota fa042bc491 KAFKA-7157: Fix handling of nulls in TimestampConverter (#7070)
Fix handling of nulls in TimestampConverter.

Authors: Valeria Vasylieva <valeria.vasylieva@gmail.com>, Robert Yokota <rayokota@gmail.com>
Reviewers: Arjun Satish <arjun@confluent.io>, Randall Hauch <rhauch@gmail.com>
2019-07-12 12:12:20 -05:00
Michał Borowiecki fc4fea6761 KAFKA-6605: Fix NPE in Flatten when optional Struct is null (#5705)
Correct the Flatten SMT to properly handle null key or value `Struct` instances.

Author: Michal Borowiecki <michal.borowiecki@openbet.com>
Reviewers: Arjun Satish <arjun@confluent.io>, Robert Yokota <rayokota@gmail.com>, Randall Hauch <rhauch@gmail.com>
2019-07-12 10:27:33 -05:00
Nacho Muñoz Gómez 289ac09292 KAFKA-8591; WorkerConfigTransformer NPE on connector configuration reloading (#6991)
A bug in `WorkerConfigTransformer` prevents the connector configuration reload when the ConfigData TTL expires. 

The issue boils down to the fact that `worker.herder().restartConnector` is receiving a null callback. 

```
[2019-06-17 14:34:12,320] INFO Scheduling a restart of connector workshop-incremental in 60000 ms (org.apache.kafka.connect.runtime.WorkerConfigTransformer:88)
[2019-06-17 14:34:12,321] ERROR Uncaught exception in herder work thread, exiting:  (org.apache.kafka.connect.runtime.distributed.DistributedHerder:227)
java.lang.NullPointerException
        at org.apache.kafka.connect.runtime.distributed.DistributedHerder$19.onCompletion(DistributedHerder.java:1187)
        at org.apache.kafka.connect.runtime.distributed.DistributedHerder$19.onCompletion(DistributedHerder.java:1183)
        at org.apache.kafka.connect.runtime.distributed.DistributedHerder.tick(DistributedHerder.java:273)
        at org.apache.kafka.connect.runtime.distributed.DistributedHerder.run(DistributedHerder.java:219)
```
This patch adds a callback which just logs the error.

Reviewers: Robert Yokota <rayokota@gmail.com>, Jason Gustafson <jason@confluent.io>
2019-07-08 23:07:10 -07:00
Cheng Pan 0b40d27647 MINOR: Remove redundant placeholder in log message (#7016)
Reviewers: Jason Gustafson <jason@confluent.io>
2019-07-03 11:23:23 -07:00
Jason Gustafson b0935e548b
MINOR: Embedded connect cluster should mask exit procedures by default (#7028)
`EmbeddedConnectCluster` has the ability to mask system exits to avoid killing the jvm. It appears that the default was intended to be `true`, but is actually `false`. The `maskExitProcedures` method on `EmbeddedConnectCluster.Builder` documents the parameter as:

```
* @param mask if false, exit and halt procedures remain unchanged; true is the default.
```
Because this is not enabled by default as intended, we are seeing some build failures which exit abruptly:
```
17:29:11 Execution failed for task ':connect:runtime:integrationTest'.
17:29:11 > Process 'Gradle Test Executor 25' finished with non-zero exit value 1
```
The culprit often appears to be `ExampleConnectIntegrationTest`, which indeed does not override the default value of `maskExitProcedures`.

Reviewers: Ewen Cheslack-Postava <me@ewencp.org>
2019-07-02 22:15:22 -07:00
Konstantine Karantasis 11b25a13ee MINOR: Fix DistributedHerderTest after adding reason to maybeLeaveGroup (#6982)
Mocking of WorkerCoordinator was not precise after adding an argument (reason) to AbstractCoordinator#maybeLeaveGroup in KAFKA-8569:

Unit test case for DistributedHerderTest is now precise with respect to the expected argument and succeeds

Reviewers: Boyang Chen <boyang@confluent.io>, Guozhang Wang <wangguoz@gmail.com>
2019-06-21 12:42:34 -07:00
Boyang Chen 03d61ebfb9 KAFKA-8569: integrate warning message under static membership (#6972)
Static members never leave the group, so potentially we could log a flooding number of warning messages in the hb thread. The solution is to only log as warning when we are on dynamic membership.

Reviewers: Guozhang Wang <wangguoz@gmail.com>
2019-06-20 13:56:00 -07:00
Boyang Chen 1b9e107388 KAFKA-7853: Refactor coordinator config (#6854)
An attempt to refactor current coordinator logic.

Reviewers: Stanislav Kozlovski <stanislav_kozlovski@outlook.com>, Konstantine Karantasis <konstantine@confluent.io>, Guozhang Wang <wangguoz@gmail.com>
2019-06-17 10:58:43 -07:00
Boyang Chen cca05cace4 KAFKA-8331: stream static membership system test (#6877)
As title suggested, we boost 3 stream instances stream job with one minute session timeout, and once the group is stable, doing couple of rolling bounces for the entire cluster. Every rejoin based on restart should have no generation bump on the client side.

Reviewers: Guozhang Wang <wangguoz@gmail.com>,  Bill Bejeck <bbejeck@gmail.com>
2019-06-07 16:52:12 -04:00
Hai-Dang Dam 1a3fe9aa52 KAFKA-8404: Add HttpHeader to RestClient HTTP Request and Connector REST API (#6791)
When Connect forwards a REST request from one worker to another, the Authorization header was not forwarded. This commit changes the Connect framework to add include the authorization header when forwarding requests to other workers.

Author: Hai-Dang Dam <damquanghaidang@gmail.com>
Reviewers: Robert Yokota <rayokota@gmail.com>, Randall Hauch <rhauch@gmail.com>
2019-06-03 21:06:00 -05:00
Konstantine Karantasis 3c7c988e39 KAFKA-8449: Restart tasks on reconfiguration under incremental cooperative rebalancing (#6850)
Restart task on reconfiguration under incremental cooperative rebalancing, and keep execution paths separate for config updates between eager and cooperative. Include the group generation in the log message when the worker receives its assignment.

Author: Konstantine Karantasis <konstantine@confluent.io>
Reviewer: Randall Hauch <rhauch@gmail.com>
2019-06-03 11:13:39 -05:00
Konstantine Karantasis 17345b3be5 KAFKA-8463: Fix redundant reassignment of tasks when leader worker leaves (#6859)
Author: Konstantine Karantasis <konstantine@confluent.io>
Reviewer: Randall Hauch <rhauch@gmail.com>
2019-06-02 12:19:19 -05:00
Chris Egerton 5351efe48a KAFKA-8407: Fix validation of class and list configs in connector client overrides (#6789)
Because of how config values are converted into strings in the `AbstractHerder.validateClientOverrides()` method after being validated by the client override policy, an exception is thrown if the value returned by the policy isn't already parsed as the type expected by the client `ConfigDef`. The fix here involves parsing client override properties before passing them to the override policy.

A unit test is added to ensure that several different types of configs are validated properly by the herder.

Author: Chris Egerton <chrise@confluent.io>
Reviewers: Magesh Nandakumar <magesh.n.kumar@gmail.com>, Randall Hauch <rhauch@gmail.com>
2019-05-23 14:21:19 -05:00
Konstantine Karantasis 68a3eb373b KAFKA-8415: Interface ConnectorClientConfigOverridePolicy needs to be excluded from class loading isolation (#6796)
Author: Konstantine Karantasis <konstantine@confluent.io>
Reviewer: Randall Hauch <rhauch@gmail.com>
2019-05-23 14:14:09 -05:00
sdreynolds 89f331eac3 KAFKA-8229; Reset WorkerSinkTask offset commit interval after task commit (#6579)
Prior to this change, the next commit time advances
_each_ time a commit happens -- including when a commit happens
because it was requested by the `Task`. When a `Task` requests a
commit several times, the clock advances far into the future
which prevents expected periodic commits from happening.

This commit changes the behavior, we reset `nextCommit` relative
to the time of the commit.

Reviewers: Jason Gustafson <jason@confluent.io>
2019-05-22 23:50:48 -07:00
Boyang Chen cafdc1e7df KAFKA-8399: bring back internal.leave.group.on.close config for KStream (#6779)
As title states. We plan to merge this to both trunk and 2.3 if it could fix the stream system tests globally.
Reference implementation: #6673

Reviewers: Guozhang Wang <wangguoz@gmail.com>, Matthias J. Sax <mjsax@apache.org>
2019-05-22 11:03:00 -04:00
Magesh Nandakumar 126230dad0 KAFKA-8265: Fix override config name to match KIP-458. (#6776)
Author: Magesh Nandakumar <magesh.n.kumar@gmail.com>
Reviewer: Randall Hauch <rhauch@gmail.com>
2019-05-21 11:28:28 -05:00
Magesh Nandakumar 7d70133b75 KAFKA-8265: Fix config name to match KIP-458. (#6755)
Return a copy of the ConfigDef in Client Configs. Related to KIP-458.

Author: Magesh Nandakumar <magesh.n.kumar@gmail.com
Reviewer: Randall Hauch <rhauch@gmail.com>
2019-05-20 14:02:22 -05:00
Lee Dongjin b43f5446ac KAFKA-8316; Remove deprecated usage of Slf4jRequestLog, SslContextFactory (#6668)
* Remove deprecated class Slf4jRequestLog: use Slf4jRequestLogWriter, CustomRequestLog instread.

1. Remove '@SuppressWarnings("deprecation")' from RestServer#initializeResources, JsonRestServer#start.
2. Remove unused JsonRestServer#httpRequest.

* Fix deprecated class usage: SslContextFactory -> SslContextFactory.[Server, Client]

1. Split SSLUtils#createSslContextFactory into SSLUtils#create[Server, Client]SideSslContextFactory: each method instantiates SslContextFactory.[Server, Client], respectively.
2. SSLUtils#configureSslContextFactoryAuthentication is called from SSLUtils#createServerSideSslContextFactory only.
3. Update SSLUtilsTest following splittion; for client-side SSL Context Factory, SslContextFactory#get[Need, Want]ClientAuth is always false. (SSLUtilsTest#testCreateClientSideSslContextFactory)

Reviewers: Ismael Juma <ismael@juma.me.uk>, Jason Gustafson <jason@confluent.io>
2019-05-20 10:15:15 -07:00
Magesh Nandakumar 2e91a310d7 KAFKA-8265: Initial implementation for ConnectorClientConfigPolicy to enable overrides (KIP-458) (#6624)
Implementation to enable policy for Connector Client config overrides. This is
implemented per the KIP-458.

Reviewers: Randall Hauch <rhauch@gmail.com>
2019-05-17 01:37:32 -07:00
Konstantine Karantasis ce584a01ff KAFKA-5505: Incremental cooperative rebalancing in Connect (KIP-415) (#6363)
Added the incremental cooperative rebalancing in Connect to avoid global rebalances on all connectors and tasks with each new/changed/removed connector. This new protocol is backward compatible and will work with heterogeneous clusters that exist during a rolling upgrade, but once the clusters consist of new workers only some affected connectors and tasks will be rebalanced: connectors and tasks on existing nodes still in the cluster and not added/changed/removed will continue running while the affected connectors and tasks are rebalanced.

This commit attempted to minimize the changes to the existing V0 protocol logic, though that was not entirely possible.

This commit adds extensive unit and integration tests for both the old V0 protocol and the new v1 protocol. Soak testing has been performed multiple times to verify behavior while connectors and added, changed, and removed and while workers are added and removed from the cluster.

Author: Konstantine Karantasis <konstantine@confluent.io>
Reviewers: Randall Hauch <rhauch@gmail.com>, Ewen Cheslack-Postava <me@ewencp.org>, Robert Yokota <rayokota@gmail.com>, David Arthur <mumrah@gmail.com>, Ryanne Dolan <ryannedolan@gmail.com>
2019-05-16 22:46:03 -05:00
dan norwood 5a95c2e1cd Add '?expand' query param for additional info on '/connectors'. (#6658)
Per KIP-465, kept existing behavior of `/connectors` resource in the Connect's REST API, but added the ability to specify `?expand` query parameter to get list of connectors with status details on each connector. Added unit tests, and verified passing existing system tests (which use the older list form).

See https://cwiki.apache.org/confluence/display/KAFKA/KIP-465%3A+Add+Consolidated+Connector+Endpoint+to+Connect+REST+API.

Author: Dan Norwood <norwood@confluent.io>
Reviewer: Randall Hauch <rhauch@gmail.com>
2019-05-16 16:29:29 -05:00
Randall Hauch b395ef4182 KAFKA-3816: Add MDC logging to Connect runtime (#5743)
See https://cwiki.apache.org/confluence/display/KAFKA/KIP-449%3A+Add+connector+contexts+to+Connect+worker+logs

Added LoggingContext as a simple mechanism to set and unset Mapped Diagnostic Contexts (MDC) in the loggers to provide for each thread useful parameters that can be used within the logging configuration. MDC avoids having to modify lots of log statements, since the parameters are available to all log statements issued by the thread, no matter what class makes those calls.

The design intentionally minimizes the number of changes to any existing classes, and does not use Java 8 features so it can be easily backported if desired, although per this KIP it will be applied initially only in AK 2.3 and later and must be enabled via the Log4J configuration.

Reviewers: Jason Gustafson <jason@conflent.io>, Guozhang Wang <wangguoz@gmail.com>
2019-05-16 12:35:01 +01:00
Konstantine Karantasis 2327b35558 MINOR: Enable console logs in Connect tests (#6745)
Author: Konstantine Karantasis <konstantine@confluent.io>
Reviewer: Randall Hauch <rhauch@gmail.com>
2019-05-15 22:05:52 -05:00
Magesh Nandakumar 5928ffd0dc KAFKA-8320 : fix retriable exception package for source connectors (#6675)
WorkerSourceTask is catching the exception from wrong package org.apache.kafka.common.errors. It is not clear from the API standpoint as to which package the connect framework supports - the one from common or connect. The safest thing would be to support both the packages even though it's less desirable.

Author: Magesh Nandakumar <magesh.n.kumar@gmail.com>
Reviewers: Arjun Satish <arjun@confluent.io>, Randall Hauch <rhauch@gmail.com>
2019-05-15 17:20:20 -05:00
Boyang Chen 2208f9966d KAFKA-8354; Replace Sync group request/response with automated protocol (#6729)
Update SyncGroup API to use the generated protocol classes.

Reviewers: Jason Gustafson <jason@confluent.io>
2019-05-15 15:09:00 -07:00
Paul Davidson fbb09952ac KAFKA-5061 - Make default Worker Task client IDs distinct (#6097)
Use the task ID to make the default client IDs used by Worker Tasks distinct and stable. This is avoids name conflicts on JMX MBeans and enables useful monitoring.

This implements https://cwiki.apache.org/confluence/display/KAFKA/KIP-411%3A+Make+default+Kafka+Connect+worker+task+client+IDs+distinct.

See: https://issues.apache.org/jira/browse/KAFKA-5061

Author: Paul Davidson <>
Reviewer: Cyrus Vafadari <cyrusv@alum.mit.edu>, Arjun Satish <arjun@confluent.io>, Randall Hauch <rhauch@gmail.com>
2019-05-15 17:02:35 -05:00
Arabelle Hou 4c85171a1f KAFKA-7633: Allow Kafka Connect to access internal topics without cluster ACLs (#5918)
When Kafka Connect does not have cluster ACLs to create topics,
it fails to even access its internal topics which already exist.
This was originally fixed in KAFKA-6250 by ignoring the cluster
authorization error, but now Kafka 2.0 returns a different response
code that corresponds to a different error. Add a patch to ignore this
new error as well.

Reviewers: Jason Gustafson <jason@confluent.io>
2019-05-11 11:05:37 -07:00
Chris Egerton 7a4618a793 MINOR: Remove header and key/value converter config value logging (#6660)
The debug log lines in the `Plugins` class that log header and key/value converter configurations should be altered as the configurations for these converters may contain secrets that should not be logged in plaintext. Instead, only the keys for these configs are safe to expose.

Author: Chris Egerton <cegerton@oberlin.edu>
Reviewer: Randall Hauch <rhauch@gmail.com>
2019-05-09 20:35:08 -05:00
Chris Egerton 36a5aba4ec KAFKA-8231: Expansion of ConnectClusterState interface (#6584)
Expand ConnectClusterState interface and implementation with methods that provide the immutable cluster details and the connector configuration. This includes unit tests for the new methods.

Author: Chris Egerton <cegerton@oberlin.edu>
Reviews: Arjun Satish <arjun@confluent.io>, Konstantine Karantasis <konstantine@confluent.io>, Randall Hauch <rhauch@gmail.com>
2019-05-09 20:27:59 -05:00
Bob Barrett a97e55b838 KAFKA-8332: Refactor ImplicitLinkedHashSet to avoid losing ordering when converting to Scala
Because of how conversions between Java collections and Scala collections work, ImplicitLinkedHashMultiSet objects were being treated as unordered in some contexts where they shouldn't be.  This broke JOIN_GROUP handling.  

This patch renames ImplicitLinkedHashMultiSet to ImplicitLinkedHashMultCollection.  The order of Collection objects will be preserved when converting to scala.  Adding Set and List "views" to the Collection gives us a more elegant way of accessing that functionality when needed.

Reviewers: Colin P. McCabe <cmccabe@apache.org>
2019-05-09 11:08:22 -07:00
Ismael Juma c09e25fac2
MINOR: Fix bug in Struct.equals and use Objects.equals/Long.hashCode (#6680)
* Fixed bug in Struct.equals where we returned prematurely and added tests
* Update RequestResponseTest to check that `equals` and `hashCode` of
the struct is the same after serialization/deserialization only when possible.
* Use `Objects.equals` and `Long.hashCode` to simplify code
* Removed deprecated usages of `JUnitTestSuite`

Reviewers: Manikumar Reddy <manikumar.reddy@gmail.com>
2019-05-09 06:21:24 -07:00
Boyang Chen b0e82a68b3 KAFKA-8284: enable static membership on KStream (#6673)
Part of KIP-345 effort. The strategy is to extract user passed in group.instance.id config and pass it in with given thread-id (because consumer is currently per-thread level).

Reviewers: Guozhang Wang <wangguoz@gmail.com>
2019-05-07 17:43:13 -07:00
Chris Egerton cc097e909c KAFKA-8304: Fix registration of Connect REST extensions (#6651)
Fix registration of Connect REST extensions to prevent deadlocks when extensions get the list of connectors before the herder is available. Added integration test to check the behavior.

Author: Chris Egerton <cegerton@oberlin.edu>
Reviewers: Arjun Satish <arjun@confluent.io>, Randall Hauch <rhauch@gmail.com>
2019-05-07 17:20:51 -05:00
Mickael Maison 407bcdf78e KAFKA-8056; Use automatic RPC generation for FindCoordinator (#6408)
Reviewers: Jason Gustafson <jason@confluent.io>
2019-05-06 14:26:22 -07:00
Ismael Juma a37282415e
MINOR: Upgrade dependencies for Kafka 2.3 (#6665)
Many patch and minor updates.

Scalatest and Jetty deprecated classes that we
use. I removed usages for the former and filed KAFKA-8316 for the latter (I
suppressed the relevant deprecation warnings until the JIRA is fixed). As
part of the scalatest fixes, I also removed `TestUtils.fail` since it duplicates
`Assertions.fail`.

I also fixed a few compiler warnings that have crept in since my last sweep.

Updates of note:
- Jetty: 9.4.14 -> 9.4.18
  * https://github.com/eclipse/jetty.project/releases/tag/jetty-9.4.15.v20190215
  * https://github.com/eclipse/jetty.project/releases/tag/jetty-9.4.16.v20190411
  * https://github.com/eclipse/jetty.project/releases/tag/jetty-9.4.17.v20190418
  * https://github.com/eclipse/jetty.project/releases/tag/jetty-9.4.17.v20190418
  * https://github.com/eclipse/jetty.project/releases/tag/jetty-9.4.18.v20190429
- zstd: 1.3.8-1 -> 1.4.0-1
  * https://github.com/facebook/zstd/releases/tag/v1.4.0
  * zstd's fastest strategy, 6-8% faster in most scenarios
- zookeeper: 3.4.13 -> 3.4.14
  * https://zookeeper.apache.org/doc/r3.4.14/releasenotes.html

### Committer Checklist (excluded from commit message)
- [ ] Verify design and implementation 
- [ ] Verify test coverage and CI build status
- [ ] Verify documentation (including upgrade notes)

Reviewers: Manikumar Reddy <manikumar.reddy@gmail.com>
2019-05-03 10:35:07 -07:00
Zhanxiang (Patrick) Huang 6ca899e56d KAFKA-8066; Always close the sensors in Selector.close() (#6402)
When shutting down the ReplicaFetcher thread, we may fail to unregister sensors in selector.close(). When that happened, we will fail to start up the ReplicaFetcherThread with the same fetch id again because of the IllegalArgumentException in sensor registration. This issue will cause constant URPs in the cluster because the ReplicaFetchterThread is gone.

This patch addresses this issue by introducing a try-finally block in selector.close() so that we will always unregister the sensors in shutting down ReplicaFetcherThreads.

Reviewers: Rajini Sivaram <rajinisivaram@googlemail.com>, Jason Gustafson <jason@confluent.io>
2019-05-01 12:40:48 -07:00
Stanislav Kozlovski 191f2faae0 KAFKA-7992: Introduce start-time-ms metric (#6318)
Reviewers: Colin P. McCabe <cmccabe@apache.org>, Ismael Juma <ismael@juma.me.uk>
2019-05-01 08:58:02 -07:00
Boyang Chen 0f995ba6be KAFKA-7862 & KIP-345 part-one: Add static membership logic to JoinGroup protocol (#6177)
This is the first diff for the implementation of JoinGroup logic for static membership. The goal of this diff contains:

* Add group.instance.id to be unique identifier for consumer instances, provided by end user;
Modify group coordinator to accept JoinGroupRequest with/without static membership, refactor the logic for readability and code reusability.
* Add client side support for incorporating static membership changes, including new config for group.instance.id, apply stream thread client id by default, and new join group exception handling.
* Increase max session timeout to 30 min for more user flexibility if they are inclined to tolerate partial unavailability than burdening rebalance.
* Unit tests for each module changes, especially on the group coordinator logic. Crossing the possibilities like:
6.1 Dynamic/Static member
6.2 Known/Unknown member id
6.3 Group stable/unstable
6.4 Leader/Follower

The rest of the 345 change will be broken down to 4 separate diffs:

* Avoid kicking out members through rebalance.timeout, only do the kick out through session timeout.
* Changes around LeaveGroup logic, including version bumping, broker logic, client logic, etc.
* Admin client changes to add ability to batch remove static members
* Deprecate group.initial.rebalance.delay

Reviewers: Liquan Pei <liquanpei@gmail.com>, Stanislav Kozlovski <familyguyuser192@windowslive.com>, Jason Gustafson <jason@confluent.io>, Guozhang Wang <wangguoz@gmail.com>
2019-04-26 11:44:38 -07:00
Randall Hauch 9ff490509f MINOR: Correct RestServerTest formatting 2019-04-22 22:47:30 -05:00
Cyrus Vafadari d801c4fc69 MINOR: Add support for Standalone Connect configs in Rest Server extensions (#6622)
Add support for Standalone Connect configs in Rest Server extensions

A bug was introduced in 7a42750d that was caught in system tests:
The rest extensions fail if a Standalone worker config is passed,
since it does not have a definition for rebalance timeout.
A new method was introduced on WorkerConfig that by default returns
null for the rebalance timeout, and DistributedConfig overloads this
to return the configured value.

Author: Cyrus Vafadari <cyrus@confluent.io>
Reviewers: Arjun Satish <arjunconfluent.io>, Randall Hauch <rhauch@gmail.com>
2019-04-22 22:29:12 -05:00
Sebastián Ortega 3000fda8b9 KAFKA-8277: Fix NPEs in several methods of ConnectHeaders (#6550)
Replace `headers.isEmpty()` by calls to `isEmpty()` as the latter does a null check on heathers (that is lazily created).

Author: Sebastián Ortega <sebastian.ortega@letgo.com>
Reviewers: Konstantine Karantasis <konstantine@confluent.io>, Arjun Satish <arjunconfluent.io>, Randall Hauch <rhauch@gmail.com>
2019-04-22 14:19:58 -07:00
Chris Egerton 71e721f135 KAFKA-8058: Fix ConnectClusterStateImpl.connectors() method (#6384)
Fixed the ConnectClusterStateImpl.connectors() method and throw an exception on timeout. Added unit test.

Author: Chris Egerton <chrise@confluent.io>

Reviewers: Magesh Nandakumar <magesh.n.kumar@gmail.com>, Robert Yokota <rayokota@gmail.com>, Arjun Satish <wicknicks@users.noreply.github.com>, Konstantine Karantasis <konstantine@confluent.io>, Randall Hauch <rhauch@gmail.com>, Ewen Cheslack-Postava <ewen@confluent.io>

Closes #6384 from C0urante:kafka-8058
2019-04-07 09:43:09 -05:00
Mickael Maison 825fa3fa09 MINOR: Fixed a few warning in core and connects (#6545)
- var -> val
- unused imports
- Javadoc fix

Reviewers: Manikumar Reddy <manikumar.reddy@gmail.com>
2019-04-05 23:30:48 +05:30
Doroszlai, Attila d3316bc6a7 KAFKA-8126: Flaky Test org.apache.kafka.connect.runtime.WorkerTest.testAddRemoveTask (#6475)
Changed the WorkerTest to use a mock Executor.

Author: Attila Doroszlai <adoroszlai@apache.org>
Reviewer: Randall Hauch <rhauch@gmail.com>
2019-04-03 15:00:05 -05:00
Konstantine Karantasis e4cad35312 KAFKA-8014: Extend Connect integration tests to add and remove workers dynamically (#6342)
Extend Connect's integration test framework to add or remove workers to EmbeddedConnectCluster, and choosing whether to fail the test on ungraceful service shutdown. Also added more JavaDoc and other minor improvements. 

Author: Konstantine Karantasis <konstantine@confluent.io>

Reviewers: Arjun Satish <arjun@confluent.io>, Randall Hauch <rhauch@gmail.com>

Closes #6342 from kkonstantine/KAFKA-8014
2019-03-25 09:29:33 -05:00
Boyang Chen 8406f3624d KAFKA-7858: Automatically generate JoinGroup request/response
Reviewers: Colin P. McCabe <cmccabe@apache.org>
2019-03-18 13:26:09 -07:00
Jason Gustafson 460e46c3bb
KAFKA-7831; Do not modify subscription state from background thread (#6221)
Metadata may be updated from the background thread, so we need to protect access to SubscriptionState. This patch restructures the metadata handling so that we only check pattern subscriptions in the foreground. Additionally, it improves the following:

1. SubscriptionState is now the source of truth for the topics that will be fetched. We had a lot of messy logic previously to try and keep the the topic set in Metadata consistent with the subscription, so this simplifies the logic.
2. The metadata needs for the producer and consumer are quite different, so it made sense to separate the custom logic into separate extensions of Metadata. For example, only the producer requires topic expiration.
3. We've always had an edge case in which a metadata change with an inflight request may cause us to effectively miss an expected update. This patch implements a separate version inside Metadata which is bumped when the needed topics changes.
4. This patch removes the MetadataListener, which was the cause of https://issues.apache.org/jira/browse/KAFKA-7764. 

Reviewers: David Arthur <mumrah@gmail.com>, Rajini Sivaram <rajinisivaram@googlemail.com>
2019-03-07 16:29:19 -08:00
Lysss 20f701cc50 KAFKA-7880:Naming worker thread by task id (#6275)
KAFKA-7880: Name worker thread to include task id

Change Connect's `WorkerTask` to name the thread using the `task-thread-<connectorTaskId>` pattern.

Reviewers: Randall Hauch <rhauch@gmail.com>
2019-03-04 18:04:34 -06:00
lzh3636 d930e5d3d8 improve some logging statements (#6078)
Reviewers: Matthias J. Sax <matthias@confluent.io>, Randall Hauch <rhauch@gmail.com>
2019-03-04 17:57:45 -06:00
Arjun Satish 2627a1be2c MINOR: Increase produce timeout to 120 seconds (#6326)
MINOR: Increase produce timeout for EmbeddedKafkaCluster to 120 seconds

Previous value was 500ms. This change gives more room to pass tests on systems with low resources running many parallel tests.

Reviewers: Randall Hauch <randall@confluent.io>
2019-02-25 23:26:59 -06:00
Chia-Ping Tsai 35a0de32ee KAFKA-6161 Add default implementation to close() and configure() for Serdes (#5348)
Reviewers: Matthias J. Sax <matthias@confluent.io>, Guozhang Wang <wangguoz@gmail.com>
2019-02-21 09:05:13 -08:00
Alex Diachenko ec42e0378e KAFKA-7799; Use httpcomponents-client in RestServerTest.
The test `org.apache.kafka.connect.runtime.rest.RestServerTest#testCORSEnabled` assumes Jersey client can send restricted HTTP headers(`Origin`).

Jersey client uses `sun.net.www.protocol.http.HttpURLConnection`.
`sun.net.www.protocol.http.HttpURLConnection` drops restricted headers(`Host`, `Keep-Alive`, `Origin`, etc) based on static property `allowRestrictedHeaders`.
This property is initialized in a static block by reading Java system property `sun.net.http.allowRestrictedHeaders`.

So, if classloader loads `HttpURLConnection` before we set `sun.net.http.allowRestrictedHeaders=true`, then all subsequent changes of this system property won't take any effect(which happens if `org.apache.kafka.connect.integration.ExampleConnectIntegrationTest` is executed before `RestServerTest`).
To prevent this, we have to either make sure we set `sun.net.http.allowRestrictedHeaders=true` as early as possible or do not rely on this system property at all.

This PR adds test dependency on `httpcomponents-client` which doesn't depend on `sun.net.http.allowRestrictedHeaders` system property. Thus none of existing tests should interfere with `RestServerTest`.

Author: Alex Diachenko <sansanichfb@gmail.com>

Reviewers: Randall Hauch, Konstantine Karantasis, Gwen Shapira

Closes #6236 from avocader/KAFKA-7799
2019-02-12 12:03:08 -08:00
Ismael Juma c7f99bc2bd
MINOR: Update JUnit to 4.13 and annotate log cleaner integration test (#6248)
JUnit 4.13 fixes the issue where `Category` and `Parameterized` annotations
could not be used together. It also deprecates `ExpectedException` and
`assertThat`. Given this, we:

- Replace `ExpectedException` with the newly introduced `assertThrows`.
- Replace `Assert.assertThat` with `MatcherAssert.assertThat`.
- Annotate `AbstractLogCleanerIntegrationTest` with `IntegrationTest` category.

Reviewers: Ewen Cheslack-Postava <ewen@confluent.io>, David Arthur <mumrah@gmail.com>
2019-02-11 22:06:14 -08:00
Colin Patrick McCabe e2e8bdbd8c
KAFKA-7832: Use automatic RPC generation in CreateTopics (#5972)
Reviewers: Jun Rao <junrao@gmail.com>, Tom Bentley <tbentley@redhat.com>, Boyang Chen <bchen11@outlook.com>
2019-02-04 10:39:43 -08:00
Randall Hauch 176ea0d0f9 KAFKA-7873; Always seek to beginning in KafkaBasedLog (#6203)
Explicitly seek KafkaBasedLog’s consumer to the beginning of the topic partitions, rather than potentially use committed offsets (which would be unexpected) if group.id is set or rely upon `auto.offset.reset=earliest` if the group.id is null.

This should not change existing behavior but should remove some potential issues introduced with KIP-287 if `group.id` is not set in the consumer configurations. Note that even if `group.id` is set, we still always want to consume from the beginning.

Reviewers: Jason Gustafson <jason@confluent.io>
2019-02-01 14:17:52 -08:00
Chris Egerton 743607af5a KAFKA-5117: Stop resolving externalized configs in Connect REST API
[KIP-297](https://cwiki.apache.org/confluence/display/KAFKA/KIP-297%3A+Externalizing+Secrets+for+Connect+Configurations#KIP-297:ExternalizingSecretsforConnectConfigurations-PublicInterfaces) introduced the `ConfigProvider` mechanism, which was primarily intended for externalizing secrets provided in connector configurations. However, when querying the Connect REST API for the configuration of a connector or its tasks, those secrets are still exposed. The changes here prevent the Connect REST API from ever exposing resolved configurations in order to address that. rhauch has given a more thorough writeup of the thinking behind this in [KAFKA-5117](https://issues.apache.org/jira/browse/KAFKA-5117)

Tested and verified manually. If these changes are approved unit tests can be added to prevent a regression.

Author: Chris Egerton <chrise@confluent.io>

Reviewers: Robert Yokota <rayokota@gmail.com>, Randall Hauch <rhauch@gmail.com, Ewen Cheslack-Postava <ewen@confluent.io>

Closes #6129 from C0urante/hide-provided-connect-configs
2019-01-23 11:00:23 -08:00
Arjun Satish dc935c4beb MINOR: Handle case where connector status endpoints returns 404 (#6176)
Reviewers: Randall Hauch <randall@confluent.io>, Matthias J. Sax <matthias@confluent.io>
2019-01-20 19:31:20 -08:00
Magesh Nandakumar dec68c9350 MINOR: Start Connect REST server in standalone mode to match distributed mode (KAFKA-7503 follow-up)
Start the Rest server in the standalone mode similar to how it's done for distributed mode.

Author: Magesh Nandakumar <magesh.n.kumar@gmail.com>

Reviewers: Arjun Satish <arjun@confluent.io>, Ewen Cheslack-Postava <ewen@confluent.io>

Closes #6148 from mageshn/KAFKA-7826
2019-01-16 22:58:30 -08:00
Chia-Ping Tsai af634a4a98 KAFKA-7391; Introduce close(Duration) to Producer and AdminClient instead of close(long, TimeUnit) (#5667)
See KIP-367: https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=89070496.

Reviewers: Viktor Somogyi <viktorsomogyi@gmail.com>, Manikumar Reddy <manikumar.reddy@gmail.com>, Jason Gustafson <jason@confluent.io>
2019-01-15 08:48:32 -08:00
Andrew Schofield aca52b6d2c KAFKA-7461: Add tests for logical types
Added testing of logical types for Kafka Connect in support of KIP-145 features.
Added tests for Boolean, Time, Date and Timestamp, including the valid conversions.

The area of ISO8601 strings is a bit of a mess because the tokenizer is not compatible with
that format, and a subsequent JIRA will be needed to fix that.

A few small fixes as well as creating test cases, but they're clearly just corrections such as
using 0 to mean January (java.util.Calendar uses zero-based month numbers).

Author: Andrew Schofield <andrew_schofield@uk.ibm.com>

Reviewers: Mickael Maison <mimaison@users.noreply.github.com>, Ewen Cheslack-Postava <ewen@confluent.io>

Closes #6077 from AndrewJSchofield/KAFKA-7461-ConverterValuesLogicalTypesTest
2019-01-14 15:41:23 -08:00
Arjun Satish 69d8d2ea11 KAFKA-7503: Connect integration test harness
Expose a programmatic way to bring up a Kafka and Zk cluster through Java API to facilitate integration tests for framework level changes in Kafka Connect. The Kafka classes would be similar to KafkaEmbedded in streams. The new classes would reuse the kafka.server.KafkaServer classes from :core, and provide a simple interface to bring up brokers in integration tests.

Signed-off-by: Arjun Satish <arjunconfluent.io>

Author: Arjun Satish <arjun@confluent.io>
Author: Arjun Satish <wicknicks@users.noreply.github.com>

Reviewers: Randall Hauch <rhauch@gmail.com>, Konstantine Karantasis <konstantine@confluent.io>, Ewen Cheslack-Postava <ewen@confluent.io>

Closes #5516 from wicknicks/connect-integration-test
2019-01-14 13:50:23 -08:00
Jason Gustafson 73a9fcaa4d
KAFKA-7799; Fix flaky test RestServerTest.testCORSEnabled (#6106)
The test always fails if testOptionsDoesNotIncludeWadlOutput is executed before testCORSEnabled. It seems the problem is the use of the system property. Perhaps there is some static caching somewhere.

Reviewers: Randall Hauch <rhauch@gmail.com>, Guozhang Wang <wangguoz@gmail.com>
2019-01-08 23:47:28 -08:00
Chia-Ping Tsai 9601315420 KAFKA-7253; The returned connector type is always null when creating connector (#5470)
The null map returned from the current snapshot causes the null type in response. The connector class name can be taken from the config of request instead since we require the config should contain the connector class name.

Reviewers: Jason Gustafson <jason@confluent.io>
2019-01-08 08:52:42 -08:00
Chia-Ping Tsai b6d1450012 MINOR: Show the specified value in logging the deprecation warnings of internal converter (#5212)
Reviewers: Jason Gustafson <jason@confluent.io>
2019-01-05 02:56:15 -08:00
Renato Mefi 964e2c57c2 KAFKA-3832; Kafka Connect's JSON Converter never outputs a null value (#6027)
When using the Connect `JsonConverter`, it's impossible to produce tombstone messages, thus impacting the compaction of the topic. This patch allows the converter with and without schemas to output a NULL byte value in order to have a proper tombstone message. When it's regarding to get this data into a connect record, the approach is the same as when the payload looks like `"{ "schema": null, "payload": null }"`, this way the sink connectors can maintain their functionality and reduces the BCC.

Reviewers: Gunnar Morling <gunnar.morling@googlemail.com>, Randall Hauch <rhauch@gmail.com>, Jason Gustafson <jason@confluent.io>
2018-12-28 09:39:52 -08:00
Alex Diachenko ee370d3893 KAFKA-7759; Disable WADL output in the Connect REST API (#6051)
This patch disables support for WADL output in the Connect REST API since it was never intended to be exposed. 

Reviewers: Randall Hauch <rhauch@gmail.com>, Jason Gustafson <jason@confluent.io>
2018-12-20 14:24:05 -08:00
Cyrus Vafadari 9f954ac614 MINOR: Safe string conversion to avoid NPEs
Should be ported back to 2.0

Author: Cyrus Vafadari <cyrus@confluent.io>

Reviewers: Ewen Cheslack-Postava <ewen@confluent.io>

Closes #6004 from cyrusv/cyrus-npe
2018-12-05 13:23:52 -08:00
Magesh Nandakumar ace4dd0056 KAFKA-7551: Refactor to create producer & consumer in the worker
This is minor refactoring that brings in the creation of producer and consumer to the Worker. Currently, the consumer is created in the WorkerSinkTask. This should not affect any functionality and it just makes the code structure easier to understand.

Author: Magesh Nandakumar <magesh.n.kumar@gmail.com>

Reviewers: Ryanne Dolan <ryannedolan@gmail.com>, Randall Hauch <rhauch@gmail.com>, Robert Yokota <rayokota@gmail.com>, Ewen Cheslack-Postava <ewen@confluent.io>

Closes #5842 from mageshn/KAFKA-7551
2018-11-29 23:38:50 -08:00
Cyrus Vafadari 4712a36416 MINOR: Add logging to Connect SMTs
Includes Update to ConnectRecord string representation to give
visibility into schemas, useful in SMT tracing

Author: Cyrus Vafadari <cyrus@confluent.io>

Reviewers: Randall Hauch <rhauch@gmail.com>, Konstantine Karantasis <konstantine@confluent.io>, Ewen Cheslack-Postava <ewen@confluent.io>

Closes #5860 from cyrusv/cyrus-logging
2018-11-29 22:29:50 -08:00
Robert Yokota a2e87feb8b KAFKA-7620: Fix restart logic for TTLs in WorkerConfigTransformer
The restart logic for TTLs in `WorkerConfigTransformer` was broken when trying to make it toggle-able.   Accessing the toggle through the `Herder` causes the same code to be called recursively.  This fix just accesses the toggle by simply looking in the properties map that is passed to `WorkerConfigTransformer`.

Author: Robert Yokota <rayokota@gmail.com>

Reviewers: Magesh Nandakumar <magesh.n.kumar@gmail.com>, Ewen Cheslack-Postava <ewen@confluent.io>

Closes #5914 from rayokota/KAFKA-7620
2018-11-27 22:01:21 -08:00
Mickael Maison 4e90af34c6 MINOR: Various javadoc improvement in clients and connect (#5878)
Fixed formatting issues and added links in a few classes
2018-11-23 10:50:30 +05:30
Srinivas Reddy ad26914de6 KAFKA-7418: Add the missing '--help' option to Kafka commands (KIP-374)
Changes made as part of this [KIP-374](https://cwiki.apache.org/confluence/x/FgSQBQ) and [KAFKA-7418](https://issues.apache.org/jira/browse/KAFKA-7418)
 - Checking for empty args or help option in command file to print Usage
 - Added new class to enforce help option to all commands
 - Refactored few lines (ex `PreferredReplicaLeaderElectionCommand`) to
   make use of `CommandDefaultOptions` attributes.
 - Made the changes in help text wordings

Run the unit tests in local(Windows) few Linux friendly tests are failing but
not any functionality, verified `--help` and no option response by running
Scala classes, since those all are having `main` method.

Author: Srinivas Reddy <srinivas96alluri@gmail.com>
Author: Srinivas Reddy <mrsrinivas@users.noreply.github.com>
Author: Srinivas <srinivas96alluri@gmail.com>

Reviewers: Colin Patrick McCabe <colin@cmccabe.xyz>, Manikumar Reddy <manikumar.reddy@gmail.com>, Guozhang Wang <wangguoz@gmail.com>, Mickael Maison <mickael.maison@gmail.com>

Closes #5910 from mrsrinivas/KIP-374
2018-11-22 17:12:34 +05:30
Stanislav Kozlovski 068ab9cefa KAFKA-7528: Standardize on Min/Avg/Max Kafka metrics' default value - NaN (#5908)
While metrics like Min, Avg and Max make sense to respective use Double.MAX_VALUE, 0.0 and Double.MIN_VALUE as default values to ease computation logic, exposing those values makes reading them a bit misleading. For instance, how would you differentiate whether your -avg metric has a value of 0 because it was given samples of 0 or no samples were fed to it?

It makes sense to standardize on the output of these metrics with something that clearly denotes that no values have been recorded.

Reviewers: Bill Bejeck <bill@confluent.io>, John Roesler <john@confluent.io>, Guozhang Wang <wangguoz@gmail.com>
2018-11-20 15:54:24 -08:00
Yishun Guan 9646602d68 KAFKA-7402: Implement KIP-376 AutoCloseable additions 2018-11-16 15:58:47 -08:00
Benedict Jin b21e933592 * MINOR: Catching null pointer exception for empty leader URL when assignment is null (#4798)
Catch null pointer exception for empty leader URL when assignment is null.

Reviewers: Randall Hauch <rhauch@gmail.com>, Konstantine Karantasis <konstantine@confluent.io>, Guozhang Wang <wangguoz@gmail.com>
2018-11-16 15:28:47 -08:00
Randall Hauch 545d40d83c MINOR: Avoid logging connector configuration in Connect framework (#5868)
Some connector configs may be sensitive, so we should avoid logging them.

Reviewers: Alex Diachenko, Dustin Cote <dustin@confluent.io>, Jason Gustafson <jason@confluent.io>
2018-11-13 13:30:34 -08:00
Ismael Juma 12f310d50e
KAFKA-7612: Fix javac warnings and enable warnings as errors (#5900)
- Use Xlint:all with 3 exclusions (filed KAFKA-7613 to remove the exclusions)
- Use the same javac options when compiling tests (seems accidental that
we didn't do this before)
- Replaced several deprecated method calls with non-deprecated ones:
  - `KafkaConsumer.poll(long)` and `KafkaConsumer.close(long)`
  - `Class.newInstance` and `new Integer/Long` (deprecated since Java 9)
  - `scala.Console` (deprecated in Scala 2.11)
  - `PartitionData` taking a timestamp (one of them seemingly a bug)
  - `JsonMappingException` single parameter constructor
- Fix unnecessary usage of raw types in several places.
- Add @SuppressWarnings for deprecations, unchecked and switch fallthrough in
several places.
- Scala clean-ups (var -> val, ETA expansion warnings, avoid reflective calls)
- Use lambdas to simplify code in a few places
- Add @SafeVarargs, fix varargs usage and remove unnecessary `Utils.mkList` method

Reviewers: Matthias J. Sax <mjsax@apache.org>, Manikumar Reddy <manikumar.reddy@gmail.com>, Randall Hauch <rhauch@gmail.com>, Bill Bejeck <bill@confluent.io>, Stanislav Kozlovski <stanislav_kozlovski@outlook.com>
2018-11-12 22:18:59 -08:00
Srinivas Reddy 8abbf33b59 KAFKA-7431: Clean up connect unit tests
[KAFKA-7431](https://issues.apache.org/jira/browse/KAFKA-7431)

Changes made to improve the code readability:
 - Removed `throws Exception` from the place where there won't be an
 exception
 - Removed type arguments where those can be inferred explicitly by compiler
 - Rewritten Anonymous classes to Java 8 with lambdas

Author: Srinivas Reddy <mrsrinivas@users.noreply.github.com>
Author: Srinivas Reddy <srinivas96alluri@gmail.com>

Reviewers: Randall Hauch <rhauch@gmail.com>, Ismael Juma <ismael@juma.me.uk>, Ryanne Dolan <ryannedolan@gmail.com>, Ewen Cheslack-Postava <ewen@confluent.io>

Closes #5681 from mrsrinivas/cleanup-connect-uts
2018-11-07 08:23:19 -08:00
Jason Gustafson d71cb54672
KAFKA-7567; Clean up internal metadata usage for consistency and extensibility (#5813)
This patch makes two improvements to internal metadata handling logic and testing:

1. It reduce dependence on the public object `Cluster` for internal metadata propagation since it is not easy to evolve. As an example, we need to propagate leader epochs from the metadata response to `Metadata`, but it is not straightforward to do this without exposing it in `PartitionInfo` since that is what `Cluster` uses internally. By doing this change, we are able to remove some redundant `Cluster` building logic. 
2. We want to make the metadata handling in `MockClient` simpler and more consistent. Currently we have mix of metadata update mechanisms which are internally inconsistent with each other and do not match the implementation in `NetworkClient`.

Reviewers: Rajini Sivaram <rajinisivaram@googlemail.com>
2018-10-30 13:20:13 -07:00
Ron Dagostino e8a3bc7425 KAFKA-7352; Allow SASL Connections to Periodically Re-Authenticate (KIP-368) (#5582)
KIP-368 implementation to enable periodic re-authentication of SASL clients. Also adds a broker configuration option to terminate client connections that do not re-authenticate within the configured interval.
2018-10-26 23:18:15 +01:00
jonathanskrzypek a947fe8da8 KAFKA-6195: Resolve DNS aliases in bootstrap.server (KIP-235) (#4485)
Adds `client.dns.lookup=resolve_canonical_bootstrap_servers_only` option to perform full dns resolution of bootstrap addresses

Reviewers: Colin Patrick McCabe <colin@cmccabe.xyz>, Sriharsha Chintalapani <sriharsha@apache.org>, Edoardo Comar <ecomar@uk.ibm.com>, Mickael Maison <mickael.maison@gmail.com>, Manikumar Reddy <manikumar.reddy@gmail.com>, Rajini Sivaram <rajinisivaram@googlemail.com>
2018-10-13 21:39:35 +01:00
Edoardo Comar f393b2f7dd KAFKA-6863 Kafka clients should try to use multiple DNS resolved IP (#4987)
Implementation of KIP-302: Based on the new client configuration `client.dns.lookup`, a NetworkClient can use InetAddress.getAllByName to find all IPs and iterate over them when they fail to connect. Only uses either IPv4 or IPv6 addresses similar to the default mode.

Co-authored-by: Edoardo Comar <ecomar@uk.ibm.com>
Co-authored-by: Mickael Maison <mickael.maison@gmail.com>

Reviewers: Rajini Sivaram <rajinisivaram@googlemail.com>
2018-10-11 18:14:17 +01:00
Ismael Juma adb3a950ee
MINOR: Fix remaining core, connect and clients tests to pass with Java 11 (#5771)
- SslFactoryTest should use SslFactory to create SSLEngine
- Use Mockito instead of EasyMock in `ConsoleConsumerTest` as one of
the tests mocks a standard library class and the latest released EasyMock
version can't do that when Java 11 is used.
- Avoid mocking `ConcurrentMap` in `SourceTaskOffsetCommitterTest`
for similar reasons. As it happens, mocking is not actually needed here.

Reviewers: Rajini Sivaram <rajinisivaram@googlemail.com>
2018-10-10 13:31:06 -07:00
Konstantine Karantasis 6a03d3b42c KAFKA-6914; Set parent classloader of DelegatingClassLoader same as the worker's (#5720)
The parent classloader of the DelegatingClassLoader and therefore the classloading scheme used by Connect does not have to be fixed to the System classloader.

Setting it the same as the one that was used to load the DelegatingClassLoader class itself is more flexible and, while in most cases will result in the System classloader to be used, it will also work in othr managed environments that control classloading differently (OSGi, and others).

The fix is minimal and the mainstream use is tested via system tests.

Reviewers: Randall Hauch <rhauch@gmail.com>, Jason Gustafson <jason@confluent.io>
2018-10-05 12:46:13 -07:00
Robert Yokota 3edd8e7333 KAFKA-7476: Fix Date-based types in SchemaProjector
Various converters (AvroConverter and JsonConverter) produce a
SchemaAndValue consisting of a logical schema type and a java.util.Date.
This is a fix for SchemaProjector to properly handle the Date.

Author: Robert Yokota <rayokota@gmail.com>

Reviewers: Konstantine Karantasis <konstantine@confluent.io>, Ewen Cheslack-Postava <ewen@confluent.io>

Closes #5736 from rayokota/KAFKA-7476
2018-10-04 20:34:50 -07:00
Kevin Lu a5335e7cbd KAFKA-6123: Give client MetricsReporter auto-generated client.id (#5383) 2018-10-03 09:56:22 -07:00
Amit Sela fd44dc7fb2 KAFKA-6684: Support casting Connect values with bytes schema to string
Allow to cast LogicalType to string by calling the serialized (Java) object's toString().

Added tests for `BigDecimal` and `Date` as whole record and as fields.

Author: Amit Sela <amitsela33@gmail.com>

Reviewers: Randall Hauch <rhauch@gmail.com>, Robert Yokota <rayokota@gmail.com>, Ewen Cheslack-Postava <ewen@confluent.io>

Closes #4820 from amitsela/cast-transform-bytes
2018-09-30 22:24:09 -07:00
Amit Sela c1457be995 KAFKA-7460: Fix Connect Values converter date format pattern
Switches to normal year format instead of week date years and day of month instead of day of year.

This is directly from #4820, but separated into a different JIRA/PR to keep the fixes independent. Original authorship should be maintained in the commit.

Author: Amit Sela <amitsela33@gmail.com>

Reviewers: Ewen Cheslack-Postava <ewen@confluent.io>

Closes #5718 from ewencp/fix-header-converter-date-format
2018-09-30 19:51:59 -07:00
Michał Borowiecki 22f1724123 KAFKA-7434: Fix NPE in DeadLetterQueueReporter
*More detailed description of your change,
if necessary. The PR title and PR message become
the squashed commit message, so use a separate
comment to ping reviewers.*

*Summary of testing strategy (including rationale)
for the feature or bug fix. Unit and/or integration
tests are expected for any behaviour change and
system tests should be considered for larger changes.*

Author: Michał Borowiecki <mbor81@gmail.com>

Reviewers: Arjun Satish <arjun@confluent.io>, Ewen Cheslack-Postava <ewen@confluent.io>

Closes #5700 from mihbor/KAFKA-7434
2018-09-29 10:19:10 -07:00
Randall Hauch f0282498e7 KAFKA-6926: Simplified some logic to eliminate some suppressions of NPath complexity checks (#5051)
Modified several classes' `equals` methods and simplified a complex method to
reduce the NPath complexity so they could be removed from the checkstyle
suppressions that were required with the recent move to Java 8 and upgrade
of Checkstyle: https://github.com/apache/kafka/pull/5046.

Reviewers: Robert Yokota <rayokota@gmail.com>, Arjun Satish <arjun@confluent.io>, Ismael Juma <ismael@juma.me.uk>
2018-09-12 21:21:46 -07:00
Ismael Juma f123d2f18c KAFKA-5887; Replace findBugs with spotBugs and upgrade to Gradle 4.10
findBugs is abandoned, it doesn't work with Java 9 and the Gradle plugin will be deprecated in
Gradle 5.0: https://github.com/gradle/gradle/pull/6664

spotBugs is actively maintained and it supports Java 8, 9 and 10. Java 11 is not supported yet,
but it's likely to happen soon.

Also fixed a file leak in Connect identified by spotbugs.

Manually tested spotBugsMain, jarAll and importing kafka in IntelliJ and running
a build in the IDE.

Author: Ismael Juma <ismael@juma.me.uk>

Reviewers: Colin Patrick McCabe <colin@cmccabe.xyz>, Dong Lin <lindong28@gmail.com>

Closes #5625 from ijuma/kafka-5887-spotbugs
2018-09-10 13:14:00 -07:00
Kevin Lafferty 847780e5a5 KAFKA-7353: Connect logs 'this' for anonymous inner classes
Replace 'this' reference in anonymous inner class logs to out class's 'this'

Author: Kevin Lafferty <kevin.lafferty@gmail.com>

Reviewers: Randall Hauch <rhauch@gmail.com>, Arjun Satish <arjun@confluent.io>, Ewen Cheslack-Postava <ewen@confluent.io>

Closes #5583 from kevin-laff/connect_logging
2018-09-05 20:15:25 -07:00
Robert Yokota fd5acd73e6 KAFKA-7242: Reverse xform configs before saving (KIP-297)
During actions such as a reconfiguration, the task configs are obtained
via `Worker.connectorTaskConfigs` and then subsequently saved into an
instance of `ClusterConfigState`.  The values of the properties that are saved
are post-transformation (of variable references) when they should be
pre-transformation.  This is to avoid secrets appearing in plaintext in
the `connect-configs` topic, for example.

The fix is to change the 2 clients of `Worker.connectorTaskConfigs` to
perform a reverse transformation (values converted back into variable
references) before saving them into an instance of `ClusterConfigState`.
The 2 places where the save is performed are
`DistributedHerder.reconfigureConnector` and
`StandaloneHerder.updateConnectorTasks`.

The way that the reverse transformation works is by using the
"raw" connector config (with variable references still intact) from
`ClusterConfigState` to convert config values back into variable
references for those keys that are common between the task config
and the connector config.

There are 2 additional small changes that only affect `StandaloneHerder`:

1) `ClusterConfigState.allTasksConfigs` has been changed to perform a
transformation (resolution) on all variable references.  This is
necessary because the result of this method is compared directly to
`Worker.connectorTaskConfigs`, which also has variable references
resolved.

2) `StandaloneHerder.startConnector` has been changed to match
`DistributedHerder.startConnector`.  This is to fix an issue where
during `StandaloneHerder.restartConnector`, the post-transformed
connector config would be saved back into `ClusterConfigState`.

I also performed an analysis of all other code paths where configs are
saved back into `ClusterConfigState` and did not find any other
issues.

Author: Robert Yokota <rayokota@gmail.com>

Reviewers: Ewen Cheslack-Postava <ewen@confluent.io>

Closes #5475 from rayokota/KAFKA-7242-reverse-xform-props
2018-08-28 12:59:08 -07:00
Robert Yokota d31762a0c2 MINOR: Additional testing of logical type handling in Cast transform
Reviewers: Randall Hauch <rhauch@gmail.com>, Jason Gustafson <jason@confluent.io>
2018-08-21 08:21:46 -07:00
Maciej Bryński 504824b7b3 KAFKA-5891; Proper handling of LogicalTypes in Cast (#4633)
Currently logical types are dropped during Cast Transformation.
This patch fixes this behaviour.

Reviewers: Randall Hauch <rhauch@gmail.com>, Jason Gustafson <jason@confluent.io>
2018-08-20 16:43:11 -07:00
Koen De Groote 9e0e29ac5c MINOR: Use statically compiled regular expressions for efficiency (#5168)
Reviewers: Andras Beni <andrasbeni@cloudera.com>, Sriharsha Chintalapani <sriharsha@apache.org>, Jason Gustafson <jason@confluent.io>
2018-08-13 17:31:29 -07:00
Viktor Somogyi 8a78d76466 KAFKA-7140; Remove deprecated poll usages (#5319)
Reviewers: Matthias J. Sax <mjsax@apache.org>, Jason Gustafson <jason@confluent.io>
2018-08-10 22:51:17 -07:00
Arjun Satish e876c921b0 MINOR: Add connector configs to site-docs
In AK's documentation, the config props for connectors are not listed (https://kafka.apache.org/documentation/#connectconfigs). This PR adds these sink and source connector configs to the html site-docs.

Signed-off-by: Arjun Satish <arjunconfluent.io>

Author: Arjun Satish <arjun@confluent.io>

Reviewers: Ewen Cheslack-Postava <ewen@confluent.io>

Closes #5469 from wicknicks/add-connector-configs-to-docs
2018-08-07 14:34:27 -07:00
Robert Yokota 36a8fec0ab KAFKA-7225: Pretransform validated props
If a property requires validation, it should be pretransformed if it is a variable reference, in order to have a value that will properly pass the validation.

Author: Robert Yokota <rayokota@gmail.com>

Reviewers: Randall Hauch <rhauch@gmail.com>, Ewen Cheslack-Postava <ewen@confluent.io>

Closes #5445 from rayokota/KAFKA-7225-pretransform-validated-props
2018-08-07 13:18:16 -07:00
Jason Gustafson fc5f6b0e46
MINOR: Add Timer to simplify timeout bookkeeping and use it in the consumer (#5087)
We currently do a lot of bookkeeping for timeouts which is both error-prone and distracting. This patch adds a new `Timer` class to simplify this logic and control unnecessary calls to system time. In particular, this helps with nested timeout operations. The consumer has been updated to use the new class.

Reviewers: Ismael Juma <ismael@juma.me.uk>, Guozhang Wang <wangguoz@gmail.com>
2018-08-03 17:25:07 -07:00
Arjun Satish 70d882861e KAFKA-7228: Set errorHandlingMetrics for dead letter queue
DLQ reporter does not get a `errorHandlingMetrics` object when created by the worker. This results in an NPE.

Signed-off-by: Arjun Satish <arjunconfluent.io>

*More detailed description of your change,
if necessary. The PR title and PR message become
the squashed commit message, so use a separate
comment to ping reviewers.*

*Summary of testing strategy (including rationale)
for the feature or bug fix. Unit and/or integration
tests are expected for any behaviour change and
system tests should be considered for larger changes.*

Author: Arjun Satish <arjun@confluent.io>

Reviewers: Konstantine Karantasis <konstantine@confluent.io>, Ewen Cheslack-Postava <ewen@confluent.io>

Closes #5440 from wicknicks/KAFKA-7228
2018-08-02 14:36:02 -07:00
Jason Gustafson c3e7c0bcb2
MINOR: Producers should set delivery timeout instead of retries (#5425)
Use delivery timeout instead of retries when possible and remove various TODOs associated with completion of KIP-91.

Reviewers: Ismael Juma <ismael@juma.me.uk>, Guozhang Wang <wangguoz@gmail.com>
2018-08-01 11:04:17 -07:00
Yu Yang 7fc7136ffd KAFKA-5886; Introduce delivery.timeout.ms producer config (KIP-91) (#5270)
Co-authored-by: Sumant Tambe <sutambe@yahoo.com>
Co-authored-by: Yu Yang <yuyang@pinterest.com>

Reviewers: Ted Yu <yuzhihong@gmail.com>, Apurva Mehta <apurva@confluent.io>, Jason Gustafson <jason@confluent.io>
2018-07-26 09:13:50 -07:00
Konstantine Karantasis 5d2bf6328e MINOR: FileStreamSinkTask should create file if it doesn't exist (#5406)
A recent change from `new FileOutputStream` to `Files.newOutputStream` missed the `CREATE` flag (which is necessary in addition to `APPEND`).

Reviewers: Ismael Juma <ismael@juma.me.uk>
2018-07-20 21:01:10 -07:00
Ismael Juma 7a74ec62d2
MINOR: Avoid FileInputStream/FileOutputStream (#5281)
They rely on finalizers (before Java 11), which create
unnecessary GC load. The alternatives are as easy to
use and don't have this issue.

Also use FileChannel directly instead of retrieving
it from RandomAccessFile whenever possible
since the indirection is unnecessary.

Finally, add a few try/finally blocks.

Reviewers: Colin Patrick McCabe <colin@cmccabe.xyz>, Rajini Sivaram <rajinisivaram@googlemail.com>
2018-06-27 01:00:05 -07:00
Chia-Ping Tsai 6810617179 KAFKA-7048 NPE when creating connector (#5202)
Reviewers: Robert Yokota <rayokota@gmail.com>, Konstantine Karantasis <konstantine@confluent.io>, Guozhang Wang <wangguoz@gmail.com>
2018-06-17 16:49:01 -07:00
Gunnar Morling be846d833c KAFKA-7058: Comparing schema default values using Objects#deepEquals()
https://issues.apache.org/jira/browse/KAFKA-7058
* Summary of testing strategy: Added new unit test

Author: Gunnar Morling <gunnar.morling@googlemail.com>

Reviewers: Randall Hauch <rhauch@gmail.com>, Ewen Cheslack-Postava <ewen@confluent.io>

Closes #5225 from gunnarmorling/KAFKA-7058
2018-06-16 23:04:31 -07:00
Randall Hauch f0282cb3de KAFKA-7047: Added SimpleHeaderConverter to plugin isolation whitelist
This was originally missed when headers were added as part of KIP-145 in AK 1.1. An additional unit test was added in line with the StringConverter.

This should be backported to the AK `1.1` branch so that it is included in the next bugfix release. The `SimpleHeaderConverter` class that we're referencing was first added in the 1.1.0 release, so there's no reason to backport earlier.

Author: Randall Hauch <rhauch@gmail.com>

Reviewers: Ewen Cheslack-Postava <ewen@confluent.io>

Closes #5204 from rhauch/kafka-7047
2018-06-16 22:23:20 -07:00
Magesh Nandakumar 239dd0fb9b KAFKA-7039: Create an instance of the plugin only it's a Versioned Plugin
Create an instance of the plugin only it's a Versioned Plugin. Prior to KIP-285, this was done for only for Connector and this PR will continue to have the same behavior.

Author: Magesh Nandakumar <magesh.n.kumar@gmail.com>

Reviewers: Randall Hauch <rhauch@gmail.com>, Konstantine Karantasis <konstantine@confluent.io>, Ewen Cheslack-Postava <ewen@confluent.io>

Closes #5191 from mageshn/KAFKA-7039
2018-06-16 22:18:20 -07:00
Randall Hauch fab8b7e676 KAFKA-7056: Moved Connect’s new numeric converters to runtime (KIP-305)
KIP-305 added numeric converters to Connect, but these were added in Connect’s API module in the same package as the `StringConverter`. This commit moves them into the Runtime module and into the `converters` package where the `ByteArrayConverter` already lives. These numeric converters have not yet been included in a release, and so they can be moved without concern.

All of Connect’s converters must be referenced in worker / connector configurations and are therefore part of the API, but otherwise do not need to be in the “api” module as they do not need to be instantiated or directly used by extensions. This change makes them more similar to and aligned with the `ByteArrayConverter`.

It also gives us the opportunity to move them into the “api” module in the future (keeping the same package name), should we ever want or need to do so. However, if we were to start out with them in the “api” module, we would never be able to move them out into the “runtime” module, even if we kept the same package name. Therefore, moving them to “runtime” now gives us a bit more flexibility.

This PR moves the unit tests for the numeric converters accordingly, and updates the `PluginsUtil` and `PluginUtilsTest` as well.

Author: Randall Hauch <rhauch@gmail.com>

Reviewers: Konstantine Karantasis <konstantine@confluent.io>, Ewen Cheslack-Postava <ewen@confluent.io>

Closes #5222 from rhauch/kafka-7056
2018-06-15 14:52:28 -07:00
Randall Hauch 22356d55ef KAFKA-7043: Modified plugin isolation whitelist with recently added converters (KIP-305)
Several recently-added converters are included in the plugin isolation whitelist, similarly to the `StringConverter`. This is a change in the implementation, and does not affect the approved KIP. Several unit tests were added to verify they are being loaded in isolation, again similarly to `StringConverter`.

These changes should be applied only to `trunk` and `2.0`, since these converters were added as part of KIP-305 for AK 2.0.

Author: Randall Hauch <rhauch@gmail.com>

Reviewers: Magesh Nandakumar <magesh.n.kumar@gmail.com>, Konstantine Karantasis <konstantine@confluent.io>, Ewen Cheslack-Postava <ewen@confluent.io>

Closes #5198 from rhauch/kafka-7043
2018-06-12 21:35:45 -07:00
Chia-Ping Tsai 70e13f032e MINOR: Remove the unused field in DelegatingClassLoader
After [3173](https://github.com/apache/kafka/commit/e0150a25e8), the field "activePaths" is not used anymore.

Author: Chia-Ping Tsai <chia7712@gmail.com>

Reviewers: Ewen Cheslack-Postava <ewen@confluent.io>

Closes #4984 from chia7712/remove_unused_field_from_DelegatingClassLoader
2018-06-12 21:03:19 -07:00
Robert Yokota 16190e9bfd MINOR: Move FileConfigProvider to provider subpackage (#5194)
This moves FileConfigProvider to the org.apache.common.config.provider package to more easily isolate provider implementations going forward.

Reviewers: Konstantine Karantasis <konstantine@confluent.io>, Randall Hauch <rhauch@gmail.com>, Ismael Juma <ismael@juma.me.uk>, Jason Gustafson <jason@confluent.io>
2018-06-12 12:52:51 -07:00
Arjun Satish 3face7fce2 KAFKA-7003: Set error context in message headers (KIP-298)
If the property `errors.deadletterqueue.context.headers.enable` is set to true, add a set of headers to the message describing the context under which the error took place.

A unit test is added to check the correctness of header creation.

Signed-off-by: Arjun Satish <arjunconfluent.io>

Author: Arjun Satish <arjun@confluent.io>

Reviewers: Konstantine Karantasis <konstantine@confluent.io>, Randall Hauch <rhauch@gmail.com>, Ewen Cheslack-Postava <ewen@confluent.io>

Closes #5159 from wicknicks/KAFKA-7003
2018-06-11 15:16:46 -07:00
Arjun Satish 22612be46d KAFKA-7002: Add a config property for DLQ topic's replication factor (KIP-298)
Currently, the replication factor is hardcoded to a value of 3. This means that we cannot use a DLQ in any cluster setup with less than three brokers. It is better to have the user specify this value if the default value does meet the requirements.

Testing: A unit test is added.

Signed-off-by: Arjun Satish <arjunconfluent.io>

Author: Arjun Satish <arjun@confluent.io>

Reviewers: Randall Hauch <rhauch@gmail.com>, Ewen Cheslack-Postava <ewen@confluent.io>

Closes #5145 from wicknicks/KAFKA-7002
2018-06-07 15:49:57 -07:00
Arjun Satish aa014b2709 KAFKA-7001: Rename errors.allowed.max property in Connect to errors.tolerance (KIP-298)
Signed-off-by: Arjun Satish <arjunconfluent.io>

Author: Arjun Satish <arjun@confluent.io>

Reviewers: Randall Hauch <rhauch@gmail.com>, Ewen Cheslack-Postava <ewen@confluent.io>

Closes #5146 from wicknicks/KAFKA-7001
2018-06-07 15:27:13 -07:00
Magesh Nandakumar 642a09168c KAFKA-6991: Fix ServiceLoader issue with PluginClassLoader (KIP-285)
Fix ServiceLoader issue with PluginClassLoader and add basic-auth-extension packaging & classpath

*More detailed description of your change,
if necessary. The PR title and PR message become
the squashed commit message, so use a separate
comment to ping reviewers.*

*Summary of testing strategy (including rationale)
for the feature or bug fix. Unit and/or integration
tests are expected for any behaviour change and
system tests should be considered for larger changes.*

Author: Magesh Nandakumar <magesh.n.kumar@gmail.com>

Reviewers: Konstantine Karantasis <konstantine@confluent.io>, Ewen Cheslack-Postava <ewen@confluent.io>

Closes #5135 from mageshn/KAFKA-6991
2018-06-06 21:09:16 -07:00
Robert Yokota 8264492dee MINOR: Use service loader for ConfigProvider impls (KIP-297)
This is a small change to use the Java ServiceLoader to load ConfigProvider plugins.  It uses code added by mageshn for Connect Rest Extensions.

Author: Robert Yokota <rayokota@gmail.com>

Reviewers: Magesh Nandakumar <magesh.n.kumar@gmail.com>, Randall Hauch <rhauch@gmail.com>, Ewen Cheslack-Postava <ewen@confluent.io>

Closes #5141 from rayokota/service-loader-for-config-plugins
2018-06-06 08:55:57 -07:00
Arjun Satish faa15b8b75 KAFKA-6981: Move the error handling configuration properties into the ConnectorConfig and SinkConnectorConfig classes (KIP-298)
Move the error handling configuration properties into the ConnectorConfig and SinkConnectorConfig classes, and refactor the tests and classes to use these new properties.

Testing: Unit tests and running the connect-standalone script with a file sink connector.

Author: Arjun Satish <arjun@confluent.io>
Author: Randall Hauch <rhauch@gmail.com>

Reviewers: Konstantine Karantasis <konstantine@confluent.io>, Magesh Nandakumar <magesh.n.kumar@gmail.com>, Robert Yokota <rayokota@gmail.com>, Randall Hauch <rhauch@gmail.com>, Ewen Cheslack-Postava <ewen@confluent.io>

Closes #5125 from wicknicks/KAFKA-6981
2018-06-05 13:59:15 -07:00
Robert Yokota 08e8facdc9 KAFKA-6886: Externalize secrets from Connect configs (KIP-297)
This commit allows secrets in Connect configs to be externalized and replaced with variable references of the form `${provider:[path:]key}`, where the "path" is optional.

There are 2 main additions to `org.apache.kafka.common.config`: a `ConfigProvider` and a `ConfigTransformer`.  The `ConfigProvider` is an interface that allows key-value pairs to be provided by an external source for a given "path".  An a TTL can be associated with the key-value pairs returned from the path.  The `ConfigTransformer` will use instances of `ConfigProvider` to replace variable references in a set of configuration values.

In the Connect framework, `ConfigProvider` classes can be specified in the worker config, and then variable references can be used in the connector config.  In addition, the herder can be configured to restart connectors (or not) based on the TTL returned from a `ConfigProvider`.  The main class that performs restarts and transformations is `WorkerConfigTransformer`.

Finally, a `configs()` method has been added to both `SourceTaskContext` and `SinkTaskContext`.  This allows connectors to get configs with variables replaced by the latest values from instances of `ConfigProvider`.

Most of the other changes in the Connect framework are threading various objects through classes to enable the above functionality.

Author: Robert Yokota <rayokota@gmail.com>
Author: Ewen Cheslack-Postava <me@ewencp.org>

Reviewers: Randall Hauch <rhauch@gmail.com>, Ewen Cheslack-Postava <ewen@confluent.io>

Closes #5068 from rayokota/KAFKA-6886-connect-secrets
2018-05-30 14:43:11 -07:00
Arjun Satish f8dfbb067c KAFKA-6738: Implement error handling for source and sink tasks (KIP-298)
This PR implements the features described in this KIP: https://cwiki.apache.org/confluence/display/KAFKA/KIP-298%3A+Error+Handling+in+Connect

This PR changes the Connect framework to allow it to automatically deal with errors encountered while processing records in a Connector. The following behavior changes are introduced here:

**Retry on Failure**: Retry the failed operation a configurable number of times, with backoff between each retry.
**Task Tolerance Limits**: Tolerate a configurable number of failures in a task.

We also add the following ways to report errors, along with sufficient context to simplify the debugging process:

**Log Error Context**: The error information along with processing context is logged along with standard application logs.
**Dead Letter Queue**: Produce the original message into a Kafka topic (applicable only to sink connectors).

New **metrics** which will monitor the number of failures, and the behavior of the response handler are added.

The changes proposed here **are backward compatible**. The current behavior in Connect is to kill the task on the first error in any stage. This will remain the default behavior if the connector does not override any of the new configurations which are provided as part of this feature.

Testing: added multiple unit tests to test the retry and tolerance logic.

Author: Arjun Satish <arjun@confluent.io>
Author: Ewen Cheslack-Postava <me@ewencp.org>

Reviewers: Magesh Nandakumar <magesh.n.kumar@gmail.com>, Randall Hauch <rhauch@gmail.com>, Konstantine Karantasis <konstantine@confluent.io>, Ewen Cheslack-Postava <ewen@confluent.io>

Closes #5065 from wicknicks/KAFKA-6378
2018-05-30 11:39:45 -07:00
Magesh Nandakumar 98094954a2 KAFKA-6776: ConnectRestExtension Interfaces & Rest integration (KIP-285)
This PR provides the implementation for KIP-285 and also a reference implementation for authenticating BasicAuth credentials using JAAS LoginModule

Author: Magesh Nandakumar <magesh.n.kumar@gmail.com>

Reviewers: Randall Hauch <rhauch@gmail.com>, Arjun Satish <wicknicks@users.noreply.github.com>, Konstantine Karantasis <konstantine@confluent.io>, Ewen Cheslack-Postava <ewen@confluent.io>

Closes #4931 from mageshn/KIP-285
2018-05-29 21:35:22 -07:00
Randall Hauch fffb9c5b5c KAFKA-6913: Add Connect converters and header converters for short, integer, long, float, and double (KIP-305)
*[KIP-305](https://cwiki.apache.org/confluence/display/KAFKA/KIP-305%3A+Add+Connect+primitive+number+converters) has been approved.*

Added converters and header converters for the primitive number types for which Kafka already had serializers and deserializers. All extend a common base class, `NumberConverter`, that encapsulates most of the shared functionality. Unit tests were added to check the basic functionality.

These classes are not used by any other Connect code, and must be explicitly used in Connect workers and connectors.

Author: Randall Hauch <rhauch@gmail.com>

Reviewers: Arjun Satish <wicknicks@users.noreply.github.com>, Magesh Nandakumar <magesh.n.kumar@gmail.com>, Ewen Cheslack-Postava <ewen@confluent.io>

Closes #5034 from rhauch/kafka-6913
2018-05-29 16:37:33 -07:00
Chris Egerton a64ab91238 KAFKA-5540: Deprecate internal converter configs (KIP-174)
Implementation of [KIP-174](https://cwiki.apache.org/confluence/display/KAFKA/KIP-174+-+Deprecate+and+remove+internal+converter+configs+in+WorkerConfig)

Configuration properties 'internal.key.converter' and 'internal.value.converter'
are deprecated, and default to org.apache.kafka.connect.json.JsonConverter.

Warnings are logged if values are specified for either, or if properties that
appear to configure instances of internal converters (i.e., ones prefixed with
either 'internal.key.converter.' or 'internal.value.converter.') are given.

The property 'schemas.enable' is also defaulted to false for internal
JsonConverter instances (both for keys and values) if it isn't specified.

Documentation and code have also been updated with deprecation notices and
annotations, respectively.

Unit tests have been updated in `PluginsTest` to account for the new defaults for `schemas.enable` for internal key/value converters, and to ensure that (for the time being), internal key/value converters are still configurable despite being deprecated.

Author: Chris Egerton <chrise@confluent.io>
Author: Ewen Cheslack-Postava <me@ewencp.org>

Reviewers: Randall Hauch <rhauch@gmail.com>, Ewen Cheslack-Postava <ewen@confluent.io>

Closes #4693 from C0urante/kafka-5540
2018-05-29 16:22:47 -07:00
John Roesler c470ff70d3 KAFKA-5697; Implement new consumer poll API from KIP-266 (#4855)
Add the new stricter-timeout version of `poll` proposed in KIP-266.

The pre-existing variant `poll(long timeout)` would block indefinitely for metadata
updates if they were needed, then it would issue a fetch and poll for `timeout` ms 
for new records. The initial indefinite metadata block caused applications to become
stuck when the brokers became unavailable. The existence of the timeout parameter
made the indefinite block especially unintuitive.

This PR adds `poll(Duration timeout)` with the semantics:
1. iff a metadata update is needed:
    1. send (asynchronous) metadata requests
    2. poll for metadata responses (counts against timeout)
        - if no response within timeout, **return an empty collection immediately**
2. if there is fetch data available, **return it immediately**
3. if there is no fetch request in flight, send fetch requests
4. poll for fetch responses (counts against timeout)
    - if no response within timeout, **return an empty collection** (leaving async fetch request for the next poll)
    - if we get a response, **return the response**

The old method, `poll(long timeout)` is deprecated, but we do not change its semantics, so it remains:
1. iff a metadata update is needed:
    1. send (asynchronous) metadata requests
    2. poll for metadata responses *indefinitely until we get it*
2. if there is fetch data available, **return it immediately**
3. if there is no fetch request in flight, send fetch requests
4. poll for fetch responses (counts against timeout)
    - if no response within timeout, **return an empty collection** (leaving async fetch request for the next poll)
    - if we get a response, **return the response**

One notable usage is prohibited by the new `poll`: previously, you could call `poll(0)` to block for metadata updates, for example to initialize the client, supposedly without fetching records. Note, though, that this behavior is not according to any contract, and there is no guarantee that `poll(0)` won't return records the first time it's called. Therefore, it has always been unsafe to ignore the response.
2018-05-26 11:50:51 -07:00
Jeremy Custenborder 7ecfbab92f KAFKA-5807 - Check Connector.config() and Transformation.config() returns a valid ConfigDef
Little back story on this. Was helping a user over email. This could be much easier to debug if we assume that the connector developer might not return valid configs. For example Intellij will generate a stub that returns a null. This was the case that inspired this JIRA.

Author: Jeremy Custenborder <jcustenborder@gmail.com>

Reviewers: Jason Gustafson <jason@confluent.io>, Randall Hauch <rhauch@gmail.com>, Ewen Cheslack-Postava <ewen@confluent.io>

Closes #3762 from jcustenborder/KAFKA-5807
2018-05-22 13:53:15 -07:00
Ismael Juma a30ecc6755 MINOR: Remove o.a.kafka.common.utils.Base64 and IS_JAVA8_COMPATIBLE
We no longer need them since we now require Java 8.

Author: Ismael Juma <ismael@juma.me.uk>

Reviewers: Andras Beni <andrasbeni@cloudera.com>, Manikumar Reddy O <manikumar.reddy@gmail.com>, Dong Lin <lindong28@gmail.com>

Closes #5049 from ijuma/remove-base64
2018-05-22 09:57:39 -07:00
Ismael Juma e70a191d30
KAFKA-4423: Drop support for Java 7 (KIP-118) and update deps (#5046)
* Set --source, --target and --release to 1.8.
* Build Scala 2.12 by default.
* Remove some conditionals in the build file now that Java 8
is the minimum version.
* Bump the version of Jetty, Jersey and Checkstyle (the newer
versions require Java 8).
* Fixed issues uncovered by the new version if Checkstyle.
* A couple of minor updates to handle an incompatible source
change in the new version of Jetty.
* Add dependency to jersey-hk2 to fix failing tests caused
by Jersey upgrade.
* Update release script to use Java 8 and to take into account
that Scala 2.12 is now built by default.
* While we're at it, bump the version of Gradle, Gradle plugins,
ScalaLogging, JMH and apache directory api.
* Minor documentation updates including the readme and upgrade
notes. A number of Streams Java 7 examples can be removed
subsequently.
2018-05-21 23:17:42 -07:00
Jagadesh Adireddi 95b46a12e5 KAFKA-6685: Added Exception to distinguish message Key from Value during deserializing.
https://issues.apache.org/jira/browse/KAFKA-6685

Added Exception message in `WorkerSinkTask.convertMessages` to distinguish message Key from Value during deserialization to Kafka connect format.

*More detailed description of your change,
if necessary. The PR title and PR message become
the squashed commit message, so use a separate
comment to ping reviewers.*

*Summary of testing strategy (including rationale)
for the feature or bug fix. Unit and/or integration
tests are expected for any behaviour change and
system tests should be considered for larger changes.*

Author: Jagadesh Adireddi <adireddijagadesh@gmail.com>

Reviewers: Randall Hauch <rhauch@gmail.com>, Ewen Cheslack-Postava <ewen@confluent.io>

Closes #4765 from jadireddi/KAFKA-6685---log-message-should-distinguish-key-from-value
2018-05-21 21:10:17 -07:00
Robert Yokota ee8abb2f70 KAFKA-6566: Improve Connect Resource Cleanup
This is a change to improve resource cleanup for sink tasks and source tasks.  Now `Task.stop()` is called from both `WorkerSinkTask.close()` and `WorkerSourceTask.close()`.

It is called from `WorkerXXXTask.close()` since this method is called in the `finally` block of `WorkerTask.run()`, and Connect developers use `stop()` to clean up resources.

Author: Robert Yokota <rayokota@gmail.com>

Reviewers: Randall Hauch <rhauch@gmail.com>, Ewen Cheslack-Postava <ewen@confluent.io>

Closes #5020 from rayokota/K6566-improve-connect-resource-cleanup
2018-05-18 10:39:34 -07:00
Colin Patrick McCabe abbd53da4a KAFKA-6299; Fix AdminClient error handling when metadata changes (#4295)
When AdminClient gets a NOT_CONTROLLER error, it should refresh its metadata and retry the request, rather than making the end-user deal with NotControllerException.

Move AdminClient's metadata management outside of NetworkClient and into AdminMetadataManager. This will make it easier to do more sophisticated metadata management in the future, such as implementing a NodeProvider which fetches the leaders for topics.

Rather than manipulating newCalls directly, the AdminClient service thread now drains it directly into pendingCalls. This minimizes the amount of locking we have to do, since pendingCalls is only accessed from the service thread.
2018-05-09 14:27:28 -07:00
John Roesler cc43e77bbb MINOR: make Sensor#add idempotent (#4853)
This change makes adding a metric to a sensor idempotent.
That is, if the metric is already added to the sensor, the method
returns with success.

The current behavior is that any attempt to register a second metric
with the same name is an error.

Testing strategy: There is a new unit test covering this behavior

Reviewers: Guozhang Wang <guozhang@confluent.io>, Bill Bejeck <bill@confluent.io>
2018-04-11 20:50:10 -07:00
Benedict Jin 37efc79eb8 MINOR: Remove magic number and extract Pattern instance from method as class field (#4799)
* Remove magic number
* Extract Pattern instance from method as class field
* Add @Override declare

Reviewers: Randall Hauch <rhauch@gmail.com>
2018-04-08 11:54:22 -07:00
Randall Hauch d9369de8f2 KAFKA-6728: Corrected the worker’s instantiation of the HeaderConverter
## Summary of the problem
When the `header.converter` is not specified in the worker config or the connector config, a bug in the `Plugins` test causes it to never instantiate the `HeaderConverter` instance, even though there is a default value.

This is a problem as soon as the connector deals with headers, either in records created by a source connector or in messages on the Kafka topics consumed by a sink connector. As soon as that happens, a NPE occurs.

A workaround is to explicitly set the `header.converter` configuration property, but this was added in AK 1.1 and thus means that upgrading to AK 1.1 will not be backward compatible and will require this configuration change.

## The Changes

The `Plugins.newHeaderConverter` methods were always returning null if the `header.converter` configuration value was not specified in the supplied connector or worker configuration. Thus, even though the `header.converter` property has a default, it was never being used.

The fix was to only check whether a `header.converter` property was specified when the connector configuration was being used, and if no such property exists in the connector configuration to return null. Then, when the worker configuration is being used, the method simply gets the `header.converter` value (or the default if no value was explicitly set).

Also, the ConnectorConfig had the same default value for the `header.converter` property as the WorkerConfig, but this resulted in very confusing log messages that implied the default header converter should be used even when the worker config specified the `header.converter` value. By removing the default, the log messages now make sense, and the Worker still properly instantiates the correct header converter.

Finally, updated comments and added log messages to make it more clear which converters are being used and how they are being converted.

## Testing

Several new unit tests for `Plugins.newHeaderConverter` were added to check the various behavior. Additionally, the runtime JAR with these changes was built and inserted into an AK 1.1 installation, and a source connector was manually tested with 8 different combinations of settings for the `header.converter` configuration:

1. default value
1. worker configuration has `header.converter` explicitly set to the default
1. worker configuration has `header.converter` set to a custom `HeaderConverter` implementation in the same plugin
1. worker configuration has `header.converter` set to a custom `HeaderConverter` implementation in a _different_ plugin
1. connector configuration has `header.converter` explicitly set to the default
1. connector configuration has `header.converter` set to a custom `HeaderConverter` implementation in the same plugin
1. connector configuration has `header.converter` set to a custom `HeaderConverter` implementation in a _different_ plugin
1. worker configuration has `header.converter` explicitly set to the default, and the connector configuration has `header.converter` set to a custom `HeaderConverter` implementation in a _different_ plugin

The worker created the correct `HeaderConverter` implementation with the correct configuration in all of these tests.

Finally, the default configuration was used with the aforementioned custom source connector that generated records with headers, and an S3 connector that consumes the records with headers (but didn't do anything with them). This test also passed.

Author: Randall Hauch <rhauch@gmail.com>

Reviewers: Ewen Cheslack-Postava <ewen@confluent.io>

Closes #4815 from rhauch/kafka-6728
2018-04-03 08:48:05 -07:00
Attila Sasvari 549a5cec1e MINOR: Fix potential resource leak in FileOffsetBackingStore (#4739)
Reviewers: Sandor Murakozi <smurakozi@gmail.com>, Jason Gustafson <jason@confluent.io>
2018-03-24 12:20:11 -07:00
Randall Hauch e7ef719a5b KAFKA-6661: Ensure sink connectors don’t resume consumer when task is paused
Changed WorkerSinkTaskContext to only resume the consumer topic partitions when the connector/task is not in the paused state.

The context tracks the set of topic partitions that are explicitly paused/resumed by the connector, and when the WorkerSinkTask resumes the tasks it currently resumes all topic partitions *except* those that are still explicitly paused in the context. Therefore, the change above should result in the desired behavior.

Several debug statements were added to record when the context is called by the connector.

This can be backported to older releases, since this bug goes back to 0.10 or 0.9.

Author: Randall Hauch <rhauch@gmail.com>

Reviewers: Ewen Cheslack-Postava <ewen@confluent.io>

Closes #4716 from rhauch/kafka-6661
2018-03-15 15:52:53 -07:00
Randall Hauch 0e22fd6f8d KAFKA-6578: Changed the Connect distributed and standalone main method to log all exceptions (#4609)
Any exception thrown by calls within a `main()` method are not logged unless explicitly done so. This change simply adds a try-catch block around most of the content of the distributed and standalone `main()` methods.
2018-02-22 22:29:49 -08:00
Randall Hauch fc19c3e6f2 KAFKA-6577: Fix Connect system tests and add debug messages
**NOTE: This should be backported to the `1.1` branch, and is currently a blocker for 1.1.**

The `connect_test.py::ConnectStandaloneFileTest.test_file_source_and_sink` system test is failing with the SASL configuration without a sufficient explanation. During the test, the Connect worker fails to start, but the Connect log contains no useful information. There are actual several things compounding to cause the failure and make it difficult to understand the problem.

First, the `tests/kafkatest/tests/connect/templates/connect_standalone.properties` is only adding in the broker's security configuration with the `producer.` and `consumer.` prefixes, but is not adding them with no prefix. The worker uses the AdminClient to connect to the broker to get the Kafka cluster ID and to manage the three internal topics, and the AdminClient is configured via top-level properties. Because the SASL test requires the clients all connect using SASL, the lack of broker security configs means the AdminClient was attempting and failing to connect to the broker. This is corrected by adding the broker's security configuration to the Connect worker configuration file at the top-level. (This was already being done in the `connect_distributed.properties` file.)

Second, the default `request.timeout.ms` for the AdminClient (and the other clients) is 120 seconds, so the AdminClient was retrying for 120 seconds before it would give up and thrown an error. However, the test was only waiting for 60 seconds before determining that the service failed to start. This can be corrected by setting `request.timeout.ms=10000` in the Connect distributed and standalone worker configurations.

Third, the Connect workers were recently changed to lookup the Kafka cluster ID before it started the herder. This is unlike the older uses of the AdminClient to find and manage the internal topics, where failure to connect was not necessarily logged correctly but nevertheless still skipped over, relying upon broker auto-topic creation to create the internal topics. (This may be why the test did not fail prior to the recent change to always require a successful AdminClient connection.) Although the worker never got this far in its startup process, the fact that we missed such an error since the prior releases means that failure to connect with the AdminClient was not being properly reported.

The `ConnectStandaloneFileTest.test_file_source_and_sink` system tests were run locally prior to this fix, and they failed as with the nightlies. Once these fixes were made, the locally run system tests passed.

Author: Randall Hauch <rhauch@gmail.com>

Reviewers: Konstantine Karantasis <konstantine@confluent.io>, Ewen Cheslack-Postava <me@ewencp.org>

Closes #4610 from rhauch/kafka-6577-trunk
2018-02-22 09:39:59 +00:00
Konstantine Karantasis f3c4240e6c MINOR: Restore scanning for super types of Connect plugins (#4584)
Enabling scans for super types in reflections is required in order to discover Connect plugins.

Reviewers: Randall Hauch <rhauch@gmail.com>, Jason Gustafson <jason@confluent.io>
2018-02-20 13:39:45 -08:00
Konstantine Karantasis b79e11bb51 MINOR: Redirect response code in Connect's RestClient to logs instead of stdout
Sending the response code of an http request issued via `RestClient` in Connect to stdout seems like a unconventional choice.

This PR redirects the responds code with a message in the logs at DEBUG level (usually the same level as the one that the caller of `RestClient.httpRequest` uses.

This fix will also fix system tests that broke by outputting this response code to stdout.

Author: Konstantine Karantasis <konstantine@confluent.io>

Reviewers: Randall Hauch <rhauch@gmail.com>, Damian Guy <damian.guy@gmail.com>

Closes #4591 from kkonstantine/MINOR-Redirect-response-code-in-Connect-RestClient-to-logs-instead-of-stdout
2018-02-20 17:15:31 +00:00
Konstantine Karantasis ee352be9c8 MINOR: Fix bug introduced by adding batch.size without default in FileStreamSourceConnector (#4579)
https://github.com/apache/kafka/pull/4356 added `batch.size` config property to `FileStreamSourceConnector` but the property was added as required without a default in config definition (`ConfigDef`). This results in validation error during connector startup. 

Unit tests were added for both `FileStreamSourceConnector` and `FileStreamSinkConnector` to avoid such issues in the future.

Reviewers: Randall Hauch <rhauch@gmail.com>, Jason Gustafson <jason@confluent.io>
2018-02-16 18:30:12 -08:00
Robert Yokota 3af13967db KAFKA-6503: Parallelize plugin scanning
This is a small change to parallelize plugin scanning.  This may help in some environments where otherwise plugin scanning is slow.

Author: Robert Yokota <rayokota@gmail.com>

Reviewers: Konstantine Karantasis <konstantine@confluent.io>, Randall Hauch <rhauch@gmail.com>, Ewen Cheslack-Postava <ewen@confluent.io>

Closes #4561 from rayokota/K6503-improve-plugin-scanning
2018-02-14 16:24:05 -08:00
ConcurrencyPractitioner d902b6b4b6 KAFKA-5996; JsonConverter generates Mismatching schema DataException (#4523)
JsonConverter should use object equality rather than reference equality in `convertToJson`.

Reviewers: Bartlomiej Tartanus <bartektartanus@gmail.com>, Randall Hauch <rhauch@gmail.com>, Jason Gustafson <jason@confluent.io>
2018-02-14 08:46:22 -08:00
Benedict Jin 2693e9be74 MINOR: Misc improvements on runtime / storage / metrics / config parts (#4525)
Reviewers: Guozhang Wang <wangguoz@gmail.com>
2018-02-13 16:39:21 -08:00
Randall Hauch a68a8b6b6c KAFKA-6511; Corrected connect list/map parsing logic (#4516)
Corrected the parsing of invalid list values. A list can only be parsed if it contains elements that have a common type, and a map can only be parsed if it contains keys with a common type and values with a common type.

Reviewers: Arjun Satish <arjun@confluent.io>, Magesh Nandakumar <magesh.n.kumar@gmail.com>, Konstantine Karantasis <konstantine@confluent.io>, Jason Gustafson <jason@confluent.io>
2018-02-13 11:44:12 -08:00
Jeremy Custenborder 9f4fe7679e KAFKA-5550; Connect Struct.put() should include the field name if validation fails (#3507)
Changed call to use the overload of ConnectSchema.validateValue() method with the field name passed in. Ensure that field in put call is not null.

Reviewers: Randall Hauch <rhauch@gmail.com>, Jason Gustafson <jason@confluent.io>
2018-02-12 09:16:25 -08:00
Randall Hauch 976a3b0cc8 KAFKA-6513: Corrected how Converters and HeaderConverters are instantiated and configured
The commits for KIP-145 (KAFKA-5142) changed how the Connect workers instantiate and configure the Converters, and also added the ability to do the same for the new HeaderConverters. However, the last few commits removed the default value for the `converter.type` property for Converters and HeaderConverters, and this broke how the internal converters were being created.

This change corrects the behavior so that the `converter.type` property is always set by the worker (or by the Plugins class), which means the existing Converter implementations will not have to do this. The built-in JsonConverter, ByteArrayConverter, and StringConverter also implement HeaderConverter which implements Configurable, but the Worker and Plugins methods do not yet use the `Configurable.configure(Map)` method and instead still use the `Converter.configure(Map,boolean)`.

Several tests were modified, and a new PluginsTest was added to verify the new behavior in Plugins for instantiating and configuring the Converter and HeaderConverter instances.

Author: Randall Hauch <rhauch@gmail.com>

Reviewers: Konstantine Karantasis <konstantine@confluent.io>, Ewen Cheslack-Postava <ewen@confluent.io>

Closes #4512 from rhauch/kafka-6513
2018-02-09 15:47:44 -08:00
Wladimir Schmidt c32860b224 MINOR: exchange redundant Collections.addAll with parameterized constructor (#4521)
* Exchange manual copy to collection with Collections.addAll call
* Exchange redundant Collections.addAll with parameterized constructor call

Reviewers: Guozhang Wang <wangguoz@gmail.com>
2018-02-06 11:33:49 -08:00
Robert Yokota 7d70a427b2 KAFKA-6504: Ensure uniqueness of connect task metric sensor creation (#4514)
This change ensures that when sensors are created, they are unique to the metric group associated with the task that created them. Previously the sensors were being shared between task metric groups, causing incorrect metrics.

Reviewers: Randall Hauch <rhauch@gmail.com>, Jason Gustafson <jason@confluent.io>
2018-02-06 09:18:37 -08:00
Jeff Klukas eb3fef760e KAFKA-6253: Improve sink connector topic regex validation
KAFKA-3073 added topic regex support for sink connectors. The addition requires that you only specify one of topics or topics.regex settings. This is being validated in one place, but not during submission of connectors. This PR adds validation at `AbstractHerder.validateConnectorConfig` and `WorkerConnector.initialize`.

This adds a test of the new behavior to `AbstractHerderTest`.

Author: Jeff Klukas <jeff@klukas.net>

Reviewers: Randall Hauch <rhauch@gmail.com>, Ewen Cheslack-Postava <ewen@confluent.io>

Closes #4251 from jklukas/connect-topics-validation
2018-02-05 09:46:07 -08:00
Konstantine Karantasis 17aaff3606 KAFKA-6288: Broken symlink interrupts scanning of the plugin path
Submitting a fail safe fix for rare IOExceptions on symbolic links.

The fix is submitted without a test case since it does seem easy to reproduce such type of failures (just having a broken symbolic link does not reproduce the issue) and it's considered pretty low risk.

If accepted, needs to be ported at least to 1.0, if not 0.11

Author: Konstantine Karantasis <konstantine@confluent.io>

Reviewers: Randall Hauch <rhauch@gmail.com>, Ewen Cheslack-Postava <ewen@confluent.io>

Closes #4481 from kkonstantine/KAFKA-6288-Broken-symlink-interrupts-scanning-the-plugin-path
2018-02-04 14:57:25 -08:00
Gunnar Morling f9b56d680b KAFKA-6515; Adding toString() method to o.a.k.connect.data.Field (#4509) 2018-02-03 13:27:02 -08:00
Randall Hauch 4c48942f9d KAFKA-5142: Add Connect support for message headers (KIP-145)
**[KIP-145](https://cwiki.apache.org/confluence/display/KAFKA/KIP-145+-+Expose+Record+Headers+in+Kafka+Connect) has been accepted, and this PR implements KIP-145 except without the SMTs.**

Changed the Connect API and runtime to support message headers as described in [KIP-145](https://cwiki.apache.org/confluence/display/KAFKA/KIP-145+-+Expose+Record+Headers+in+Kafka+Connect).

The new `Header` interface defines an immutable representation of a Kafka header (key-value pair) with support for the Connect value types and schemas. This interface provides methods for easily converting between many of the built-in primitive, structured, and logical data types.

The new `Headers` interface defines an ordered collection of headers and is used to track all headers associated with a `ConnectRecord` (and thus `SourceRecord` and `SinkRecord`). This does allow multiple headers with the same key. The `Headers` contains methods for adding, removing, finding, and modifying headers. Convenience methods allow connectors and transforms to easily use and modify the headers for a record.

A new `HeaderConverter` interface is also defined to enable the Connect runtime framework to be able to serialize and deserialize headers between the in-memory representation and Kafka’s byte[] representation. A new `SimpleHeaderConverter` implementation has been added, and this serializes to strings and deserializes by inferring the schemas (`Struct` header values are serialized without the schemas, so they can only be deserialized as `Map` instances without a schema.) The `StringConverter`, `JsonConverter`, and `ByteArrayConverter` have all been extended to also be `HeaderConverter` implementations. Each connector can be configured with a different header converter, although by default the `SimpleHeaderConverter` is used to serialize header values as strings without schemas.

Unit and integration tests are added for `ConnectHeader` and `ConnectHeaders`, the two implementation classes for headers. Additional test methods are added for the methods added to the `Converter` implementations. Finally, the `ConnectRecord` object is already used heavily, so only limited tests need to be added while quite a few of the existing tests already cover the changes.

Author: Randall Hauch <rhauch@gmail.com>

Reviewers: Arjun Satish <arjun@confluent.io>, Ted Yu <yuzhihong@gmail.com>, Magesh Nandakumar <magesh.n.kumar@gmail.com>, Konstantine Karantasis <konstantine@confluent.io>, Ewen Cheslack-Postava <ewen@confluent.io>

Closes #4319 from rhauch/kafka-5142-b
2018-01-31 10:40:24 -08:00
Soenke Liebau 530bc59de2 KAFKA-4930: Enforce set of legal characters for connector names (KIP-212)
…to check for empty connector name and illegal characters in connector name. This also fixes  KAFKA-4938 by removing the check for slashes in connector name from ConnectorsResource.

Author: Ewen Cheslack-Postava <me@ewencp.org>
Author: Soenke Liebau <soenke.liebau@opencore.com>

Reviewers: Gwen Shapira <cshapi@gmail.com>, Viktor Somogyi <viktor.somogyi@cloudera.com>, Randall Hauch <rhauch@gmail.com>, Ewen Cheslack-Postava <ewen@confluent.io>

Closes #2755 from soenkeliebau/KAFKA-4930
2018-01-31 08:49:23 -08:00
Jakub Scholz 3431be2aeb KAFKA-4029: SSL support for Connect REST API (KIP-208)
This PR implements the JIRA issue [KAFKA-4029: SSL support for Connect REST API](https://issues.apache.org/jira/browse/KAFKA-4029) / [KIP-208](https://cwiki.apache.org/confluence/display/KAFKA/KIP-208%3A+Add+SSL+support+to+Kafka+Connect+REST+interface).

Summary of the main changes:
- Jetty `HttpClient` is used as HTTP client instead of the one shipped with Java. That allows to keep the SSL configuration for Server and Client be in single place (both use the Jetty `SslContextFactory`). It also has much richer configuration than the JDK client (it is easier to configure things such as supported cipher suites etc.).
- The `RestServer` class has been broker into 3 parts. `RestServer` contains the server it self. `RestClient` contains the HTTP client used for forwarding requests etc. and `SSLUtils` contain some helper classes for configuring SSL. One of the reasons for this was Findbugs complaining about the class complexity.
- A new method `valuesWithPrefixAllOrNothing` has been added to `AbstractConfig` to make it easier to handle the situation that we want to use either only the prefixed SSL options or only the non-prefixed. But not mixed them.

Author: Jakub Scholz <www@scholzj.com>

Reviewers: Ewen Cheslack-Postava <ewen@confluent.io>

Closes #4429 from scholzj/kip-208
2018-01-30 15:09:40 -08:00
Konstantine Karantasis b4165522b3 KAFKA-6148; ClassCastException in connectors that include kafka-clients packages (#4457)
Exclusion for packages that need not be loaded in isolation needs to be extended to all the `org.apache.kafka` packages (that do not belong to transforms and the other whitelisted packages). Most notably, this refers to any classes in `kafka-clients` package. 

Reviewers: Randall Hauch <rhauch@gmail.com>, Jason Gustafson <jason@confluent.io>
2018-01-30 08:45:07 -08:00
Scott d706c87cb2 MINOR: a single doc typo (#2969) 2018-01-26 11:14:57 -08:00
Konstantine Karantasis 993d3c727e KAFKA-6467; Enforce layout of dependencies within a connect plugin to be deterministic (#4459)
Adds alphanumeric ordering of dependencies as they added to a Connect plugin's class loader path. 

This makes the layout of the dependencies consistent across systems and deployments. Dependencies should still, in principle, not include conflicts and ideally order should not matter. 

Reviewers: Randall Hauch <rhauch@gmail.com>, Jason Gustafson <jason@confluent.io>
2018-01-26 08:39:57 -08:00
Charly Molter aa42a11dfd KAFKA-6180; Add a Validator for NonNull configurations and remove redundant null checks on lists (#4188) 2018-01-24 21:06:44 -08:00
Konstantine Karantasis 0fa52644de KAFKA-6277: Ensure loadClass for plugin class loaders is thread-safe.
`loadClass` needs to be synchronized to protect subsequent calls to `defineClass`.

Details in the javadoc of this PR as well as here too: https://docs.oracle.com/javase/7/docs/technotes/guides/lang/cl-mt.html

/cc ewencp rhauch

Author: Konstantine Karantasis <konstantine@confluent.io>

Reviewers: Ewen Cheslack-Postava <ewen@confluent.io>

Closes #4428 from kkonstantine/KAFKA-6277-Make-loadClass-thread-safe-for-class-loaders-of-Connect-plugins
2018-01-21 19:41:20 -08:00
Gunnar Morling 8cd563d478 KAFKA-6456; JavaDoc clarification for Connect SourceTask#poll() (#4432)
Making clear that implementations of poll() shouldn't block indefinitely in order to allow the task instance to transition to PAUSED state.

Reviewers:  Randall Hauch <rhauch@gmail.com>, Jason Gustafson <jason@confluent.io>
2018-01-18 09:56:22 -08:00
Gavrie Philipson 936e81afcb KAFKA-6250: Use existing Kafka Connect internal topics without requiring ACL (#4247)
When using Kafka Connect with a cluster that doesn't allow the user to
create topics (due to ACL configuration), Connect fails when trying to
create its internal topics even if these topics already exist. This is
incorrect behavior according to the documentation, which mentions that
R/W access should be enough.

This happens specifically when using Aiven Kafka, which does not permit
creation of topics via the Kafka Admin Client API.

The patch ignores the returned error, similar to the behavior for older
brokers that don't support the API.
2018-01-11 15:52:50 -08:00
Filipe Agapito 697a4af35a KAFKA-6363: Use MockAdminClient for any unit tests that depend on AdminClient (#4371)
* Implement MockAdminClient.deleteTopics
* Use MockAdminClient instead of MockKafkaAdminClientEnv in StreamsResetterTest
* Rename MockKafkaAdminClientEnv to AdminClientUnitTestEnv
* Use MockAdminClient instead of MockKafkaAdminClientEnv in TopicAdminTest
* Rename KafkaAdminClient to AdminClientUnitTestEnv in KafkaAdminClientTest.java
* Migrate StreamThreadTest to MockAdminClient
* Fix style errors
* Address review comments
* Fix MockAdminClient call

Reviewers: Matthias J. Sax <matthias@confluent.io>, Konstantine Karantasis <konstantine@confluent.io>, Guozhang Wang <wangguoz@gmail.com>
2018-01-08 11:58:56 -08:00
Arjun Satish 3b03c9d348 KAFKA-6252; Close the metric group to clean up any existing metrics
We are closing the metricGroups created in a Worker, Source task and Sink task before populating them with new metrics. This helps in cases where an Exception is thrown when previously created groups were not cleaned up correctly.

Signed-off-by: Arjun Satish <arjunconfluent.io>

Author: Arjun Satish <arjun@confluent.io>

Reviewers: Randall Hauch <rhauch@gmail.com>, Konstantine Karantasis <konstantine@confluent.io>, Ewen Cheslack-Postava <ewen@confluent.io>

Closes #4397 from wicknicks/KAFKA-6252
2018-01-05 21:24:39 -08:00
Study ecf0ab42c1 KAFKA-4335: Add batch.size to FileStreamSource connector to prevent OOM
When the source file of `FileStreamSource` is a large file, `FileStreamSourceTask.poll()` will result in OOM. This pull request added `batch.size` parameter which can restrict the poll size.

*More detailed description of your change,
if necessary. The PR title and PR message become
the squashed commit message, so use a separate
comment to ping reviewers.*

*Summary of testing strategy (including rationale)
for the feature or bug fix. Unit and/or integration
tests are expected for any behaviour change and
system tests should be considered for larger changes.*

Author: Study <ph.study@gmail.com>

Reviewers: Ewen Cheslack-Postava <ewen@confluent.io>

Closes #4356 from phstudy/KAFKA-4335
2018-01-05 18:31:53 -08:00
Ewen Cheslack-Postava bfb272c5cd KAFKA-6311: Expose Kafka cluster ID in Connect REST API (KIP-238) (#4314)
Reviewers: Konstantine Karantasis <konstantine@confluent.io>, Randall Hauch <rhauch@gmail.com>, Jason Gustafson <jason@confluent.io>
2018-01-05 07:52:50 -08:00
Koen De Groote 96df93522f Replace Arrays.asList with Collections.singletonList where possible (#4368)
Reviewers: Ismael Juma <ismael@juma.me.uk>
2018-01-01 13:07:32 +00:00
Koen De Groote 2bc780b959 MINOR: Use EnumMap/EnumSet if possible (#3919)
They are more efficient than HashMap/HashSet.

Reviewers: Ismael Juma <ismael@juma.me.uk>
2017-12-29 12:39:18 +00:00
Tobias Gies 68712dcdec KAFKA-6308; Connect Struct should use deepEquals/deepHashCode
This changes the Struct's equals and hashCode method to use Arrays#deepEquals and Arrays#deepHashCode, respectively. This resolves a problem where two structs with values of type byte[] would not be considered equal even though the byte arrays' contents are equal. By using deepEquals, the byte arrays' contents are compared instead of ther identity.

Since this changes the behavior of the equals method for byte array values, the behavior of hashCode must change alongside it to ensure the methods still fulfill the general contract of "equal objects must have equal hashCodes".

Test rationale:
All existing unit tests for equals were untouched and continue to work. A new test method was added to verify the behavior of equals and hashCode for Struct instances that contain a byte array value. I verify the reflixivity and transitivity of equals as well as the fact that equal Structs have equal hashCodes
and not-equal structs do not have equal hashCodes.

Author: Tobias Gies <tobias.gies@trivago.com>
Author: Tobias Gies <tobias@tobiasgies.de>

Reviewers: Randall Hauch <rhauch@gmail.com>, Jason Gustafson <jason@confluent.io>

Closes #4293 from tobiasgies/feature/kafka-6308-deepequals
2017-12-14 15:26:20 -08:00
Colin P. Mccabe 616321bcb6 KAFKA-6102; Consolidate MockTime implementations between connect and clients
Author: Colin P. Mccabe <cmccabe@confluent.io>

Reviewers: Randall Hauch <rhauch@gmail.com>, Jason Gustafson <jason@confluent.io>

Closes #4105 from cmccabe/KAFKA-6102
2017-12-14 14:50:34 -08:00
Soenke Liebau 5a2960f811 KAFKA-5563: Standardize validation and substitution of connector names in REST API connector configs
…from config to own function and added check to create connector call.

Author: Soenke Liebau <soenke.liebau@opencore.com>

Reviewers: Ewen Cheslack-Postava <ewen@confluent.io>

Closes #4230 from soenkeliebau/KAFKA-5563
2017-11-25 17:50:17 -08:00
Arjun Satish c5f31fe384 KAFKA-4827: Correctly encode special chars while creating URI objects
Signed-off-by: Arjun Satish <arjunconfluent.io>

Author: Arjun Satish <arjun@confluent.io>

Reviewers: Randall Hauch <rhauch@gmail.com>, Ewen Cheslack-Postava <ewen@confluent.io>

Closes #4205 from wicknicks/KAFKA-4827
2017-11-22 09:54:06 -08:00
Jeff Klukas 049342e440 KAFKA-3073: Add topic regex support for Connect sinks
There are more methods that had to be touched than I anticipated when writing [the KIP](https://cwiki.apache.org/confluence/display/KAFKA/KIP-215%3A+Add+topic+regex+support+for+Connect+sinks).

The implementation here is now complete and includes a test that verifies that there's a call to `consumer.subscribe(Pattern, RebalanceHandler)` when `topics.regex` is provided.

Author: Jeff Klukas <jeff@klukas.net>

Reviewers: Randall Hauch <rhauch@gmail.com>, Ewen Cheslack-Postava <ewen@confluent.io>

Closes #4151 from jklukas/connect-topics.regex
2017-11-21 16:01:16 -08:00
tedyu fd8eb268d6 KAFKA-6168: Connect Schema comparison is slow for large schemas
Re-arrange order of comparisons in equals() to evaluate non-composite fields first
Cache hash code

Author: tedyu <yuzhihong@gmail.com>

Reviewers: Randall Hauch <rhauch@gmail.com>, Konstantine Karantasis <konstantine@confluent.io>, Ewen Cheslack-Postava <ewen@confluent.io>

Closes #4176 from tedyu/trunk
2017-11-21 14:26:32 -08:00
Ewen Cheslack-Postava f0276f5ca2 MINOR: Log unexpected exceptions in Connect REST calls that generate 500s at a higher log level
The ConnectExceptionMapper was originally intended to handle ConnectException errors for some expected cases where we just want to always convert them to a certain response and the ExceptionMapper was the easiest way to do that uniformly across the API. However, in the case that it's not an expected subclass, we should log the information at the error level so the user can track down the cause of the error.

This is only an initial improvement. We should probably also add a more general ExceptionMapper to handle other exceptions we may not have caught and converted to ConnectException.

Author: Ewen Cheslack-Postava <me@ewencp.org>

Reviewers: Randall Hauch <rhauch@gmail.com>, Jason Gustafson <jason@confluent.io>

Closes #4227 from ewencp/better-connect-error-logging
2017-11-20 20:29:33 -08:00
sachinbhalekar 0026121543 KAFKA-6218: Optimize condition in if statement to reduce the number of comparisons
Changed the condition in **if** statement
**(schema.name() == null || !(schema.name().equals(LOGICAL_NAME)))** which
requires two comparisons in worst case with
**(!LOGICAL_NAME.equals(schema.name()))**  which requires single comparison
in all cases and _avoids null pointer exception.
![kafka_optimize_if](https://user-images.githubusercontent.com/32234013/32872271-afe0b954-ca3a-11e7-838d-6a3bc416b807.JPG)
_

Author: sachinbhalekar <sachinbansibhalekar@gmail.com>
Author: sachinbhalekar <32234013+sachinbhalekar@users.noreply.github.com>

Reviewers: Randall Hauch <rhauch@gmail.com>, Ewen Cheslack-Postava <ewen@confluent.io>

Closes #4225 from sachinbhalekar/trunk
2017-11-16 15:04:27 -08:00
Adem Efe Gencer 86062e9a78 KAFKA-6157; Fix repeated words words in JavaDoc and comments.
Author: Adem Efe Gencer <agencer@linkedin.com>

Reviewers: Jiangjie Qin <becket.qin@gmail.com>

Closes #4170 from efeg/bug/typoFix
2017-11-05 18:00:43 -08:00
Richard Yu 7fe88e8bd9 KAFKA-5212; Consumer ListOffsets request can starve group heartbeats
Author: Richard Yu <richardyu@Richards-Air.attlocal.net>

Reviewers: Jason Gustafson <jason@confluent.io>

Closes #4110 from ConcurrencyPractitioner/trunk
2017-10-30 11:31:36 -07:00
Konstantine Karantasis 5ec6765bdb KAFKA-6087: Scanning plugin.path needs to support relative symlinks.
Author: Konstantine Karantasis <konstantine@confluent.io>

Reviewers: Ewen Cheslack-Postava <ewen@confluent.io>

Closes #4092 from kkonstantine/KAFKA-6087-Scanning-plugin.path-needs-to-support-relative-symlinks
2017-10-19 14:24:57 -07:00
Ewen Cheslack-Postava 7d6ca52a27 MINOR: Push JMX metric name mangling into the JmxReporter (KIP-190 follow up)
Author: Ewen Cheslack-Postava <me@ewencp.org>

Reviewers: Rajini Sivaram <rajinisivaram@googlemail.com>, Ismael Juma <ismael@juma.me.uk>

Closes #3980 from ewencp/dont-mangle-names
2017-10-11 17:32:40 -04:00
Konstantine Karantasis 974d6fec93 KAFKA-5953: Register all jdbc drivers available in plugin and class paths
Author: Konstantine Karantasis <konstantine@confluent.io>

Reviewers: Randall Hauch <rhauch@gmail.com>, Ewen Cheslack-Postava <ewen@confluent.io>

Closes #4030 from kkonstantine/KAFKA-5953-Connect-classloader-isolation-may-be-broken-for-JDBC-drivers
2017-10-06 11:42:04 -07:00
Randall Hauch a47bfbcae0 KAFKA-5903: Added Connect metrics to the worker and distributed herder (KIP-196)
Added metrics to the Connect worker and rebalancing metrics to the distributed herder.

This is built on top of #3987, and I can rebase this PR once that is merged.

Author: Randall Hauch <rhauch@gmail.com>

Reviewers: Ewen Cheslack-Postava <ewen@confluent.io>

Closes #4011 from rhauch/kafka-5903
2017-10-05 11:23:11 -07:00
Randall Hauch 11afff0990 KAFKA-5990: Enable generation of metrics docs for Connect (KIP-196)
A new mechanism was added recently to the Metrics framework to make it easier to generate the documentation. It uses a registry with a MetricsNameTemplate for each metric, and then those templates are used when creating the actual metrics. The metrics framework provides utilities that can generate the HTML documentation from the registry of templates.

This change moves the recently-added Connect metrics over to use these templates and to then generate the metric documentation for Connect.

This PR is based upon #3975 and can be rebased once that has been merged.

Author: Randall Hauch <rhauch@gmail.com>

Reviewers: Ewen Cheslack-Postava <ewen@confluent.io>

Closes #3987 from rhauch/kafka-5990
2017-10-04 11:05:50 -07:00
Ismael Juma 2a1b39ef1b MINOR: Simplify log cleaner and fix compiler warnings
- Simplify LogCleaner.cleanSegments and add comment regarding thread
unsafe usage of `LogSegment.append`. This was a result of investigating
KAFKA-4972.
- Fix compiler warnings (in some cases use the fully qualified name as a
workaround for deprecation warnings in import statements).

Author: Ismael Juma <ismael@juma.me.uk>

Reviewers: Rajini Sivaram <rajinisivaram@googlemail.com>

Closes #4016 from ijuma/simplify-log-cleaner-and-fix-warnings
2017-10-04 18:55:46 +01:00
Randall Hauch 05357b7030 KAFKA-5902: Added sink task metrics (KIP-196)
Added Connect metrics specific to source tasks, and builds upon #3864 and #3911 that have already been merged into `trunk`, and #3959 that has yet to be merged.

I'll rebase this PR when the latter is merged.

Author: Randall Hauch <rhauch@gmail.com>

Reviewers: Ewen Cheslack-Postava <ewen@confluent.io>

Closes #3975 from rhauch/kafka-5902
2017-10-03 11:52:14 -07:00
Rajini Sivaram 021d8a8e96 KAFKA-5746; Add new metrics to support health checks (KIP-188)
Adds new metrics to support health checks:
1. Error rates for each request type, per-error code
2. Request size and temporary memory size
3. Message conversion rate and time
4. Successful and failed authentication rates
5. ZooKeeper latency and status
6. Client version

Author: Rajini Sivaram <rajinisivaram@googlemail.com>

Reviewers: Ismael Juma <ismael@juma.me.uk>

Closes #3705 from rajinisivaram/KAFKA-5746-new-metrics
2017-09-28 21:58:59 +01:00
Randall Hauch 89ba0c1525 KAFKA-5901: Added Connect metrics specific to source tasks (KIP-196)
Added Connect metrics specific to source tasks, and builds upon #3864 and #3911 that have already been merged into `trunk`.

Author: Randall Hauch <rhauch@gmail.com>

Reviewers: tedyu <yuzhihong@gmail.com>, Ewen Cheslack-Postava <ewen@confluent.io>

Closes #3959 from rhauch/kafka-5901
2017-09-28 09:53:19 -07:00
Konstantine Karantasis 1444b7b594 KAFKA-5867: Log Kafka Connect worker info during startup
Author: Konstantine Karantasis <konstantine@confluent.io>

Reviewers: Randall Hauch <rhauch@gmail.com>, Ewen Cheslack-Postava <ewen@confluent.io>

Closes #3932 from kkonstantine/KAFKA-5867-Kafka-Connect-applications-should-log-info-message-when-starting-up
2017-09-27 22:07:37 -07:00
Randall Hauch 73cc416664 KAFKA-5900: Add task metrics common to both sink and source tasks
Added metrics that are common to both sink and source tasks.

Marked as "**WIP**" since this PR is built upon #3864, and will need to be rebased once that has been merged into `trunk`. However, I would still appreciate initial reviews since this PR is largely additive.

Author: Randall Hauch <rhauch@gmail.com>

Reviewers: Konstantine Karantasis <konstantine@confluent.io>, Ewen Cheslack-Postava <ewen@confluent.io>

Closes #3911 from rhauch/kafka-5900
2017-09-26 22:23:37 -07:00
Randall Hauch f4a1ca347b KAFKA-5899: Added Connect metrics for connectors (KIP-196)
This PR is the first of several subtasks for [KAFKA-2376](https://issues.apache.org/jira/browse/KAFKA-2376) to add metrics to Connect worker processes. See that issue and [KIP-196 for details](https://cwiki.apache.org/confluence/display/KAFKA/KIP-196%3A+Add+metrics+to+Kafka+Connect+framework).

This PR adds metrics for each connector using Kafka’s existing `Metrics` framework. This is the first of several changes to add several groups of metrics, this change starts by adding a very simple `ConnectMetrics` object that is owned by each worker and that makes it easy to define multiple groups of metrics, called `ConnectMetricGroup` objects. Each metric group maps to a JMX MBean, and each metric within the group maps to an MBean attribute.

Future PRs will build upon this simple pattern to add metrics for source and sink tasks, workers, and worker rebalances.

Author: Randall Hauch <rhauch@gmail.com>

Reviewers: Konstantine Karantasis <konstantine@confluent.io>, Ewen Cheslack-Postava <ewencp@confluent.io>

Closes #3864 from rhauch/kafka-5899
2017-09-22 13:17:28 -07:00
Thibaud Chardonnens 7c988a3c8b KAFKA-5330: Use per-task converters in Connect
Instead of sharing the same converter instance within the worker, use a converter per task.

More details:
- https://github.com/confluentinc/schema-registry/issues/514
- https://issues.apache.org/jira/browse/KAFKA-5330

Author: Thibaud Chardonnens <Thibaud.Chardonnens@swisscom.com>

Reviewers: Ewen Cheslack-Postava <ewen@confluent.io>

Closes #3196 from tbcdns/KAFKA-5330
2017-09-21 20:12:08 -07:00
tedyu 552b170787 KAFKA-5657: Connect REST API should include the connector type when describing a connector (KIP-151)
Embed the type of connector in ConnectorInfo

Author: tedyu <yuzhihong@gmail.com>

Reviewers: Randall Hauch <rhauch@gmail.com>, Jason Gustafson <jason@confluent.io>, Konstantine Karantasis <konstantine@confluent.io>, Ewen Cheslack-Postava <ewen@confluent.io>

Closes #3812 from tedyu/trunk
2017-09-20 14:01:43 -07:00
Ismael Juma a3f068e22d MINOR: Update powermock and enable its tests when running with Java 9
Also:
1. Fix WorkerTest to use the correct `Mock` annotations. `org.easymock.Mock`
is not supported by PowerMock 2.x.
2. Rename `powermock` to `powermockJunit4` in `dependencies.gradle` for
clarity.

Author: Ismael Juma <ismael@juma.me.uk>

Reviewers: Rajini Sivaram <rajinisivaram@googlemail.com>

Closes #3881 from ijuma/kafka-5884-powermock-java
2017-09-18 10:49:12 +01:00
Ismael Juma ffd8f18a12 KAFKA-4501; Fix EasyMock and disable PowerMock tests under Java 9
- EasyMock 3.5 supports Java 9.

- Fixed issues in `testFailedSendRetryLogic` and
`testCreateConnectorAlreadyExists` exposed by new EasyMock
version. The former was passing `anyObject` to
`andReturn`, which doesn't make sense. This was leaving
behind a global `any` matcher, which caused a few issues in
the new version. Fixing this meant that the correlation ids had
to be updated to actually match. The latter was missing a
couple of expectations that the previous version of EasyMock
didn't catch.

- Removed unnecessary PowerMock dependency from 3 tests.

- Disabled remaining PowerMock tests when running with Java 9
until https://github.com/powermock/powermock/issues/783 is
in a release.

- Once we merge this PR, we can enable tests in the Java 9 builds
in Jenkins.

Author: Ismael Juma <ismael@juma.me.uk>

Reviewers: Rajini Sivaram <rajinisivaram@googlemail.com>

Closes #3845 from ijuma/kafka-4501-easymock-powermock-java-9
2017-09-13 18:18:54 +01:00
Andrey Dyachkov 9bb06c3eef KAFKA-5763; Use LogContext in NetworkClient, Selector and broker
Author: Andrey Dyachkov <andrey.dyachkov@gmail.com>

Reviewers: Jason Gustafson <jason@confluent.io>, Manikumar Reddy <manikumar.reddy@gmail.com>, Ismael Juma <ismael@juma.me.uk>

Closes #3761 from adyach/kafka-5763
2017-09-11 15:37:46 +01:00
oleg 5102576460 KAFKA-5756; Connect WorkerSourceTask synchronization issue on flush
Author: oleg <oleg@nexla.com>

Reviewers: Randall Hauch <rhauch@gmail.com>, Jason Gustafson <jason@confluent.io>

Closes #3702 from oleg-smith/KAFKA-5756
2017-09-06 11:12:23 -07:00
Jason Gustafson 6896f1ddb7 MINOR: Ensure consumer logging has clientId/groupId context
This patch ensures that the consumer groupId and clientId are available in all log messages which makes debugging much easier when a single application has multiple consumer instances. To make this easier, I've added a new `LogContext` object which builds a log prefix similar to the broker-side `kafka.utils.Logging` mixin. Additionally this patch changes the log level for a couple minor cases:

- Consumer wakeup events are now logged at DEBUG instead of TRACE
- Heartbeat enabling/disabling is now logged at DEBUG instead of TRACE

Author: Jason Gustafson <jason@confluent.io>

Reviewers: Manikumar Reddy <manikumar.reddy@gmail.com>, Ismael Juma <ismael@juma.me.uk>

Closes #3676 from hachikuji/log-consumer-wakeups
2017-08-19 11:17:02 -07:00
Konstantine Karantasis 72eacbea5b KAFKA-5567: Connect sink worker should commit offsets of original topic partitions
Author: Konstantine Karantasis <konstantine@confluent.io>

Reviewers: Randall Hauch <rhauch@gmail.com>, Ewen Cheslack-Postava <ewen@confluent.io>

Closes #3499 from kkonstantine/KAFKA-5567-With-transformations-that-mutate-the-topic-partition-committing-offsets-should-to-refer-to-the-original-topic-partition
2017-08-16 14:43:29 -07:00
Randall Hauch 3b1cea60e9 KAFKA-5731; Corrected how the sink task worker updates the last committed offsets
Prior to this change, it was possible for the synchronous consumer commit request to be handled before previously-submitted asynchronous commit requests. If that happened, the out-of-order handlers improperly set the last committed offsets, which then became inconsistent with the offsets the connector task is working with.

This change ensures that the last committed offsets are updated only for the most recent commit request, even if the consumer reorders the calls to the callbacks.

Author: Randall Hauch <rhauch@gmail.com>

Reviewers: Jason Gustafson <jason@confluent.io>

Closes #3662 from rhauch/kafka-5731
2017-08-15 14:16:00 -07:00
Ewen Cheslack-Postava a593db6a2b MINOR: Standardize logging of Worker-level messages from Tasks and Connectors
This ensures all logs have the connector/task ID, whether tasks are source or sink, and formats them consistently.

Author: Ewen Cheslack-Postava <me@ewencp.org>

Reviewers: Randall Hauch <rhauch@gmail.com>, Konstantine Karantasis <konstantine@confluent.io>, Jason Gustafson <jason@confluent.io>

Closes #3639 from ewencp/standardize-connector-task-logging
2017-08-09 09:07:27 -07:00
Randall Hauch 1a653c813c KAFKA-5704: Corrected Connect distributed startup behavior to allow older brokers to auto-create topics
When a Connect distributed worker starts up talking with broker versions 0.10.1.0 and later, it will use the AdminClient to look for the internal topics and attempt to create them if they are missing. Although the AdminClient was added in 0.11.0.0, the AdminClient uses APIs to create topics that existed in 0.10.1.0 and later. This feature works as expected when Connect uses a broker version 0.10.1.0 or later.

However, when a Connect distributed worker starts up using a broker older than 0.10.1.0, the AdminClient is not able to find the required APIs and thus will throw an UnsupportedVersionException. Unfortunately, this exception is not caught and instead causes the Connect worker to fail even when the topics already exist.

This change handles the UnsupportedVersionException by logging a debug message and doing nothing. The existing producer logic will get information about the topics, which will cause the broker to create them if they don’t exist and broker auto-creation of topics is enabled. This is the same behavior that existed prior to 0.11.0.0, and so this change restores that behavior for brokers older than 0.10.1.0.

This change also adds a system test that verifies Connect works with a variety of brokers and is able to run source and sink connectors. The test verifies that Connect can read from the internal topics when the connectors are restarted.

Author: Randall Hauch <rhauch@gmail.com>

Reviewers: Ewen Cheslack-Postava <ewen@confluent.io>

Closes #3641 from rhauch/kafka-5704
2017-08-08 20:20:41 -07:00
Ewen Cheslack-Postava 22611aca9b KAFKA-5535: Handle null values in ExtractField
Author: Ewen Cheslack-Postava <me@ewencp.org>

Reviewers: Randall Hauch <rhauch@gmail.com>, Konstantine Karantasis <konstantine@confluent.io>, Jason Gustafson <jason@confluent.io>

Closes #3559 from ewencp/kafka-5535-extract-field-null
2017-08-04 10:25:21 -07:00
Hooman Broujerdi da42977a00 MINOR: Added safe deserialization implementation
Author: Hooman Broujerdi <hoomanb@hoomanb.usersys.redhat.com>

Reviewers: Randall Hauch <rhauch@gmail.com>, Ewen Cheslack-Postava <ewen@confluent.io>

Closes #3563 from rhauch/deserialization-validation
2017-07-21 16:02:36 -07:00
Stephane Maarek ad6c53d89e KAFKA-5468: WorkerSourceTask offset commit loglevel changes
changed log level for source connector worker task when committing offsets

Author: Stephane Maarek <stephane@simplemachines.com.au>

Reviewers: Ewen Cheslack-Postava <ewen@confluent.io>

Closes #3367 from simplesteph/KAFKA-5468
2017-07-20 21:55:50 -07:00
Vahid Hashemian f87d58b796 MINOR: Code Cleanup
Clean up includes:

- Switching try-catch-finally blocks to try-with-resources when possible
- Removing some seemingly unnecessary `SuppressWarnings` annotations
- Resolving some Java warnings
- Closing unclosed Closable objects
- Removing unused code

Author: Vahid Hashemian <vahidhashemian@us.ibm.com>

Reviewers: Balint Molnar <balintmolnar91@gmail.com>, Guozhang Wang <wangguoz@gmail.com>, Matthias J. Sax <matthias@confluent.io>, Ismael Juma <ismael@juma.me.uk>, Jason Gustafson <jason@confluent.io>

Closes #3222 from vahidhashemian/minor/code_cleanup_1706
2017-07-19 10:51:28 -07:00
Jeremy Custenborder 6d7a81b478 KAFKA-5579: check for null
Author: Jeremy Custenborder <jcustenborder@gmail.com>

Reviewers: Randall Hauch <rhauch@gmail.com>, Ewen Cheslack-Postava <ewen@confluent.io>

Closes #3517 from jcustenborder/KAFKA-5579
2017-07-17 10:45:31 -07:00
Evgeny Veretennikov 0547a08254 MINOR: typo in variable name "unkownInfo"
Author: Evgeny Veretennikov <evg.veretennikov@gmail.com>

Reviewers: Apurva Mehta <apurva@confluent.io>, Ismael Juma <ismael@juma.me.uk>

Closes #3433 from evis/unkown_info_typo
2017-07-12 16:24:58 -07:00
Jeremy Custenborder ab8e9d1755 KAFKA-5548: Extended validation for SchemaBuilder methods.
More input validation for SchemaBuilder methods.

Author: Jeremy Custenborder <jcustenborder@gmail.com>

Reviewers: Ewen Cheslack-Postava <ewen@confluent.io>

Closes #3474 from jcustenborder/KAFKA-5548
2017-07-01 02:46:55 -07:00
Ewen Cheslack-Postava 96587f4b1f KAFKA-5475: Connector config validation should include fields for defined transformation aliases
Author: Ewen Cheslack-Postava <me@ewencp.org>

Reviewers: Konstantine Karantasis <konstantine@confluent.io>, Jason Gustafson <jason@confluent.io>

Closes #3399 from ewencp/kafka-5475-validation-transformations
2017-06-21 14:20:48 -07:00
Randall Hauch de982ba3fb KAFKA-5472: Eliminated duplicate group names when validating connector results
Kafka Connect was adding duplicate group names in the response from the REST API's validation of connector configurations. This fixes the duplicates and maintains the order of the `ConfigDef` objects so that the `ConfigValue` results are in the same order.

This is a blocker and should be merged to 0.11.0.

Author: Randall Hauch <rhauch@gmail.com>

Reviewers: Ewen Cheslack-Postava <ewen@confluent.io>

Closes #3379 from rhauch/KAFKA-5472
2017-06-20 17:48:32 -07:00
ppatierno 198a43d846 KAFKA-5412: Using connect-console-sink/source.properties raises an exception related to "file" property not found
Author: ppatierno <ppatierno@live.com>

Reviewers: Randall Hauch <rhauch@gmail.com>, Ewen Cheslack-Postava <ewen@confluent.io>

Closes #3279 from ppatierno/kafka-5412
2017-06-19 09:22:19 -07:00
Ismael Juma 0f60617fab KAFKA-5275; AdminClient API consistency
Author: Ismael Juma <ismael@juma.me.uk>

Reviewers: Colin P. Mccabe <cmccabe@confluent.io>, Jason Gustafson <jason@confluent.io>

Closes #3339 from ijuma/kafka-5275-admin-client-api-consistency
2017-06-15 02:05:41 +01:00
Ewen Cheslack-Postava b760615f3d KAFKA-5448: Change TimestampConverter configuration name to avoid conflicting with reserved 'type' configuration used by all Transformations
Author: Ewen Cheslack-Postava <me@ewencp.org>

Reviewers: David Tucker, Gwen Shapira

Closes #3342 from ewencp/kafka-5448-change-timestamp-converter-config-name
2017-06-14 15:24:32 -07:00
Ewen Cheslack-Postava 179d574a39 MINOR: Verify mocks in all WorkerTest tests and don't unnecessarily mockStatic the Plugins class
Author: Ewen Cheslack-Postava <me@ewencp.org>

Reviewers: Konstantine Karantasis <konstantine@confluent.io>, Randall Hauch <rhauch@gmail.com>, Ismael Juma <ismael@juma.me.uk>

Closes #3319 from ewencp/minor-worker-test-cleanup
2017-06-14 02:34:18 +01:00
Konstantine Karantasis 004dde9e7a MINOR: Add unit tests for PluginDesc in Connect.
Related to https://github.com/apache/kafka/pull/3321

Author: Konstantine Karantasis <konstantine@confluent.io>

Reviewers: Ewen Cheslack-Postava <ewen@confluent.io>

Closes #3326 from kkonstantine/MINOR-Add-tests-for-PluginDesc
2017-06-13 16:55:22 -07:00
Konstantine Karantasis 4b4102884c HOTFIX: Handle Connector version returning 'null' during plugin loading.
Author: Konstantine Karantasis <konstantine@confluent.io>

Reviewers: Ewen Cheslack-Postava <ewen@confluent.io>

Closes #3321 from kkonstantine/HOTFIX-Handle-null-version-returned-from-Connector-interface-during-plugin-loading
2017-06-13 14:40:07 -07:00
Nick Pillitteri d655d806ee KAFKA-4942: Fix commitTimeoutMs being set before the commit actually started
This fixes KAFKA-4942

This supersededs #2730

/cc simplesteph gwenshap ewencp

Author: Nick Pillitteri <nickp@smartertravelmedia.com>
Author: simplesteph <stephane.maarek@gmail.com>

Reviewers: simplesteph <stephane.maarek@gmail.com>, Konstantine Karantasis <konstantine@confluent.io>, Ewen Cheslack-Postava <ewen@confluent.io>

Closes #2912 from 56quarters/fix-connect-offset-commit
2017-06-06 14:22:39 -07:00
Randall Hauch af85e05b98 KAFKA-5164: Ensure SetSchemaMetadata updates key or value when Schema changes
When the `SetSchemaMetadata` SMT is used to change the name and/or version of the key or value’s schema, any references to the old schema in the key or value must be changed to reference the new schema. Only keys or values that are `Struct` have such references, and so currently only these are adjusted.

This is based on `trunk` since the fix is expected to be targeted to the 0.11.1 release.

Author: Randall Hauch <rhauch@gmail.com>

Reviewers: Ewen Cheslack-Postava <ewen@confluent.io>

Closes #3198 from rhauch/kafka-5164
2017-06-02 10:02:40 -07:00
Colin P. Mccabe b3036c5861 KAFKA-5293; Do not apply exponential backoff if users have overridden…
… reconnect.backoff.ms

Author: Colin P. Mccabe <cmccabe@confluent.io>

Reviewers: Ismael Juma <ismael@juma.me.uk>

Closes #3174 from cmccabe/KAFKA-5293
2017-06-01 21:21:58 +01:00
Konstantine Karantasis e0150a25e8 MINOR: Traverse plugin path recursively in Connect (KIP-146)
Author: Konstantine Karantasis <konstantine@confluent.io>

Reviewers: Ewen Cheslack-Postava <ewen@confluent.io>

Closes #3173 from kkonstantine/MINOR-Traverse-plugin-path-recursively-in-Connect
2017-06-01 02:07:53 -07:00
Ismael Juma 7311dcbc53 KAFKA-5291; AdminClient should not trigger auto creation of topics
- Added a boolean `allow_auto_topic_creation` to MetadataRequest and
bumped the protocol version to V4.

- When connecting to brokers older than 0.11.0.0, the `allow_auto_topic_creation`
field won't be considered, so we send a metadata request for all topics
to keep the behavior consistent.

- Set `allow_auto_topic_creation` to false in the new AdminClient and
StreamsKafkaClient (which exists for the purpose of creating topics
manually); set it to true everywhere else for now. Other clients will eventually
rely on client-side auto topic creation, but that’s not there yet.

- Add `allowAutoTopicCreation` field to `Metadata`, which is used by
`DefaultMetadataUpdater`. This is not strictly needed for the new
`AdminClient`, but it avoids surprises if it ever adds a topic to `Metadata`
via `setTopics` or `addTopic`.

Author: Ismael Juma <ismael@juma.me.uk>

Reviewers: Jun Rao <junrao@gmail.com>

Closes #3098 from ijuma/kafka-5291-admin-client-no-auto-topic-creation
2017-06-01 01:00:11 +01:00
Ewen Cheslack-Postava 61bab2d875 KAFKA-4714; TimestampConverter transformation (KIP-66)
Author: Ewen Cheslack-Postava <me@ewencp.org>

Reviewers: Konstantine Karantasis <konstantine@confluent.io>, Jason Gustafson <jason@confluent.io>

Closes #3065 from ewencp/kafka-3209-timestamp-converter
2017-05-19 11:26:59 -07:00
Dana Powers abe699176b KAFKA-3878; Support exponential backoff policy via reconnect.backoff.max (KIP-144)
Summary:
- add `reconnect.backoff.max.ms` common client configuration parameter
- if `reconnect.backoff.max.ms` > `reconnect.backoff.ms`, apply an exponential backoff policy
- apply +/- 20% random jitter to smooth cluster reconnects

Author: Dana Powers <dana.powers@gmail.com>

Reviewers: Ewen Cheslack-Postava <me@ewencp.org>, Roger Hoover <roger.hoover@gmail.com>, Ismael Juma <ismael@juma.me.uk>

Closes #1523 from dpkp/exp_backoff
2017-05-19 14:06:40 +01:00
Randall Hauch 56623efd73 KAFKA-4667: Connect uses AdminClient to create internal topics when needed (KIP-154)
The backing store for offsets, status, and configs now attempts to use the new AdminClient to look up the internal topics and create them if they don’t yet exist. If the necessary APIs are not available in the connected broker, the stores fall back to the old behavior of relying upon auto-created topics. Kafka Connect requires a minimum of Apache Kafka 0.10.0.1-cp1, and the AdminClient can work with all versions since 0.10.0.0.

All three of Connect’s internal topics are created as compacted topics, and new distributed worker configuration properties control the replication factor for all three topics and the number of partitions for the offsets and status topics; the config topic requires a single partition and does not allow it to be set via configuration. All of these new configuration properties have sensible defaults, meaning users can upgrade without having to change any of the existing configurations. In most situations, existing Connect deployments will have already created the storage topics prior to upgrading.

The replication factor defaults to 3, so anyone running Kafka clusters with fewer nodes than 3 will receive an error unless they explicitly set the replication factor for the three internal topics. This is actually desired behavior, since it signals the users that they should be aware they are not using sufficient replication for production use.

The integration tests use a cluster with a single broker, so they were changed to explicitly specify a replication factor of 1 and a single partition.

The `KafkaAdminClientTest` was refactored to extract a utility for setting up a `KafkaAdminClient` with a `MockClient` for unit tests.

Author: Randall Hauch <rhauch@gmail.com>

Reviewers: Ewen Cheslack-Postava <ewen@confluent.io>

Closes #2984 from rhauch/kafka-4667
2017-05-18 16:02:29 -07:00
Konstantine Karantasis 45f2261763 KAFKA-3487: Support classloading isolation in Connect (KIP-146)
Author: Konstantine Karantasis <konstantine@confluent.io>

Reviewers: Randall Hauch <rhauch@gmail.com>, Ewen Cheslack-Postava <ewen@confluent.io>

Closes #3028 from kkonstantine/KAFKA-3487-Support-classloading-isolation-in-Connect
2017-05-18 10:39:15 -07:00
Ewen Cheslack-Postava 1cea4d8f5a KAFKA-4714; Flatten and Cast single message transforms (KIP-66)
Author: Ewen Cheslack-Postava <me@ewencp.org>

Reviewers: Konstantine Karantasis <konstantine@confluent.io>, Shikhar Bhushan <shikhar@confluent.io>, Jason Gustafson <jason@confluent.io>

Closes #2458 from ewencp/kafka-3209-even-more-transforms
2017-05-16 23:05:35 -07:00
Ewen Cheslack-Postava e3892c29c3 MINOR: Handle nulls in NonEmptyListValidator
Author: Ewen Cheslack-Postava <me@ewencp.org>

Reviewers: Ismael Juma <ismael@juma.me.uk>

Closes #3045 from ewencp/minor-non-empty-list-validator-nulls
2017-05-14 18:33:05 +01:00