Commit Graph

147 Commits

Author SHA1 Message Date
Jeremy Custenborder 9f4fe7679e KAFKA-5550; Connect Struct.put() should include the field name if validation fails (#3507)
Changed call to use the overload of ConnectSchema.validateValue() method with the field name passed in. Ensure that field in put call is not null.

Reviewers: Randall Hauch <rhauch@gmail.com>, Jason Gustafson <jason@confluent.io>
2018-02-12 09:16:25 -08:00
Randall Hauch 976a3b0cc8 KAFKA-6513: Corrected how Converters and HeaderConverters are instantiated and configured
The commits for KIP-145 (KAFKA-5142) changed how the Connect workers instantiate and configure the Converters, and also added the ability to do the same for the new HeaderConverters. However, the last few commits removed the default value for the `converter.type` property for Converters and HeaderConverters, and this broke how the internal converters were being created.

This change corrects the behavior so that the `converter.type` property is always set by the worker (or by the Plugins class), which means the existing Converter implementations will not have to do this. The built-in JsonConverter, ByteArrayConverter, and StringConverter also implement HeaderConverter which implements Configurable, but the Worker and Plugins methods do not yet use the `Configurable.configure(Map)` method and instead still use the `Converter.configure(Map,boolean)`.

Several tests were modified, and a new PluginsTest was added to verify the new behavior in Plugins for instantiating and configuring the Converter and HeaderConverter instances.

Author: Randall Hauch <rhauch@gmail.com>

Reviewers: Konstantine Karantasis <konstantine@confluent.io>, Ewen Cheslack-Postava <ewen@confluent.io>

Closes #4512 from rhauch/kafka-6513
2018-02-09 15:47:44 -08:00
Gunnar Morling f9b56d680b KAFKA-6515; Adding toString() method to o.a.k.connect.data.Field (#4509) 2018-02-03 13:27:02 -08:00
Randall Hauch 4c48942f9d KAFKA-5142: Add Connect support for message headers (KIP-145)
**[KIP-145](https://cwiki.apache.org/confluence/display/KAFKA/KIP-145+-+Expose+Record+Headers+in+Kafka+Connect) has been accepted, and this PR implements KIP-145 except without the SMTs.**

Changed the Connect API and runtime to support message headers as described in [KIP-145](https://cwiki.apache.org/confluence/display/KAFKA/KIP-145+-+Expose+Record+Headers+in+Kafka+Connect).

The new `Header` interface defines an immutable representation of a Kafka header (key-value pair) with support for the Connect value types and schemas. This interface provides methods for easily converting between many of the built-in primitive, structured, and logical data types.

The new `Headers` interface defines an ordered collection of headers and is used to track all headers associated with a `ConnectRecord` (and thus `SourceRecord` and `SinkRecord`). This does allow multiple headers with the same key. The `Headers` contains methods for adding, removing, finding, and modifying headers. Convenience methods allow connectors and transforms to easily use and modify the headers for a record.

A new `HeaderConverter` interface is also defined to enable the Connect runtime framework to be able to serialize and deserialize headers between the in-memory representation and Kafka’s byte[] representation. A new `SimpleHeaderConverter` implementation has been added, and this serializes to strings and deserializes by inferring the schemas (`Struct` header values are serialized without the schemas, so they can only be deserialized as `Map` instances without a schema.) The `StringConverter`, `JsonConverter`, and `ByteArrayConverter` have all been extended to also be `HeaderConverter` implementations. Each connector can be configured with a different header converter, although by default the `SimpleHeaderConverter` is used to serialize header values as strings without schemas.

Unit and integration tests are added for `ConnectHeader` and `ConnectHeaders`, the two implementation classes for headers. Additional test methods are added for the methods added to the `Converter` implementations. Finally, the `ConnectRecord` object is already used heavily, so only limited tests need to be added while quite a few of the existing tests already cover the changes.

Author: Randall Hauch <rhauch@gmail.com>

Reviewers: Arjun Satish <arjun@confluent.io>, Ted Yu <yuzhihong@gmail.com>, Magesh Nandakumar <magesh.n.kumar@gmail.com>, Konstantine Karantasis <konstantine@confluent.io>, Ewen Cheslack-Postava <ewen@confluent.io>

Closes #4319 from rhauch/kafka-5142-b
2018-01-31 10:40:24 -08:00
Scott d706c87cb2 MINOR: a single doc typo (#2969) 2018-01-26 11:14:57 -08:00
Gunnar Morling 8cd563d478 KAFKA-6456; JavaDoc clarification for Connect SourceTask#poll() (#4432)
Making clear that implementations of poll() shouldn't block indefinitely in order to allow the task instance to transition to PAUSED state.

Reviewers:  Randall Hauch <rhauch@gmail.com>, Jason Gustafson <jason@confluent.io>
2018-01-18 09:56:22 -08:00
Koen De Groote 96df93522f Replace Arrays.asList with Collections.singletonList where possible (#4368)
Reviewers: Ismael Juma <ismael@juma.me.uk>
2018-01-01 13:07:32 +00:00
Koen De Groote 2bc780b959 MINOR: Use EnumMap/EnumSet if possible (#3919)
They are more efficient than HashMap/HashSet.

Reviewers: Ismael Juma <ismael@juma.me.uk>
2017-12-29 12:39:18 +00:00
Tobias Gies 68712dcdec KAFKA-6308; Connect Struct should use deepEquals/deepHashCode
This changes the Struct's equals and hashCode method to use Arrays#deepEquals and Arrays#deepHashCode, respectively. This resolves a problem where two structs with values of type byte[] would not be considered equal even though the byte arrays' contents are equal. By using deepEquals, the byte arrays' contents are compared instead of ther identity.

Since this changes the behavior of the equals method for byte array values, the behavior of hashCode must change alongside it to ensure the methods still fulfill the general contract of "equal objects must have equal hashCodes".

Test rationale:
All existing unit tests for equals were untouched and continue to work. A new test method was added to verify the behavior of equals and hashCode for Struct instances that contain a byte array value. I verify the reflixivity and transitivity of equals as well as the fact that equal Structs have equal hashCodes
and not-equal structs do not have equal hashCodes.

Author: Tobias Gies <tobias.gies@trivago.com>
Author: Tobias Gies <tobias@tobiasgies.de>

Reviewers: Randall Hauch <rhauch@gmail.com>, Jason Gustafson <jason@confluent.io>

Closes #4293 from tobiasgies/feature/kafka-6308-deepequals
2017-12-14 15:26:20 -08:00
Jeff Klukas 049342e440 KAFKA-3073: Add topic regex support for Connect sinks
There are more methods that had to be touched than I anticipated when writing [the KIP](https://cwiki.apache.org/confluence/display/KAFKA/KIP-215%3A+Add+topic+regex+support+for+Connect+sinks).

The implementation here is now complete and includes a test that verifies that there's a call to `consumer.subscribe(Pattern, RebalanceHandler)` when `topics.regex` is provided.

Author: Jeff Klukas <jeff@klukas.net>

Reviewers: Randall Hauch <rhauch@gmail.com>, Ewen Cheslack-Postava <ewen@confluent.io>

Closes #4151 from jklukas/connect-topics.regex
2017-11-21 16:01:16 -08:00
tedyu fd8eb268d6 KAFKA-6168: Connect Schema comparison is slow for large schemas
Re-arrange order of comparisons in equals() to evaluate non-composite fields first
Cache hash code

Author: tedyu <yuzhihong@gmail.com>

Reviewers: Randall Hauch <rhauch@gmail.com>, Konstantine Karantasis <konstantine@confluent.io>, Ewen Cheslack-Postava <ewen@confluent.io>

Closes #4176 from tedyu/trunk
2017-11-21 14:26:32 -08:00
sachinbhalekar 0026121543 KAFKA-6218: Optimize condition in if statement to reduce the number of comparisons
Changed the condition in **if** statement
**(schema.name() == null || !(schema.name().equals(LOGICAL_NAME)))** which
requires two comparisons in worst case with
**(!LOGICAL_NAME.equals(schema.name()))**  which requires single comparison
in all cases and _avoids null pointer exception.
![kafka_optimize_if](https://user-images.githubusercontent.com/32234013/32872271-afe0b954-ca3a-11e7-838d-6a3bc416b807.JPG)
_

Author: sachinbhalekar <sachinbansibhalekar@gmail.com>
Author: sachinbhalekar <32234013+sachinbhalekar@users.noreply.github.com>

Reviewers: Randall Hauch <rhauch@gmail.com>, Ewen Cheslack-Postava <ewen@confluent.io>

Closes #4225 from sachinbhalekar/trunk
2017-11-16 15:04:27 -08:00
Jeremy Custenborder 6d7a81b478 KAFKA-5579: check for null
Author: Jeremy Custenborder <jcustenborder@gmail.com>

Reviewers: Randall Hauch <rhauch@gmail.com>, Ewen Cheslack-Postava <ewen@confluent.io>

Closes #3517 from jcustenborder/KAFKA-5579
2017-07-17 10:45:31 -07:00
Jeremy Custenborder ab8e9d1755 KAFKA-5548: Extended validation for SchemaBuilder methods.
More input validation for SchemaBuilder methods.

Author: Jeremy Custenborder <jcustenborder@gmail.com>

Reviewers: Ewen Cheslack-Postava <ewen@confluent.io>

Closes #3474 from jcustenborder/KAFKA-5548
2017-07-01 02:46:55 -07:00
Ewen Cheslack-Postava 1cea4d8f5a KAFKA-4714; Flatten and Cast single message transforms (KIP-66)
Author: Ewen Cheslack-Postava <me@ewencp.org>

Reviewers: Konstantine Karantasis <konstantine@confluent.io>, Shikhar Bhushan <shikhar@confluent.io>, Jason Gustafson <jason@confluent.io>

Closes #2458 from ewencp/kafka-3209-even-more-transforms
2017-05-16 23:05:35 -07:00
Vitaly Pushkar 54bf2fb5ff KAFKA-4810: Make Kafka Connect SchemaBuilder more lax about checking that fields are unset
https://issues.apache.org/jira/browse/KAFKA-4810

> Currently SchemaBuilder is strict when checking that certain fields have not been set yet (e.g. version, name, doc). It just checks that the field is null. This is intended to protect the user from buggy code that overwrites a field with different values, but it's a bit too strict currently. In generic code for converting schemas (e.g. Converters) you will sometimes initialize a builder with these values (e.g. because you get a SchemaBuilder for a logical type, which sets name & version), but then have generic code for setting name & version from the source schema.

Changed the validation method to not only check if a field is null but also to check if the new value that is being set is the same as the current value of the field.
ewencp

Author: Vitaly Pushkar <vitaly.pushkar@gmail.com>

Reviewers: Ewen Cheslack-Postava <ewen@confluent.io>

Closes #2806 from vitaly-pushkar/KAFKA-4810-schema-builder-default-fields-validation
2017-04-04 14:58:45 -07:00
Balint Molnar 75e213e550 KAFKA-4855: Struct SchemaBuilder should not allow duplicate fields
ewencp can you please review.

Author: Balint Molnar <balintmolnar91@gmail.com>

Reviewers: Gwen Shapira <cshapi@gmail.com>, Ewen Cheslack-Postava <ewen@confluent.io>

Closes #2732 from baluchicken/KAFKA-4855
2017-04-03 20:07:47 -07:00
Colin P. Mccabe a7e3679d22 KAFKA-4924: Fix Kafka Connect API findbugs warnings
Author: Colin P. Mccabe <cmccabe@confluent.io>

Reviewers: Ewen Cheslack-Postava <ewen@confluent.io>

Closes #2715 from cmccabe/KAFKA-4924
2017-03-21 16:48:44 -07:00
Sachin Mittal 197a5d5a6d KAFKA-4848: Fix retryWithBackoff deadlock issue
Fixes related to handling of MAX_POLL_INTERVAL_MS_CONFIG during deadlock and CommitFailedException on partition revoked.

Author: Sachin Mittal <sjmittal@gmail.com>

Reviewers: Matthias J. Sax, Damian Guy, Guozhang Wang

Closes #2642 from sjmittal/trunk
2017-03-20 21:56:15 -07:00
Matthias J. Sax d0e436c471 MINOR: improve license header check by providing head file instead of (prefix) header regex
Author: Matthias J. Sax <matthias@confluent.io>

Reviewers: Jason Gustafson <jason@confluent.io>, Ewen Cheslack-Postava <ewen@confluent.io>

Closes #2303 from mjsax/licenseHeader
2017-02-28 12:35:04 -08:00
Aegeaner 6c839395b7 KAFKA-4709:Error message from Struct.validate() should include the name of the offending field.
https://issues.apache.org/jira/browse/KAFKA-4709

Author: Aegeaner <xihuke@gmail.com>

Reviewers: Dong Lin, Guozhang Wang

Closes #2521 from Aegeaner/KAFKA-4709
2017-02-16 13:44:08 -08:00
Balint Molnar 84323eea23 KAFKA-4679: Remove unstable markers from Connect APIs
ewencp ignore this PR if you are already started to work on this ticket.

Author: Balint Molnar <balintmolnar91@gmail.com>

Reviewers: Ewen Cheslack-Postava <ewen@confluent.io>

Closes #2423 from baluchicken/KAFKA-4679

(cherry picked from commit 1434b61d5d)
Signed-off-by: Ewen Cheslack-Postava <me@ewencp.org>
2017-01-27 19:58:22 -08:00
Shikhar Bhushan 2f90488323 KAFKA-3209: KIP-66: single message transforms
Besides API and runtime changes, this PR also includes 2 data transformations (`InsertField`, `HoistToStruct`) and 1 routing transformation (`TimestampRouter`).

There is some gnarliness in `ConnectorConfig` / `ConfigDef` around creating, parsing and validating a dynamic `ConfigDef`.

Author: Shikhar Bhushan <shikhar@confluent.io>

Reviewers: Ewen Cheslack-Postava <ewen@confluent.io>

Closes #2299 from shikhar/smt-2017
2017-01-12 16:14:53 -08:00
Ewen Cheslack-Postava a565a77b1f KAFKA-4404; Add javadocs to document core Connect types, especially that integer types are signed
Author: Ewen Cheslack-Postava <me@ewencp.org>

Reviewers: Konstantine Karantasis <konstantine@confluent.io>, Jason Gustafson <jason@confluent.io>

Closes #2296 from ewencp/kafka-4404-document-connect-signed-integer-types
2017-01-03 11:02:20 -08:00
Shikhar Bhushan b45a67ede9 KAFKA-4161: KIP-89: Allow sink connectors to decouple flush and offset commit
Author: Shikhar Bhushan <shikhar@confluent.io>

Reviewers: Ewen Cheslack-Postava <ewen@confluent.io>

Closes #2139 from shikhar/kafka-4161-deux
2016-12-01 15:01:09 -08:00
Ismael Juma d092673838 MINOR: A bunch of clean-ups related to usage of unused variables
There should be only one cases where these clean-ups have a functional impact: replaced repeated identical logs with a single log for the stale controller epoch case.

The rest should just make the code easier to read and make it a bit less wasteful. I did this exercise because unused variables sometimes mask bugs.

Author: Ismael Juma <ismael@juma.me.uk>

Reviewers: Jason Gustafson <jason@confluent.io>

Closes #1985 from ijuma/remove-unused
2016-10-25 02:55:55 +01:00
Andrew Stevenson ed50769234 MINOR: Check for null timestamp rather than value in hashcode
Author: Andrew Stevenson <andrew@datamountaineer.com>

Reviewers: Shikhar Bhushan <shikhar@confluent.io>, Ewen Cheslack-Postava <ewen@confluent.io>

Closes #2055 from andrewstevenson/kafka-4334
2016-10-23 22:32:03 -07:00
Shikhar Bhushan d7bffebca0 KAFKA-4173; SchemaProjector should successfully project missing Struct field when target field is optional
Author: Shikhar Bhushan <shikhar@confluent.io>

Reviewers: Konstantine Karantasis <konstantine@confluent.io>, Jason Gustafson <jason@confluent.io>

Closes #1865 from shikhar/kafka-4173
2016-09-16 15:54:33 -07:00
Shikhar Bhushan b91eeac943 KAFKA-4100: Ensure 'fields' and 'fieldsByName' are not null for Struct schemas
Author: Shikhar Bhushan <shikhar@confluent.io>

Reviewers: Ewen Cheslack-Postava <ewen@confluent.io>

Closes #1800 from shikhar/kafka-4100
2016-08-29 19:08:52 -07:00
Shikhar Bhushan 08ad2be0da KAFKA-4070: implement Connect Struct.toString()
Author: Shikhar Bhushan <shikhar@confluent.io>

Reviewers: Gwen Shapira

Closes #1790 from shikhar/add-struct-tostring
2016-08-25 19:25:36 -07:00
Shikhar Bhushan 44ad7b574e KAFKA-3846: KIP-65: include timestamp in Connect record types
https://cwiki.apache.org/confluence/display/KAFKA/KIP-65%3A+Expose+timestamps+to+Connect

Author: Shikhar Bhushan <shikhar@confluent.io>

Reviewers: Jason Gustafson <jason@confluent.io>, Ewen Cheslack-Postava <ewen@confluent.io>

Closes #1537 from shikhar/kafka-3846
2016-06-30 13:59:31 -07:00
Rollulus 4544ee4487 KAFKA-3864: make field.get return field's default value when needed
And not the containing struct's default value.

The contribution is my original work and that I license the work to the project under the project's open source license.

ewencp

Author: Rollulus <roelboel@xs4all.nl>

Reviewers: Ewen Cheslack-Postava <ewen@confluent.io>

Closes #1528 from rollulus/kafka-3864
2016-06-20 12:30:27 -07:00
Liquan Pei bd8681cdd5 KAFKA-3690: Avoid to pass null to UnmodifiableMap
Author: Liquan Pei <liquanpei@gmail.com>

Reviewers: Jason Gustafson <jason@confluent.io>, Ismael Juma <ismael@juma.me.uk>, Ewen Cheslack-Postava <ewen@confluent.io>

Closes #1360 from Ishiihara/avoid-to-pass-null
2016-05-11 13:06:20 -07:00
Rajini Sivaram 9d71489ff0 KAFKA-3548: Use root locale for case transformation of constant strings
For enums and other constant strings, use locale independent case conversions to enable comparisons to work regardless of the default locale.

Author: Rajini Sivaram <rajinisivaram@googlemail.com>

Reviewers: Manikumar Reddy, Ismael Juma, Guozhang Wang, Gwen Shapira

Closes #1220 from rajinisivaram/KAFKA-3548
2016-04-20 18:54:30 -07:00
Ishita Mandhan 0bf61039c8 MINOR: Fix typos in code comments
This patch fixes all occurances of two consecutive 'the's in the code comments.

Author: Ishita Mandhan (imandhaus.ibm.com)

Author: Ishita Mandhan <imandha@us.ibm.com>

Reviewers: Guozhang Wang <wangguoz@gmail.com>

Closes #1240 from imandhan/typofixes
2016-04-19 17:39:04 -07:00
Liquan Pei c07d017227 KAFKA-3315: Add REST and Connector API to expose connector configuration
Author: Liquan Pei <liquanpei@gmail.com>

Reviewers: Ewen Cheslack-Postava <ewen@confluent.io>

Closes #964 from Ishiihara/expose-connector-config
2016-03-17 13:26:02 -07:00
Jeremy Custenborder 6eacc0de30 KAFKA-3260 - Added SourceTask.commitRecord
Added commitRecord(SourceRecord record) to SourceTask. This method is called during the callback from producer.send() when the message has been sent successfully. Added commitTaskRecord(SourceRecord record) to WorkerSourceTask to handle calling commitRecord on the SourceTask. Updated tests for calls to commitRecord.

Author: Jeremy Custenborder <jcustenborder@gmail.com>

Reviewers: Jason Gustafson <jason@confluent.io>, Ewen Cheslack-Postava <ewen@confluent.io>

Closes #950 from jcustenborder/KAFKA-3260
2016-03-15 14:32:22 -07:00
Ismael Juma 241c3ebb28 KAFKA-3375; Suppress deprecated warnings where reasonable and tweak compiler settings
* Fix and suppress number of unchecked warnings (except for Kafka Streams)
* Add `SafeVarargs` annotation to fix warnings
* Suppress unfixable deprecation warnings
* Replace deprecated by non-deprecated usage where possible
* Avoid reflective calls via structural types in Scala
* Tweak compiler settings for scalac and javac

Once we drop Java 7 and Scala 2.10, we can tweak the compiler settings further so that they warn us about more things.

Author: Ismael Juma <ismael@juma.me.uk>

Reviewers: Grant Henke, Gwen Shapira, Guozhang Wang

Closes #1042 from ijuma/kafka-3375-suppress-depreccated-tweak-compiler
2016-03-14 19:14:36 -07:00
Jason Gustafson f7d019ed40 KAFKA-3093: Add Connect status tracking API
Author: Jason Gustafson <jason@confluent.io>

Reviewers: Ewen Cheslack-Postava <ewen@confluent.io>

Closes #920 from hachikuji/KAFKA-3093
2016-02-23 22:47:31 -08:00
Jason Gustafson 1d80f563bc KAFKA-3092: Replace SinkTask onPartitionsAssigned/onPartitionsRevoked with open/close
Author: Jason Gustafson <jason@confluent.io>

Reviewers: Liquan Pei <liquanpei@gmail.com>, Ewen Cheslack-Postava <ewen@confluent.io>

Closes #815 from hachikuji/KAFKA-3092
2016-02-03 11:28:58 -08:00
Jason Gustafson 6dc9743125 KAFKA-2886: Handle sink task rebalance failures by stopping worker task
Author: Jason Gustafson <jason@confluent.io>

Reviewers: Liquan Pei <liquanpei@gmail.com>, Ewen Cheslack-Postava <ewen@confluent.io>

Closes #767 from hachikuji/KAFKA-2886
2016-01-15 09:28:43 -08:00
manasvigupta a0d21407cb KAFKA-3009; Disallow star imports
Summary of code changes
------------------------------------
1) Added a new Checkstyle rule to flag any code using star imports
2) Fixed ALL existing code violations using star imports

Testing
-----------
Local build was successful
ALL JUnits ran successfully on local.

ewencp - Request you to please review changes. Thank you !

I state that the contribution is my original work and I license the work to the project under the project's open source license.

Author: manasvigupta <manasvigupta@yahoo.co.in>

Reviewers: Ewen Cheslack-Postava <ewen@confluent.io>

Closes #700 from manasvigupta/KAFKA-3009
2015-12-21 13:30:59 -08:00
Edward Ribeiro 6e5bd2497a KAFKA-2974; `==` is used incorrectly in a few places in Java code
A few issues found via static analysis.

Author: Edward Ribeiro <edward.ribeiro@gmail.com>
Author: Ismael Juma <ismael@juma.me.uk>

Reviewers: Gwen Shapira, Sriharsha Chintalapani, Guozhang Wang

Closes #652 from ijuma/use-equals-instead-of-==
2015-12-09 20:34:09 -08:00
Ewen Cheslack-Postava 75c7abd826 KAFKA-2906: Fix Connect javadocs, restrict only to api subproject, and clean up javadoc warnings.
Author: Ewen Cheslack-Postava <me@ewencp.org>

Reviewers: Gwen Shapira

Closes #599 from ewencp/kafka-2906-connect-javadocs
2015-11-30 05:26:32 +08:00
Ewen Cheslack-Postava 8db55618d5 KAFKA-2752: Add VerifiableSource/Sink connectors and rolling bounce Copycat system tests.
Author: Ewen Cheslack-Postava <me@ewencp.org>

Reviewers: Ben Stopford, Geoff Anderson, Guozhang Wang

Closes #432 from ewencp/kafka-2752-copycat-clean-bounce-test
2015-11-10 14:54:15 -08:00
Ewen Cheslack-Postava bc76e6704e KAFKA-2775: Move exceptions into API package for Kafka Connect.
Author: Ewen Cheslack-Postava <me@ewencp.org>

Reviewers: Gwen Shapira

Closes #457 from ewencp/kafka-2775-exceptions-in-api-package
2015-11-09 10:27:18 -08:00
Ewen Cheslack-Postava f2031d4063 KAFKA-2774: Rename Copycat to Kafka Connect
Author: Ewen Cheslack-Postava <me@ewencp.org>

Reviewers: Gwen Shapira

Closes #456 from ewencp/kafka-2774-rename-copycat
2015-11-08 22:11:03 -08:00