Autogenerate docs for the Consumer Fetcher's metrics. This is a smaller subset of the original PR https://github.com/apache/kafka/pull/1202.
CC ijuma benstopford hachikuji
Author: James Cheng <jylcheng@yahoo.com>
Reviewers: Ismael Juma <ismael@juma.me.uk>, Guozhang Wang <wangguoz@gmail.com>
Closes#2993 from wushujames/fetcher_metrics_docs
**JIRA ticket:** [KAFKA-5081 two versions of jackson-annotations-xxx.jar in distribution tgz](https://issues.apache.org/jira/browse/KAFKA-5081)
**Solutions:**
1. accept this merge request **_OR_**
2. upgrade jackson libraries to version **_2.9.x_** (currently available as a pre-release only)
**Related jackson issue:** [Add explicit \`jackson-annotations\` dependency version for \`jackson-databind\`](https://github.com/FasterXML/jackson-databind/issues/1545)
**Note:** previous (equivalent) merge request #2900 ended up deep in the sand with swarm of messages due to flaky test, so I opted to close it and to open this one.
ijuma: FYI
Author: dejan2609 <dejan2609@gmail.com>
Reviewers: Ewen Cheslack-Postava <ewen@confluent.io>
Closes#3116 from dejan2609/KAFKA-5081
As per KIP-82
Adding record headers api to ProducerRecord, ConsumerRecord
Support to convert from protocol to api added Kafka Producer, Kafka Fetcher (Consumer)
Updated MirrorMaker, ConsoleConsumer and scala BaseConsumer
Add RecordHeaders and RecordHeader implementation of the interfaces Headers and Header
Some bits using are reverted to being Java 7 compatible, for the moment until KIP-118 is implemented.
Author: Michael Andre Pearce <Michael.Andre.Pearce@me.com>
Reviewers: Radai Rosenblatt <radai.rosenblatt@gmail.com>, Jiangjie Qin <becket.qin@gmail.com>, Ismael Juma <ismael@juma.me.uk>, Jason Gustafson <jason@confluent.io>
Closes#2772 from michaelandrepearce/KIP-82
Worth special mention:
1. Update Scala to 2.11.11 and 2.12.2
2. Update Gradle to 3.5
3. Update ZooKeeper to 3.4.10
4. Update reflections to 0.9.11, which:
* Switches to jsr305 annotations with a provided scope
* Updates Guava from 18 to 20
* Updates javaassist from 3.18 to 3.21
There’s a separate PR for updating RocksDb, so
I didn’t include that here.
Author: Ismael Juma <ismael@juma.me.uk>
Reviewers: Jason Gustafson <jason@confluent.io>
Closes#2872 from ijuma/update-deps-for-0.11
This uses JUnit Categories to identify integration tests. Adds 2 new build targets:
`integrationTest` and `unitTest`.
Author: Damian Guy <damian.guy@gmail.com>
Reviewers: Eno Thereska <eno@confluent.io>, Ewen Cheslack-Postava <ewen@confluent.io>, Ismael Juma <ismael@juma.me.uk>
Closes#2695 from dguy/junit-categories
Author: Damian Guy <damian.guy@gmail.com>
Reviewers: Guozhang Wang <wangguoz@gmail.com>, Ewen Cheslack-Postava <me@ewencp.org>, Jason Gustafson <jason@confluent.io>, Ismael Juma <ismael@juma.me.uk>
Closes#2594 from dguy/checkstyle
Generate core project with correct source folders. In addition
set output folders same as command line build. Don't generate
unnecessary projects.
Author: Dhwani Katagade <dhwani_katagade@persistent.com>
Reviewers: Edoardo Comar <ecomar@uk.ibm.com>, Guozhang Wang <wangguoz@gmail.com>, Ismael Juma <ismael@juma.me.uk>
Closes#2382 from dhwanikatagade/gradle_eclipse_plugin_path_fix
Renames `HoistToStruct` SMT to `HoistField`.
Adds the following SMTs:
`ExtractField`
`MaskField`
`RegexRouter`
`ReplaceField`
`SetSchemaMetadata`
`ValueToKey`
Adds HTML doc generation and updates to `connect.html`.
Author: Shikhar Bhushan <shikhar@confluent.io>
Reviewers: Ewen Cheslack-Postava <ewen@confluent.io>
Closes#2374 from shikhar/more-smt
Besides API and runtime changes, this PR also includes 2 data transformations (`InsertField`, `HoistToStruct`) and 1 routing transformation (`TimestampRouter`).
There is some gnarliness in `ConnectorConfig` / `ConfigDef` around creating, parsing and validating a dynamic `ConfigDef`.
Author: Shikhar Bhushan <shikhar@confluent.io>
Reviewers: Ewen Cheslack-Postava <ewen@confluent.io>
Closes#2299 from shikhar/smt-2017
dguy guozhangwang This is a new PR for KAFKA-4060.
Author: Hojjat Jafarpour <hojjat@Hojjat-Jafarpours-MBP.local>
Author: Hojjat Jafarpour <hojjat@HojjatJpoursMBP.attlocal.net>
Reviewers: Damian Guy, Matthias J. Sax, Isamel Juma, Guozhang Wang
Closes#1884 from hjafarpour/KAFKA-4060-Remove-ZkClient-dependency-in-Kafka-Streams-new
There were a couple of important issues fixed in Gradle 3.2.1:
* [GRADLE-3582] - Gradle wrapper fails to escape arguments with nested quotes
* [GRADLE-3583] - Newlines in JAVA_OPTS breaks application plugin shell script in Gradle 3.2
And a lot of important issues fixed in Scala 2.12.1:
* http://www.scala-lang.org/news/2.12.1
Author: Ismael Juma <ismael@juma.me.uk>
Reviewers: Ewen Cheslack-Postava <me@ewencp.org>
Closes#2216 from ijuma/gradle-3.2.1-and-scala-2.12.1
We suspect that the test suite hangs we have been seeing are
due to PermGen exhaustion. It is a common reason for
hard JVM lock-ups.
Author: Ismael Juma <ismael@juma.me.uk>
Reviewers: Jason Gustafson <jason@confluent.io>
Closes#1926 from ijuma/test-jvm-params
Also upgrade scoverage (required for compatibility) and remove usage of
`useAnt` which doesn't exist in Gradle 3.0
It turns out that one cannot even run `gradle` to download the project Gradle version if `useAnt` is used in the build. This is unfortunate (the SBT launcher has much saner behaviour).
Release notes: https://docs.gradle.org/3.0/release-notes
Author: Ismael Juma <ismael@juma.me.uk>
Reviewers: Grant Henke <granthenke@gmail.com>, Ewen Cheslack-Postava <ewen@confluent.io>, Sriharsha Chintalapani <harsha@hortonworks.com>
Closes#1774 from ijuma/kafka-4082-support-gradle-3.0
ijuma said that it would make sense to split out this work from KAFKA-3234, since KAFKA-3234 had both a mechanical change (generating docs) as well as a change requiring discussion (deprecating/renaming config options).
jjkoshy, I hope you don't mind that I took over this work. It's been 3 months since the last activity on KAFKA-3234, so I thought it would be okay to take over.
This work is essentially is the first 5-6 commits from Joel's https://github.com/apache/kafka/pull/907. However, since I'm not very experienced with git, I didn't do a direct merge/rebase, but instead largely hand-merged it. I did some minor cleanup. All credit goes to Joel, all blame goes to me. :)
For reference, I attached the auto-generated configuration.html file (as a PDF, because github won't let me attache html).
[configuration.pdf](https://github.com/apache/kafka/files/323901/configuration.pdf)
This is my first time writing Scala, so let me know if there are any changes needed.
I don't know who is the right person to review this. ijuma, can you help me redirect this to the appropriate person? Thanks.
Author: James Cheng <jylcheng@yahoo.com>
Reviewers: Ismael Juma <ismael@juma.me.uk>, Joel Koshy <jjkoshy@gmail.com>, Ewen Cheslack-Postava <ewen@confluent.io>
Closes#1527 from wushujames/generate_topic_docs
This is a regression caused by 0bb1d3ae.
After that commit, Streams no longer has a direct dependency on slf4j-log4j12, but zkclient
has a dependency on an older version of slf4j-log4j12, so we get a transitive dependency on
the older version.
The fix is to simply exclude the undesired dependencies from the zkclient dependency.
Author: Ismael Juma <ismael@juma.me.uk>
Reviewers: Guozhang Wang <wangguoz@gmail.com>
Closes#1704 from ijuma/kafka-4018-streams-duplicate-slf4j-log4j
moved streams application reset tool from tools to core
Author: Matthias J. Sax <matthias@confluent.io>
Reviewers: Ismael Juma <ismael@juma.me.uk>, Damian Guy <damian.guy@gmail.com>, Guozhang Wang <wangguoz@gmail.com>, Ewen Cheslack-Postava <ewen@confluent.io>
Closes#1685 from mjsax/moveResetTool
(cherry picked from commit f2405a73ea)
Signed-off-by: Ewen Cheslack-Postava <me@ewencp.org>
Better performance is always welcome:
"The Gradle build itself has seen a 50% reduction in configuration time. You'll see the biggest impact on multi-project builds"
Author: Ismael Juma <ismael@juma.me.uk>
Reviewers: Ewen Cheslack-Postava <ewen@confluent.io>
Closes#1644 from ijuma/update-gradle
As kafka-streams is intended to be used by applications that may or may not wish to use log4j, kafka-streams itself should not have a dependency on a concrete log framework. This change adapts the dependencies to be API-only for compile, and framework-specific for the test runtime only.
I read through the [Contributing Code Guidelines](https://cwiki.apache.org/confluence/display/KAFKA/Contributing+Code+Changes) and interpreted this as a trivial change that doesn't require a Jira ticket. Please let me know if I've interpreted that wrongly.
This contribution is my original work and I license the work to the project under the project's open source license.
Author: Mathieu Fenniak <mathieu@encouragemarketing.com>
Reviewers: Guozhang Wang <wangguoz@gmail.com>
Closes#1639 from mfenniak/fix-slf4j-dependency-for-streams
Fix timing window in producer by holding onto cluster object while processing send requests so that changes to cluster during metadata refresh don't cause NPE if a topic is deleted.
Author: Rajini Sivaram <rajinisivaram@googlemail.com>
Reviewers: Sriharsha Chintalapani <harsha@hortonworks.com>, Ewen Cheslack-Postava <ewen@confluent.io>, Ismael Juma <ismael@juma.me.uk>
Closes#1478 from rajinisivaram/KAFKA-3562
Currently javadoc doesn't specify charset.
This pull reqeust will set this to UTF-8.
Author: Sasaki Toru <sasakitoa@nttdata.co.jp>
Reviewers: Ewen Cheslack-Postava <ewen@confluent.io>
Closes#1413 from sasakitoa/javadoc_garbled
The task is called `aggregatedJavadoc` and the generated html will be under `<project.dir>/build/docs/javadoc/`.
I also disabled javadoc for `tools` and `log4j-appender` as they are not public API.
Author: Ismael Juma <ismael@juma.me.uk>
Reviewers: Guozhang Wang <wangguoz@gmail.com>
Closes#1398 from ijuma/kafka-3717-aggregate-javadoc
These dependencies are unnecessary and they are acquired
transitively via zkclient (jline, netty) and reflections (annotations).
Ewen did the hard work in figuring out why we have unexpected
additional dependencies since 0.9.x.
Author: Ismael Juma <ismael@juma.me.uk>
Reviewers: Ewen Cheslack-Postava, Guozhang Wang, Gwen Shapira
Closes#1396 from ijuma/exclude-jline-netty-deps-in-streams and squashes the following commits:
3aa366f [Ismael Juma] Exclude findbugs annotations due to LGPL license
2d3d714 [Ismael Juma] Use local exclusion for `jline` and `netty`
482b6c0 [Ismael Juma] Exclude `jline` and `netty` dependencies in the `streams` project
_copyDependantTestLibs_ was added temporarily as a dependency of _jar_ task to enable SASL system tests to be run with MiniKdc without changing the automated system test runs which run _gradlew clean jar_. Since the build target _systemTestLibs_ is already in Kafka build.gradle, the Confluent automated test runs can now run _gradlew clean systemTestLibs_ instead. This PR provides the final change to remove _copyDependantTestLibs_ from the _jar_ task. This should be committed only after the Confluent automated sytem test build script is updated, to avoid breaking any builds.
Author: Rajini Sivaram <rajinisivaram@googlemail.com>
Reviewers: Ismael Juma <ismael@juma.me.uk>
Closes#430 from rajinisivaram/minor-systemtestlibs
Author: Eno Thereska <eno.thereska@gmail.com>
Reviewers: Ismael Juma, Damian Guy, Michael G. Noll, Guozhang Wang
Closes#1260 from enothereska/KAFKA-3612-integration-tests
There are a few improvements in 2.12 and 2.13. I am particularly interested in the performance improvements:
* 2.12: "This release brings support for compile only dependencies, improved build script compilation speed and even better IDE support."
* 2.13: "We've achieved performance improvements during Gradle's configuration and execution phase, where we have measured up to 25% improvements to build time in our performance tests. No changes to your build script are necessary to start taking advantage of these improvements."
Author: Ismael Juma <ismael@juma.me.uk>
Reviewers: Gwen Shapira
Closes#1271 from ijuma/gradle-2.13
This also fixes KAFKA-3453 and KAFKA-2866.
Author: Ismael Juma <ismael@juma.me.uk>
Reviewers: Gwen Shapira
Closes#1155 from ijuma/kafka-3475-introduce-our-minikdc
…ibraries
This ensures duplicates are not copied in the distribution without rewriting all of the tar'ing logic. A larger improvement could be made to the packaging code, but that should be tracked by another jira.
Author: Grant Henke <granthenke@gmail.com>
Reviewers: Gwen Shapira, Ismael Juma
Closes#1075 from granthenke/libs-duplicates
* Fix and suppress number of unchecked warnings (except for Kafka Streams)
* Add `SafeVarargs` annotation to fix warnings
* Suppress unfixable deprecation warnings
* Replace deprecated by non-deprecated usage where possible
* Avoid reflective calls via structural types in Scala
* Tweak compiler settings for scalac and javac
Once we drop Java 7 and Scala 2.10, we can tweak the compiler settings further so that they warn us about more things.
Author: Ismael Juma <ismael@juma.me.uk>
Reviewers: Grant Henke, Gwen Shapira, Guozhang Wang
Closes#1042 from ijuma/kafka-3375-suppress-depreccated-tweak-compiler
- Moves all generated docs under /docs/generated
- Generates docs for Protocol, Errors, and ApiKeys
- Adds new protocol.html page
Author: Grant Henke <granthenke@gmail.com>
Reviewers: Gwen Shapira
Closes#970 from granthenke/protocol-doc-wip
Adds a gradle task to generate a report of outdate release dependencies:
`gradle dependencyUpdates`
Updates a few minor versions.
Author: Grant Henke <granthenke@gmail.com>
Reviewers: Ismael Juma, Gwen Shapira
Closes#973 from granthenke/outdated-deps
Without this change `./gradlew releaseTarGz` (and its variants) will not include the RocksDB jar, which is required for Kafka Streams, in Kafka's `libs/` folder. The impact is that any Streams job will fail when it runs against a broker that was installed via a release tarball.
guozhangwang junrao : please review.
Author: Michael G. Noll <michael@confluent.io>
Reviewers: Jun Rao <junrao@gmail.com>
Closes#1007 from miguno/trunk-rocksdb-fixes
Also remove some unused imports.
Author: Guozhang Wang <wangguoz@gmail.com>
Reviewers: Ismael Juma <ismael@juma.me.uk>, Ewen Cheslack-Postava <ewen@confluent.io>
Closes#992 from guozhangwang/KSExamples
Observation: when doing "gradlew releaseTarGz" the streams jar was not included in the tarball. Adding a line to include it. ijuma guozhangwang could you please review. Thanks.
Author: Eno Thereska <eno.thereska@gmail.com>
Reviewers: Guozhang Wang <wangguoz@gmail.com>
Closes#984 from enothereska/trunk
Patch version bumps for bouncy castle, minikdc, snappy, slf4j, scalatest and powermock. Notable fixes:
* Snappy: fixes a resource leak
* Bouncy castle: security fixes
Also update Gradle to 2.11 (where the notable change is improved IDE integration) and the grgit build dependency.
Author: Ismael Juma <ismael@juma.me.uk>
Reviewers: Grant Henke <granthenke@gmail.com>, Ewen Cheslack-Postava <ewen@confluent.io>
Closes#903 from ijuma/kafka-3227-conservative-update-of-kafka-deps
This is handy when debugging certain kinds of Jenkins failures.
Author: Ismael Juma <ismael@juma.me.uk>
Reviewers: Guozhang Wang <wangguoz@gmail.com>, Ewen Cheslack-Postava <ewen@confluent.io>
Closes#739 from ijuma/gradle-show-standard-streams
guozhangwang added .git/refs/heads/ file existence check.
Author: Manikumar reddy O <manikumar.reddy@gmail.com>
Reviewers: Ismael Juma, Guozhang Wang
Closes#209 from omkreddy/KAFKA-1901
Some of the Improvements Include:
- The Checkstyle task now produces a human friendly HTML report
- Potential performance improvements
- Bug Fixes
Author: Grant Henke <granthenke@gmail.com>
Reviewers: Ewen Cheslack-Postava <ewen@confluent.io>
Closes#715 from granthenke/gradle
- Adds CheckStyle to core and examples modules
- Fixes any existing CheckStyle issues
Author: Grant Henke <granthenke@gmail.com>
Reviewers: Ewen Cheslack-Postava <ewen@confluent.io>
Closes#703 from granthenke/checkstyle-core
A few notes on the added test:
* I verified this test fails when changing between snappy 1.1.1.2 and 1.1.1.7 (per KAFKA-2189)
* The hard coded numbers are passing before and after lzo change
Author: Grant Henke <granthenke@gmail.com>
Reviewers: Ismael Juma, Guozhang Wang
Closes#552 from granthenke/lz4
This patch fixes some releative paths so bulding from a subproject directory like ($rootDir/core) will not fail
Author: Grant Henke <granthenke@gmail.com>
Reviewers: Ewen Chesklack-Postava
Closes#509 from granthenke/minor
We can take advantage of the fact that major Scala versions are binary compatible (since 2.10) to make the build a little more user-friendly.
Author: Ismael Juma <ismael@juma.me.uk>
Reviewers: Ewen Cheslack=Postava, Grant Henke
Closes#574 from ijuma/install-all-and-major-instead-of-full-version
More performance improvements:
"In many cases, Gradle 2.9 is much faster than Gradle 2.8 when performing incremental builds.
Very large builds (many thousands of source files) could see incremental build speeds up to 80% faster than 2.7 and up to 40% faster than 2.8.
Gradle now uses a more efficient mechanism to scan the filesystem, making up-to-date checks significantly faster. This improvement is only available when running Gradle with Java 7 or newer.
Other improvements have been made to speed-up include and exclude pattern evaluation; these improvements apply to all supported Java versions.
Gradle now uses much less memory than previous releases when performing incremental builds. By de-duplicating Strings used as file paths in internal caches, and by reducing the overhead of listing classes under test for Java projects, some builds use 30-70% less memory that Gradle 2.8."
https://docs.gradle.org/current/release-notes
Author: Ismael Juma <ismael@juma.me.uk>
Reviewers: Grant Henke, Guozhang Wang
Closes#549 from ijuma/gradle-2.9
Gradle does not handle subprojects with the same name (top-level tools vs
connect/tools) properly, making the dependency impossible to express correctly
since we need to move the ThroughputThrottler class into the top level tools
project. Moving the current set of tools into the runtime jar works fine since
they are only used for system tests at the moment.
Author: Ewen Cheslack-Postava <me@ewencp.org>
Reviewers: Gwen Shapira
Closes#512 from ewencp/kafka-2807-redux
Author: Ewen Cheslack-Postava <me@ewencp.org>
Reviewers: Ben Stopford, Geoff Anderson, Guozhang Wang
Closes#432 from ewencp/kafka-2752-copycat-clean-bounce-test
Run sanity check, replication tests and benchmarks with SASL/Kerberos using MiniKdc.
Author: Rajini Sivaram <rajinisivaram@googlemail.com>
Reviewers: Geoff Anderson <geoff@confluent.io>, Jun Rao <junrao@gmail.com>
Closes#358 from rajinisivaram/KAFKA-2644
Updated kafka-producer-perf-test.sh to use org.apache.kafka.clients.tools.ProducerPerformance.
Updated build.gradle to add kafka-tools-0.9.0.0-SNAPSHOT.jar to kafka/libs folder.
Author: Manikumar reddy O <manikumar.reddy@gmail.com>
Reviewers: Gwen Shapira, Ismael Juma
Closes#242 from omkreddy/KAFKA-2562
ewencp Nothing too complicated here
Author: Geoff Anderson <geoff@confluent.io>
Reviewers: Ewen Cheslack-Postava, Gwen Shapira
Closes#392 from granders/minor-remove-system-test
KAFKA-2644 adds MiniKdc for system tests and hence needs a target to collect all MiniKdc jars. At the moment, system tests run `gradlew jar`. Replacing that with `gradlew systemTestLibs` will enable kafka jars and test dependency jars to be built and copied into appropriate locations. Submitting this as a separate PR so that the new target can be added to the build scripts that run system tests before KAFKA-2644 is committed. A separate target for system test artifacts will allow dependency changes to be made in future without breaking test runs.
Author: Rajini Sivaram <rajinisivaram@googlemail.com>
Reviewers: Ismael Juma <ismael@juma.me.uk>, Jun Rao <junrao@gmail.com>
Closes#361 from rajinisivaram/kafka-systemTestLibs
This adds coordination between DistributedHerders using the generalized consumer
support, allowing automatic balancing of connectors and tasks across workers. A
few pieces that require interaction between workers (resolving config
inconsistencies, forwarding of configuration changes to the leader worker) are
incomplete because they require REST API support to implement properly.
Author: Ewen Cheslack-Postava <me@ewencp.org>
Reviewers: Jason Gustafson, Gwen Shapira
Closes#321 from ewencp/kafka-2371-distributed-herder
This PR implements SASL/Kerberos which was originally submitted by harshach as https://github.com/apache/kafka/pull/191.
I've been submitting PRs to Harsha's branch with fixes and improvements and he has integrated all, but the most recent one. I'm creating this PR so that the Jenkins can run the tests on the branch (they pass locally).
Author: Ismael Juma <ismael@juma.me.uk>
Author: Sriharsha Chintalapani <harsha@hortonworks.com>
Author: Harsha <harshach@users.noreply.github.com>
Reviewers: Ismael Juma <ismael@juma.me.uk>, Rajini Sivaram <rajinisivaram@googlemail.com>, Parth Brahmbhatt <brahmbhatt.parth@gmail.com>, Jun Rao <junrao@gmail.com>
Closes#334 from ijuma/KAFKA-1686-V1
This patch is different than the one attached to the JIRA - I'm applying the new javadoc rules to all subprojects while the one in the JIRA applies only to "clients". We need this since Copycat has the same issues.
Author: Gwen Shapira <cshapi@gmail.com>
Reviewers: Ismael Juma, Guozhang Wang
Closes#147 from gwenshap/KAFKA-2203
This PR copies the latest kafka docs to kafka repo docs directory. Here I have copied 0.8.3/ directory contents from svn website repo to kafka/docs repository.
Some questions: This PR contains generated javadocs also. Do we need to copy javadocs here?
Author: Manikumar reddy O <manikumar.reddy@gmail.com>
Reviewers: Gwen Shapira, Ismael Juma
Closes#171 from omkreddy/KAFKA-2425-MOVE-DOCS-TO-KAFKA-REPO
This work has been contributed by Jesse Anderson, Randall Hauch, Yasuhiro Matsuda and Guozhang Wang. The detailed design can be found in https://cwiki.apache.org/confluence/display/KAFKA/KIP-28+-+Add+a+processor+client.
Author: Guozhang Wang <wangguoz@gmail.com>
Author: Yasuhiro Matsuda <yasuhiro.matsuda@gmail.com>
Author: Yasuhiro Matsuda <yasuhiro@confluent.io>
Author: ymatsuda <yasuhiro.matsuda@gmail.com>
Author: Randall Hauch <rhauch@gmail.com>
Author: Jesse Anderson <jesse@smokinghand.com>
Author: Ismael Juma <ismael@juma.me.uk>
Author: Jesse Anderson <eljefe6a@gmail.com>
Reviewers: Ismael Juma, Randall Hauch, Edward Ribeiro, Gwen Shapira, Jun Rao, Jay Kreps, Yasuhiro Matsuda, Guozhang Wang
Closes#130 from guozhangwang/streaming
The default is typically `1m` for 64-bit machines and the Scala compiler sometimes needs more than this.
Author: Ismael Juma <ismael@juma.me.uk>
Reviewers: Manikumar Reddy, Gwen Shapira
Closes#157 from ijuma/kafka-2457-stackoverflowerror-during-builds
This is an initial patch implementing the basics of Copycat for KIP-26.
The intent here is to start a review of the key pieces of the core API and get a reasonably functional, baseline, non-distributed implementation of Copycat in place to get things rolling. The current patch has a number of known issues that need to be addressed before a final version:
* Some build-related issues. Specifically, requires some locally-installed dependencies (see below), ignores checkstyle for the runtime data library because it's lifted from Avro currently and likely won't last in its current form, and some Gradle task dependencies aren't quite right because I haven't gotten rid of the dependency on `core` (which should now be an easy patch since new consumer groups are in a much better state).
* This patch currently depends on some Confluent trunk code because I prototyped with our Avro serializers w/ schema-registry support. We need to figure out what we want to provide as an example built-in set of serializers. Unlike core Kafka where we could ignore the issue, providing only ByteArray or String serializers, this is pretty central to how Copycat works.
* This patch uses a hacked up version of Avro as its runtime data format. Not sure if we want to go through the entire API discussion just to get some basic code committed, so I filed KAFKA-2367 to handle that separately. The core connector APIs and the runtime data APIs are entirely orthogonal.
* This patch needs some updates to get aligned with recent new consumer changes (specifically, I'm aware of the ConcurrentModificationException issue on exit). More generally, the new consumer is in flux but Copycat depends on it, so there are likely to be some negative interactions.
* The layout feels a bit awkward to me right now because I ported it from a Maven layout. We don't have nearly the same level of granularity in Kafka currently (core and clients, plus the mostly ignored examples, log4j-appender, and a couple of contribs). We might want to reorganize, although keeping data+api separate from runtime and connector plugins is useful for minimizing dependencies.
* There are a variety of other things (e.g., I'm not happy with the exception hierarchy/how they are currently handled, TopicPartition doesn't really need to be duplicated unless we want Copycat entirely isolated from the Kafka APIs, etc), but I expect those we'll cover in the review.
Before commenting on the patch, it's probably worth reviewing https://issues.apache.org/jira/browse/KAFKA-2365 and https://issues.apache.org/jira/browse/KAFKA-2366 to get an idea of what I had in mind for a) what we ultimately want with all the Copycat patches and b) what we aim to cover in this initial patch. My hope is that we can use a WIP patch (after the current obvious deficiencies are addressed) while recognizing that we want to make iterative progress with a bunch of subsequent PRs.
Author: Ewen Cheslack-Postava <me@ewencp.org>
Reviewers: Ismael Juma, Gwen Shapira
Closes#99 from ewencp/copycat and squashes the following commits:
a3a47a6 [Ewen Cheslack-Postava] Simplify Copycat exceptions, make them a subclass of KafkaException.
8c108b0 [Ewen Cheslack-Postava] Rename Coordinator to Herder to avoid confusion with the consumer coordinator.
7bf8075 [Ewen Cheslack-Postava] Make Copycat CLI speific to standalone mode, clean up some config and get rid of config storage in standalone mode.
656a003 [Ewen Cheslack-Postava] Clarify and expand the explanation of the Copycat Coordinator interface.
c0e5fdc [Ewen Cheslack-Postava] Merge remote-tracking branch 'origin/trunk' into copycat
0fa7a36 [Ewen Cheslack-Postava] Mark Copycat classes as unstable and reduce visibility of some classes where possible.
d55d31e [Ewen Cheslack-Postava] Reorganize Copycat code to put it all under one top-level directory.
b29cb2c [Ewen Cheslack-Postava] Merge remote-tracking branch 'origin/trunk' into copycat
d713a21 [Ewen Cheslack-Postava] Address Gwen's review comments.
6787a85 [Ewen Cheslack-Postava] Make Converter generic to match serializers since some serialization formats do not require a base class of Object; update many other classes to have generic key and value class type parameters to match this change.
b194c73 [Ewen Cheslack-Postava] Split Copycat converter option into two options for key and value.
0b5a1a0 [Ewen Cheslack-Postava] Normalize naming to use partition for both source and Kafka, adjusting naming in CopycatRecord classes to clearly differentiate.
e345142 [Ewen Cheslack-Postava] Remove Copycat reflection utils, use existing Utils and ConfigDef functionality from clients package.
be5c387 [Ewen Cheslack-Postava] Minor cleanup
122423e [Ewen Cheslack-Postava] Style cleanup
6ba87de [Ewen Cheslack-Postava] Remove most of the Avro-based mock runtime data API, only preserving enough schema functionality to support basic primitive types for an initial patch.
4674d13 [Ewen Cheslack-Postava] Address review comments, clean up some code styling.
25b5739 [Ewen Cheslack-Postava] Fix sink task offset commit concurrency issue by moving it to the worker thread and waking up the consumer to ensure it exits promptly.
0aefe21 [Ewen Cheslack-Postava] Add log4j settings for Copycat.
220e42d [Ewen Cheslack-Postava] Replace Avro serializer with JSON serializer.
1243a7c [Ewen Cheslack-Postava] Merge remote-tracking branch 'origin/trunk' into copycat
5a618c6 [Ewen Cheslack-Postava] Remove offset serializers, instead reusing the existing serializers and removing schema projection support.
e849e10 [Ewen Cheslack-Postava] Remove duplicated TopicPartition implementation.
dec1379 [Ewen Cheslack-Postava] Switch to using new consumer coordinator instead of manually assigning partitions. Remove dependency of copycat-runtime on core.
4a9b4f3 [Ewen Cheslack-Postava] Add some helpful Copycat-specific build and test targets that cover all Copycat packages.
31cd1ca [Ewen Cheslack-Postava] Add CLI tools for Copycat.
e14942c [Ewen Cheslack-Postava] Add Copycat file connector.
0233456 [Ewen Cheslack-Postava] Add copycat-avro and copycat-runtime
11981d2 [Ewen Cheslack-Postava] Add copycat-data and copycat-api
Initial patch for KIP-25
Note that to install ducktape, do *not* use pip to install ducktape. Instead:
```
$ git clone gitgithub.com:confluentinc/ducktape.git
$ cd ducktape
$ python setup.py install
```
Author: Geoff Anderson <geoff@confluent.io>
Author: Geoff <granders@gmail.com>
Author: Liquan Pei <liquanpei@gmail.com>
Reviewers: Ewen, Gwen, Jun, Guozhang
Closes#70 from granders/KAFKA-2276 and squashes the following commits:
a62fb6c [Geoff Anderson] fixed checkstyle errors
a70f0f8 [Geoff Anderson] Merged in upstream trunk.
8b62019 [Geoff Anderson] Merged in upstream trunk.
47b7b64 [Geoff Anderson] Created separate tools jar so that the clients package does not pull in dependencies on the Jackson JSON tools or argparse4j.
a9e6a14 [Geoff Anderson] Merged in upstream changes
d18db7b [Geoff Anderson] fixed :rat errors (needed to add licenses)
321fdf8 [Geoff Anderson] Ignore tests/ and vagrant/ directories when running rat build task
795fc75 [Geoff Anderson] Merged in changes from upstream trunk.
1d93f06 [Geoff Anderson] Updated provisioning to use java 7 in light of KAFKA-2316
2ea4e29 [Geoff Anderson] Tweaked README, changed default log collection behavior on VerifiableProducer
0eb6fdc [Geoff Anderson] Merged in system-tests
69dd7be [Geoff Anderson] Merged in trunk
4034dd6 [Geoff Anderson] Merged in upstream trunk
ede6450 [Geoff] Merge pull request #4 from confluentinc/move_muckrake
7751545 [Geoff Anderson] Corrected license headers
e6d532f [Geoff Anderson] java 7 -> java 6
8c61e2d [Geoff Anderson] Reverted jdk back to 6
f14c507 [Geoff Anderson] Removed mode = "test" from Vagrantfile and Vagrantfile.local examples. Updated testing README to clarify aws setup.
98b7253 [Geoff Anderson] Updated consumer tests to pre-populate kafka logs
e6a41f1 [Geoff Anderson] removed stray println
b15b24f [Geoff Anderson] leftover KafkaBenchmark in super call
0f75187 [Geoff Anderson] Rmoved stray allow_fail. kafka_benchmark_test -> benchmark_test
f469f84 [Geoff Anderson] Tweaked readme, added example Vagrantfile.local
3d73857 [Geoff Anderson] Merged downstream changes
42dcdb1 [Geoff Anderson] Tweaked behavior of stop_node, clean_node to generally fail fast
7f7c3e0 [Geoff Anderson] Updated setup.py for kafkatest
c60125c [Geoff Anderson] TestEndToEndLatency -> EndToEndLatency
4f476fe [Geoff Anderson] Moved aws scripts to vagrant directory
5af88fc [Geoff Anderson] Updated README to include aws quickstart
e5edf03 [Geoff Anderson] Updated example aws Vagrantfile.local
96533c3 [Geoff] Update aws-access-keys-commands
25a413d [Geoff] Update aws-example-Vagrantfile.local
884b20e [Geoff Anderson] Moved a bunch of files to kafkatest directory
fc7c81c [Geoff Anderson] added setup.py
632be12 [Geoff] Merge pull request #3 from confluentinc/verbose-client
51a94fd [Geoff Anderson] Use argparse4j instead of joptsimple. ThroughputThrottler now has more intuitive behavior when targetThroughput is 0.
a80a428 [Geoff Anderson] Added shell program for VerifiableProducer.
d586fb0 [Geoff Anderson] Updated comments to reflect that throttler is not message-specific
6842ed1 [Geoff Anderson] left out a file from last commit
1228eef [Geoff Anderson] Renamed throttler
9100417 [Geoff Anderson] Updated command-line options for VerifiableProducer. Extracted throughput logic to make it reusable.
0a5de8e [Geoff Anderson] Fixed checkstyle errors. Changed name to VerifiableProducer. Added synchronization for thread safety on println statements.
475423b [Geoff Anderson] Convert class to string before adding to json object.
bc009f2 [Geoff Anderson] Got rid of VerboseProducer in core (moved to clients)
c0526fe [Geoff Anderson] Updates per review comments.
8b4b1f2 [Geoff Anderson] Minor updates to VerboseProducer
2777712 [Geoff Anderson] Added some metadata to producer output.
da94b8c [Geoff Anderson] Added number of messages option.
07cd1c6 [Geoff Anderson] Added simple producer which prints status of produced messages to stdout.
a278988 [Geoff Anderson] fixed typos
f1914c3 [Liquan Pei] Merge pull request #2 from confluentinc/system_tests
81e4156 [Liquan Pei] Bootstrap Kafka system tests
Author: Ismael Juma <ismael@juma.me.uk>
Reviewers: Guozhang
Closes#97 from ijuma/kafka-2321 and squashes the following commits:
4834464 [Ismael Juma] KAFKA-2321; Introduce CONTRIBUTING.md
`testAll` passed locally.
Author: Ismael Juma <ismael@juma.me.uk>
Reviewers: Harsha, Ewen, Guozhang
Closes#87 from ijuma/kafka-2348-drop-support-for-scala-2.9 and squashes the following commits:
cf9796a [Ismael Juma] KAFKA-2348; Drop support for Scala 2.9
Author: Ismael Juma <ismael@juma.me.uk>
Closes#82 from ijuma/kafka-2324 and squashes the following commits:
d71bf5c [Ismael Juma] KAFKA-2324; Update to Scala 2.11.7