Commit Graph

88 Commits

Author SHA1 Message Date
John Roesler ccd3af1566 Minor resolve streams scala warnings (#6369)
Resolves the compiler warnings when building streams-scala.

Reviewers: A. Sophie Blee-Goldman <ableegoldman@gmail.com>, Ismael Juma <ismael@juma.me.uk>, Guozhang Wang <wangguoz@gmail.com>
2019-03-07 15:05:35 -08:00
Matthias J. Sax 240d7589d6 MINOR: improve JavaDocs about global state stores (#6359)
Improve JavaDocs about global state stores.

Reviewers: Guozhang Wang <wangguoz@gmail.com>, Sophie Blee-Goldman <sophie@confluent.io>,  Bill Bejeck <bbejeck@gmail.com>
2019-03-06 21:19:47 -05:00
Ismael Juma 12f310d50e
KAFKA-7612: Fix javac warnings and enable warnings as errors (#5900)
- Use Xlint:all with 3 exclusions (filed KAFKA-7613 to remove the exclusions)
- Use the same javac options when compiling tests (seems accidental that
we didn't do this before)
- Replaced several deprecated method calls with non-deprecated ones:
  - `KafkaConsumer.poll(long)` and `KafkaConsumer.close(long)`
  - `Class.newInstance` and `new Integer/Long` (deprecated since Java 9)
  - `scala.Console` (deprecated in Scala 2.11)
  - `PartitionData` taking a timestamp (one of them seemingly a bug)
  - `JsonMappingException` single parameter constructor
- Fix unnecessary usage of raw types in several places.
- Add @SuppressWarnings for deprecations, unchecked and switch fallthrough in
several places.
- Scala clean-ups (var -> val, ETA expansion warnings, avoid reflective calls)
- Use lambdas to simplify code in a few places
- Add @SafeVarargs, fix varargs usage and remove unnecessary `Utils.mkList` method

Reviewers: Matthias J. Sax <mjsax@apache.org>, Manikumar Reddy <manikumar.reddy@gmail.com>, Randall Hauch <rhauch@gmail.com>, Bill Bejeck <bill@confluent.io>, Stanislav Kozlovski <stanislav_kozlovski@outlook.com>
2018-11-12 22:18:59 -08:00
Bill Bejeck 86b1150e18 MINOR: Update Streams Scala API for addition of Grouped (#5793)
While working on the documentation updates I realized the Streams Scala API needs
to get updated for the addition of Grouped

Added a test for Grouped.scala ran all streams-scala tests and streams tests

Reviewers: Matthias J. Sax <matthias@confluent.io>, John Roesler <john@confluent.io>, Guozhang Wang <guozhang@confluent.io>
2018-10-16 07:24:50 -07:00
John Roesler f047e1c178 MINOR: fix non-deterministic streams-scala tests (#5792)
Stop using current system time by default, as it introduces non-determinism.

Reviewers: Matthias J. Sax <matthias@confluent.io>, Bill Bejeck <bill@confluent.io>, Guozhang Wang <guozhang@confluent.io>
2018-10-15 17:47:20 -07:00
Nikolay ca641b3e2e KAFKA-7277: Migrate Streams API to Duration instead of longMs times (#5682)
Reviewers: Johne Roesler <john@confluent.io>, Matthias J. Sax <matthias@confluent.io>, Bill Bejeck <bill@confluent.io>, Guozhang Wang <guozhang@confluent.io>
2018-10-04 13:51:39 -07:00
Joan Goyeau af80dccc7d KAFKA-7399: KIP-366, Make FunctionConversions deprecated (#5562)
Reviewers: John Roesler <vvcephei@users.noreply.github.com>, Guozhang Wang <wangguoz@gmail.com>
2018-09-19 09:10:15 -07:00
John Roesler b18c8dda52 MINOR: fix scala serde tests (#5644)
Reviewers: Guozhang Wang <wangguoz@gmail.com>
2018-09-13 08:29:07 -07:00
John Roesler 9dac615d22 KAFKA-7386: streams-scala should not cache serdes (#5622)
Currently, scala.Serdes.String, for example, invokes Serdes.String() once and caches the result.

However, the implementation of the String serde has a non-empty configure method that is variant in whether it's used as a key or value serde. So we won't get correct execution if we create one serde and use it for both keys and values.

Reviewers: Bill Bejeck <bill@confluent.io>, Guozhang Wang <wangguoz@gmail.com>
2018-09-11 16:17:47 -07:00
Joan Goyeau acd3858ea6 KAFKA-7396: Materialized, Serialized, Joined, Consumed and Produced with implicit Serdes (#5551)
We want to make sure that we always have a serde for all Materialized, Serialized, Joined, Consumed and Produced.
For that we can make use of the implicit parameters in Scala.

KIP: https://cwiki.apache.org/confluence/display/KAFKA/KIP-365%3A+Materialized%2C+Serialized%2C+Joined%2C+Consumed+and+Produced+with+implicit+Serde

Reviewers: John Roesler <vvcephei@users.noreply.github.com>, Matthias J. Sax <matthias@confluent.io>, Bill Bejeck <bbejeck@gmail.com>, Guozhang Wang <guozhang@confluent.io>, Ted Yu <yuzhihong@gmail.com>
2018-09-11 14:08:42 -07:00
Ewen Cheslack-Postava 0d535b167a MINOR: Mark new Scala streams tests as integration tests (KIP-270 follow-up) (#5631)
Reviewers: Eno Thereska <eno.thereska@gmail.com>, John Roesler <vvcephei@users.noreply.github.com>, Guozhang Wang <wangguoz@gmail.com>
2018-09-10 09:53:02 -07:00
tedyu 34d9ae6628 MINOR: Fix streams Scala peek recursive call (#5566)
This PR fixes the previously recursive call of Streams Scala peek

Reviewers: Joan Goyeau <joan@goyeau.com>, Guozhang Wang <guozhang@confluent.io>, John Roesler <john@confluent.io>
2018-08-29 09:34:01 -07:00
Joan Goyeau 7d2b95895a MINOR: Correct folder for package object scala (#5573)
Reviewers: Guozhang Wang <guozhang@confluent.io>, John Roesler <john@confluent.io>
2018-08-27 18:44:12 -07:00
Joan Goyeau b8559de23d MINOR: Fix streams Scala foreach recursive call (#5539)
Reviewers: Guozhang Wang <guozhang@confluent.io>, John Roesler <john@confluent.io>
2018-08-23 17:07:39 -07:00
Joan Goyeau d8f9f278a2 KAFKA-7316: Fix Streams Scala filter recursive call #5538
Due to lack of conversion to kstream Predicate, existing filter method in KTable.scala would result in StackOverflowError.

This PR fixes the bug and adds testing for it.

Reviewers: Guozhang Wang <guozhang@confluent.io>, John Roesler <john@confluent.io>
2018-08-23 09:15:27 -07:00
Joan Goyeau dae9c41838 KAFKA-7301: Fix streams Scala join ambiguous overload (#5502)
Join in the Scala streams API is currently unusable in 2.0.0 as reported by @mowczare:
#5019 (comment)

This due to an overload of it with the same signature in the first curried parameter.
See compiler issue that didn't catch it: https://issues.scala-lang.org/browse/SI-2628

Reviewers: Debasish Ghosh <dghosh@acm.org>, Guozhang Wang <guozhang@confluent.io>, John Roesler <john@confluent.io>
2018-08-21 15:41:36 -07:00
John Roesler b1539ff62d KAFKA-7250: switch scala transform to TransformSupplier (#5481)
#5468 introduced a breaking API change that was actually avoidable. This PR re-introduces the old API as deprecated and alters the API introduced by #5468 to be consistent with the other methods

also, fixed misc syntax problems
2018-08-09 10:11:48 -07:00
Manikumar Reddy O a9d7f8a1fd MINOR: Fix Streams scala format violations (#5472)
Reviewers: Guozhang Wang <wangguoz@gmail.com>
2018-08-07 12:50:42 -07:00
Michal Dziemianko ed13d7eebb KAFKA-7250: fix transform function in scala DSL to accept TranformerSupplier (#5468)
Reviewers: Guozhang Wang <wangguoz@gmail.com>
2018-08-07 08:00:22 -07:00
Guozhang Wang afe00effe2
KAFKA-3514: Part II, Choose tasks with data on all partitions to process (#5398)
1. In each iteration, decide if a task is processable if all of its partitions contains data, so it can decide which record to process next.

1.a Add one exception that, if the task indeed have data on some but not all of its partitions, we only consider as not processable for some finite round of iterations.
1.b Add a task-level metric to record whenever we are forced to process a task that is only "partially data available", since it may leads to non-determinism.

2. Break the main loop on put-raw-data and process-them. Since now not all data put into the queue would be processed completely within a single iteration.

3. NOTE that within an iteration, if a task has exhausted one of its queue it will still be processed, since we only update processable list once in each iteration, I'm improving on this on the follow-up part III PR.

4. Found and fixed a bug in metrics recording: the taskName and sensorName parameters were exchanged.

5. Optimized task stream time computation again since our current partition stream time reasoning has been simplified.

6. Added unit tests.

Reviewers: Matthias J. Sax <matthias@confluent.io>, John Roesler <vvcephei@users.noreply.github.com>, Bill Bejeck <bbejeck@gmail.com>
2018-08-02 18:34:53 -07:00
Guozhang Wang 75825caee4
KAFKA-5037 Follow-up: move Scala test to Java (#5399)
Reviewers: Ted Yu <yuzhihong@gmail.com>, Bill Bejeck <bill@confluent.io>, Matthias J. Sax <matthias@confluent.io>
2018-07-20 10:37:56 -07:00
Manikumar Reddy O 9089fb2d82 MINOR: Fix format violations streams scala tests (#5402)
@guozhangwang @mjsax hot fix for streams scala test format violations

Reviewers: Guozhang Wang <wangguoz@gmail.com>
2018-07-20 08:20:36 -07:00
Ted Yu 82f124ae30 KAFKA-5037: Fix infinite loop if all input topics are unknown at startup
1. At the beginning of assign, we first check that all the non-repartition source topics are included in the metadata. If not, we log an error at the leader and set an error in the Assignment userData bytes, indicating that leader cannot complete assignment and the error code would indicate the root cause of it.

2. Upon receiving the assignment, if the error is not NONE the streams will shutdown itself with a log entry re-stating the root cause interpreted from the error code.

Author: tedyu <yuzhihong@gmail.com>

Reviewers: Matthias J. Sax <mjsax@apache.org>, Guozhang Wang <wangguoz@gmail.com>

Closes #5322 from tedyu/trunk
2018-07-19 15:22:53 -07:00
Joan Goyeau 05c5854d1f MINOR: Add Scalafmt to Streams Scala API (#4965)
Reviewers: Guozhang Wang <wangguoz@gmail.com>
2018-07-09 16:48:34 -07:00
Joan Goyeau ad56f04af9 KAFKA-6936: Implicit materialized for aggregate, count and reduce (#5066)
In #4919 we propagate the SerDes for each of these aggregation operators.

As @guozhangwang mentioned in that PR:

```
reduce: inherit the key and value serdes from the parent XXImpl class.
count: inherit the key serdes, enforce setting the Serdes.Long() for value serdes.
aggregate: inherit the key serdes, do not set for value serdes internally.
```

Although it's all good for reduce and count, it is quiet unsafe to have aggregate without Materialized given. In fact I don't see why we would not give a Materialized for the aggregate since the result type will always be different (otherwise use reduce) and also the value Serde is simply not propagated.

This has been discussed previously in a broader PR before but I believe for aggregate we could pass implicitly a Materialized the same way we pass a Joined, just to avoid the stupid case. Then if the user wants to specialize, he can give his own Materialized.

Reviewers: Debasish Ghosh <dghosh@acm.org>, Guozhang Wang <guozhang@confluent.io>
2018-05-31 17:19:37 -07:00
Guozhang Wang f33e9a346e KAFKA-4936: Add dynamic routing in Streams (#5018)
implements KIP-303

Reviewers: Bill Bejeck <bill@confluent.io>, John Roesler <john@confluent.io>, Matthias J. Sax <matthias@confluent.io>
2018-05-30 11:54:53 -07:00
Bill Bejeck 4943c3f2f7 MINOR: reduce commit time on test (#5095)
Reviewers: Matthias J. Sax <matthias@confluent.io>, Guozhang Wang <guozhang@confluent.io>
2018-05-29 16:33:00 -07:00
Ismael Juma 7132a85fc3 KAFKA-6921; Remove old Scala producer and related code
* Removed Scala producers, request classes, kafka.tools.ProducerPerformance, encoders,
tests.
* Updated ConsoleProducer to remove Scala producer support (removed `BaseProducer`
and several options that are not used by the Java producer).
* Updated a few Scala consumer tests to use the new producer (including a minor
refactor of `produceMessages` methods in `TestUtils`).
* Updated `ClientUtils.fetchTopicMetadata` to use `SimpleConsumer` instead of
`SyncProducer`.
* Removed `TestKafkaAppender` as it looks useless and it defined an `Encoder`.
* Minor import clean-ups

No new tests added since behaviour should remain the same after these changes.

Author: Ismael Juma <ismael@juma.me.uk>

Reviewers: Manikumar Reddy O <manikumar.reddy@gmail.com>, Dong Lin <lindong28@gmail.com>

Closes #5045 from ijuma/kafka-6921-remove-old-producer
2018-05-24 17:32:49 -07:00
Joan Goyeau 96cda0e07a MINOR: Fix type inference on joins and aggregates (#5019)
The type inference doesn't currently work for the join functions in Scala as it doesn't know yet the types of the given KStream[K, V] or KTable[K, V].

The fix here is to curry the joiner function. I personally prefer this notation but this also means it differs more from the Java API.
I believe the diff with the Java API is worth in this case as it's not only solving the type inference but also fits better the Scala way of coding (ex: fold).

Moreover any Scala dev will bug and spend little time on these functions trying to understand why the type inference is not working and then get frustrated to be obliged to be explicit here where it's not harmful to be inferred.

Reviewers: Debasish Ghosh <dghosh@acm.org>, Guozhang Wang <guozhang@confluent.io>, Ismael Juma <ismael@juma.me.uk>
2018-05-20 16:25:16 -07:00
Andy Coates 4e1c8ffd0d KAFKA-6849: add transformValues methods to KTable. (#4959)
See the KIP: https://cwiki.apache.org/confluence/display/KAFKA/KIP-292%3A+Add+transformValues%28%29+method+to+KTable

This PR adds the transformValues method to the KTable interface. The semantics of the call are the same as the methods of the same name on the KStream interface.

Fixes KAFKA-6849

Reviewers: Matthias J. Sax <matthias@confluent.io>, Guozhang Wang <wangguoz@gmail.com>
2018-05-18 16:06:50 -07:00
Guozhang Wang 6b8e79b137
HOTFIX: move Conusmed to o.a.k.streams.kstream (#5033)
Reviewers: Bill Bejeck <bill@confluent.io>, Matthias J. Sax <matthias@confluent.io>
2018-05-17 11:56:07 -07:00
Joan Goyeau ac9de822b2 MINOR: Use Set instead of List for multiple topics (#5024)
Debasish Ghosh <dghosh@acm.org>, Guozhang Wang <guozhang@confluent.io>
2018-05-17 08:44:50 -07:00
Joan Goyeau 40d191b563 MINOR: Count fix and Type alias refactor in Streams Scala API (#4966)
Reviewers: Debasish Ghosh <dghosh@acm.org>, Guozhang Wang <guozhang@confluent.io>
2018-05-11 10:15:48 -07:00
Joan Goyeau b88d70b532 MINOR: Make Serdes less confusing in Scala (#4963)
Serdes are confusing in the Scala wrapper:

* We have wrappers around Serializer, Deserializer and Serde which are not very useful.
* We have Serdes in 2 places org.apache.kafka.common.serialization.Serde and in DefaultSerdes, instead we should be having only one place where to find all the Serdes.

I wanted to do this PR before the release as this is a breaking change.
This shouldn't add more so the current tests should be enough.

Reviewers: Debasish Ghosh <dghosh@acm.org>, Guozhang Wang <guozhang@confluent.io>
2018-05-08 09:15:31 -07:00
Michael G. Noll 00d1137570 KAFKA-6871: KStreams Scala API: incorrect Javadocs and misleading parameter name (#4971)
Reviewer: Matthias J. Sax <matthias@confluent.io>, Debasish Ghosh <dghosh@acm.org>, Guozhang Wang <guozhang@confluent.io>, Bill Bejeck <bill@confluent.io>
2018-05-07 09:27:45 -07:00
Sean Glover 893e044515 MINOR: Build and code sample updates for Kafka Streams DSL for Scala (#4949)
Several build and documentation updates were required after the merge of KAFKA-6670: Implement a Scala wrapper library for Kafka Streams.

Encode Scala major version into streams-scala artifacts.
To differentiate versions of the kafka-streams-scala artifact across Scala major versions it's required to encode the version into the artifact name before its published to a maven repository. This is accomplished by following a similar release process as kafka core, which encodes the Scala major version and then runs the build for each major version of Scala supported. This is considered standard practice when releasing Scala libraries, but is not handled for us automatically with the basic Scala for Gradle support.

After this change you can generate and install the kafka-streams-scala artifact into the local maven repository:

$ ./gradlew -PscalaVersion=2.11 install
$ ./gradlew -PscalaVersion=2.12 install

Reviewers: Ismael Juma <ismael@juma.me.uk>, Guozhang Wang <wangguoz@gmail.com>
2018-05-06 20:55:12 -07:00
Guozhang Wang af983267be
MINOR: Removed deprecated schedule function (#4908)
While working on this, I also refactored the MockProcessor out of the MockProcessorSupplier to cleanup the unit test paths.

Reviewers: John Roesler <john@confluent.io>, Bill Bejeck <bill@confluent.io>, Matthias J. Sax <matthias@confluent.io>
2018-05-04 08:42:01 -07:00
Debasish Ghosh b2e4db01b6 KAFKA-6670: Implement a Scala wrapper library for Kafka Streams
This PR implements a Scala wrapper library for Kafka Streams. The library is implemented as a project under streams, namely `:streams:streams-scala`. The PR contains the following:

* the library implementation of the wrapper abstractions
* the test suite
* the changes in `build.gradle` to build the library jar

The library has been tested running the tests as follows:

```
$ ./gradlew -Dtest.single=StreamToTableJoinScalaIntegrationTestImplicitSerdes streams:streams-scala:test
$ ./gradlew -Dtest.single=StreamToTableJoinScalaIntegrationTestImplicitSerdesWithAvro streams:streams-scala:test
$ ./gradlew -Dtest.single=WordCountTest streams:streams-scala:test
```

Author: Debasish Ghosh <ghosh.debasish@gmail.com>
Author: Sean Glover <seglo@randonom.com>

Reviewers: Bill Bejeck <bill@confluent.io>, Matthias J. Sax <matthias@confluent.io>, Ismael Juma <ismael@juma.me.uk>, John Roesler <john@confluent.io>, Damian Guy <damian@confluent.io>, Guozhang Wang <wangguoz@gmail.com>

Closes #4756 from debasishg/scala-streams
2018-04-23 13:33:35 -07:00