Commit Graph

63 Commits

Author SHA1 Message Date
John Roesler 4d1c4d9086 KAFKA-12593: Fix Apache License headers (#10452)
* Standardize license headers in scala, python, and gradle files.
* Relocate copyright attribution to the NOTICE.
* Add a license header check to `spotless` for scala files.

Reviewers: Ewen Cheslack-Postava <ewencp@apache.org>, Matthias J. Sax <mjsax@apache.org>, A. Sophie Blee-Goldman <ableegoldman@apache.org
2021-04-01 10:49:28 -05:00
Matthias J. Sax 27824baa21
KAFKA-10003: Mark KStream.through() as deprecated and update Scala API (#8679)
- part of KIP-221

Co-authored-by: John Roesler <john@confluent.io>
2020-05-22 08:41:28 -07:00
John Roesler fd095aaafd
KAFKA-8410: Revert Part 1: processor context bounds (#8414) (#8595)
This reverts commit 29e08fd2c2.
There turned out to be more than expected problems with adding the generic parameters.

Reviewers: Matthias J. Sax <matthias@confluent.io>
2020-05-01 14:26:36 -05:00
John Roesler 29e08fd2c2
KAFKA-8410: Part 1: processor context bounds (#8414)
Add type bounds to the ProcessorContext, which bounds the types that can be forwarded to child nodes.

Reviewers: Matthias J. Sax <matthias@confluent.io>
2020-04-07 13:11:02 -05:00
Ismael Juma 90bbeedf52
MINOR: Fix Scala 2.13 compiler warnings (#8390)
Once Scala 2.13.2 is officially released, I will submit a follow up PR
that enables `-Xfatal-warnings` with the necessary warning
exclusions. Compiler warning exclusions were only introduced in 2.13.2
and hence why we have to wait for that. I used a snapshot build to
test it in the meantime.

Changes:
* Remove Deprecated annotation from internal request classes
* Class.newInstance is deprecated in favor of
Class.getConstructor().newInstance
* Replace deprecated JavaConversions with CollectionConverters
* Remove unused kafka.cluster.Cluster
* Don't use Map and Set methods deprecated in 2.13:
    - collection.Map +, ++, -, --, mapValues, filterKeys, retain
    - collection.Set +, ++, -, --
* Add scala-collection-compat dependency to streams-scala and
update version to 2.1.4.
* Replace usages of deprecated Either.get and Either.right
* Replace usage of deprecated Integer(String) constructor
* `import scala.language.implicitConversions` is not needed in Scala 2.13
* Replace usage of deprecated `toIterator`, `Traversable`, `seq`,
`reverseMap`, `hasDefiniteSize`
* Replace usage of deprecated alterConfigs with incrementalAlterConfigs
where possible
* Fix implicit widening conversions from Long/Int to Double/Float
* Avoid implicit conversions to String
* Eliminate usage of deprecated procedure syntax
* Remove `println`in `LogValidatorTest` instead of fixing the compiler
warning since tests should not `println`.
* Eliminate implicit conversion from Array to Seq
* Remove unnecessary usage of 3 argument assertEquals
* Replace `toStream` with `iterator`
* Do not use deprecated SaslConfigs.DEFAULT_SASL_ENABLED_MECHANISMS
* Replace StringBuilder.newBuilder with new StringBuilder
* Rename AclBuffers to AclSeqs and remove usage of `filterKeys`
* More consistent usage of Set/Map in Controller classes: this also fixes
deprecated warnings with Scala 2.13
* Add spotBugs exclusion for inliner artifact in KafkaApis with Scala 2.12.

Reviewers: Manikumar Reddy <manikumar.reddy@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>
2020-04-01 06:20:48 -07:00
Matthias J. Sax 21cfd0b453
MINOR: Fix generic types in StreamsBuilder and Topology (#8273)
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>, Guozhang Wang <guozhang@confluent.io>, John Roesler <john@confluent.io>
2020-03-19 14:29:15 -07:00
high.lee dc89c86d43
KAFKA-9483: Add Scala KStream#toTable to the Streams DSL (#8024)
Part of KIP-523

Reviewers: Matthias J. Sax <matthias@confluent.io>, Bill Bejeck <bill@confluent.io>, John Roesler <john@confluent.io>
2020-02-11 14:15:43 -08:00
Matthias J. Sax 1ccca5c6a9
KAFKA-6049: extend Kafka Streams Scala API for cogroup (KIP-150) (#7847)
Reviewers: John Roesler <john@confluent.io>, Bill Bejeck <bill@confluent.io>
2020-01-08 15:28:51 -08:00
Ismael Juma 6dc6f6a60d
KAFKA-9324: Drop support for Scala 2.11 (KIP-531) (#7859)
* Adjust build and documentation.
* Use lambda syntax for SAM types in `core`, `streams-scala` and
`connect-runtime` modules.
* Remove `runnable` and `newThread` from `CoreUtils` as lambda
syntax for SAM types make them unnecessary.
* Remove stale comment in `FunctionsCompatConversions`,
`KGroupedStream`, `KGroupedTable' and `KStream` about Scala 2.11,
the conversions are needed for Scala 2.12 too.
* Deprecate `org.apache.kafka.streams.scala.kstream.Suppressed`
and use `org.apache.kafka.streams.kstream.Suppressed` instead.
* Use `Admin.create` instead of `AdminClient.create`. Static methods
in Java interfaces can be invoked since Scala 2.12. I noticed that
MirrorMaker 2 uses `AdminClient.create`, but I did not change them
as Connectors have restrictions on newer client APIs.
* Improve efficiency in a few `Gauge` implementations by avoiding
unnecessary intermediate collections.
* Remove pointless `Option.apply` in `ZookeeperClient`
`SessionState` metric.
* Fix unused import/variable and other compiler warnings.
* Reduce visibility of some vals/defs.

Reviewers: Manikumar Reddy <manikumar@confluent.io>, Guozhang Wang <wangguoz@gmail.com>, Gwen Shapira <gwen@confluent.io>
2020-01-06 19:51:01 +01:00
Matthias J. Sax 5a65da5fe9
MINOR: Kafka Streams Scala API cleanup (#7852)
Reviewers: Bill Bejeck <bill@confluent.io>
2019-12-20 13:18:16 -08:00
Alex Kokachev 7f5c380c34 KAFKA-9011: Removed multiple calls to supplier.get() in order to avoid multiple transformer instances being created. (#7685)
This is a followup PR for #7520 to address issue of multiple calls to get() as it was pointed out by @bbejeck in #7520 (comment)

Reviewers: Bill Bejeck <bbejeck@gmail.com>
2019-11-14 16:12:08 -05:00
Alex Kokachev 9a125a72a2 KAFKA-9011: Scala bindings for flatTransform and flatTransformValues in KStream (#7520)
Reviewers: John Roesler <john@confluent.io>, Bill Bejeck <bbejeck@gmail.com>
2019-11-12 10:23:24 -05:00
Matthias J. Sax 2421a69556 MINOR: Fix Kafka Streams JavaDocs with regard to new StreamJoined class (#7627)
Reviewers: Bruno Cadonna <bruno@confluent.io>, Bill Bejeck <bbejeck@gmail.com>
2019-11-01 17:14:57 -04:00
huxi 7c4b029df9 KAFKA-8944: Fixed KTable compiler warning. (#7393)
https://issues.apache.org/jira/browse/KAFKA-8944

Reviewers: Bill Bejeck <bbejeck@gmail.com>
2019-10-08 23:40:21 -04:00
Jukka Karvanen a5a6938c69 KAFKA-8233: TopologyTestDriver test input and output usability improvements (#7378)
Implements KIP-470

Reviewers: Bill Bejeck <bill@confluent.io>, John Roesler <john@confluent.io>, Matthias J. Sax <matthias@confluent.io>
2019-10-07 01:01:58 -07:00
Adam Bellemare c87fe9402c KAFKA-3705 Added a foreignKeyJoin implementation for KTable. (#5527)
https://issues.apache.org/jira/browse/KAFKA-3705

Allows for a KTable to map its value to a given foreign key and join on another KTable keyed on that foreign key. Applies the joiner, then returns the tuples keyed on the original key. This supports updates from both sides of the join.

Reviewers: Guozhang Wang <wangguoz@gmail.com>, Matthias J. Sax <mjsax@apache.org>,  John Roesler <john@confluent.io>, Boyang Chen <boyang@confluent.io>, Christopher Pettitt <cpettitt@confluent.io>, Bill Bejeck <bbejeck@gmail.com>, Jan Filipiak <Jan.Filipiak@trivago.com>, pgwhalen, Alexei Daniline
2019-10-03 18:59:31 -04:00
Bill Bejeck 6925775e63 KAFKA-8558: Add StreamJoined config object to join (#7285)
Reviewer: John Roesler <john@confluent.io>, Matthias J. Sax <matthias@confluent.io>
2019-10-02 20:32:18 -07:00
Michał Siatkowski 45c800ff01 KAFKA-8911: Using proper WindowSerdes constructors in their implicit definitions (#7352)
Detailed info is available in the ticket: https://issues.apache.org/jira/browse/KAFKA-8911

Briefly, implicit defs are calling empty constructors, which exists only for reflection object creation.
Therefore, while using the implicit definitons, a NPE occurs when Serde is called.

Reviewers: John Roesler <john@confluent.io>, Bill Bejeck <bbejeck@gmail.com>
2019-09-30 18:53:25 -04:00
Ismael Juma 6dd4ebcea7
MINOR: Make the build compile with Scala 2.13 (#6989)
Scala 2.13 support was added to build via #5454. This PR adjusts the code so that
it compiles with 2.11, 2.12 and 2.13.

Changes:
* Add `scala-collection-compat` dependency.
* Import `scala.collection.Seq` in a number of places for consistent behavior between
Scala 2.11, 2.12 and 2.13.
* Remove wildcard imports that were causing the Java classes to have priority over the
Scala ones, related Scala issue: https://github.com/scala/scala/pull/6589.
* Replace parallel collection usage with `Future`. The former is no longer included by
default in the standard library.
* Replace val _: Unit workaround with one that is more concise and works with Scala 2.13
* Replace `filterKeys` with `filter` when we expect a `Map`. `filterKeys` returns a view
that doesn't implement the `Map` trait in Scala 2.13.
* Replace `mapValues` with `map` or add a `toMap` as an additional transformation
when we expect a `Map`. `mapValues` returns a view that doesn't implement the
`Map` trait in Scala 2.13.
* Replace `breakOut` with `iterator` and `to`, `breakOut` was removed in Scala
2.13.
* Replace to() with toMap, toIndexedSeq and toSet
* Replace `mutable.Buffer.--` with `filterNot`.
* ControlException is an abstract class in Scala 2.13.
* Variable arguments can only receive arrays or immutable.Seq in Scala 2.13.
* Use `Factory` instead of `CanBuildFrom` in DecodeJson. `CanBuildFrom` behaves
a bit differently in Scala 2.13 and it's been deprecated. `Factory` has the behavior
we need and it's available via the compat library.
* Fix failing tests due to behavior change in Scala 2.13,
"Map.values.map is not strict in Scala 2.13" (https://github.com/scala/bug/issues/11589).
* Use Java collections instead of Scala ones in StreamResetter (a Java class).
* Adjust CheckpointFile.write to take an `Iterable` instead of `Seq` to avoid
unnecessary collection copies.
* Fix DelayedElectLeader to use a Map instead of Set and avoid `to` call that
doesn't work in Scala 2.13.
* Use unordered map for mapping in SimpleAclAuthorizer, mapping of ordered
maps require an `Ordering` in Scala 2.13 for safety reasons.
* Adapt `ConsumerGroupCommand` to compile with Scala 2.13.
* CoreUtils.min takes an `Iterable` instead of `TraversableOnce`, the latter does
not exist in Scala 2.13.
* Replace `Unit` with `()` in a couple places. Scala 2.13 is stricter when it expects
a value instead of a type.
* Fix bug in CustomQuotaCallbackTest where we did not necessarily set `partitionRatio`
correctly, `forall` can terminate early.
* Add a couple of spotbugs exclusions that are needed by code generated by Scala 2.13
* Remove unused variables, simplify some code and remove procedure syntax in a few
places.
* Remove unused `CoreUtils.JSONEscapeString`.

Reviewers: Manikumar Reddy <manikumar.reddy@gmail.com>, José Armando García Sancio <jsancio@users.noreply.github.com>
2019-07-02 06:29:39 -07:00
Matthias J. Sax 8a237f599a KAFKA-6455: Session Aggregation should use window-end-time as record timestamp (#6645)
For session-windows, the result record should have the window-end timestamp as record timestamp.

Rebased to resolve merge conflicts. Removed unused classes TupleForwarder and ForwardingCacheFlushListener (replace with TimestampedTupleForwarder, SessionTupleForwarder, TimestampedCacheFlushListerner, and SessionCacheFlushListener)

Reviewers: John Roesler <john@confluent.io>, Bruno Cadonna <bruno@confluent.io>, Boyang Chen <boyang@confluent.io>, Bill Bejeck <bill@confluent.io>, Guozhang Wang <wangguoz@gmail.com>
2019-05-12 15:31:44 -07:00
Ismael Juma a37282415e
MINOR: Upgrade dependencies for Kafka 2.3 (#6665)
Many patch and minor updates.

Scalatest and Jetty deprecated classes that we
use. I removed usages for the former and filed KAFKA-8316 for the latter (I
suppressed the relevant deprecation warnings until the JIRA is fixed). As
part of the scalatest fixes, I also removed `TestUtils.fail` since it duplicates
`Assertions.fail`.

I also fixed a few compiler warnings that have crept in since my last sweep.

Updates of note:
- Jetty: 9.4.14 -> 9.4.18
  * https://github.com/eclipse/jetty.project/releases/tag/jetty-9.4.15.v20190215
  * https://github.com/eclipse/jetty.project/releases/tag/jetty-9.4.16.v20190411
  * https://github.com/eclipse/jetty.project/releases/tag/jetty-9.4.17.v20190418
  * https://github.com/eclipse/jetty.project/releases/tag/jetty-9.4.17.v20190418
  * https://github.com/eclipse/jetty.project/releases/tag/jetty-9.4.18.v20190429
- zstd: 1.3.8-1 -> 1.4.0-1
  * https://github.com/facebook/zstd/releases/tag/v1.4.0
  * zstd's fastest strategy, 6-8% faster in most scenarios
- zookeeper: 3.4.13 -> 3.4.14
  * https://zookeeper.apache.org/doc/r3.4.14/releasenotes.html

### Committer Checklist (excluded from commit message)
- [ ] Verify design and implementation 
- [ ] Verify test coverage and CI build status
- [ ] Verify documentation (including upgrade notes)

Reviewers: Manikumar Reddy <manikumar.reddy@gmail.com>
2019-05-03 10:35:07 -07:00
Casey Green b7d7f7590d KAFKA-7778: Add KTable.suppress to Scala API (#6314)
Detailed description

* Adds KTable.suppress to the Scala API.
* Fixed count in KGroupedStream, SessionWindowedKStream, and TimeWindowedKStream so that the value serde gets passed down to the KTable returned by the internal mapValues method.
* Suppress API support for Java 1.8 + Scala 2.11

Testing strategy

I added unit tests covering:

* Windowed KTable.count.suppress w/ Suppressed.untilTimeLimit
* Windowed KTable.count.suppress w/ Suppressed.untilWindowCloses
* Non-windowed KTable.count.suppress w/ Suppressed.untilTimeLimit
* Session-windowed KTable.count.suppress w/ Suppressed.untilWindowCloses

Reviewers: John Roesler <john@confluent.io>, Guozhang Wang <guozhang@confluent.io>
2019-04-15 16:27:19 -07:00
Bill Bejeck c74acb24eb
MINOR: Remove line for testing repartition topic name (#6488)
With KIP-307 joined.name() is deprecated plus we don't need to test for repartition topic names.
Reviewers: Matthias J. Sax <mjsax@apache.org>
2019-03-24 12:47:58 -04:00
Massimo Siani 853f24a4a1 KAFKA-7027: Add an overload build method in scala (#6373)
The Java API can pass a Properties object to StreamsBuilder#build, to allow, e.g., topology optimization, while the Scala API does not yet. The latter only delegates the work to the underlying Java implementation.

Reviewers: John Roesler <john@confluent.io>,  Bill Bejeck <bbejeck@gmail.com>
2019-03-15 10:56:48 -04:00
Matthias J. Sax 0ef2b4ccce MINOR: fix Scala compiler warning (#6417)
Reviewers: Guozhang Wang <wangguoz@gmail.com>,  Bill Bejeck <bbejeck@gmail.com>
2019-03-09 08:09:07 -05:00
John Roesler ccd3af1566 Minor resolve streams scala warnings (#6369)
Resolves the compiler warnings when building streams-scala.

Reviewers: A. Sophie Blee-Goldman <ableegoldman@gmail.com>, Ismael Juma <ismael@juma.me.uk>, Guozhang Wang <wangguoz@gmail.com>
2019-03-07 15:05:35 -08:00
Matthias J. Sax 240d7589d6 MINOR: improve JavaDocs about global state stores (#6359)
Improve JavaDocs about global state stores.

Reviewers: Guozhang Wang <wangguoz@gmail.com>, Sophie Blee-Goldman <sophie@confluent.io>,  Bill Bejeck <bbejeck@gmail.com>
2019-03-06 21:19:47 -05:00
Ismael Juma 12f310d50e
KAFKA-7612: Fix javac warnings and enable warnings as errors (#5900)
- Use Xlint:all with 3 exclusions (filed KAFKA-7613 to remove the exclusions)
- Use the same javac options when compiling tests (seems accidental that
we didn't do this before)
- Replaced several deprecated method calls with non-deprecated ones:
  - `KafkaConsumer.poll(long)` and `KafkaConsumer.close(long)`
  - `Class.newInstance` and `new Integer/Long` (deprecated since Java 9)
  - `scala.Console` (deprecated in Scala 2.11)
  - `PartitionData` taking a timestamp (one of them seemingly a bug)
  - `JsonMappingException` single parameter constructor
- Fix unnecessary usage of raw types in several places.
- Add @SuppressWarnings for deprecations, unchecked and switch fallthrough in
several places.
- Scala clean-ups (var -> val, ETA expansion warnings, avoid reflective calls)
- Use lambdas to simplify code in a few places
- Add @SafeVarargs, fix varargs usage and remove unnecessary `Utils.mkList` method

Reviewers: Matthias J. Sax <mjsax@apache.org>, Manikumar Reddy <manikumar.reddy@gmail.com>, Randall Hauch <rhauch@gmail.com>, Bill Bejeck <bill@confluent.io>, Stanislav Kozlovski <stanislav_kozlovski@outlook.com>
2018-11-12 22:18:59 -08:00
Bill Bejeck 86b1150e18 MINOR: Update Streams Scala API for addition of Grouped (#5793)
While working on the documentation updates I realized the Streams Scala API needs
to get updated for the addition of Grouped

Added a test for Grouped.scala ran all streams-scala tests and streams tests

Reviewers: Matthias J. Sax <matthias@confluent.io>, John Roesler <john@confluent.io>, Guozhang Wang <guozhang@confluent.io>
2018-10-16 07:24:50 -07:00
John Roesler f047e1c178 MINOR: fix non-deterministic streams-scala tests (#5792)
Stop using current system time by default, as it introduces non-determinism.

Reviewers: Matthias J. Sax <matthias@confluent.io>, Bill Bejeck <bill@confluent.io>, Guozhang Wang <guozhang@confluent.io>
2018-10-15 17:47:20 -07:00
Nikolay ca641b3e2e KAFKA-7277: Migrate Streams API to Duration instead of longMs times (#5682)
Reviewers: Johne Roesler <john@confluent.io>, Matthias J. Sax <matthias@confluent.io>, Bill Bejeck <bill@confluent.io>, Guozhang Wang <guozhang@confluent.io>
2018-10-04 13:51:39 -07:00
Joan Goyeau af80dccc7d KAFKA-7399: KIP-366, Make FunctionConversions deprecated (#5562)
Reviewers: John Roesler <vvcephei@users.noreply.github.com>, Guozhang Wang <wangguoz@gmail.com>
2018-09-19 09:10:15 -07:00
John Roesler b18c8dda52 MINOR: fix scala serde tests (#5644)
Reviewers: Guozhang Wang <wangguoz@gmail.com>
2018-09-13 08:29:07 -07:00
John Roesler 9dac615d22 KAFKA-7386: streams-scala should not cache serdes (#5622)
Currently, scala.Serdes.String, for example, invokes Serdes.String() once and caches the result.

However, the implementation of the String serde has a non-empty configure method that is variant in whether it's used as a key or value serde. So we won't get correct execution if we create one serde and use it for both keys and values.

Reviewers: Bill Bejeck <bill@confluent.io>, Guozhang Wang <wangguoz@gmail.com>
2018-09-11 16:17:47 -07:00
Joan Goyeau acd3858ea6 KAFKA-7396: Materialized, Serialized, Joined, Consumed and Produced with implicit Serdes (#5551)
We want to make sure that we always have a serde for all Materialized, Serialized, Joined, Consumed and Produced.
For that we can make use of the implicit parameters in Scala.

KIP: https://cwiki.apache.org/confluence/display/KAFKA/KIP-365%3A+Materialized%2C+Serialized%2C+Joined%2C+Consumed+and+Produced+with+implicit+Serde

Reviewers: John Roesler <vvcephei@users.noreply.github.com>, Matthias J. Sax <matthias@confluent.io>, Bill Bejeck <bbejeck@gmail.com>, Guozhang Wang <guozhang@confluent.io>, Ted Yu <yuzhihong@gmail.com>
2018-09-11 14:08:42 -07:00
Ewen Cheslack-Postava 0d535b167a MINOR: Mark new Scala streams tests as integration tests (KIP-270 follow-up) (#5631)
Reviewers: Eno Thereska <eno.thereska@gmail.com>, John Roesler <vvcephei@users.noreply.github.com>, Guozhang Wang <wangguoz@gmail.com>
2018-09-10 09:53:02 -07:00
tedyu 34d9ae6628 MINOR: Fix streams Scala peek recursive call (#5566)
This PR fixes the previously recursive call of Streams Scala peek

Reviewers: Joan Goyeau <joan@goyeau.com>, Guozhang Wang <guozhang@confluent.io>, John Roesler <john@confluent.io>
2018-08-29 09:34:01 -07:00
Joan Goyeau 7d2b95895a MINOR: Correct folder for package object scala (#5573)
Reviewers: Guozhang Wang <guozhang@confluent.io>, John Roesler <john@confluent.io>
2018-08-27 18:44:12 -07:00
Joan Goyeau b8559de23d MINOR: Fix streams Scala foreach recursive call (#5539)
Reviewers: Guozhang Wang <guozhang@confluent.io>, John Roesler <john@confluent.io>
2018-08-23 17:07:39 -07:00
Joan Goyeau d8f9f278a2 KAFKA-7316: Fix Streams Scala filter recursive call #5538
Due to lack of conversion to kstream Predicate, existing filter method in KTable.scala would result in StackOverflowError.

This PR fixes the bug and adds testing for it.

Reviewers: Guozhang Wang <guozhang@confluent.io>, John Roesler <john@confluent.io>
2018-08-23 09:15:27 -07:00
Joan Goyeau dae9c41838 KAFKA-7301: Fix streams Scala join ambiguous overload (#5502)
Join in the Scala streams API is currently unusable in 2.0.0 as reported by @mowczare:
#5019 (comment)

This due to an overload of it with the same signature in the first curried parameter.
See compiler issue that didn't catch it: https://issues.scala-lang.org/browse/SI-2628

Reviewers: Debasish Ghosh <dghosh@acm.org>, Guozhang Wang <guozhang@confluent.io>, John Roesler <john@confluent.io>
2018-08-21 15:41:36 -07:00
John Roesler b1539ff62d KAFKA-7250: switch scala transform to TransformSupplier (#5481)
#5468 introduced a breaking API change that was actually avoidable. This PR re-introduces the old API as deprecated and alters the API introduced by #5468 to be consistent with the other methods

also, fixed misc syntax problems
2018-08-09 10:11:48 -07:00
Manikumar Reddy O a9d7f8a1fd MINOR: Fix Streams scala format violations (#5472)
Reviewers: Guozhang Wang <wangguoz@gmail.com>
2018-08-07 12:50:42 -07:00
Michal Dziemianko ed13d7eebb KAFKA-7250: fix transform function in scala DSL to accept TranformerSupplier (#5468)
Reviewers: Guozhang Wang <wangguoz@gmail.com>
2018-08-07 08:00:22 -07:00
Guozhang Wang afe00effe2
KAFKA-3514: Part II, Choose tasks with data on all partitions to process (#5398)
1. In each iteration, decide if a task is processable if all of its partitions contains data, so it can decide which record to process next.

1.a Add one exception that, if the task indeed have data on some but not all of its partitions, we only consider as not processable for some finite round of iterations.
1.b Add a task-level metric to record whenever we are forced to process a task that is only "partially data available", since it may leads to non-determinism.

2. Break the main loop on put-raw-data and process-them. Since now not all data put into the queue would be processed completely within a single iteration.

3. NOTE that within an iteration, if a task has exhausted one of its queue it will still be processed, since we only update processable list once in each iteration, I'm improving on this on the follow-up part III PR.

4. Found and fixed a bug in metrics recording: the taskName and sensorName parameters were exchanged.

5. Optimized task stream time computation again since our current partition stream time reasoning has been simplified.

6. Added unit tests.

Reviewers: Matthias J. Sax <matthias@confluent.io>, John Roesler <vvcephei@users.noreply.github.com>, Bill Bejeck <bbejeck@gmail.com>
2018-08-02 18:34:53 -07:00
Guozhang Wang 75825caee4
KAFKA-5037 Follow-up: move Scala test to Java (#5399)
Reviewers: Ted Yu <yuzhihong@gmail.com>, Bill Bejeck <bill@confluent.io>, Matthias J. Sax <matthias@confluent.io>
2018-07-20 10:37:56 -07:00
Manikumar Reddy O 9089fb2d82 MINOR: Fix format violations streams scala tests (#5402)
@guozhangwang @mjsax hot fix for streams scala test format violations

Reviewers: Guozhang Wang <wangguoz@gmail.com>
2018-07-20 08:20:36 -07:00
Ted Yu 82f124ae30 KAFKA-5037: Fix infinite loop if all input topics are unknown at startup
1. At the beginning of assign, we first check that all the non-repartition source topics are included in the metadata. If not, we log an error at the leader and set an error in the Assignment userData bytes, indicating that leader cannot complete assignment and the error code would indicate the root cause of it.

2. Upon receiving the assignment, if the error is not NONE the streams will shutdown itself with a log entry re-stating the root cause interpreted from the error code.

Author: tedyu <yuzhihong@gmail.com>

Reviewers: Matthias J. Sax <mjsax@apache.org>, Guozhang Wang <wangguoz@gmail.com>

Closes #5322 from tedyu/trunk
2018-07-19 15:22:53 -07:00
Joan Goyeau 05c5854d1f MINOR: Add Scalafmt to Streams Scala API (#4965)
Reviewers: Guozhang Wang <wangguoz@gmail.com>
2018-07-09 16:48:34 -07:00
Joan Goyeau ad56f04af9 KAFKA-6936: Implicit materialized for aggregate, count and reduce (#5066)
In #4919 we propagate the SerDes for each of these aggregation operators.

As @guozhangwang mentioned in that PR:

```
reduce: inherit the key and value serdes from the parent XXImpl class.
count: inherit the key serdes, enforce setting the Serdes.Long() for value serdes.
aggregate: inherit the key serdes, do not set for value serdes internally.
```

Although it's all good for reduce and count, it is quiet unsafe to have aggregate without Materialized given. In fact I don't see why we would not give a Materialized for the aggregate since the result type will always be different (otherwise use reduce) and also the value Serde is simply not propagated.

This has been discussed previously in a broader PR before but I believe for aggregate we could pass implicitly a Materialized the same way we pass a Joined, just to avoid the stupid case. Then if the user wants to specialize, he can give his own Materialized.

Reviewers: Debasish Ghosh <dghosh@acm.org>, Guozhang Wang <guozhang@confluent.io>
2018-05-31 17:19:37 -07:00