Commit Graph

639 Commits

Author SHA1 Message Date
Matthias J. Sax 6a7652fe0f
MINOR: remove idempotent statement (#5659)
Reviewers: John Roesler <john@confluent.io>, Bill Bejeck <bill@confluent.io>, Guozhang Wang <guozhang@confluent.io>
2018-09-19 17:17:28 -07:00
Matthias J. Sax 4c9d49bd3b
KAFKA-7192: Wipe out state store if EOS is turned on and checkpoint file does not exist (#5641)
Reviews: Guozhang Wang <guozhang@confluent.io>, Bill Bejeck <bill@confluent.io>, John Roesler <john@confluent.io>
2018-09-14 12:05:24 -07:00
Bill Bejeck 3792111347 MINOR: Update streams upgrade system tests 0.11.0.3 (#5613)
This is a port of #5605 for the 11.3 branch

Reviewers: John Roesler <john@confluent.io>, Matthias J. Sax <matthias@confluent.io>, Guozhang Wang <wangguoz@gmail.com>
2018-09-06 13:06:35 -07:00
John Roesler 42f0784991 KAFKA-7284: streams should unwrap fenced exception (#5520)
Reviewers: Matthias J. Sax <matthias@confluent.io>
2018-08-21 10:07:30 -07:00
Matthias J. Sax 4425c8ffbb MINOR: Caching layer should forward record timestamp (#5423) (#5426)
Reviewer: Guozhang Wang <guozhang@confluent.io>
2018-08-01 13:36:34 -07:00
Matthias J. Sax 9383567d1e Bump version to 0.11.0.4-SNAPSHOT 2018-07-02 10:01:59 -07:00
Matthias J. Sax 26ddb9e319 Bump version to 0.11.0.3 2018-06-22 12:12:57 -07:00
Guozhang Wang 88529006b4
KAFKA-7021: checkpoint offsets from committed (#5232)
This is a cherry-pick PR from #5207

1. add the committed offsets to checkpointable offset map.

2. add the restoration integration test for the source KTable case.
2018-06-14 22:21:49 -07:00
Damian Guy 42a82ac4e9 KAFKA-6360: Clear RocksDB Segments when store is closed
Now that we support re-initializing state stores, we need to clear the segments when the store is closed so that they can be re-opened.

Author: Damian Guy <damian.guy@gmail.com>

Reviewers: Bill Bejeck <bbejeck@gmail.com>, Guozhang Wang <wangguoz@gmail.com>, Ted Yu <yuzhihong@gmail.com>

Closes #4324 from dguy/kafka-6360
2018-06-14 22:07:39 -07:00
Matthias J. Sax a8e48b3f95 KAFKA-6711: GlobalStateManagerImpl should not write offsets of in-memory stores in checkpoint file (#5219) 2018-06-14 14:16:46 -07:00
Jagadesh Adireddi 51f585dee6 KAFKA-6906: Fixed to commit transactions if data is produced via wall clock punctuation (#5105)
Reviewers: Matthias J. Sax <matthias@confluent.io>, Bill Bejeck <bill@confluent.io>, Guozhang Wang <guozhang@confluent.io>
2018-06-12 19:10:38 -07:00
Guozhang Wang 341fd7b533 KAFKA-6634: Delay starting new transaction in task.initializeTopology (#4684)
As titled, not starting new transaction since during restoration producer would have not activity and hence may cause txn expiration. Also delay starting new txn in resuming until initializing topology.

Reviewers: Matthias J. Sax <mjsax@apache.org>, Bill Bejeck <bill@confluent.io>
2018-06-12 17:54:48 -07:00
Bill Bejeck fac2fa0f46 KAFKA-6205: initialize topology after state stores restoration completed
Initialize topology after state store restoration.
Although IMHO updating some of the existing tests demonstrates the correct order of operations, I'll probably add an integration test, but I wanted to get this PR in for feedback on the approach.

Author: Bill Bejeck <bill@confluent.io>

Reviewers: Guozhang Wang <wangguoz@gmail.com>, Matthias J. Sax <mjsax@apache.org>

Closes #4415 from bbejeck/KAFKA-6205_restore_state_stores_before_initializing_topology

minor log4j edits
2018-06-12 17:24:01 -07:00
Gitomain 9bf277bc1a KAFKA-6782: solved the bug of restoration of aborted messages for GlobalStateStore and KGlobalTable (#4900)
Reviewer: Matthias J. Sax <matthias@confluent.io>, Bill Bejeck <bill@confluent.io>, Guozhang Wang <guozhang@confluent.io>
2018-06-12 13:31:56 -07:00
tedyu 9c9657b102 KAFKA-6747 Check whether there is in-flight transaction before aborting transaction (#4826)
As Frederic reported on mailing list under the subject "kafka-streams Invalid transition attempted from state READY to state ABORTING_TRANSACTION", producer#abortTransaction should only be called when transactionInFlight is true.

Reviewers: Guozhang Wang <wangguoz@gmail.com>, Matthias J. Sax <matthias@confluent.io>
2018-06-12 10:44:52 -07:00
John Roesler b5b795b302 KAFKA-6925: fix parentSensors memory leak (#5108) (#5120)
Previously, we failed to remove sensors from the parentSensors map, effectively a memory leak.

Add a test to verify that removed sensors get removed from the underlying registry as well as the parentSensors map.

Reviewers: Guozhang Wang <wangguoz@gmail.com>
2018-06-03 10:58:34 -07:00
Matthias J. Sax 88da81b945
MINOR: StreamsConfig `upgrade.from` should not be in list of deprecated configs (#4780)
Reviewers: Bill Bejeck <bill@confluent.io>, John Roesler <john@confluent.io>, Guozhang Wang <guozhang@confluent.io>
2018-03-27 13:46:08 -07:00
Matthias J. Sax 13dbcad9bb
KAFKA-6054: Fix upgrade path from Kafka Streams v0.10.0 (#4761)
Introduces new config parameter `upgrade.from`.

Reviewers: Guozhang Wang <guozhang@confluent.io>, Bill Bejeck <bill@confluent.io>
2018-03-26 20:32:20 -07:00
Yaswanth Kumar e37c2ddf2e KAFKA-6536: Adding versions for japicmp-maven-plugin and maven-shade-plugin in quickstart (#4569)
Author: Yaswanth Kumar <yash27422@gmail.com>

Reviewers: Guozhang Wang <guozhang@confluent.io>, Matthias J. Sax <matthias@confluent.io>
2018-02-16 13:27:25 -08:00
Matthias J. Sax cc15da1d77 MINOR: Improve EOS related config docs
Author: Matthias J. Sax <matthias@confluent.io>

Reviewers: Jason Gustafson <jason@confluent.io>, Guozhang Wang <wangguoz@gmail.com>

Closes #4284 from mjsax/minor-improve-eos-docs
2017-12-01 14:45:55 -08:00
Matthias J. Sax c89c6b8736 MINOR: improve flaky Streams tests
Use TestUtil test directory for state directory instead of default /tmp/kafka-streams

Author: Matthias J. Sax <matthias@confluent.io>

Reviewers: Damian Guy <damian.guy@gmail.com>

Closes #4246 from mjsax/improve-flaky-streams-tests
2017-11-22 10:55:42 +00:00
Matthias J. Sax 4b5c82a8ee MINOR: improve StateStore JavaDocs
Clarify that state directory must use `storeName`

Author: Matthias J. Sax <matthias@confluent.io>

Reviewers: Bill Bejeck <bill@confluent.io>, Guozhang Wang <wangguoz@gmail.com>

Closes #4228 from mjsax/minor-state-store-javadoc

(cherry picked from commit b604540fbd)
Signed-off-by: Guozhang Wang <wangguoz@gmail.com>
2017-11-17 11:49:28 -08:00
Rajini Sivaram 70d7d180b0 MINOR: Update version numbers to 0.11.0.3-SNAPSHOT 2017-11-16 10:51:22 +00:00
Rajini Sivaram 73be1e1168 Bump version to 0.11.0.2 2017-11-10 23:47:23 +00:00
Alex Good 1321d89484 KAFKA-6190: Use consumer.position() instead of record.offset() to advance in GlobalKTable restoration to avoid transactional control messages
Calculate offset using consumer.position() in GlobalStateManagerImp#restoreState

Author: Alex Good <alexjsgood@gmail.com>

Reviewers: Matthias J. Sax <matthias@confluent.io>, Damian Guy <damian.guy@gmail.com>, Guozhang Wang <wangguoz@gmail.com>

Closes #4197 from alexjg/0.11.0
2017-11-09 15:30:53 -08:00
Guozhang Wang 1a0c006983 KAFKA-6179: Clear min timestamp tracker upon partition queue cleanup
Author: Guozhang Wang <wangguoz@gmail.com>

Reviewers: Matthias J. Sax <matthias@confluent.io>, Bill Bejeck <bill@confluent.io>, Damian Guy <damian.guy@gmail.com>

Closes #4186 from guozhangwang/K6179-cleanup-timestamp-tracker-on-clear

(cherry picked from commit ee1aaa091f)
Signed-off-by: Guozhang Wang <wangguoz@gmail.com>
2017-11-08 15:07:35 -08:00
Guozhang Wang 176ff0d692 HOTFIX: Remove sysout logging
Author: Guozhang Wang <wangguoz@gmail.com>

Reviewers: Damian Guy <damian.guy@gmail.com>, Xavier Léauté <xavier@confluent.io>

Closes #4130 from guozhangwang/KHotfix-0110-remove-logging
2017-10-25 08:31:46 -07:00
Guozhang Wang 472c8974f2 HOTFIX: poll with zero millis during restoration
Mirror of #4096 for 0.11.01.

During the restoration phase, when thread state is still in PARTITION_ASSIGNED not RUNNING yet, call poll() on the normal consumer with 0 millisecond timeout, to unblock the restoration of other tasks as soon as possible.

Author: Guozhang Wang <wangguoz@gmail.com>

Reviewers: Bill Bejeck <bill@confluent.io>, Damian Guy <damian.guy@gmail.com>, Matthias J. Sax <matthias@confluent.io>, Xavier Léauté <xavier@confluent.io>

Closes #4085 from guozhangwang/KHotfix-0110-restore-only
2017-10-23 22:52:35 -07:00
Guozhang Wang 33640106b4 KAFKA-5140: Fix reset integration test
A couple of root causes of this flaky test is fixed:

1. The MockTime was incorrectly used across multiple test methods within the class, as a class rule. Instead we set it on each test case; also remove the scala MockTime dependency.

2. List topics may not contain the deleted topics while their ZK paths are yet to be deleted; so the delete-check-recreate pattern may fail to successfully recreate the topic at all. Change the checking to read from zk path directly instead.

Another minor fix is to remove the misleading wait condition error message as the accumData is always empty.

Author: Guozhang Wang <wangguoz@gmail.com>

Reviewers: Bill Bejeck <bill@confluent.io>, Damian Guy <damian.guy@gmail.com>, Matthias J. Sax <matthias@confluent.io>

Closes #4095 from guozhangwang/KMinor-reset-integration-test

(cherry picked from commit d3f24798f9)
Signed-off-by: Guozhang Wang <wangguoz@gmail.com>
2017-10-23 12:41:35 -07:00
siva santhalingam 51ea8e76ba KAFKA-5967; Ineffective check of negative value in CompositeReadOnlyKeyValueStore#approximateNumEntries()
package name: org.apache.kafka.streams.state.internals
Minor change to approximateNumEntries() method in CompositeReadOnlyKeyValueStore class.

long total = 0;
   for (ReadOnlyKeyValueStore<K, V> store : stores) {
          total += store.approximateNumEntries();
   }

return total < 0 ? Long.MAX_VALUE : total;

The check for negative value seems to account for wrapping. However, wrapping can happen within the for loop. So the check should be performed inside the loop.

Author: siva santhalingam <ssanthalingam@netskope.com>

Reviewers: Matthias J. Sax <matthias@confluent.io>, Damian Guy <damian.guy@gmail.com>

Closes #3988 from shivsantham/trunk

(cherry picked from commit 5afeddaa99)
Signed-off-by: Damian Guy <damian.guy@gmail.com>
2017-10-04 10:15:02 -07:00
Damian Guy fae2d23868 KAFKA-5986; Streams State Restoration never completes when logging is disabled
When logging is disabled and there are state stores the task never transitions from restoring to running. This is because we only ever check if the task has state stores and return false on initialization if it does. The check should be if we have changelog partitions, i.e., we need to restore.

Author: Damian Guy <damian.guy@gmail.com>

Reviewers: Matthias J. Sax <matthias@confluent.io>, Bill Bejeck <bill@confluent.io>, tedyu <yuzhihong@gmail.com>, Ismael Juma <ismael@juma.me.uk>

Closes #3983 from dguy/restore-test

(cherry picked from commit 3107a6c5c8)
Signed-off-by: Damian Guy <damian.guy@gmail.com>
2017-09-29 15:24:32 +01:00
Damian Guy b95a6bf61a MINOR: Bump version in streams quickstart archetype pom.xml
Author: Damian Guy <damian.guy@gmail.com>

Reviewers: Ismael Juma <ismael@juma.me.uk>

Closes #3857 from dguy/fix-archetype-version
2017-09-14 12:01:13 +01:00
Damian Guy c3c7674afb MINOR: Update version numbers to 0.11.0.2-SNAPSHOT 2017-09-12 14:07:42 +01:00
Guozhang Wang 5f0c060e0c KAFKA-5797: Handle metadata not available in store registration
This is the backport of #3748 for trunk

Author: Guozhang Wang <wangguoz@gmail.com>

Reviewers: Damian Guy <damian.guy@gmail.com>

Closes #3806 from guozhangwang/K5797-handle-metadata-available-0110
2017-09-08 09:07:00 -07:00
Guozhang Wang fc3eeb0047 HOTFIX: remove unused imports 2017-09-05 16:09:10 -07:00
Guozhang Wang 426057094c MINOR: logging improvements on StreamThread
This is a manual cherry-pick of https://github.com/apache/kafka/pull/3769 for 0.11.0

Author: Guozhang Wang <wangguoz@gmail.com>

Reviewers: Damian Guy <damian.guy@gmail.com>, Bill Bejeck <bill@confluent.io>, Matthias J. Sax <matthias@confluent.io>

Closes #3771 from guozhangwang/KMinor-logging-improvements
2017-09-05 16:07:13 -07:00
Damian Guy 90b6d978e8 MINOR: add mvn-pgp-plugin to sign streams quickstart jars
Author: Damian Guy <damian.guy@gmail.com>

Reviewers: Guozhang Wang <wangguoz@gmail.com>

Closes #3793 from dguy/sign-mvn-jars

(cherry picked from commit d78eb03fad)
Signed-off-by: Guozhang Wang <wangguoz@gmail.com>
2017-09-05 10:51:16 -07:00
Matthias J. Sax 7b6e5f96ce KAFKA-5818: KafkaStreams state transitions not correct
Author: Matthias J. Sax <matthias@confluent.io>

Reviewers: Ted Yu <yuzhihong@gmail.com>, Guozhang Wang <wangguoz@gmail.com>

Closes #3779 from mjsax/kafka-5818-kafkaStreams-state-transition-01101
2017-09-03 15:25:26 -07:00
Damian Guy abc66c98ca KAFKA-5787; StoreChangelogReader needs to restore partitions that were added post initialization
If a task fails during initialization due to a LockException, its changelog partitions are not immediately added to the StoreChangelogReader as the thread doesn't hold the lock. However StoreChangelogReader#restore will be called and it sets the initialized flag. On a subsequent successfull call to initialize the new tasks the partitions are added to the StoreChangelogReader, however as it is already initialized these new partitions will never be restored. So the task would remain in a non-running state forever.

Author: Damian Guy <damian.guy@gmail.com>

Reviewers: Guozhang Wang <wangguoz@gmail.com>

Closes #3747 from dguy/kafka-5787-0.11
2017-08-29 09:43:29 +01:00
radzish 1425758915 KAFKA-5771; org.apache.kafka.streams.state.internals.Segments#segments method returns incorrect results when segments were added out of order
Suggested fix for the bug

Author: radzish <radzish@gmail.com>

Reviewers: Damian Guy <damian.guy@gmail.com>

Closes #3737 from radzish/KAFKA-5771

(cherry picked from commit 05e3850b2e)
Signed-off-by: Damian Guy <damian.guy@gmail.com>
2017-08-25 13:59:25 +01:00
Matthias J. Sax b872abf69b KAFKA-5603; Don't abort TX for zombie tasks
Author: Matthias J. Sax <matthias@confluent.io>

Reviewers: Bill Bejeck <bill@confluent.io>, Damian Guy <damian.guy@gmail.com>

Closes #3722 from mjsax/kafka-5603-dont-abort-tx-for-zombie-tasks-01101
2017-08-24 10:47:50 +01:00
Damian Guy b268322ed7 KAFKA-5152; move state restoration out of rebalance and into poll loop
In `onPartitionsAssigned`:
1. release all locks for non-assigned suspended tasks.
2. resume any suspended tasks.
3. Create new tasks, but don't attempt to take the state lock.
4. Pause partitions for any new tasks.
5. set the state to `PARTITIONS_ASSIGNED`

In `StreamThread#runLoop`
1. poll
2. if state is `PARTITIONS_ASSIGNED`
 2.1  attempt to initialize any new tasks, i.e, take out the state locks and init state stores
 2.2 restore some data for changelogs, i.e., poll once on the restore consumer and return the partitions that have been fully restored
 2.3 update tasks with restored partitions and move any that have completed restoration to running
 2.4 resume consumption for any tasks where all partitions have been restored.
 2.5 if all active tasks are running, transition to `RUNNING` and assign standby partitions to the restoreConsumer.

Author: Damian Guy <damian.guy@gmail.com>

Reviewers: Guozhang Wang <wangguoz@gmail.com>, Bill Bejeck <bill@confluent.io>

Closes #3653 from dguy/0.11.0-restore-on-poll
2017-08-16 11:14:00 +01:00
Guozhang Wang a1a1186064 KAFKA-5727: Archetype project for Streams quickstart and tutorial web docs
Author: Guozhang Wang <wangguoz@gmail.com>

Reviewers: Damian Guy <damian.guy@gmail.com>

Closes #3660 from guozhangwang/K5727-archetype-project-0110
2017-08-15 11:17:23 -07:00
Eno Thereska 77b81c02b9 HOTFIX: state transition cherry picking
Cherry picked from https://github.com/apache/kafka/pull/3432

Author: Eno Thereska <eno.thereska@gmail.com>

Reviewers: Damian Guy <damian.guy@gmail.com>, Guozhang Wang <wangguoz@gmail.com>

Closes #3622 from enothereska/KAFKA-5571-0.11
2017-08-15 15:35:43 +01:00
Damian Guy 2a4eeb1c6f KAFKA-5562; Do streams state directory cleanup on a single thread
Backported from trunk: https://github.com/apache/kafka/pull/3516

Author: Damian Guy <damian.guy@gmail.com>

Reviewers: Guozhang Wang <wangguoz@gmail.com>, Eno Thereska <eno.thereska@gmail.com>

Closes #3654 from dguy/cherry-pick-stream-thread-cleanup
2017-08-11 16:40:28 +01:00
Damian Guy 58125ced76 MINOR: change log level in ThreadCache to trace
cache eviction logging at debug level is too high volume. This was already done on trunk but didn't make it into 0.11

Author: Damian Guy <damian.guy@gmail.com>

Reviewers: Ismael Juma <ismael@juma.me.uk>

Closes #3652 from dguy/minor-cache-log-level
2017-08-10 11:42:56 +01:00
Damian Guy a75027429b KAFKA-5717; InMemoryKeyValueStore should delete keys with null values during restore
Fixed a bug in the InMemoryKeyValueStore restoration where a key with a `null` value is written in to the map rather than being deleted.

Author: Damian Guy <damian.guy@gmail.com>

Reviewers: Bill Bejeck <bbejeck@gmail.com>, Guozhang Wang <wangguoz@gmail.com>

Closes #3650 from dguy/kafka-5717

(cherry picked from commit c35c479813)
Signed-off-by: Damian Guy <damian.guy@gmail.com>
2017-08-09 20:03:40 +01:00
Eno Thereska 3eaa325692 HOTFIX: Fixes to metric names of Streams
A couple of fixes to metric names to match the KIP
- Removed extra strings in the metric names that are already in the tags
- add a separate metric for "all"

Author: Eno Thereska <eno.thereska@gmail.com>

Reviewers: Guozhang Wang <wangguoz@gmail.com>

Closes #3491 from enothereska/hotfix-metric-names

(cherry picked from commit 6bee1e9e57)
Signed-off-by: Guozhang Wang <wangguoz@gmail.com>
2017-08-03 14:18:35 -07:00
Guozhang Wang 452e2eedb8 HOTFIX: handle commit failed exception on stream thread's suspend task
1. Capture `CommitFailedException` in `StreamThread#suspendTasksAndState`.

2. Remove `Cache` from AbstractTask as it is not needed any more; remove not used cleanup related variables from StreamThread (cc dguy to double check).

3.  Also fix log4j outputs for error and warn, such that for WARN we do not print stack trace, and for ERROR we remove the dangling colon since the exception stack trace will start in newline.

4. Update one log4j entry to always print as WARN for errors closing a zombie task (cc mjsax ).

Author: Guozhang Wang <wangguoz@gmail.com>

Reviewers: Matthias J. Sax <matthias@confluent.io>, Damian Guy <damian.guy@gmail.com>

Closes #3574 from guozhangwang/KHotfix-handle-commit-failed-exception-in-suspend

(cherry picked from commit 228a4fdb6d)
Signed-off-by: Guozhang Wang <wangguoz@gmail.com>

fix unit test
2017-08-01 14:51:52 -07:00
Damian Guy 55c2b73650 HOTFIX: fix threading issue in MeteredKeyValueStore
`MeteredKeyValueStore` wasn't thread safe. Interleaving operations could modify the state, i.e, the `key` and/or `value` which could result in incorrect behaviour.

Author: Damian Guy <damian.guy@gmail.com>

Reviewers: Guozhang Wang <wangguoz@gmail.com>

Closes #3588 from dguy/hotfix-metered-kv-store

(cherry picked from commit 4059fa5763)
Signed-off-by: Damian Guy <damian.guy@gmail.com>
2017-07-28 09:05:41 +01:00