Commit Graph

4791 Commits

Author SHA1 Message Date
Guozhang Wang 925acc9f47
KAFKA-4831: add documentation for KIP-265 (#4686)
Reviewers: Damian Guy <damian.guy@gmail.com>, Bill Bejeck <bill@confluent.io>
2018-03-14 17:33:54 -07:00
Edoardo Comar 35c08189cb MINOR: fixing streams test-util compilation errors in Eclipse (#4631)
Author: Edoardo Comar <ecomar@uk.ibm.com>

Reviewer: Matthias J. Sax <matthias@confluent.io>
2018-03-14 15:19:51 -07:00
Dong Lin d935699486 KAFKA-6640; Improve efficiency of KafkaAdminClient.describeTopics() (#4694)
Currently in KafkaAdminClient.describeTopics(), for each topic in the request, a complete map of cluster and errors will be constructed for every topic and partition. This unnecessarily increases the complexity of describeTopics() to O(n^2). This patch improves the complexity to O(n).

Reviewers: Ismael Juma <ismael@juma.me.uk>, Colin Patrick McCabe <colin@cmccabe.xyz>, Jason Gustafson <jason@confluent.io>
2018-03-14 14:58:24 -07:00
Chia-Ping Tsai d2bbcbed44 MINOR: Correct "exeption" to "exception" in connect docs (#4709) 2018-03-14 08:46:07 -07:00
Colin Patrick McCabe bf8a4c2ce7 MINOR: Improve Trogdor client logging. (#4675)
AgentClient and CoordinatorClient should have the option of logging failures to custom log4j objects.  There should also be builders for these objects, to make them easier to extend in the future.

Reviewers: Rajini Sivaram <rajinisivaram@googlemail.com>
2018-03-14 10:12:15 +00:00
Dong Lin 6b08905dfb KAFKA-3978; Ensure high watermark is always positive (#4695)
Partition high watermark may become -1 if the initial value is out of range. This situation can occur during partition reassignment, for example. The bug was fixed and validated with unit test in this patch.

Reviewers: Ismael Juma <ismael@juma.me.uk>, Jason Gustafson <jason@confluent.io>
2018-03-13 22:52:59 -07:00
hmcl aa14ec08b7 MINOR: Remove kafka-consumer-offset-checker.bat for KAFKA-3356 (#4703) 2018-03-13 16:41:54 -07:00
Manikumar Reddy O ad355298c6 MINOR: Remove unused server exceptions (#4701) 2018-03-13 13:59:13 -07:00
Guozhang Wang 95ad03540f
KAFKA-6634: Delay starting new transaction in task.initializeTopology (#4684)
As titled, not starting new transaction since during restoration producer would have not activity and hence may cause txn expiration. Also delay starting new txn in resuming until initializing topology.

Reviewers: Matthias J. Sax <mjsax@apache.org>, Bill Bejeck <bill@confluent.io>
2018-03-13 08:43:58 -07:00
Colin Patrick McCabe ec9e8110e3 MINOR: add DEFAULT_PORT for Trogdor Agent and Coordinator (#4674)
Add a DEFAULT_PORT constant for the Trogdor Agent and Coordinator.

Reviewers: Rajini Sivaram <rajinisivaram@googlemail.com>
2018-03-13 13:15:44 +00:00
Matthias J. Sax 0c00aa0983
MINOR: Streams doc example should not close store (#4667)
Author: Matthias J. Sax <matthias@confluent.io>

Reviewers: Bill Bejeck <bill@confluent.io>, Guozhang Wang <guozhang@confluent.io>
2018-03-13 00:42:40 -07:00
Rajini Sivaram 2bf06890b9 MINOR: Use large batches in metrics test for conversion time >= 1ms (#4681) 2018-03-12 23:23:00 -07:00
Siva Santhalingam 0bb8e66184 KAFKA-6024; Move arg validation in KafkaConsumer ahead of `acquireAndEnsureOpen` (#4617) 2018-03-12 23:03:32 -07:00
Dong Lin 1ea07b993d KAFKA-6624; Prevent concurrent log flush and log deletion (#4663)
KAFKA-6624; Prevent concurrent log flush and log deletion

Reviewers: Ted Yu <yuzhihong@gmail.com>, Jun Rao <junrao@gmail.com>
2018-03-12 22:20:44 -07:00
Detharon 724032bd06 MINOR: Fix incorrect JavaDoc (type mismatch) (#4632)
Reviewers: Guozhang Wang <wangguoz@gmail.com>
2018-03-12 16:32:19 -07:00
Andras Beni 29d2e8cf17 KAFKA-3368; Add documentation for old message format (#3425) 2018-03-12 15:13:34 -07:00
Jimin Hsieh 596b604862 MINOR: Fix wrong message in `bin/kafka-run-class.sh` (#4682)
To build jar you need to specify `scalaVersion` instead of `scala_version`.

Reviewers: Manikumar Reddy O <manikumar.reddy@gmail.com>, Jason Gustafson <jason@confluent.io>
2018-03-12 14:34:59 -07:00
Guozhang Wang 8e84961661
KAFKA-6560: Add docs for KIP-261 (#4685)
Reviewers: Matthias J. Sax <mjsax@apache.org>, Bill Bejeck <bill@confluent.io>
2018-03-12 13:11:29 -07:00
Ismael Juma 825bfe5ade
MINOR: Revert to ZooKeeper 3.4.10 due to ZOOKEEPER-2960 (#4678)
It's a critical bug that only affects the server, but we
don't have an easy way to use 3.4.11 for the client
only.

Reviewers: Jun Rao <junrao@gmail.com>, Damian Guy <damian.guy@gmail.com>
2018-03-12 04:40:59 -07:00
Jacek Laskowski c28c556e92 MINOR: Remove code duplication + excessive space (#4683)
Reviewers: Guozhang Wang <wangguoz@gmail.com>
2018-03-11 11:38:03 -07:00
Rajini Sivaram 8df96a4119
MINOR: Reduce ZK reads and ensure ZK watch is set for listener update (#4670)
Ensures that ZK watch is set for each live broker for listener update notifications in the controller. Also avoids reading all brokers from ZooKeeper when a broker metadata is modified by passing in brokerId to BrokerModifications and reading only the updated broker.

The existing listener update test verifies both these changes. Earlier, the test did not detect missing watch for the last broker since metadata of all brokers were read from ZK (adding a watch for all) when any broker was updated.

Reviewers: Jun Rao <junrao@gmail.com>
2018-03-10 15:57:05 +00:00
Rajini Sivaram 8d4d5f8c9f MINOR: Fix deadlock in ZooKeeperClient.close() on session expiry (#4672)
Reviewers: Jun Rao <junrao@gmail.com>
2018-03-09 15:18:23 -08:00
Max Zheng fd015b401d MINOR: Tag AWS instances with Jenkins build url (#4657)
This will allow us to trace leaked instances back to the job,
so that we can figure out what happened and fix the leak.

Reviewers: Ismael Juma <ismael@juma.me.uk>
2018-03-09 14:54:43 -08:00
Radai Rosenblatt 5760da7d0b KAFKA-6622; Fix performance issue loading consumer offsets (#4661)
`batch.baseOffset` is an expensive operation (even says so in its javadoc), and yet was called for every single record in a batch when loading offsets. This means that for N records in a gzipped batch, the entire batch will be unzipped N times. The fix is to compute and cache the base offset once as we decompress and process the batch.

Reviewers: Dong Lin <lindong28@gmail.com>, Ismael Juma <ismael@juma.me.uk>, Jason Gustafson <jason@confluent.io>
2018-03-09 14:21:06 -08:00
Anna Povzner f1c112c63d MINOR: Add PayloadGenerator to Trogdor (#4640)
It generates the producer payload (key and value) and makes sure that the values are
populated to target a realistic compression rate (0.3 - 0.4) if compression is used.
The generated payload is deterministic and can be replayed from a given position.
For now, all generated values are constant size, and key types can be configured
to be either null or 8 bytes.

Added messageSize parameter to producer spec, that specifies produced
key + message size.
2018-03-09 13:57:04 -08:00
Vitaly Pushkar b1aa1912f0 KAFKA-4831: Extract WindowedSerde to public APIs (#3307)
Now that we have augmented WindowSerde with non-arg parameters, extract it out as part of the public APIs so that users who want to I/O windowed streams can use it. This is originally introduced by @vitaly-pushkar

This PR grows out to be a much larger one, as I found a few tech debts and bugs while working on it. Here is a summary of the PR:

Public API changes (I will propose a KIP after a first round of reviews):
Add TimeWindowedSerializer, TimeWindowedDeserializer, SessionWindowedSerializer, SessionWindowedDeserializer into o.a.k.streams.kstream. The serializers would implemented an internal WindowedSerializer interface for the serializeBaseKey function used in 3) below.

Add WindowedSerdes into o.a.k.streams.kstream. The reason to now add them into o.a.k.clients's Serdes is that it then needs dependency of streams.

Add "default.windowed.key.serde.inner" and "default.windowed.value.serde.inner" into StreamsConfig, used when "default.key.serde" is specified to use time or session windowed serde. Note this requires the serde class, not the type class.

Consolidated serde format from multiple classes, including SessionKeySerde.java for session, and WindowStoreUtils for time window, into SessionKeySchema and WindowKeySchema.

Bug fix: WindowedStreamPartitioner needs to consider both time window and session window serdes.

Removed RocksDBWindowBytesStore etc optimization since after KIP-182 all the serde know happens on metered store, hence this optimization is not worth.

Bug fix: for time window, the serdes used for store and the serdes used for piping (source and sink node) are different: the former needs to append sequence number but not for the later.

Other minor cleanups: remove unnecessary throws, etc.

Authors: Guozhang Wang <wangguoz@gmail.com>, Vitaly Pushkar <vitaly.pushkar@gmail.com>

Reviewers: Matthias J. Sax <mjsax@apache.org>, Bill Bejeck <bill@confluent.io>, Xi Hu
2018-03-09 11:08:08 -08:00
Rajini Sivaram 3ef2fb843e
MINOR: Fix record conversion time in metrics (#4671)
Reviewers: Ismael Juma <ismael@juma.me.uk>
2018-03-09 16:50:34 +00:00
Jimin Hsieh db619c6da2 MINOR: Remove unused local variable in SocketServer (#4669)
Reviewers: Ismael Juma <ismael@juma.me.uk>
2018-03-09 07:44:57 -08:00
John Roesler 6a383d8bc4 MINOR: clean stateDirectory in TopologyTestDriver (#4655)
Author: John Roeler <john@confluent.io>

Reviewers: Bill Bejeck <bill@confluent.io>, Matthias J. Sax <matthias@confluent.io>, Guozhang Wang <guozhang@confluent.io>
2018-03-08 15:43:45 -08:00
Damian Guy 989088f697 KAFKA-6560: [FOLLOW-UP] don't deserialize null byte array in window store fetch (#4665)
If the result of a fetch from a Window Store results in a null byte array we should return null rather than passing it to the serde to deserialize.

Reviewers: Guozhang Wang <wangguoz@gmail.com>
2018-03-08 13:24:22 -08:00
wushujames c5ba0da993 MINOR: Fix incorrect references to the max transaction timeout config (#4664) 2018-03-08 10:28:20 -08:00
Jason Gustafson 925d6a2ef3 MINOR: Skip sending fetches/offset lookups when awaiting the reconnect backoff (#4644)
Logging can get spammy during the reconnect blackout period because any requests we send to ConsumerNetworkClient will immediately be failed when poll() returns. This patch checks for connection failures prior to sending fetches and offset lookups and skips sending to any failed nodes. Test cases added for both.

Reviewers: Rajini Sivaram <rajinisivaram@googlemail.com>
2018-03-08 11:00:20 +00:00
nafshartous cf092aeecc KAFKA-5660 Don't throw TopologyBuilderException during runtime (#4645)
Reviewers: Matthias J. Sax <mjsax@apache.org>, Bill Bejeck <bill@confluent.io>, Guozhang Wang <guozhang@confluent.io>
2018-03-06 17:49:33 -08:00
Jason Gustafson 23c1c52c85
KAFKA-6615; Add scripts for DumpLogSegments (#4653)
Reviewers: Ismael Juma <ismael@juma.me.uk>
2018-03-06 15:37:50 -08:00
Guozhang Wang e5d6c9a79a
MINOR: Do not start processor for bounce-at-start (#4639)
Only start it after the broker has been shutdown.
2018-03-06 11:19:38 -08:00
Ewen Cheslack-Postava d13cbd0cae KAFKA-3806: Increase offsets retention default to 7 days (KIP-186) (#4648)
Reviewers: Ismael Juma <ismael@juma.me.uk>, Jason Gustafson <jason@confluent.io>
2018-03-05 22:21:53 -08:00
Bill Bejeck a1b01f48e9 KAFKA-6309: Improve task assignor load balance (#4624)
Sorts TaskIds on first assignment evenly distributing tasks by topicGroupId should help with evening the load of work across topologies. This PR is an initial "strawman" approach which will be followed up (at a later date YTBD) by scoring or assigning weight to processing nodes to ensure even processing distribution.

Added a new test to existing unit test.
2018-03-05 21:24:22 -08:00
Jason Gustafson 8f2c087166
MINOR: Complete inflight requests in order on disconnect (#4642)
NetworkClient should use FIFO order when completing inflight requests following a disconnect.

I've added new unit tests for `InFlightRequests` and `NetworkClient` which verify completion order.

Reviewers: Jun Rao <junrao@gmail.com>
2018-03-05 16:48:05 -08:00
Bill Bejeck 8a7d7e7955 MINOR: Add System test for standby task-rebalancing (#4554)
Author: Bill Bejeck <bill@confluent.io>

Reviewers: Damian Guy <damian@confluent.io>, Guozhang Wang <guozhang@confluent.io>, Matthias J. Sax <matthias@confluent.io>
2018-03-05 11:06:32 -08:00
Matthias J. Sax 2a4ba75e13
KAFKA-6054: Code cleanup to prepare the actual fix for an upgrade path (#4630)
Author: Matthias J. Sax <matthias@confluent.io>

Reviewers: Bill Bejeck <bill@confluent.io>, John Roesler <john@confluent.io>, Guozhang Wang <guozhang@confluent.io>
2018-03-05 10:56:42 -08:00
Jason Gustafson 604b93cfde
KAFKA-6606; Ensure consumer awaits auto-commit interval after sending… (#4641)
We need to reset the auto-commit deadline after sending the offset commit request so that we do not resend it while the request is still inflight. 

Added unit tests ensuring this behavior and proper backoff in the case of a failure.

Reviewers: Guozhang Wang <wangguoz@gmail.com>
2018-03-03 13:27:29 -08:00
Sandor Murakozi 9868976747 KAFKA-6111: Improve test coverage of KafkaZkClient, fix bugs found by new tests; 2018-03-02 11:25:43 -08:00
Jason Gustafson 6cfcc9d553
KAFKA-6593; Fix livelock with consumer heartbeat thread in commitSync (#4625)
Contention for the lock in ConsumerNetworkClient can lead to a livelock situation in which an active commitSync is unable to make progress because its completion is blocked in the heartbeat thread. The fix is twofold:

1) We change ConsumerNetworkClient to use a fair lock to reduce the chance of each thread getting starved.
2) We eliminate the dependence on the lock in ConsumerNetworkClient for callback completion so that callbacks will not be blocked by an active poll().

Reviewers: Guozhang Wang <wangguoz@gmail.com>
2018-03-01 11:04:11 -08:00
Guozhang Wang eb449fe7c5
KAFKA-6560: Replace range query with newly added single point query in Windowed Aggregation (#4578)
* Add a new fetch(K key, long window-start-timestamp) API into ReadOnlyWindowStore.
* Use the new API to replace the range fetch API in KStreamWindowedAggregate and KStreamWindowedReduce.
* Added corresponding unit tests.
* Also removed some redundant byte serdes in byte stores.
2018-03-01 09:27:11 -08:00
Ewen Cheslack-Postava f582a7515c MINOR: Extend release.py with a subcommand for staging docs into the kafka-site repo
Author: Ewen Cheslack-Postava <me@ewencp.org>

Reviewers: Ismael Juma <ismael@juma.me.uk>, Damian Guy <damian.guy@gmail.com>

Closes #3917 from ewencp/stage-docs
2018-02-28 10:28:55 -08:00
Viktor Somogyi 9166ae4ec4 MINOR: Remove unnecessary semicolon in ResourceType (#4626) 2018-02-27 19:06:07 -08:00
Thomas Leplus 031f522a2d MINOR: Fix javadoc typo in Headers (#4627) 2018-02-27 19:04:29 -08:00
Guozhang Wang 97ad549d56
KAFKA-6534: Enforce a rebalance in the next poll call when encounter task migration (#4544)
The fix is in two folds:

For tasks that's closed in closeZombieTask, their corresponding partitions are still in runningByPartition so those closed tasks may still be returned in activeTasks and standbyTasks. Adding guards on the returned tasks and if they are closed notify the thread to trigger rebalance immediately.

When triggering a rebalance, un-subscribe and re-subscribe immediately to make sure we are not dependent on the background heartbeat thread timing.

Some minor changes on log4j. More specifically, I moved the log entry of closeZombieTask to its callers with more context information and the action going to take.

I can re-produce the issue with EosIntegrationTest may hand-code the heartbeat thread to GC, and confirmed this patch fixed the issue. Unfortunately this test cannot be added to AK since currently we do not have ways to manipulate the heartbeat thread in unit tests.

Reviewers: Jason Gustafson <jason@confluent.io>, Bill Bejeck <bill@confluent.io>, Matthias J. Sax <matthias@confluent.io>
2018-02-27 00:29:25 -08:00
Guozhang Wang f26fbb9adc
MINOR: Rename stream partition assignor to streams partition assignor (#4621)
This is a straight-forward change that make the name of the partition assignor to be aligned with Streams.

Reviewers: Matthias J. Sax <mjsax@apache.org>
2018-02-26 14:39:47 -08:00
Blake Miller 7c5d0c459f MINOR:Fix typo in the impl source (#4587)
The static method KStreamImpl.createReparitionedSource() is missing a t.

This PR globally fixes the typo and keeps the code indentation consistent.
2018-02-26 12:25:53 -08:00