Commit Graph

3867 Commits

Author SHA1 Message Date
Ismael Juma 7311dcbc53 KAFKA-5291; AdminClient should not trigger auto creation of topics
- Added a boolean `allow_auto_topic_creation` to MetadataRequest and
bumped the protocol version to V4.

- When connecting to brokers older than 0.11.0.0, the `allow_auto_topic_creation`
field won't be considered, so we send a metadata request for all topics
to keep the behavior consistent.

- Set `allow_auto_topic_creation` to false in the new AdminClient and
StreamsKafkaClient (which exists for the purpose of creating topics
manually); set it to true everywhere else for now. Other clients will eventually
rely on client-side auto topic creation, but that’s not there yet.

- Add `allowAutoTopicCreation` field to `Metadata`, which is used by
`DefaultMetadataUpdater`. This is not strictly needed for the new
`AdminClient`, but it avoids surprises if it ever adds a topic to `Metadata`
via `setTopics` or `addTopic`.

Author: Ismael Juma <ismael@juma.me.uk>

Reviewers: Jun Rao <junrao@gmail.com>

Closes #3098 from ijuma/kafka-5291-admin-client-no-auto-topic-creation
2017-06-01 01:00:11 +01:00
Ismael Juma 647afeff6a KAFKA-5353; baseTimestamp should always have a create timestamp
This makes the case where we build the records from scratch consistent
with the case where update the batch header "in place". Thanks to
edenhill who found the issue while testing librdkafka.

The reason our tests don’t catch this is that we rely on the maxTimestamp
to compute the record level timestamps if log append time is used.

Author: Ismael Juma <ismael@juma.me.uk>

Reviewers: Jason Gustafson <jason@confluent.io>

Closes #3177 from ijuma/set-base-sequence-for-log-append-time
2017-06-01 00:16:55 +01:00
Ismael Juma eeb8f67810 MINOR: Use `waitUntil` to fix transient failures of ControllerFailoverTest
Without it, it's possible that the assertion is checked before the exception
is thrown in the callback.

Author: Ismael Juma <ismael@juma.me.uk>

Reviewers: Rajini Sivaram <rajinisivaram@googlemail.com>

Closes #3182 from ijuma/fix-controller-failover-flakiness
2017-05-31 23:47:11 +01:00
Mario Molina dc5bf4bd45 KAFKA-5218; New Short serializer, deserializer, serde
Author: Mario Molina <mmolimar@gmail.com>

Reviewers: Matthias J. Sax <matthias@confluent.io>, Damian Guy <damian.guy@gmail.com>, Michael G. Noll <michael@confluent.io>, Guozhang Wang <wangguoz@gmail.com>

Closes #3017 from mmolimar/KAFKA-5218
2017-05-31 15:09:58 -07:00
Matthias J. Sax d9479275fd MINOR: Improve streams config parameters
Adjust "importance level" and add explanation to the docs.

Author: Matthias J. Sax <matthias@confluent.io>

Reviewers: Eno Thereska <eno.thereska@gmail.com>, Guozhang Wang <wangguoz@gmail.com>

Closes #2855 from mjsax/minor-improve-streams-config-parameters
2017-05-31 14:53:28 -07:00
Jason Gustafson 81f0c1e8f2 KAFKA-5093; Avoid loading full batch data when possible when iterating FileRecords
Author: Jason Gustafson <jason@confluent.io>

Reviewers: Ismael Juma <ismael@juma.me.uk>

Closes #3160 from hachikuji/KAFKA-5093
2017-05-31 14:11:47 -07:00
Colin P. Mccabe da9a171c99 KAFKA-5265; Move ACLs, Config, Topic classes into org.apache.kafka.common
Also introduce TopicConfig.

Author: Colin P. Mccabe <cmccabe@confluent.io>

Reviewers: Ismael Juma <ismael@juma.me.uk>

Closes #3120 from cmccabe/KAFKA-5265
2017-05-31 17:35:31 +01:00
Ismael Juma 9323a75335 MINOR: Use new consumer in ProducerCompressionTest
This should be less flaky as it has a higher timeout. I also increased the timeout
in a couple of other tests that had a very low (100 ms) timeouts.

The failure would manifest itself as:

```text
java.net.SocketTimeoutException
	at sun.nio.ch.SocketAdaptor$SocketInputStream.read(SocketAdaptor.java:229)
	at sun.nio.ch.ChannelInputStream.read(ChannelInputStream.java:103)
	at java.nio.channels.Channels$ReadableByteChannelImpl.read(Channels.java:385)
	at org.apache.kafka.common.network.NetworkReceive.readFromReadableChannel(NetworkReceive.java:85)
	at kafka.network.BlockingChannel.readCompletely(BlockingChannel.scala:129)
	at kafka.network.BlockingChannel.receive(BlockingChannel.scala:120)
	at kafka.consumer.SimpleConsumer.liftedTree1$1(SimpleConsumer.scala:100)
	at kafka.consumer.SimpleConsumer.kafka$consumer$SimpleConsumer$$sendRequest(SimpleConsumer.scala:84)
	at kafka.consumer.SimpleConsumer$$anonfun$fetch$1$$anonfun$apply$mcV$sp$1.apply$mcV$sp(SimpleConsumer.scala:133)
	at kafka.consumer.SimpleConsumer$$anonfun$fetch$1$$anonfun$apply$mcV$sp$1.apply(SimpleConsumer.scala:133)
	at kafka.consumer.SimpleConsumer$$anonfun$fetch$1$$anonfun$apply$mcV$sp$1.apply(SimpleConsumer.scala:133)
	at kafka.metrics.KafkaTimer.time(KafkaTimer.scala:31)
	at kafka.consumer.SimpleConsumer$$anonfun$fetch$1.apply$mcV$sp(SimpleConsumer.scala:132)
	at kafka.consumer.SimpleConsumer$$anonfun$fetch$1.apply(SimpleConsumer.scala:132)
	at kafka.consumer.SimpleConsumer$$anonfun$fetch$1.apply(SimpleConsumer.scala:132)
	at kafka.metrics.KafkaTimer.time(KafkaTimer.scala:31)
	at kafka.consumer.SimpleConsumer.fetch(SimpleConsumer.scala:131)
	at kafka.api.test.ProducerCompressionTest.testCompression(ProducerCompressionTest.scala:97)
```

Author: Ismael Juma <ismael@juma.me.uk>

Reviewers: Rajini Sivaram <rajinisivaram@googlemail.com>

Closes #3178 from ijuma/producer-compression-test-flaky
2017-05-31 16:13:30 +01:00
Jason Gustafson aebba89a2b KAFKA-5349; Fix illegal state error in consumer's ListOffset handler
Author: Jason Gustafson <jason@confluent.io>

Reviewers: Apurva Mehta <apurva@confluent.io>, Ismael Juma <ismael@juma.me.uk>

Closes #3175 from hachikuji/KAFKA-5349
2017-05-31 10:27:16 +01:00
Ismael Juma 31cc8885e4 MINOR: Improve assert in ControllerFailoverTest
It sometimes fails in Jenkins like:

```text
java.lang.AssertionError: IllegalStateException was not thrown
	at org.junit.Assert.fail(Assert.java:88)
	at org.junit.Assert.assertTrue(Assert.java:41)
	at kafka.controller.ControllerFailoverTest.testHandleIllegalStateException(ControllerFailoverTest.scala:86)
```

I ran it locally 100 times with no failure.

Author: Ismael Juma <ismael@juma.me.uk>

Reviewers: Rajini Sivaram <rajinisivaram@googlemail.com>

Closes #3176 from ijuma/improve-controller-failover-assert
2017-05-31 10:22:17 +01:00
Jorge Quilcate Otoya ef9551297c KAFKA-5266; Follow-up improvements for consumer offset reset tool (KIP-122)
Implement improvements defined here: https://issues.apache.org/jira/browse/KAFKA-5266

Author: Jorge Quilcate Otoya <quilcate.jorge@gmail.com>

Reviewers: Jason Gustafson <jason@confluent.io>

Closes #3102 from jeqo/feature/KAFKA-5266
2017-05-31 00:50:48 -07:00
Damian Guy 2cc8f48ae5 KAFKA-5308; TC should handle UNSUPPORTED_FOR_MESSAGE_FORMAT in WriteTxnMarker response
Return UNSUPPORTED_MESSAGE_FORMAT in handleWriteTxnMarkers when a topic is not the correct message format.
Remove any TopicPartitions that have same error from those waiting for markers

Author: Damian Guy <damian.guy@gmail.com>

Reviewers: Jason Gustafson <jason@confluent.io>, Guozhang Wang <wangguoz@gmail.com>

Closes #3152 from dguy/kafka-5308
2017-05-30 23:38:50 -07:00
Tommy Becker dd8cdb79d3 KAKFA-5334: Allow rocksdb.config.setter to be specified as a String or Class instance
Handle` rocksdb.config.setter` being set as a class name or class
instance.

Author: Tommy Becker <tobecker@tivo.com>
Author: Tommy Becker <twbecker@gmail.com>

Reviewers: Matthias J. Sax, Damian Guy, Guozhang Wang

Closes #3155 from twbecker/KAFKA-5334
2017-05-30 23:21:12 -07:00
Jiangjie Qin d082563907 KAFKA-5211; Do not skip a corrupted record in consumer
Author: Jiangjie Qin <becket.qin@gmail.com>

Reviewers: Jason Gustafson <jason@confluent.io>

Closes #3114 from becketqin/KAFKA-5211
2017-05-30 22:41:21 -07:00
Jason Gustafson d41cf1b778 KAFKA-5251; Producer should cancel unsent AddPartitions and Produce requests on abort
Author: Jason Gustafson <jason@confluent.io>

Reviewers: Apurva Mehta <apurva@confluent.io>, Guozhang Wang <wangguoz@gmail.com>

Closes #3161 from hachikuji/KAFKA-5251
2017-05-30 20:32:51 -07:00
Colin P. Mccabe 3250cc767e KAFKA-5324; AdminClient: add close with timeout, fix some timeout bugs
Author: Colin P. Mccabe <cmccabe@confluent.io>

Reviewers: Ismael Juma <ismael@juma.me.uk>

Closes #3141 from cmccabe/KAFKA-5324
2017-05-31 03:02:52 +01:00
Xavier Léauté c060c48285 KAFKA-5150; Reduce LZ4 decompression overhead
- reuse decompression buffers in consumer Fetcher
- switch lz4 input stream to operate directly on ByteBuffers
- avoids performance impact of catching exceptions when reaching the end of legacy record batches
- more tests with both compressible / incompressible data, multiple
  blocks, and various other combinations to increase code coverage
- fixes bug that would cause exception instead of invalid block size
  for invalid incompressible blocks
- fixes bug if incompressible flag is set on end frame block size

Overall this improves LZ4 decompression performance by up to 40x for small batches.
Most improvements are seen for batches of size 1 with messages on the order of ~100B.
We see at least 2x improvements for for batch sizes of < 10 messages, containing messages < 10kB

This patch also yields 2-4x improvements on v1 small single message batches for other compression types.

Full benchmark results can be found here
https://gist.github.com/xvrl/05132e0643513df4adf842288be86efd

Author: Xavier Léauté <xavier@confluent.io>
Author: Ismael Juma <ismael@juma.me.uk>

Reviewers: Jason Gustafson <jason@confluent.io>, Ismael Juma <ismael@juma.me.uk>

Closes #2967 from xvrl/kafka-5150
2017-05-31 02:22:07 +01:00
Ismael Juma 6021618f9d MINOR: onControllerResignation should be invoked if triggerControllerMove is called
Also update the test to be simpler since we can use a mock event to simulate the issue
more easily (thanks Jun for the suggestion). This should fix two issues:

1. A transient test failure due to a NPE in ControllerFailoverTest.testMetadataUpdate:

```text
Caused by: java.lang.NullPointerException
	at kafka.controller.ControllerBrokerRequestBatch.addUpdateMetadataRequestForBrokers(ControllerChannelManager.scala:338)
	at kafka.controller.KafkaController.sendUpdateMetadataRequest(KafkaController.scala:975)
	at kafka.controller.ControllerFailoverTest.testMetadataUpdate(ControllerFailoverTest.scala:141)
```

The test was creating an additional thread and it does not seem like it was doing the
appropriate synchronization (perhaps this became more of an issue after we changed
the Controller to be single-threaded and changed the locking)

2. Setting `activeControllerId.set(-1)` in `triggerControllerMove` causes `Reelect` not to invoke `onControllerResignation`. Among other things, this causes an `IllegalStateException` to be thrown when `KafkaScheduler.startup` is invoked for the second time without the corresponding `shutdown`. We now simply call `onControllerResignation` as part of `triggerControllerMove`.

Finally, I included a few clean-ups:

1. No longer update the broker state in `onControllerFailover`. This is no longer needed
since we removed the `RunningAsController` state (KAFKA-3761).
2. Trivial clean-ups in KafkaController
3. Removed unused parameter in `ZkUtils.getPartitionLeaderAndIsrForTopics`

Author: Ismael Juma <ismael@juma.me.uk>

Reviewers: Jun Rao <junrao@gmail.com>

Closes #2935 from ijuma/on-controller-resignation-if-trigger-controller-move
2017-05-30 16:59:33 -07:00
Rajini Sivaram b38b74bb77 MINOR: Fix doc for producer throttle time metrics
Author: Rajini Sivaram <rajinisivaram@googlemail.com>

Reviewers: Ismael Juma <ismael@juma.me.uk>, Jun Rao <junrao@gmail.com>

Closes #3169 from rajinisivaram/MINOR-producer-metrics
2017-05-30 16:35:25 -07:00
Guozhang Wang 80223b14ee KAFKA-5202: Handle topic deletion while trying to send txn markers
Here is the sketch of this proposal:

1. When it is time to send the txn markers, only look for the leader node of the partition once instead of retrying, and if that information is not available, it means the partition is highly likely been removed since it was in the cache before. In this case, we just remove the partition from the metadata object and skip putting into the corresponding queue, and if all partitions' leader broker are non-available, complete this delayed operation to proceed to write the complete txn log entry.

2. If the leader id is unknown from the cache but the corresponding node object with the listener name is not available, it means that the leader is likely unavailable right now. Put it into a separate queue and let sender thread retry fetching its metadata again each time upon draining the queue.

One caveat of this approach is the delete-and-recreate case, and the argument is that since all the messages are deleted anyways when deleting the topic-partition, it does not matter whether the markers are on the log partitions or not.

Author: Guozhang Wang <wangguoz@gmail.com>

Reviewers: Apurva Mehta <apurva@confluent.io>, Damian Guy <damian.guy@gmail.com>, Jason Gustafson <jason@confluent.io>

Closes #3130 from guozhangwang/K5202-handle-topic-deletion
2017-05-30 14:35:51 -07:00
xinlihua f0745cd514 KAFKA-4603: Disallow abbreviations in OptionParser constructor
KAFKA-4603 the command parsed error
Using "new OptionParser" might result in parse error

Change all the OptionParser constructor in Kafka into "new OptionParser(false)"

Author: xinlihua <xin.lihua1@zte.com.cn>
Author: unknown <00067310@A23338408.zte.intra>
Author: auroraxlh <xin.lihua1@zte.com.cn>
Author: xin <xin.lihua1@zte.com.cn>

Reviewers: Damian Guy, Guozhang Wang

Closes #2349 from auroraxlh/fix_OptionParser_bug
2017-05-30 13:53:32 -07:00
Ismael Juma b3788d8dcb KAFKA-5316; Follow-up with ByteBufferOutputStream and other misc improvements
ByteBufferOutputStream improvements:
* Document pitfalls
* Improve efficiency when dealing with direct byte buffers
* Improve handling of buffer expansion
* Be consistent about using `limit` instead of `capacity`
* Add constructors that allocate the internal buffer

Other minor changes:
* Fix log warning to specify correct Kafka version
* Clean-ups

Author: Ismael Juma <ismael@juma.me.uk>

Reviewers: Jason Gustafson <jason@confluent.io>

Closes #3166 from ijuma/minor-kafka-5316-follow-ups
2017-05-30 12:53:32 -07:00
Jiangjie Qin 6b03497915 KAFKA-5344; set message.timestamp.difference.max.ms back to Long.MaxValue
Author: Jiangjie Qin <becket.qin@gmail.com>

Reviewers: Ismael Juma <ismael@juma.me.uk>

Closes #3163 from becketqin/KAFKA-5344
2017-05-30 15:44:34 +01:00
amethystic 6f5930d631 KAFKA-5278: ConsoleConsumer should honor `--value-deserializer`
In the original implementation, console-consumer fails to honor `--value-deserializer` config.

Author: amethystic <huxi_2b@hotmail.com>

Reviewers: Guozhang Wang <wangguoz@gmail.com>

Closes #3100 from amethystic/KAFKA-5278
2017-05-30 00:12:54 -07:00
Damian Guy a8794d8a5d KAFKA-5260; Producer should not send AbortTxn unless transaction has actually begun
Keep track of when a transaction has begun by setting a flag, `transactionStarted` when a successfull `AddPartitionsToTxnResponse` or `AddOffsetsToTxnResponse` had been received. If an `AbortTxnRequest` about to be sent and `transactionStarted` is false, don't send the request and transition the state to `READY`

Author: Damian Guy <damian.guy@gmail.com>

Reviewers: Apurva Mehta <apurva@confluent.io>, Guozhang Wang <wangguoz@gmail.com>, Jason Gustafson <jason@confluent.io>

Closes #3126 from dguy/kafka-5260
2017-05-29 22:52:59 -07:00
hejiefang dc520d275a KAFKA-5338: fix a miss-spell in ResetIntegrationTest
There is a Misspell in Annotations of ResetIntegrationTest.

Author: hejiefang <he.jiefang@zte.com.cn>

Reviewers: Matthias J. Sax, Guozhang Wang

Closes #3159 from hejiefang/KAFKA-5338
2017-05-29 16:01:54 -07:00
Jason Gustafson dfa3c8a92d KAFKA-5316; LogCleaner should account for larger record sets after cleaning
Author: Jason Gustafson <jason@confluent.io>

Reviewers: Ismael Juma <ismael@juma.me.uk>, Jun Rao <junrao@gmail.com>

Closes #3142 from hachikuji/KAFKA-5316
2017-05-28 09:57:59 -07:00
Vahid Hashemian b50387eb7c MINOR: Remove unused method parameter in `SimpleAclAuthorizer`
Author: Vahid Hashemian <vahidhashemian@us.ibm.com>

Reviewers: Ismael Juma <ismael@juma.me.uk>

Closes #3147 from vahidhashemian/minor/remove_unsed_method_parameter_simpleaclauthorizer
2017-05-27 10:19:15 +01:00
Rajini Sivaram eb3aae7a05 MINOR: Cleanup in tests to avoid threads being left behind
Author: Rajini Sivaram <rajinisivaram@googlemail.com>

Reviewers: Ismael Juma <ismael@juma.me.uk>

Closes #3146 from rajinisivaram/MINOR-test-cleanup
2017-05-27 10:17:57 +01:00
James Cheng 0bc4f75eed KAFKA-5191: Autogenerate Consumer Fetcher metrics
Autogenerate docs for the Consumer Fetcher's metrics. This is a smaller subset of the original PR https://github.com/apache/kafka/pull/1202.

CC ijuma benstopford hachikuji

Author: James Cheng <jylcheng@yahoo.com>

Reviewers: Ismael Juma <ismael@juma.me.uk>, Guozhang Wang <wangguoz@gmail.com>

Closes #2993 from wushujames/fetcher_metrics_docs
2017-05-26 15:34:20 -07:00
Umesh Chaudhary ca8915d2ef KAFKA-4660; Improve test coverage KafkaStreams
dguy , mjsax Please review the PR and let me know your comments.

Author: umesh chaudhary <umesh9794@gmail.com>

Reviewers: Bill Bejeck, Matthias J. Sax, Guozhang Wang

Closes #3099 from umesh9794/mylocal
2017-05-26 15:31:58 -07:00
Ismael Juma 68eed84f24 KAFKA-5333; Remove Broker ACL resource type
Author: Ismael Juma <ismael@juma.me.uk>

Reviewers: Jun Rao <junrao@gmail.com>

Closes #3154 from ijuma/kafka-5333-remove-broker-acl-resource-type
2017-05-26 15:25:02 -07:00
Damian Guy 7892b4e6c7 KAFKA-5128; Check inter broker version in transactional methods
Add check in `KafkaApis` that the inter broker protocol version is at least `KAFKA_0_11_0_IV0`, i.e., supporting transactions

Author: Damian Guy <damian.guy@gmail.com>

Reviewers: Ismael Juma <ismael@juma.me.uk>, Jason Gustafson <jason@confluent.io>

Closes #3103 from dguy/kafka-5128
2017-05-26 09:52:47 -07:00
Matthias J. Sax faa1803aa3 KAFKA-5309: Stores not queryable after one thread died
- introduces a new thread state DEAD
 - ignores DEAD threads when querying

Author: Matthias J. Sax <matthias@confluent.io>

Reviewers: Damian Guy, Eno Thereska, Guozhang Wang

Closes #3140 from mjsax/kafka-5309-stores-not-queryable
2017-05-26 09:42:02 -07:00
Jason Gustafson 3743363827 MINOR: Preserve the base offset of the original record batch in V2
The previous code did not handle this correctly if a batch was
compacted more than once.

Also add test case for duplicate check after log cleaning and
improve various comments.

Author: Jason Gustafson <jason@confluent.io>

Reviewers: Ismael Juma <ismael@juma.me.uk>

Closes #3145 from hachikuji/minor-improve-base-sequence-docs
2017-05-26 09:46:09 +01:00
Apurva Mehta 02c0c3b017 KAFKA-5147; Add missing synchronization to TransactionManager
The basic idea is that exactly three collections, ie. `pendingRequests`, `newPartitionsToBeAddedToTransaction`, and `partitionsInTransaction` are accessed from the context of application threads. The first two are modified from the application threads, and the last is read from those threads.

So to make the `TransactionManager` truly thread safe, we have to ensure that all accesses to these three members are done in a synchronized block. I inspected the code, and I believe this patch puts the synchronization in all the correct places.

Author: Apurva Mehta <apurva@confluent.io>

Reviewers: Jason Gustafson <jason@confluent.io>

Closes #3132 from apurvam/KAFKA-5147-transaction-manager-synchronization-fixes
2017-05-25 16:21:04 -07:00
Rajini Sivaram 73ca0d215e KAFKA-5320: Include all request throttling in client throttle metrics
Author: Rajini Sivaram <rajinisivaram@googlemail.com>

Reviewers: Jun Rao <junrao@gmail.com>, Ismael Juma <ismael@juma.me.uk>

Closes #3137 from rajinisivaram/KAFKA-5320
2017-05-25 20:28:18 +01:00
Damian Guy 20e2008785 KAFKA-5279: TransactionCoordinator must expire transactionalIds
remove transactions that have not been updated for at least `transactional.id.expiration.ms`

Author: Damian Guy <damian.guy@gmail.com>

Reviewers: Apurva Mehta, Guozhang Wang

Closes #3101 from dguy/kafka-5279
2017-05-25 11:01:10 -07:00
Rajini Sivaram 64fc1a7cae KAFKA-5263: Avoid tight polling loop in consumer with no ready nodes
For consumers with manual partition assignment, await metadata when there are no ready nodes to avoid busy polling.

Author: Rajini Sivaram <rajinisivaram@googlemail.com>

Reviewers: Jason Gustafson <jason@confluent.io>, Ismael Juma <ismael@juma.me.uk>

Closes #3124 from rajinisivaram/KAFKA-5263
2017-05-25 11:23:18 +01:00
Colin P. Mccabe a10990f44b MINOR: fix flakiness in testDeleteAcls
This call to isCompletedExceptionally introduced a race condition
because the future might not have been completed.  assertFutureError
checks that the exception is present and of the correct type in any
case, so the call was not necessary.

Author: Colin P. Mccabe <cmccabe@confluent.io>

Reviewers: Ismael Juma <ismael@juma.me.uk>

Closes #3139 from cmccabe/fix-test-deleteacls
2017-05-25 11:21:00 +01:00
Jason Gustafson cea319a4ad KAFKA-4935; Deprecate client checksum API and compute lazy partial checksum for magic v2
Author: Jason Gustafson <jason@confluent.io>

Reviewers: Apurva Mehta <apurva@confluent.io>, Ismael Juma <ismael@juma.me.uk>

Closes #3123 from hachikuji/KAFKA-4935
2017-05-25 08:21:01 +01:00
Jason Gustafson fdcee8b8b3 MINOR: GroupCoordinator can append with group lock
Author: Jason Gustafson <jason@confluent.io>

Reviewers: Guozhang Wang <wangguoz@gmail.com>

Closes #3133 from hachikuji/minor-replica-manager-append-refactor
2017-05-24 21:00:44 -07:00
Jeyhun Karimov c5d44af774 KAFKA-4144 Follow-up: add one missing overload function to maintain backward compatibility
A follow up RP to fix [issue](2cd0b87bc8 (commitcomment-22200864))

Author: Jeyhun Karimov <je.karimov@gmail.com>

Reviewers: Matthias J. Sax, Eno Thereska, Bill Bejeck, Guozhang Wang

Closes #3109 from jeyhunkarimov/KAFKA-4144-follow-up
2017-05-24 19:00:37 -07:00
Jason Gustafson 38f6cae9e8 KAFKA-5259; TransactionalId auth implies ProducerId auth
Author: Jason Gustafson <jason@confluent.io>

Reviewers: Apurva Mehta <apurva@confluent.io>, Jun Rao <junrao@gmail.com>

Closes #3075 from hachikuji/KAFKA-5259-FIXED
2017-05-24 15:26:46 -07:00
Vahid Hashemian 88200938f0 MINOR: Improve the help doc of consumer group command
Clarify the consumer group command help message around `zookeeper`, `bootstrap-server`, and `new-consumer` options.

Author: Vahid Hashemian <vahidhashemian@us.ibm.com>

Reviewers: Jason Gustafson <jason@confluent.io>

Closes #2046 from vahidhashemian/minor/improve_consumer_group_command_doc
2017-05-24 09:14:10 -07:00
Apurva Mehta 4d89db9682 KAFKA-5273: Make KafkaConsumer.committed query the server for all partitions
Before this patch the consumer would return the cached offsets for partitions in its current assignment. This worked when all the offset commits went through the consumer.

With KIP-98, offsets can be committed transactionally through the producer. This means that relying on cached positions in the consumer returns incorrect information: since commits go through the producer, the cache is never updated.

Hence we need to update the `KafkaConsumer.committed` method to always lookup the server for the last committed offset to ensure it gets the correct information every time.

Author: Apurva Mehta <apurva@confluent.io>

Reviewers: Jason Gustafson, Guozhang Wang

Closes #3119 from apurvam/KAFKA-5273-kafkaconsumer-committed-should-always-hit-server
2017-05-23 23:08:57 -07:00
dejan2609 90dc2d16fb KAFKA-5081: Force jackson-annotations to a single version matching expected Jackson version
**JIRA ticket:** [KAFKA-5081 two versions of jackson-annotations-xxx.jar in distribution tgz](https://issues.apache.org/jira/browse/KAFKA-5081)

**Solutions:**
1. accept this merge request **_OR_**
2. upgrade jackson libraries to version **_2.9.x_** (currently available as a pre-release only)

**Related jackson issue:** [Add explicit \`jackson-annotations\` dependency version for \`jackson-databind\`](https://github.com/FasterXML/jackson-databind/issues/1545)

**Note:** previous (equivalent) merge request #2900 ended up deep in the sand with swarm of messages due to flaky test, so I opted to close it and to open this one.

ijuma: FYI

Author: dejan2609 <dejan2609@gmail.com>

Reviewers: Ewen Cheslack-Postava <ewen@confluent.io>

Closes #3116 from dejan2609/KAFKA-5081
2017-05-23 22:07:17 -07:00
Ismael Juma 516d8457d8 KAFKA-5135; Controller Health Metrics (KIP-143)
Author: Ismael Juma <ismael@juma.me.uk>

Reviewers: Jun Rao <junrao@gmail.com>, Onur Karaman <okaraman@linkedin.com>

Closes #2983 from ijuma/kafka-5135-controller-health-metrics-kip-143
2017-05-24 00:37:06 +01:00
Onur Karaman beeddc25d6 KAFKA-5310; reset ControllerContext during resignation
This ticket is all about ControllerContext initialization and teardown. The key points are:
1. we should teardown ControllerContext during resignation instead of waiting on election to fix it up. A heapdump shows that the former controller keeps pretty much all of its ControllerContext state laying around.
2. we don't properly teardown/reset ControllerContext.partitionsBeingReassigned. This can cause problems when the former controller becomes re-elected as controller at a later point in time.

Suppose a partition assignment is initially R0. Now suppose a reassignment R1 gets stuck during controller C0 and an admin tries to "undo" R1 (by deleting /admin/partitions_reassigned, deleting /controller, and submitting another reassignment specifying R0). The new controller C1 may succeed with R0. If the controller moves back to C0, it will then reattempt R1 even though that partition reassignment has been cleared from zookeeper prior to shifting the controller back to C0. This results in the actual partition reassignment in zookeeper being unexpectedly changed back to R1.

Author: Onur Karaman <okaraman@linkedin.com>

Reviewers: Jun Rao <junrao@gmail.com>

Closes #3122 from onurkaraman/KAFKA-5310
2017-05-23 16:29:55 -07:00
Guozhang Wang 1bf6483316 KAFKA-5280: Protect txn metadata map with read-write lock
Two major changes plus one minor change:

0. change stateLock to a read-write lock.

1. Put the check of "isCoordinator" and "coordinatorLoading" together with the return of the metadata, under one read lock block, since otherwise we can get incorrect behavior if there is a change in the metadata cache after the check but before the accessing of the metadata.

2. Grab the read lock right before trying to append to local txn log, and until the local append returns; this is to avoid the scenario that the epoch has actually changed when we are appending to local log (e.g. emigration followed by immigration).

3. only watch on txnId instead of txnId and txnPartitionId in the txn marker purgatory, and disable reaper thread, as we can now safely clear all the delayed operations by traversing the marker queues.

Author: Guozhang Wang <wangguoz@gmail.com>

Reviewers: Jason Gustafson, Jun Rao

Closes #3082 from guozhangwang/K5231-read-write-lock
2017-05-23 13:34:43 -07:00