Commit Graph

170 Commits

Author SHA1 Message Date
Gyeongwon, Do abde0e0878
MINOR: fix typo and comment (#14650)
Reviewers: hudeqi <1217150961@qq.com>, Ziming Deng <dengziming1993@gmail.com>.
2023-10-28 12:10:53 +08:00
Ismael Juma 69e591db3a
MINOR: Rewrite/Move KafkaNetworkChannel to the `raft` module (#14559)
This is now possible since `InterBrokerSend` was moved from `core` to `server-common`.
Also rewrite/move `KafkaNetworkChannelTest`.

The scala version of `KafkaNetworkChannelTest` passed with the changes here (before I
deleted it).

Reviewers: Justine Olshan <jolshan@confluent.io>, José Armando García Sancio <jsancio@users.noreply.github.com>
2023-10-16 20:10:31 -07:00
Ismael Juma 4cf86c5d2f
KAFKA-15492: Upgrade and enable spotbugs when building with Java 21 (#14533)
Spotbugs was temporarily disabled as part of KAFKA-15485 to support Kafka build with JDK 21. This PR upgrades the spotbugs version to 4.8.0 which adds support for JDK 21 and enables it's usage on build again.

Reviewers: Divij Vaidya <diviv@amazon.com>
2023-10-12 14:09:10 +02:00
Ismael Juma 98febb989a
KAFKA-15485: Fix "this-escape" compiler warnings introduced by JDK 21 (1/N) (#14427)
This is one of the steps required for kafka to compile with Java 21.

For each case, one of the following fixes were applied:
1. Suppress warning if fixing would potentially result in an incompatible change (for public classes)
2. Add final to one or more methods so that the escape is not possible
3. Replace method calls with direct field access.

In addition, we also fix a couple of compiler warnings related to deprecated references in the `core` module.

See the following for more details regarding the new lint warning:
https://www.oracle.com/java/technologies/javase/21-relnote-issues.html#JDK-8015831

Reviewers: Divij Vaidya <diviv@amazon.com>, Satish Duggana <satishd@apache.org>, Chris Egerton <chrise@aiven.io>
2023-09-24 05:59:29 -07:00
José Armando García Sancio 7b669e8806
KAFKA-14273; Close file before atomic move (#14354)
In the Windows OS atomic move are not allowed if the file has another open handle. E.g

__cluster_metadata-0\quorum-state: The process cannot access the file because it is being used by another process
        at java.base/sun.nio.fs.WindowsException.translateToIOException(WindowsException.java:92)
        at java.base/sun.nio.fs.WindowsException.rethrowAsIOException(WindowsException.java:103)
        at java.base/sun.nio.fs.WindowsFileCopy.move(WindowsFileCopy.java:403)
        at java.base/sun.nio.fs.WindowsFileSystemProvider.move(WindowsFileSystemProvider.java:293)
        at java.base/java.nio.file.Files.move(Files.java:1430)
        at org.apache.kafka.common.utils.Utils.atomicMoveWithFallback(Utils.java:949)
        at org.apache.kafka.common.utils.Utils.atomicMoveWithFallback(Utils.java:932)
        at org.apache.kafka.raft.FileBasedStateStore.writeElectionStateToFile(FileBasedStateStore.java:152)

This is fixed by first closing the temporary quorum-state file before attempting to move it.

Reviewers: Colin Patrick McCabe <cmccabe@apache.org>
Co-Authored-By: Renaldo Baur Filho <renaldobf@gmail.com>
2023-09-07 16:17:03 -07:00
mannoopj 2e3ff21c2e
KAFKA-15412: Reading an unknown version of quorum-state-file should trigger an error (#14302)
Reading an unknown version of quorum-state-file should trigger an error. Currently the only known version is 0. Reading any other version should cause an error.

Reviewers: Justine Olshan <jolshan@confluent.io>, Luke Chen <showuon@gmail.com>
2023-08-30 15:03:41 +08:00
Phuc-Hong-Tran 8d12c1175c
KAFKA-15152: Fix incorrect format specifiers when formatting string (#14026)
Reviewers: Divij Vaidya <diviv@amazon.com>

Co-authored-by: phuchong.tran <phuchong.tran@servicenow.com>
2023-08-24 19:38:45 +02:00
José Armando García Sancio 3f4816dd3e
KAFKA-15345; KRaft leader notifies leadership when listener reaches epoch start (#14213)
In a non-empty log the KRaft leader only notifies the listener of leadership when it has read to the leader's epoch start offset. This guarantees that the leader epoch has been committed and that the listener has read all committed offsets/records.

Unfortunately, the KRaft leader doesn't do this when the log is empty. When the log is empty the listener is notified immediately when it has become leader. This makes the API inconsistent and harder to program against.

This change fixes that by having the KRaft leader wait for the listener's nextOffset to be greater than the leader's epochStartOffset before calling handleLeaderChange.

The RecordsBatchReader implementation is also changed to include control records. This makes it possible for the state machine learn about committed control records. This additional information can be used to compute the committed offset or for counting those bytes when determining when to snapshot the partition.

Reviewers: Colin P. McCabe <cmccabe@apache.org>, Jason Gustafson <jason@confluent.io>
2023-08-17 18:40:17 -07:00
José Armando García Sancio dafe51b658
KAFKA-15100; KRaft data race with the expiration service (#14141)
The KRaft client uses an expiration service to complete FETCH requests that have timed out. This expiration service uses a different thread from the KRaft polling thread. This means that it is unsafe for the expiration service thread to call tryCompleteFetchRequest. tryCompleteFetchRequest reads and updates a lot of states that is assumed to be only be read and updated from the polling thread.

The KRaft client now does not call tryCompleteFetchRequest when the FETCH request has expired. It instead will send the FETCH response that was computed when the FETCH request was first handled.

This change also fixes a bug where the KRaft client was not sending the FETCH response immediately, if the response contained a diverging epoch or snapshot id.

Reviewers: Jason Gustafson <jason@confluent.io>
2023-08-09 07:12:08 -07:00
José Armando García Sancio e0727063f7
KAFKA-15312; Force channel before atomic file move (#14162)
On ext4 file systems we have seen snapshots with zero-length files. This is possible if
the file is closed and moved before forcing the channel to write to disk.

Reviewers: Ron Dagostino <rndgstn@gmail.com>, Alok Thatikunta <athatikunta@confluent.io>
2023-08-08 14:31:42 -07:00
Colin Patrick McCabe 10bcd4fc7f
KAFKA-15213: provide the exact offset to QuorumController.replay (#13643)
Provide the exact record offset to QuorumController.replay() in all cases. There are several situations
where this is useful, such as logging, implementing metadata transactions, or handling broker
registration records.

In the case where the QC is inactive, and simply replaying records, it is easy to compute the exact
record offset from the batch base offset and the record index.

The active QC case is more difficult. Technically, when we submit records to the Raft layer, it can
choose a batch base offset later than the one we expect, if someone else is also adding records.
While the QC is the only entity submitting data records, control records may be added at any time.
In the current implementation, these are really only used for leadership elections. However, this
could change with the addition of quorum reconfiguration or similar features.

Therefore, this PR allows the QC to tell the Raft layer that a record append should fail if it
would have resulted in a batch base offset other than what was expected. This in turn will trigger a
controller failover. In the future, if automatically added control records become more common, we
may wish to have a more sophisticated system than this simple optimistic concurrency mechanism. But
for now, this will allow us to rely on the offset as correct.

In order that the active QC can learn what offset to start writing at, the PR also adds a new
RaftClient#endOffset function.

At the Raft level, this PR adds a new exception, UnexpectedBaseOffsetException. This gets thrown
when we request a base offset that doesn't match the one the Raft layer would have given us.
Although this exception should cause a failover, it should not be considered a fault. This
complicated the exception handling a bit and motivated splitting more of it out into the new
EventHandlerExceptionInfo class. This will also let us unit test things like slf4j log messages a
bit better.

Reviewers: David Arthur <mumrah@gmail.com>, José Armando García Sancio <jsancio@apache.org>
2023-07-27 17:01:55 -07:00
Colin Patrick McCabe c7de30f38b
KAFKA-15183: Add more controller, loader, snapshot emitter metrics (#14010)
Implement some of the metrics from KIP-938: Add more metrics for
measuring KRaft performance.

Add these metrics to QuorumControllerMetrics:
    kafka.controller:type=KafkaController,name=TimedOutBrokerHeartbeatCount
    kafka.controller:type=KafkaController,name=EventQueueOperationsStartedCount
    kafka.controller:type=KafkaController,name=EventQueueOperationsTimedOutCount
    kafka.controller:type=KafkaController,name=NewActiveControllersCount

Create LoaderMetrics with these new metrics:
    kafka.server:type=MetadataLoader,name=CurrentMetadataVersion
    kafka.server:type=MetadataLoader,name=HandleLoadSnapshotCount

Create SnapshotEmitterMetrics with these new metrics:
    kafka.server:type=SnapshotEmitter,name=LatestSnapshotGeneratedBytes
    kafka.server:type=SnapshotEmitter,name=LatestSnapshotGeneratedAgeMs

Reviewers: Ron Dagostino <rndgstn@gmail.com>
2023-07-24 21:13:58 -07:00
Cheryl Simmons e98508747a
Doc fixes: Fix format and other small errors in config documentation (#13661)
Various formatting fixes in the config docs.

Reviewers: Bill Bejeck <bbejeck@apache.org>
2023-07-10 12:48:35 -04:00
José Armando García Sancio 3a246b1aba
KAFKA-15078; KRaft leader replys with snapshot for offset 0 (#13845)
If the follower has an empty log, fetches with offset 0, it is more
efficient for the leader to reply with a snapshot id (redirect to
FETCH_SNAPSHOT) than for the follower to continue fetching from the log
segments.

Reviewers: David Arthur <mumrah@gmail.com>, dengziming <dengziming1993@gmail.com>
2023-06-28 14:21:11 -07:00
José Armando García Sancio b7a6a8fd5f
KAFKA-15076; KRaft should prefer latest snapshot (#13834)
If the KRaft listener is at offset 0, the start of the log, and KRaft has generated a snapshot, it should prefer the latest snapshot instead of having the listener read from the start of the log.

This is implemented by having KafkaRaftClient send a Listener.handleLoadSnapshot event, if the Listener is at offset 0 and the KRaft partition has generated a snapshot.

Reviewers: Jason Gustafson <jason@confluent.io>, David Arthur <mumrah@gmail.com>
2023-06-12 07:25:42 -07:00
Alok Thatikunta 3d349ae0d6
MINOR; Add helper util Snapshots.lastContainedLogTimestamp (#13772)
This change refactors the lastContainedLogTimestamp to the Snapshots class, for re-usability. Introduces IdentitySerde based on ByteBuffer, required for using RecordsSnapshotReader. This change also removes the "recordSerde: RecordSerde[_]" argument from the KafkaMetadataLog constructor.

Reviewers: José Armando García Sancio <jsancio@apache.org>
2023-06-06 08:29:15 -07:00
mojh7 04f2f6a26a
MINOR: Typo and unused method removal (#13739)
clean up unused private method and removed typos

Reviewers:  Divij Vaidya <diviv@amazon.com>,  Manyanda Chitimbo <manyanda.chitimbo@gmail.com>,  Daniel Scanteianu, Josep Prat <josep.prat@aiven.io>
2023-06-06 10:50:56 +02:00
Divij Vaidya fe6a827e20
KAFKA-14633: Reduce data copy & buffer allocation during decompression (#13135)
After this change,

    For broker side decompression: JMH benchmark RecordBatchIterationBenchmark demonstrates 20-70% improvement in throughput (see results for RecordBatchIterationBenchmark.measureSkipIteratorForVariableBatchSize).
    For consumer side decompression: JMH benchmark RecordBatchIterationBenchmark a mix bag of single digit regression for some compression type to 10-50% improvement for Zstd (see results for RecordBatchIterationBenchmark.measureStreamingIteratorForVariableBatchSize).

Reviewers: Luke Chen <showuon@gmail.com>, Manyanda Chitimbo <manyanda.chitimbo@gmail.com>, Ismael Juma <mail@ismaeljuma.com>
2023-06-05 15:04:49 +08:00
David Mao d944ef1efb MINOR: Rename handleSnapshot to handleLoadSnapshot (#13727)
Rename handleSnapshot to handleLoadSnapshot to make it explicit that it is handling snapshot load,
not generation.

Reviewers: Colin P. McCabe <cmccabe@apache.org>, Jason Gustafson <jason@confluent.io>
2023-05-17 09:57:24 -07:00
Chia-Ping Tsai 3c8665025a
MINOR: move ControlRecordTest to correct directory (#13718)
Reviewers: Kamal Chandraprakash<kamal.chandraprakash@gmail.com>, hudeqi<1217150961@qq.com>, Satish Duggana<satishd@apache.org>
2023-05-17 18:56:02 +05:30
Luke Chen 625ef176ee
MINOR: remove kraft readme link (#13691)
The config/kraft/README.md is already removed. We should also remove the link.

Reviewers: dengziming <dengziming1993@gmail.com>
2023-05-10 16:40:20 +08:00
Manyanda Chitimbo dd63d88ac3
MINOR: fix noticed typo in raft and metadata projects (#13612)
Reviewers: Josep Prat <jlprat@apache.org>
2023-04-21 15:02:06 +02:00
José Armando García Sancio 1f1900b380
MINOR: Improve raft log4j messages a bit (#13553)
Reviewers: Colin P. McCabe <cmccabe@apache.org>
2023-04-14 10:05:22 -07:00
Paolo Patierno 571841fed3
KAFKA-14883: Expose `observer` state in KRaft metrics (#13525)
Currently, the current-state KRaft related metric reports follower state for a broker while technically it should be reported as an observer as the kafka-metadata-quorum tool does.

Reviewers: Luke Chen <showuon@gmail.com>, dengziming <dengziming1993@gmail.com>
2023-04-13 12:55:57 +08:00
José Armando García Sancio 672dd3ab6a
KAFKA-13020; Implement reading Snapshot log append timestamp (#13345)
The SnapshotReader exposes the "last contained log time". This is mainly used during snapshot cleanup. The previous implementation used the append time of the snapshot record. This is not accurate as this is the time when the snapshot was created and not the log append time of the last record included in the snapshot.

The log append time of the last record included in the snapshot is store in the header control record of the snapshot. The header control record is the first record of the snapshot.

To be able to read this record, this change extends the RecordsIterator to decode and expose the control records in the Records type.

Reviewers: Colin Patrick McCabe <cmccabe@apache.org>
2023-04-07 09:25:54 -07:00
José Armando García Sancio d604534cc3
MINOR; Increase log level of some rare events (#13430)
To help debug KRaft's behavior this change increases the log level of
some rare messages to INFO level.

Reviewers: Jason Gustafson <jason@confluent.io>
2023-03-21 17:02:38 -07:00
Calvin Liu 79b5f7f1ce
KAFKA-14617: Add ReplicaState to FetchRequest (KIP-903) (#13323)
This patch is the first part of KIP-903. It updates the FetchRequest to include the new tagged ReplicaState field which replaces the now deprecated ReplicaId field. The FetchRequest version is bumped to version 15 and the MetadataVersion to 3.5-IV1.

Reviewers: David Jacot <djacot@confluent.io>
2023-03-16 14:04:34 +01:00
José Armando García Sancio 44e613c4cd
KAFKA-13884; Only voters flush on Fetch response (#13396)
The leader only requires that voters have flushed their log up to the fetch offset before sending a fetch request.

This change only flushes the log when handling the fetch response, if the follower is a voter. This should improve the disk performance on observers (brokers).

Reviewers: Jason Gustafson <jason@confluent.io>
2023-03-15 12:06:41 -07:00
José Armando García Sancio c13b49f2d1
Revert "KAFKA-14371: Remove unused clusterId field from quorum-state file (#13102)" (#13355)
This reverts commit 0927049a61.

Reviewers: Luke Chen <showuon@gmail.com>
2023-03-06 18:09:21 -08:00
Christo Lolov 5b295293c0
MINOR: Remove unnecessary toString(); fix comment references (#13212)
Reviewers: Mickael Maison <mickael.maison@gmail.com>, Divij Vaidya <diviv@amazon.com>, Lucas Brutschy <lbrutschy@confluent.io>
2023-03-06 18:39:04 +01:00
Gantigmaa Selenge 0927049a61
KAFKA-14371: Remove unused clusterId field from quorum-state file (#13102)
Remove clusterId field from the KRaft controller's quorum-state file $LOG_DIR/__cluster_metadata-0/quorum-state

Reviewers: Luke Chen <showuon@gmail.com>, dengziming <dengziming1993@gmail.com>, Christo Lolov <christololov@gmail.com>
2023-03-01 10:13:38 +08:00
Jason Gustafson 35142d43e6
KAFKA-14664; Fix inaccurate raft idle ratio metric (#13207)
The raft idle ratio is currently computed as the average of all recorded poll durations. This tends to underestimate the actual idle ratio since it treats all measurements equally regardless how much time was spent. For example, say we poll twice with the following durations:

Poll 1: 2s
Poll 2: 0s

Assume that the busy time is negligible, so 2s passes overall.

In the first measurement, 2s is spent waiting, so we compute and record a ratio of 1.0. In the second measurement, no time passes, and we record 0.0. The idle ratio is then computed as the average of these two values (1.0 + 0.0 / 2 = 0.5), which suggests that the process was busy for 1s, which overestimates the true busy time.

In this patch, we create a new `TimeRatio` class which tracks the total duration of a periodic event over a full interval of time measurement.

Reviewers: José Armando García Sancio <jsancio@apache.org>
2023-02-15 14:40:00 -08:00
José Armando García Sancio f9e0d03274
MINOR; Make granting voter immutable (#13154)
Make LeaderState's grantingVoters field explicitly immutable. The set of voters that granted their voter to the current leader was already immutable. This change makes that explicit.

Reviewers: Jason Gustafson <jason@confluent.io>, Mathew Hogan <mathewdhogan@@users.noreply.github.com>
2023-01-25 15:52:01 -08:00
José Armando García Sancio 058d8d530b
KAFKA-14618; Fix off by one error in snapshot id (#13108)
The KRaft client expects the offset of the snapshot id to be an end offset. End offsets are
exclusive. The MetadataProvenance type was createing a snapshot id using the last contained offset
which is inclusive. This change fixes that and renames some of the fields to make this difference
more obvious.

Reviewers: Colin P. McCabe <cmccabe@apache.org>
2023-01-13 10:06:38 -08:00
Jason Gustafson 26a4d42072
MINOR: Pass snapshot ID directly in `RaftClient.createSnapshot` (#12981)
Let `RaftClient.createSnapshot` take the snapshotId directly instead of the committed offset/epoch (which may not exist). 

Reviewers: José Armando García Sancio <jsancio@apache.org>
2022-12-13 10:44:56 -08:00
José Armando García Sancio 3541d5ab18
MINOR; Improve high watermark log messages (#12975)
While debugging KRaft and the metadata state machines it is helpful to always log the first time the replica discovers the high watermark. All other updates to the high watermark are logged at trace because they are more frequent and less useful.

Reviewers: Luke Chen <showuon@gmail.com>
2022-12-12 16:32:16 -08:00
Colin Patrick McCabe 5514f372b3
MINOR: extract jointly owned parts of BrokerServer and ControllerServer (#12837)
Extract jointly owned parts of BrokerServer and ControllerServer into SharedServer. Shut down
SharedServer when the last component using it shuts down. But make sure to stop the raft manager
before closing the ControllerServer's sockets.

This PR also fixes a memory leak where ReplicaManager was not removing some topic metric callbacks
during shutdown. Finally, we now release memory from the BatchMemoryPool in KafkaRaftClient#close.
These changes should reduce memory consumption while running junit tests.

Reviewers: Jason Gustafson <jason@confluent.io>, Ismael Juma <ismael@juma.me.uk>
2022-12-02 00:27:22 -08:00
José Armando García Sancio 72b535acaf
KAFKA-14307; Controller time-based snapshots (#12761)
Implement time based snapshot for the controller. The general strategy for this feature is that the controller will use the record-batch's append time to determine if a snapshot should be generated. If the oldest record that has been committed but is not included in the latest snapshot is older than `metadata.log.max.snapshot.interval.ms`, the controller will trigger a snapshot immediately. This is useful in case the controller was offline for more that `metadata.log.max.snapshot.interval.ms` milliseconds.

If the oldest record that has been committed but is not included in the latest snapshot is NOT older than `metadata.log.max.snapshot.interval.ms`, the controller will schedule a `maybeGenerateSnapshot` deferred task.

It is possible that when the controller wants to generate a new snapshot, either because of time or number of bytes, the controller is currently generating a snapshot. In this case the `SnapshotGeneratorManager` was changed so that it checks and potentially triggers another snapshot when the currently in-progress snapshot finishes.

To better support this feature the following additional changes were made:
1. The configuration `metadata.log.max.snapshot.interval.ms` was added to `KafkaConfig` with a default value of one hour.
2. `RaftClient` was extended to return the latest snapshot id. This snapshot id is used to determine if a given record is included in a snapshot.
3. Improve the `SnapshotReason` type to support the inclusion of values in the message.

Reviewers: Jason Gustafson <jason@confluent.io>, Niket Goel <niket-goel@users.noreply.github.com>
2022-11-21 17:30:50 -08:00
Jason Gustafson c710ecd071
MINOR: Reduce tries in RecordsIteratorTest to improve build time (#12798)
`RecordsIteratorTest` takes the longest times in recent builds (even including integration tests). The default of 1000 tries from jqwik is probably overkill and causes the test to take 10 minutes locally. Decreasing to 50 tries reduces that to less than 30s. 

Reviewers: David Jacot <djacot@confluent.io>
2022-10-31 09:29:19 -07:00
Orsák Maroš a0e37b79aa
MINOR: Add test cases to the Raft module (#12692)
Reviewers: Mickael Maison <mickael.maison@gmail.com>
, Divij Vaidya <diviv@amazon.com>, Ismael Juma <mlists@juma.me.uk>
2022-10-28 17:54:34 +02:00
José Armando García Sancio d0ff869718
MINOR; Add accessor methods to OffsetAndEpoch (#12770)
Accessor are preferred over fields because they compose better with Java's
lambda syntax.

Reviewers: Jason Gustafson <jason@confluent.io>
2022-10-19 12:07:07 -07:00
Niket eb8f0bd5e4
MINOR: Adding KRaft Monitoring Related Metrics to docs/ops.html (#12679)
This commit adds KRaft monitoring related metrics to the Kafka docs (docs/ops.html).

Reviewers: Jason Gustafson <jason@confluent.io>, Luke Chen <showuon@gmail.com>
2022-09-26 14:25:36 +08:00
Luke Chen bf7ddf73af
MINOR: use addExact to avoid overflow and some cleanup (#12660)
What changes in this PR:
1. Use addExact to avoid overflow in BatchAccumulator#bytesNeeded. We did use addExact in bytesNeededForRecords method, but forgot that when returning the result.
2. javadoc improvement

Reviewers: Jason Gustafson <jason@confluent.io>
2022-09-22 09:22:58 +08:00
Colin Patrick McCabe b401fdaefb MINOR: Add more validation during KRPC deserialization
When deserializing KRPC (which is used for RPCs sent to Kafka, Kafka Metadata records, and some
    other things), check that we have at least N bytes remaining before allocating an array of size N.

    Remove DataInputStreamReadable since it was hard to make this class aware of how many bytes were
    remaining. Instead, when reading an individual record in the Raft layer, simply create a
    ByteBufferAccessor with a ByteBuffer containing just the bytes we're interested in.

    Add SimpleArraysMessageTest and ByteBufferAccessorTest. Also add some additional tests in
    RequestResponseTest.

    Reviewers: Tom Bentley <tbentley@redhat.com>, Mickael Maison <mickael.maison@gmail.com>, Colin McCabe <colin@cmccabe.xyz>

    Co-authored-by: Colin McCabe <colin@cmccabe.xyz>
    Co-authored-by: Manikumar Reddy <manikumar.reddy@gmail.com>
    Co-authored-by: Mickael Maison <mickael.maison@gmail.com>
2022-09-21 20:58:23 +05:30
Jason Gustafson 8c8b5366a6
KAFKA-14240; Validate kraft snapshot state on startup (#12653)
We should prevent the metadata log from initializing in a known bad state. If the log start offset of the first segment is greater than 0, then must be a snapshot an offset greater than or equal to it order to ensure that the initialized state is complete.

Reviewers: José Armando García Sancio <jsancio@users.noreply.github.com>
2022-09-19 11:52:48 -07:00
Ashmeet Lamba 86645cb40a
KAFKA-14073; Log the reason for snapshot (#12414)
When a snapshot is taken it is due to either of the following reasons -

    Max bytes were applied
    Metadata version was changed

Once the snapshot process is started, it will log the reason that initiated the process.

Updated existing tests to include code changes required to log the reason. I was not able to check the logs when running tests - could someone guide me on how to enable logs when running a specific test case.

Reviewers: dengziming <dengziming1993@gmail.com>, José Armando García Sancio <jsancio@apache.org>
2022-09-13 10:03:47 -07:00
José Armando García Sancio c5954175a4
KAFKA-14222; KRaft's memory pool should always allocate a buffer (#12625)
Because the snapshot writer sets a linger ms of Integer.MAX_VALUE it is
possible for the memory pool to run out of memory if the snapshot is
greater than 5 * 8MB.

This change allows the BatchMemoryPool to always allocate a buffer when
requested. The memory pool frees the extra allocated buffer when released if
the number of pooled buffers is greater than the configured maximum
batches.

Reviewers: Jason Gustafson <jason@confluent.io>
2022-09-13 08:04:40 -07:00
José Armando García Sancio f83c6f2da4
KAFKA-14183; Cluster metadata bootstrap file should use header/footer (#12565)
The boostrap.checkpoint files should include a control record batch for
the SnapshotHeaderRecord at the start of the file. It should also
include a control record batch for the SnapshotFooterRecord at the end
of the file.

The snapshot header record is important because it versions the rest of
the bootstrap file.

Reviewers: David Arthur <mumrah@gmail.com>
2022-08-27 19:11:06 -07:00
Jason Gustafson 5c52c61a46
MINOR: A few cleanups for DescribeQuorum APIs (#12548)
A few small cleanups in the `DescribeQuorum` API and handling logic:

- Change field types in `QuorumInfo`:
  - `leaderId`: `Integer` -> `int`
  - `leaderEpoch`: `Integer` -> `long` (to allow for type expansion in the future)
  - `highWatermark`: `Long` -> `long`
- Use field names `lastFetchTimestamp` and `lastCaughtUpTimestamp` consistently
- Move construction of `DescribeQuorumResponseData.PartitionData` into `LeaderState`
- Consolidate fetch time/offset update logic into `LeaderState.ReplicaState.updateFollowerState`

Reviewers: Luke Chen <showuon@gmail.com>, José Armando García Sancio <jsancio@users.noreply.github.com>
2022-08-24 13:12:14 -07:00
Niket c7f051914e
KAFKA-13888; Implement `LastFetchTimestamp` and in `LastCaughtUpTimestamp` for DescribeQuorumResponse [KIP-836] (#12508)
This commit implements the newly added fields `LastFetchTimestamp` and `LastCaughtUpTimestamp` for KIP-836: https://cwiki.apache.org/confluence/display/KAFKA/KIP-836:+Addition+of+Information+in+DescribeQuorumResponse+about+Voter+Lag.

Reviewers: Jason Gustafson <jason@confluent.io>
2022-08-19 15:09:09 -07:00