This PR is part of KIP-1033 which aims to bring a ProcessingExceptionHandler to Kafka Streams in order to deal with exceptions that occur during processing.
This PR catch the exceptions thrown while handling a processing exception
Co-authored-by: Dabz <d.gasparina@gmail.com>
Co-authored-by: loicgreffier <loic.greffier@michelin.com>
Reviewers: Bruno Cadonna <bruno@confluent.io>, Matthias J. Sax <matthias@confluent.io>
This PR is part of KIP1033 which aims to bring a ProcessingExceptionHandler to Kafka Streams in order to deal with exceptions that occur during processing.
This PR expose the new ErrorHandlerContext as a parameter to the Deserialization exception handlers and deprecate the previous handle signature.
Co-authored-by: Dabz <d.gasparina@gmail.com>
Co-authored-by: loicgreffier <loic.greffier@michelin.com>
Reviewers: Bruno Cadonna <bruno@confluent.io>, Matthias J. Sax <matthias@confluent.io>
This PR is part of KIP-1033 which aims to bring a ProcessingExceptionHandler to Kafka Streams in order to deal with exceptions that occur during processing.
This PR expose the new ErrorHandlerContext as a parameter to the Production exception handler and deprecate the previous handle signature.
Co-authored-by: Dabz <d.gasparina@gmail.com>
Co-authored-by: loicgreffier <loic.greffier@michelin.com>
Reviewers: Bruno Cadonna <bruno@confluent.io>, Matthias J. Sax <matthias@confluent.io>
Identified a couple of reliability issues with broker code for share groups
1. Broker seems to get stuck at times when using multiple share consumers due to a corner case where the second last fetch request did not contain any topic partition to fetch, because of which the broker could never complete the last request. This results in a share fetch request getting stuck.
2. Since persister would not perform any business logic around sending state batches for a share partition, there could be scenarios where it sends state batches with no AVAILABLE records. This could cause a breach on the limit of in-flight messages we have configured, and hence broker would never be able to complete the share fetch requests.
Reviewers: Andrew Schofield <aschofield@confluent.io>, Apoorv Mittal <apoorvmittal10@gmail.com>, Manikumar Reddy <manikumar.reddy@gmail.com>
This change allows the KRaft leader to send the DescribeQuorumResponse version based on the schema version used by the client.
Reviewers: José Armando García Sancio <jsancio@apache.org>
Extend LeaderChangeMessage schema to support version 1 of the message. The leader will continue to write version 0 of the schema. This is needed so that in the future the leader can write version 1 of the message and be guaranteed that all of the replicas in the cluster support version 1 of the schema.
Reviewers: José Armando García Sancio <jsancio@apache.org>
Implement the RemoveVoter RPC. The general algorithm is as follow:
1. Check that the leader has fenced the previous leader(s) by checking that the HWM is known;
otherwise return the REQUEST_TIMED_OUT error.
2. Check that the cluster supports kraft.version 1; otherwise return the UNSUPPORTED_VERSION error.
3. Check that there are no uncommitted voter changes; otherwise return the REQUEST_TIMED_OUT error.
4. Append the updated VotersRecord to the log. The KRaft internal listener will read this uncommitted
record from the log and add the new voter to the set of voters.
5. Wait for the VotersRecord to commit using the majority of the new set of voters. Return a
REQUEST_TIMED_OUT error if it doesn't commit in time.
6. Send the RemoveVoter successful response to the client.
7. Resign the leadership if the leader is not in the new voter set
One thing to note is that now that KRaft supports both the remove voter and add voter RPC. Only one
change can be pending at once. This is achieved in the following ways. The AddVoter RPC checks if
there are pending AddVoter or RemoveVoter RPC. The RemoveVoter RPC checks if there are any
pending AddVoter or RemoveVoter RPC. Both RPCs check that there is no uncommitted VotersRecord.
Reviewers: Colin P. McCabe <cmccabe@apache.org>
This change includes adding transaction.version (part of KIP-1022)
New transaction version 1 is introduced to support writing flexible fields in transaction state log messages.
Transaction version 2 is created in anticipation for further KIP-890 changes.
Neither are made production ready. Tests for the new transaction version and new MV are created.
Also include change to not report a feature as supported if the range is 0-0.
Reviewers: Jun Rao <junrao@apache.org>, David Jacot <djacot@confluent.io>, Artem Livshits <alivshits@confluent.io>, Colin P. McCabe <cmccabe@apache.org>
To avoid a resource leak, we need to close the AdminClient after the test.
Reviewers: Lianet Magrans <lianetmr@gmail.com>, Matthias J. Sax <matthias@confluent.io>
Co-authored-by: Dabz <d.gasparina@gmail.com>
Co-authored-by: loicgreffier <loic.greffier@michelin.com>
Minor code improvements across different classed, related to the `ProcessingExceptionHandler` implementation (KIP-1033).
Reviewers: Bruno Cadonna <bruno@confluent.io>, Matthias J. Sax <matthias@confluent.io>
The generated code path added to test scope of sourceSets will make `Gradle` assume there are tests in module, and then it results in failed test task since the module does not define test engine.
Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
In this PR I have completely migrated HeartbeatRequestManagerTest away from ConsumerTestBuilder. I have removed any instances of spy objects and used mocks wherever possible. 31/31 tests are passing. I also removed one test that existed in another test file.
Reviewers: Lianet Magrans <lianetmr@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>
This change implements the AddVoter RPC. The high-level algorithm is as follow:
1. Check that the leader has fenced the previous leader(s) by checking that the HWM is known;
otherwise, return the REQUEST_TIMED_OUT error.
2. Check that the cluster supports kraft.version 1; otherwise return the UNSUPPORTED_VERSION error.
3. Check that there are no uncommitted voter changes; otherwise return the REQUEST_TIMED_OUT error.
4. Check that the new voter's id is not part of the existing voter set, otherwise return the
DUPLICATE_VOTER error.
5. Send an API_VERSIONS RPC to the first (default) listener to discover the supported kraft.version
of the new voter.
6. Check that the new voter supports the current kraft.version, otherwise return the
INVALID_REQUEST error.
7. Check that the new voter is caught up to the log end offset of the leader, otherwise return a
REQUEST_TIMED_OUT error.
8. Append the updated VotersRecord to the log. The KRaft internal listener will read this
uncommitted record from the log and add the new voter to the set of voters.
9. Wait for the VotersRecord to commit using the majority of the new set of voters. Return a
REQUEST_TIMED_OUT error if it doesn't commit in time.
10. Send the AddVoter successful response to the client.
The leader implements the above algorithm by tracking 3 events: the ADD_VOTER request is received,
the API_VERSIONS response is received and finally the HWM is updated.
The state of the ADD_VOTER operation is tracked by LeaderState using the AddVoterHandlerState. The
algorithm is implemented by the AddVoterHandler type.
This change also fixes a small issue introduced by the bootstrap checkpoint (0-0.checkpoint). The
internal partition listener (KRaftControlRecordStateMachine) and the external partition listener
(KafkaRaftClient.ListenerContext) were using "nextOffset = 0" as the initial state of the reading
cursor. This was causing the bootstrap checkpoint to keep getting reloaded until the leader wrote a
record to the log. Changing the initial offset (nextOffset) to -1 allows the listeners to
distinguish between the initial state (nextOffset == -1) and the bootstrap checkpoint was loaded
but the log segment is empty (nextOffset == 0).
Reviewers: Colin P. McCabe <cmccabe@apache.org>