The JSON files in this directory define the Apache Kafka message protocol.
This protocol describes what information clients and servers send to each
other, and how it is serialized. Note that this version of JSON supports
comments. Comments begin with a double forward slash.
When Kafka is compiled, these specification files are translated into Java code
to read and write messages. Any change to these JSON files will trigger a
recompilation of this generated code.
These specification files replace an older system where hand-written
serialization code was used. Over time, we will migrate all messages to using
automatically generated serialization and deserialization code.
Requests and Responses
----------------------
The Kafka protocol features requests and responses. Requests are sent to a
server in order to get a response. Each request is uniquely identified by a
16-bit integer called the "api key". The API key of the response will always
match that of the request.
Each message has a unique 16-bit version number. The schema might be different
for each version of the message. Sometimes, the version is incremented even
though the schema has not changed. This may indicate that the server should
behave differently in some way. The version of a response must always match
the version of the corresponding request.
Each request or response has a top-level field named "validVersions." This
specifies the versions of the protocol that our code understands. For example,
specifying "0-2" indicates that we understand versions 0, 1, and 2. You must
always specify the highest message version which is supported.
The only old message versions that are no longer supported are version 0 of
MetadataRequest and MetadataResponse. In general, since we adopted KIP-97,
dropping support for old message versions is no longer allowed without a KIP.
Therefore, please be careful not to increase the lower end of the version
support interval for any message.
MessageData Objects
-------------------
Using the JSON files in this directory, we generate Java code for MessageData
objects. These objects store request and response data for kafka. MessageData
objects do not contain a version number. Instead, a single MessageData object
represents every possible version of a Message. This makes working with
messages more convenient, because the same code path can be used for every
version of a message.
Fields
------
Each message contains an array of fields. Fields specify the data that should
be sent with the message. In general, fields have a name, a type, and version
information associated with them.
The order that fields appear in a message is important. Fields which come
first in the message definition will be sent first over the network. Changing
the order of the fields in a message is an incompatible change.
In each new message version, we may add or subtract fields. For example, if we
are creating a new version 3 of a message, we can add a new field with the
version spec "3+". This specifies that the field only appears in version 3 and
later. If a field is being removed, we should change its version from "0+" to
"0-2" to indicate that it will not appear in version 3 and later.
Field Types
-----------
There are several primitive field types available.
* "boolean": either true or false. This takes up 1 byte on the wire.
* "int8": an 8-bit integer. This also takes up 1 byte on the wire.
* "int16": a 16-bit integer. This takes up 2 bytes on the wire.
* "int32": a 32-bit integer. This takes up 4 bytes on the wire.
* "int64": a 64-bit integer. This takes up 8 bytes on the wire.
* "string": a string. This must be less than 64kb in size when serialized as UTF-8. This takes up 2 bytes on the wire, plus the length of the string when serialized to UTF-8.
* "bytes": binary data. This takes up 4 bytes on the wire, plus the length of the bytes.
In addition to these primitive field types, there is also an array type. Array
types start with a "[]" and end with the name of the element type. For
example, []Foo declares an array of "Foo" objects. Array fields have their own
array of fields, which specifies what is in the contained objects.
Nullable Fields
---------------
Booleans and ints can never be null. However, fields that are strings, bytes,
or arrays may optionally be "nullable." When a field is "nullable," that
simply means that we are prepared to serialize and deserialize null entries for
that field.
If you want to declare a field as nullable, you set "nullableVersions" for that
field. Nullability is implemented as a version range in order to accomodate a
very common pattern in Kafka where a field that was originally not nullable
becomes nullable in a later version.
If a field is declared as non-nullable, and it is present in the message
version you are using, you should set it to a non-null value before serializing
the message. Otherwise, you will get a runtime error.
Serializing Messages
--------------------
The Message#write method writes out a message to a buffer. The fields that are
written out will depend on the version number that you supply to write(). When
you write out a message using an older version, fields that are too old to be
present in the schema will be omitted.
When working with older message versions, please verify that the older message
schema includes all the data that needs to be sent. For example, it is probably
OK to skip sending a timeout field. However, a field which radically alters the
meaning of the request, such as a "validateOnly" boolean, should not be ignored.
It's often useful to know how much space a message will take up before writing
it out to a buffer. You can find this out by calling the Message#size method.
You can also convert a message to a Struct by calling Message#toStruct. This
allows you to use the functions that serialize Structs to buffers.
Deserializing Messages
----------------------
Message objects may be deserialized using the Message#read method. This method
overwrites all the data currently in the message object with new data.
You can also deserialize a message from a Struct by calling Message#fromStruct.
The Struct will not be modified.
Any fields in the message object that are not present in the version that you
are deserializing will be reset to default values. Unless a custom default has