Ignore shardId updates from replica nodes (#13877)

Close https://github.com/redis/redis/issues/13868

This bug was introduced by https://github.com/redis/redis/pull/13468

To maintain compatibility with older versions that do not support
shardid, when a replica passes a shardid, we also update the master’s
shardid accordingly.

However, when both the master and replica support shardid, an issue
arises: in one moment, the master may pass a shardid, causing us to
update both the master and all its replicas to match the master’s
shardid. But if the replica later passes a different shardid, we would
then update the master’s shardid again, leading to continuous changes in
shardid.

Regardless of the situation, we always ensure that the replica’s shardid
remains consistent with the master’s shardid.
This commit is contained in:
Jason 2025-03-30 03:15:04 -04:00 committed by YaacovHazan
parent a26774cee1
commit 779a20058b
1 changed files with 11 additions and 17 deletions

View File

@ -964,30 +964,24 @@ clusterNode *clusterNodeGetSlave(clusterNode *node, int slave_idx) {
static void updateShardId(clusterNode *node, const char *shard_id) { static void updateShardId(clusterNode *node, const char *shard_id) {
if (shard_id && memcmp(node->shard_id, shard_id, CLUSTER_NAMELEN) != 0) { if (shard_id && memcmp(node->shard_id, shard_id, CLUSTER_NAMELEN) != 0) {
assignShardIdToNode(node, shard_id, CLUSTER_TODO_SAVE_CONFIG); /* We always make our best effort to keep the shard-id consistent
* between the master and its replicas:
/* If the replica or master does not support shard-id (old version),
* we still need to make our best effort to keep their shard-id consistent.
* *
* 1. Master supports but the replica does not. * 1. When updating the master's shard-id, we simultaneously update the
* We might first update the replica's shard-id to the master's randomly * shard-id of all its replicas to ensure consistency.
* generated shard-id. Then, when the master's shard-id arrives, we must * 2. When updating replica's shard-id, if it differs from its master's shard-id,
* also update all its replicas. * we discard this replica's shard-id and continue using master's shard-id.
* 2. If the master does not support but the replica does. * This applies even if the master does not support shard-id, in which
* We also need to synchronize the master's shard-id with the replica. * case we rely on the master's randomly generated shard-id. */
* 3. If neither of master and replica supports it.
* The master will have a randomly generated shard-id and will update
* the replica to match the master's shard-id. */
if (node->slaveof == NULL) { if (node->slaveof == NULL) {
assignShardIdToNode(node, shard_id, CLUSTER_TODO_SAVE_CONFIG);
for (int i = 0; i < clusterNodeNumSlaves(node); i++) { for (int i = 0; i < clusterNodeNumSlaves(node); i++) {
clusterNode *slavenode = clusterNodeGetSlave(node, i); clusterNode *slavenode = clusterNodeGetSlave(node, i);
if (memcmp(slavenode->shard_id, shard_id, CLUSTER_NAMELEN) != 0) if (memcmp(slavenode->shard_id, shard_id, CLUSTER_NAMELEN) != 0)
assignShardIdToNode(slavenode, shard_id, CLUSTER_TODO_SAVE_CONFIG|CLUSTER_TODO_FSYNC_CONFIG); assignShardIdToNode(slavenode, shard_id, CLUSTER_TODO_SAVE_CONFIG|CLUSTER_TODO_FSYNC_CONFIG);
} }
} else { } else if (memcmp(node->slaveof->shard_id, shard_id, CLUSTER_NAMELEN) == 0) {
clusterNode *masternode = node->slaveof; assignShardIdToNode(node, shard_id, CLUSTER_TODO_SAVE_CONFIG);
if (memcmp(masternode->shard_id, shard_id, CLUSTER_NAMELEN) != 0)
assignShardIdToNode(masternode, shard_id, CLUSTER_TODO_SAVE_CONFIG|CLUSTER_TODO_FSYNC_CONFIG);
} }
} }
} }