Commit Graph

59394 Commits

Author SHA1 Message Date
Michael Klishin 1b2946701a
Merge pull request #14138 from rabbitmq/mergify/bp/v4.1.x/pr-14127
QQ/Streams: Ensure open file handles are closed when a queue is deleted. (backport #14127)
2025-06-26 16:23:17 +04:00
Karl Nilsson 8b40a8e09e QQ/Streams: Ensure open file handles are closed when a queue is deleted.
If a stream or quorum queue has opened a file to read a consumer message
and the queue is deleted the file handle reference is lost and kept
open until the end of the channel lifetime.

(cherry picked from commit c688169f08)
2025-06-26 09:44:08 +00:00
Michael Klishin 5a41a13f7b
Merge pull request #14135 from rabbitmq/mergify/bp/v4.1.x/pr-14134
Test (make) / Build and Xref (1.17, 26) (push) Waiting to run Details
Test (make) / Build and Xref (1.17, 27) (push) Waiting to run Details
Test (make) / Test (1.17, 27, khepri) (push) Waiting to run Details
Test (make) / Test (1.17, 27, mnesia) (push) Waiting to run Details
Test (make) / Test mixed clusters (1.17, 27, khepri) (push) Waiting to run Details
Test (make) / Test mixed clusters (1.17, 27, mnesia) (push) Waiting to run Details
Test (make) / Type check (1.17, 27) (push) Waiting to run Details
Follow up to #14132 (backport #14134)
2025-06-25 23:32:35 +04:00
Luke Bakken 5ebf279e97 Follow up to #14132
#14132 introduced a small bug in the JSON output that was caught by CI.

(cherry picked from commit 33cb21ee92)
2025-06-25 19:31:20 +00:00
Michael Klishin 2eb2100398
Merge pull request #14133 from rabbitmq/mergify/bp/v4.1.x/pr-14132
Trigger a 4.1.x alpha release build / trigger_alpha_build (push) Waiting to run Details
Test (make) / Build and Xref (1.17, 26) (push) Waiting to run Details
Test (make) / Build and Xref (1.17, 27) (push) Waiting to run Details
Test (make) / Test (1.17, 27, khepri) (push) Waiting to run Details
Test (make) / Test (1.17, 27, mnesia) (push) Waiting to run Details
Test (make) / Test mixed clusters (1.17, 27, khepri) (push) Waiting to run Details
Test (make) / Test mixed clusters (1.17, 27, mnesia) (push) Waiting to run Details
Test (make) / Type check (1.17, 27) (push) Waiting to run Details
Follow-up to 14101 (backport #14132)
2025-06-25 19:21:32 +04:00
Michael Klishin 3c599aa8a2
Merge pull request #14128 from rabbitmq/mergify/bp/v4.1.x/pr-14123
By @tomyouyou: Avoid a scary log exception when a closing connection runs into an exception during a command writer flush operation (backport #14123)
2025-06-25 19:21:16 +04:00
Michael Klishin 30f12e2664
Merge pull request #14131 from rabbitmq/mergify/bp/v4.1.x/pr-14108
CQ: Retry opening file when flushing buffers to avoid "DELETE PENDING" issues on Windows (backport #14108)
2025-06-25 19:15:37 +04:00
Luke Bakken db6e795fe5 Follow-up to 14101
Improvement in the code that @the-mikedavis noticed just before #14118 was merged.

(cherry picked from commit 00528cb1e8)
2025-06-25 15:14:37 +00:00
Michael Klishin a4e1bc7a3a
Merge pull request #14129 from rabbitmq/mergify/bp/v4.1.x/pr-14125
Re-submit #14087 by @SimonUnge: introduce an opinionated, opt-in way to prevent a node from booting if it's been reset in the past (backport #14125)
2025-06-25 18:08:55 +04:00
Loïc Hoguin 5d2c09c96b CQ: Retry opening write file when flushing buffers
On Windows the file may be in "DELETE PENDING" state following
its deletion (when the last message was acked). A subsequent
message leads us to writing to that file again but we can't
and get an {error,eacces}. In that case we wait 10ms and retry
up to 3 times.

(cherry picked from commit ff8ecf1cf7)
2025-06-25 13:50:28 +00:00
Michael Klishin c3e04724ef Wording
(cherry picked from commit 6c27536777)
2025-06-25 13:45:29 +00:00
Michael Klishin efcf998416 Update ct.test.spec
(cherry picked from commit 7876b2df58)
2025-06-25 13:45:28 +00:00
Michael Klishin 921ec24c81 Don't list a test suite twice in parallel CT suite groups #14087 #14125
(cherry picked from commit 74c4ec83df)
2025-06-25 13:45:28 +00:00
Michael Klishin c727c88416 More renaming #14087, add new test suite to a parallel CT group
(cherry picked from commit 5f1ab1409ff33f51fde535c5ffc22b43b2347a1c)
(cherry picked from commit 7810b4e018)
2025-06-25 13:45:28 +00:00
Simon Unge a8541cfe2f Rename
(cherry picked from commit 77cec4930e)
(cherry picked from commit 8ab2bda4eb)
2025-06-25 13:45:27 +00:00
Simon Unge 171df35d9e Add opt in initial check run
(cherry picked from commit 2d2c70cc7c)
(cherry picked from commit 1e04b72f6d)
2025-06-25 13:45:27 +00:00
Michael Klishin 3d80af6a79 Make dialyzer happy
(cherry picked from commit b4a11e61ab)
2025-06-25 13:39:17 +00:00
Michael Klishin f3fa4cc6ab Simplify #13121 by @tomyouyou, log it at debug level
(cherry picked from commit 9bd0731a5a)
2025-06-25 13:39:16 +00:00
tomyouyou 5a6da14339 When the client disconnects, the 'channel' process may generate a large number of exception logs.
When the client disconnects, flushing writer in the termination may result in a large number of exceptions due to the writer being closed.
The exceptions are as follows:

2025-06-24 17:56:06.661 [error] <0.1381.0> ** Generic server <0.1381.0> terminating, ** Last message in was {'$gen_cast',terminate}, ** When Server state == {ch, {conf,running,rabbit_framing_amqp_0_9_1,1, <0.1371.0>,<0.1379.0>,<0.1371.0>, <<"10.225.80.5:50760 -> 10.225.80.6:5673">>, {user,<<"rabbit_inside_user">>,[], [{rabbit_auth_backend_internal, #Fun<rabbit_auth_backend_internal.3.16580688>}]}, <<"/">>, <<"lzz.localdomain_rc.py_reply_89a60f0ef2114da2b3f150ca359ecf46">>, <0.1373.0>, [{<<"authentication_failure_close">>,bool,true}, {<<"connection.blocked">>,bool,true}, {<<"consumer_cancel_notify">>,bool,true}, {<<"need_notify_server_info_with_heartbeat">>,bool, true}], none,5,1800000,#{},infinity,1000000000}, {lstate,<0.1380.0>,false}, none,3, {1, [{pending_ack,2,<<"1">>,-576460618632, {resource,<<"/">>,queue, <<"lzz.localdomain_rc.py_reply_89a60f0ef2114da2b3f150ca359ecf46">>}, 1}], []}, undefined, #{<<"1">> =>, {{amqqueue, {resource,<<"/">>,queue, <<"lzz.localdomain_rc.py_reply_89a60f0ef2114da2b3f150ca359ecf46">>}, false,false,none, [{<<"x-expires">>,signedint,1800000}, {<<"x-queue-type">>,longstr,<<"classic">>}], <0.1385.0>,[],[],[],undefined,undefined,[],[], live,0,[],<<"/">>, #{user => <<"rabbit_inside_user">>, system_creation => 1750758840399767062, recover_on_declare => false, creator =>, {1750758936,"10.225.80.5",50760,"rc.py"}}, rabbit_classic_queue,#{}}, {false,5,false, [{zclient,tuple, {1750758936,"10.225.80.5",50760,"rc.py"}}]}}}, #{{resource,<<"/">>,queue, <<"lzz.localdomain_rc.py_reply_89a60f0ef2114da2b3f150ca359ecf46">>} =>, {1,{<<"1">>,nil,nil}}}, {state,none,30000,undefined}, false,1, {rabbit_confirms,undefined,#{}}, [],[],none,flow,[], {rabbit_queue_type, #{{resource,<<"/">>,queue, <<"lzz.localdomain_rc.py_reply_89a60f0ef2114da2b3f150ca359ecf46">>} =>, {ctx,rabbit_classic_queue, {rabbit_classic_queue,<0.1385.0>,#{}, #{<0.1385.0> => ok}, false}}}}, #Ref<0.2472179985.4173070337.136448>,false, {erlang,#Ref<0.2472179985.4173070337.136063>}, "rc.py",true,0,false,undefined,undefined,undefined, false}, ** Reason for termination == , ** {{shutdown,{writer,send_failed,closed}}, {gen_server,call,[<0.1379.0>,flush,infinity]}},
2025-06-24 17:56:06.665 [error] <0.1381.0>   crasher:, initial call: rabbit_channel:init/1, pid: <0.1381.0>, registered_name: [], exception exit: {{shutdown,{writer,send_failed,closed}}, {gen_server,call,[<0.1379.0>,flush,infinity]}}, in function  gen_server2:terminate/3 (gen_server2.erl, line 1172), ancestors: [<0.1378.0>,<0.1376.0>,<0.1369.0>,<0.1368.0>,<0.1169.0>, <0.1168.0>,<0.1167.0>,<0.1165.0>,<0.1164.0>,rabbit_sup, <0.249.0>], message_queue_len: 1, messages: [{'EXIT',<0.1378.0>,shutdown}], links: [<0.1378.0>], dictionary: [{msg_io_dt_cfg,{1750758936,2}}, {zext_options_dt_cfg,{1750758966,[]}}, {zlog_consumer_dt_cfg,{1750758936,false}}, {channel_operation_timeout,15000}, {rbt_trace_enable,true}, {process_name, {rabbit_channel, {<<"10.225.80.5:50760 -> 10.225.80.6:5673">>,1}}}, {counter_publish_size_dt_cfg,{1750758936,undefined}}, {peer_info, {"10.225.80.5",50760, "10.225.80.5:50760 -> 10.225.80.6:5673 - rc.py:3382128:dfe6ba8d-a42f-4ece-93df-11bff0410814", "rc.py",0}}, {peer_host_port_compname,{"10.225.80.5",50760,"rc.py"}}, {permission_cache_can_expire,false}, {debug_openv_dt_cfg,{1750758936,[]}}, {z_qref_type_dic, [{{resource,<<"/">>,queue, <<"lzz.localdomain_rc.py_reply_89a60f0ef2114da2b3f150ca359ecf46">>}, rabbit_classic_queue}]}, {zconsumer_num,1}, {virtual_host,<<"/">>}, {msg_size_for_gc,458}, {rand_seed, {#{max => 288230376151711743,type => exsplus, next => #Fun<rand.5.65977474>, jump => #Fun<rand.3.65977474>}, [20053568771696737|52030598835932017]}}, {top_queue_msg_dt_cfg, {1750758936, {0,0,0,undefined,false,false,undefined,undefined}}}], trap_exit: true, status: running, heap_size: 4185, stack_size: 28, reductions: 50613, neighbours:,

(cherry picked from commit 9e14040456)
2025-06-25 13:39:16 +00:00
Michael Klishin f5bf2405d5
Merge pull request #14126 from rabbitmq/mergify/bp/v4.1.x/pr-14118
Fix JSON output for `rabbitmqctl environment` (backport #14118)
2025-06-25 17:10:38 +04:00
Luke Bakken 598d854a3f Fix JSON output for `rabbitmqctl environment`
Fixes #14101

(cherry picked from commit 75cd74a2f2)
2025-06-25 13:06:00 +00:00
Michael Klishin 5ca894138d
Merge pull request #14122 from rabbitmq/mergify/bp/v4.1.x/pr-14115
Test (make) / Build and Xref (1.17, 26) (push) Waiting to run Details
Test (make) / Build and Xref (1.17, 27) (push) Waiting to run Details
Test (make) / Test (1.17, 27, khepri) (push) Waiting to run Details
Test (make) / Test (1.17, 27, mnesia) (push) Waiting to run Details
Test (make) / Test mixed clusters (1.17, 27, khepri) (push) Waiting to run Details
Test (make) / Test mixed clusters (1.17, 27, mnesia) (push) Waiting to run Details
Test (make) / Type check (1.17, 27) (push) Waiting to run Details
Add log in test (backport #14115)
2025-06-25 14:29:17 +04:00
Arnaud Cogoluègnes f0776c8b97 Add log statements stream network partitions
The test creates network partitions and checks how the stream SAC
coordinator deals with them. It can be flaky on CI, the log statements
should help diagnose the flakiness.

(cherry picked from commit 066145763f)
2025-06-25 08:12:04 +00:00
Michael Klishin af9b0d00ba
4.1.2 release notes update
Trigger a 4.1.x alpha release build / trigger_alpha_build (push) Waiting to run Details
Test (make) / Build and Xref (1.17, 26) (push) Waiting to run Details
Test (make) / Build and Xref (1.17, 27) (push) Waiting to run Details
Test (make) / Test (1.17, 27, khepri) (push) Waiting to run Details
Test (make) / Test (1.17, 27, mnesia) (push) Waiting to run Details
Test (make) / Test mixed clusters (1.17, 27, khepri) (push) Waiting to run Details
Test (make) / Test mixed clusters (1.17, 27, mnesia) (push) Waiting to run Details
Test (make) / Type check (1.17, 27) (push) Waiting to run Details
(cherry picked from commit 754352375c)
2025-06-24 19:39:07 +04:00
Michael Klishin 231467ba85
Merge pull request #14117 from rabbitmq/mergify/bp/v4.1.x/pr-14116
Ra 2.16.11 (backport #14116)
2025-06-24 18:57:26 +04:00
Michael Klishin 8161e6e126 Ra 2.16.11
to include rabbitmq/ra#546.

(cherry picked from commit 4691a16af6)
2025-06-24 13:36:04 +00:00
Michael Klishin 96cb9a7854
Correct a 4.1.2 release notes formatting issue
Trigger a 4.1.x alpha release build / trigger_alpha_build (push) Waiting to run Details
Test (make) / Build and Xref (1.17, 26) (push) Waiting to run Details
Test (make) / Build and Xref (1.17, 27) (push) Waiting to run Details
Test (make) / Test (1.17, 27, khepri) (push) Waiting to run Details
Test (make) / Test (1.17, 27, mnesia) (push) Waiting to run Details
Test (make) / Test mixed clusters (1.17, 27, khepri) (push) Waiting to run Details
Test (make) / Test mixed clusters (1.17, 27, mnesia) (push) Waiting to run Details
Test (make) / Type check (1.17, 27) (push) Waiting to run Details
(cherry picked from commit e019a4e41d)
2025-06-24 01:16:25 +04:00
Michael Klishin 149ee4ceb6
Initial 4.1.2 release notes
(cherry picked from commit e26fde9086)
2025-06-24 01:15:23 +04:00
Michael Klishin 382bc57f5e
Merge pull request #14112 from rabbitmq/mergify/bp/v4.1.x/pr-14109
Use module machine version for stream coordinator status (backport #14109)
2025-06-23 22:30:47 +04:00
Michael Klishin 26ffdc3e2e
Merge pull request #14113 from rabbitmq/mergify/bp/v4.1.x/pr-14111
Trigger a 4.1.x alpha release build / trigger_alpha_build (push) Waiting to run Details
Test (make) / Build and Xref (1.17, 26) (push) Waiting to run Details
Test (make) / Build and Xref (1.17, 27) (push) Waiting to run Details
Test (make) / Test (1.17, 27, khepri) (push) Waiting to run Details
Test (make) / Test (1.17, 27, mnesia) (push) Waiting to run Details
Test (make) / Test mixed clusters (1.17, 27, khepri) (push) Waiting to run Details
Test (make) / Test mixed clusters (1.17, 27, mnesia) (push) Waiting to run Details
Test (make) / Type check (1.17, 27) (push) Waiting to run Details
Bump ActiveMQ to v6.1.7 (backport #14111)
2025-06-23 21:29:33 +04:00
David Ansari 822a38930c Bump ActiveMQ to v6.1.7
We've experienced lots of failures in CI:
```
GEN    test/system_SUITE_data/apache-activemq-5.18.3-bin.tar.gz
make: *** [Makefile:65: test/system_SUITE_data/apache-activemq-5.18.3-bin.tar.gz] Error 28
make: Leaving directory '/home/runner/work/rabbitmq-server/rabbitmq-server/deps/amqp10_client'
Error: Process completed with exit code 2.
```

Bumping to the latest ActiveMQ Classic version may or may not help with
these failures.

Either way, we want to test against the latest ActiveMQ version. Version
5.18.3 reached end-of-life and is no longer maintained.

(cherry picked from commit 033a87523d)
2025-06-23 16:12:43 +00:00
Arnaud Cogoluègnes 59bee252f8 Use module machine version for stream coordinator status
The wrong module was used.

(cherry picked from commit 5042d8eefe)
2025-06-23 15:56:00 +00:00
Arnaud Cogoluègnes 7c8ccdecd5
Support cross-version overview in stream SAC coordinator
When the state comes from V4 and the current module is V5.

References #14106

(cherry picked from commit 4e7e0f0f1d)
2025-06-23 17:30:43 +02:00
Arnaud Cogoluègnes 61322e52e9
Add log message to help diagnose flaky test
(cherry picked from commit 0ca128b80f)
2025-06-23 17:30:42 +02:00
Michael Klishin b894e11798
Merge pull request #14107 from rabbitmq/mergify/bp/v4.1.x/pr-14106
Miscellaneous minor improvements in stream SAC coordinator (backport #14106)
2025-06-23 16:49:45 +04:00
Arnaud Cogoluègnes ae647a650e Miscellaneous minor improvements in stream SAC coordinator
This commit handles edge cases in the stream SAC coordinator to make
sure it does not crash during execution. Most of these edge cases
consist in an inconsistent state, so there are very unlikely to happen.

This commit also makes sure there is no duplicate in the consumer list
of a group. Consumers are also now identified only by their connection
PID and their subscription ID, as now the timestamp they contain in
their state does not allow a field-by-field comparison.

(cherry picked from commit b4f7d46842)
2025-06-23 11:50:27 +00:00
Michael Klishin 73d49cbe3d
Merge pull request #14099 from rabbitmq/mergify/bp/v4.1.x/pr-14097
Test (make) / Build and Xref (1.17, 26) (push) Waiting to run Details
Test (make) / Build and Xref (1.17, 27) (push) Waiting to run Details
Test (make) / Test (1.17, 27, khepri) (push) Waiting to run Details
Test (make) / Test (1.17, 27, mnesia) (push) Waiting to run Details
Test (make) / Test mixed clusters (1.17, 27, khepri) (push) Waiting to run Details
Test (make) / Test mixed clusters (1.17, 27, mnesia) (push) Waiting to run Details
Test (make) / Type check (1.17, 27) (push) Waiting to run Details
Federation: update makefile to avoid dialyzer compilation errors (backport #14097)
2025-06-23 13:04:47 +04:00
Arnaud Cogoluègnes 0e7e0caf79
Merge pull request #14103 from rabbitmq/dependabot/maven/deps/rabbitmq_auth_backend_http/examples/rabbitmq_auth_backend_spring_boot/v4.1.x/prod-deps-75c31b2fef
[skip ci] Bump the prod-deps group across 2 directories with 1 update
2025-06-23 07:05:56 +00:00
dependabot[bot] 42f23ec654
[skip ci] Bump the prod-deps group across 2 directories with 1 update
Bumps the prod-deps group with 1 update in the /deps/rabbitmq_auth_backend_http/examples/rabbitmq_auth_backend_spring_boot directory: [org.springframework.boot:spring-boot-starter-parent](https://github.com/spring-projects/spring-boot).
Bumps the prod-deps group with 1 update in the /deps/rabbitmq_auth_backend_http/examples/rabbitmq_auth_backend_spring_boot_kotlin directory: [org.springframework.boot:spring-boot-starter-parent](https://github.com/spring-projects/spring-boot).


Updates `org.springframework.boot:spring-boot-starter-parent` from 3.5.0 to 3.5.3
- [Release notes](https://github.com/spring-projects/spring-boot/releases)
- [Commits](https://github.com/spring-projects/spring-boot/compare/v3.5.0...v3.5.3)

Updates `org.springframework.boot:spring-boot-starter-parent` from 3.5.0 to 3.5.3
- [Release notes](https://github.com/spring-projects/spring-boot/releases)
- [Commits](https://github.com/spring-projects/spring-boot/compare/v3.5.0...v3.5.3)

---
updated-dependencies:
- dependency-name: org.springframework.boot:spring-boot-starter-parent
  dependency-version: 3.5.3
  dependency-type: direct:production
  update-type: version-update:semver-patch
  dependency-group: prod-deps
- dependency-name: org.springframework.boot:spring-boot-starter-parent
  dependency-version: 3.5.3
  dependency-type: direct:production
  update-type: version-update:semver-patch
  dependency-group: prod-deps
...

Signed-off-by: dependabot[bot] <support@github.com>
2025-06-21 18:29:38 +00:00
Michael Klishin defcf18976
Resolve a conflict #14097 14099 2025-06-20 19:42:35 +04:00
Diana Parra Corbacho 0c259133a8 Federation: update makefile to avoid dialyzer compilation errors
They just happen with a combination of OTP 27.3 and Elixir 1.17

(cherry picked from commit 0801e68c14)

# Conflicts:
#	deps/rabbitmq_federation/Makefile
2025-06-20 08:53:43 +00:00
Arnaud Cogoluègnes d9c0c6eee1
Mention socket is from stream reader in log message
Trigger a 4.1.x alpha release build / trigger_alpha_build (push) Has been cancelled Details
Test (make) / Build and Xref (1.17, 26) (push) Has been cancelled Details
Test (make) / Build and Xref (1.17, 27) (push) Has been cancelled Details
Test (make) / Test (1.17, 27, khepri) (push) Has been cancelled Details
Test (make) / Test (1.17, 27, mnesia) (push) Has been cancelled Details
Test (make) / Test mixed clusters (1.17, 27, khepri) (push) Has been cancelled Details
Test (make) / Test mixed clusters (1.17, 27, mnesia) (push) Has been cancelled Details
Test (make) / Type check (1.17, 27) (push) Has been cancelled Details
(cherry picked from commit 72df6270b2)
2025-06-19 15:52:39 +02:00
Michael Klishin d6cc15a923
Merge commit from fork
Trigger a 4.1.x alpha release build / trigger_alpha_build (push) Has been cancelled Details
Test (make) / Build and Xref (1.17, 26) (push) Has been cancelled Details
Test (make) / Build and Xref (1.17, 27) (push) Has been cancelled Details
Test (make) / Test (1.17, 27, khepri) (push) Has been cancelled Details
Test (make) / Test (1.17, 27, mnesia) (push) Has been cancelled Details
Test (make) / Test mixed clusters (1.17, 27, khepri) (push) Has been cancelled Details
Test (make) / Test mixed clusters (1.17, 27, mnesia) (push) Has been cancelled Details
Test (make) / Type check (1.17, 27) (push) Has been cancelled Details
Test Management UI with Selenium / selenium (chrome, 1.17.3, 27.3) (push) Has been cancelled Details
Management UI: escape virtual host names in virtual host restart forms

(cherry picked from commit 60be7d8046)
2025-06-17 21:50:03 +04:00
Jean-Sébastien Pédron e3e610cde7
Merge pull request #14093 from rabbitmq/mergify/bp/v4.1.x/pr-14081
Trigger a 4.1.x alpha release build / trigger_alpha_build (push) Waiting to run Details
Test (make) / Build and Xref (1.17, 26) (push) Waiting to run Details
Test (make) / Build and Xref (1.17, 27) (push) Waiting to run Details
Test (make) / Test (1.17, 27, khepri) (push) Waiting to run Details
Test (make) / Test (1.17, 27, mnesia) (push) Waiting to run Details
Test (make) / Test mixed clusters (1.17, 27, khepri) (push) Waiting to run Details
Test (make) / Test mixed clusters (1.17, 27, mnesia) (push) Waiting to run Details
Test (make) / Type check (1.17, 27) (push) Waiting to run Details
Delete symlinks to `erlang.mk` and `rabbitmq-components.mk` (backport #14081)
2025-06-17 16:32:35 +02:00
Arnaud Cogoluègnes 789e156313
Merge pull request #14091 from rabbitmq/mergify/bp/v4.1.x/pr-13672
Prevent blocked groups in stream SAC with fine-grained status (backport #13672)
2025-06-17 14:28:34 +00:00
Jean-Sébastien Pédron 98477f95ea Delete symlinks to `erlang.mk` and `rabbitmq-components.mk`
[Why]
They make it more difficult to compile RabbitMQ on Windows. They were
probably useful at the time of the switch to a monorepository but I
don't see their need anymore.

(cherry picked from commit 63f7da23c7)
2025-06-17 14:04:30 +00:00
Michael Klishin 5c8703100f
Merge pull request #14089 from rabbitmq/mergify/bp/v4.1.x/pr-14088
Avoid list allocation (backport #14088)
2025-06-17 15:45:13 +04:00
Arnaud Cogoluègnes d2b9a7c87c Add activate_stream_consumer command
New CLI command to trigger a rebalancing in a SAC group and activate a
consumer. This is a last resort solution if all consumers in a group
accidently end up in {connected, waiting} state.

The command re-uses an existing function, which only picks the consumer
that should be active. This means it does not try to "fix" the state
(e.g. removing a disconnected consumer because its node is definitely
gone from the cluster).

Fixes #14055

(cherry picked from commit 41acc117bd)
2025-06-17 11:25:54 +00:00
Arnaud Cogoluègnes cf4d66a9e1 Close stream connection in case of unexpected error from SAC coordinator
Calls to the stream SAC coordinator can fail for various reason
(e.g. a timeout because of a network partition). The stream reader does not
take into account what the SAC coordinator returns and moves on even
in case of errors. This can lead to inconsistent state for SAC groups.

This commit changes this behavior by handling unexpected errors from the
SAC coordinator and closing the connection. The client is expected to
reconnect. This is safer than risking inconsistent state.

Fixes #14040

(cherry picked from commit 58f4e83c22)
2025-06-17 11:25:54 +00:00
Arnaud Cogoluègnes 7cd1f06533 Remove only stream subscriptions affected by down stream member
The clean-up of a stream connection state when a stream member goes down can
remove subscriptions not affected by the member. The subscription state is
removed from the connection, but the subscription is not removed from
the SAC state (if the subscription is a SAC), because the subscription member
PID does not match the down member PID.

When the actual member of the subscription goes down, the subscription is no
longer part of the state, so the clean-up does not find the subscription
and does not remove it from the SAC state. This lets a ghost consumer in
the corresponding SAC group.

This commit makes sure only the affected subscriptions are removed from
the state when a stream member goes down.

Fixes #13961

(cherry picked from commit a9cf049030)
2025-06-17 11:25:53 +00:00