)]}'
{"/COMMIT_MSG":[{"author":{"_account_id":28619,"name":"Dmitriy Rabotyagov","email":"noonedeadpunk@gmail.com","username":"noonedeadpunk"},"change_message_id":"7f16545941f70e97f362a0b43bebefb26bbb7d23","unresolved":true,"context_lines":[{"line_number":7,"context_line":"Auto-delete the failed quorum rabbit queues"},{"line_number":8,"context_line":""},{"line_number":9,"context_line":"When rabbit is failing for a specific quorum queue, the only thing to"},{"line_number":10,"context_line":"do is to delete the queue (as per rabbit doc, see [1])."},{"line_number":11,"context_line":""},{"line_number":12,"context_line":"So, to avoid the RPC service to be broken until an operator eventually"},{"line_number":13,"context_line":"do a manual fix on it, catch any INTERNAL ERROR (code 541) and trigger"}],"source_content_type":"text/x-gerrit-commit-message","patch_set":4,"id":"5c46039d_e623a330","line":10,"range":{"start_line":10,"start_character":26,"end_line":10,"end_character":53},"updated":"2023-08-16 10:50:24.000000000","message":"I don\u0027t either see this recommendation there or I\u0027m not understanding where it \"hides\".\n\nAccording to doc, rabbit should self-heal once new leader is elected after quorum being lost.\n\nAnd based on the doc, until there is a new leader elected (and quorum is established) it should be pretty much pointless to re-create quorum queues.\n\nAlso, approach looks quite scary, especially without usage of QManager that you\u0027ve introduced later on, but with enabled quorum for transient queues, as according to [1] recreating a lot of queues with arbitrary names is risky in long terms for cluster stability.\n\nBut again, I\u0027m not very strong expert in RabbitMQ, so correct me if I\u0027m wrong about these.\n\n[1] https://www.rabbitmq.com/quorum-queues.html#atom-use","commit_id":"b4b6f55de222716b0029d45adc33a60f8b0e68e0"},{"author":{"_account_id":11583,"name":"Arnaud Morin","email":"arnaud.morin@gmail.com","username":"arnaudmorin"},"change_message_id":"05020a4afaf31bb0b5f7b1f001f2e45a070224ba","unresolved":false,"context_lines":[{"line_number":7,"context_line":"Auto-delete the failed quorum rabbit queues"},{"line_number":8,"context_line":""},{"line_number":9,"context_line":"When rabbit is failing for a specific quorum queue, the only thing to"},{"line_number":10,"context_line":"do is to delete the queue (as per rabbit doc, see [1])."},{"line_number":11,"context_line":""},{"line_number":12,"context_line":"So, to avoid the RPC service to be broken until an operator eventually"},{"line_number":13,"context_line":"do a manual fix on it, catch any INTERNAL ERROR (code 541) and trigger"}],"source_content_type":"text/x-gerrit-commit-message","patch_set":4,"id":"1d13a966_3c76806d","line":10,"range":{"start_line":10,"start_character":26,"end_line":10,"end_character":53},"in_reply_to":"5b28e895_9ecedc56","updated":"2023-09-06 13:53:50.000000000","message":"Done","commit_id":"b4b6f55de222716b0029d45adc33a60f8b0e68e0"},{"author":{"_account_id":11583,"name":"Arnaud Morin","email":"arnaud.morin@gmail.com","username":"arnaudmorin"},"change_message_id":"553142249990492a70615ba549475029db4f95ca","unresolved":true,"context_lines":[{"line_number":7,"context_line":"Auto-delete the failed quorum rabbit queues"},{"line_number":8,"context_line":""},{"line_number":9,"context_line":"When rabbit is failing for a specific quorum queue, the only thing to"},{"line_number":10,"context_line":"do is to delete the queue (as per rabbit doc, see [1])."},{"line_number":11,"context_line":""},{"line_number":12,"context_line":"So, to avoid the RPC service to be broken until an operator eventually"},{"line_number":13,"context_line":"do a manual fix on it, catch any INTERNAL ERROR (code 541) and trigger"}],"source_content_type":"text/x-gerrit-commit-message","patch_set":4,"id":"d789ec56_f2991a41","line":10,"range":{"start_line":10,"start_character":26,"end_line":10,"end_character":53},"in_reply_to":"5c46039d_e623a330","updated":"2023-08-16 12:49:22.000000000","message":"In the doc: \"If a quorum of nodes cannot be recovered (say if 2 out of 3 RabbitMQ nodes are permanently lost) the queue is permanently unavailable and will need to be force deleted and recreated.\"\n\nRabbit is self-healing most of the time, but when it can\u0027t, the only solution on our side was to delete the queue.\n\nAbout the \"scary\" situation, using Qmanager does not change the behavior.\nLet me try to explain:\nneutron, nova, etc, are creating Listener or Sender connections.\nThose connections are trying to create a random queue on rabbit side.\nThe randomness of the queue name will never change until the service is restarted.\nWithout the patch, oslo_messaging is trying over and over to declare the same queue, but will never achieve it because the queue is broken on rabbit side.\nWith the patch, if we receive an InternalError, we delete the queue before trying to declare it again.\n\nI must agree that this patch is maybe not the best.\nWe wrote it before moving our cluster to stream and QManager.\nWith QManager and streams, we are not seeing such errors anymore (the rabbit cluster is much more healthy).\n\nIf you think this patch should not exist, I can move it outside the chain and eventually talk about it separately.","commit_id":"b4b6f55de222716b0029d45adc33a60f8b0e68e0"},{"author":{"_account_id":28619,"name":"Dmitriy Rabotyagov","email":"noonedeadpunk@gmail.com","username":"noonedeadpunk"},"change_message_id":"11d862d78f0c11ffe7d2f3da47c5fc6bf01afbf0","unresolved":true,"context_lines":[{"line_number":7,"context_line":"Auto-delete the failed quorum rabbit queues"},{"line_number":8,"context_line":""},{"line_number":9,"context_line":"When rabbit is failing for a specific quorum queue, the only thing to"},{"line_number":10,"context_line":"do is to delete the queue (as per rabbit doc, see [1])."},{"line_number":11,"context_line":""},{"line_number":12,"context_line":"So, to avoid the RPC service to be broken until an operator eventually"},{"line_number":13,"context_line":"do a manual fix on it, catch any INTERNAL ERROR (code 541) and trigger"}],"source_content_type":"text/x-gerrit-commit-message","patch_set":4,"id":"5b28e895_9ecedc56","line":10,"range":{"start_line":10,"start_character":26,"end_line":10,"end_character":53},"in_reply_to":"d789ec56_f2991a41","updated":"2023-08-16 13:49:21.000000000","message":"Ugh, I read that sentence like 3 times and each one was missing `permanently`. Sorry for that.\n\nI was thinking at least move this patch somewhere upwards the chain and maybe add some more conditions to when attempt such cleanups.\n\nBut given your explanation (and my honest blindness) there does not seem to be much choice anyway.","commit_id":"b4b6f55de222716b0029d45adc33a60f8b0e68e0"}],"/PATCHSET_LEVEL":[{"author":{"_account_id":28522,"name":"Hervé Beraud","email":"herveberaud.pro@gmail.com","username":"hberaud"},"change_message_id":"7009612157d40517b84c510ae035b7f668b399d6","unresolved":false,"context_lines":[],"source_content_type":"","patch_set":4,"id":"10489fd1_f2e3cea5","updated":"2023-08-31 09:26:53.000000000","message":"I agree with Stephen a release note would be really appreciated here. It could help operators to identify this change.\n\nElse as with the parent patch, we are in feature freeze so I\u0027d suggest to wait a couple of week that bobcat will be branched.","commit_id":"b4b6f55de222716b0029d45adc33a60f8b0e68e0"},{"author":{"_account_id":15334,"name":"Stephen Finucane","display_name":"stephenfin","email":"stephenfin@redhat.com","username":"sfinucan"},"change_message_id":"b57f6499e57e881e2869ab3fba36224327f9682d","unresolved":false,"context_lines":[],"source_content_type":"","patch_set":4,"id":"66f2323c_5aaeb9b7","updated":"2023-08-29 09:18:28.000000000","message":"This could probably do with a release note btw. Not urgent but if you respin please add one","commit_id":"b4b6f55de222716b0029d45adc33a60f8b0e68e0"},{"author":{"_account_id":28619,"name":"Dmitriy Rabotyagov","email":"noonedeadpunk@gmail.com","username":"noonedeadpunk"},"change_message_id":"11d862d78f0c11ffe7d2f3da47c5fc6bf01afbf0","unresolved":false,"context_lines":[],"source_content_type":"","patch_set":4,"id":"937db300_b19b09eb","updated":"2023-08-16 13:49:21.000000000","message":"recheck - test_hotplug_nic failing with SSH time out.","commit_id":"b4b6f55de222716b0029d45adc33a60f8b0e68e0"},{"author":{"_account_id":11583,"name":"Arnaud Morin","email":"arnaud.morin@gmail.com","username":"arnaudmorin"},"change_message_id":"05020a4afaf31bb0b5f7b1f001f2e45a070224ba","unresolved":false,"context_lines":[],"source_content_type":"","patch_set":4,"id":"730a9367_4273b089","in_reply_to":"10489fd1_f2e3cea5","updated":"2023-09-06 13:53:50.000000000","message":"done","commit_id":"b4b6f55de222716b0029d45adc33a60f8b0e68e0"},{"author":{"_account_id":28522,"name":"Hervé Beraud","email":"herveberaud.pro@gmail.com","username":"hberaud"},"change_message_id":"3800ccddabd2ac03015eb26cf15278b0dbc5bf1a","unresolved":false,"context_lines":[],"source_content_type":"","patch_set":7,"id":"5fbe0a77_ee3f5d04","updated":"2023-11-10 09:37:17.000000000","message":"I think we should backport this patch on stable branches, quorum queues have been added during stable/zed so we should reach zed at least.","commit_id":"33791d8662b0a44b976d8aa16b85ddd64e0b4123"}]}
