)]}'
{"/PATCHSET_LEVEL":[{"author":{"_account_id":37306,"name":"Piotr Milewski","display_name":"Piotr Milewski","email":"vurmil@gmail.com","username":"vurmil"},"change_message_id":"09f1da4b8609c52c0efd15792d9ad1a41be27516","unresolved":false,"context_lines":[],"source_content_type":"","patch_set":1,"id":"74d2e4b9_3f9791bf","updated":"2026-03-25 18:19:48.000000000","message":"what do you think about this solution?","commit_id":"4cd61fd805d91891f6539ade45cf7badfbf8da4e"},{"author":{"_account_id":22629,"name":"Michal Nasiadka","email":"mnasiadka@gmail.com","username":"mnasiadka"},"change_message_id":"d5697d7ea93919d8199f8eb11d650e460a86b896","unresolved":true,"context_lines":[],"source_content_type":"","patch_set":5,"id":"4a315856_e94304a1","updated":"2026-03-30 10:51:53.000000000","message":"Should we by default have a script that notifies alert manager?","commit_id":"994bc3d6cb655ebb3164ffa1fccbfea0d20e994b"},{"author":{"_account_id":17669,"name":"Doug Szumski","email":"doug@stackhpc.com","username":"DougSzumski"},"change_message_id":"bf850255fa1593fdd4f9fc726451f2302cb9471c","unresolved":false,"context_lines":[],"source_content_type":"","patch_set":5,"id":"f2b03041_4de465eb","updated":"2026-03-30 12:37:47.000000000","message":"Thanks Piotr, it looks useful.\n\nWhat do you think about exposing the events as metrics? Something like:\n\n1) Use a script like your example no.2 to write out logs of interest\n2) Slurp the logs via Prometheus textfile collector\n3) Create an alert using the new metrics (out of scope of KA)\n\nI was thinking that this could provide a more general solution which could be used for other services. It\u0027s also nice to have the time series data in addition to the alerting.","commit_id":"994bc3d6cb655ebb3164ffa1fccbfea0d20e994b"},{"author":{"_account_id":37306,"name":"Piotr Milewski","display_name":"Piotr Milewski","email":"vurmil@gmail.com","username":"vurmil"},"change_message_id":"a0ddfd22a131687c96be3f5cf2533639918d0dc3","unresolved":true,"context_lines":[],"source_content_type":"","patch_set":5,"id":"ccf982f3_a27b681f","in_reply_to":"4a315856_e94304a1","updated":"2026-03-30 11:11:16.000000000","message":"https://review.opendev.org/c/openstack/kolla-ansible/+/982142/4\n\nI added this in Patchset 4, but I figured you probably wouldn’t want to maintain it, so I included two examples in the documentation instead. I can revert it if you’d like.","commit_id":"994bc3d6cb655ebb3164ffa1fccbfea0d20e994b"},{"author":{"_account_id":37306,"name":"Piotr Milewski","display_name":"Piotr Milewski","email":"vurmil@gmail.com","username":"vurmil"},"change_message_id":"c0167a2e04dc8b6252f636a9057b00c20248deb9","unresolved":false,"context_lines":[],"source_content_type":"","patch_set":5,"id":"5f382921_3b6b59d2","in_reply_to":"59ff3104_b2292cb0","updated":"2026-03-30 13:44:24.000000000","message":"Hi Doug, thanks for the feedback!\n\nRegarding metrics: Actually, most Galera state information is already exposed as metrics via the standard prometheus-mysqld-exporter (e.g., mysql_global_status_wsrep_local_state, etc.), which Kolla already supports.\n\nI believe the main value of this patch lies in event-driven notifications rather than just metrics. While Prometheus metrics are great for monitoring trends and current state via scraping, wsrep_notify_cmd allows for:\n\n1. Immediate action: Execution of custom logic exactly when the state change occurs (zero-delay alerting).\n\n2. Integration with non-Prometheus systems: Like sending a direct webhook to a legacy orchestration system or triggering a local recovery script.\n\n3. Contextual logging: Capturing the exact transition events which are sometimes missed by the scraper if the interval is too long.\n\n4. Fault isolation: During a major cluster failure, the central monitoring stack (Prometheus/Alertmanager) or the network might be degraded. Since this script runs locally on the database node, it can trigger immediate out-of-band alerts (e.g., sending a direct mail via a local relay, slack notification etc).\n\n5. Unlimited custom logic: Metrics are just numbers, but this script allows for complex, conditional logic. One could use it to automatically  update a secondary load balancer, or clean up stale lock files. The operator\u0027s imagination is the only limit here.\n\nImplementing the \u0027textfile collector\u0027 approach via this script is definitely possible with this patch (as shown in Example 2 in the docs), but I see it more as a \u0027user-choice\u0027 rather than a default requirement for this feature. This patch provides the plumbing for both: direct alerting OR feeding the textfile collector if the operator chooses to do so.","commit_id":"994bc3d6cb655ebb3164ffa1fccbfea0d20e994b"},{"author":{"_account_id":17669,"name":"Doug Szumski","email":"doug@stackhpc.com","username":"DougSzumski"},"change_message_id":"c568fd06b9d3b62085bb8f40b483ddcae4809d85","unresolved":false,"context_lines":[],"source_content_type":"","patch_set":5,"id":"007e2324_5dce8729","in_reply_to":"5f382921_3b6b59d2","updated":"2026-03-30 14:06:42.000000000","message":"Thanks Piotr, these are some very good points.","commit_id":"994bc3d6cb655ebb3164ffa1fccbfea0d20e994b"},{"author":{"_account_id":22629,"name":"Michal Nasiadka","email":"mnasiadka@gmail.com","username":"mnasiadka"},"change_message_id":"20fc0c9f340f13535409aa4106038a9f44a2caea","unresolved":false,"context_lines":[],"source_content_type":"","patch_set":5,"id":"524c6f78_7aced5a2","in_reply_to":"ccf982f3_a27b681f","updated":"2026-03-30 16:25:16.000000000","message":"Acknowledged","commit_id":"994bc3d6cb655ebb3164ffa1fccbfea0d20e994b"},{"author":{"_account_id":17669,"name":"Doug Szumski","email":"doug@stackhpc.com","username":"DougSzumski"},"change_message_id":"142670087d00ff56e257adba3acf1a6154f3293b","unresolved":false,"context_lines":[],"source_content_type":"","patch_set":5,"id":"59ff3104_b2292cb0","in_reply_to":"f2b03041_4de465eb","updated":"2026-03-30 12:39:19.000000000","message":"See: https://github.com/prometheus/node_exporter?tab\u003dreadme-ov-file#textfile-collector","commit_id":"994bc3d6cb655ebb3164ffa1fccbfea0d20e994b"},{"author":{"_account_id":17669,"name":"Doug Szumski","email":"doug@stackhpc.com","username":"DougSzumski"},"change_message_id":"c568fd06b9d3b62085bb8f40b483ddcae4809d85","unresolved":false,"context_lines":[],"source_content_type":"","patch_set":6,"id":"a2461169_aeda446d","updated":"2026-03-30 14:06:42.000000000","message":"I think this is neat idea.","commit_id":"88d1bef67c74f690033746da7694357fecd845ea"},{"author":{"_account_id":17669,"name":"Doug Szumski","email":"doug@stackhpc.com","username":"DougSzumski"},"change_message_id":"27b02b9d952f6b7cf6941cddfb64c071c08d49f3","unresolved":false,"context_lines":[],"source_content_type":"","patch_set":6,"id":"37d4e056_10495ef7","in_reply_to":"a2461169_aeda446d","updated":"2026-03-30 14:08:02.000000000","message":"*a neat idea","commit_id":"88d1bef67c74f690033746da7694357fecd845ea"},{"author":{"_account_id":17669,"name":"Doug Szumski","email":"doug@stackhpc.com","username":"DougSzumski"},"change_message_id":"ee397b0813bc3960a76c19dcf7a4191f1e379449","unresolved":false,"context_lines":[],"source_content_type":"","patch_set":7,"id":"e2086e4c_3f03252a","updated":"2026-03-30 15:04:49.000000000","message":"thanks Piotr!","commit_id":"04059bab4f7a7527fd84edd8a0c55972e5dced78"}],"ansible/roles/mariadb/tasks/config.yml":[{"author":{"_account_id":17669,"name":"Doug Szumski","email":"doug@stackhpc.com","username":"DougSzumski"},"change_message_id":"bf850255fa1593fdd4f9fc726451f2302cb9471c","unresolved":true,"context_lines":[{"line_number":47,"context_line":"  vars:"},{"line_number":48,"context_line":"    service_name: \"mariadb\""},{"line_number":49,"context_line":"    service: \"{{ mariadb_services[service_name] }}\""},{"line_number":50,"context_line":"  ansible.builtin.copy:"},{"line_number":51,"context_line":"    src: \"{{ node_custom_config }}/mariadb/wsrep-notify.sh\""},{"line_number":52,"context_line":"    dest: \"{{ node_config_directory }}/{{ service_name }}/wsrep-notify.sh\""},{"line_number":53,"context_line":"    mode: \"0770\""}],"source_content_type":"text/x-yaml","patch_set":4,"id":"9dcdbc1a_0eaf6e44","line":50,"updated":"2026-03-30 12:37:47.000000000","message":"Should we consider using `template` here for the script? I see the example script is a template","commit_id":"b88d5753e5561b655f9331cd26f6d4f266229803"},{"author":{"_account_id":37306,"name":"Piotr Milewski","display_name":"Piotr Milewski","email":"vurmil@gmail.com","username":"vurmil"},"change_message_id":"c0167a2e04dc8b6252f636a9057b00c20248deb9","unresolved":false,"context_lines":[{"line_number":47,"context_line":"  vars:"},{"line_number":48,"context_line":"    service_name: \"mariadb\""},{"line_number":49,"context_line":"    service: \"{{ mariadb_services[service_name] }}\""},{"line_number":50,"context_line":"  ansible.builtin.copy:"},{"line_number":51,"context_line":"    src: \"{{ node_custom_config }}/mariadb/wsrep-notify.sh\""},{"line_number":52,"context_line":"    dest: \"{{ node_config_directory }}/{{ service_name }}/wsrep-notify.sh\""},{"line_number":53,"context_line":"    mode: \"0770\""}],"source_content_type":"text/x-yaml","patch_set":4,"id":"b54276b4_0df83b8e","line":50,"in_reply_to":"9dcdbc1a_0eaf6e44","updated":"2026-03-30 13:44:24.000000000","message":"Good point, Doug! You\u0027re right","commit_id":"b88d5753e5561b655f9331cd26f6d4f266229803"}],"doc/source/reference/databases/mariadb-guide.rst":[{"author":{"_account_id":17669,"name":"Doug Szumski","email":"doug@stackhpc.com","username":"DougSzumski"},"change_message_id":"89231631b119efd126c4c32fda4a71c2cca5f149","unresolved":true,"context_lines":[{"line_number":98,"context_line":"   #!/bin/bash"},{"line_number":99,"context_line":""},{"line_number":100,"context_line":"   # Alertmanager API endpoint and credentials"},{"line_number":101,"context_line":"   # These variables are available within the Kolla environment"},{"line_number":102,"context_line":"   URL\u003d\"{{ internal_protocol }}://{{ kolla_internal_vip_address }}:{{ prometheus_alertmanager_port }}/api/v2/alerts\""},{"line_number":103,"context_line":"   AUTH\u003d\"{{ prometheus_alertmanager_user }}:{{ prometheus_alertmanager_password }}\""},{"line_number":104,"context_line":""}],"source_content_type":"text/x-rst","patch_set":6,"id":"2ddf531b_71fe5b87","line":101,"updated":"2026-03-30 14:07:24.000000000","message":"nit: The linter is grumbling about line length","commit_id":"88d1bef67c74f690033746da7694357fecd845ea"},{"author":{"_account_id":37306,"name":"Piotr Milewski","display_name":"Piotr Milewski","email":"vurmil@gmail.com","username":"vurmil"},"change_message_id":"64b03043ea86856a79cf330934d3b10c9d7b73aa","unresolved":false,"context_lines":[{"line_number":98,"context_line":"   #!/bin/bash"},{"line_number":99,"context_line":""},{"line_number":100,"context_line":"   # Alertmanager API endpoint and credentials"},{"line_number":101,"context_line":"   # These variables are available within the Kolla environment"},{"line_number":102,"context_line":"   URL\u003d\"{{ internal_protocol }}://{{ kolla_internal_vip_address }}:{{ prometheus_alertmanager_port }}/api/v2/alerts\""},{"line_number":103,"context_line":"   AUTH\u003d\"{{ prometheus_alertmanager_user }}:{{ prometheus_alertmanager_password }}\""},{"line_number":104,"context_line":""}],"source_content_type":"text/x-rst","patch_set":6,"id":"e0ce8278_a31bd8c5","line":101,"in_reply_to":"2ddf531b_71fe5b87","updated":"2026-03-30 14:52:56.000000000","message":"Acknowledged","commit_id":"88d1bef67c74f690033746da7694357fecd845ea"},{"author":{"_account_id":17669,"name":"Doug Szumski","email":"doug@stackhpc.com","username":"DougSzumski"},"change_message_id":"c568fd06b9d3b62085bb8f40b483ddcae4809d85","unresolved":true,"context_lines":[{"line_number":99,"context_line":""},{"line_number":100,"context_line":"   # Alertmanager API endpoint and credentials"},{"line_number":101,"context_line":"   # These variables are available within the Kolla environment"},{"line_number":102,"context_line":"   URL\u003d\"{{ internal_protocol }}://{{ kolla_internal_vip_address }}:{{ prometheus_alertmanager_port }}/api/v2/alerts\""},{"line_number":103,"context_line":"   AUTH\u003d\"{{ prometheus_alertmanager_user }}:{{ prometheus_alertmanager_password }}\""},{"line_number":104,"context_line":""},{"line_number":105,"context_line":"   curl -X POST \"$URL\" \\"}],"source_content_type":"text/x-rst","patch_set":6,"id":"b7dc4b97_3928c0be","line":102,"updated":"2026-03-30 14:06:42.000000000","message":"I know this is just an example, but it would neat to avoid the VIP here. Certainly, issues with the VIP are not-mutually exclusive with Galera melting down.\n\nCould you simply fire off the alert to *all* of the Alertmanager instances and it will handle the deduplication for you? If not, we could cycle through them, until a POST succeeds.","commit_id":"88d1bef67c74f690033746da7694357fecd845ea"},{"author":{"_account_id":37306,"name":"Piotr Milewski","display_name":"Piotr Milewski","email":"vurmil@gmail.com","username":"vurmil"},"change_message_id":"64b03043ea86856a79cf330934d3b10c9d7b73aa","unresolved":false,"context_lines":[{"line_number":99,"context_line":""},{"line_number":100,"context_line":"   # Alertmanager API endpoint and credentials"},{"line_number":101,"context_line":"   # These variables are available within the Kolla environment"},{"line_number":102,"context_line":"   URL\u003d\"{{ internal_protocol }}://{{ kolla_internal_vip_address }}:{{ prometheus_alertmanager_port }}/api/v2/alerts\""},{"line_number":103,"context_line":"   AUTH\u003d\"{{ prometheus_alertmanager_user }}:{{ prometheus_alertmanager_password }}\""},{"line_number":104,"context_line":""},{"line_number":105,"context_line":"   curl -X POST \"$URL\" \\"}],"source_content_type":"text/x-rst","patch_set":6,"id":"32ebcf82_8273bdfb","line":102,"in_reply_to":"b7dc4b97_3928c0be","updated":"2026-03-30 14:52:56.000000000","message":"Acknowledged","commit_id":"88d1bef67c74f690033746da7694357fecd845ea"}]}
