)]}'
{"/PATCHSET_LEVEL":[{"author":{"_account_id":15554,"name":"Bence Romsics","email":"bence.romsics@gmail.com","username":"ebenrom","status":"working for Ericsson, UTC+1 (+DST)"},"change_message_id":"3ef1c233d957739090444517c6bd7988f14a6fe0","unresolved":true,"context_lines":[],"source_content_type":"","patch_set":2,"id":"1fbbc8bc_af9a67bb","updated":"2025-05-13 11:55:32.000000000","message":"Hi,\n\nThanks for this patch! I believe this reduces the frequency of the bug, however it does not solve it properly. The reason being that we have a lock to avoid concurrent trunk bridge creation and deletion:\n\nhttps://opendev.org/openstack/neutron/src/commit/d5ee0e603b7b86461f4350b7365f49660799ad30/neutron/services/trunk/drivers/openvswitch/agent/ovsdb_handler.py#L152-L158\nhttps://opendev.org/openstack/neutron/src/commit/d5ee0e603b7b86461f4350b7365f49660799ad30/neutron/services/trunk/drivers/openvswitch/agent/ovsdb_handler.py#L200\n\nThis lock is local to ovs-agent. From the locked code we call into the server (we have multiple rpc calls in there). In the server the trunk may already be deleted when the agent is still processing the create. Clearly a distributed lock is not the solution here. However we could entertain the following ideas for a proper fix:\n\n1) We could add the missing metadata as early as possible. ps1 of this change adds the missing metadata earlier than before. But not as early as possible. Ideally this would happen in the same ovs transaction in which we create the port that later has the missing keys in external_ids. That transaction is in os-vif:\n\nhttps://opendev.org/openstack/os-vif/src/branch/master/vif_plug_ovs/ovsdb/ovsdb_lib.py#L200\n\nHere we have two examples of the external_ids field on the tap interface of the trunk parent port:\n\na) bridge_name present, no leftover tpi\n{\n\u0027attached-mac\u0027: \u0027fa:16:3e:12:18:a8\u0027,\n\u0027bridge_name\u0027: \u0027tbr-8ab5a42a-d\u0027,\n\u0027iface-id\u0027: \u0027a440adc8-ba05-43ca-897c-ac7cde888083\u0027,\n\u0027iface-status\u0027: \u0027active\u0027,\n\u0027subport_ids\u0027: \u0027[\"fdbf42a9-0d9b-40d2-9745-a0ecc34a75fd\", \"8c552e67-3d44-44f0-801b-4c8281ca0927\", \"0a3aad77-ecb0-4388-aaba-a1c423c0f94f\", \"a6acd854-3473-46d3-9ec4-8000de9eb5ff\", \"65ce7b1d-fea8-44e8-ba75-800b06757f05\", \"b95bf7fb-89db-4f10-9908-1b94a00910ed\", \"9414a720-8b80-43ff-8a4e-6fbb8a1115d6\", \"a79928ab-ff51-46db-9024-3f4ebc6f4476\", \"3641f2c7-d668-41e2-9315-1808d67e8d18\", \"6c87ca7f-b689-484b-851c-d82ed689b181\"]\u0027,\n\u0027trunk_id\u0027: \u00278ab5a42a-d7c3-4779-827e-0c89cbd8728d\u0027,\n\u0027vm-uuid\u0027: \u002709dc5bdb-3d27-4f45-b557-b9b197d83d78\u0027,\n}\n\nb) bridge_name missing, we leak a tpi\n{\n\u0027attached-mac\u0027: \u0027fa:16:3e:8f:dd:70\u0027,\n\u0027iface-id\u0027: \u00273ad03104-c90b-43d6-b7e7-79467b8a837e\u0027,\n\u0027iface-status\u0027: \u0027active\u0027,\n\u0027vm-uuid\u0027: \u0027a7cf1637-50f9-4f78-9bea-b0df57c3093c\u0027,\n}\n\nc) missing key:value pairs in (b)\n\u0027bridge_name\u0027: \u0027tbr-8ab5a42a-d\u0027,\n\u0027subport_ids\u0027: \u0027[\"fdbf42a9-0d9b-40d2-9745-a0ecc34a75fd\", ...]\u0027,\n\u0027trunk_id\u0027: \u00278ab5a42a-d7c3-4779-827e-0c89cbd8728d\u0027,\n\nI believe we could at least add the bridge_name from os-vif:\n\u0027bridge_name\u0027: \u0027tbr-8ab5a42a-d\u0027,\n\n2) We could refactor the agent-side trunk handling code like this:\n\na) rpc get: get all trunk related data from server and pass it down\nb) no rpc, locked locally: call a modified handle_trunk_add/handle_trunk_remove that no longer calls into the server but is still decorated with the trunk name based lock, return the final trunk status desired\nc) rpc put: update the trunk status after all this from a single place\n\nThis sounds quite a big refactor possibly introducing many new bugs...\n\n3) We could clean all unmatched patch cable ends (tpi with no matching tpt and spi with no matching spt). But I don\u0027t see how we could this at the right time...\n\nAltogether option (1) seems the most realistic to me. What do you think?","commit_id":"725bc7e601b9f1939a656bc828886aef91307c61"},{"author":{"_account_id":8313,"name":"Lajos Katona","display_name":"lajoskatona","email":"katonalala@gmail.com","username":"elajkat","status":"Ericsson Software Technology"},"change_message_id":"ebfe6a4c5d2d5d2cacf01f57d8281d4db1ee3a31","unresolved":true,"context_lines":[],"source_content_type":"","patch_set":2,"id":"aae0edb3_2ed9c031","in_reply_to":"1fbbc8bc_af9a67bb","updated":"2025-05-14 12:45:05.000000000","message":"thanks, I checked adding the external_ids field (bridge_name) in os-vif: https://review.opendev.org/c/openstack/os-vif/+/949736\n\nLet\u0027s see","commit_id":"725bc7e601b9f1939a656bc828886aef91307c61"},{"author":{"_account_id":15554,"name":"Bence Romsics","email":"bence.romsics@gmail.com","username":"ebenrom","status":"working for Ericsson, UTC+1 (+DST)"},"change_message_id":"5dde22ef14e20fa3156f5298f197777220d7fe15","unresolved":false,"context_lines":[],"source_content_type":"","patch_set":3,"id":"76d9d268_f6399ebc","updated":"2025-05-19 08:58:53.000000000","message":"Thanks!","commit_id":"01f8faf861af12347bcce609eb7448ff598b4e71"},{"author":{"_account_id":8313,"name":"Lajos Katona","display_name":"lajoskatona","email":"katonalala@gmail.com","username":"elajkat","status":"Ericsson Software Technology"},"change_message_id":"9b4d00033c7537bede7d798e4c0896cdb8bc0ce9","unresolved":false,"context_lines":[],"source_content_type":"","patch_set":5,"id":"ef01ec3f_1d67ecd7","updated":"2025-06-30 09:38:28.000000000","message":"https://review.opendev.org/c/openstack/neutron/+/952958 fixed for https://bugs.launchpad.net/neutron/+bug/2115026 functional failure","commit_id":"4c50416d6a34d65c3b5d4287b13106d8b171b7a8"}],"neutron/services/trunk/drivers/openvswitch/agent/ovsdb_handler.py":[{"author":{"_account_id":1131,"name":"Brian Haley","email":"haleyb.dev@gmail.com","username":"brian-haley"},"change_message_id":"20ba571c0646a9e9827b72f31c115182bbec210a","unresolved":true,"context_lines":[{"line_number":46,"context_line":""},{"line_number":47,"context_line":"# Note: this lock is local only to the ovs-agent, but not for"},{"line_number":48,"context_line":"# the RPC calls to the neutron-server! This can mean that the agent"},{"line_number":49,"context_line":"# is still working on trunk wiring but in the server the trunk already"},{"line_number":50,"context_line":"# deleted."},{"line_number":51,"context_line":"def lock_on_bridge_name(required_parameter):"},{"line_number":52,"context_line":"    def func_decor(f):"}],"source_content_type":"text/x-python","patch_set":3,"id":"887cf4cb_49bcea2b","line":49,"range":{"start_line":49,"start_character":57,"end_line":49,"end_character":70},"updated":"2025-06-27 14:20:27.000000000","message":"s/trunk is already","commit_id":"01f8faf861af12347bcce609eb7448ff598b4e71"},{"author":{"_account_id":8313,"name":"Lajos Katona","display_name":"lajoskatona","email":"katonalala@gmail.com","username":"elajkat","status":"Ericsson Software Technology"},"change_message_id":"08bb416b0540d34b3fe5e69f65877a61c07a078b","unresolved":false,"context_lines":[{"line_number":46,"context_line":""},{"line_number":47,"context_line":"# Note: this lock is local only to the ovs-agent, but not for"},{"line_number":48,"context_line":"# the RPC calls to the neutron-server! This can mean that the agent"},{"line_number":49,"context_line":"# is still working on trunk wiring but in the server the trunk already"},{"line_number":50,"context_line":"# deleted."},{"line_number":51,"context_line":"def lock_on_bridge_name(required_parameter):"},{"line_number":52,"context_line":"    def func_decor(f):"}],"source_content_type":"text/x-python","patch_set":3,"id":"7437f442_5e4b1886","line":49,"range":{"start_line":49,"start_character":57,"end_line":49,"end_character":70},"in_reply_to":"887cf4cb_49bcea2b","updated":"2025-06-27 15:40:57.000000000","message":"Done","commit_id":"01f8faf861af12347bcce609eb7448ff598b4e71"},{"author":{"_account_id":1131,"name":"Brian Haley","email":"haleyb.dev@gmail.com","username":"brian-haley"},"change_message_id":"20ba571c0646a9e9827b72f31c115182bbec210a","unresolved":true,"context_lines":[{"line_number":279,"context_line":"            else:"},{"line_number":280,"context_line":"                subport_ids.append(subport.port_id)"},{"line_number":281,"context_line":""},{"line_number":282,"context_line":"        # Note(lajoskatona): update_trunk_metadata tries to add the following"},{"line_number":283,"context_line":"        # fields to external_ids: bridge_name, trunk_id and subport_ids."},{"line_number":284,"context_line":"        # It can happen that this metadata is not filled in time and"},{"line_number":285,"context_line":"        # the cleanup fails."}],"source_content_type":"text/x-python","patch_set":3,"id":"bf48a68a_0fc0e39e","line":282,"range":{"start_line":282,"start_character":15,"end_line":282,"end_character":26},"updated":"2025-06-27 14:20:27.000000000","message":"nit: don\u0027t know if you need your nick here, didn\u0027t put above","commit_id":"01f8faf861af12347bcce609eb7448ff598b4e71"},{"author":{"_account_id":8313,"name":"Lajos Katona","display_name":"lajoskatona","email":"katonalala@gmail.com","username":"elajkat","status":"Ericsson Software Technology"},"change_message_id":"08bb416b0540d34b3fe5e69f65877a61c07a078b","unresolved":false,"context_lines":[{"line_number":279,"context_line":"            else:"},{"line_number":280,"context_line":"                subport_ids.append(subport.port_id)"},{"line_number":281,"context_line":""},{"line_number":282,"context_line":"        # Note(lajoskatona): update_trunk_metadata tries to add the following"},{"line_number":283,"context_line":"        # fields to external_ids: bridge_name, trunk_id and subport_ids."},{"line_number":284,"context_line":"        # It can happen that this metadata is not filled in time and"},{"line_number":285,"context_line":"        # the cleanup fails."}],"source_content_type":"text/x-python","patch_set":3,"id":"7b9a082c_79d2e03a","line":282,"range":{"start_line":282,"start_character":15,"end_line":282,"end_character":26},"in_reply_to":"bf48a68a_0fc0e39e","updated":"2025-06-27 15:40:57.000000000","message":"yeah, it was perhaps originally a base of TODO in my local repo and left it, thanks","commit_id":"01f8faf861af12347bcce609eb7448ff598b4e71"},{"author":{"_account_id":1131,"name":"Brian Haley","email":"haleyb.dev@gmail.com","username":"brian-haley"},"change_message_id":"20ba571c0646a9e9827b72f31c115182bbec210a","unresolved":true,"context_lines":[{"line_number":282,"context_line":"        # Note(lajoskatona): update_trunk_metadata tries to add the following"},{"line_number":283,"context_line":"        # fields to external_ids: bridge_name, trunk_id and subport_ids."},{"line_number":284,"context_line":"        # It can happen that this metadata is not filled in time and"},{"line_number":285,"context_line":"        # the cleanup fails."},{"line_number":286,"context_line":"        # As os-vif is responsible for creating these ports and at the time"},{"line_number":287,"context_line":"        # of creation the metadata partially available, os-vif can add at"},{"line_number":288,"context_line":"        # least bridge_name to external_ids."}],"source_content_type":"text/x-python","patch_set":3,"id":"cf87e2d6_b74426cb","line":285,"range":{"start_line":285,"start_character":14,"end_line":285,"end_character":21},"updated":"2025-06-27 14:20:27.000000000","message":"create fails? Maybe I\u0027m missing the context","commit_id":"01f8faf861af12347bcce609eb7448ff598b4e71"},{"author":{"_account_id":8313,"name":"Lajos Katona","display_name":"lajoskatona","email":"katonalala@gmail.com","username":"elajkat","status":"Ericsson Software Technology"},"change_message_id":"08bb416b0540d34b3fe5e69f65877a61c07a078b","unresolved":false,"context_lines":[{"line_number":282,"context_line":"        # Note(lajoskatona): update_trunk_metadata tries to add the following"},{"line_number":283,"context_line":"        # fields to external_ids: bridge_name, trunk_id and subport_ids."},{"line_number":284,"context_line":"        # It can happen that this metadata is not filled in time and"},{"line_number":285,"context_line":"        # the cleanup fails."},{"line_number":286,"context_line":"        # As os-vif is responsible for creating these ports and at the time"},{"line_number":287,"context_line":"        # of creation the metadata partially available, os-vif can add at"},{"line_number":288,"context_line":"        # least bridge_name to external_ids."}],"source_content_type":"text/x-python","patch_set":3,"id":"7fb17e6f_d3be058e","line":285,"range":{"start_line":285,"start_character":14,"end_line":285,"end_character":21},"in_reply_to":"cf87e2d6_b74426cb","updated":"2025-06-27 15:40:57.000000000","message":"create is fine the issue happens if the delete arrives before neutron-agent finished all the creation related things (easy to reproduce if you have lots of subports....) and just can\u0027t reach the point where all the matadata is attached to the ovs devices, and when the delete comes cant find what to delete, but adding bridge_name at create time in os-vif can solve this, but to have everybody warned that part of this logic is seriously in os-vf/nova I think it is good to have a few sentences here.","commit_id":"01f8faf861af12347bcce609eb7448ff598b4e71"},{"author":{"_account_id":1131,"name":"Brian Haley","email":"haleyb.dev@gmail.com","username":"brian-haley"},"change_message_id":"20ba571c0646a9e9827b72f31c115182bbec210a","unresolved":true,"context_lines":[{"line_number":284,"context_line":"        # It can happen that this metadata is not filled in time and"},{"line_number":285,"context_line":"        # the cleanup fails."},{"line_number":286,"context_line":"        # As os-vif is responsible for creating these ports and at the time"},{"line_number":287,"context_line":"        # of creation the metadata partially available, os-vif can add at"},{"line_number":288,"context_line":"        # least bridge_name to external_ids."},{"line_number":289,"context_line":"        try:"},{"line_number":290,"context_line":"            self._update_trunk_metadata("}],"source_content_type":"text/x-python","patch_set":3,"id":"1084c021_3b11852f","line":287,"range":{"start_line":287,"start_character":35,"end_line":287,"end_character":44},"updated":"2025-06-27 14:20:27.000000000","message":"s/is partially","commit_id":"01f8faf861af12347bcce609eb7448ff598b4e71"}]}
