)]}'
{"nova/compute/manager.py":[{"author":{"_account_id":12356,"name":"Vladyslav Drok","email":"vdrok@mirantis.com","username":"vdrok"},"change_message_id":"b017240ddba3f269bde5f47790a82c491743e692","unresolved":false,"context_lines":[{"line_number":9588,"context_line":"                         {\u0027id\u0027: cn.id, \u0027hh\u0027: cn.hypervisor_hostname,"},{"line_number":9589,"context_line":"                          \u0027nodes\u0027: nodenames})"},{"line_number":9590,"context_line":"                try:"},{"line_number":9591,"context_line":"                    cn.destroy()"},{"line_number":9592,"context_line":"                except exception.ComputeHostNotFound:"},{"line_number":9593,"context_line":"                    # NOTE(mgoddard): it\u0027s possible that another compute"},{"line_number":9594,"context_line":"                    # service took ownership of this compute node since we"}],"source_content_type":"text/x-python","patch_set":5,"id":"df33271e_a1923e9e","line":9591,"updated":"2020-04-03 12:44:22.000000000","message":"Functional test mentions that this bug affects rocky and further, but it also affects queens, though in a more involved scenario, when some conductor(s) networking flaps (while others may be OK), and some nova-computes cannot report that they are in up state for some time, this reshuffle of nodes between hosts also happens, causing the same problems.\n\nBut also maybe there is an easier way to fix the issue? Maybe here before destroy, if driver rebalances nodes, check that this compute node was already reassigned to some other compute host, and delete it only if it was not? Then the rebalance check will just update the host of the ComputeNode record again. But check if node was reassigned ideally means that it needs to be a part of driver API I think.","commit_id":"4b21e2721c22e0fdd46ec472c2e78c76a0ee8312"},{"author":{"_account_id":14826,"name":"Mark Goddard","email":"markgoddard86@gmail.com","username":"mgoddard"},"change_message_id":"3f6331797b1491d8e592609d718f1b29e8856507","unresolved":false,"context_lines":[{"line_number":9588,"context_line":"                         {\u0027id\u0027: cn.id, \u0027hh\u0027: cn.hypervisor_hostname,"},{"line_number":9589,"context_line":"                          \u0027nodes\u0027: nodenames})"},{"line_number":9590,"context_line":"                try:"},{"line_number":9591,"context_line":"                    cn.destroy()"},{"line_number":9592,"context_line":"                except exception.ComputeHostNotFound:"},{"line_number":9593,"context_line":"                    # NOTE(mgoddard): it\u0027s possible that another compute"},{"line_number":9594,"context_line":"                    # service took ownership of this compute node since we"}],"source_content_type":"text/x-python","patch_set":5,"id":"df33271e_bbcb8748","line":9591,"in_reply_to":"df33271e_8720ea9d","updated":"2020-04-03 16:21:38.000000000","message":"It hurts my head every time I look at it :)","commit_id":"4b21e2721c22e0fdd46ec472c2e78c76a0ee8312"},{"author":{"_account_id":12356,"name":"Vladyslav Drok","email":"vdrok@mirantis.com","username":"vdrok"},"change_message_id":"22898fe7a6fc7342dbcf6165a40089a3f0b2c90c","unresolved":false,"context_lines":[{"line_number":9588,"context_line":"                         {\u0027id\u0027: cn.id, \u0027hh\u0027: cn.hypervisor_hostname,"},{"line_number":9589,"context_line":"                          \u0027nodes\u0027: nodenames})"},{"line_number":9590,"context_line":"                try:"},{"line_number":9591,"context_line":"                    cn.destroy()"},{"line_number":9592,"context_line":"                except exception.ComputeHostNotFound:"},{"line_number":9593,"context_line":"                    # NOTE(mgoddard): it\u0027s possible that another compute"},{"line_number":9594,"context_line":"                    # service took ownership of this compute node since we"}],"source_content_type":"text/x-python","patch_set":5,"id":"df33271e_8720ea9d","line":9591,"in_reply_to":"df33271e_a1923e9e","updated":"2020-04-03 14:00:33.000000000","message":"Right, we can\u0027t do the hash ring check as we don\u0027t know if any other node actually picked it up or that it was really deleted, so all of this with removal from cache and such addition to dbapi to consider both host and node seems to be the best way to go.","commit_id":"4b21e2721c22e0fdd46ec472c2e78c76a0ee8312"},{"author":{"_account_id":12356,"name":"Vladyslav Drok","email":"vdrok@mirantis.com","username":"vdrok"},"change_message_id":"4925e2647c7b7ee5e361f6972dd32a8eb7eec8f7","unresolved":false,"context_lines":[{"line_number":9588,"context_line":"                         {\u0027id\u0027: cn.id, \u0027hh\u0027: cn.hypervisor_hostname,"},{"line_number":9589,"context_line":"                          \u0027nodes\u0027: nodenames})"},{"line_number":9590,"context_line":"                try:"},{"line_number":9591,"context_line":"                    cn.destroy()"},{"line_number":9592,"context_line":"                except exception.ComputeHostNotFound:"},{"line_number":9593,"context_line":"                    # NOTE(mgoddard): it\u0027s possible that another compute"},{"line_number":9594,"context_line":"                    # service took ownership of this compute node since we"}],"source_content_type":"text/x-python","patch_set":5,"id":"df33271e_9b9de30c","line":9591,"in_reply_to":"df33271e_bbcb8748","updated":"2020-04-03 16:32:35.000000000","message":"I feel the same headache :)\n\nI was initially trying to fix it on queens and came up with this little ugly fix which I guess might work as well\n\n  for cn in compute_nodes_in_db:\n    if cn.hypervisor_hostname not in nodenames:\n      if self.driver.rebalances_nodes:\n        if (getattr(self.driver, \u0027hash_ring\u0027) and\n            CONF.host not in self.driver.hash_ring.get_nodes(\n              cn.hypervisor_hostname.encode(\u0027utf-8\u0027))):\n          # If a node in the hash ring changed its host, we\n          # consider that the service that should be managing it\n          # now is up, because _refresh_hash_ring checks it\n          LOG.info(\"ComputeNode %(hh)s moving from %(h)s \"\n                   \"compute host to a new one, remove it from \"\n                   \"the cache of managed nodes.\",\n                   {\"hh\": cn.hypervisor_hostname, \"h\": cn.host})\n          # Only remove it from RT cache, the new compute will\n          # take care of the rest\n          rt.remove_node(cn.hypervisor_hostname)\n          continue\n      \u003cremaining orphan deletion flow here\u003e\n\nBut it does not feel right to do such checks here, without providing virt driver api for that...","commit_id":"4b21e2721c22e0fdd46ec472c2e78c76a0ee8312"},{"author":{"_account_id":14826,"name":"Mark Goddard","email":"markgoddard86@gmail.com","username":"mgoddard"},"change_message_id":"a38f6f28099c2988c0ad7bfcc81abc9c11cd67ff","unresolved":false,"context_lines":[{"line_number":9588,"context_line":"                         {\u0027id\u0027: cn.id, \u0027hh\u0027: cn.hypervisor_hostname,"},{"line_number":9589,"context_line":"                          \u0027nodes\u0027: nodenames})"},{"line_number":9590,"context_line":"                try:"},{"line_number":9591,"context_line":"                    cn.destroy()"},{"line_number":9592,"context_line":"                except exception.ComputeHostNotFound:"},{"line_number":9593,"context_line":"                    # NOTE(mgoddard): it\u0027s possible that another compute"},{"line_number":9594,"context_line":"                    # service took ownership of this compute node since we"}],"source_content_type":"text/x-python","patch_set":5,"id":"df33271e_1b20b3f7","line":9591,"in_reply_to":"df33271e_bbcb8748","updated":"2020-04-03 16:24:49.000000000","message":"Perhaps your +1 would encourage some reviews?","commit_id":"4b21e2721c22e0fdd46ec472c2e78c76a0ee8312"},{"author":{"_account_id":9708,"name":"Balazs Gibizer","display_name":"gibi","email":"gibizer@gmail.com","username":"gibi"},"change_message_id":"91fb238a1bf3a7d3ec3deed3914665e4bda06915","unresolved":true,"context_lines":[{"line_number":9952,"context_line":"                         {\u0027id\u0027: cn.id, \u0027hh\u0027: cn.hypervisor_hostname,"},{"line_number":9953,"context_line":"                          \u0027nodes\u0027: nodenames})"},{"line_number":9954,"context_line":"                try:"},{"line_number":9955,"context_line":"                    cn.destroy()"},{"line_number":9956,"context_line":"                except exception.ComputeHostNotFound:"},{"line_number":9957,"context_line":"                    # NOTE(mgoddard): it\u0027s possible that another compute"},{"line_number":9958,"context_line":"                    # service took ownership of this compute node since we"}],"source_content_type":"text/x-python","patch_set":9,"id":"2e2b7629_dbbaf634","line":9955,"updated":"2021-04-28 09:38:44.000000000","message":"Please note here that this destroy is a conditional one. It only destroys the ComputeNode object is the host field of it is not changed in the DB since we last queried it.","commit_id":"8036677fa6888e3382160ef466f83c35e1097666"}],"nova/objects/compute_node.py":[{"author":{"_account_id":9708,"name":"Balazs Gibizer","display_name":"gibi","email":"gibizer@gmail.com","username":"gibi"},"change_message_id":"91fb238a1bf3a7d3ec3deed3914665e4bda06915","unresolved":true,"context_lines":[{"line_number":360,"context_line":""},{"line_number":361,"context_line":"    @base.remotable"},{"line_number":362,"context_line":"    def destroy(self):"},{"line_number":363,"context_line":"        db.compute_node_delete(self._context, self.id, self.host)"},{"line_number":364,"context_line":""},{"line_number":365,"context_line":"    def update_from_virt_driver(self, resources):"},{"line_number":366,"context_line":"        # NOTE(pmurray): the virt driver provides a dict of values that"}],"source_content_type":"text/x-python","patch_set":9,"id":"e56ae7af_85054a64","line":363,"updated":"2021-04-28 09:38:44.000000000","message":"Can we make the extra and unusual behavior of the object delete explicit? I mean there is now a logic that nova only deletes ComputeNode object from the DB if the value of the ComputeNode.host field in the object is matching with the value of the corresponding column of the DB.","commit_id":"8036677fa6888e3382160ef466f83c35e1097666"},{"author":{"_account_id":4690,"name":"melanie witt","display_name":"melwitt","email":"melwittt@gmail.com","username":"melwitt"},"change_message_id":"91aba4ebf0583490f582683f97a9c125a2978319","unresolved":true,"context_lines":[{"line_number":360,"context_line":""},{"line_number":361,"context_line":"    @base.remotable"},{"line_number":362,"context_line":"    def destroy(self):"},{"line_number":363,"context_line":"        db.compute_node_delete(self._context, self.id, self.host)"},{"line_number":364,"context_line":""},{"line_number":365,"context_line":"    def update_from_virt_driver(self, resources):"},{"line_number":366,"context_line":"        # NOTE(pmurray): the virt driver provides a dict of values that"}],"source_content_type":"text/x-python","patch_set":9,"id":"89c39db8_3530efba","line":363,"in_reply_to":"12a629d4_2bbe04c0","updated":"2021-04-29 21:41:07.000000000","message":"Hm, I think this is a fair point. If I\u0027m understanding correctly, I believe we do have some prior art here with the Instance object [1]:\n\n        if not self.obj_attr_is_set(\u0027host\u0027) or not self.host:\n            # NOTE(danms): If our host is not set, avoid a race\n            constraint \u003d db.constraint(host\u003ddb.equal_any(None))\n        else:\n            constraint \u003d None\n\n        try:\n            db_inst \u003d db.instance_destroy(self._context, self.uuid,\n                                          constraint\u003dconstraint,\n                                          hard_delete\u003dhard_delete)\n            self._from_db_object(self._context, self, db_inst)\n        except exception.ConstraintNotMet:\n            raise exception.ObjectActionError(action\u003d\u0027destroy\u0027,\n                                              reason\u003d\u0027host changed\u0027)\n\nwhere we will add a host field constraint on the delete query. Unless I\u0027m missing something, we could do the same thing for ComputeNode and raise ObjectActionError if we get ConstraintNotMet from the DB layer. And then handle that in the compute manager. Doing this we would add a constraint\u003dNone kwarg to the compute_node_delete method instead of adding a host arg.\n\nWhat do you think?\n\n[1] https://github.com/openstack/nova/blob/dab4ec1a534d4774e61a24cdace0f3491788e4e7/nova/objects/instance.py#L646","commit_id":"8036677fa6888e3382160ef466f83c35e1097666"},{"author":{"_account_id":9708,"name":"Balazs Gibizer","display_name":"gibi","email":"gibizer@gmail.com","username":"gibi"},"change_message_id":"2919f0558b4772f54a5c891d93d5f9f83d30c73c","unresolved":true,"context_lines":[{"line_number":360,"context_line":""},{"line_number":361,"context_line":"    @base.remotable"},{"line_number":362,"context_line":"    def destroy(self):"},{"line_number":363,"context_line":"        db.compute_node_delete(self._context, self.id, self.host)"},{"line_number":364,"context_line":""},{"line_number":365,"context_line":"    def update_from_virt_driver(self, resources):"},{"line_number":366,"context_line":"        # NOTE(pmurray): the virt driver provides a dict of values that"}],"source_content_type":"text/x-python","patch_set":9,"id":"e1d9d438_94f83ab3","line":363,"in_reply_to":"89c39db8_3530efba","updated":"2021-05-20 13:03:09.000000000","message":"As the race we are fixing here is happening during a periodic task we cannot make the not-deletion explicit to the user, we can only log it. However this is a generic destroy() method that might be later used not just from the periodic task but from some user facing action. There I would expect to get an error response if the deletion failed due to host differences. So I think what melwitt suggests is a good direction. Raise from the db method if there is an unmet constraint, handle that in the periodic now by ignoring / logging. Then later when somebody starts using this destory() method will need to explicitly deal with the exception.","commit_id":"8036677fa6888e3382160ef466f83c35e1097666"},{"author":{"_account_id":15334,"name":"Stephen Finucane","display_name":"stephenfin","email":"stephenfin@redhat.com","username":"sfinucan"},"change_message_id":"d67b1e7320215556b4ce0d2fbb0d3a748cf480a8","unresolved":true,"context_lines":[{"line_number":360,"context_line":""},{"line_number":361,"context_line":"    @base.remotable"},{"line_number":362,"context_line":"    def destroy(self):"},{"line_number":363,"context_line":"        db.compute_node_delete(self._context, self.id, self.host)"},{"line_number":364,"context_line":""},{"line_number":365,"context_line":"    def update_from_virt_driver(self, resources):"},{"line_number":366,"context_line":"        # NOTE(pmurray): the virt driver provides a dict of values that"}],"source_content_type":"text/x-python","patch_set":9,"id":"12a629d4_2bbe04c0","line":363,"in_reply_to":"e56ae7af_85054a64","updated":"2021-04-28 14:50:40.000000000","message":"How could we do this? There\u0027s no reason to introduce a new remotable method and doing so would prevent our ability to backport this. Perhaps just comments?","commit_id":"8036677fa6888e3382160ef466f83c35e1097666"}]}
