)]}'
{"specs/backlog/approved/handling-nova-compute-restarts-during-live-migration.rst":[{"author":{"_account_id":11604,"name":"sean mooney","email":"smooney@redhat.com","username":"sean-k-mooney"},"change_message_id":"2f17cf396c1cccdf5f4a329458beece042ba351e","unresolved":true,"context_lines":[{"line_number":39,"context_line":"marks the migration as failed and resets the state of the instance. An"},{"line_number":40,"context_line":"operator can then reboot the instance on the source host, even though it"},{"line_number":41,"context_line":"is already running on the destination host. nova-compute will recreate"},{"line_number":42,"context_line":"the libvirt domain and start the instance on the source host."},{"line_number":43,"context_line":""},{"line_number":44,"context_line":"Preventing an instance to boot from two nodes at the same will prevent"},{"line_number":45,"context_line":"these two instances writing to the volumes that are attached to them at"}],"source_content_type":"text/x-rst","patch_set":1,"id":"ad359eb3_8872ee4f","line":42,"updated":"2025-07-24 17:30:54.000000000","message":"the problem is the comptue agent cant check if its on the destiatnion.\n\nit also cant currently abort the migration if its still runnning\n\nnova could be modifed ot potitally fien tthe relevent migration job and try and abort it or a new rpc coudl be added to try and some how resume the migration by callign the dest and seeign if it exist there.\n\nbut its a non trivial problem to adress this known issue.\n\nif we had an rpc similr to pre_live migration that checked fi the instnacce was runnign on the dest we could perhaps call post live migration to compelte it but there is inerintly a reace happening betwen both agents that makes recovery hard.","commit_id":"77e85bd1113f3b4ebc05a192fd3c6e286af5e393"},{"author":{"_account_id":11604,"name":"sean mooney","email":"smooney@redhat.com","username":"sean-k-mooney"},"change_message_id":"2f17cf396c1cccdf5f4a329458beece042ba351e","unresolved":true,"context_lines":[{"line_number":53,"context_line":"Proposed change"},{"line_number":54,"context_line":"\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d"},{"line_number":55,"context_line":""},{"line_number":56,"context_line":"None yet, the rest of the spec depends on what solution is chosen."},{"line_number":57,"context_line":""},{"line_number":58,"context_line":"Alternatives"},{"line_number":59,"context_line":"------------"}],"source_content_type":"text/x-rst","patch_set":1,"id":"7441ca7c_fb394f85","line":56,"updated":"2025-07-24 17:30:54.000000000","message":"so the problem statement there has been know for a very long time.\n\nthis is why we mark the migration as failed and i believe put the instance in error.\n\non emitigation for this that we have considered in the past is in the graceful shutdown case we woudl abort the migraiton first.\n\ni think that is a partial solution that likely should be done but we also need better error handling on startup because we cant assume a graceful shutdown.\n\none possible resolution might be to lock the instance in the error stat as an admin so that and admin has to manually review the current state and either unlock it or correct the issue.\n\nthe better solution would be a dedicated api to allow resuming an interrupted live migration or to try an self heal.\n\npresumably the latter is what you want. you want nova to self heal and ensure the host point to the correct location and that we either mark the migration as succeeded or if we makr it as failed make sure it cant be running on the destination host.","commit_id":"77e85bd1113f3b4ebc05a192fd3c6e286af5e393"},{"author":{"_account_id":11604,"name":"sean mooney","email":"smooney@redhat.com","username":"sean-k-mooney"},"change_message_id":"2f17cf396c1cccdf5f4a329458beece042ba351e","unresolved":true,"context_lines":[{"line_number":65,"context_line":"difficult for an operator to start the VM. For example, require an"},{"line_number":66,"context_line":"explicit clearing of this new state and display a message describing the"},{"line_number":67,"context_line":"number of things that can go wrong, and/or possible checks an operator"},{"line_number":68,"context_line":"can do before resetting."},{"line_number":69,"context_line":""},{"line_number":70,"context_line":"Handle the failed migration in nova-compute automatically"},{"line_number":71,"context_line":"^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^"}],"source_content_type":"text/x-rst","patch_set":1,"id":"6e818dc8_a4332ca5","line":68,"updated":"2025-07-24 17:30:54.000000000","message":"well the way to do that would be to lock the vm as an admin (i.e. nova shoudl lock it so that only an admin can unlock it) and leave a meaning full message in the lock reason field.","commit_id":"77e85bd1113f3b4ebc05a192fd3c6e286af5e393"},{"author":{"_account_id":11604,"name":"sean mooney","email":"smooney@redhat.com","username":"sean-k-mooney"},"change_message_id":"2f17cf396c1cccdf5f4a329458beece042ba351e","unresolved":true,"context_lines":[{"line_number":108,"context_line":""},{"line_number":109,"context_line":"The proposal aims to implement a core change that will be enabled once"},{"line_number":110,"context_line":"nova-compute is upgraded. It does not make sense to make this either an"},{"line_number":111,"context_line":"opt-in or opt-out option."},{"line_number":112,"context_line":""},{"line_number":113,"context_line":"Developer impact"},{"line_number":114,"context_line":"----------------"}],"source_content_type":"text/x-rst","patch_set":1,"id":"974afd1c_833f4101","line":111,"updated":"2025-07-24 17:30:54.000000000","message":"so this woudl affect all virt driver not just libvirt as its a change to the compute manager not the virt driver logic.\n\nwe could guard this by a min compute service version check but any change we do need to work wiht mixed upgraded and non upgraded host on either side of the mitation.\n\nthe simpelt way to do that is disabel the new logic until all nodes are upgraded which i think is what you are suggesting here.","commit_id":"77e85bd1113f3b4ebc05a192fd3c6e286af5e393"}]}
