)]}' {"/PATCHSET_LEVEL":[{"author":{"_account_id":9708,"name":"Balazs Gibizer","display_name":"gibi","email":"gibizer@gmail.com","username":"gibi"},"change_message_id":"c238244046e72807d5f46a48753ffe42d9ee7ec2","unresolved":false,"context_lines":[],"source_content_type":"","patch_set":1,"id":"b96641cf_cd9f5322","updated":"2022-07-19 10:36:32.000000000","message":"Honestly I\u0027m not sure this problem is meaningfully solvable without re-architecture nova or ironic or both.","commit_id":"d7665380e0791a0ec00a11e196d3f70ec79ce37c"},{"author":{"_account_id":782,"name":"John Garbutt","email":"john@johngarbutt.com","username":"johngarbutt"},"change_message_id":"b9237f66f38e27de4c135acc596c6043d6c50fce","unresolved":false,"context_lines":[],"source_content_type":"","patch_set":1,"id":"214ae10e_3ad74f4a","updated":"2022-10-14 15:19:31.000000000","message":"I recognise the problems here, thank you for brining this together. However, I don\u0027t agree with the proposed fixes. I think we should remove multiple nova-computes infront of a single conductor group.","commit_id":"d7665380e0791a0ec00a11e196d3f70ec79ce37c"},{"author":{"_account_id":4393,"name":"Dan Smith","email":"dms@danplanet.com","username":"danms"},"change_message_id":"4a4869f908e3a9cb922fc9ee7ab426cb1150ae7d","unresolved":false,"context_lines":[],"source_content_type":"","patch_set":1,"id":"5a7523ec_c860f2a0","updated":"2022-07-05 16:21:24.000000000","message":"I\u0027m not sure why there aren\u0027t any -1s on this, so I\u0027ll add one. Lots of discussion here and I\u0027ve added my comments, but I don\u0027t see anything merge-able or even action-able at this point.","commit_id":"d7665380e0791a0ec00a11e196d3f70ec79ce37c"},{"author":{"_account_id":4690,"name":"melanie witt","display_name":"melwitt","email":"melwittt@gmail.com","username":"melwitt"},"change_message_id":"2240ce3040078accedca8f75dd2d8bb495ee0527","unresolved":false,"context_lines":[],"source_content_type":"","patch_set":1,"id":"22b079dd_cd8eb6d7","updated":"2022-05-25 04:54:05.000000000","message":"Posting what I\u0027ve gotten through so far.","commit_id":"d7665380e0791a0ec00a11e196d3f70ec79ce37c"},{"author":{"_account_id":11604,"name":"sean mooney","email":"smooney@redhat.com","username":"sean-k-mooney"},"change_message_id":"0d6dde7dea42352144e1ef4f85f829235dab3918","unresolved":false,"context_lines":[],"source_content_type":"","patch_set":1,"id":"0df9db38_02d05a59","updated":"2022-05-17 19:16:54.000000000","message":"Thanks for drafing this\n\nthere si a lot to digest \n\nthe two most imporant comments i have made are related to the cell/az concern\n\nhttps://review.opendev.org/c/openstack/nova-specs/+/842015/1/specs/zed/approved/ironic-fix-management-issues.rst#326\n\nand perhaps \n\nhttps://review.opendev.org/c/openstack/nova-specs/+/842015/1/specs/zed/approved/ironic-fix-management-issues.rst#474\n\nthere are still some conceptual difference that i tried to clarify in the spec in other comments.\n","commit_id":"d7665380e0791a0ec00a11e196d3f70ec79ce37c"},{"author":{"_account_id":4393,"name":"Dan Smith","email":"dms@danplanet.com","username":"danms"},"change_message_id":"1c9381a0f7157252c875f673fcce5c29d70b7de4","unresolved":false,"context_lines":[],"source_content_type":"","patch_set":1,"id":"fbff9a78_8605f4ab","in_reply_to":"214ae10e_3ad74f4a","updated":"2022-10-14 15:26:35.000000000","message":"+1 to everything John said in his review just now :D","commit_id":"d7665380e0791a0ec00a11e196d3f70ec79ce37c"}],"specs/zed/approved/ironic-fix-management-issues.rst":[{"author":{"_account_id":782,"name":"John Garbutt","email":"john@johngarbutt.com","username":"johngarbutt"},"change_message_id":"b9237f66f38e27de4c135acc596c6043d6c50fce","unresolved":true,"context_lines":[{"line_number":30,"context_line":"of backporting. Because of the complexity and future direction, the desire"},{"line_number":31,"context_line":"was to treat this more as a specification document to build consensus and"},{"line_number":32,"context_line":"attempt to fully understand the mechanics, and hopefully come up with some"},{"line_number":33,"context_line":"number of solutions."},{"line_number":34,"context_line":""},{"line_number":35,"context_line":"Terminology \u0026 Meaning"},{"line_number":36,"context_line":"\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d"}],"source_content_type":"text/x-rst","patch_set":1,"id":"7002f2e6_8b433066","line":33,"updated":"2022-10-14 15:19:31.000000000","message":"tl;dr\n\nI think we should drop supporting multiple nova-compute hosts managing a single ironic conductor partition.\n\nWe should instead tell users to shard into conductor groups of around 1k nodes, with one nova-compute for each conductor group, using the already supported partition key.\n\nTo help with availability, the deployment tooling can use pacemaker, or whatever, to just run one of them on one of your nodes, with CONF.host \u003d \u003cconductor_group_name\u003e. The only real cost of moving is the startup time, which used to be horrendous, but has got better.","commit_id":"d7665380e0791a0ec00a11e196d3f70ec79ce37c"},{"author":{"_account_id":11604,"name":"sean mooney","email":"smooney@redhat.com","username":"sean-k-mooney"},"change_message_id":"40550583ab2b1d14819a9bd4c6a5ae02eaf3f022","unresolved":true,"context_lines":[{"line_number":30,"context_line":"of backporting. Because of the complexity and future direction, the desire"},{"line_number":31,"context_line":"was to treat this more as a specification document to build consensus and"},{"line_number":32,"context_line":"attempt to fully understand the mechanics, and hopefully come up with some"},{"line_number":33,"context_line":"number of solutions."},{"line_number":34,"context_line":""},{"line_number":35,"context_line":"Terminology \u0026 Meaning"},{"line_number":36,"context_line":"\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d"}],"source_content_type":"text/x-rst","patch_set":1,"id":"1a5de490_c5c0306f","line":33,"in_reply_to":"7002f2e6_8b433066","updated":"2022-10-14 16:37:31.000000000","message":"ya i think that also makes sense with the assumption that ironic server will not move between conductor groups while the are actively provisioned.\n\ni vaguely think of conductor groups in ironic as an equivalent to our cells.\nthey are a sharing mechanism server shoudl remain in the conductor group the are associated with so then the instance.host wont need to change right.","commit_id":"d7665380e0791a0ec00a11e196d3f70ec79ce37c"},{"author":{"_account_id":11604,"name":"sean mooney","email":"smooney@redhat.com","username":"sean-k-mooney"},"change_message_id":"0d6dde7dea42352144e1ef4f85f829235dab3918","unresolved":true,"context_lines":[{"line_number":74,"context_line":"| | | the supplied input |"},{"line_number":75,"context_line":"| | | to |"},{"line_number":76,"context_line":"+--------------------------+-----------------------+--------------------+"},{"line_number":77,"context_line":""},{"line_number":78,"context_line":""},{"line_number":79,"context_line":"Problem description"},{"line_number":80,"context_line":"\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d"}],"source_content_type":"text/x-rst","patch_set":1,"id":"9f6d5059_54149c30","line":77,"updated":"2022-05-17 19:16:54.000000000","message":"^ didnt actully clarify anything when i read it.\nit actully jsut made the terms more ambiguous.","commit_id":"d7665380e0791a0ec00a11e196d3f70ec79ce37c"},{"author":{"_account_id":4690,"name":"melanie witt","display_name":"melwitt","email":"melwittt@gmail.com","username":"melwitt"},"change_message_id":"2240ce3040078accedca8f75dd2d8bb495ee0527","unresolved":true,"context_lines":[{"line_number":74,"context_line":"| | | the supplied input |"},{"line_number":75,"context_line":"| | | to |"},{"line_number":76,"context_line":"+--------------------------+-----------------------+--------------------+"},{"line_number":77,"context_line":""},{"line_number":78,"context_line":""},{"line_number":79,"context_line":"Problem description"},{"line_number":80,"context_line":"\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d"}],"source_content_type":"text/x-rst","patch_set":1,"id":"a5e941b0_9f7bf36e","line":77,"in_reply_to":"9f6d5059_54149c30","updated":"2022-05-25 04:54:05.000000000","message":"Agree this is a bit confusing. I will attempt to summarize what I think it means.\n\nRow 1: The nova-compute service is responsible for modifying the hash ring in the ironic driver. \"Nodes\" are responsible for modifying the hash ring in tooz? (I don\u0027t really understand the last column)\n\nRow 2: The hash ring in the ironic driver in nova contains nova-compute Service objects. A hash ring in tooz contains arbitrary objects or data.\n\nRow 3: Baremetal nodes [in ironic] are called \"nodes\" in the ironic driver and are represented by ComputeNode objects. I don\u0027t understand the last column.\n\nRow 4: In the ironic driver HashRing.get_nodes() returns nova-compute Service objects. I don\u0027t understand the last column.","commit_id":"d7665380e0791a0ec00a11e196d3f70ec79ce37c"},{"author":{"_account_id":4393,"name":"Dan Smith","email":"dms@danplanet.com","username":"danms"},"change_message_id":"4a4869f908e3a9cb922fc9ee7ab426cb1150ae7d","unresolved":true,"context_lines":[{"line_number":74,"context_line":"| | | the supplied input |"},{"line_number":75,"context_line":"| | | to |"},{"line_number":76,"context_line":"+--------------------------+-----------------------+--------------------+"},{"line_number":77,"context_line":""},{"line_number":78,"context_line":""},{"line_number":79,"context_line":"Problem description"},{"line_number":80,"context_line":"\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d"}],"source_content_type":"text/x-rst","patch_set":1,"id":"9d01c434_ecdb4a56","line":77,"in_reply_to":"a5e941b0_9f7bf36e","updated":"2022-07-05 16:21:24.000000000","message":"If melwitt\u0027s explanation of this table is correct, then I\u0027m more confused than I was by the table to begin with (which I didn\u0027t think was possible). So, yeah, agree this is ... not helpful at least.","commit_id":"d7665380e0791a0ec00a11e196d3f70ec79ce37c"},{"author":{"_account_id":782,"name":"John Garbutt","email":"john@johngarbutt.com","username":"johngarbutt"},"change_message_id":"b9237f66f38e27de4c135acc596c6043d6c50fce","unresolved":true,"context_lines":[{"line_number":100,"context_line":"to run multiple nova-compute services talking to the same Ironic API endpoint."},{"line_number":101,"context_line":"This effort, while successful, is where the root of our issues originate."},{"line_number":102,"context_line":""},{"line_number":103,"context_line":"To make this happen, a consistent hash ring is generated using Tooz which"},{"line_number":104,"context_line":"is populated utilizing the list of \"online\" nova-compute services. This is"},{"line_number":105,"context_line":"ideally re-evaluated every time an internal cache of nodes, in this specific"},{"line_number":106,"context_line":"case, baremetal nodes, is updated from Ironic."},{"line_number":107,"context_line":""},{"line_number":108,"context_line":"Where this entire thing begins to loose traction is the original code never"},{"line_number":109,"context_line":"expected the operational environment to ever change, which is simply not"}],"source_content_type":"text/x-rst","patch_set":1,"id":"25243ac3_e85baa00","line":106,"range":{"start_line":103,"start_character":0,"end_line":106,"end_character":46},"updated":"2022-10-14 15:19:31.000000000","message":"So we have since added conductor group based scaling of individual nova-compute services. This allows a scale out as your cloud expands. Its a better solution for the problem you mention here.\n\nOriginally, we I assumed it was more about high availability. But it turns out it is really bad at that (as mentioned else where).","commit_id":"d7665380e0791a0ec00a11e196d3f70ec79ce37c"},{"author":{"_account_id":11604,"name":"sean mooney","email":"smooney@redhat.com","username":"sean-k-mooney"},"change_message_id":"0d6dde7dea42352144e1ef4f85f829235dab3918","unresolved":true,"context_lines":[{"line_number":106,"context_line":"case, baremetal nodes, is updated from Ironic."},{"line_number":107,"context_line":""},{"line_number":108,"context_line":"Where this entire thing begins to loose traction is the original code never"},{"line_number":109,"context_line":"expected the operational environment to ever change, which is simply not"},{"line_number":110,"context_line":"true of any infrastructure operator. Inventory changes. Infrastructure hosting"},{"line_number":111,"context_line":"services and components change, and ultimately need to be changed as an"},{"line_number":112,"context_line":"environment grows and needs to address performance issues or prevent"}],"source_content_type":"text/x-rst","patch_set":1,"id":"90959d15_e140b29a","line":109,"range":{"start_line":109,"start_character":10,"end_line":109,"end_character":51},"updated":"2022-05-17 19:16:54.000000000","message":"that partly true.\nnova never expecte the comptue node to compute service mapping to chagne for any compute service that was hosting a openstack instance.\n\nwe care less about compute-nodes/ironic baremetail servers that are in the aviable state and not assinged ot any instance moving with some caveats.\n\n\nreblaciange without consideration for host aggrates or cells might beak some assumtions the operator made when they deploy nova orginally so that cant be entirely ignored.","commit_id":"d7665380e0791a0ec00a11e196d3f70ec79ce37c"},{"author":{"_account_id":11604,"name":"sean mooney","email":"smooney@redhat.com","username":"sean-k-mooney"},"change_message_id":"0d6dde7dea42352144e1ef4f85f829235dab3918","unresolved":true,"context_lines":[{"line_number":114,"context_line":""},{"line_number":115,"context_line":"The practical net result of the static expectation, is that the code expected"},{"line_number":116,"context_line":"the host the nova-compute service ran on, to remain forever, and be forever"},{"line_number":117,"context_line":"useable and addressable. In a cloud with a high rate of churn, this might"},{"line_number":118,"context_line":"have been slightly acceptable, but experience in the community is physical"},{"line_number":119,"context_line":"nodes tend to be reused and reallocated quickly, *or* exist for quite a long"},{"line_number":120,"context_line":"time as individual instances in Nova. Combine with normal cycling of services,"}],"source_content_type":"text/x-rst","patch_set":1,"id":"b7d6f290_e213f8cd","line":117,"updated":"2022-05-17 19:16:54.000000000","message":"yes. the compute services model the RPC end point of the contol plane.\nin the cases of ironic the nova-comptue service that runs the ironic virt driver it typicaly deploy on your contoler and contorler are not expected to be added or remvoed under normal operation.\n\n\ncompute hosts/ironic nodes are expected to change in normal opertions.","commit_id":"d7665380e0791a0ec00a11e196d3f70ec79ce37c"},{"author":{"_account_id":4393,"name":"Dan Smith","email":"dms@danplanet.com","username":"danms"},"change_message_id":"4a4869f908e3a9cb922fc9ee7ab426cb1150ae7d","unresolved":true,"context_lines":[{"line_number":121,"context_line":"through even basic operations like software updates, a churn was created"},{"line_number":122,"context_line":"which also introduced yet another class of bugs. Granted, the static"},{"line_number":123,"context_line":"expectation makes all the sense in the world if the host running the"},{"line_number":124,"context_line":"nova-compute service is a hypervisor with virtual machines."},{"line_number":125,"context_line":""},{"line_number":126,"context_line":"The crux of this issue, is ultimately that the hash ring is used to"},{"line_number":127,"context_line":"determine the distribution of nodes for the internal cache of the driver,"}],"source_content_type":"text/x-rst","patch_set":1,"id":"bd450bb7_094eb7cc","line":124,"updated":"2022-07-05 16:21:24.000000000","message":"I know this is written from the perspective of ironic as normal and nova as weird, but I think it\u0027s a little unfair. Nova is not overly static in this regard, it just looks like that when you decouple the lifecycle of the things being managed from the thing doing the management. nova-compute is intended to run with the instances it manages, and if the host those are all on dies or needs to be recycled, the workloads and the manager of those workloads have their fate intertwined. Driving an external service from nova-compute where the fates are necessarily *not* closely linked is not a failure in the design or implementation of Nova, it\u0027s just using the tool for something other than intended.","commit_id":"d7665380e0791a0ec00a11e196d3f70ec79ce37c"},{"author":{"_account_id":782,"name":"John Garbutt","email":"john@johngarbutt.com","username":"johngarbutt"},"change_message_id":"b9237f66f38e27de4c135acc596c6043d6c50fce","unresolved":true,"context_lines":[{"line_number":121,"context_line":"through even basic operations like software updates, a churn was created"},{"line_number":122,"context_line":"which also introduced yet another class of bugs. Granted, the static"},{"line_number":123,"context_line":"expectation makes all the sense in the world if the host running the"},{"line_number":124,"context_line":"nova-compute service is a hypervisor with virtual machines."},{"line_number":125,"context_line":""},{"line_number":126,"context_line":"The crux of this issue, is ultimately that the hash ring is used to"},{"line_number":127,"context_line":"determine the distribution of nodes for the internal cache of the driver,"}],"source_content_type":"text/x-rst","patch_set":1,"id":"f9fba251_993d3442","line":124,"in_reply_to":"bd450bb7_094eb7cc","updated":"2022-10-14 15:19:31.000000000","message":"A further note on this topic that may help...\n\nThe \"hostname\" here is a configuration option. I am thinking about kolla-ansible/kayobe setting this to the conductor group name (maybe with an index), so I don\u0027t care what host nova-compute runs on.\n\nCONF.host is really just a \"shard\" name, used for the process doing the locking for that subset of resources, and the Rabbitmq topic name for incoming requests for objects owned by that process.\n\n(this is somewhat similar to sean\u0027s point below).","commit_id":"d7665380e0791a0ec00a11e196d3f70ec79ce37c"},{"author":{"_account_id":11604,"name":"sean mooney","email":"smooney@redhat.com","username":"sean-k-mooney"},"change_message_id":"40550583ab2b1d14819a9bd4c6a5ae02eaf3f022","unresolved":true,"context_lines":[{"line_number":121,"context_line":"through even basic operations like software updates, a churn was created"},{"line_number":122,"context_line":"which also introduced yet another class of bugs. Granted, the static"},{"line_number":123,"context_line":"expectation makes all the sense in the world if the host running the"},{"line_number":124,"context_line":"nova-compute service is a hypervisor with virtual machines."},{"line_number":125,"context_line":""},{"line_number":126,"context_line":"The crux of this issue, is ultimately that the hash ring is used to"},{"line_number":127,"context_line":"determine the distribution of nodes for the internal cache of the driver,"}],"source_content_type":"text/x-rst","patch_set":1,"id":"bcd45aad_957b016b","line":124,"in_reply_to":"f9fba251_993d3442","updated":"2022-10-14 16:37:31.000000000","message":"yep","commit_id":"d7665380e0791a0ec00a11e196d3f70ec79ce37c"},{"author":{"_account_id":4393,"name":"Dan Smith","email":"dms@danplanet.com","username":"danms"},"change_message_id":"4a4869f908e3a9cb922fc9ee7ab426cb1150ae7d","unresolved":true,"context_lines":[{"line_number":125,"context_line":""},{"line_number":126,"context_line":"The crux of this issue, is ultimately that the hash ring is used to"},{"line_number":127,"context_line":"determine the distribution of nodes for the internal cache of the driver,"},{"line_number":128,"context_line":"which also drives scheduling updates, which can result in Compute Nodes,"},{"line_number":129,"context_line":"being deleted in some cases. And when a failure has occured, or something"},{"line_number":130,"context_line":"as mundane as a hostname accidently chagned on the host, the instances"},{"line_number":131,"context_line":"which were deployed at that time are no longer managable."}],"source_content_type":"text/x-rst","patch_set":1,"id":"3f591138_9e59e42d","line":128,"range":{"start_line":128,"start_character":0,"end_line":128,"end_character":36},"updated":"2022-07-05 16:21:24.000000000","message":"I\u0027m not sure what this is referring to. By scheduling updates do you mean updating information that is used for scheduling of new instances? Or scheduling of some maintenance calls to ironic?","commit_id":"d7665380e0791a0ec00a11e196d3f70ec79ce37c"},{"author":{"_account_id":11604,"name":"sean mooney","email":"smooney@redhat.com","username":"sean-k-mooney"},"change_message_id":"0d6dde7dea42352144e1ef4f85f829235dab3918","unresolved":true,"context_lines":[{"line_number":126,"context_line":"The crux of this issue, is ultimately that the hash ring is used to"},{"line_number":127,"context_line":"determine the distribution of nodes for the internal cache of the driver,"},{"line_number":128,"context_line":"which also drives scheduling updates, which can result in Compute Nodes,"},{"line_number":129,"context_line":"being deleted in some cases. And when a failure has occured, or something"},{"line_number":130,"context_line":"as mundane as a hostname accidently chagned on the host, the instances"},{"line_number":131,"context_line":"which were deployed at that time are no longer managable."},{"line_number":132,"context_line":""},{"line_number":133,"context_line":"Again, it cannot be stressed enough, this is a complex set of issues, and"}],"source_content_type":"text/x-rst","patch_set":1,"id":"b161b581_2bf76b9f","line":130,"range":{"start_line":129,"start_character":64,"end_line":130,"end_character":43},"updated":"2022-05-17 19:16:54.000000000","message":"whiel this happens accidentally some thiem we do not support this upstream or downstream in any way.\n\nits one of the cardenel rules of deploying nova that you shoudl not change ever change the host name of a comptute service that is managing an instance.\n\nif you do then your all bets are off and you are left with a broken cloud.\n\nwith the ironic driver you are actully isolated form this more then you are with libvirt since the hypervior_hostnames of the compute_nodes is the ironic baremetal node uuid and the host value of the comptue servce can be set in the nova.conf\n\nso realsitically in say a ooo environment where ooo hardcodes [DEFAULT]host\u003d$FQDN\nchanging the host name of a contoler hosting the nova-compute service for ironic will not change teh host value of a compute service unless ooo also updated the nova.conf\n\nfor anyone relaying on the default which uses socket.gethostname() then changing /etc/hostname will break the ironic nova-compute service just as it breaks libvirt\nbut instead of 100s of vms being unmanageble the scale can be much larger with ironic.\n\nthe fix in this case woudl be to revert the hostname change but that is not always possible.","commit_id":"d7665380e0791a0ec00a11e196d3f70ec79ce37c"},{"author":{"_account_id":11604,"name":"sean mooney","email":"smooney@redhat.com","username":"sean-k-mooney"},"change_message_id":"0d6dde7dea42352144e1ef4f85f829235dab3918","unresolved":true,"context_lines":[{"line_number":146,"context_line":""},{"line_number":147,"context_line":"This one of two fundimental bugs which exist, where in operational terms"},{"line_number":148,"context_line":"one nova-compute service does not take over responsibility for the deployed"},{"line_number":149,"context_line":"baremetal instances of a nova-compute service which has gone down."},{"line_number":150,"context_line":""},{"line_number":151,"context_line":"From an operational perspectie, it is reasonable to expect the service to"},{"line_number":152,"context_line":"do so where it is just a command proxy."}],"source_content_type":"text/x-rst","patch_set":1,"id":"ac7ad948_07d154ca","line":149,"updated":"2022-05-17 19:16:54.000000000","message":"so from a nova perspetive that is not a resonable expection in the general case.\n\nwe would expect the operator to redpeloy the ocmpute service either on the same host if it was fixed or on a different host with the old hostname set in the nova.conf so that the redeploy service could resume manaing the isntance that were mapped to it.\n\nthis is because nova\u0027s expectation is that an isntace instahce.host and instance.hypervior host shall not change out side of an api request form a user such as move operation.\n\nthe only move operation we supprot on an instnace who\u0027s compute service is down is evacuate which ironic does not supprot\nhttps://github.com/openstack/nova/blob/master/nova/virt/ironic/driver.py#L160\u003d","commit_id":"d7665380e0791a0ec00a11e196d3f70ec79ce37c"},{"author":{"_account_id":4393,"name":"Dan Smith","email":"dms@danplanet.com","username":"danms"},"change_message_id":"4a4869f908e3a9cb922fc9ee7ab426cb1150ae7d","unresolved":true,"context_lines":[{"line_number":146,"context_line":""},{"line_number":147,"context_line":"This one of two fundimental bugs which exist, where in operational terms"},{"line_number":148,"context_line":"one nova-compute service does not take over responsibility for the deployed"},{"line_number":149,"context_line":"baremetal instances of a nova-compute service which has gone down."},{"line_number":150,"context_line":""},{"line_number":151,"context_line":"From an operational perspectie, it is reasonable to expect the service to"},{"line_number":152,"context_line":"do so where it is just a command proxy."}],"source_content_type":"text/x-rst","patch_set":1,"id":"11205aef_76555c27","line":149,"in_reply_to":"ac7ad948_07d154ca","updated":"2022-07-05 16:21:24.000000000","message":"Agree with Sean: we expect instances to not move between hosts/nodes unless we do it. That\u0027s because there\u0027s a lot of accounting in various places that has to be updated when that happens.\n\nPerhaps this is something we should put as a top-level \"do we want to change this\" question. Do we ever want nova-computes to silently \"adopt\" instances (and thus the nodes) from other computes? If yes, then there\u0027s a big list of things we\u0027ll need to make sure are handled, and if no, we can rule out this bug as a situation we want to handle. Since it would be only useful for ironic, and very complicated, and violate our \"nova doesn\u0027t orchestrate\" tenet, I\u0027d think \"no\".","commit_id":"d7665380e0791a0ec00a11e196d3f70ec79ce37c"},{"author":{"_account_id":11604,"name":"sean mooney","email":"smooney@redhat.com","username":"sean-k-mooney"},"change_message_id":"0d6dde7dea42352144e1ef4f85f829235dab3918","unresolved":true,"context_lines":[{"line_number":148,"context_line":"one nova-compute service does not take over responsibility for the deployed"},{"line_number":149,"context_line":"baremetal instances of a nova-compute service which has gone down."},{"line_number":150,"context_line":""},{"line_number":151,"context_line":"From an operational perspectie, it is reasonable to expect the service to"},{"line_number":152,"context_line":"do so where it is just a command proxy."},{"line_number":153,"context_line":""},{"line_number":154,"context_line":"It does this because only baremetal nodes which are **not deployed** as"}],"source_content_type":"text/x-rst","patch_set":1,"id":"e7a4e019_2ef505fd","line":151,"range":{"start_line":151,"start_character":8,"end_line":151,"end_character":19},"updated":"2022-05-17 19:16:54.000000000","message":"i would say form a technial perspecitve but operationall that would violate the expections of how instance are managed in a normal nova deployemet so operationaly that would not be a valid expecation in a non ironic deployment.","commit_id":"d7665380e0791a0ec00a11e196d3f70ec79ce37c"},{"author":{"_account_id":4393,"name":"Dan Smith","email":"dms@danplanet.com","username":"danms"},"change_message_id":"4a4869f908e3a9cb922fc9ee7ab426cb1150ae7d","unresolved":true,"context_lines":[{"line_number":148,"context_line":"one nova-compute service does not take over responsibility for the deployed"},{"line_number":149,"context_line":"baremetal instances of a nova-compute service which has gone down."},{"line_number":150,"context_line":""},{"line_number":151,"context_line":"From an operational perspectie, it is reasonable to expect the service to"},{"line_number":152,"context_line":"do so where it is just a command proxy."},{"line_number":153,"context_line":""},{"line_number":154,"context_line":"It does this because only baremetal nodes which are **not deployed** as"}],"source_content_type":"text/x-rst","patch_set":1,"id":"9e91cfcf_97c7b73f","line":151,"range":{"start_line":151,"start_character":8,"end_line":151,"end_character":19},"in_reply_to":"e7a4e019_2ef505fd","updated":"2022-07-05 16:21:24.000000000","message":"Right.","commit_id":"d7665380e0791a0ec00a11e196d3f70ec79ce37c"},{"author":{"_account_id":11604,"name":"sean mooney","email":"smooney@redhat.com","username":"sean-k-mooney"},"change_message_id":"0d6dde7dea42352144e1ef4f85f829235dab3918","unresolved":true,"context_lines":[{"line_number":159,"context_line":"Effective result is, we *both* loose the deployed instance in record updates"},{"line_number":160,"context_line":"and tracking. The ComputeNode record can then also be deleted, and there is"},{"line_number":161,"context_line":"sadness. Granted, the ComputeNode record is now recreated."},{"line_number":162,"context_line":""},{"line_number":163,"context_line":"Launchpad Bug #1825876"},{"line_number":164,"context_line":"~~~~~~~~~~~~~~~~~~~~~~"},{"line_number":165,"context_line":""}],"source_content_type":"text/x-rst","patch_set":1,"id":"c0e2f471_eb7da7f3","line":162,"updated":"2022-05-17 19:16:54.000000000","message":"from a nova perspecitve you cant recreate a compute service once its deleted\n\neven if it has the same hostname it will have a differnt service uuid and that woudl normally result in a differnt placement resouce provider form the old one.\n\nso form a nova prepecive once the comptue service is delete its gone forever an a new service that just happes to have the same name has been created.\n\n\nit more a kin to inheriting a house after a relitive has passed away. there is a new person(service uuid) liveing there but the adddres(hostname) of the house has not changed.\n\nthis is part of the impedence mismatch between ironic an nova.","commit_id":"d7665380e0791a0ec00a11e196d3f70ec79ce37c"},{"author":{"_account_id":782,"name":"John Garbutt","email":"john@johngarbutt.com","username":"johngarbutt"},"change_message_id":"b9237f66f38e27de4c135acc596c6043d6c50fce","unresolved":true,"context_lines":[{"line_number":159,"context_line":"Effective result is, we *both* loose the deployed instance in record updates"},{"line_number":160,"context_line":"and tracking. The ComputeNode record can then also be deleted, and there is"},{"line_number":161,"context_line":"sadness. Granted, the ComputeNode record is now recreated."},{"line_number":162,"context_line":""},{"line_number":163,"context_line":"Launchpad Bug #1825876"},{"line_number":164,"context_line":"~~~~~~~~~~~~~~~~~~~~~~"},{"line_number":165,"context_line":""}],"source_content_type":"text/x-rst","patch_set":1,"id":"d1ed107e_456121a7","line":162,"in_reply_to":"c0e2f471_eb7da7f3","updated":"2022-10-14 15:19:31.000000000","message":"As a proof point, If you don\u0027t use multiple nova-compute\u0027s for a single ironic partition, you don\u0027t see any of this.","commit_id":"d7665380e0791a0ec00a11e196d3f70ec79ce37c"},{"author":{"_account_id":11604,"name":"sean mooney","email":"smooney@redhat.com","username":"sean-k-mooney"},"change_message_id":"40550583ab2b1d14819a9bd4c6a5ae02eaf3f022","unresolved":true,"context_lines":[{"line_number":159,"context_line":"Effective result is, we *both* loose the deployed instance in record updates"},{"line_number":160,"context_line":"and tracking. The ComputeNode record can then also be deleted, and there is"},{"line_number":161,"context_line":"sadness. Granted, the ComputeNode record is now recreated."},{"line_number":162,"context_line":""},{"line_number":163,"context_line":"Launchpad Bug #1825876"},{"line_number":164,"context_line":"~~~~~~~~~~~~~~~~~~~~~~"},{"line_number":165,"context_line":""}],"source_content_type":"text/x-rst","patch_set":1,"id":"51245de6_d5942f6c","line":162,"in_reply_to":"d1ed107e_456121a7","updated":"2022-10-14 16:37:31.000000000","message":"yep it sound like a 1:1 mapping of computes to conductor groups is the way to go.","commit_id":"d7665380e0791a0ec00a11e196d3f70ec79ce37c"},{"author":{"_account_id":11604,"name":"sean mooney","email":"smooney@redhat.com","username":"sean-k-mooney"},"change_message_id":"0d6dde7dea42352144e1ef4f85f829235dab3918","unresolved":true,"context_lines":[{"line_number":172,"context_line":""},{"line_number":173,"context_line":"This is in part because the hypervisors (Compute Nodes) are pruned out due to"},{"line_number":174,"context_line":"logic in the Compute Manager which evaluates the ComputeNode ``host``, and"},{"line_number":175,"context_line":"if it doesn\u0027t match a working hypervisor, then the ComputeNode object is"},{"line_number":176,"context_line":"deleted from the database. Operationally this is a case for an"},{"line_number":177,"context_line":"``available`` baremetal node in Ironic *and* the internal driver cache is"},{"line_number":178,"context_line":"updated as a result of the rebalance operation of a nova-compute service"},{"line_number":179,"context_line":"going down due to timing out it\u0027s heartbeat to the database."}],"source_content_type":"text/x-rst","patch_set":1,"id":"fa030eb9_277712e8","line":176,"range":{"start_line":175,"start_character":42,"end_line":176,"end_character":27},"updated":"2022-05-17 19:16:54.000000000","message":"eak really that seams logically wrong to me.\n\nthe only time a compute node should be deleted in the nova database would be if the corresponeing ironic baremetal node is deleted form ironic.\n\nin no other case is it valid to delete the compute node at least that is my initial reaction.\n\n\nthat is based on two rationals.\nfirst the comptue nodes logicaly map to the ironic baremetal node so that is the souce of truth for them existing\nsecond the other compute service that is deleting them does not \"own\" the comptue node currently so it shoudl not do any opteration on it unless it has \"adopted\" it via a rebalance. so if the phsyical server still exist in ironic then it should not be deleted.\n\n\nby the way correct me if im way but ironic allows you to specify a uuid when you create a bearmetal ndoe in ironic right.\n\nwithout passing a uuid i would considder deleteing a baremetal node and readding it with to be a new baremetal host even if its the same physical server.\nthat shoudl result in the old compute ndoe being removed and a new one being created but that should also only be valide if there is no instance assocated with it.\n\nif it had an instnace and your froced it on the ironic side then really nova should delete the instance too.\n\nif you delete it and readded it but used the same uuid i would also expect the compute node to be deleted and recreated with the old uuid however it would be a new recored in the db table.\n\nwe could perhaps special case that but i would not expect use to undelete the row in the db unless we have already implemet driver dependet code for this whihc i woudl find surpising so again this is like the house analogy \nform a nova point of view that is still a new compute node even if you resued the same uuid on the ironic side.","commit_id":"d7665380e0791a0ec00a11e196d3f70ec79ce37c"},{"author":{"_account_id":4393,"name":"Dan Smith","email":"dms@danplanet.com","username":"danms"},"change_message_id":"4a4869f908e3a9cb922fc9ee7ab426cb1150ae7d","unresolved":true,"context_lines":[{"line_number":172,"context_line":""},{"line_number":173,"context_line":"This is in part because the hypervisors (Compute Nodes) are pruned out due to"},{"line_number":174,"context_line":"logic in the Compute Manager which evaluates the ComputeNode ``host``, and"},{"line_number":175,"context_line":"if it doesn\u0027t match a working hypervisor, then the ComputeNode object is"},{"line_number":176,"context_line":"deleted from the database. Operationally this is a case for an"},{"line_number":177,"context_line":"``available`` baremetal node in Ironic *and* the internal driver cache is"},{"line_number":178,"context_line":"updated as a result of the rebalance operation of a nova-compute service"},{"line_number":179,"context_line":"going down due to timing out it\u0027s heartbeat to the database."}],"source_content_type":"text/x-rst","patch_set":1,"id":"1798f4d7_0c8f9a38","line":176,"range":{"start_line":175,"start_character":42,"end_line":176,"end_character":27},"in_reply_to":"c1a12f43_b0adcb0d","updated":"2022-07-05 16:21:24.000000000","message":"Right, so I\u0027m confused about what is being asserted in the spec here. Is Julia asserting that we are deleting ComputeNode objects when we shouldn\u0027t be? Or asserting that we *should* be but *aren\u0027t*? The referenced bug doesn\u0027t really tell me much.","commit_id":"d7665380e0791a0ec00a11e196d3f70ec79ce37c"},{"author":{"_account_id":4690,"name":"melanie witt","display_name":"melwitt","email":"melwittt@gmail.com","username":"melwitt"},"change_message_id":"2240ce3040078accedca8f75dd2d8bb495ee0527","unresolved":true,"context_lines":[{"line_number":172,"context_line":""},{"line_number":173,"context_line":"This is in part because the hypervisors (Compute Nodes) are pruned out due to"},{"line_number":174,"context_line":"logic in the Compute Manager which evaluates the ComputeNode ``host``, and"},{"line_number":175,"context_line":"if it doesn\u0027t match a working hypervisor, then the ComputeNode object is"},{"line_number":176,"context_line":"deleted from the database. Operationally this is a case for an"},{"line_number":177,"context_line":"``available`` baremetal node in Ironic *and* the internal driver cache is"},{"line_number":178,"context_line":"updated as a result of the rebalance operation of a nova-compute service"},{"line_number":179,"context_line":"going down due to timing out it\u0027s heartbeat to the database."}],"source_content_type":"text/x-rst","patch_set":1,"id":"c1a12f43_b0adcb0d","line":176,"range":{"start_line":175,"start_character":42,"end_line":176,"end_character":27},"in_reply_to":"fa030eb9_277712e8","updated":"2022-05-25 04:54:05.000000000","message":"Yeah, there are only two ways a ComputeNode objects (and corresponding database row) can be deleted in nova.\n\n1. If the DELETE /os-services/{service_id} API is called by a user (admin). This means the nova-compute Service object will be deleted [1] and the matching ComputeNode object will be deleted [2]. Note that in this case, the ComputeNode does not map to an ironic node -- it is the ComputeNode representing the nova-compute service being deleted.\n\n2. If an ironic node is deleted. Nova will detect that its list of ComputeNode objects matching the ironic node list has a mismatch and the ironic driver will delete the ComputeNode that has no ironic node anymore and say, \"Deleting orphan compute node\".\n\n[1] https://github.com/openstack/nova/blob/61b161eeaaed9e228b735f7c6793e7fc5bf1a830/nova/api/openstack/compute/services.py#L340\n[2] https://github.com/openstack/nova/blob/61b161eeaaed9e228b735f7c6793e7fc5bf1a830/nova/db/main/api.py#L390","commit_id":"d7665380e0791a0ec00a11e196d3f70ec79ce37c"},{"author":{"_account_id":11604,"name":"sean mooney","email":"smooney@redhat.com","username":"sean-k-mooney"},"change_message_id":"0d6dde7dea42352144e1ef4f85f829235dab3918","unresolved":true,"context_lines":[{"line_number":202,"context_line":"The old hostname no longer exists, and rightfully so, would only create"},{"line_number":203,"context_line":"confusion if someone were to find it and go look for it in a Configuration"},{"line_number":204,"context_line":"Management Data Base or Hardware Inventory."},{"line_number":205,"context_line":""},{"line_number":206,"context_line":"This ultimately is both an operator and user experience headache, and a"},{"line_number":207,"context_line":"maintenance risk as right now, the only way to recover the ability to manage"},{"line_number":208,"context_line":"the deployed instances, is to update the host field on the records."}],"source_content_type":"text/x-rst","patch_set":1,"id":"d7597663_aa44527d","line":205,"updated":"2022-05-17 19:16:54.000000000","message":"right so if you replace a failed libvirt host we would expect you to evacuate all the instance first.\n\nif they are on share storage nova will just spawn the vm on a differnet host with the same ports and voluems ectra \n\nif its not on share storeage we will rebuidl the vm form its current image which will lose data but if the node is dead that the best we can do.\n\nironic does not supprot evacuate so you cannot perform and instance acation on the instace to evaucate it but that is alos semanticly wrong\n\nsince relaly the isntace and compute node are still present but the compute service is down so you want to make ti manageble via one of the other compute services.","commit_id":"d7665380e0791a0ec00a11e196d3f70ec79ce37c"},{"author":{"_account_id":782,"name":"John Garbutt","email":"john@johngarbutt.com","username":"johngarbutt"},"change_message_id":"b9237f66f38e27de4c135acc596c6043d6c50fce","unresolved":true,"context_lines":[{"line_number":202,"context_line":"The old hostname no longer exists, and rightfully so, would only create"},{"line_number":203,"context_line":"confusion if someone were to find it and go look for it in a Configuration"},{"line_number":204,"context_line":"Management Data Base or Hardware Inventory."},{"line_number":205,"context_line":""},{"line_number":206,"context_line":"This ultimately is both an operator and user experience headache, and a"},{"line_number":207,"context_line":"maintenance risk as right now, the only way to recover the ability to manage"},{"line_number":208,"context_line":"the deployed instances, is to update the host field on the records."}],"source_content_type":"text/x-rst","patch_set":1,"id":"735a5d8b_340a9e49","line":205,"in_reply_to":"acb0fb3f_3dfec0d5","updated":"2022-10-14 15:19:31.000000000","message":"+1 for tweaking conf.host.\n\nI would recommend using the ironic conductor group name, i.e. the partition key.","commit_id":"d7665380e0791a0ec00a11e196d3f70ec79ce37c"},{"author":{"_account_id":4393,"name":"Dan Smith","email":"dms@danplanet.com","username":"danms"},"change_message_id":"4a4869f908e3a9cb922fc9ee7ab426cb1150ae7d","unresolved":true,"context_lines":[{"line_number":202,"context_line":"The old hostname no longer exists, and rightfully so, would only create"},{"line_number":203,"context_line":"confusion if someone were to find it and go look for it in a Configuration"},{"line_number":204,"context_line":"Management Data Base or Hardware Inventory."},{"line_number":205,"context_line":""},{"line_number":206,"context_line":"This ultimately is both an operator and user experience headache, and a"},{"line_number":207,"context_line":"maintenance risk as right now, the only way to recover the ability to manage"},{"line_number":208,"context_line":"the deployed instances, is to update the host field on the records."}],"source_content_type":"text/x-rst","patch_set":1,"id":"acb0fb3f_3dfec0d5","line":205,"in_reply_to":"d7597663_aa44527d","updated":"2022-07-05 16:21:24.000000000","message":"I think the problem there is that the controller replacement should be making sure that the new controller has the same hostname as the one being replaced. That\u0027s not normally a problem, but if you\u0027re running nova-compute on your controllers for ironic, then the hostname has to be the same. The fact and reason that it\u0027s not happening in this case is really tripleo-specific and not necessarily a nova problem, it\u0027s just a rule that needs to be respected.\n\nOne way to solve that generically would be to just advise people as a best-practice to set their CONF.hostname for ironic computes to some token (i.e. \"ironic-1\", \"ironic-3\", etc). That way if you need to re-deploy an ironic compute somewhere else, the identity (and thus the instances for which it is responsible) are restored. I would say \"database surgery\" is not the proper solution to that problem either.","commit_id":"d7665380e0791a0ec00a11e196d3f70ec79ce37c"},{"author":{"_account_id":11604,"name":"sean mooney","email":"smooney@redhat.com","username":"sean-k-mooney"},"change_message_id":"0d6dde7dea42352144e1ef4f85f829235dab3918","unresolved":true,"context_lines":[{"line_number":237,"context_line":" manual intervention after the fact."},{"line_number":238,"context_line":"* As an operator, I want to be able to scale up/down my Ironic nova-compute"},{"line_number":239,"context_line":" services and replace a failed Ironic nova-compute service in operation"},{"line_number":240,"context_line":" without manual intervention."},{"line_number":241,"context_line":"* As an end-user, I expect to be able to continue to manage my Bare Metal"},{"line_number":242,"context_line":" instances and remain unaware of any outage within my provider\u0027s cloud."},{"line_number":243,"context_line":"* As an operator, I don\u0027t want to have to deal with extra fallout from a"}],"source_content_type":"text/x-rst","patch_set":1,"id":"df7a715c_fa59d16d","line":240,"updated":"2022-05-17 19:16:54.000000000","message":"well thats kind fo contradicotry \n\nreplace a failed Ironic nova-compute service in operation without manual intervention\n\nthe replacment of a server is a manual intervention.","commit_id":"d7665380e0791a0ec00a11e196d3f70ec79ce37c"},{"author":{"_account_id":4393,"name":"Dan Smith","email":"dms@danplanet.com","username":"danms"},"change_message_id":"4a4869f908e3a9cb922fc9ee7ab426cb1150ae7d","unresolved":true,"context_lines":[{"line_number":237,"context_line":" manual intervention after the fact."},{"line_number":238,"context_line":"* As an operator, I want to be able to scale up/down my Ironic nova-compute"},{"line_number":239,"context_line":" services and replace a failed Ironic nova-compute service in operation"},{"line_number":240,"context_line":" without manual intervention."},{"line_number":241,"context_line":"* As an end-user, I expect to be able to continue to manage my Bare Metal"},{"line_number":242,"context_line":" instances and remain unaware of any outage within my provider\u0027s cloud."},{"line_number":243,"context_line":"* As an operator, I don\u0027t want to have to deal with extra fallout from a"}],"source_content_type":"text/x-rst","patch_set":1,"id":"d50558f4_6aa009a6","line":240,"in_reply_to":"df7a715c_fa59d16d","updated":"2022-07-05 16:21:24.000000000","message":"Yeah, I\u0027m not sure I get that. I assume it means \"replace a nova-compute without doing a bunch of other stuff I don\u0027t think I should have to do\". That is, of course, quite doable by replacing the compute with one of the same hostname (or identifier, per above).","commit_id":"d7665380e0791a0ec00a11e196d3f70ec79ce37c"},{"author":{"_account_id":11604,"name":"sean mooney","email":"smooney@redhat.com","username":"sean-k-mooney"},"change_message_id":"0d6dde7dea42352144e1ef4f85f829235dab3918","unresolved":true,"context_lines":[{"line_number":249,"context_line":""},{"line_number":250,"context_line":"Or perhaps, multiple proposed possibilities."},{"line_number":251,"context_line":""},{"line_number":252,"context_line":"Just fix the fields as we \"rebalanance* as a result of the hash ring changing"},{"line_number":253,"context_line":"-----------------------------------------------------------------------------"},{"line_number":254,"context_line":""},{"line_number":255,"context_line":"In 2019, this was"}],"source_content_type":"text/x-rst","patch_set":1,"id":"dbc15207_95747b3d","line":252,"range":{"start_line":252,"start_character":0,"end_line":252,"end_character":77},"updated":"2022-05-17 19:16:54.000000000","message":"by the way the idea that the hash ring is the single source of truth is also an impedence mismatch.\n\nnoting tat the driver level is ment to alter the behavior of the comptue manager.\nthis currently works because the comptue manager get the set of compute nodes form teh driver in the form of the list of hyperviros.\n\nso the driver can modify the behvior of the compute manager by altering the set of hypervior sthat it returns as well as modifyign things liek ther resource provider trees.\n\nbut the nova data base is the singel souce of truth concpetually for that maping btween comptue node and serice.\n\nwith the ironic driver that is inverted and ironic via the harsh ring actully is defienign the signel source of turth which brakes nova mental model to a degree.\n\n\nit was somethign we elected to do a long time ago but its now how nove ingeneral expect this to work.\n\nwe do not expect the service to compute node mapping to be a product of an in memory dynimc hash ring implmeation. we expect the compute service to only change as a result of an operator deploying a new serice or calling the api.","commit_id":"d7665380e0791a0ec00a11e196d3f70ec79ce37c"},{"author":{"_account_id":9708,"name":"Balazs Gibizer","display_name":"gibi","email":"gibizer@gmail.com","username":"gibi"},"change_message_id":"c74549169a43f43bec32845be3e25996d0126d00","unresolved":true,"context_lines":[{"line_number":249,"context_line":""},{"line_number":250,"context_line":"Or perhaps, multiple proposed possibilities."},{"line_number":251,"context_line":""},{"line_number":252,"context_line":"Just fix the fields as we \"rebalanance* as a result of the hash ring changing"},{"line_number":253,"context_line":"-----------------------------------------------------------------------------"},{"line_number":254,"context_line":""},{"line_number":255,"context_line":"In 2019, this was"}],"source_content_type":"text/x-rst","patch_set":1,"id":"d99f5d79_c5be885e","line":252,"range":{"start_line":252,"start_character":0,"end_line":252,"end_character":77},"in_reply_to":"dbc15207_95747b3d","updated":"2022-07-05 14:33:01.000000000","message":"So we have two sources of truth, i) nova DB, ii) hashring in ironic-driver. During rebalance ii) is changed but i) is not. So the system becomes inconsistent.\n\nSo either:\na) ii) should not change in a rebalance\nOr \nb) i) also needs to change during a rebalance to match with ii)\n\nProbably a) is not possible without as the rebalance is a must if we lost a nova-compute service.\n\nSo we need to make b) work. To do that we need to move the instances and compute node objects to a different nova-compute service:\n\n* As Sean noted above the only way to move an instance away from a dead nova-compute service is to evacuate it. But evacuate also means moving the instance to a different hypervisor / compute node. In case of ironic rebalance only the nova-compute service needs to be change but not the compute node.\n\n* Today there is no way in nova to move a compute node to a different nova-compute service.","commit_id":"d7665380e0791a0ec00a11e196d3f70ec79ce37c"},{"author":{"_account_id":11604,"name":"sean mooney","email":"smooney@redhat.com","username":"sean-k-mooney"},"change_message_id":"0d6dde7dea42352144e1ef4f85f829235dab3918","unresolved":true,"context_lines":[{"line_number":274,"context_line":" the structure actually changes and only process what we \"own\""},{"line_number":275,"context_line":"2) Correct the caching issues that kept nodes sticky to the cache and would"},{"line_number":276,"context_line":" result in them moving after a structural change to the environment has"},{"line_number":277,"context_line":" occured. i.e. Instance deployed while compute was down, cache is influenced"},{"line_number":278,"context_line":" by that \"deployed while down\" relationship, and as soon as the Bare Metal"},{"line_number":279,"context_line":" Node is in ``available`` state to be reallocated, it would stay in one"},{"line_number":280,"context_line":" ``nova-compute`` service\u0027s node\u0027s cache until the next rebalance/cache"}],"source_content_type":"text/x-rst","patch_set":1,"id":"3d36f457_fbd389e0","line":277,"range":{"start_line":277,"start_character":17,"end_line":277,"end_character":58},"updated":"2022-05-17 19:16:54.000000000","message":"so that should not be possibel today\nsince the schduler woudl not select teh comptue service if it was down.\n\nunless this is specificaly a case when the instance request was sent to ironic and then the compute service failed while ironic was provisiong it but htat is not a case of the instance was deployed while the compute was down.\n\nthe operation started when it was up.","commit_id":"d7665380e0791a0ec00a11e196d3f70ec79ce37c"},{"author":{"_account_id":11604,"name":"sean mooney","email":"smooney@redhat.com","username":"sean-k-mooney"},"change_message_id":"0d6dde7dea42352144e1ef4f85f829235dab3918","unresolved":true,"context_lines":[{"line_number":323,"context_line":" .. NOTE::"},{"line_number":324,"context_line":" This was raised in high bandwith calls to discuss concerns but was"},{"line_number":325,"context_line":" not captured. If a nova contributor could help clarify, it would be"},{"line_number":326,"context_line":" greatly appreciated."},{"line_number":327,"context_line":""},{"line_number":328,"context_line":"* Automatic reconcilation seems to generally go against Nova maintainers"},{"line_number":329,"context_line":" operational use model where for any such action, which is that an \"instance\""}],"source_content_type":"text/x-rst","patch_set":1,"id":"91a1a9bb_d328142e","line":326,"updated":"2022-05-17 19:16:54.000000000","message":"there are at least two related concerns around aggregates and cells.\n\nin nova\u0027s api db comptue services and trastiivly all comptue comptute nodes \nmanaged by that compute service are mapped to a cell\nan instance is also mapped to a cell based on the compute node they are assigned to.\n\nthe hash ring may not move compute nodes or instances between nova cells.\ncurrently, the driver does not know what cell it is in nor does the hash ring.\n\nif the driver/hash ring naively rebalances the compute nodes without considering the cell boundaries then the node will still be unmanageable as the request will be sent to the wrong message bus.\n\n\nin a more user-facing concern, all compute services and hosts beneath them are mapped to an Availability Zone.\n\nAvailability zones are also defined as metadata on host aggregates in the API DB, not the cell DB.\n\ncompute services are not allowed to upcall to the API DB and individual compute service/driver do not know what AZ they may or may not be part of.\n\nhost aggregates are mapped to compute services.\na compute service may have many host aggregates associated with it for things like the tenant isolation filter or many other scheduling reasons.\n\nsince the hash-ring/driver knows nothing about the relationship between the compute service and the aggregates it\u0027s a member of it cannot safely rebalance as it might move the compute to a different az or host aggregate that would violate the\n\ntenant isolation or other requirements that were first enforced when scheduling.","commit_id":"d7665380e0791a0ec00a11e196d3f70ec79ce37c"},{"author":{"_account_id":4393,"name":"Dan Smith","email":"dms@danplanet.com","username":"danms"},"change_message_id":"4a4869f908e3a9cb922fc9ee7ab426cb1150ae7d","unresolved":true,"context_lines":[{"line_number":323,"context_line":" .. NOTE::"},{"line_number":324,"context_line":" This was raised in high bandwith calls to discuss concerns but was"},{"line_number":325,"context_line":" not captured. If a nova contributor could help clarify, it would be"},{"line_number":326,"context_line":" greatly appreciated."},{"line_number":327,"context_line":""},{"line_number":328,"context_line":"* Automatic reconcilation seems to generally go against Nova maintainers"},{"line_number":329,"context_line":" operational use model where for any such action, which is that an \"instance\""}],"source_content_type":"text/x-rst","patch_set":1,"id":"073177af_a233278c","line":326,"in_reply_to":"91a1a9bb_d328142e","updated":"2022-07-05 16:21:24.000000000","message":"Right, there\u0027s a whole host of \"what goes where\" information that is maintained at the top level and enforced by scheduling and/or the APIs. That stuff may also include some things like disabled cells and other scheduling policy, in addition to the things Sean noted here.\n\nComputes are already too complicated for what they should be doing, and pushing a top-level view of the whole deployment\u0027s architecture and isolation policy to them is not reasonable (nor feasible because of cell and api-db separations). This is one of my primary concerns around the computes either unilaterally adopting things they think they need to manage, or attempting to collude to make that decision, not knowing if there\u0027s another cell of computes that might be having a parallel election.","commit_id":"d7665380e0791a0ec00a11e196d3f70ec79ce37c"},{"author":{"_account_id":11604,"name":"sean mooney","email":"smooney@redhat.com","username":"sean-k-mooney"},"change_message_id":"0d6dde7dea42352144e1ef4f85f829235dab3918","unresolved":true,"context_lines":[{"line_number":334,"context_line":" ``nova-compute`` process workload and responsibilities would require"},{"line_number":335,"context_line":" substantial manual intervention to have an even distribution of"},{"line_number":336,"context_line":" baremetal hardware nodes associated to ``nova-compute`` processes."},{"line_number":337,"context_line":""},{"line_number":338,"context_line":" * Additionally, formally encoding logic to treat an ``Instance.host`` field"},{"line_number":339,"context_line":" to influence node caching, ultimately means we are not actually fixing"},{"line_number":340,"context_line":" the issues related to cache handling and assignment."}],"source_content_type":"text/x-rst","patch_set":1,"id":"9a0d4c6d_edbbeb63","line":337,"updated":"2022-05-17 19:16:54.000000000","message":"this is not about instances. or failed compute nodes. the compute nodes(physical ironic manage servers) are still oeprational as long as they are managemable by ironc this is about the compute sericces and comptue node managemnt.\nso this cant be expressed as an isntnace action witout creating a new instance ation that is special to ironic.\n\n\ni.e. we cant overload evacuate sicne evacuate woudl be soemthing that could be supproted for ironic indepently where we repovisoin a new servcer if the modther board the the current woen explodes. either prservice data in teh boot form iscsi case or rebuildign from the orginal image as we do with vms when not on shared storage.\n\n\nthere is not management api for comptue service to compute node operations today in nova.","commit_id":"d7665380e0791a0ec00a11e196d3f70ec79ce37c"},{"author":{"_account_id":9708,"name":"Balazs Gibizer","display_name":"gibi","email":"gibizer@gmail.com","username":"gibi"},"change_message_id":"3fac79211aa7a77717f56e7660f7827d28052eb8","unresolved":true,"context_lines":[{"line_number":334,"context_line":" ``nova-compute`` process workload and responsibilities would require"},{"line_number":335,"context_line":" substantial manual intervention to have an even distribution of"},{"line_number":336,"context_line":" baremetal hardware nodes associated to ``nova-compute`` processes."},{"line_number":337,"context_line":""},{"line_number":338,"context_line":" * Additionally, formally encoding logic to treat an ``Instance.host`` field"},{"line_number":339,"context_line":" to influence node caching, ultimately means we are not actually fixing"},{"line_number":340,"context_line":" the issues related to cache handling and assignment."}],"source_content_type":"text/x-rst","patch_set":1,"id":"8b6ae958_62fbf828","line":337,"in_reply_to":"055e770f_f6664505","updated":"2022-07-07 07:46:26.000000000","message":"\u003e \u003e \u003e there is not management api for comptue service to compute node operations today in nova.\n\u003e \u003e \n\u003e \u003e I think this is the missing piece. What we would need is way to ask nova to move a compute node to another nova-compute service (and move the instances on that compute node along).\n\u003e \n\u003e So you want to add an API operation that is effectively a \"special ironic thing\"? I\u0027m not sure I can get behind that ... :)\n\nI totally get it. I don\u0027t like it either, but also I don\u0027t see other way to trigger a move in a way that the actual update can be triggered from the compute but the API level constraints are also enforced.\n\n\u003e \n\u003e I think we\u0027d also have to disallow that based on virt type from the API so as not to let people be arbitrarily re-assigning their libvirt compute nodes to other services.\n\u003e \n\nYep. I assume we have a lot of API requests that is not supported for ironic, now there will be one that is only supported for ironic virt driver.\n\n\u003e \u003e So I think we need to add a new POST /os-hypervisors/\u003ccompute-node-uuid\u003e API that allows changing the [\"service\"][\"host\"] field of the hypervisor (i.e. compute node) to trigger the move. Then such action can be handled by the nova-api or nova-superconductor service on the API level to check preconditions (e.g. aggregates, cells, AZs constraints). Then forward the request to the nova cell conductor in the cell where the move happens (as it cannot happen between cells). Then the nova-conductor in the cell can do the DB update under a lock, to prevent race conditions between competing nova-compute services claiming ownership over the same compute node.\n\u003e \n\u003e What lock? The two computes involved won\u0027t be collaborating on a lock while running their resource tracker loops, and the conductor can\u0027t really lock anything about the compute nodes from other conductors. It can make an atomic reassignment of the host in the DB, but that\u0027s about it. Is that what you mean?\n\u003e \n\nGood point. We can only do atomic reassignment, not a real lock. \n\n\u003e If so, I still think there\u0027s a weird \"eventual consistency\" situation that happens when you do that in the middle of resource manager updates by either of the two related computes.\n\u003e \n\nCan we reject updates that are coming from a different nova-compute service to a compute node already assigned to another nova-compute? It would not totally remove the eventual consistency situation but shrink the window. \n\n\u003e \u003e I know that this means a lot more work as it needs a new REST API and RPC between the API and the Cell layer. Also this might cause scaling issues after a cold start when a lot of moves needed.\n\u003e \n\u003e There\u0027s a lot involved here, I think, especially if you assume this API would be called en masse after some event to reassign a bunch of things. You\u0027d have many conductors reassigning compute node pairs between compute hosts in parallel. You\u0027d certainly have to disable any of the hash ring stuff during that time so that computes aren\u0027t spewing results or fighting you. Or are you expecting to remove the hash ring entirely in this scenario?\n\nI don\u0027t think we can remove the hashring, as that is the source of the actual reassignment. That has the logic to assign a compute node to different compute service. If we remove the ring then there won\u0027t be any reassignment ever. I think if we want to avoid overload then we should limit the number of nova-compute services brought up at a given time. This will not decrease the total amount of reassignment probably, but it will spread them over time.","commit_id":"d7665380e0791a0ec00a11e196d3f70ec79ce37c"},{"author":{"_account_id":4393,"name":"Dan Smith","email":"dms@danplanet.com","username":"danms"},"change_message_id":"4a4869f908e3a9cb922fc9ee7ab426cb1150ae7d","unresolved":true,"context_lines":[{"line_number":334,"context_line":" ``nova-compute`` process workload and responsibilities would require"},{"line_number":335,"context_line":" substantial manual intervention to have an even distribution of"},{"line_number":336,"context_line":" baremetal hardware nodes associated to ``nova-compute`` processes."},{"line_number":337,"context_line":""},{"line_number":338,"context_line":" * Additionally, formally encoding logic to treat an ``Instance.host`` field"},{"line_number":339,"context_line":" to influence node caching, ultimately means we are not actually fixing"},{"line_number":340,"context_line":" the issues related to cache handling and assignment."}],"source_content_type":"text/x-rst","patch_set":1,"id":"055e770f_f6664505","line":337,"in_reply_to":"5c175453_48d74a9d","updated":"2022-07-05 16:21:24.000000000","message":"\u003e \u003e there is not management api for comptue service to compute node operations today in nova.\n\u003e \n\u003e I think this is the missing piece. What we would need is way to ask nova to move a compute node to another nova-compute service (and move the instances on that compute node along).\n\nSo you want to add an API operation that is effectively a \"special ironic thing\"? I\u0027m not sure I can get behind that ... :)\n\nI think we\u0027d also have to disallow that based on virt type from the API so as not to let people be arbitrarily re-assigning their libvirt compute nodes to other services.\n\n\u003e So I think we need to add a new POST /os-hypervisors/\u003ccompute-node-uuid\u003e API that allows changing the [\"service\"][\"host\"] field of the hypervisor (i.e. compute node) to trigger the move. Then such action can be handled by the nova-api or nova-superconductor service on the API level to check preconditions (e.g. aggregates, cells, AZs constraints). Then forward the request to the nova cell conductor in the cell where the move happens (as it cannot happen between cells). Then the nova-conductor in the cell can do the DB update under a lock, to prevent race conditions between competing nova-compute services claiming ownership over the same compute node.\n\nWhat lock? The two computes involved won\u0027t be collaborating on a lock while running their resource tracker loops, and the conductor can\u0027t really lock anything about the compute nodes from other conductors. It can make an atomic reassignment of the host in the DB, but that\u0027s about it. Is that what you mean?\n\nIf so, I still think there\u0027s a weird \"eventual consistency\" situation that happens when you do that in the middle of resource manager updates by either of the two related computes.\n\n\u003e I know that this means a lot more work as it needs a new REST API and RPC between the API and the Cell layer. Also this might cause scaling issues after a cold start when a lot of moves needed.\n\nThere\u0027s a lot involved here, I think, especially if you assume this API would be called en masse after some event to reassign a bunch of things. You\u0027d have many conductors reassigning compute node pairs between compute hosts in parallel. You\u0027d certainly have to disable any of the hash ring stuff during that time so that computes aren\u0027t spewing results or fighting you. Or are you expecting to remove the hash ring entirely in this scenario?","commit_id":"d7665380e0791a0ec00a11e196d3f70ec79ce37c"},{"author":{"_account_id":9708,"name":"Balazs Gibizer","display_name":"gibi","email":"gibizer@gmail.com","username":"gibi"},"change_message_id":"c238244046e72807d5f46a48753ffe42d9ee7ec2","unresolved":true,"context_lines":[{"line_number":334,"context_line":" ``nova-compute`` process workload and responsibilities would require"},{"line_number":335,"context_line":" substantial manual intervention to have an even distribution of"},{"line_number":336,"context_line":" baremetal hardware nodes associated to ``nova-compute`` processes."},{"line_number":337,"context_line":""},{"line_number":338,"context_line":" * Additionally, formally encoding logic to treat an ``Instance.host`` field"},{"line_number":339,"context_line":" to influence node caching, ultimately means we are not actually fixing"},{"line_number":340,"context_line":" the issues related to cache handling and assignment."}],"source_content_type":"text/x-rst","patch_set":1,"id":"290124c7_2a5d68b7","line":337,"in_reply_to":"88df4bdb_6045f4e8","updated":"2022-07-19 10:36:32.000000000","message":"\u003e \u003e Can we reject updates that are coming from a different nova-compute service to a compute node already assigned to another nova-compute? It would not totally remove the eventual consistency situation but shrink the window.\n\u003e \n\u003e Reject from where? In conductor or placement? Placement doesn\u0027t know about the assignment (right?) and conductor would have to effectively look up the assignment before each write or whatever. Seems sketchy to me...\n\nI thought about rejecting it from the conductor perspective. But you have a point that the compute also updates Placement not just the conductor.\n\n\u003e \n\u003e \u003e I don\u0027t think we can remove the hashring, as that is the source of the actual reassignment. That has the logic to assign a compute node to different compute service. If we remove the ring then there won\u0027t be any reassignment ever. I think if we want to avoid overload then we should limit the number of nova-compute services brought up at a given time. This will not decrease the total amount of reassignment probably, but it will spread them over time.\n\u003e \n\u003e What I meant was remove the management of the hash ring from the computes, in terms of responsibility. By \"remove the hash ring\" I meant an arrangement where computes just manage nodes that are assigned to their service and something else decides to change those assignments. Still concerned about who does that and how much ironic information they need to have.\n\nAhh I see now. Yes, that could move the problem of reassignment to a place where it is easier to implement. I.e. we we move the hashring to the nova API layer (I know we don\u0027t want to as that would be ironic specific code) then the reassignment code will have the information about cells without any upcalls.","commit_id":"d7665380e0791a0ec00a11e196d3f70ec79ce37c"},{"author":{"_account_id":4393,"name":"Dan Smith","email":"dms@danplanet.com","username":"danms"},"change_message_id":"8b592cfb3389cb9303bbde2981ab9e1ac92256f3","unresolved":true,"context_lines":[{"line_number":334,"context_line":" ``nova-compute`` process workload and responsibilities would require"},{"line_number":335,"context_line":" substantial manual intervention to have an even distribution of"},{"line_number":336,"context_line":" baremetal hardware nodes associated to ``nova-compute`` processes."},{"line_number":337,"context_line":""},{"line_number":338,"context_line":" * Additionally, formally encoding logic to treat an ``Instance.host`` field"},{"line_number":339,"context_line":" to influence node caching, ultimately means we are not actually fixing"},{"line_number":340,"context_line":" the issues related to cache handling and assignment."}],"source_content_type":"text/x-rst","patch_set":1,"id":"88df4bdb_6045f4e8","line":337,"in_reply_to":"8b6ae958_62fbf828","updated":"2022-07-12 17:10:33.000000000","message":"\u003e Yep. I assume we have a lot of API requests that is not supported for ironic, now there will be one that is only supported for ironic virt driver.\n\nYeah, but an array of features that are supported by some virt drivers but not all is different than a completely bespoke thing that will only ever be supported by one. When we\u0027re looking to plumb things through, even if we\u0027re maybe only ever going to support it for one, I think it\u0027s important to look at what other drivers might need if they wanted to be able to support the feature in the future. \n\n\u003e Can we reject updates that are coming from a different nova-compute service to a compute node already assigned to another nova-compute? It would not totally remove the eventual consistency situation but shrink the window.\n\nReject from where? In conductor or placement? Placement doesn\u0027t know about the assignment (right?) and conductor would have to effectively look up the assignment before each write or whatever. Seems sketchy to me...\n\n\u003e I don\u0027t think we can remove the hashring, as that is the source of the actual reassignment. That has the logic to assign a compute node to different compute service. If we remove the ring then there won\u0027t be any reassignment ever. I think if we want to avoid overload then we should limit the number of nova-compute services brought up at a given time. This will not decrease the total amount of reassignment probably, but it will spread them over time.\n\nWhat I meant was remove the management of the hash ring from the computes, in terms of responsibility. By \"remove the hash ring\" I meant an arrangement where computes just manage nodes that are assigned to their service and something else decides to change those assignments. Still concerned about who does that and how much ironic information they need to have.\n\nI know this is all focused on changes to Nova to make it work better for Ironic, but have we considered changes to Ironic to make it more hypervisor-y? If ironic exposed nodes in groups and each compute just managed whatever its assigned group is, then maybe ironic could help by offlining a node, removing it from a group, reassigning, etc. Some combination of nova changes and ironic changes seem like they might be more palatable than just hacking in not-a-hypervisor changes into all layers of nova.","commit_id":"d7665380e0791a0ec00a11e196d3f70ec79ce37c"},{"author":{"_account_id":9708,"name":"Balazs Gibizer","display_name":"gibi","email":"gibizer@gmail.com","username":"gibi"},"change_message_id":"c74549169a43f43bec32845be3e25996d0126d00","unresolved":true,"context_lines":[{"line_number":334,"context_line":" ``nova-compute`` process workload and responsibilities would require"},{"line_number":335,"context_line":" substantial manual intervention to have an even distribution of"},{"line_number":336,"context_line":" baremetal hardware nodes associated to ``nova-compute`` processes."},{"line_number":337,"context_line":""},{"line_number":338,"context_line":" * Additionally, formally encoding logic to treat an ``Instance.host`` field"},{"line_number":339,"context_line":" to influence node caching, ultimately means we are not actually fixing"},{"line_number":340,"context_line":" the issues related to cache handling and assignment."}],"source_content_type":"text/x-rst","patch_set":1,"id":"5c175453_48d74a9d","line":337,"in_reply_to":"9a0d4c6d_edbbeb63","updated":"2022-07-05 14:33:01.000000000","message":"\u003e this is not about instances. or failed compute nodes. the compute nodes(physical ironic manage servers) are still oeprational as long as they are managemable by ironc this is about the compute sericces and comptue node managemnt.\n\u003e so this cant be expressed as an isntnace action witout creating a new instance ation that is special to ironic.\n\u003e \n\n+1\n\n\u003e \n\u003e i.e. we cant overload evacuate sicne evacuate woudl be soemthing that could be supproted for ironic indepently where we repovisoin a new servcer if the modther board the the current woen explodes. either prservice data in teh boot form iscsi case or rebuildign from the orginal image as we do with vms when not on shared storage.\n\u003e \n\u003e \n\u003e there is not management api for comptue service to compute node operations today in nova.\n\nI think this is the missing piece. What we would need is way to ask nova to move a compute node to another nova-compute service (and move the instances on that compute node along). As this potentially depends on nova API level information (host aggregates, AZs, cells) such move operation should be checked on the nova API DB level (nova-api or nova-superconductor level). This also means that the actual move needs to be triggered from above (i.e. from REST API) as from below (i.e nova-compute) we have no way to talk to the API level (as we don\u0027t want to add more up-calls). \n\n(This is probably similar to Option: \"Make the nova conductor handle assignments and assign nodes\")\n\nSo I think we need to add a new POST /os-hypervisors/\u003ccompute-node-uuid\u003e API that allows changing the [\"service\"][\"host\"] field of the hypervisor (i.e. compute node) to trigger the move. Then such action can be handled by the nova-api or nova-superconductor service on the API level to check preconditions (e.g. aggregates, cells, AZs constraints). Then forward the request to the nova cell conductor in the cell where the move happens (as it cannot happen between cells). Then the nova-conductor in the cell can do the DB update under a lock, to prevent race conditions between competing nova-compute services claiming ownership over the same compute node.\n\nI know that this means a lot more work as it needs a new REST API and RPC between the API and the Cell layer. Also this might cause scaling issues after a cold start when a lot of moves needed.","commit_id":"d7665380e0791a0ec00a11e196d3f70ec79ce37c"},{"author":{"_account_id":11604,"name":"sean mooney","email":"smooney@redhat.com","username":"sean-k-mooney"},"change_message_id":"0d6dde7dea42352144e1ef4f85f829235dab3918","unresolved":true,"context_lines":[{"line_number":342,"context_line":" .. NOTE:: Additional concern, when discussed, it was mentioned that a new"},{"line_number":343,"context_line":" instnace is created. Is the instance_id static? If not, evacuate"},{"line_number":344,"context_line":" would be a hard break because instance_id is static on ironic side"},{"line_number":345,"context_line":" while the node is provisioned."},{"line_number":346,"context_line":""},{"line_number":347,"context_line":"* Database locking has been raised as a general concern. As updates would be"},{"line_number":348,"context_line":" row level changes in the tables, and row level locking applies in that case,"}],"source_content_type":"text/x-rst","patch_set":1,"id":"8da14b7f_945de989","line":345,"updated":"2022-05-17 19:16:54.000000000","message":"yes the instance id is static for the liftime of the isntnace.","commit_id":"d7665380e0791a0ec00a11e196d3f70ec79ce37c"},{"author":{"_account_id":11604,"name":"sean mooney","email":"smooney@redhat.com","username":"sean-k-mooney"},"change_message_id":"0d6dde7dea42352144e1ef4f85f829235dab3918","unresolved":true,"context_lines":[{"line_number":353,"context_line":" calls of get_available_nodes from the cache *while* we\u0027re processing"},{"line_number":354,"context_line":" the hash ring update, which would block anintermediate state. We could"},{"line_number":355,"context_line":" just have a flag on the driver and raise the ?VirtDriverNotReady?"},{"line_number":356,"context_line":" exception?"},{"line_number":357,"context_line":""},{"line_number":358,"context_line":"* A worry about attempting to take over the management of the same node has"},{"line_number":359,"context_line":" been raised, which mainly exists in cold-start situations when all"}],"source_content_type":"text/x-rst","patch_set":1,"id":"843810b8_afd99493","line":356,"updated":"2022-05-17 19:16:54.000000000","message":"from an architrue point of view the compute manager is not ment to have driver dependenct code and the virt diver is not ment to make db queries.\n\nso that creates other issues with this also.\n\nwe possibel could modify the comptue manage and virt driver interface to allow the compute manager to ask the driver if its \"ready\" excptions shoudl not be used as control flow so i dont really like the \"lets raise an excptions if we are reblaiceing the hash ring\" approch.","commit_id":"d7665380e0791a0ec00a11e196d3f70ec79ce37c"},{"author":{"_account_id":4393,"name":"Dan Smith","email":"dms@danplanet.com","username":"danms"},"change_message_id":"4a4869f908e3a9cb922fc9ee7ab426cb1150ae7d","unresolved":true,"context_lines":[{"line_number":353,"context_line":" calls of get_available_nodes from the cache *while* we\u0027re processing"},{"line_number":354,"context_line":" the hash ring update, which would block anintermediate state. We could"},{"line_number":355,"context_line":" just have a flag on the driver and raise the ?VirtDriverNotReady?"},{"line_number":356,"context_line":" exception?"},{"line_number":357,"context_line":""},{"line_number":358,"context_line":"* A worry about attempting to take over the management of the same node has"},{"line_number":359,"context_line":" been raised, which mainly exists in cold-start situations when all"}],"source_content_type":"text/x-rst","patch_set":1,"id":"81e2b9e1_fccf2900","line":356,"in_reply_to":"843810b8_afd99493","updated":"2022-07-05 16:21:24.000000000","message":"Yeah, this seems like a bad idea to me.. Essentially letting the driver (and any bug in the driver) stall resource updates indefinitely.","commit_id":"d7665380e0791a0ec00a11e196d3f70ec79ce37c"},{"author":{"_account_id":11604,"name":"sean mooney","email":"smooney@redhat.com","username":"sean-k-mooney"},"change_message_id":"0d6dde7dea42352144e1ef4f85f829235dab3918","unresolved":true,"context_lines":[{"line_number":404,"context_line":""},{"line_number":405,"context_line":" .. NOTE:: Can we get a written rational for why we can\u0027t just guard off"},{"line_number":406,"context_line":" with a config option?"},{"line_number":407,"context_line":""},{"line_number":408,"context_line":"Only rebalance a node\u0027s assets if it is Compute is DOWN and Disabled"},{"line_number":409,"context_line":"--------------------------------------------------------------------"},{"line_number":410,"context_line":""}],"source_content_type":"text/x-rst","patch_set":1,"id":"aa4e05a6_90b04889","line":407,"updated":"2022-05-17 19:16:54.000000000","message":"one that was raised is we dont want to have one way of runign nova when you use ironic and another when you dont.\n\nwe want to support deploying a single nova that has some livbirt hosts and some ironic hsots and have that be a reasonable thing to do.","commit_id":"d7665380e0791a0ec00a11e196d3f70ec79ce37c"},{"author":{"_account_id":4393,"name":"Dan Smith","email":"dms@danplanet.com","username":"danms"},"change_message_id":"4a4869f908e3a9cb922fc9ee7ab426cb1150ae7d","unresolved":true,"context_lines":[{"line_number":404,"context_line":""},{"line_number":405,"context_line":" .. NOTE:: Can we get a written rational for why we can\u0027t just guard off"},{"line_number":406,"context_line":" with a config option?"},{"line_number":407,"context_line":""},{"line_number":408,"context_line":"Only rebalance a node\u0027s assets if it is Compute is DOWN and Disabled"},{"line_number":409,"context_line":"--------------------------------------------------------------------"},{"line_number":410,"context_line":""}],"source_content_type":"text/x-rst","patch_set":1,"id":"618f4bac_714f570a","line":407,"in_reply_to":"aa4e05a6_90b04889","updated":"2022-07-05 16:21:24.000000000","message":"Yes, this. Just like we don\u0027t want to have APIs behave differently based on config, we don\u0027t want to have nova behave differently based on the virt driver in use (to the extent this is possible). Further, multi-arch Nova deployments are a thing, of course.\n\nNova is an abstraction layer, and as such, it is not able to surface every feature and quirk of every underlying technology it wraps. That\u0027s a good thing.","commit_id":"d7665380e0791a0ec00a11e196d3f70ec79ce37c"},{"author":{"_account_id":11604,"name":"sean mooney","email":"smooney@redhat.com","username":"sean-k-mooney"},"change_message_id":"0d6dde7dea42352144e1ef4f85f829235dab3918","unresolved":true,"context_lines":[{"line_number":423,"context_line":" \"baremetal node disappears\", and \"new node looking like that other"},{"line_number":424,"context_line":" node just appeared, potentially incurring an additional workload within"},{"line_number":425,"context_line":" other areas of Nova."},{"line_number":426,"context_line":""},{"line_number":427,"context_line":"Locking the Instance/Compute node records"},{"line_number":428,"context_line":"------------------------------------------"},{"line_number":429,"context_line":""}],"source_content_type":"text/x-rst","patch_set":1,"id":"e19550a3_4522e246","line":426,"updated":"2022-05-17 19:16:54.000000000","message":"not that this does not adress the issue of the hash righ ensurign compute nodes are not moved between cells or AZs.\n\nrealisticly the only entitiy in nova that has the visablity to the cell/az mapping and can run periodic tasks taht coudl do the rebalce is the super conductor.\n\nthe super conductor is not aware of ironic so we dont have an extention point to mode this there but its the only porcess that has acesss to the api db and cell dbs.","commit_id":"d7665380e0791a0ec00a11e196d3f70ec79ce37c"},{"author":{"_account_id":11604,"name":"sean mooney","email":"smooney@redhat.com","username":"sean-k-mooney"},"change_message_id":"0d6dde7dea42352144e1ef4f85f829235dab3918","unresolved":true,"context_lines":[{"line_number":437,"context_line":"Concerns"},{"line_number":438,"context_line":"~~~~~~~~"},{"line_number":439,"context_line":""},{"line_number":440,"context_line":"* This may not be backportable."},{"line_number":441,"context_line":""},{"line_number":442,"context_line":"* This may not be able to play nicely in a live environment undergoing"},{"line_number":443,"context_line":" an upgrade, at least without substantial testing."}],"source_content_type":"text/x-rst","patch_set":1,"id":"f679d6ab_90035b3c","line":440,"updated":"2022-05-17 19:16:54.000000000","message":"it would not be we cannot backport rpc changes, service version changes, object changes , api changes or most db chages.\nso we cannot add a new rpc method or similar to add a mutex/distibuted lock.\n\nin general compute service dont have access to memcache or the db so unless we aquired a log via an ironic api call im not sure how we could add a DLM implementation in a backporable way.","commit_id":"d7665380e0791a0ec00a11e196d3f70ec79ce37c"},{"author":{"_account_id":4393,"name":"Dan Smith","email":"dms@danplanet.com","username":"danms"},"change_message_id":"4a4869f908e3a9cb922fc9ee7ab426cb1150ae7d","unresolved":true,"context_lines":[{"line_number":437,"context_line":"Concerns"},{"line_number":438,"context_line":"~~~~~~~~"},{"line_number":439,"context_line":""},{"line_number":440,"context_line":"* This may not be backportable."},{"line_number":441,"context_line":""},{"line_number":442,"context_line":"* This may not be able to play nicely in a live environment undergoing"},{"line_number":443,"context_line":" an upgrade, at least without substantial testing."}],"source_content_type":"text/x-rst","patch_set":1,"id":"2fd203bc_97c8353b","line":440,"in_reply_to":"f679d6ab_90035b3c","updated":"2022-07-05 16:21:24.000000000","message":"I think that expecting any real solution (or even incremental improvement) to this being backportable is pointless. AFAICT, anything discussed so far would have substantial impacts to deployments if the new code wasn\u0027t rolled out atomically to all computes during a maintenance window, which rules out backports, IMHO.","commit_id":"d7665380e0791a0ec00a11e196d3f70ec79ce37c"},{"author":{"_account_id":4393,"name":"Dan Smith","email":"dms@danplanet.com","username":"danms"},"change_message_id":"4a4869f908e3a9cb922fc9ee7ab426cb1150ae7d","unresolved":true,"context_lines":[{"line_number":452,"context_line":"locking/concurrency concerns on db records by allowing a singular lock to be"},{"line_number":453,"context_line":"used per node or for the overall process. "},{"line_number":454,"context_line":""},{"line_number":455,"context_line":"Concerns"},{"line_number":456,"context_line":"~~~~~~~~"},{"line_number":457,"context_line":""},{"line_number":458,"context_line":"* This surely would not be backportable, in other words forward only."}],"source_content_type":"text/x-rst","patch_set":1,"id":"59a69b7b_94a68af4","line":455,"updated":"2022-07-05 16:21:24.000000000","message":"Which conductor? The conductors are headless, nameless, anonymous, and have no persistence. They also have no concept of the virt driver being used on the computes. Unless you kicked this off with an API command and let whichever one happens to handle the request do a full reshuffle, you\u0027d need a big lock somewhere and all the conductors would have churning periodics fighting for that lock all the time.\n\nThe above does not sound good to me, nor does giving something that is highly virt-dependent to conductor, which has no concept of such things today.","commit_id":"d7665380e0791a0ec00a11e196d3f70ec79ce37c"},{"author":{"_account_id":11604,"name":"sean mooney","email":"smooney@redhat.com","username":"sean-k-mooney"},"change_message_id":"0d6dde7dea42352144e1ef4f85f829235dab3918","unresolved":true,"context_lines":[{"line_number":471,"context_line":"Align configuration and use a different shard key to break apart the hash ring"},{"line_number":472,"context_line":"------------------------------------------------------------------------------"},{"line_number":473,"context_line":""},{"line_number":474,"context_line":".. NOTE:: Under initial discussion"},{"line_number":475,"context_line":""},{"line_number":476,"context_line":"Provide a ?nova-manage? command to fix the db, kind of cleanup the cache code"},{"line_number":477,"context_line":"-----------------------------------------------------------------------------"}],"source_content_type":"text/x-rst","patch_set":1,"id":"97c4308d_7a6e98a3","line":474,"updated":"2022-05-17 19:16:54.000000000","message":"I should probably expand on this but I need to think about how to express this clearly...\n\n\ntl;dr is as follows.\n\n\nadd a new shard_key option to the ironic compute section of the nova.conf\ndefault this to socket.gethostname()\ncreate one nova compute service record per cell/az pair where ironic is deployed\nin each such region set the [DEFAULT]host\u003dironic_\u003ccell\u003e_\u003caz\u003e\nrun 1-n nova-compute agent process using the same conf.Host value\n\neach nova-compute agent process started in this way would share the same message queue names and would all be able to respond to RPC request to this compute service endpoint.\n\nthis would allow an active/active compute service deployment where its horizontally scaleable by just adding new compute agent processes.\n\nthe shared key would be used as an import to the hash ring so that each compute service would only process periodic for the subset of computing that is assigned to them by the hash ring.\n\nthis would eliminate the need for rebalance entirely at the nova compute manager level. since we would not need to update the compute node to compute service mapping or the instance.host and comptuenode.host values.\n\nthis would mean that comptue-nodes/instance would never be at risk of moving between cells or AZ and violating the schedulign constraits express via aggreate or placement metatata since netierh compute node or instance are \"moving\"\n\n\nwe are instead leavergin the message bus to supprot multiple producer (api) multipel consumer (compute agent process) model and then internally in the driver sharding the host that specific instance of nova-comptue actully runs periodic tasks for.\n\nthere are some open question with this such as how does the hashring know which agents are alive. and how does the resouce traker work in this case.\n\nto fasialte that likely we would need some way for the driver to share state such as via memcace or the message bus or even ironic.\n\n\nbut i think somehting along this direction could work.","commit_id":"d7665380e0791a0ec00a11e196d3f70ec79ce37c"},{"author":{"_account_id":4393,"name":"Dan Smith","email":"dms@danplanet.com","username":"danms"},"change_message_id":"4a4869f908e3a9cb922fc9ee7ab426cb1150ae7d","unresolved":true,"context_lines":[{"line_number":471,"context_line":"Align configuration and use a different shard key to break apart the hash ring"},{"line_number":472,"context_line":"------------------------------------------------------------------------------"},{"line_number":473,"context_line":""},{"line_number":474,"context_line":".. NOTE:: Under initial discussion"},{"line_number":475,"context_line":""},{"line_number":476,"context_line":"Provide a ?nova-manage? command to fix the db, kind of cleanup the cache code"},{"line_number":477,"context_line":"-----------------------------------------------------------------------------"}],"source_content_type":"text/x-rst","patch_set":1,"id":"8a6ea110_571789b7","line":474,"in_reply_to":"97c4308d_7a6e98a3","updated":"2022-07-05 16:21:24.000000000","message":"I understand the benefits of this, and probably something along the lines of this are what we\u0027ll need to do. However, as you note, it\u0027s not clear how the computes would decide who would do the RT updates. I also worry that it would cause ironic-specific bits to bleed upwards into the compute manager and resource tracker, and if not, we\u0027d end up with some churn as things flip flop back and forth as computes slowly digest environmental changes.\n\nIt would break our current notion of seeing services via the API, as it would look like you have one ironic compute when you have N. Anything we need to do that affects the compute service would need to be a different type of message over RPC in this configuration. For example, enabling/disabling a host (which you would no longer be able to do really anyway) would send the set_host_enabled() RPC call to just one of those computes. Things like get_host_uptime() would report a constantly-changing value as we\u0027d poll a different one each time. Neither of these is perhaps all that important, but either we need to document in the API that things\u0027d be broken for ironic, or work out some way to make them work for ironic at the higher layers.","commit_id":"d7665380e0791a0ec00a11e196d3f70ec79ce37c"},{"author":{"_account_id":11604,"name":"sean mooney","email":"smooney@redhat.com","username":"sean-k-mooney"},"change_message_id":"0d6dde7dea42352144e1ef4f85f829235dab3918","unresolved":true,"context_lines":[{"line_number":488,"context_line":".. NOTE:: It occurs to me that we could just always say the first ironic"},{"line_number":489,"context_line":" compute is the leader and can do these things, and keep all of the other"},{"line_number":490,"context_line":" computes as purely read/only. We could still kind of fix the cache in"},{"line_number":491,"context_line":" the next pass the remote compute node."},{"line_number":492,"context_line":""},{"line_number":493,"context_line":"Concerns"},{"line_number":494,"context_line":"~~~~~~~~"}],"source_content_type":"text/x-rst","patch_set":1,"id":"d42f2f62_08248a13","line":491,"updated":"2022-05-17 19:16:54.000000000","message":"you woudl need to so some sort of leader election or supprot a raft like protocal to make hat robust but yes.","commit_id":"d7665380e0791a0ec00a11e196d3f70ec79ce37c"},{"author":{"_account_id":4393,"name":"Dan Smith","email":"dms@danplanet.com","username":"danms"},"change_message_id":"4a4869f908e3a9cb922fc9ee7ab426cb1150ae7d","unresolved":true,"context_lines":[{"line_number":488,"context_line":".. NOTE:: It occurs to me that we could just always say the first ironic"},{"line_number":489,"context_line":" compute is the leader and can do these things, and keep all of the other"},{"line_number":490,"context_line":" computes as purely read/only. We could still kind of fix the cache in"},{"line_number":491,"context_line":" the next pass the remote compute node."},{"line_number":492,"context_line":""},{"line_number":493,"context_line":"Concerns"},{"line_number":494,"context_line":"~~~~~~~~"}],"source_content_type":"text/x-rst","patch_set":1,"id":"beb1a78a_819666c3","line":491,"in_reply_to":"d42f2f62_08248a13","updated":"2022-07-05 16:21:24.000000000","message":"They all look up the service.id of each (relevant) service and decide they\u0027re the leader if self.service.id\u003d\u003dmin(service_ids). That would keep it stable over time instead of each election potentially being random or time-based.\n\nEither way, this involves some potential breakage if it just happens automatically, right? If we rebalance instances and nodes in realtime, a request that comes in while that is happening may go to the wrong compute service. However, if the point is to purely re-home orphaned instances, then maybe that\u0027s less of a problem.\n\nHowever, it still ends up being ironic-specific and not something the ironic driver can really do periodically in the background today. I\u0027m not sure I like the ironic driver occasionally re-writing things it doesn\u0027t really own in the database that would change what is going on in the compute manager that \"owns\" it. It would need to share the RT lock I would think to avoid doing that in the middle of a RT update.","commit_id":"d7665380e0791a0ec00a11e196d3f70ec79ce37c"},{"author":{"_account_id":11604,"name":"sean mooney","email":"smooney@redhat.com","username":"sean-k-mooney"},"change_message_id":"0d6dde7dea42352144e1ef4f85f829235dab3918","unresolved":true,"context_lines":[{"line_number":508,"context_line":" discussed items. Distinct possibility this could work with just the"},{"line_number":509,"context_line":" local information already present in the DB/Configuration. Then again"},{"line_number":510,"context_line":" it may only work with just ironic deployments. Research required."},{"line_number":511,"context_line":""},{"line_number":512,"context_line":""},{"line_number":513,"context_line":""},{"line_number":514,"context_line":"Alternatives"}],"source_content_type":"text/x-rst","patch_set":1,"id":"92f65b34_b04d5a7a","line":511,"updated":"2022-05-17 19:16:54.000000000","message":"the issue is the compute agent really does not have all the infor it need to compute teh rebalnce since basically it need to run all the instance though the scudler again to select which host are valid based on the cell and host aggreate constratis.","commit_id":"d7665380e0791a0ec00a11e196d3f70ec79ce37c"}]}