)]}' {"/PATCHSET_LEVEL":[{"author":{"_account_id":11604,"name":"sean mooney","email":"smooney@redhat.com","username":"sean-k-mooney"},"change_message_id":"7e1ab778ec7f0d00a9fd264903518f66175faf84","unresolved":false,"context_lines":[],"source_content_type":"","patch_set":2,"id":"1e104d00_390ea06e","updated":"2025-01-06 10:30:07.000000000","message":"+1 for starting the disucssion \n\nthis is a tricky topic and i agree that perhaps instead of addign workaround after workaround we need to take a step back and reflect on wheter the current api desgin of placement or our usage of it is really fit for what we want to achive.","commit_id":"e794d561dfb4b5e34b7f6719f4328a80f8529f9a"},{"author":{"_account_id":9708,"name":"Balazs Gibizer","display_name":"gibi","email":"gibizer@gmail.com","username":"gibi"},"change_message_id":"813ca81ce70cc76cba51cf9735e525da564e362e","unresolved":false,"context_lines":[],"source_content_type":"","patch_set":2,"id":"016a19f7_107415da","updated":"2025-01-09 12:02:01.000000000","message":"w -1 just to signal that I have not intention to land this backlog spec right niw","commit_id":"e794d561dfb4b5e34b7f6719f4328a80f8529f9a"}],"specs/backlog/approved/placement-child-rp-usage-in-allocation-candidates.rst":[{"author":{"_account_id":9708,"name":"Balazs Gibizer","display_name":"gibi","email":"gibizer@gmail.com","username":"gibi"},"change_message_id":"8485eb61fbf38220a713de4d67aa18bda0eb3107","unresolved":true,"context_lines":[{"line_number":77,"context_line":"would grow to hundreds of megabytes making it very hard to transfer. Also"},{"line_number":78,"context_line":"nova would also need to iterate and filter these candidates which would lead to"},{"line_number":79,"context_line":"excessive select_destination RPC runtime and memory usage."},{"line_number":80,"context_line":""},{"line_number":81,"context_line":"Use Cases"},{"line_number":82,"context_line":"---------"},{"line_number":83,"context_line":""}],"source_content_type":"text/x-rst","patch_set":2,"id":"f353a29e_2cc9e006","line":80,"updated":"2025-10-03 10:19:22.000000000","message":"We have new learnings based on a new bug https://review.opendev.org/c/openstack/nova-specs/+/938070 We can limit the number of valid candidates but that is not enough as the number of invalid candidates grow even more rapidly and placement needs to go through those invalid ones to find the valid ones.","commit_id":"e794d561dfb4b5e34b7f6719f4328a80f8529f9a"},{"author":{"_account_id":9708,"name":"Balazs Gibizer","display_name":"gibi","email":"gibizer@gmail.com","username":"gibi"},"change_message_id":"6752c1e655b98052f53d287e1115ba4480956966","unresolved":false,"context_lines":[{"line_number":77,"context_line":"would grow to hundreds of megabytes making it very hard to transfer. Also"},{"line_number":78,"context_line":"nova would also need to iterate and filter these candidates which would lead to"},{"line_number":79,"context_line":"excessive select_destination RPC runtime and memory usage."},{"line_number":80,"context_line":""},{"line_number":81,"context_line":"Use Cases"},{"line_number":82,"context_line":"---------"},{"line_number":83,"context_line":""}],"source_content_type":"text/x-rst","patch_set":2,"id":"55f8529c_bd27a226","line":80,"in_reply_to":"f353a29e_2cc9e006","updated":"2025-10-07 13:54:41.000000000","message":"Done","commit_id":"e794d561dfb4b5e34b7f6719f4328a80f8529f9a"},{"author":{"_account_id":11604,"name":"sean mooney","email":"smooney@redhat.com","username":"sean-k-mooney"},"change_message_id":"7e1ab778ec7f0d00a9fd264903518f66175faf84","unresolved":true,"context_lines":[{"line_number":86,"context_line":" these resources to a Nova managed VM."},{"line_number":87,"context_line":""},{"line_number":88,"context_line":"* As a deployer I want to keep the memory consumption of Placement and Nova"},{"line_number":89,"context_line":" withing reasonable limits."},{"line_number":90,"context_line":""},{"line_number":91,"context_line":"* As a deployer I want to keep the VM scheduling time within reasonable limits"},{"line_number":92,"context_line":""}],"source_content_type":"text/x-rst","patch_set":2,"id":"981acdac_f1a7058c","line":89,"updated":"2025-01-06 10:30:07.000000000","message":"you mean ram is not cheap, ya that checks out :)","commit_id":"e794d561dfb4b5e34b7f6719f4328a80f8529f9a"},{"author":{"_account_id":11604,"name":"sean mooney","email":"smooney@redhat.com","username":"sean-k-mooney"},"change_message_id":"d1fd0d4a08fbe98ddef4792daa6a85c98104dea3","unresolved":false,"context_lines":[{"line_number":86,"context_line":" these resources to a Nova managed VM."},{"line_number":87,"context_line":""},{"line_number":88,"context_line":"* As a deployer I want to keep the memory consumption of Placement and Nova"},{"line_number":89,"context_line":" withing reasonable limits."},{"line_number":90,"context_line":""},{"line_number":91,"context_line":"* As a deployer I want to keep the VM scheduling time within reasonable limits"},{"line_number":92,"context_line":""}],"source_content_type":"text/x-rst","patch_set":2,"id":"0523f143_711fa388","line":89,"in_reply_to":"8f79e2bf_d37cf1ef","updated":"2025-01-09 11:15:28.000000000","message":"Acknowledged","commit_id":"e794d561dfb4b5e34b7f6719f4328a80f8529f9a"},{"author":{"_account_id":9708,"name":"Balazs Gibizer","display_name":"gibi","email":"gibizer@gmail.com","username":"gibi"},"change_message_id":"f9f456b94ae8c038ab8d8f0d92ecad1f9bd7d693","unresolved":true,"context_lines":[{"line_number":86,"context_line":" these resources to a Nova managed VM."},{"line_number":87,"context_line":""},{"line_number":88,"context_line":"* As a deployer I want to keep the memory consumption of Placement and Nova"},{"line_number":89,"context_line":" withing reasonable limits."},{"line_number":90,"context_line":""},{"line_number":91,"context_line":"* As a deployer I want to keep the VM scheduling time within reasonable limits"},{"line_number":92,"context_line":""}],"source_content_type":"text/x-rst","patch_set":2,"id":"8f79e2bf_d37cf1ef","line":89,"in_reply_to":"981acdac_f1a7058c","updated":"2025-01-09 07:59:38.000000000","message":"Unfortunately the exponential nature of the problem not just effects the cost of the RAM used, but can be actually prohibitive with current CPU architecture to provide such amount of RAM in a single machine ;) I pretty sure that with 16 PFs with 16 VFs each and a request for 16 VFs will blow the memory usage to unsatisfyable levels.","commit_id":"e794d561dfb4b5e34b7f6719f4328a80f8529f9a"},{"author":{"_account_id":11604,"name":"sean mooney","email":"smooney@redhat.com","username":"sean-k-mooney"},"change_message_id":"7e1ab778ec7f0d00a9fd264903518f66175faf84","unresolved":true,"context_lines":[{"line_number":88,"context_line":"* As a deployer I want to keep the memory consumption of Placement and Nova"},{"line_number":89,"context_line":" withing reasonable limits."},{"line_number":90,"context_line":""},{"line_number":91,"context_line":"* As a deployer I want to keep the VM scheduling time within reasonable limits"},{"line_number":92,"context_line":""},{"line_number":93,"context_line":"Proposed change"},{"line_number":94,"context_line":"\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d"}],"source_content_type":"text/x-rst","patch_set":2,"id":"a3930f05_1c166913","line":91,"updated":"2025-01-06 10:30:07.000000000","message":"such high expectations, but i agree with these usecases.","commit_id":"e794d561dfb4b5e34b7f6719f4328a80f8529f9a"},{"author":{"_account_id":9708,"name":"Balazs Gibizer","display_name":"gibi","email":"gibizer@gmail.com","username":"gibi"},"change_message_id":"f9f456b94ae8c038ab8d8f0d92ecad1f9bd7d693","unresolved":true,"context_lines":[{"line_number":100,"context_line":" ``[placement]max_allocation_candidates``"},{"line_number":101,"context_line":"* The order of the candidate generation can be tuned via"},{"line_number":102,"context_line":" ``[placement]allocation_candidates_generation_strategy``"},{"line_number":103,"context_line":""},{"line_number":104,"context_line":".. __: https://review.opendev.org/q/topic:%22bug/2070257%22"},{"line_number":105,"context_line":""},{"line_number":106,"context_line":"However these workarounds are resulting in lost candidates that can lead to"}],"source_content_type":"text/x-rst","patch_set":2,"id":"5012ad3a_26dd85c2","line":103,"updated":"2025-01-09 07:59:38.000000000","message":"Probably want a separate discussion what to do with the defaults of these two workaround like config options. The current defaults are for backwards compatibility. But if we are confident that we can have safer default without to much risk then we should probably change the defaults independently from the direction and timeline of the changes proposed by this spec.","commit_id":"e794d561dfb4b5e34b7f6719f4328a80f8529f9a"},{"author":{"_account_id":11604,"name":"sean mooney","email":"smooney@redhat.com","username":"sean-k-mooney"},"change_message_id":"7e1ab778ec7f0d00a9fd264903518f66175faf84","unresolved":true,"context_lines":[{"line_number":125,"context_line":""},{"line_number":126,"context_line":"Placement cloud return the pieces, the RPs satisfying a given group from the"},{"line_number":127,"context_line":"request, a candidate is generated from, and let the client (i.e. nova) generate"},{"line_number":128,"context_line":"all the candidates from the pieces on demand."},{"line_number":129,"context_line":""},{"line_number":130,"context_line":"Placement should avoid generating symmetric candidates"},{"line_number":131,"context_line":"~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~"}],"source_content_type":"text/x-rst","patch_set":2,"id":"6dbb20b6_9a880a7e","line":128,"updated":"2025-01-06 10:30:07.000000000","message":"hum, thats a big change althoug if we had a placement-lib that could contain the common code then we could supprot placement and nova sharing the same code to do this. we coudl move the placement fixture/api defintion to that too like neutron lib if we watned too.that woudl simplfy how our existing functional tests work and avoid needign to install all of placement just for the fixture.\n\nso im not saying we shoudl not consider this but if we did i think we woudl want to coniser factoring out common code rather then just directly porting it to nova.\n\nwe also likely woudl want a new api endpoint in placment i.e. instead of \n/allocationCandiates we woudl want somethign like /ResouceProviderCandiates\ngiven it would not be returnign allcoation candaties at all jsut the RP sumeries\n\n-----\nlater\n\n\ni actully think this might be the best path forward looking at the other alternitives\n\nadd a new ResouceProviderSummery api endpoint that does not create allocation candiates but just returns the RP sumemries and make nova generate them in the scheduler laizlly in the first filter that actully needes them (its only the pci and numa filters that would need them) we would need to ensure that when we return the before the scheduler returns to the conductor that we have generated an allocation candiate for the select host and the N alternite host but that should be shoudl always be N+1 max. the filter also only ever need to generate 1 valid candiates although they may check several invlid one internally in the numa case until they find one that works.\n\nwe might want to reverse how that work today, today we check if the request numa toplogy can fit the host and any of the set of allcoation candiates but instead we coudl have the numa filter validate if a instance/host numa topplopgy mapping can be created and then create an allcoation candiate form that toplogy.\n\nthere are a lot of detail to work out but that would work.\n.... the only problem i see with this is it calls into quetion why placemnet is a serpate service, with that said having a placement lib like this might also be useful for things like watcher.\n\nif we exposted the effectvice qurey to /ResouceProviderCandiates that was used by nova in the server show output or in a new api /server/\u003cuuid/ResouceProviderCandiates (this coudl return the result from plaement i.e. the provider summeries) , watcher could internally issue the same request to placmeent when conisdering which host to move an instance too\n\nwatcher woudl still not have visibility into numa constraints but it would be better then today.\n\nthe onther thing that comes to mind is if we were to have a new api endpoint\n/ResouceProviderCandiates i woudl be inclidned to not encode the request in the query stirng. i think using a json request body and likely a http POST as a result of that woudl be better, \n\nwe have had issue with member_of and in query arguments reaching the make url lenght so movign to a json body based request for this new api would allow us\nto resolve that problem too.","commit_id":"e794d561dfb4b5e34b7f6719f4328a80f8529f9a"},{"author":{"_account_id":11604,"name":"sean mooney","email":"smooney@redhat.com","username":"sean-k-mooney"},"change_message_id":"7e1ab778ec7f0d00a9fd264903518f66175faf84","unresolved":true,"context_lines":[{"line_number":136,"context_line":"it can drop truly equivalent candidates."},{"line_number":137,"context_line":""},{"line_number":138,"context_line":"TODO: try to find examples of equivalent candidates that placement can safely"},{"line_number":139,"context_line":"drop today regardless of the information only nova aware of like NUMA affinity"},{"line_number":140,"context_line":""},{"line_number":141,"context_line":"Change how Nova models physical resources"},{"line_number":142,"context_line":"~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~"}],"source_content_type":"text/x-rst","patch_set":2,"id":"84a7648d_ec834ea2","line":139,"updated":"2025-01-06 10:30:07.000000000","message":"i cant prove thsi in words but my gut feelign is that any two allcoation candiates that contian the same number of allocatios form the same RPs can be dropped.,\n\ni wanted to say say any two allcoation cantiades that only contain the same RPs could be merged but that is not true\n\ni.e because of numa info `2 RP1 and 1 form RP2` may not be the same as `1 from RP1 and 2 form RP2`\n\nbut if we had two allcoation candaties that were `1 from rp1 and 1 form rp2` and `1 from rp 2 and 1 from rp1` then htey are eqivlent as the only differ in order which does not matter.\n\nso if do somehtign like generate a sting for each RP/rc f\u0027{rp.uuid}:{ RC:RC_COUNT for RC, RC_COUNT in allocation_from_rp }\u0027 i.e `46e16894-3e5c-47ae-b710-588b4b4e1f8a:{VGPU:1}`\n\nprovided we generate that in a determistic way (sort the allocations by uuid/RC\nwe can hash or compare the reusltant stings to see if the allcoation candiate is equivalent.\n\nhaving said that i belive we are actully already doing this in placment.\n\nwe have defiedn a __hash__ funciton on the allcoation candiate class i belive that does produce a stable hash over the resouce class and we instert the AC into a set to do the dedupe. however im not sure it that hash removes all order indepent duplciates as i descibe above or not.","commit_id":"e794d561dfb4b5e34b7f6719f4328a80f8529f9a"},{"author":{"_account_id":11604,"name":"sean mooney","email":"smooney@redhat.com","username":"sean-k-mooney"},"change_message_id":"7e1ab778ec7f0d00a9fd264903518f66175faf84","unresolved":true,"context_lines":[{"line_number":143,"context_line":""},{"line_number":144,"context_line":"In the above example Nova could decide that the two PFs provides resources of"},{"line_number":145,"context_line":"the same type and therefore does not represent them as two separate children"},{"line_number":146,"context_line":"providers, but a single provider instead with 12 resources."},{"line_number":147,"context_line":""},{"line_number":148,"context_line":"Change how Nova requests resources from Placement"},{"line_number":149,"context_line":"~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~"}],"source_content_type":"text/x-rst","patch_set":2,"id":"815a6b92_922b82f3","line":146,"updated":"2025-01-06 10:30:07.000000000","message":"this is valid if and only if each phsyical device is interchanable with respect to the observable caracteristigs of the device.\n\nby that i mean they are both on the same numa node, have the same physnet(if they are used for neutron) same gpu type in the vgpu case and same traits.\n\nnote that this is somewhat problematic if we wanted to supprot somthign like PF anti affinity for reduncacy fi we ever support bond port in neutron but we dont supprot bonding today so im inclidned to say that fine and we can proably supprot PF anti affintiy on the nova side in the pci passthoguh filter.\n\nwhere it breaks down on the networkign side is where they have non mergable sub resouces, specificaly bandwith/pack per second inventories.\n\nagain i think that ok for now since we dont model pci device with neutron phsynet in placement today in nova, out side of the min bandiwith/pps cases where they are already 1:1 between PF netdev name and RP i.e. the rp name is {hostname}_{pf_netdev_name}\nwe obviously cant change that.\n\nif we were to limit this ot only acclaroatr i.e. not netorking device used with neturon however we could adjust the modeling to merge similar deivces into a single RP,\n\nthis does mean a more complex mapping of the phsyical device ot RP\nwe can nolonger jsut do `{hostname}_{pci address}`\n\nif we were to go donw this route i woudl consider adding a new mapping ot placement\n\nbasically add a external ids list to the RP, this woudl be a list of opaque string that are a cline specific identifer that makes sesne to the service that owns the RP.\n\nthat way nova can generate one RP for a set of host dedevice and then add the pci address fo each device in that combided set to the RP as a list of external ids,\n\nthat way we can use that info when quering the pci_devices in in the nova db to slect the actul device to assign. effectivly this replaces the 1:1 mapping with a 1:* mapping while allowing use to model the host idetifies in the placment api for easy lookup via the provider summeries.","commit_id":"e794d561dfb4b5e34b7f6719f4328a80f8529f9a"},{"author":{"_account_id":11604,"name":"sean mooney","email":"smooney@redhat.com","username":"sean-k-mooney"},"change_message_id":"7e1ab778ec7f0d00a9fd264903518f66175faf84","unresolved":true,"context_lines":[{"line_number":150,"context_line":""},{"line_number":151,"context_line":"In the above example instead of asking for two groups of 1 unit of resource"},{"line_number":152,"context_line":"Nova could request 2 units of resources in a single group."},{"line_number":153,"context_line":""},{"line_number":154,"context_line":""},{"line_number":155,"context_line":"Data model impact"},{"line_number":156,"context_line":"-----------------"}],"source_content_type":"text/x-rst","patch_set":2,"id":"1233a696_06bd5e88","line":153,"updated":"2025-01-06 10:30:07.000000000","message":"that would only be ok in some cases.\n\ni.e for neutron port request the netruon port would have to have the saem constirats, such as being on the same physnet or ip segment but it might be ok,\n\nfor flavor based request im undecieded if this is somethign we coudl modle in the pci alias, i really dont like that we can have request groups in the flavor directly. i stongly think that was a mistake to add and i also dont partically like haveing even resouce: in the flaovr.\n\nso if we were to change this, i would want use to have a level of indirection rather then making the admin express the grouping in the existing flavor syntax im just now sure how to do that or where, pci alias seams possibel but that is not generic to other type of nested resouces.","commit_id":"e794d561dfb4b5e34b7f6719f4328a80f8529f9a"},{"author":{"_account_id":11604,"name":"sean mooney","email":"smooney@redhat.com","username":"sean-k-mooney"},"change_message_id":"7e1ab778ec7f0d00a9fd264903518f66175faf84","unresolved":true,"context_lines":[{"line_number":151,"context_line":"In the above example instead of asking for two groups of 1 unit of resource"},{"line_number":152,"context_line":"Nova could request 2 units of resources in a single group."},{"line_number":153,"context_line":""},{"line_number":154,"context_line":""},{"line_number":155,"context_line":"Data model impact"},{"line_number":156,"context_line":"-----------------"},{"line_number":157,"context_line":""}],"source_content_type":"text/x-rst","patch_set":2,"id":"5698f707_52fd2505","line":154,"updated":"2025-01-06 10:30:07.000000000","message":"another alternitive for a placement only change is to change placement ot generate candiates laizly.\n\nbasically the default placement limit is 1000\n\nif we modified placement to use generators such that we looped over all potential hosts taking 1 allcoation at a time per root RP until we hit that global limit,\non a host with \u003e1000 valid hosts we would get 1000 allocation candiate for 1000 hosts,\n\non a host with 10 vallid host we woudl get 100 allcoation candiate per hosts assume thei is the 8^8 or a similar case where there are 100+ allcoation canidate per hosts.","commit_id":"e794d561dfb4b5e34b7f6719f4328a80f8529f9a"}]}