)]}'
{"specs/queens/approved/count-quota-usage-from-placement.rst":[{"author":{"_account_id":6873,"name":"Matt Riedemann","email":"mriedem.os@gmail.com","username":"mriedem"},"change_message_id":"b539409d44012564b026f080627d042ea345bd3d","unresolved":false,"context_lines":[{"line_number":1,"context_line":".."},{"line_number":2,"context_line":" This work is licensed under a Creative Commons Attribution 3.0 Unported"},{"line_number":3,"context_line":" License."},{"line_number":4,"context_line":""}],"source_content_type":"text/x-rst","patch_set":1,"id":"bf659307_105cc323","line":1,"updated":"2018-03-26 19:22:20.000000000","message":"This has to be moved to the rocky directory now.","commit_id":"c2218b6d1b34b285fa4183f7a6126700ca9299b3"},{"author":{"_account_id":4690,"name":"melanie witt","display_name":"melwitt","email":"melwittt@gmail.com","username":"melwitt"},"change_message_id":"ec624c0a9f99113a8b2660e57ff0e9c5db2338a8","unresolved":false,"context_lines":[{"line_number":29,"context_line":"Problem description"},{"line_number":30,"context_line":"\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d"},{"line_number":31,"context_line":""},{"line_number":32,"context_line":"A detailed description of the problem. What problem is this blueprint"},{"line_number":33,"context_line":"addressing?"},{"line_number":34,"context_line":""},{"line_number":35,"context_line":"When we count quota resource usage for CPU and RAM, we do so by reading"},{"line_number":36,"context_line":"separate cell databases and aggregating the results. CPU and RAM amounts per"}],"source_content_type":"text/x-rst","patch_set":1,"id":"7f515b1d_a34a6da9","line":33,"range":{"start_line":32,"start_character":0,"end_line":33,"end_character":11},"updated":"2017-10-03 20:58:38.000000000","message":"Whoops, forgot to remove this.","commit_id":"c2218b6d1b34b285fa4183f7a6126700ca9299b3"},{"author":{"_account_id":5754,"name":"Alex Xu","email":"hejie.xu@intel.com","username":"xuhj"},"change_message_id":"c78cb7993ab3809e8451490ca14b31b3a1729f55","unresolved":false,"context_lines":[{"line_number":71,"context_line":"Proposed change"},{"line_number":72,"context_line":"\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d"},{"line_number":73,"context_line":""},{"line_number":74,"context_line":"We will replace the existing multi-cell database query with two calls to"},{"line_number":75,"context_line":"placement to get resource usage for CPU, RAM, and instances. We can get"},{"line_number":76,"context_line":"CPU and RAM usage for a project and user by querying the ``/usages`` resource::"},{"line_number":77,"context_line":""},{"line_number":78,"context_line":"    GET /usages?project_id\u003d\u003cproject id\u003e\u0026user_id\u003d\u003cuser id\u003e"}],"source_content_type":"text/x-rst","patch_set":1,"id":"bf659307_b43ed588","line":75,"range":{"start_line":74,"start_character":0,"end_line":75,"end_character":60},"updated":"2018-03-29 07:33:57.000000000","message":"Is there any solution for the race between the limit check and the resource claim with placement? I guess you guys already discussed that. It is worth to document it in the spec.","commit_id":"c2218b6d1b34b285fa4183f7a6126700ca9299b3"},{"author":{"_account_id":6873,"name":"Matt Riedemann","email":"mriedem.os@gmail.com","username":"mriedem"},"change_message_id":"bea2389b4d58ea1bcc0507bd636aef159473add1","unresolved":false,"context_lines":[{"line_number":77,"context_line":""},{"line_number":78,"context_line":"    GET /usages?project_id\u003d\u003cproject id\u003e\u0026user_id\u003d\u003cuser id\u003e"},{"line_number":79,"context_line":""},{"line_number":80,"context_line":"We can get the number of instances for a project and user if we add a new"},{"line_number":81,"context_line":"REST resource ``/allocations/count`` with a GET method that accepts query"},{"line_number":82,"context_line":"strings for ``project_id`` and ``user_id``, for example::"},{"line_number":83,"context_line":""}],"source_content_type":"text/x-rst","patch_set":1,"id":"5f7c97a3_bf561b06","line":80,"updated":"2018-05-04 14:59:57.000000000","message":"We have allocations against a migration record during server move operations, the allocations swapped from the instance to the migration record for the source host, and the instance allocation goes against the destination host.\n\nI thought we agreed in Dublin (although I can\u0027t find it in the etherpad) that we\u0027d add a type column to the consumers table in placement for being able to distinguish migrations from instances?","commit_id":"c2218b6d1b34b285fa4183f7a6126700ca9299b3"},{"author":{"_account_id":11564,"name":"Chris Dent","email":"cdent@anticdent.org","username":"chdent"},"change_message_id":"d9ff39e1f8ccb852f822e36f90e5ed98ab1b3b6e","unresolved":false,"context_lines":[{"line_number":81,"context_line":"REST resource ``/allocations/count`` with a GET method that accepts query"},{"line_number":82,"context_line":"strings for ``project_id`` and ``user_id``, for example::"},{"line_number":83,"context_line":""},{"line_number":84,"context_line":"    GET /allocations/count?project_id\u003d\u003cuuid\u003e\u0026user_id\u003d\u003cuuid\u003e"},{"line_number":85,"context_line":""},{"line_number":86,"context_line":"where the response would be, for example::"},{"line_number":87,"context_line":""}],"source_content_type":"text/x-rst","patch_set":1,"id":"bf659307_f2949e46","line":84,"updated":"2018-03-27 13:02:26.000000000","message":"Acknowledging what you said below about being able to guarantee consumers were instances if a URI like the above did come back into scope, it would be best if it were not /allocations/count because that\u0027s in the /allocations/{consumer_uuid} slot (and we don\u0027t validate the slot).\n\n/allocations_count or /allocations?count\u003dtrue or some such might be better.","commit_id":"c2218b6d1b34b285fa4183f7a6126700ca9299b3"},{"author":{"_account_id":4690,"name":"melanie witt","display_name":"melwitt","email":"melwittt@gmail.com","username":"melwitt"},"change_message_id":"1c70dbbe1bad263140d2bf8c9da1ff3722af55b7","unresolved":false,"context_lines":[{"line_number":93,"context_line":"    }"},{"line_number":94,"context_line":""},{"line_number":95,"context_line":"This is assuming that there is one allocation record per consumer UUID, where"},{"line_number":96,"context_line":"each consumer UUID represents one instance consumer."},{"line_number":97,"context_line":""},{"line_number":98,"context_line":"Alternatives"},{"line_number":99,"context_line":"------------"}],"source_content_type":"text/x-rst","patch_set":1,"id":"7f515b1d_e6d7534f","line":96,"updated":"2017-10-03 21:17:22.000000000","message":"Apparently we can\u0027t assume that consumers are instances, so we need a different approach for counting instances. One idea is to add a \"deleted\" column to the instance_mappings table in the API database. We could set that inside the Instance.destroy method to ensure it\u0027s only set when a delete succeeds.","commit_id":"c2218b6d1b34b285fa4183f7a6126700ca9299b3"},{"author":{"_account_id":5754,"name":"Alex Xu","email":"hejie.xu@intel.com","username":"xuhj"},"change_message_id":"c78cb7993ab3809e8451490ca14b31b3a1729f55","unresolved":false,"context_lines":[{"line_number":180,"context_line":"  query a count of allocations by ``project_id`` and ``user_id`` to serve as"},{"line_number":181,"context_line":"  a count of the number of instances"},{"line_number":182,"context_line":"* Replace the parallel query of separate cell databases with queries to the"},{"line_number":183,"context_line":"  placement API for quota counting of instances, CPU, and RAM"},{"line_number":184,"context_line":""},{"line_number":185,"context_line":"Dependencies"},{"line_number":186,"context_line":"\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d"}],"source_content_type":"text/x-rst","patch_set":1,"id":"bf659307_d4110906","line":183,"updated":"2018-03-29 07:33:57.000000000","message":"We also need to deprecated the config option \u0027ONF.quota.recheck_quota\u0027 I guess. It is useless after we migrate to the placement.","commit_id":"c2218b6d1b34b285fa4183f7a6126700ca9299b3"},{"author":{"_account_id":4690,"name":"melanie witt","display_name":"melwitt","email":"melwittt@gmail.com","username":"melwitt"},"change_message_id":"2579b0c0ce273ba415b8805838b39e2b3e9515a5","unresolved":false,"context_lines":[{"line_number":180,"context_line":"  query a count of allocations by ``project_id`` and ``user_id`` to serve as"},{"line_number":181,"context_line":"  a count of the number of instances"},{"line_number":182,"context_line":"* Replace the parallel query of separate cell databases with queries to the"},{"line_number":183,"context_line":"  placement API for quota counting of instances, CPU, and RAM"},{"line_number":184,"context_line":""},{"line_number":185,"context_line":"Dependencies"},{"line_number":186,"context_line":"\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d"}],"source_content_type":"text/x-rst","patch_set":1,"id":"5f7c97a3_064e24c0","line":183,"in_reply_to":"bf659307_d4110906","updated":"2018-05-15 18:19:27.000000000","message":"We\u0027ll still need CONF.quota.recheck_quota because of the fact that parallel instance creates can race. Example: several instance create requests come in and we check quota with placement for CPU and RAM, the project has quota available, so we go ahead and create an allocation. After one instance create allocates CPU and RAM in placement, it is possible that other parallel instance creates will now be over quota, so we have to ask placement again if CPU and RAM requests are under quota, if CONF.quota.recheck_quota\u003dTrue.","commit_id":"c2218b6d1b34b285fa4183f7a6126700ca9299b3"}],"specs/rocky/approved/count-quota-usage-from-placement.rst":[{"author":{"_account_id":4690,"name":"melanie witt","display_name":"melwitt","email":"melwittt@gmail.com","username":"melwitt"},"change_message_id":"c36f3874da154b368b94236547611137a28cfd98","unresolved":false,"context_lines":[{"line_number":69,"context_line":"Proposed change"},{"line_number":70,"context_line":"\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d"},{"line_number":71,"context_line":""},{"line_number":72,"context_line":"We will replace the existing multi-cell database query with two calls to"},{"line_number":73,"context_line":"placement to get resource usage for CPU and RAM. We can get CPU and RAM usage"},{"line_number":74,"context_line":"for a project and user by querying the ``/usages`` resource::"},{"line_number":75,"context_line":""}],"source_content_type":"text/x-rst","patch_set":2,"id":"5f7c97a3_9a6288ad","line":72,"range":{"start_line":72,"start_character":60,"end_line":72,"end_character":69},"updated":"2018-05-15 18:47:51.000000000","message":"This should be \"one call\". Gonna update this to show that we\u0027ll have one call to placement /usages and one query of nova_api.instance_mappings.","commit_id":"864170d03096d9170fa4135b8e655fc85bf15e99"},{"author":{"_account_id":7,"name":"Jay Pipes","email":"jaypipes@gmail.com","username":"jaypipes"},"change_message_id":"728df427d712ec0ae2cecfa80b31034f7bf622c9","unresolved":false,"context_lines":[{"line_number":14,"context_line":"instead of using reservations and tracking quota usages in a separate database"},{"line_number":15,"context_line":"table. We\u0027re counting resources like instances, CPU, and RAM by querying each"},{"line_number":16,"context_line":"cell database and aggregating the results per project and per user. This"},{"line_number":17,"context_line":"approach has a couple of downsides: it is not too efficient and it is"},{"line_number":18,"context_line":"susceptible to undesirable behavior if a cell becomes unavailable. If a cell"},{"line_number":19,"context_line":"becomes unavailable, resources in its database cannot be counted and will not"},{"line_number":20,"context_line":"be included in resource usage until the cell returns. Cells could become"}],"source_content_type":"text/x-rst","patch_set":3,"id":"5f7c97a3_c7e0e1fd","line":17,"range":{"start_line":17,"start_character":46,"end_line":17,"end_character":50},"updated":"2018-05-18 15:47:51.000000000","message":"s/too// :)","commit_id":"36c1d1f151b5a618ed3ae3b8155dd6f40a9c2fb6"},{"author":{"_account_id":7,"name":"Jay Pipes","email":"jaypipes@gmail.com","username":"jaypipes"},"change_message_id":"f18f8397fbf150f93f233d1139ac6802fa092128","unresolved":false,"context_lines":[{"line_number":14,"context_line":"instead of using reservations and tracking quota usages in a separate database"},{"line_number":15,"context_line":"table. We\u0027re counting resources like instances, CPU, and RAM by querying each"},{"line_number":16,"context_line":"cell database and aggregating the results per project and per user. This"},{"line_number":17,"context_line":"approach has a couple of downsides: it is not too efficient and it is"},{"line_number":18,"context_line":"susceptible to undesirable behavior if a cell becomes unavailable. If a cell"},{"line_number":19,"context_line":"becomes unavailable, resources in its database cannot be counted and will not"},{"line_number":20,"context_line":"be included in resource usage until the cell returns. Cells could become"}],"source_content_type":"text/x-rst","patch_set":3,"id":"5f7c97a3_bd1cc777","line":17,"range":{"start_line":17,"start_character":46,"end_line":17,"end_character":50},"in_reply_to":"5f7c97a3_9a49d233","updated":"2018-05-20 18:33:23.000000000","message":"I believe the general point is that making a single REST API query or a single DB query is always going to be more efficient than executing multiple queries, one for each cell.","commit_id":"36c1d1f151b5a618ed3ae3b8155dd6f40a9c2fb6"},{"author":{"_account_id":26936,"name":"Surya Seetharaman","email":"suryaseetharaman.9@gmail.com","username":"tssurya"},"change_message_id":"5c55f23619cd0a6fe5c6f00d792783506b4e6686","unresolved":false,"context_lines":[{"line_number":14,"context_line":"instead of using reservations and tracking quota usages in a separate database"},{"line_number":15,"context_line":"table. We\u0027re counting resources like instances, CPU, and RAM by querying each"},{"line_number":16,"context_line":"cell database and aggregating the results per project and per user. This"},{"line_number":17,"context_line":"approach has a couple of downsides: it is not too efficient and it is"},{"line_number":18,"context_line":"susceptible to undesirable behavior if a cell becomes unavailable. If a cell"},{"line_number":19,"context_line":"becomes unavailable, resources in its database cannot be counted and will not"},{"line_number":20,"context_line":"be included in resource usage until the cell returns. Cells could become"}],"source_content_type":"text/x-rst","patch_set":3,"id":"5f7c97a3_4e1e6e0c","line":17,"range":{"start_line":17,"start_character":46,"end_line":17,"end_character":50},"in_reply_to":"5f7c97a3_bd1cc777","updated":"2018-06-04 07:59:06.000000000","message":"\u003e I believe the general point is that making a single REST API query\n \u003e or a single DB query is always going to be more efficient than\n \u003e executing multiple queries, one for each cell.\n\n++, specially if one cell DB is slower than others even in the case of parallel querying.","commit_id":"36c1d1f151b5a618ed3ae3b8155dd6f40a9c2fb6"},{"author":{"_account_id":4393,"name":"Dan Smith","email":"dms@danplanet.com","username":"danms"},"change_message_id":"780ed13cb33ee72d04b876be7e13875e81a616b6","unresolved":false,"context_lines":[{"line_number":14,"context_line":"instead of using reservations and tracking quota usages in a separate database"},{"line_number":15,"context_line":"table. We\u0027re counting resources like instances, CPU, and RAM by querying each"},{"line_number":16,"context_line":"cell database and aggregating the results per project and per user. This"},{"line_number":17,"context_line":"approach has a couple of downsides: it is not too efficient and it is"},{"line_number":18,"context_line":"susceptible to undesirable behavior if a cell becomes unavailable. If a cell"},{"line_number":19,"context_line":"becomes unavailable, resources in its database cannot be counted and will not"},{"line_number":20,"context_line":"be included in resource usage until the cell returns. Cells could become"}],"source_content_type":"text/x-rst","patch_set":3,"id":"5f7c97a3_9a49d233","line":17,"range":{"start_line":17,"start_character":46,"end_line":17,"end_character":50},"in_reply_to":"5f7c97a3_c7e0e1fd","updated":"2018-05-20 01:25:58.000000000","message":"I must be missing something about the efficiency point here. More on that below, but without that clarification this spec is entirely about trying to make quotas continue to work with a cell down, and not really anything about performance.","commit_id":"36c1d1f151b5a618ed3ae3b8155dd6f40a9c2fb6"},{"author":{"_account_id":7,"name":"Jay Pipes","email":"jaypipes@gmail.com","username":"jaypipes"},"change_message_id":"728df427d712ec0ae2cecfa80b31034f7bf622c9","unresolved":false,"context_lines":[{"line_number":23,"context_line":""},{"line_number":24,"context_line":"We can make resource usage counting for quotas much more efficient and"},{"line_number":25,"context_line":"resilient to temporary cell outages by querying placement and the API database"},{"line_number":26,"context_line":"for resource usage instead of reading separate cell databases."},{"line_number":27,"context_line":""},{"line_number":28,"context_line":""},{"line_number":29,"context_line":"Problem description"}],"source_content_type":"text/x-rst","patch_set":3,"id":"5f7c97a3_e7babddf","line":26,"updated":"2018-05-18 15:47:51.000000000","message":"++","commit_id":"36c1d1f151b5a618ed3ae3b8155dd6f40a9c2fb6"},{"author":{"_account_id":4393,"name":"Dan Smith","email":"dms@danplanet.com","username":"danms"},"change_message_id":"780ed13cb33ee72d04b876be7e13875e81a616b6","unresolved":false,"context_lines":[{"line_number":37,"context_line":"aggregate them to calculate the resource usage."},{"line_number":38,"context_line":""},{"line_number":39,"context_line":"This is not ideal because even though we query cell databases in parallel, we"},{"line_number":40,"context_line":"are querying multiple databases and without true threading the parallel"},{"line_number":41,"context_line":"querying occurs in one process. The approach is also sensitive to temporary"},{"line_number":42,"context_line":"cell outages which may occur during operator maintenance or if a cell database"},{"line_number":43,"context_line":"is experiencing issues and we cannot connect to it. While a cell is"},{"line_number":44,"context_line":"unavailable, we cannot count resource usage residing in that cell database and"}],"source_content_type":"text/x-rst","patch_set":3,"id":"5f7c97a3_9a96b294","line":41,"range":{"start_line":40,"start_character":36,"end_line":41,"end_character":30},"updated":"2018-05-20 01:25:58.000000000","message":"I\u0027m not sure what this has to do with anything. The querying does happen in parallel, but the aggregation has to happen in a single thread regardless of using \"true threading\" or not. The only extra bit we could parallelize with real threads would be the processing of the DB results, but that would be rather tiny.\n\nIf there isn\u0027t something behind this, I\u0027d rather remove this from the spec here as I think it would imply some larger downside to a layperson or user reading this spec.","commit_id":"36c1d1f151b5a618ed3ae3b8155dd6f40a9c2fb6"},{"author":{"_account_id":4393,"name":"Dan Smith","email":"dms@danplanet.com","username":"danms"},"change_message_id":"e0f756783f48f7b5eecb67070e90673e4db93f8c","unresolved":false,"context_lines":[{"line_number":37,"context_line":"aggregate them to calculate the resource usage."},{"line_number":38,"context_line":""},{"line_number":39,"context_line":"This is not ideal because even though we query cell databases in parallel, we"},{"line_number":40,"context_line":"are querying multiple databases and without true threading the parallel"},{"line_number":41,"context_line":"querying occurs in one process. The approach is also sensitive to temporary"},{"line_number":42,"context_line":"cell outages which may occur during operator maintenance or if a cell database"},{"line_number":43,"context_line":"is experiencing issues and we cannot connect to it. While a cell is"},{"line_number":44,"context_line":"unavailable, we cannot count resource usage residing in that cell database and"}],"source_content_type":"text/x-rst","patch_set":3,"id":"5f7c97a3_439d9255","line":41,"range":{"start_line":40,"start_character":36,"end_line":41,"end_character":30},"in_reply_to":"5f7c97a3_1d0dbb3d","updated":"2018-05-20 19:18:10.000000000","message":"Yeah I don\u0027t get why the big query is better _at_all_. However, let\u0027s just leave the efficiency thing out of the spec, since the resiliency aspect is why we\u0027re doing this anyway and then we\u0027re good. Fewer words anyway.","commit_id":"36c1d1f151b5a618ed3ae3b8155dd6f40a9c2fb6"},{"author":{"_account_id":7,"name":"Jay Pipes","email":"jaypipes@gmail.com","username":"jaypipes"},"change_message_id":"f18f8397fbf150f93f233d1139ac6802fa092128","unresolved":false,"context_lines":[{"line_number":37,"context_line":"aggregate them to calculate the resource usage."},{"line_number":38,"context_line":""},{"line_number":39,"context_line":"This is not ideal because even though we query cell databases in parallel, we"},{"line_number":40,"context_line":"are querying multiple databases and without true threading the parallel"},{"line_number":41,"context_line":"querying occurs in one process. The approach is also sensitive to temporary"},{"line_number":42,"context_line":"cell outages which may occur during operator maintenance or if a cell database"},{"line_number":43,"context_line":"is experiencing issues and we cannot connect to it. While a cell is"},{"line_number":44,"context_line":"unavailable, we cannot count resource usage residing in that cell database and"}],"source_content_type":"text/x-rst","patch_set":3,"id":"5f7c97a3_1d0dbb3d","line":41,"range":{"start_line":40,"start_character":36,"end_line":41,"end_character":30},"in_reply_to":"5f7c97a3_9a96b294","updated":"2018-05-20 18:33:23.000000000","message":"Yeah, I don\u0027t think the section about true threading is helpful either, for the reasons you stated. Like I mention above, I think the efficiency argument can be made as simple as saying executing a single query is always going to be more efficient than having to execute multiple queries, one per cell.\n\nOr just leave the efficiency argument out entirely; I\u0027d be fine with that as well.","commit_id":"36c1d1f151b5a618ed3ae3b8155dd6f40a9c2fb6"},{"author":{"_account_id":4393,"name":"Dan Smith","email":"dms@danplanet.com","username":"danms"},"change_message_id":"780ed13cb33ee72d04b876be7e13875e81a616b6","unresolved":false,"context_lines":[{"line_number":57,"context_line":"Use Cases"},{"line_number":58,"context_line":"---------"},{"line_number":59,"context_line":""},{"line_number":60,"context_line":"Counting quota resource usage from placement would make instance create"},{"line_number":61,"context_line":"requests more efficient for End Users and make quota behavior consistent in the"},{"line_number":62,"context_line":"event of temporary cell database disruptions. It would be easier for Operators"},{"line_number":63,"context_line":"to take cells offline if needed for maintenance without concern about the"},{"line_number":64,"context_line":"possibility of quota limits being exceeded during the maintenance. It could"}],"source_content_type":"text/x-rst","patch_set":3,"id":"5f7c97a3_9adb127c","line":61,"range":{"start_line":60,"start_character":56,"end_line":61,"end_character":23},"updated":"2018-05-20 01:25:58.000000000","message":"How are they more efficient? Is there a quota calculation that we can\u0027t do in SQL in the cell DB schema that placement can do? Perhaps we\u0027ve talked about that in the past and I\u0027m forgetting it. If so, it should probably go in the problem description to call it out specifically.","commit_id":"36c1d1f151b5a618ed3ae3b8155dd6f40a9c2fb6"},{"author":{"_account_id":5754,"name":"Alex Xu","email":"hejie.xu@intel.com","username":"xuhj"},"change_message_id":"a3109b3fcfb10e716057b26074785b8b5e8d4387","unresolved":false,"context_lines":[{"line_number":64,"context_line":"possibility of quota limits being exceeded during the maintenance. It could"},{"line_number":65,"context_line":"spare Operators the trouble of potentially having to fix cases where quota has"},{"line_number":66,"context_line":"been exceeded during maintenance or if a cell database connection could not be"},{"line_number":67,"context_line":"established."},{"line_number":68,"context_line":""},{"line_number":69,"context_line":"Proposed change"},{"line_number":70,"context_line":"\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d"}],"source_content_type":"text/x-rst","patch_set":3,"id":"5f7c97a3_29df6c69","line":67,"updated":"2018-05-15 23:49:23.000000000","message":"I have one more usecase. since we have placement now, we have more and more resources added to the nova and placement. But the nova quota only counts few built-in resources. The user also wants to count the quota for the GPU/FPGA and the consumer resource class.","commit_id":"36c1d1f151b5a618ed3ae3b8155dd6f40a9c2fb6"},{"author":{"_account_id":7,"name":"Jay Pipes","email":"jaypipes@gmail.com","username":"jaypipes"},"change_message_id":"728df427d712ec0ae2cecfa80b31034f7bf622c9","unresolved":false,"context_lines":[{"line_number":64,"context_line":"possibility of quota limits being exceeded during the maintenance. It could"},{"line_number":65,"context_line":"spare Operators the trouble of potentially having to fix cases where quota has"},{"line_number":66,"context_line":"been exceeded during maintenance or if a cell database connection could not be"},{"line_number":67,"context_line":"established."},{"line_number":68,"context_line":""},{"line_number":69,"context_line":"Proposed change"},{"line_number":70,"context_line":"\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d"}],"source_content_type":"text/x-rst","patch_set":3,"id":"5f7c97a3_c79c615f","line":67,"in_reply_to":"5f7c97a3_29df6c69","updated":"2018-05-18 15:47:51.000000000","message":"FYI, Alex covers this use case in this related spec: https://review.openstack.org/#/c/569011/","commit_id":"36c1d1f151b5a618ed3ae3b8155dd6f40a9c2fb6"},{"author":{"_account_id":6873,"name":"Matt Riedemann","email":"mriedem.os@gmail.com","username":"mriedem"},"change_message_id":"4854f1f3210634555a731afb366dd2bd6108b56b","unresolved":false,"context_lines":[{"line_number":64,"context_line":"possibility of quota limits being exceeded during the maintenance. It could"},{"line_number":65,"context_line":"spare Operators the trouble of potentially having to fix cases where quota has"},{"line_number":66,"context_line":"been exceeded during maintenance or if a cell database connection could not be"},{"line_number":67,"context_line":"established."},{"line_number":68,"context_line":""},{"line_number":69,"context_line":"Proposed change"},{"line_number":70,"context_line":"\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d"}],"source_content_type":"text/x-rst","patch_set":3,"id":"5f7c97a3_fa0c863c","line":67,"in_reply_to":"5f7c97a3_c79c615f","updated":"2018-05-19 23:25:23.000000000","message":"I\u0027m not sure yet how I feel about nova being responsible for counting resource usage and enforcing quota for non-compute (nova) resource classes, like will we at some point expect to extend this to port and volume quota as well (not that those are part of nova flavors...so maybe that\u0027s the distinction). Anyway, that\u0027s for Alex\u0027s spec, not this one.","commit_id":"36c1d1f151b5a618ed3ae3b8155dd6f40a9c2fb6"},{"author":{"_account_id":7,"name":"Jay Pipes","email":"jaypipes@gmail.com","username":"jaypipes"},"change_message_id":"f18f8397fbf150f93f233d1139ac6802fa092128","unresolved":false,"context_lines":[{"line_number":64,"context_line":"possibility of quota limits being exceeded during the maintenance. It could"},{"line_number":65,"context_line":"spare Operators the trouble of potentially having to fix cases where quota has"},{"line_number":66,"context_line":"been exceeded during maintenance or if a cell database connection could not be"},{"line_number":67,"context_line":"established."},{"line_number":68,"context_line":""},{"line_number":69,"context_line":"Proposed change"},{"line_number":70,"context_line":"\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d"}],"source_content_type":"text/x-rst","patch_set":3,"id":"5f7c97a3_60ff6467","line":67,"in_reply_to":"5f7c97a3_fa0c863c","updated":"2018-05-20 18:33:23.000000000","message":"\u003e I\u0027m not sure yet how I feel about nova being responsible for\n \u003e counting resource usage and enforcing quota for non-compute (nova)\n \u003e resource classes, like will we at some point expect to extend this\n \u003e to port and volume quota as well (not that those are part of nova\n \u003e flavors...so maybe that\u0027s the distinction). Anyway, that\u0027s for\n \u003e Alex\u0027s spec, not this one.\n\nThe answer to this, for me at least, boils down to when the resource is actually consumed/created.\n\nIf those resources are consumed/created by Nova in the process of launching a VM, then I believe Nova should be responsible for the quota usage checks.\n\nIf the user is pre-creating some resources via another endpoint -- Cinder, Neutron, Glance, whatever -- then those resources will need to be quota-checked by the other services, since the resources will have already been consumed by the user/project and Nova is simply doing the needful with regards to wiring up or attaching those pre-created resources.\n\nBut if Nova is doing the creation of the resources -- for example creating a network port or IP for a get-me-a-network type request -- then Nova should be the thing that checks the usage of those resources for quota purposes. That\u0027s the situation that is going to get tricky.\n\nIt would be a whole lot easier if OpenStack had a single API endpoint and was able to do a single check for all resources that are involved in a request.\n\nMatt, you\u0027d asked in IRC on Saturday what the quota situation is like in Kubernetes. Well, they have a single API service, not dozens of API services like we have in OpenStack. Inside that service is a single controller that handles quota usage checks.\n\nYou can see the commit that added \"generic\" resource quota counting back in October 2017 to k8s here:\n\nhttps://github.com/kubernetes/kubernetes/pull/54320\n\nthere\u0027s a lot of stuff that I don\u0027t care for in the k8s quota implementation, notably that it\u0027s pretty inefficient because for every resource involved in the request it needs to do a query against the API for that type of object and get a count of that object. Those count queries essentially boil down to queries against the underlying etcd storage to get the number of those objects that a \"user\" owns (in Kubernetes, there is a thing called a namespace that can be used to provide multi-tenancy, so it\u0027s not really a user..). So, instead of doing a single query to get resource usages, Kubernetes always has to do N etcd requests.\n\nNote that because they k8s doesn\u0027t use a relational DB for storage, that pattern of always having to do N requests against etcd and essentially performing joins and aggregate queries in memory in their controllers is all over the place in Kubernetes.\n\nThat said, having a single API endpoint (and a single controller) handling all HTTP requests allows k8s to address all quota resource checks for a single request all at once, which makes reasoning about the quota code a bit simpler...","commit_id":"36c1d1f151b5a618ed3ae3b8155dd6f40a9c2fb6"},{"author":{"_account_id":5754,"name":"Alex Xu","email":"hejie.xu@intel.com","username":"xuhj"},"change_message_id":"1e38a46b16c415f31563e99b3de775c7ae725a5e","unresolved":false,"context_lines":[{"line_number":64,"context_line":"possibility of quota limits being exceeded during the maintenance. It could"},{"line_number":65,"context_line":"spare Operators the trouble of potentially having to fix cases where quota has"},{"line_number":66,"context_line":"been exceeded during maintenance or if a cell database connection could not be"},{"line_number":67,"context_line":"established."},{"line_number":68,"context_line":""},{"line_number":69,"context_line":"Proposed change"},{"line_number":70,"context_line":"\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d"}],"source_content_type":"text/x-rst","patch_set":3,"id":"5f7c97a3_3d08b72b","line":67,"in_reply_to":"5f7c97a3_fa0c863c","updated":"2018-05-20 17:08:47.000000000","message":"just clarify, in case people misunderstand, my spec isn\u0027t about counting the non-compute resource classes. It is also for the resource like VGPU, it is managed by the nova, but currently, we have no way to limit the quota on it.","commit_id":"36c1d1f151b5a618ed3ae3b8155dd6f40a9c2fb6"},{"author":{"_account_id":4690,"name":"melanie witt","display_name":"melwitt","email":"melwittt@gmail.com","username":"melwitt"},"change_message_id":"a8aa74e222ecde20627b50e583e086942a4e5638","unresolved":false,"context_lines":[{"line_number":74,"context_line":"* One call to placement to get resource usage for CPU and RAM. We can get CPU"},{"line_number":75,"context_line":"  and RAM usage for a project and user by querying the ``/usages`` resource::"},{"line_number":76,"context_line":""},{"line_number":77,"context_line":"    GET /usages?project_id\u003d\u003cproject id\u003e\u0026user_id\u003d\u003cuser id\u003e"},{"line_number":78,"context_line":""},{"line_number":79,"context_line":"* One query to the API database to get resource usage for instances. We can get"},{"line_number":80,"context_line":"  the number of instances for a project and user if we add a new column"}],"source_content_type":"text/x-rst","patch_set":3,"id":"5f7c97a3_c96598b6","line":77,"updated":"2018-05-15 23:59:26.000000000","message":"I finally remembered this is subject to the same issue we were talking about in #openstack-placement today. We’re assuming that the resource usages returned from placement are VCPU and RAM related to instances. But, In The Future they may not be guaranteed to be nova instance VCPU and RAM, because we don’t have a way of identifying the “type” of a consumer.","commit_id":"36c1d1f151b5a618ed3ae3b8155dd6f40a9c2fb6"},{"author":{"_account_id":4393,"name":"Dan Smith","email":"dms@danplanet.com","username":"danms"},"change_message_id":"e0f756783f48f7b5eecb67070e90673e4db93f8c","unresolved":false,"context_lines":[{"line_number":74,"context_line":"* One call to placement to get resource usage for CPU and RAM. We can get CPU"},{"line_number":75,"context_line":"  and RAM usage for a project and user by querying the ``/usages`` resource::"},{"line_number":76,"context_line":""},{"line_number":77,"context_line":"    GET /usages?project_id\u003d\u003cproject id\u003e\u0026user_id\u003d\u003cuser id\u003e"},{"line_number":78,"context_line":""},{"line_number":79,"context_line":"* One query to the API database to get resource usage for instances. We can get"},{"line_number":80,"context_line":"  the number of instances for a project and user if we add a new column"}],"source_content_type":"text/x-rst","patch_set":3,"id":"5f7c97a3_83b40ad3","line":77,"in_reply_to":"5f7c97a3_2312d659","updated":"2018-05-20 19:18:10.000000000","message":"Well, I know you can take a hard line on what quota means and say that you\u0027re not quota\u0027ing off any actual resources by providing such a quantity of instances. However, they\u0027re described as \"limits\" and I just think that \"a limit on a number of instances\" is a very human thing to reason about.\n\nAlso, the min_unit and step_size thing is a fair point, but it\u0027s not scoped by tenant, which means you would have to make those restrictions apply to every tenant on the system (or group of computes), and I think that\u0027s never likely to work at any real scale with heterogeneous tenants.","commit_id":"36c1d1f151b5a618ed3ae3b8155dd6f40a9c2fb6"},{"author":{"_account_id":4393,"name":"Dan Smith","email":"dms@danplanet.com","username":"danms"},"change_message_id":"8a6bb70ccb4368b540fd24b16261a3e984fc66b7","unresolved":false,"context_lines":[{"line_number":74,"context_line":"* One call to placement to get resource usage for CPU and RAM. We can get CPU"},{"line_number":75,"context_line":"  and RAM usage for a project and user by querying the ``/usages`` resource::"},{"line_number":76,"context_line":""},{"line_number":77,"context_line":"    GET /usages?project_id\u003d\u003cproject id\u003e\u0026user_id\u003d\u003cuser id\u003e"},{"line_number":78,"context_line":""},{"line_number":79,"context_line":"* One query to the API database to get resource usage for instances. We can get"},{"line_number":80,"context_line":"  the number of instances for a project and user if we add a new column"}],"source_content_type":"text/x-rst","patch_set":3,"id":"5f7c97a3_97f524f0","line":77,"in_reply_to":"5f7c97a3_236c1635","updated":"2018-05-21 04:51:15.000000000","message":"No, of course nobody has asked because they currently have a quota on instances. I\u0027m just poking holes in the assertion that min_unit and step_size across the deployment would be enough. We should ask them, I\u0027m just saying I\u0027m expecting those will be the answers.","commit_id":"36c1d1f151b5a618ed3ae3b8155dd6f40a9c2fb6"},{"author":{"_account_id":26936,"name":"Surya Seetharaman","email":"suryaseetharaman.9@gmail.com","username":"tssurya"},"change_message_id":"5c55f23619cd0a6fe5c6f00d792783506b4e6686","unresolved":false,"context_lines":[{"line_number":74,"context_line":"* One call to placement to get resource usage for CPU and RAM. We can get CPU"},{"line_number":75,"context_line":"  and RAM usage for a project and user by querying the ``/usages`` resource::"},{"line_number":76,"context_line":""},{"line_number":77,"context_line":"    GET /usages?project_id\u003d\u003cproject id\u003e\u0026user_id\u003d\u003cuser id\u003e"},{"line_number":78,"context_line":""},{"line_number":79,"context_line":"* One query to the API database to get resource usage for instances. We can get"},{"line_number":80,"context_line":"  the number of instances for a project and user if we add a new column"}],"source_content_type":"text/x-rst","patch_set":3,"id":"5f7c97a3_5e592433","line":77,"in_reply_to":"5f7c97a3_67cf8d57","updated":"2018-06-04 07:59:06.000000000","message":"\u003e @melwitt: yes, this is true. that said, I\u0027ve always maintained that\n \u003e the \"instances\" quota in Nova is, well, silly. An \"instance\" isn\u0027t\n \u003e really a quantifiable thing like memory or CPU resources are\n \u003e quantifiable. I mean, if the user launches 10 32-VCPU, 64GB RAM\n \u003e instances and another user launches 10 1-VCPU, 512MB RAM instances,\n \u003e both of those users are \"using\" the same amount of the \"instances\"\n \u003e quota. Which, at least in my mind, brings the usefulness of the\n \u003e \"instances\" quota into question.\n \u003e \n \u003e Long term, I think it\u0027s best to just get rid of the \"instances\"\n \u003e quota class entirely and only rely on limits of the underlying\n \u003e quantifiable resources.\n\nIMHO, me personally would be ok with this (and same in our deployment meaning we will not be affected), but I guess having a quota on instances is more of a comfort for deployments using only 1 flavor type where they don\u0027t want to bother with setting the quantifiable quota limits like CPU/Memory and just want to put a cap on number of instances.","commit_id":"36c1d1f151b5a618ed3ae3b8155dd6f40a9c2fb6"},{"author":{"_account_id":4393,"name":"Dan Smith","email":"dms@danplanet.com","username":"danms"},"change_message_id":"780ed13cb33ee72d04b876be7e13875e81a616b6","unresolved":false,"context_lines":[{"line_number":74,"context_line":"* One call to placement to get resource usage for CPU and RAM. We can get CPU"},{"line_number":75,"context_line":"  and RAM usage for a project and user by querying the ``/usages`` resource::"},{"line_number":76,"context_line":""},{"line_number":77,"context_line":"    GET /usages?project_id\u003d\u003cproject id\u003e\u0026user_id\u003d\u003cuser id\u003e"},{"line_number":78,"context_line":""},{"line_number":79,"context_line":"* One query to the API database to get resource usage for instances. We can get"},{"line_number":80,"context_line":"  the number of instances for a project and user if we add a new column"}],"source_content_type":"text/x-rst","patch_set":3,"id":"5f7c97a3_da3f4a85","line":77,"in_reply_to":"5f7c97a3_67cf8d57","updated":"2018-05-20 01:25:58.000000000","message":"\u003e Long term, I think it\u0027s best to just get rid of the \"instances\"\n \u003e quota class entirely and only rely on limits of the underlying\n \u003e quantifiable resources.\n\nLast time I remember talking about this, an operator said that they may give 100GB of RAM as quota, but they definitely do not want a user creating 100 1GB instances, and expect them to use a minimum granularity to use that up. Granted, maybe they could achieve the same thing with also a quota on IPs or something like that, but probably not for network-heavy things where they expect you to have multiple IPs per instance.\n\nAnyway, I\u0027m just saying, we should definitely not assume that instance quota is dumb for everyone without asking.","commit_id":"36c1d1f151b5a618ed3ae3b8155dd6f40a9c2fb6"},{"author":{"_account_id":6873,"name":"Matt Riedemann","email":"mriedem.os@gmail.com","username":"mriedem"},"change_message_id":"4854f1f3210634555a731afb366dd2bd6108b56b","unresolved":false,"context_lines":[{"line_number":74,"context_line":"* One call to placement to get resource usage for CPU and RAM. We can get CPU"},{"line_number":75,"context_line":"  and RAM usage for a project and user by querying the ``/usages`` resource::"},{"line_number":76,"context_line":""},{"line_number":77,"context_line":"    GET /usages?project_id\u003d\u003cproject id\u003e\u0026user_id\u003d\u003cuser id\u003e"},{"line_number":78,"context_line":""},{"line_number":79,"context_line":"* One query to the API database to get resource usage for instances. We can get"},{"line_number":80,"context_line":"  the number of instances for a project and user if we add a new column"}],"source_content_type":"text/x-rst","patch_set":3,"id":"5f7c97a3_9a11d256","line":77,"in_reply_to":"5f7c97a3_67cf8d57","updated":"2018-05-19 23:25:23.000000000","message":"\u003e Long term, I think it\u0027s best to just get rid of the \"instances\" quota class entirely and only rely on limits of the underlying quantifiable resources.\n\nNot a bad idea, has this ever been floated by the operators community?","commit_id":"36c1d1f151b5a618ed3ae3b8155dd6f40a9c2fb6"},{"author":{"_account_id":7,"name":"Jay Pipes","email":"jaypipes@gmail.com","username":"jaypipes"},"change_message_id":"138558919a1e872328a67f6e82fd71d146df7308","unresolved":false,"context_lines":[{"line_number":74,"context_line":"* One call to placement to get resource usage for CPU and RAM. We can get CPU"},{"line_number":75,"context_line":"  and RAM usage for a project and user by querying the ``/usages`` resource::"},{"line_number":76,"context_line":""},{"line_number":77,"context_line":"    GET /usages?project_id\u003d\u003cproject id\u003e\u0026user_id\u003d\u003cuser id\u003e"},{"line_number":78,"context_line":""},{"line_number":79,"context_line":"* One query to the API database to get resource usage for instances. We can get"},{"line_number":80,"context_line":"  the number of instances for a project and user if we add a new column"}],"source_content_type":"text/x-rst","patch_set":3,"id":"5f7c97a3_236c1635","line":77,"in_reply_to":"5f7c97a3_83b40ad3","updated":"2018-05-20 19:56:05.000000000","message":"\u003e Well, I know you can take a hard line on what quota means and say\n \u003e that you\u0027re not quota\u0027ing off any actual resources by providing\n \u003e such a quantity of instances. However, they\u0027re described as\n \u003e \"limits\" and I just think that \"a limit on a number of instances\"\n \u003e is a very human thing to reason about.\n\nSure, it might be a very human thing to reason about, but at the end of the day, unless you\u0027re talking about instances that are all the same size/flavor, then it\u0027s not a particularly useful thing to apply a limit to, IMHO.\n\n \u003e Also, the min_unit and step_size thing is a fair point, but it\u0027s\n \u003e not scoped by tenant, which means you would have to make those\n \u003e restrictions apply to every tenant on the system (or group of\n \u003e computes), and I think that\u0027s never likely to work at any real\n \u003e scale with heterogeneous tenants.\n\nI\u0027m struggling to think of a situation where an operator would want to control step_size or min_unit per tenant.\n\nControlling the granularity of resource provisioning by tenant isn\u0027t really something I\u0027ve seen much demand for, but I trust you that it is. Perhaps you can point me at a deployer that you\u0027ve heard this request from and I can ask some questions of them and get a better idea of the need?","commit_id":"36c1d1f151b5a618ed3ae3b8155dd6f40a9c2fb6"},{"author":{"_account_id":4393,"name":"Dan Smith","email":"dms@danplanet.com","username":"danms"},"change_message_id":"780ed13cb33ee72d04b876be7e13875e81a616b6","unresolved":false,"context_lines":[{"line_number":74,"context_line":"* One call to placement to get resource usage for CPU and RAM. We can get CPU"},{"line_number":75,"context_line":"  and RAM usage for a project and user by querying the ``/usages`` resource::"},{"line_number":76,"context_line":""},{"line_number":77,"context_line":"    GET /usages?project_id\u003d\u003cproject id\u003e\u0026user_id\u003d\u003cuser id\u003e"},{"line_number":78,"context_line":""},{"line_number":79,"context_line":"* One query to the API database to get resource usage for instances. We can get"},{"line_number":80,"context_line":"  the number of instances for a project and user if we add a new column"}],"source_content_type":"text/x-rst","patch_set":3,"id":"5f7c97a3_7abb960d","line":77,"in_reply_to":"5f7c97a3_c96598b6","updated":"2018-05-20 01:25:58.000000000","message":"\u003e I finally remembered this is subject to the same issue we were\n \u003e talking about in #openstack-placement today. We’re assuming that\n \u003e the resource usages returned from placement are VCPU and RAM\n \u003e related to instances. But, In The Future they may not be guaranteed\n \u003e to be nova instance VCPU and RAM, because we don’t have a way of\n \u003e identifying the “type” of a consumer.\n\nRight, so we need a revision to this spec to explain how we\u0027re going to scope these queries right?","commit_id":"36c1d1f151b5a618ed3ae3b8155dd6f40a9c2fb6"},{"author":{"_account_id":7,"name":"Jay Pipes","email":"jaypipes@gmail.com","username":"jaypipes"},"change_message_id":"728df427d712ec0ae2cecfa80b31034f7bf622c9","unresolved":false,"context_lines":[{"line_number":74,"context_line":"* One call to placement to get resource usage for CPU and RAM. We can get CPU"},{"line_number":75,"context_line":"  and RAM usage for a project and user by querying the ``/usages`` resource::"},{"line_number":76,"context_line":""},{"line_number":77,"context_line":"    GET /usages?project_id\u003d\u003cproject id\u003e\u0026user_id\u003d\u003cuser id\u003e"},{"line_number":78,"context_line":""},{"line_number":79,"context_line":"* One query to the API database to get resource usage for instances. We can get"},{"line_number":80,"context_line":"  the number of instances for a project and user if we add a new column"}],"source_content_type":"text/x-rst","patch_set":3,"id":"5f7c97a3_67cf8d57","line":77,"in_reply_to":"5f7c97a3_c96598b6","updated":"2018-05-18 15:47:51.000000000","message":"@melwitt: yes, this is true. that said, I\u0027ve always maintained that the \"instances\" quota in Nova is, well, silly. An \"instance\" isn\u0027t really a quantifiable thing like memory or CPU resources are quantifiable. I mean, if the user launches 10 32-VCPU, 64GB RAM instances and another user launches 10 1-VCPU, 512MB RAM instances, both of those users are \"using\" the same amount of the \"instances\" quota. Which, at least in my mind, brings the usefulness of the \"instances\" quota into question.\n\nLong term, I think it\u0027s best to just get rid of the \"instances\" quota class entirely and only rely on limits of the underlying quantifiable resources.","commit_id":"36c1d1f151b5a618ed3ae3b8155dd6f40a9c2fb6"},{"author":{"_account_id":5754,"name":"Alex Xu","email":"hejie.xu@intel.com","username":"xuhj"},"change_message_id":"27e9a6e65e9607f32055d0f7d69169560bfa48be","unresolved":false,"context_lines":[{"line_number":74,"context_line":"* One call to placement to get resource usage for CPU and RAM. We can get CPU"},{"line_number":75,"context_line":"  and RAM usage for a project and user by querying the ``/usages`` resource::"},{"line_number":76,"context_line":""},{"line_number":77,"context_line":"    GET /usages?project_id\u003d\u003cproject id\u003e\u0026user_id\u003d\u003cuser id\u003e"},{"line_number":78,"context_line":""},{"line_number":79,"context_line":"* One query to the API database to get resource usage for instances. We can get"},{"line_number":80,"context_line":"  the number of instances for a project and user if we add a new column"}],"source_content_type":"text/x-rst","patch_set":3,"id":"5f7c97a3_1f43c243","line":77,"in_reply_to":"5f7c97a3_c96598b6","updated":"2018-05-16 03:03:49.000000000","message":"One of way is to follow the method in the below, we can get the instance uuid from the instance_mappings table. Then we query the usage for the specific set of instance uuids.  But...yes.. that is ugly, limited by the length of url, and not sure the performance.","commit_id":"36c1d1f151b5a618ed3ae3b8155dd6f40a9c2fb6"},{"author":{"_account_id":7,"name":"Jay Pipes","email":"jaypipes@gmail.com","username":"jaypipes"},"change_message_id":"f18f8397fbf150f93f233d1139ac6802fa092128","unresolved":false,"context_lines":[{"line_number":74,"context_line":"* One call to placement to get resource usage for CPU and RAM. We can get CPU"},{"line_number":75,"context_line":"  and RAM usage for a project and user by querying the ``/usages`` resource::"},{"line_number":76,"context_line":""},{"line_number":77,"context_line":"    GET /usages?project_id\u003d\u003cproject id\u003e\u0026user_id\u003d\u003cuser id\u003e"},{"line_number":78,"context_line":""},{"line_number":79,"context_line":"* One query to the API database to get resource usage for instances. We can get"},{"line_number":80,"context_line":"  the number of instances for a project and user if we add a new column"}],"source_content_type":"text/x-rst","patch_set":3,"id":"5f7c97a3_2312d659","line":77,"in_reply_to":"5f7c97a3_da3f4a85","updated":"2018-05-20 18:33:23.000000000","message":"\u003e \u003e Long term, I think it\u0027s best to just get rid of the \"instances\"\n \u003e \u003e quota class entirely and only rely on limits of the underlying\n \u003e \u003e quantifiable resources.\n \u003e \n \u003e Last time I remember talking about this, an operator said that they\n \u003e may give 100GB of RAM as quota, but they definitely do not want a\n \u003e user creating 100 1GB instances, and expect them to use a minimum\n \u003e granularity to use that up. Granted, maybe they could achieve the\n \u003e same thing with also a quota on IPs or something like that, but\n \u003e probably not for network-heavy things where they expect you to have\n \u003e multiple IPs per instance.\n\nThat\u0027s just abusing the quota system to address a different problem, IMHO.\n\nSuch operators could use the step_size and min_unit information for inventory records in Placement to prevent the same thing from happening if their concern was about too small slices of disk being handed out.\n\n \u003e Anyway, I\u0027m just saying, we should definitely not assume that\n \u003e instance quota is dumb for everyone without asking.\n\nSure, no disagreement from me. But let\u0027s make sure operators aren\u0027t misusing quota limits to solve a completely different problem than what quotas are intended to solve.","commit_id":"36c1d1f151b5a618ed3ae3b8155dd6f40a9c2fb6"},{"author":{"_account_id":6873,"name":"Matt Riedemann","email":"mriedem.os@gmail.com","username":"mriedem"},"change_message_id":"4854f1f3210634555a731afb366dd2bd6108b56b","unresolved":false,"context_lines":[{"line_number":75,"context_line":"  and RAM usage for a project and user by querying the ``/usages`` resource::"},{"line_number":76,"context_line":""},{"line_number":77,"context_line":"    GET /usages?project_id\u003d\u003cproject id\u003e\u0026user_id\u003d\u003cuser id\u003e"},{"line_number":78,"context_line":""},{"line_number":79,"context_line":"* One query to the API database to get resource usage for instances. We can get"},{"line_number":80,"context_line":"  the number of instances for a project and user if we add a new column"},{"line_number":81,"context_line":"  ``user_id`` to the ``nova_api.instance_mappings`` table. We already have a"}],"source_content_type":"text/x-rst","patch_set":3,"id":"5f7c97a3_9a3a32cb","line":78,"updated":"2018-05-19 23:25:23.000000000","message":"So, we\u0027ve got an issue with using GET /usages in that any allocations created before placement microversion 1.8 won\u0027t have a consumer record in the placement database, and usages are based on the consumer records because it\u0027s the consumer record that stores the project_id and user_id values (not the allocations table). This is discussed a bit in the review comments for the consumer generation spec amendment:\n\nhttps://review.openstack.org/#/c/565565/\n\nDan\u0027s idea for this is any allocations created with placement microversion \u003c 1.8 will create a consumer record with a config-driven project/user, which could either be a sentinel just for knowing that usages by that project/user are for this case, or they could be a (the) admin user for a deployment. If we did that, counting quota gets a bit weird here since any allocations created before 1.8 for *my* instances won\u0027t count against my quota usage, but instead against whatever is in the config.\n\nWe could certainly heal these older allocation records in nova using the \"nova-manage placement heal_allocations\" CLI [1] but I\u0027m not really sure how yet, since I\u0027d think the \u0027heal\u0027 routine would be looking for allocations without a matching consumer for a given instance and then create the consumer record (or re-create the allocation record). But if we use the config value to create the consumer record, then the actual allocation for the instance won\u0027t be counted against the same project/user as the instance, which is the problem above. Anyway, we\u0027ve got an upgrade/data migration issue that needs to be solved.\n\n[1] https://review.openstack.org/#/c/565886/","commit_id":"36c1d1f151b5a618ed3ae3b8155dd6f40a9c2fb6"},{"author":{"_account_id":4393,"name":"Dan Smith","email":"dms@danplanet.com","username":"danms"},"change_message_id":"780ed13cb33ee72d04b876be7e13875e81a616b6","unresolved":false,"context_lines":[{"line_number":75,"context_line":"  and RAM usage for a project and user by querying the ``/usages`` resource::"},{"line_number":76,"context_line":""},{"line_number":77,"context_line":"    GET /usages?project_id\u003d\u003cproject id\u003e\u0026user_id\u003d\u003cuser id\u003e"},{"line_number":78,"context_line":""},{"line_number":79,"context_line":"* One query to the API database to get resource usage for instances. We can get"},{"line_number":80,"context_line":"  the number of instances for a project and user if we add a new column"},{"line_number":81,"context_line":"  ``user_id`` to the ``nova_api.instance_mappings`` table. We already have a"}],"source_content_type":"text/x-rst","patch_set":3,"id":"5f7c97a3_7ae2b631","line":78,"in_reply_to":"5f7c97a3_9a3a32cb","updated":"2018-05-20 01:25:58.000000000","message":"Yeah, for nova we\u0027ll just have to make sure we\u0027ve healed the older allocations, which we could have the compute nodes do (with all the races we know come from that approach) or have something dedicated for it. It\u0027s going to be ugly and heavy, but I think it\u0027s got to happen eventually even without this proposal.","commit_id":"36c1d1f151b5a618ed3ae3b8155dd6f40a9c2fb6"},{"author":{"_account_id":6873,"name":"Matt Riedemann","email":"mriedem.os@gmail.com","username":"mriedem"},"change_message_id":"4854f1f3210634555a731afb366dd2bd6108b56b","unresolved":false,"context_lines":[{"line_number":80,"context_line":"  the number of instances for a project and user if we add a new column"},{"line_number":81,"context_line":"  ``user_id`` to the ``nova_api.instance_mappings`` table. We already have a"},{"line_number":82,"context_line":"  ``project_id`` column on the table. This will allow us to count instance"},{"line_number":83,"context_line":"  mappings for a project and a user to represent the instance count. We will"},{"line_number":84,"context_line":"  also rely on the new column ``queued_for_delete`` from the `spec proposal for"},{"line_number":85,"context_line":"  handling a \"down\" cell. \u003cReferences_\u003e`_"},{"line_number":86,"context_line":""},{"line_number":87,"context_line":"Alternatives"},{"line_number":88,"context_line":"------------"}],"source_content_type":"text/x-rst","patch_set":3,"id":"5f7c97a3_1a9aa2c4","line":85,"range":{"start_line":83,"start_character":69,"end_line":85,"end_character":41},"updated":"2018-05-19 23:25:23.000000000","message":"Does that spec go into detail about how this column is related to counting quotas? If not, we should at least give some brief explanation of why that is relevant here (soft delete, but let\u0027s mention it).","commit_id":"36c1d1f151b5a618ed3ae3b8155dd6f40a9c2fb6"},{"author":{"_account_id":26936,"name":"Surya Seetharaman","email":"suryaseetharaman.9@gmail.com","username":"tssurya"},"change_message_id":"5c55f23619cd0a6fe5c6f00d792783506b4e6686","unresolved":false,"context_lines":[{"line_number":80,"context_line":"  the number of instances for a project and user if we add a new column"},{"line_number":81,"context_line":"  ``user_id`` to the ``nova_api.instance_mappings`` table. We already have a"},{"line_number":82,"context_line":"  ``project_id`` column on the table. This will allow us to count instance"},{"line_number":83,"context_line":"  mappings for a project and a user to represent the instance count. We will"},{"line_number":84,"context_line":"  also rely on the new column ``queued_for_delete`` from the `spec proposal for"},{"line_number":85,"context_line":"  handling a \"down\" cell. \u003cReferences_\u003e`_"},{"line_number":86,"context_line":""},{"line_number":87,"context_line":"Alternatives"},{"line_number":88,"context_line":"------------"}],"source_content_type":"text/x-rst","patch_set":3,"id":"5f7c97a3_6a021bbd","line":85,"range":{"start_line":83,"start_character":69,"end_line":85,"end_character":41},"in_reply_to":"5f7c97a3_1a9aa2c4","updated":"2018-06-04 07:59:06.000000000","message":"Not yet, because that spec for now states that booting should not be allowed if any cell having that tenant\u0027s instances is down until an all-cell-iteration independent solution for creating quotas like this one is proposed. However I will update that spec to include more details along this direction since this spec is likely to get merged.","commit_id":"36c1d1f151b5a618ed3ae3b8155dd6f40a9c2fb6"},{"author":{"_account_id":14070,"name":"Eric Fried","email":"openstack@fried.cc","username":"efried"},"change_message_id":"69e87d071361feddfb8a12ef45af68ff3ea053d2","unresolved":false,"context_lines":[{"line_number":87,"context_line":"Alternatives"},{"line_number":88,"context_line":"------------"},{"line_number":89,"context_line":""},{"line_number":90,"context_line":"There is not really an alternative other than not changing how we count usage"},{"line_number":91,"context_line":"for instances, CPU, and RAM."},{"line_number":92,"context_line":""},{"line_number":93,"context_line":"Data model impact"},{"line_number":94,"context_line":"-----------------"}],"source_content_type":"text/x-rst","patch_set":3,"id":"5f7c97a3_7a9a0c62","line":91,"range":{"start_line":90,"start_character":0,"end_line":91,"end_character":28},"updated":"2018-05-15 19:00:49.000000000","message":"This could talk about the alternative of adding a placement API endpoint allowing us to count consumers. Which would then explain that we can\u0027t do that because consumers aren\u0027t always instances.  (Which I don\u0027t fully understand - can I get a counterexample?)\n\nBecause the simplest thing would be to add a field (e.g. consumer_count) to the response in the GET /usages API.  Then you could do this thing in one call.","commit_id":"36c1d1f151b5a618ed3ae3b8155dd6f40a9c2fb6"},{"author":{"_account_id":5754,"name":"Alex Xu","email":"hejie.xu@intel.com","username":"xuhj"},"change_message_id":"430653681420262ddb89fa7085db3654d989ef81","unresolved":false,"context_lines":[{"line_number":87,"context_line":"Alternatives"},{"line_number":88,"context_line":"------------"},{"line_number":89,"context_line":""},{"line_number":90,"context_line":"There is not really an alternative other than not changing how we count usage"},{"line_number":91,"context_line":"for instances, CPU, and RAM."},{"line_number":92,"context_line":""},{"line_number":93,"context_line":"Data model impact"},{"line_number":94,"context_line":"-----------------"}],"source_content_type":"text/x-rst","patch_set":3,"id":"5f7c97a3_ff01cab0","line":91,"range":{"start_line":90,"start_character":0,"end_line":91,"end_character":28},"in_reply_to":"5f7c97a3_034f9a34","updated":"2018-05-21 12:16:24.000000000","message":"The tricky thing is that live-migration is admin-triggered action, that is something should hide for the end user. When there is some extra consumption on the resource, and the end user can\u0027t create more resources, but there is no way for the end user to understand what is happening under the background.(all the migration records querying API is admin-only I think)","commit_id":"36c1d1f151b5a618ed3ae3b8155dd6f40a9c2fb6"},{"author":{"_account_id":6873,"name":"Matt Riedemann","email":"mriedem.os@gmail.com","username":"mriedem"},"change_message_id":"4854f1f3210634555a731afb366dd2bd6108b56b","unresolved":false,"context_lines":[{"line_number":87,"context_line":"Alternatives"},{"line_number":88,"context_line":"------------"},{"line_number":89,"context_line":""},{"line_number":90,"context_line":"There is not really an alternative other than not changing how we count usage"},{"line_number":91,"context_line":"for instances, CPU, and RAM."},{"line_number":92,"context_line":""},{"line_number":93,"context_line":"Data model impact"},{"line_number":94,"context_line":"-----------------"}],"source_content_type":"text/x-rst","patch_set":3,"id":"5f7c97a3_9a3f92b4","line":91,"range":{"start_line":90,"start_character":0,"end_line":91,"end_character":28},"in_reply_to":"5f7c97a3_071819d1","updated":"2018-05-19 23:25:23.000000000","message":"\u003e Migrations are also consumers, and they aren\u0027t instances.\n\nTrue, and this makes me wonder, what do we use for the project/user when swapping the allocations during a migration, because the actual context for the migration is going to have the admin project/user, not the instance\u0027s project/user.\n\nThis is the code in question:\n\nhttps://github.com/openstack/nova/blob/20b99f6998c088650b0c0cb066cc6aac3e5f9312/nova/conductor/tasks/migrate.py#L59\n\nBased on that, it looks like the migration consumer temporary allocation is using the instance\u0027s project and user so we\u0027re good. /whew","commit_id":"36c1d1f151b5a618ed3ae3b8155dd6f40a9c2fb6"},{"author":{"_account_id":7,"name":"Jay Pipes","email":"jaypipes@gmail.com","username":"jaypipes"},"change_message_id":"86385168e8edc2df72fa14cdf2e9d5f90c272bad","unresolved":false,"context_lines":[{"line_number":87,"context_line":"Alternatives"},{"line_number":88,"context_line":"------------"},{"line_number":89,"context_line":""},{"line_number":90,"context_line":"There is not really an alternative other than not changing how we count usage"},{"line_number":91,"context_line":"for instances, CPU, and RAM."},{"line_number":92,"context_line":""},{"line_number":93,"context_line":"Data model impact"},{"line_number":94,"context_line":"-----------------"}],"source_content_type":"text/x-rst","patch_set":3,"id":"5f7c97a3_034f9a34","line":91,"range":{"start_line":90,"start_character":0,"end_line":91,"end_character":28},"in_reply_to":"5f7c97a3_4036e8e9","updated":"2018-05-20 18:46:17.000000000","message":"\u003e I feel when you guys say the migration, the only concern is about\n \u003e we can\u0027t distinguish the migration and the instance when counting\n \u003e the instance quota. But it isn\u0027t only the concern for the instance\n \u003e quota, it is also a concern for the cpu, ram, disk all other\n \u003e quotas. We will have extra usage when there is any migration\n \u003e consumption.\n\nYep, that\u0027s correct.\n\nBut, at the end of the day, a migration really does consume more resources from the system, and those resources are indeed being consumed on behalf of the user/tenant.\n\nSo, while I don\u0027t believe we do quota usage *checking* for live migration or non-resize migrations, I actually think it\u0027s OK to have a user\u0027s usage counts reflect additional usage during a migration -- and if that means that during a migration the user isn\u0027t able to spin up as many resources, I\u0027m totally fine with that.","commit_id":"36c1d1f151b5a618ed3ae3b8155dd6f40a9c2fb6"},{"author":{"_account_id":14070,"name":"Eric Fried","email":"openstack@fried.cc","username":"efried"},"change_message_id":"6e2145077fb654650d431b46ce89875e8d1d885c","unresolved":false,"context_lines":[{"line_number":87,"context_line":"Alternatives"},{"line_number":88,"context_line":"------------"},{"line_number":89,"context_line":""},{"line_number":90,"context_line":"There is not really an alternative other than not changing how we count usage"},{"line_number":91,"context_line":"for instances, CPU, and RAM."},{"line_number":92,"context_line":""},{"line_number":93,"context_line":"Data model impact"},{"line_number":94,"context_line":"-----------------"}],"source_content_type":"text/x-rst","patch_set":3,"id":"5f7c97a3_9a65a820","line":91,"range":{"start_line":90,"start_character":0,"end_line":91,"end_character":28},"in_reply_to":"5f7c97a3_7a9a0c62","updated":"2018-05-15 19:16:46.000000000","message":"[Later] cdent explained to me about accounting for deleted (well, not-quite-yet-deleted) instances.  This may be more than we want to bite off in this bp, but have we considered deleting the allocations from placement when we set queued_for_delete rather than having to wait for the cell to come back?\n\n(If there are other reasons we can\u0027t assume consumer \u003d\u003d instance, the above still wouldn\u0027t get us to the point of being able to get rid of the nova DB side of this equation.  But it\u0027s food for thought.)","commit_id":"36c1d1f151b5a618ed3ae3b8155dd6f40a9c2fb6"},{"author":{"_account_id":5754,"name":"Alex Xu","email":"hejie.xu@intel.com","username":"xuhj"},"change_message_id":"1e38a46b16c415f31563e99b3de775c7ae725a5e","unresolved":false,"context_lines":[{"line_number":87,"context_line":"Alternatives"},{"line_number":88,"context_line":"------------"},{"line_number":89,"context_line":""},{"line_number":90,"context_line":"There is not really an alternative other than not changing how we count usage"},{"line_number":91,"context_line":"for instances, CPU, and RAM."},{"line_number":92,"context_line":""},{"line_number":93,"context_line":"Data model impact"},{"line_number":94,"context_line":"-----------------"}],"source_content_type":"text/x-rst","patch_set":3,"id":"5f7c97a3_4036e8e9","line":91,"range":{"start_line":90,"start_character":0,"end_line":91,"end_character":28},"in_reply_to":"5f7c97a3_9a3f92b4","updated":"2018-05-20 17:08:47.000000000","message":"I feel when you guys say the migration, the only concern is about we can\u0027t distinguish the migration and the instance when counting the instance quota. But it isn\u0027t only the concern for the instance quota, it is also a concern for the cpu, ram, disk all other quotas. We will have extra usage when there is any migration consumption.","commit_id":"36c1d1f151b5a618ed3ae3b8155dd6f40a9c2fb6"},{"author":{"_account_id":7,"name":"Jay Pipes","email":"jaypipes@gmail.com","username":"jaypipes"},"change_message_id":"728df427d712ec0ae2cecfa80b31034f7bf622c9","unresolved":false,"context_lines":[{"line_number":87,"context_line":"Alternatives"},{"line_number":88,"context_line":"------------"},{"line_number":89,"context_line":""},{"line_number":90,"context_line":"There is not really an alternative other than not changing how we count usage"},{"line_number":91,"context_line":"for instances, CPU, and RAM."},{"line_number":92,"context_line":""},{"line_number":93,"context_line":"Data model impact"},{"line_number":94,"context_line":"-----------------"}],"source_content_type":"text/x-rst","patch_set":3,"id":"5f7c97a3_071819d1","line":91,"range":{"start_line":90,"start_character":0,"end_line":91,"end_character":28},"in_reply_to":"5f7c97a3_9a65a820","updated":"2018-05-18 15:47:51.000000000","message":"Migrations are also consumers, and they aren\u0027t instances.\n\nEither way, the long-term solution IMO is to stop using quota classes that don\u0027t refer to quantifiable resources. The \"instances\" quota class is perfect example of something that isn\u0027t quantifiable. See my response to @melwitt above for why \"instances\" isn\u0027t quantifiable.","commit_id":"36c1d1f151b5a618ed3ae3b8155dd6f40a9c2fb6"},{"author":{"_account_id":7,"name":"Jay Pipes","email":"jaypipes@gmail.com","username":"jaypipes"},"change_message_id":"5c238ff57b917e952c007c3cff92459f1e14ad58","unresolved":false,"context_lines":[{"line_number":87,"context_line":"Alternatives"},{"line_number":88,"context_line":"------------"},{"line_number":89,"context_line":""},{"line_number":90,"context_line":"There is not really an alternative other than not changing how we count usage"},{"line_number":91,"context_line":"for instances, CPU, and RAM."},{"line_number":92,"context_line":""},{"line_number":93,"context_line":"Data model impact"},{"line_number":94,"context_line":"-----------------"}],"source_content_type":"text/x-rst","patch_set":3,"id":"5f7c97a3_bfb7d2d8","line":91,"range":{"start_line":90,"start_character":0,"end_line":91,"end_character":28},"in_reply_to":"5f7c97a3_ff01cab0","updated":"2018-05-21 12:23:53.000000000","message":"ack, that\u0027s a problem for sure. I\u0027m not sure it\u0027s a *huge* problem, but it\u0027s a problem.","commit_id":"36c1d1f151b5a618ed3ae3b8155dd6f40a9c2fb6"},{"author":{"_account_id":4393,"name":"Dan Smith","email":"dms@danplanet.com","username":"danms"},"change_message_id":"780ed13cb33ee72d04b876be7e13875e81a616b6","unresolved":false,"context_lines":[{"line_number":122,"context_line":""},{"line_number":123,"context_line":"Performance of quota resource counting should be more efficient when querying"},{"line_number":124,"context_line":"placement instead of querying separate cell databases and aggregating the"},{"line_number":125,"context_line":"results."},{"line_number":126,"context_line":""},{"line_number":127,"context_line":"Other deployer impact"},{"line_number":128,"context_line":"---------------------"}],"source_content_type":"text/x-rst","patch_set":3,"id":"5f7c97a3_3ad67e51","line":125,"updated":"2018-05-20 01:25:58.000000000","message":"As above, I\u0027m not sure why this would be. Querying multiple smaller databases in parallel should always be faster than querying the same data from a single large one, AFAIK, and that is the result I found while profiling the parallel instance list stuff.","commit_id":"36c1d1f151b5a618ed3ae3b8155dd6f40a9c2fb6"},{"author":{"_account_id":6873,"name":"Matt Riedemann","email":"mriedem.os@gmail.com","username":"mriedem"},"change_message_id":"4854f1f3210634555a731afb366dd2bd6108b56b","unresolved":false,"context_lines":[{"line_number":137,"context_line":"Upgrade impact"},{"line_number":138,"context_line":"--------------"},{"line_number":139,"context_line":""},{"line_number":140,"context_line":"None."},{"line_number":141,"context_line":""},{"line_number":142,"context_line":"Implementation"},{"line_number":143,"context_line":"\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d"}],"source_content_type":"text/x-rst","patch_set":3,"id":"5f7c97a3_ba3ccebf","line":140,"updated":"2018-05-19 23:25:23.000000000","message":"See above, we need to do something about older allocation records that don\u0027t have a consumer.\n\nLooks like that would be any instances created before Pike:\n\nhttps://review.openstack.org/#/c/469634/\n\nSo it\u0027s definitely possible to have instances from newton or ocata with allocation records that don\u0027t have matching consumer records.\n\nI also looked to see when we started claiming resources during scheduling in pike and that code also does a PUT /allocations but it uses 1.10 so project/user are there:\n\nhttps://github.com/openstack/nova/blob/9d1db10f842286fb59b9316987e075fd8758107d/nova/scheduler/client/report.py#L997\n\nOne other thing that makes me uneasy is this:\n\nhttps://github.com/openstack/nova/blob/23c4eb34380bdf3eece11abbe0f6ccb68c060f47/nova/scheduler/filter_scheduler.py#L273\n\nCould the allocations have the wrong user_id based on that? I don\u0027t think so - the project in the request spec should match the project in the user context during server create, however, later on an admin could move the instance, which is going to claim resources against a new destination host, and at that point, the allocations are going to have the project_id of the original instance user (because they come from the request spec) but the allocation user_id is going to come from the current request context, which is going to be an admin for something like cold migrate, live migrate or evacuate. See:\n\nhttps://review.openstack.org/#/c/568917/\n\nALSO, we don\u0027t claim resources in the scheduler if using the CachingScheduler. We don\u0027t use placement at all if using the CachingScheduler, and we stopped creating allocations for instances in the ResourceTracker (in nova-compute) starting in pike once all of your computes are running at least pike:\n\nIa93168b1560267178059284186fb2b7096c7e81f\n\nSo if \u003e\u003dPike and CachingScheduler, we don\u0027t even have allocations, which means those instances don\u0027t have consumer records and therefore they don\u0027t have usages, so switching to GET /usages for counting quota means those instances are reported as not using any quota. :{\n\nSo we quite a few problem scenarios here...\n\nThe \"nova-manage placement heal_allocations\" CLI is meant to be part of the solution to migrate deployments off the CachingScheduler, but I don\u0027t think we can just switch over to using GET /usages until (1) we know CachingScheduler deployments have migrated (could we add a blocker db migration to enforce this?) or (2) we\u0027d have to add a backdoor workaround option to continue counting quotas the current way until we can drop the CachingScheduler.","commit_id":"36c1d1f151b5a618ed3ae3b8155dd6f40a9c2fb6"},{"author":{"_account_id":6873,"name":"Matt Riedemann","email":"mriedem.os@gmail.com","username":"mriedem"},"change_message_id":"4854f1f3210634555a731afb366dd2bd6108b56b","unresolved":false,"context_lines":[{"line_number":175,"context_line":"Documentation Impact"},{"line_number":176,"context_line":"\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d"},{"line_number":177,"context_line":""},{"line_number":178,"context_line":"The documentation_ of Cells v2 caveats will be updated to remove the paragraph"},{"line_number":179,"context_line":"about the inability to correctly calculate quota usage when one or more cells"},{"line_number":180,"context_line":"are unreachable."},{"line_number":181,"context_line":""}],"source_content_type":"text/x-rst","patch_set":3,"id":"5f7c97a3_7a5f160e","line":178,"range":{"start_line":178,"start_character":58,"end_line":178,"end_character":64},"updated":"2018-05-19 23:25:23.000000000","message":"We shouldn\u0027t remove it, but amend it to mention when it\u0027s no longer a problem, since it will be a problem in older deployments and people might be reading those docs when upgrading from mitaka to like pike or queens.","commit_id":"36c1d1f151b5a618ed3ae3b8155dd6f40a9c2fb6"}],"specs/stein/approved/count-quota-usage-from-placement.rst":[{"author":{"_account_id":4393,"name":"Dan Smith","email":"dms@danplanet.com","username":"danms"},"change_message_id":"b81e2dd6ac9880b831cb09b5ee47ae461c8eb96a","unresolved":false,"context_lines":[{"line_number":51,"context_line":"RAM quotas. By querying placement and the API database, we can avoid reading"},{"line_number":52,"context_line":"separate cell databases for resource usage."},{"line_number":53,"context_line":""},{"line_number":54,"context_line":"Note: because there is not yet ability to partition allocations in placement,"},{"line_number":55,"context_line":"in order to support edge deployments where multiple Nova instances share the"},{"line_number":56,"context_line":"same placement service, we can add a"},{"line_number":57,"context_line":"``[workarounds]disable_quota_usage_from_placement`` which defaults to False."}],"source_content_type":"text/x-rst","patch_set":4,"id":"3f79a3b5_9988c625","line":54,"range":{"start_line":54,"start_character":52,"end_line":54,"end_character":63},"updated":"2018-11-20 22:08:19.000000000","message":"I think the partitioning we need is on RPs not allocations right? I think we would represent a RP as belonging to one partition at create time, and then the allocation work doesn\u0027t need to worry about it after that.","commit_id":"0d52ed297c1e73b32c590d41e260da5f8e1dbf37"},{"author":{"_account_id":4393,"name":"Dan Smith","email":"dms@danplanet.com","username":"danms"},"change_message_id":"b81e2dd6ac9880b831cb09b5ee47ae461c8eb96a","unresolved":false,"context_lines":[{"line_number":51,"context_line":"RAM quotas. By querying placement and the API database, we can avoid reading"},{"line_number":52,"context_line":"separate cell databases for resource usage."},{"line_number":53,"context_line":""},{"line_number":54,"context_line":"Note: because there is not yet ability to partition allocations in placement,"},{"line_number":55,"context_line":"in order to support edge deployments where multiple Nova instances share the"},{"line_number":56,"context_line":"same placement service, we can add a"},{"line_number":57,"context_line":"``[workarounds]disable_quota_usage_from_placement`` which defaults to False."}],"source_content_type":"text/x-rst","patch_set":4,"id":"3f79a3b5_b9a342b0","line":54,"range":{"start_line":54,"start_character":31,"end_line":54,"end_character":38},"updated":"2018-11-20 22:08:19.000000000","message":"an ability?\n\nthe ability?","commit_id":"0d52ed297c1e73b32c590d41e260da5f8e1dbf37"},{"author":{"_account_id":4690,"name":"melanie witt","display_name":"melwitt","email":"melwittt@gmail.com","username":"melwitt"},"change_message_id":"1b14af3e902e63299b349e6301a90194d2af7ce0","unresolved":false,"context_lines":[{"line_number":51,"context_line":"RAM quotas. By querying placement and the API database, we can avoid reading"},{"line_number":52,"context_line":"separate cell databases for resource usage."},{"line_number":53,"context_line":""},{"line_number":54,"context_line":"Note: because there is not yet ability to partition allocations in placement,"},{"line_number":55,"context_line":"in order to support edge deployments where multiple Nova instances share the"},{"line_number":56,"context_line":"same placement service, we can add a"},{"line_number":57,"context_line":"``[workarounds]disable_quota_usage_from_placement`` which defaults to False."}],"source_content_type":"text/x-rst","patch_set":4,"id":"3f79a3b5_7a77c63d","line":54,"range":{"start_line":54,"start_character":52,"end_line":54,"end_character":63},"in_reply_to":"3f79a3b5_9988c625","updated":"2018-11-21 20:56:36.000000000","message":"Probably, yes. I was thinking about it at the more fine-grained level, I think. That is, when we ask placement for /usages, it is counting allocations, and those allocations need to be partitioned.\n\nIf we think about partitioning for a particular nova instance, indeed I think we could partition RPs and any allocation granted from them belongs to them. So if /usages counting derives the partition key from its allocations\u0027 RPs, then that works too.\n\nI can change this to reflect that.","commit_id":"0d52ed297c1e73b32c590d41e260da5f8e1dbf37"},{"author":{"_account_id":4393,"name":"Dan Smith","email":"dms@danplanet.com","username":"danms"},"change_message_id":"b81e2dd6ac9880b831cb09b5ee47ae461c8eb96a","unresolved":false,"context_lines":[{"line_number":52,"context_line":"separate cell databases for resource usage."},{"line_number":53,"context_line":""},{"line_number":54,"context_line":"Note: because there is not yet ability to partition allocations in placement,"},{"line_number":55,"context_line":"in order to support edge deployments where multiple Nova instances share the"},{"line_number":56,"context_line":"same placement service, we can add a"},{"line_number":57,"context_line":"``[workarounds]disable_quota_usage_from_placement`` which defaults to False."},{"line_number":58,"context_line":"If True, we use the legacy quota counting method for instances, cores, and"}],"source_content_type":"text/x-rst","patch_set":4,"id":"3f79a3b5_f9821a02","line":55,"range":{"start_line":55,"start_character":20,"end_line":55,"end_character":24},"updated":"2018-11-20 22:08:19.000000000","message":"Just remove this. Multiple novas, or a nova+standalone ironic could be a thing without any edge-y-ness.","commit_id":"0d52ed297c1e73b32c590d41e260da5f8e1dbf37"},{"author":{"_account_id":4690,"name":"melanie witt","display_name":"melwitt","email":"melwittt@gmail.com","username":"melwitt"},"change_message_id":"1b14af3e902e63299b349e6301a90194d2af7ce0","unresolved":false,"context_lines":[{"line_number":52,"context_line":"separate cell databases for resource usage."},{"line_number":53,"context_line":""},{"line_number":54,"context_line":"Note: because there is not yet ability to partition allocations in placement,"},{"line_number":55,"context_line":"in order to support edge deployments where multiple Nova instances share the"},{"line_number":56,"context_line":"same placement service, we can add a"},{"line_number":57,"context_line":"``[workarounds]disable_quota_usage_from_placement`` which defaults to False."},{"line_number":58,"context_line":"If True, we use the legacy quota counting method for instances, cores, and"}],"source_content_type":"text/x-rst","patch_set":4,"id":"3f79a3b5_9a742239","line":55,"range":{"start_line":55,"start_character":20,"end_line":55,"end_character":24},"in_reply_to":"3f79a3b5_f9821a02","updated":"2018-11-21 20:56:36.000000000","message":"Can do.","commit_id":"0d52ed297c1e73b32c590d41e260da5f8e1dbf37"},{"author":{"_account_id":4393,"name":"Dan Smith","email":"dms@danplanet.com","username":"danms"},"change_message_id":"b81e2dd6ac9880b831cb09b5ee47ae461c8eb96a","unresolved":false,"context_lines":[{"line_number":54,"context_line":"Note: because there is not yet ability to partition allocations in placement,"},{"line_number":55,"context_line":"in order to support edge deployments where multiple Nova instances share the"},{"line_number":56,"context_line":"same placement service, we can add a"},{"line_number":57,"context_line":"``[workarounds]disable_quota_usage_from_placement`` which defaults to False."},{"line_number":58,"context_line":"If True, we use the legacy quota counting method for instances, cores, and"},{"line_number":59,"context_line":"ram. If False, we use a quota counting method that calls placement. This is a"},{"line_number":60,"context_line":"minimal way to keep \"legacy\" quota counting available for the scenario of"}],"source_content_type":"text/x-rst","patch_set":4,"id":"3f79a3b5_b9910248","line":57,"updated":"2018-11-20 22:08:19.000000000","message":"I\u0027m sad we have to do this, and keep two kinds of quota routines around for however long. I\u0027m glad it\u0027s in the workarounds section because I hope it indicates that we\u0027re not planning to keep it around for long. However, see below.","commit_id":"0d52ed297c1e73b32c590d41e260da5f8e1dbf37"},{"author":{"_account_id":4690,"name":"melanie witt","display_name":"melwitt","email":"melwittt@gmail.com","username":"melwitt"},"change_message_id":"1b14af3e902e63299b349e6301a90194d2af7ce0","unresolved":false,"context_lines":[{"line_number":54,"context_line":"Note: because there is not yet ability to partition allocations in placement,"},{"line_number":55,"context_line":"in order to support edge deployments where multiple Nova instances share the"},{"line_number":56,"context_line":"same placement service, we can add a"},{"line_number":57,"context_line":"``[workarounds]disable_quota_usage_from_placement`` which defaults to False."},{"line_number":58,"context_line":"If True, we use the legacy quota counting method for instances, cores, and"},{"line_number":59,"context_line":"ram. If False, we use a quota counting method that calls placement. This is a"},{"line_number":60,"context_line":"minimal way to keep \"legacy\" quota counting available for the scenario of"}],"source_content_type":"text/x-rst","patch_set":4,"id":"3f79a3b5_fab196e4","line":57,"in_reply_to":"3f79a3b5_b9910248","updated":"2018-11-21 20:56:36.000000000","message":"It is, but I feel the tradeoff is worthwhile. The debt we carry is one counting method. And the gain is letting our multi-cell operators have quota counting insulated from down or poor performing cells, not having to choose between two extremes of 1) fail boot requests if a project owns instances in \"down\" cells or 2) allow quota limits to be potentially exceeded.","commit_id":"0d52ed297c1e73b32c590d41e260da5f8e1dbf37"},{"author":{"_account_id":4393,"name":"Dan Smith","email":"dms@danplanet.com","username":"danms"},"change_message_id":"b81e2dd6ac9880b831cb09b5ee47ae461c8eb96a","unresolved":false,"context_lines":[{"line_number":122,"context_line":"implement the policy-driven behavior where an operator has to choose between"},{"line_number":123,"context_line":"failing server create requests when a project has instances in a down cell or"},{"line_number":124,"context_line":"allowing server create requests to potentionally exceed quota limits."},{"line_number":125,"context_line":""},{"line_number":126,"context_line":"Data model impact"},{"line_number":127,"context_line":"-----------------"},{"line_number":128,"context_line":""}],"source_content_type":"text/x-rst","patch_set":4,"id":"3f79a3b5_59154e94","line":125,"updated":"2018-11-20 22:08:19.000000000","message":"Another alternative we\u0027ve discussed is to use aggregates to surround each entire nova deployment, which doesn\u0027t require any new placement work (although it might require a trivial aggregate\u003d parameter to usages), but does imply some work either by nova or the operator to keep the aggregate updated.","commit_id":"0d52ed297c1e73b32c590d41e260da5f8e1dbf37"},{"author":{"_account_id":4690,"name":"melanie witt","display_name":"melwitt","email":"melwittt@gmail.com","username":"melwitt"},"change_message_id":"1b14af3e902e63299b349e6301a90194d2af7ce0","unresolved":false,"context_lines":[{"line_number":122,"context_line":"implement the policy-driven behavior where an operator has to choose between"},{"line_number":123,"context_line":"failing server create requests when a project has instances in a down cell or"},{"line_number":124,"context_line":"allowing server create requests to potentionally exceed quota limits."},{"line_number":125,"context_line":""},{"line_number":126,"context_line":"Data model impact"},{"line_number":127,"context_line":"-----------------"},{"line_number":128,"context_line":""}],"source_content_type":"text/x-rst","patch_set":4,"id":"3f79a3b5_baa79ea4","line":125,"in_reply_to":"3f79a3b5_59154e94","updated":"2018-11-21 20:56:36.000000000","message":"Thanks for reminding me about that. I shall add it to this section.","commit_id":"0d52ed297c1e73b32c590d41e260da5f8e1dbf37"},{"author":{"_account_id":4393,"name":"Dan Smith","email":"dms@danplanet.com","username":"danms"},"change_message_id":"b81e2dd6ac9880b831cb09b5ee47ae461c8eb96a","unresolved":false,"context_lines":[{"line_number":127,"context_line":"-----------------"},{"line_number":128,"context_line":""},{"line_number":129,"context_line":"A nova_api database schema change will be required for adding the ``user_id``"},{"line_number":130,"context_line":"column of type String to the ``nova_api.instance_mappings`` table."},{"line_number":131,"context_line":""},{"line_number":132,"context_line":"REST API impact"},{"line_number":133,"context_line":"---------------"}],"source_content_type":"text/x-rst","patch_set":4,"id":"3f79a3b5_59a0ee8e","line":130,"updated":"2018-11-20 22:08:19.000000000","message":"There will be a data migration to patch this up, which is worth mentioning, IMHO.\n\nPresumably it should be easy to batch this, looking for mappings with user_id\u003dNone and only processing those from the appropriate cell to do the update.","commit_id":"0d52ed297c1e73b32c590d41e260da5f8e1dbf37"},{"author":{"_account_id":4690,"name":"melanie witt","display_name":"melwitt","email":"melwittt@gmail.com","username":"melwitt"},"change_message_id":"1b14af3e902e63299b349e6301a90194d2af7ce0","unresolved":false,"context_lines":[{"line_number":127,"context_line":"-----------------"},{"line_number":128,"context_line":""},{"line_number":129,"context_line":"A nova_api database schema change will be required for adding the ``user_id``"},{"line_number":130,"context_line":"column of type String to the ``nova_api.instance_mappings`` table."},{"line_number":131,"context_line":""},{"line_number":132,"context_line":"REST API impact"},{"line_number":133,"context_line":"---------------"}],"source_content_type":"text/x-rst","patch_set":4,"id":"3f79a3b5_9ac2e234","line":130,"in_reply_to":"3f79a3b5_59a0ee8e","updated":"2018-11-21 20:56:36.000000000","message":"Good point, I failed to describe this in my haste. Will add.","commit_id":"0d52ed297c1e73b32c590d41e260da5f8e1dbf37"},{"author":{"_account_id":4393,"name":"Dan Smith","email":"dms@danplanet.com","username":"danms"},"change_message_id":"b81e2dd6ac9880b831cb09b5ee47ae461c8eb96a","unresolved":false,"context_lines":[{"line_number":158,"context_line":"Other deployer impact"},{"line_number":159,"context_line":"---------------------"},{"line_number":160,"context_line":""},{"line_number":161,"context_line":"None."},{"line_number":162,"context_line":""},{"line_number":163,"context_line":"Developer impact"},{"line_number":164,"context_line":"----------------"}],"source_content_type":"text/x-rst","patch_set":4,"id":"3f79a3b5_19d91626","line":161,"updated":"2018-11-20 22:08:19.000000000","message":"Deployers will have to migrate the data before this can be enabled, right? If we just do what you have in this spec, then as soon as they roll new code quotas will be all fubar for instances unless they have the workaround enabled, correct?","commit_id":"0d52ed297c1e73b32c590d41e260da5f8e1dbf37"},{"author":{"_account_id":4393,"name":"Dan Smith","email":"dms@danplanet.com","username":"danms"},"change_message_id":"b81e2dd6ac9880b831cb09b5ee47ae461c8eb96a","unresolved":false,"context_lines":[{"line_number":168,"context_line":"Upgrade impact"},{"line_number":169,"context_line":"--------------"},{"line_number":170,"context_line":""},{"line_number":171,"context_line":"None."},{"line_number":172,"context_line":""},{"line_number":173,"context_line":"Implementation"},{"line_number":174,"context_line":"\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d"}],"source_content_type":"text/x-rst","patch_set":4,"id":"3f79a3b5_79c92a50","line":171,"updated":"2018-11-20 22:08:19.000000000","message":"Er, maybe the above belongs here.","commit_id":"0d52ed297c1e73b32c590d41e260da5f8e1dbf37"},{"author":{"_account_id":4690,"name":"melanie witt","display_name":"melwitt","email":"melwittt@gmail.com","username":"melwitt"},"change_message_id":"1b14af3e902e63299b349e6301a90194d2af7ce0","unresolved":false,"context_lines":[{"line_number":168,"context_line":"Upgrade impact"},{"line_number":169,"context_line":"--------------"},{"line_number":170,"context_line":""},{"line_number":171,"context_line":"None."},{"line_number":172,"context_line":""},{"line_number":173,"context_line":"Implementation"},{"line_number":174,"context_line":"\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d"}],"source_content_type":"text/x-rst","patch_set":4,"id":"3f79a3b5_fad676f8","line":171,"in_reply_to":"3f79a3b5_79c92a50","updated":"2018-11-21 20:56:36.000000000","message":"Yes, my oversight here, thanks.","commit_id":"0d52ed297c1e73b32c590d41e260da5f8e1dbf37"},{"author":{"_account_id":4690,"name":"melanie witt","display_name":"melwitt","email":"melwittt@gmail.com","username":"melwitt"},"change_message_id":"17c8998c8c85821073e380d36da53ff8f6268578","unresolved":false,"context_lines":[{"line_number":185,"context_line":"Work Items"},{"line_number":186,"context_line":"----------"},{"line_number":187,"context_line":""},{"line_number":188,"context_line":"* Add a new column ``user_id`` to the ``nova_api.instance_mappings`` table."},{"line_number":189,"context_line":"* Add a config option ``[workarounds]disable_quota_usage_from_placement`` that"},{"line_number":190,"context_line":"  defaults to False."},{"line_number":191,"context_line":"* Add a new method to count instances with a count of"}],"source_content_type":"text/x-rst","patch_set":4,"id":"3f79a3b5_8989a22a","line":188,"updated":"2018-11-15 09:54:15.000000000","message":"Note: we will also need to use this to change from counting server group members by user_id in cell databases to counting server group members in the API database.","commit_id":"0d52ed297c1e73b32c590d41e260da5f8e1dbf37"},{"author":{"_account_id":4393,"name":"Dan Smith","email":"dms@danplanet.com","username":"danms"},"change_message_id":"b81e2dd6ac9880b831cb09b5ee47ae461c8eb96a","unresolved":false,"context_lines":[{"line_number":200,"context_line":"\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d"},{"line_number":201,"context_line":""},{"line_number":202,"context_line":"This work depends on the new column ``queued_for_delete`` from the `spec"},{"line_number":203,"context_line":"proposal for handling a \"down\" cell. \u003cReferences_\u003e`_"},{"line_number":204,"context_line":""},{"line_number":205,"context_line":"Testing"},{"line_number":206,"context_line":"\u003d\u003d\u003d\u003d\u003d\u003d\u003d"}],"source_content_type":"text/x-rst","patch_set":4,"id":"3f79a3b5_b9f8227e","line":203,"updated":"2018-11-20 22:08:19.000000000","message":"Isn\u0027t this merged now?","commit_id":"0d52ed297c1e73b32c590d41e260da5f8e1dbf37"},{"author":{"_account_id":4690,"name":"melanie witt","display_name":"melwitt","email":"melwittt@gmail.com","username":"melwitt"},"change_message_id":"1b14af3e902e63299b349e6301a90194d2af7ce0","unresolved":false,"context_lines":[{"line_number":200,"context_line":"\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d"},{"line_number":201,"context_line":""},{"line_number":202,"context_line":"This work depends on the new column ``queued_for_delete`` from the `spec"},{"line_number":203,"context_line":"proposal for handling a \"down\" cell. \u003cReferences_\u003e`_"},{"line_number":204,"context_line":""},{"line_number":205,"context_line":"Testing"},{"line_number":206,"context_line":"\u003d\u003d\u003d\u003d\u003d\u003d\u003d"}],"source_content_type":"text/x-rst","patch_set":4,"id":"3f79a3b5_7aca6647","line":203,"in_reply_to":"3f79a3b5_b9f8227e","updated":"2018-11-21 20:56:36.000000000","message":"It is. I suppose I need not mention it anymore then.","commit_id":"0d52ed297c1e73b32c590d41e260da5f8e1dbf37"},{"author":{"_account_id":4393,"name":"Dan Smith","email":"dms@danplanet.com","username":"danms"},"change_message_id":"b81e2dd6ac9880b831cb09b5ee47ae461c8eb96a","unresolved":false,"context_lines":[{"line_number":226,"context_line":""},{"line_number":227,"context_line":"This depends upon the spec proposal for handling a \"down\" cell, namely, the"},{"line_number":228,"context_line":"addition of the ``queued_for_delete`` column to the"},{"line_number":229,"context_line":"``nova_api.instance_mappings`` table schema."},{"line_number":230,"context_line":""},{"line_number":231,"context_line":"* https://blueprints.launchpad.net/nova/+spec/handling-down-cell"},{"line_number":232,"context_line":""}],"source_content_type":"text/x-rst","patch_set":4,"id":"3f79a3b5_d9fb9e77","line":229,"updated":"2018-11-20 22:08:19.000000000","message":"Here too.","commit_id":"0d52ed297c1e73b32c590d41e260da5f8e1dbf37"},{"author":{"_account_id":4393,"name":"Dan Smith","email":"dms@danplanet.com","username":"danms"},"change_message_id":"3c5fa8221d41e8bfbb7cdbcbdb13195f200c6f05","unresolved":false,"context_lines":[{"line_number":165,"context_line":"Other deployer impact"},{"line_number":166,"context_line":"---------------------"},{"line_number":167,"context_line":""},{"line_number":168,"context_line":"None."},{"line_number":169,"context_line":""},{"line_number":170,"context_line":"Developer impact"},{"line_number":171,"context_line":"----------------"}],"source_content_type":"text/x-rst","patch_set":5,"id":"3f79a3b5_af1acd7c","line":168,"updated":"2018-11-26 15:09:06.000000000","message":"Can you address my comment in this section from the previous version? Making operators have the workaround enabled before/until they run the data migration seems pretty terrible to me.","commit_id":"c433c7dd37de3fc31c169af3873a46fc8a818de8"},{"author":{"_account_id":4690,"name":"melanie witt","display_name":"melwitt","email":"melwittt@gmail.com","username":"melwitt"},"change_message_id":"4a122bb8f9185630ffa9b312612260138f6c148a","unresolved":false,"context_lines":[{"line_number":165,"context_line":"Other deployer impact"},{"line_number":166,"context_line":"---------------------"},{"line_number":167,"context_line":""},{"line_number":168,"context_line":"None."},{"line_number":169,"context_line":""},{"line_number":170,"context_line":"Developer impact"},{"line_number":171,"context_line":"----------------"}],"source_content_type":"text/x-rst","patch_set":5,"id":"3f79a3b5_0699b87e","line":168,"in_reply_to":"3f79a3b5_af1acd7c","updated":"2018-12-05 00:25:57.000000000","message":"Sorry, I didn\u0027t realize I didn\u0027t understand your point here until this second comment.\n\nTo gracefully handle a rolling upgrade, if an instance mapping doesn\u0027t have user_id populated, then the code would need to fall back on counting instances the old way via cell databases for that mapping. We\u0027d need a way to detect whether the migration has been run yet -- maybe an all or nothing check, if any instance mapping has a NULL user_id (if COUNT where user_id\u003dNone \u003e 0), then fall back on the legacy count method.\n\nI\u0027ll add this to the upgrade impact section.","commit_id":"c433c7dd37de3fc31c169af3873a46fc8a818de8"},{"author":{"_account_id":15888,"name":"Zhenyu Zheng","email":"zheng.zhenyu@outlook.com","username":"Kevin_Zheng"},"change_message_id":"354526995be25f5f5cd998a8d16a64c208f0d824","unresolved":false,"context_lines":[{"line_number":48,"context_line":"We could take a different approach by querying the placement API and the API"},{"line_number":49,"context_line":"database to get resource usage counts. Since placement is managing resource"},{"line_number":50,"context_line":"allocations, it has the information we need to count resource usage for CPU and"},{"line_number":51,"context_line":"RAM quotas. By querying placement and the API database, we can avoid reading"},{"line_number":52,"context_line":"separate cell databases for resource usage."},{"line_number":53,"context_line":""},{"line_number":54,"context_line":"Note: because there is not yet an ability to partition allocations (or perhaps,"}],"source_content_type":"text/x-rst","patch_set":6,"id":"3f79a3b5_8787adb4","line":51,"updated":"2018-12-21 14:42:37.000000000","message":"So, how do we know the consumer is a nova instance? I remembered you guys have talked about add a field in placement to identify the consumer type?","commit_id":"517ebefedad5a0c6dacb140931035828ad9b625b"},{"author":{"_account_id":6873,"name":"Matt Riedemann","email":"mriedem.os@gmail.com","username":"mriedem"},"change_message_id":"fe37e465cc0c877f2d430004716605dc21e0175e","unresolved":false,"context_lines":[{"line_number":48,"context_line":"We could take a different approach by querying the placement API and the API"},{"line_number":49,"context_line":"database to get resource usage counts. Since placement is managing resource"},{"line_number":50,"context_line":"allocations, it has the information we need to count resource usage for CPU and"},{"line_number":51,"context_line":"RAM quotas. By querying placement and the API database, we can avoid reading"},{"line_number":52,"context_line":"separate cell databases for resource usage."},{"line_number":53,"context_line":""},{"line_number":54,"context_line":"Note: because there is not yet an ability to partition allocations (or perhaps,"}],"source_content_type":"text/x-rst","patch_set":6,"id":"1f769fc5_c8733b4d","line":51,"in_reply_to":"1f769fc5_5627d2b7","updated":"2018-12-23 21:18:29.000000000","message":"\u003e Yeah, that is what I\u0027m trying to say, if we have type we can save\n \u003e one DB call. But yeah, if we only count VCPU and MEMORY_MB, then I\n \u003e guess it is not important.\n\nWe count VCPU, MEMORY_MB and number of instances during server create and resize. Even if we count the number of instances from placement using consumer types, we\u0027re not saying a DB call (and it\u0027s another REST API call to placement). Anyway, we don\u0027t need it right now so it\u0027s better to not add a new dependency.","commit_id":"517ebefedad5a0c6dacb140931035828ad9b625b"},{"author":{"_account_id":4690,"name":"melanie witt","display_name":"melwitt","email":"melwittt@gmail.com","username":"melwitt"},"change_message_id":"0654d78562441d309953136ef643f5eddcd1c0c2","unresolved":false,"context_lines":[{"line_number":48,"context_line":"We could take a different approach by querying the placement API and the API"},{"line_number":49,"context_line":"database to get resource usage counts. Since placement is managing resource"},{"line_number":50,"context_line":"allocations, it has the information we need to count resource usage for CPU and"},{"line_number":51,"context_line":"RAM quotas. By querying placement and the API database, we can avoid reading"},{"line_number":52,"context_line":"separate cell databases for resource usage."},{"line_number":53,"context_line":""},{"line_number":54,"context_line":"Note: because there is not yet an ability to partition allocations (or perhaps,"}],"source_content_type":"text/x-rst","patch_set":6,"id":"ffd0ebdf_1a240fc6","line":51,"in_reply_to":"3f79a3b5_8787adb4","updated":"2019-01-04 00:26:47.000000000","message":"@Kevin, it\u0027s true in the past we had some discussions about consumer type. Those discussions have since evolved to realize that the potential problem is not so much about consumer types but is more about multiple Nova deployments sharing one placement service. In such an environment, if \"Nova A\" asks for VCPU usage, it would get a rollup of \"Nova A\" + \"Nova B\" + \"Nova C\" VCPU usage instead of only its own.\n\nSo, that led to the \"placement partitioning\" discussion (see earlier discussion in this review comments). And until we have partitioning in placement, we will use a [workarounds] option (see the mention of the option later in this paragraph) to address this situation by allowing deployers to control the behavior.","commit_id":"517ebefedad5a0c6dacb140931035828ad9b625b"},{"author":{"_account_id":6873,"name":"Matt Riedemann","email":"mriedem.os@gmail.com","username":"mriedem"},"change_message_id":"25faa9a05b25da09ba81b0b4f2910361595f6b2d","unresolved":false,"context_lines":[{"line_number":48,"context_line":"We could take a different approach by querying the placement API and the API"},{"line_number":49,"context_line":"database to get resource usage counts. Since placement is managing resource"},{"line_number":50,"context_line":"allocations, it has the information we need to count resource usage for CPU and"},{"line_number":51,"context_line":"RAM quotas. By querying placement and the API database, we can avoid reading"},{"line_number":52,"context_line":"separate cell databases for resource usage."},{"line_number":53,"context_line":""},{"line_number":54,"context_line":"Note: because there is not yet an ability to partition allocations (or perhaps,"}],"source_content_type":"text/x-rst","patch_set":6,"id":"3f79a3b5_a25df7e5","line":51,"in_reply_to":"3f79a3b5_8787adb4","updated":"2018-12-21 15:54:25.000000000","message":"I\u0027m not entirely sure it matters. The proposed change is to call the placement API to get usages for the given project and user, and then count based on the VCPU/MEMORY_MB usage, which should really only be a nova instance - what other resources in openstack are consuming those resource classes? If something were to later and we needed to deal with it, then I think that\u0027s when we\u0027d likely have to take into account consumer types.\n\n--\n\nAs an aside, if placement did have consumer type, and there was a /consumers endpoint (we have consumers table data in placement), then we could count consumers from placement as well by filtering on type, but we don\u0027t need that right now b/c we can do the same with the instance_mappings table in the nova_api DB (as described below).","commit_id":"517ebefedad5a0c6dacb140931035828ad9b625b"},{"author":{"_account_id":15888,"name":"Zhenyu Zheng","email":"zheng.zhenyu@outlook.com","username":"Kevin_Zheng"},"change_message_id":"ba0409be2824dd325ebdd0bb72886b4711c71326","unresolved":false,"context_lines":[{"line_number":48,"context_line":"We could take a different approach by querying the placement API and the API"},{"line_number":49,"context_line":"database to get resource usage counts. Since placement is managing resource"},{"line_number":50,"context_line":"allocations, it has the information we need to count resource usage for CPU and"},{"line_number":51,"context_line":"RAM quotas. By querying placement and the API database, we can avoid reading"},{"line_number":52,"context_line":"separate cell databases for resource usage."},{"line_number":53,"context_line":""},{"line_number":54,"context_line":"Note: because there is not yet an ability to partition allocations (or perhaps,"}],"source_content_type":"text/x-rst","patch_set":6,"id":"1f769fc5_5627d2b7","line":51,"in_reply_to":"3f79a3b5_a25df7e5","updated":"2018-12-22 01:54:06.000000000","message":"Yeah, that is what I\u0027m trying to say, if we have type we can save one DB call. But yeah, if we only count VCPU and MEMORY_MB, then I guess it is not important.","commit_id":"517ebefedad5a0c6dacb140931035828ad9b625b"},{"author":{"_account_id":6873,"name":"Matt Riedemann","email":"mriedem.os@gmail.com","username":"mriedem"},"change_message_id":"25faa9a05b25da09ba81b0b4f2910361595f6b2d","unresolved":false,"context_lines":[{"line_number":51,"context_line":"RAM quotas. By querying placement and the API database, we can avoid reading"},{"line_number":52,"context_line":"separate cell databases for resource usage."},{"line_number":53,"context_line":""},{"line_number":54,"context_line":"Note: because there is not yet an ability to partition allocations (or perhaps,"},{"line_number":55,"context_line":"resource providers from which allocations could derive a partition) in"},{"line_number":56,"context_line":"placement, in order to support deployments where multiple Nova instances share"},{"line_number":57,"context_line":"the same placement service, we can add a"},{"line_number":58,"context_line":"``[workarounds]disable_quota_usage_from_placement`` which defaults to False."},{"line_number":59,"context_line":"If True, we use the legacy quota counting method for instances, cores, and"},{"line_number":60,"context_line":"ram. If False, we use a quota counting method that calls placement. This is a"}],"source_content_type":"text/x-rst","patch_set":6,"id":"3f79a3b5_22a327b1","line":57,"range":{"start_line":54,"start_character":0,"end_line":57,"end_character":26},"updated":"2018-12-21 15:54:25.000000000","message":"nit: this is by no means a traditional way people are deploying openstack, correct? As far as I\u0027ve been following, this is about reference architectures for edge deployments, which is still pretty new. As such, it might be good to just mention something along those lines for context, even if it\u0027s just something like \"where multiple Nova deployments share the same placement service, like in an Edge scenario\". I would use the word \"deployments\" rather than \"instances\" to avoid confusion with servers (VMs) as \"instances\".\n\nYou could even link to https://wiki.openstack.org/wiki/Edge_Computing_Group/Edge_Reference_Architectures#Deployment_Scenarios if you wanted, to it might not be very helpful since looking at that wiki, placement is 1:1 with each nova deployment...","commit_id":"517ebefedad5a0c6dacb140931035828ad9b625b"},{"author":{"_account_id":4690,"name":"melanie witt","display_name":"melwitt","email":"melwittt@gmail.com","username":"melwitt"},"change_message_id":"0654d78562441d309953136ef643f5eddcd1c0c2","unresolved":false,"context_lines":[{"line_number":51,"context_line":"RAM quotas. By querying placement and the API database, we can avoid reading"},{"line_number":52,"context_line":"separate cell databases for resource usage."},{"line_number":53,"context_line":""},{"line_number":54,"context_line":"Note: because there is not yet an ability to partition allocations (or perhaps,"},{"line_number":55,"context_line":"resource providers from which allocations could derive a partition) in"},{"line_number":56,"context_line":"placement, in order to support deployments where multiple Nova instances share"},{"line_number":57,"context_line":"the same placement service, we can add a"},{"line_number":58,"context_line":"``[workarounds]disable_quota_usage_from_placement`` which defaults to False."},{"line_number":59,"context_line":"If True, we use the legacy quota counting method for instances, cores, and"},{"line_number":60,"context_line":"ram. If False, we use a quota counting method that calls placement. This is a"}],"source_content_type":"text/x-rst","patch_set":6,"id":"ffd0ebdf_fa034b0e","line":57,"range":{"start_line":54,"start_character":0,"end_line":57,"end_character":26},"in_reply_to":"3f79a3b5_22a327b1","updated":"2019-01-04 00:26:47.000000000","message":"OK, I think that\u0027s a good point. Will update this.","commit_id":"517ebefedad5a0c6dacb140931035828ad9b625b"},{"author":{"_account_id":6873,"name":"Matt Riedemann","email":"mriedem.os@gmail.com","username":"mriedem"},"change_message_id":"25faa9a05b25da09ba81b0b4f2910361595f6b2d","unresolved":false,"context_lines":[{"line_number":55,"context_line":"resource providers from which allocations could derive a partition) in"},{"line_number":56,"context_line":"placement, in order to support deployments where multiple Nova instances share"},{"line_number":57,"context_line":"the same placement service, we can add a"},{"line_number":58,"context_line":"``[workarounds]disable_quota_usage_from_placement`` which defaults to False."},{"line_number":59,"context_line":"If True, we use the legacy quota counting method for instances, cores, and"},{"line_number":60,"context_line":"ram. If False, we use a quota counting method that calls placement. This is a"},{"line_number":61,"context_line":"minimal way to keep \"legacy\" quota counting available for the scenario of"}],"source_content_type":"text/x-rst","patch_set":6,"id":"3f79a3b5_bd90b898","line":58,"updated":"2018-12-21 15:54:25.000000000","message":"nit: This isn\u0027t really problem description content, this is really proposed change content for a workaround to a problem (which is the multiple nova deployments using the same placement).","commit_id":"517ebefedad5a0c6dacb140931035828ad9b625b"},{"author":{"_account_id":4690,"name":"melanie witt","display_name":"melwitt","email":"melwittt@gmail.com","username":"melwitt"},"change_message_id":"0654d78562441d309953136ef643f5eddcd1c0c2","unresolved":false,"context_lines":[{"line_number":55,"context_line":"resource providers from which allocations could derive a partition) in"},{"line_number":56,"context_line":"placement, in order to support deployments where multiple Nova instances share"},{"line_number":57,"context_line":"the same placement service, we can add a"},{"line_number":58,"context_line":"``[workarounds]disable_quota_usage_from_placement`` which defaults to False."},{"line_number":59,"context_line":"If True, we use the legacy quota counting method for instances, cores, and"},{"line_number":60,"context_line":"ram. If False, we use a quota counting method that calls placement. This is a"},{"line_number":61,"context_line":"minimal way to keep \"legacy\" quota counting available for the scenario of"}],"source_content_type":"text/x-rst","patch_set":6,"id":"ffd0ebdf_353804d2","line":58,"in_reply_to":"3f79a3b5_bd90b898","updated":"2019-01-04 00:26:47.000000000","message":"OK, will remove this part and/or move it to the proposed change section.","commit_id":"517ebefedad5a0c6dacb140931035828ad9b625b"},{"author":{"_account_id":6873,"name":"Matt Riedemann","email":"mriedem.os@gmail.com","username":"mriedem"},"change_message_id":"25faa9a05b25da09ba81b0b4f2910361595f6b2d","unresolved":false,"context_lines":[{"line_number":56,"context_line":"placement, in order to support deployments where multiple Nova instances share"},{"line_number":57,"context_line":"the same placement service, we can add a"},{"line_number":58,"context_line":"``[workarounds]disable_quota_usage_from_placement`` which defaults to False."},{"line_number":59,"context_line":"If True, we use the legacy quota counting method for instances, cores, and"},{"line_number":60,"context_line":"ram. If False, we use a quota counting method that calls placement. This is a"},{"line_number":61,"context_line":"minimal way to keep \"legacy\" quota counting available for the scenario of"},{"line_number":62,"context_line":"multiple Nova instances sharing one placement service. The config option will"},{"line_number":63,"context_line":"simply control which counting method will be called by the pluggable quota"}],"source_content_type":"text/x-rst","patch_set":6,"id":"3f79a3b5_227c871f","line":60,"range":{"start_line":59,"start_character":0,"end_line":60,"end_character":4},"updated":"2018-12-21 15:54:25.000000000","message":"And in this case, if a cell is down, you still have the possible over-commit issue? Or would we mix in the policy check for this to see what should happen (fail server create/resize if checking quota and the project has any resources in a down cell)?","commit_id":"517ebefedad5a0c6dacb140931035828ad9b625b"},{"author":{"_account_id":4690,"name":"melanie witt","display_name":"melwitt","email":"melwittt@gmail.com","username":"melwitt"},"change_message_id":"0654d78562441d309953136ef643f5eddcd1c0c2","unresolved":false,"context_lines":[{"line_number":56,"context_line":"placement, in order to support deployments where multiple Nova instances share"},{"line_number":57,"context_line":"the same placement service, we can add a"},{"line_number":58,"context_line":"``[workarounds]disable_quota_usage_from_placement`` which defaults to False."},{"line_number":59,"context_line":"If True, we use the legacy quota counting method for instances, cores, and"},{"line_number":60,"context_line":"ram. If False, we use a quota counting method that calls placement. This is a"},{"line_number":61,"context_line":"minimal way to keep \"legacy\" quota counting available for the scenario of"},{"line_number":62,"context_line":"multiple Nova instances sharing one placement service. The config option will"},{"line_number":63,"context_line":"simply control which counting method will be called by the pluggable quota"}],"source_content_type":"text/x-rst","patch_set":6,"id":"ffd0ebdf_9583f056","line":60,"range":{"start_line":59,"start_character":0,"end_line":60,"end_character":4},"in_reply_to":"3f79a3b5_227c871f","updated":"2019-01-04 00:26:47.000000000","message":"Yes, in case of legacy behavior you\u0027d have the over-commit issue. I was not considering mixing in the policy check for the behavior (fail create/resize) in the face of down cells. I was thinking that today, we have the over-commit issue and being in \"legacy\" mode would be as-is with what we have today. I\u0027m open to the idea of mixing in the policy check for \"legacy\" mode but wonder whether it\u0027s worth the complexity.","commit_id":"517ebefedad5a0c6dacb140931035828ad9b625b"},{"author":{"_account_id":6873,"name":"Matt Riedemann","email":"mriedem.os@gmail.com","username":"mriedem"},"change_message_id":"ce9e7a4cc58f7eec546d0dea8755c72d28fcfb3f","unresolved":false,"context_lines":[{"line_number":56,"context_line":"placement, in order to support deployments where multiple Nova instances share"},{"line_number":57,"context_line":"the same placement service, we can add a"},{"line_number":58,"context_line":"``[workarounds]disable_quota_usage_from_placement`` which defaults to False."},{"line_number":59,"context_line":"If True, we use the legacy quota counting method for instances, cores, and"},{"line_number":60,"context_line":"ram. If False, we use a quota counting method that calls placement. This is a"},{"line_number":61,"context_line":"minimal way to keep \"legacy\" quota counting available for the scenario of"},{"line_number":62,"context_line":"multiple Nova instances sharing one placement service. The config option will"},{"line_number":63,"context_line":"simply control which counting method will be called by the pluggable quota"}],"source_content_type":"text/x-rst","patch_set":6,"id":"ffd0ebdf_c8454f72","line":60,"range":{"start_line":59,"start_character":0,"end_line":60,"end_character":4},"in_reply_to":"ffd0ebdf_9583f056","updated":"2019-01-04 17:52:22.000000000","message":"I\u0027m cool with not trying to mix in the policy check and leaving things work as-is unless someone comes along and says they need the workaround but also the policy check to prevent the over-commit (which is really more like a bug fix IMO).","commit_id":"517ebefedad5a0c6dacb140931035828ad9b625b"},{"author":{"_account_id":6873,"name":"Matt Riedemann","email":"mriedem.os@gmail.com","username":"mriedem"},"change_message_id":"25faa9a05b25da09ba81b0b4f2910361595f6b2d","unresolved":false,"context_lines":[{"line_number":95,"context_line":"  the number of instances for a project and user if we add a new column"},{"line_number":96,"context_line":"  ``user_id`` to the ``nova_api.instance_mappings`` table. We already have a"},{"line_number":97,"context_line":"  ``project_id`` column on the table. This will allow us to count instance"},{"line_number":98,"context_line":"  mappings for a project and a user to represent the instance count. We will"},{"line_number":99,"context_line":"  also rely on the new column ``queued_for_delete`` from the `spec proposal for"},{"line_number":100,"context_line":"  handling a \"down\" cell. \u003cReferences_\u003e`_"},{"line_number":101,"context_line":""}],"source_content_type":"text/x-rst","patch_set":6,"id":"3f79a3b5_82ba3bb8","line":98,"updated":"2018-12-21 15:54:25.000000000","message":"This reminds me, this may also inadvertently fix bug 1716706.","commit_id":"517ebefedad5a0c6dacb140931035828ad9b625b"},{"author":{"_account_id":4690,"name":"melanie witt","display_name":"melwitt","email":"melwittt@gmail.com","username":"melwitt"},"change_message_id":"0654d78562441d309953136ef643f5eddcd1c0c2","unresolved":false,"context_lines":[{"line_number":95,"context_line":"  the number of instances for a project and user if we add a new column"},{"line_number":96,"context_line":"  ``user_id`` to the ``nova_api.instance_mappings`` table. We already have a"},{"line_number":97,"context_line":"  ``project_id`` column on the table. This will allow us to count instance"},{"line_number":98,"context_line":"  mappings for a project and a user to represent the instance count. We will"},{"line_number":99,"context_line":"  also rely on the new column ``queued_for_delete`` from the `spec proposal for"},{"line_number":100,"context_line":"  handling a \"down\" cell. \u003cReferences_\u003e`_"},{"line_number":101,"context_line":""}],"source_content_type":"text/x-rst","patch_set":6,"id":"ffd0ebdf_15970093","line":98,"in_reply_to":"3f79a3b5_82ba3bb8","updated":"2019-01-04 00:26:47.000000000","message":"I don\u0027t think it will because build requests exist before instance mappings do, so they still represent different counts.\n\n(later)\n\nOK, so refreshing myself on the code in nova/compute/api.py, I see that instance mapping is created right after build request. So, they _mostly_ exist at the same time (there\u0027s a tiny window during which you could have a build request and not yet have an instance mapping. Which reminds me, we really need to pick mnaser\u0027s single transaction for BR/IM/RS patch up again).\n\nSo yeah, practically speaking, this would eliminate the need to count build requests.","commit_id":"517ebefedad5a0c6dacb140931035828ad9b625b"},{"author":{"_account_id":6873,"name":"Matt Riedemann","email":"mriedem.os@gmail.com","username":"mriedem"},"change_message_id":"ce9e7a4cc58f7eec546d0dea8755c72d28fcfb3f","unresolved":false,"context_lines":[{"line_number":95,"context_line":"  the number of instances for a project and user if we add a new column"},{"line_number":96,"context_line":"  ``user_id`` to the ``nova_api.instance_mappings`` table. We already have a"},{"line_number":97,"context_line":"  ``project_id`` column on the table. This will allow us to count instance"},{"line_number":98,"context_line":"  mappings for a project and a user to represent the instance count. We will"},{"line_number":99,"context_line":"  also rely on the new column ``queued_for_delete`` from the `spec proposal for"},{"line_number":100,"context_line":"  handling a \"down\" cell. \u003cReferences_\u003e`_"},{"line_number":101,"context_line":""}],"source_content_type":"text/x-rst","patch_set":6,"id":"ffd0ebdf_c83a2fe6","line":98,"in_reply_to":"ffd0ebdf_15970093","updated":"2019-01-04 17:52:22.000000000","message":"For all intents and purposes I think we can just assume, for the sake of the quota check, that instance mappings and build requests are 1:1. If it really mattered, we could create the instance mapping before the build request. I think we just focus on build request because that\u0027s what\u0027s returned from the API code before the instance is created in a cell but that doesn\u0027t matter for counting, i.e. the instance mapping will suffice.","commit_id":"517ebefedad5a0c6dacb140931035828ad9b625b"},{"author":{"_account_id":6873,"name":"Matt Riedemann","email":"mriedem.os@gmail.com","username":"mriedem"},"change_message_id":"25faa9a05b25da09ba81b0b4f2910361595f6b2d","unresolved":false,"context_lines":[{"line_number":120,"context_line":"One alternative is to hold off on counting any quota usage from placement"},{"line_number":121,"context_line":"until placement has allocation partitioning support. The problem with that is"},{"line_number":122,"context_line":"in the meantime, the only solution we have for handling of down cells is to"},{"line_number":123,"context_line":"implement the policy-driven behavior where an operator has to choose between"},{"line_number":124,"context_line":"failing server create requests when a project has instances in a down cell or"},{"line_number":125,"context_line":"allowing server create requests to potentionally exceed quota limits."},{"line_number":126,"context_line":""}],"source_content_type":"text/x-rst","patch_set":6,"id":"3f79a3b5_e21befc8","line":123,"range":{"start_line":123,"start_character":0,"end_line":123,"end_character":36},"updated":"2018-12-21 15:54:25.000000000","message":"Should link to the patch for this:\n\nhttps://review.openstack.org/#/c/614783/","commit_id":"517ebefedad5a0c6dacb140931035828ad9b625b"},{"author":{"_account_id":4690,"name":"melanie witt","display_name":"melwitt","email":"melwittt@gmail.com","username":"melwitt"},"change_message_id":"0654d78562441d309953136ef643f5eddcd1c0c2","unresolved":false,"context_lines":[{"line_number":120,"context_line":"One alternative is to hold off on counting any quota usage from placement"},{"line_number":121,"context_line":"until placement has allocation partitioning support. The problem with that is"},{"line_number":122,"context_line":"in the meantime, the only solution we have for handling of down cells is to"},{"line_number":123,"context_line":"implement the policy-driven behavior where an operator has to choose between"},{"line_number":124,"context_line":"failing server create requests when a project has instances in a down cell or"},{"line_number":125,"context_line":"allowing server create requests to potentionally exceed quota limits."},{"line_number":126,"context_line":""}],"source_content_type":"text/x-rst","patch_set":6,"id":"ffd0ebdf_557d886c","line":123,"range":{"start_line":123,"start_character":0,"end_line":123,"end_character":36},"in_reply_to":"3f79a3b5_e21befc8","updated":"2019-01-04 00:26:47.000000000","message":"Will do","commit_id":"517ebefedad5a0c6dacb140931035828ad9b625b"},{"author":{"_account_id":6873,"name":"Matt Riedemann","email":"mriedem.os@gmail.com","username":"mriedem"},"change_message_id":"25faa9a05b25da09ba81b0b4f2910361595f6b2d","unresolved":false,"context_lines":[{"line_number":121,"context_line":"until placement has allocation partitioning support. The problem with that is"},{"line_number":122,"context_line":"in the meantime, the only solution we have for handling of down cells is to"},{"line_number":123,"context_line":"implement the policy-driven behavior where an operator has to choose between"},{"line_number":124,"context_line":"failing server create requests when a project has instances in a down cell or"},{"line_number":125,"context_line":"allowing server create requests to potentionally exceed quota limits."},{"line_number":126,"context_line":""},{"line_number":127,"context_line":"Another alternative which has been discussed is, to use placement aggregates to"}],"source_content_type":"text/x-rst","patch_set":6,"id":"3f79a3b5_62759ffe","line":124,"range":{"start_line":124,"start_character":15,"end_line":124,"end_character":21},"updated":"2018-12-21 15:54:25.000000000","message":"nit: it\u0027s also resize, but that\u0027s a very minor detail which you don\u0027t really need to mention in here since it might confuse things.","commit_id":"517ebefedad5a0c6dacb140931035828ad9b625b"},{"author":{"_account_id":11564,"name":"Chris Dent","email":"cdent@anticdent.org","username":"chdent"},"change_message_id":"a1217f994b3d53f8fc5d7249d835e9eb922dcece","unresolved":false,"context_lines":[{"line_number":128,"context_line":"surround each entire Nova deployment and use that as a means to partition"},{"line_number":129,"context_line":"placement usages. We would need to add a ``aggregate\u003d`` query parameter to the"},{"line_number":130,"context_line":"placement /usages API in this case. This approach would also require some work"},{"line_number":131,"context_line":"by either Nova or the operator to keep the placement aggregate updated."},{"line_number":132,"context_line":""},{"line_number":133,"context_line":"Data model impact"},{"line_number":134,"context_line":"-----------------"}],"source_content_type":"text/x-rst","patch_set":6,"id":"3f79a3b5_6f21f8fc","line":131,"updated":"2018-12-10 15:15:54.000000000","message":"Another potential downside of this option is that whenver you fetch the aggregates of a resource provider it will always have at least one aggregate that it is a member of, which is meh. Unless we wanted to add a slew of code to obscure the special aggregates. Which may also be meh.\n\nI was thinking of this because I aggregates are the natural way of partitioning, and potentially, under the covers, might be a nice way to accomplish the desired functionality without yet-another-table. If we were to that, I continue to think that a header rather than a query parameter that says \"I am in this partition\" is the way to go.\n\nBut I would hope we have a bit more discussion about what partitions are and are for before we get too far down the road of figuring out how to do them. It seems likely that however it is done is something that will need to be mirrored in a variety of services.\n\nAll of which is a very long way of saying ✔ to this paragraph.","commit_id":"517ebefedad5a0c6dacb140931035828ad9b625b"},{"author":{"_account_id":6873,"name":"Matt Riedemann","email":"mriedem.os@gmail.com","username":"mriedem"},"change_message_id":"25faa9a05b25da09ba81b0b4f2910361595f6b2d","unresolved":false,"context_lines":[{"line_number":160,"context_line":"Performance Impact"},{"line_number":161,"context_line":"------------------"},{"line_number":162,"context_line":""},{"line_number":163,"context_line":"None."},{"line_number":164,"context_line":""},{"line_number":165,"context_line":"Other deployer impact"},{"line_number":166,"context_line":"---------------------"}],"source_content_type":"text/x-rst","patch_set":6,"id":"3f79a3b5_625eff78","line":163,"updated":"2018-12-21 15:54:25.000000000","message":"Arguable since we\u0027ll be making external REST API calls, but honestly that might be faster than iterating cells to count quota especially if at least one of the cell DB is slow/far away and takes awhile to return so we can aggregate the results.","commit_id":"517ebefedad5a0c6dacb140931035828ad9b625b"},{"author":{"_account_id":4690,"name":"melanie witt","display_name":"melwitt","email":"melwittt@gmail.com","username":"melwitt"},"change_message_id":"0654d78562441d309953136ef643f5eddcd1c0c2","unresolved":false,"context_lines":[{"line_number":160,"context_line":"Performance Impact"},{"line_number":161,"context_line":"------------------"},{"line_number":162,"context_line":""},{"line_number":163,"context_line":"None."},{"line_number":164,"context_line":""},{"line_number":165,"context_line":"Other deployer impact"},{"line_number":166,"context_line":"---------------------"}],"source_content_type":"text/x-rst","patch_set":6,"id":"ffd0ebdf_d5909898","line":163,"in_reply_to":"3f79a3b5_625eff78","updated":"2019-01-04 00:26:47.000000000","message":"Yeah, true. I can mention the possibility of impact here.","commit_id":"517ebefedad5a0c6dacb140931035828ad9b625b"},{"author":{"_account_id":6873,"name":"Matt Riedemann","email":"mriedem.os@gmail.com","username":"mriedem"},"change_message_id":"ce9e7a4cc58f7eec546d0dea8755c72d28fcfb3f","unresolved":false,"context_lines":[{"line_number":160,"context_line":"Performance Impact"},{"line_number":161,"context_line":"------------------"},{"line_number":162,"context_line":""},{"line_number":163,"context_line":"None."},{"line_number":164,"context_line":""},{"line_number":165,"context_line":"Other deployer impact"},{"line_number":166,"context_line":"---------------------"}],"source_content_type":"text/x-rst","patch_set":6,"id":"ffd0ebdf_081a777c","line":163,"in_reply_to":"ffd0ebdf_d5909898","updated":"2019-01-04 17:52:22.000000000","message":"I guess a known performance impact would be checking if data needs to be online migrated at the time of the quota check.","commit_id":"517ebefedad5a0c6dacb140931035828ad9b625b"},{"author":{"_account_id":6873,"name":"Matt Riedemann","email":"mriedem.os@gmail.com","username":"mriedem"},"change_message_id":"25faa9a05b25da09ba81b0b4f2910361595f6b2d","unresolved":false,"context_lines":[{"line_number":179,"context_line":"table will require a data migration of all existing instance mappings to"},{"line_number":180,"context_line":"populate the ``user_id`` field. The migration routine would look for mappings"},{"line_number":181,"context_line":"where ``user_id`` is None and query cells by corresponding ``project_id`` in"},{"line_number":182,"context_line":"the mapping. The query could filter on instance UUIDS, finding the ``user_id``"},{"line_number":183,"context_line":"values to populate in the mappings."},{"line_number":184,"context_line":""},{"line_number":185,"context_line":"In order to handle a rolling, mixed N and N+1 version upgrade, we will need to"}],"source_content_type":"text/x-rst","patch_set":6,"id":"3f79a3b5_42c0a3f9","line":182,"range":{"start_line":182,"start_character":48,"end_line":182,"end_character":53},"updated":"2018-12-21 15:54:25.000000000","message":"UUIDs","commit_id":"517ebefedad5a0c6dacb140931035828ad9b625b"},{"author":{"_account_id":6873,"name":"Matt Riedemann","email":"mriedem.os@gmail.com","username":"mriedem"},"change_message_id":"25faa9a05b25da09ba81b0b4f2910361595f6b2d","unresolved":false,"context_lines":[{"line_number":180,"context_line":"populate the ``user_id`` field. The migration routine would look for mappings"},{"line_number":181,"context_line":"where ``user_id`` is None and query cells by corresponding ``project_id`` in"},{"line_number":182,"context_line":"the mapping. The query could filter on instance UUIDS, finding the ``user_id``"},{"line_number":183,"context_line":"values to populate in the mappings."},{"line_number":184,"context_line":""},{"line_number":185,"context_line":"In order to handle a rolling, mixed N and N+1 version upgrade, we will need to"},{"line_number":186,"context_line":"be able to fall back on the legacy counting method for instances, cores, and"}],"source_content_type":"text/x-rst","patch_set":6,"id":"3f79a3b5_62911ff6","line":183,"updated":"2018-12-21 15:54:25.000000000","message":"So if I\u0027m following, we\u0027d first query the API DB for instance_mappings where user_id is None, and then from that list, scatter/gather cells querying for instances by UUID, then map those instances records from the cells to the instance_mappings records keyed off UUID, and use the user_id from each instances record to update the related instance_mappings record?\n\nThat would be the \u0027nova-manage db online_data_migration\u0027 batched query way of doing the online data migration right?\n\nCould we also heal instance_mappings when getting an instance in the API here:\n\nhttps://github.com/openstack/nova/blob/905e25a63d3ba25cfbdf492891ac8864fed609ab/nova/compute/api.py#L2450\n\nAt that point we have the instance mapping and the instance and if the mapping doesn\u0027t have user_id set, we can set it and save it there during the GET /servers/{server_id} call.","commit_id":"517ebefedad5a0c6dacb140931035828ad9b625b"},{"author":{"_account_id":4690,"name":"melanie witt","display_name":"melwitt","email":"melwittt@gmail.com","username":"melwitt"},"change_message_id":"0654d78562441d309953136ef643f5eddcd1c0c2","unresolved":false,"context_lines":[{"line_number":180,"context_line":"populate the ``user_id`` field. The migration routine would look for mappings"},{"line_number":181,"context_line":"where ``user_id`` is None and query cells by corresponding ``project_id`` in"},{"line_number":182,"context_line":"the mapping. The query could filter on instance UUIDS, finding the ``user_id``"},{"line_number":183,"context_line":"values to populate in the mappings."},{"line_number":184,"context_line":""},{"line_number":185,"context_line":"In order to handle a rolling, mixed N and N+1 version upgrade, we will need to"},{"line_number":186,"context_line":"be able to fall back on the legacy counting method for instances, cores, and"}],"source_content_type":"text/x-rst","patch_set":6,"id":"ffd0ebdf_55cb48a4","line":183,"in_reply_to":"3f79a3b5_62911ff6","updated":"2019-01-04 00:26:47.000000000","message":"Correct, that would be the batched nova-manage way of doing the migration. Yes, we could also heal instance_mappings on-the-fly when we got a GET. I keep forgetting that \u0027nova-manage db online_data_migration\u0027 can be run at any time (not necessarily immediately) and we should heal instances on-the-fly when we can. I\u0027ll add that as a work item.","commit_id":"517ebefedad5a0c6dacb140931035828ad9b625b"},{"author":{"_account_id":6873,"name":"Matt Riedemann","email":"mriedem.os@gmail.com","username":"mriedem"},"change_message_id":"25faa9a05b25da09ba81b0b4f2910361595f6b2d","unresolved":false,"context_lines":[{"line_number":182,"context_line":"the mapping. The query could filter on instance UUIDS, finding the ``user_id``"},{"line_number":183,"context_line":"values to populate in the mappings."},{"line_number":184,"context_line":""},{"line_number":185,"context_line":"In order to handle a rolling, mixed N and N+1 version upgrade, we will need to"},{"line_number":186,"context_line":"be able to fall back on the legacy counting method for instances, cores, and"},{"line_number":187,"context_line":"ram if ``nova_api.instance_mappings`` don\u0027t yet have ``user_id`` populated"},{"line_number":188,"context_line":"(if the operator has not yet run the data migration). We will need a way to"}],"source_content_type":"text/x-rst","patch_set":6,"id":"3f79a3b5_02adcb32","line":185,"range":{"start_line":185,"start_character":29,"end_line":185,"end_character":60},"updated":"2018-12-21 15:54:25.000000000","message":"I\u0027m not really sure what this means in this context - this is normally about backlevel computes - but what is backlevel in this case besides having unmigrated instance mappings records?","commit_id":"517ebefedad5a0c6dacb140931035828ad9b625b"},{"author":{"_account_id":4690,"name":"melanie witt","display_name":"melwitt","email":"melwittt@gmail.com","username":"melwitt"},"change_message_id":"0654d78562441d309953136ef643f5eddcd1c0c2","unresolved":false,"context_lines":[{"line_number":182,"context_line":"the mapping. The query could filter on instance UUIDS, finding the ``user_id``"},{"line_number":183,"context_line":"values to populate in the mappings."},{"line_number":184,"context_line":""},{"line_number":185,"context_line":"In order to handle a rolling, mixed N and N+1 version upgrade, we will need to"},{"line_number":186,"context_line":"be able to fall back on the legacy counting method for instances, cores, and"},{"line_number":187,"context_line":"ram if ``nova_api.instance_mappings`` don\u0027t yet have ``user_id`` populated"},{"line_number":188,"context_line":"(if the operator has not yet run the data migration). We will need a way to"}],"source_content_type":"text/x-rst","patch_set":6,"id":"ffd0ebdf_f517dcec","line":185,"range":{"start_line":185,"start_character":29,"end_line":185,"end_character":60},"in_reply_to":"3f79a3b5_02adcb32","updated":"2019-01-04 00:26:47.000000000","message":"It just means having unmigrated instance mappings records, that is, an upgrade in-progress that is not yet complete. Maybe I shouldn\u0027t mention N and N+1 upgrade and say instead \"running, in-progress upgrade\"?","commit_id":"517ebefedad5a0c6dacb140931035828ad9b625b"},{"author":{"_account_id":6873,"name":"Matt Riedemann","email":"mriedem.os@gmail.com","username":"mriedem"},"change_message_id":"ce9e7a4cc58f7eec546d0dea8755c72d28fcfb3f","unresolved":false,"context_lines":[{"line_number":182,"context_line":"the mapping. The query could filter on instance UUIDS, finding the ``user_id``"},{"line_number":183,"context_line":"values to populate in the mappings."},{"line_number":184,"context_line":""},{"line_number":185,"context_line":"In order to handle a rolling, mixed N and N+1 version upgrade, we will need to"},{"line_number":186,"context_line":"be able to fall back on the legacy counting method for instances, cores, and"},{"line_number":187,"context_line":"ram if ``nova_api.instance_mappings`` don\u0027t yet have ``user_id`` populated"},{"line_number":188,"context_line":"(if the operator has not yet run the data migration). We will need a way to"}],"source_content_type":"text/x-rst","patch_set":6,"id":"ffd0ebdf_e30dd437","line":185,"range":{"start_line":185,"start_character":29,"end_line":185,"end_character":60},"in_reply_to":"ffd0ebdf_f517dcec","updated":"2019-01-04 17:52:22.000000000","message":"Yeah I think that would make more sense since N and N+1 mean backlevel computes to me, not data.","commit_id":"517ebefedad5a0c6dacb140931035828ad9b625b"},{"author":{"_account_id":6873,"name":"Matt Riedemann","email":"mriedem.os@gmail.com","username":"mriedem"},"change_message_id":"25faa9a05b25da09ba81b0b4f2910361595f6b2d","unresolved":false,"context_lines":[{"line_number":187,"context_line":"ram if ``nova_api.instance_mappings`` don\u0027t yet have ``user_id`` populated"},{"line_number":188,"context_line":"(if the operator has not yet run the data migration). We will need a way to"},{"line_number":189,"context_line":"detect that the migration has not yet been run in order to fall back and to do"},{"line_number":190,"context_line":"that, we can have a check such as ``if COUNT(InstanceMapping.id) where"},{"line_number":191,"context_line":"user_id\u003dNone \u003e 0`` then fall back on the legacy counting method to query cell"},{"line_number":192,"context_line":"databases."},{"line_number":193,"context_line":""}],"source_content_type":"text/x-rst","patch_set":6,"id":"3f79a3b5_dda254f9","line":190,"updated":"2018-12-21 15:54:25.000000000","message":"Wouldn\u0027t this want the project_id in the filter?","commit_id":"517ebefedad5a0c6dacb140931035828ad9b625b"},{"author":{"_account_id":4690,"name":"melanie witt","display_name":"melwitt","email":"melwittt@gmail.com","username":"melwitt"},"change_message_id":"0654d78562441d309953136ef643f5eddcd1c0c2","unresolved":false,"context_lines":[{"line_number":187,"context_line":"ram if ``nova_api.instance_mappings`` don\u0027t yet have ``user_id`` populated"},{"line_number":188,"context_line":"(if the operator has not yet run the data migration). We will need a way to"},{"line_number":189,"context_line":"detect that the migration has not yet been run in order to fall back and to do"},{"line_number":190,"context_line":"that, we can have a check such as ``if COUNT(InstanceMapping.id) where"},{"line_number":191,"context_line":"user_id\u003dNone \u003e 0`` then fall back on the legacy counting method to query cell"},{"line_number":192,"context_line":"databases."},{"line_number":193,"context_line":""}],"source_content_type":"text/x-rst","patch_set":6,"id":"ffd0ebdf_75e1acfe","line":190,"in_reply_to":"3f79a3b5_dda254f9","updated":"2019-01-04 00:26:47.000000000","message":"Hm, I had been thinking if any instance mapping in the deployment, regardless of project, doesn\u0027t have user_id populated, fall back on the legacy counting method. But it does seem like it would be OK to use the new counting method for a project if all of the instance mappings for that project have user_id populated.","commit_id":"517ebefedad5a0c6dacb140931035828ad9b625b"},{"author":{"_account_id":6873,"name":"Matt Riedemann","email":"mriedem.os@gmail.com","username":"mriedem"},"change_message_id":"ce9e7a4cc58f7eec546d0dea8755c72d28fcfb3f","unresolved":false,"context_lines":[{"line_number":187,"context_line":"ram if ``nova_api.instance_mappings`` don\u0027t yet have ``user_id`` populated"},{"line_number":188,"context_line":"(if the operator has not yet run the data migration). We will need a way to"},{"line_number":189,"context_line":"detect that the migration has not yet been run in order to fall back and to do"},{"line_number":190,"context_line":"that, we can have a check such as ``if COUNT(InstanceMapping.id) where"},{"line_number":191,"context_line":"user_id\u003dNone \u003e 0`` then fall back on the legacy counting method to query cell"},{"line_number":192,"context_line":"databases."},{"line_number":193,"context_line":""}],"source_content_type":"text/x-rst","patch_set":6,"id":"ffd0ebdf_23a0bcf4","line":190,"in_reply_to":"ffd0ebdf_75e1acfe","updated":"2019-01-04 17:52:22.000000000","message":"Right. If the cloud has 1000 projects and 500 of them are migrated (all the instance mappings for half the projects have the user_id populated), if I\u0027m creating a new server and my project is all migrated, I wouldn\u0027t want to be held back somehow because of the other projects that don\u0027t yet have their instance mappings migrated. It would also reduce the scope of that query performed on each quota check.","commit_id":"517ebefedad5a0c6dacb140931035828ad9b625b"},{"author":{"_account_id":6873,"name":"Matt Riedemann","email":"mriedem.os@gmail.com","username":"mriedem"},"change_message_id":"25faa9a05b25da09ba81b0b4f2910361595f6b2d","unresolved":false,"context_lines":[{"line_number":188,"context_line":"(if the operator has not yet run the data migration). We will need a way to"},{"line_number":189,"context_line":"detect that the migration has not yet been run in order to fall back and to do"},{"line_number":190,"context_line":"that, we can have a check such as ``if COUNT(InstanceMapping.id) where"},{"line_number":191,"context_line":"user_id\u003dNone \u003e 0`` then fall back on the legacy counting method to query cell"},{"line_number":192,"context_line":"databases."},{"line_number":193,"context_line":""},{"line_number":194,"context_line":"Implementation"}],"source_content_type":"text/x-rst","patch_set":6,"id":"3f79a3b5_2265671b","line":191,"updated":"2018-12-21 15:54:25.000000000","message":"Rather than fallback to the legacy counting method, why not just do the data migration right then and there? Similar to what I\u0027m saying above with the _get_instance method.","commit_id":"517ebefedad5a0c6dacb140931035828ad9b625b"},{"author":{"_account_id":4690,"name":"melanie witt","display_name":"melwitt","email":"melwittt@gmail.com","username":"melwitt"},"change_message_id":"0654d78562441d309953136ef643f5eddcd1c0c2","unresolved":false,"context_lines":[{"line_number":188,"context_line":"(if the operator has not yet run the data migration). We will need a way to"},{"line_number":189,"context_line":"detect that the migration has not yet been run in order to fall back and to do"},{"line_number":190,"context_line":"that, we can have a check such as ``if COUNT(InstanceMapping.id) where"},{"line_number":191,"context_line":"user_id\u003dNone \u003e 0`` then fall back on the legacy counting method to query cell"},{"line_number":192,"context_line":"databases."},{"line_number":193,"context_line":""},{"line_number":194,"context_line":"Implementation"}],"source_content_type":"text/x-rst","patch_set":6,"id":"ffd0ebdf_95bbf0e3","line":191,"in_reply_to":"3f79a3b5_2265671b","updated":"2019-01-04 00:26:47.000000000","message":"It could potentially be expensive to run a batch in the middle of a quota check? In order to migrate on-the-fly and proceed with the new counting method, we would have to batch migrate all instance mappings for that project at quota check time. Do you think we should do that or?","commit_id":"517ebefedad5a0c6dacb140931035828ad9b625b"},{"author":{"_account_id":6873,"name":"Matt Riedemann","email":"mriedem.os@gmail.com","username":"mriedem"},"change_message_id":"ce9e7a4cc58f7eec546d0dea8755c72d28fcfb3f","unresolved":false,"context_lines":[{"line_number":188,"context_line":"(if the operator has not yet run the data migration). We will need a way to"},{"line_number":189,"context_line":"detect that the migration has not yet been run in order to fall back and to do"},{"line_number":190,"context_line":"that, we can have a check such as ``if COUNT(InstanceMapping.id) where"},{"line_number":191,"context_line":"user_id\u003dNone \u003e 0`` then fall back on the legacy counting method to query cell"},{"line_number":192,"context_line":"databases."},{"line_number":193,"context_line":""},{"line_number":194,"context_line":"Implementation"}],"source_content_type":"text/x-rst","patch_set":6,"id":"ffd0ebdf_c3f21007","line":191,"in_reply_to":"ffd0ebdf_95bbf0e3","updated":"2019-01-04 17:52:22.000000000","message":"If we filter on project_id like I said above, then it would be a one time hit for that project, right? Assuming default quotas, that\u0027s at most 10 instance mappings to update.\n\nMaybe get other opinions here, since I know some clouds have projects with much larger quota. I guess it depends on how eager a cloud wants to get to the counting via placement behavior for all projects (which we know CERN definitely does, but others with a single cell might not care). So if CERN is just going to run the online data migrations as soon as they get to Stein so they are covered, maybe it\u0027s not worth it for everyone else.\n\nI guess worst case scenarios is you\u0027re no worse off than you are today.","commit_id":"517ebefedad5a0c6dacb140931035828ad9b625b"},{"author":{"_account_id":4690,"name":"melanie witt","display_name":"melwitt","email":"melwittt@gmail.com","username":"melwitt"},"change_message_id":"5ca3b98b1fbd89b205b9cd7431c50762b19d6ed1","unresolved":false,"context_lines":[{"line_number":188,"context_line":"(if the operator has not yet run the data migration). We will need a way to"},{"line_number":189,"context_line":"detect that the migration has not yet been run in order to fall back and to do"},{"line_number":190,"context_line":"that, we can have a check such as ``if COUNT(InstanceMapping.id) where"},{"line_number":191,"context_line":"user_id\u003dNone \u003e 0`` then fall back on the legacy counting method to query cell"},{"line_number":192,"context_line":"databases."},{"line_number":193,"context_line":""},{"line_number":194,"context_line":"Implementation"}],"source_content_type":"text/x-rst","patch_set":6,"id":"dfd5e7cf_31603033","line":191,"in_reply_to":"ffd0ebdf_c3f21007","updated":"2019-01-04 23:07:49.000000000","message":"Yeah, given a filter on project_id, it would be a one time hit per project. But, yes I was imagining the possibility of a much higher quota for a project (Oath does things like, a tenant creates 100 new instances to scale up, implying that the quota is quite high). So in a case like that, if they didn\u0027t run online data migrations yet and we did a project batch on-the-fly, we\u0027d update 100 instance mappings during the first quota check. That seems like a lot to do during a quota check, but it is only a one time hit like you said. It\u0027s hard to say if that\u0027s so bad or not. I could post to the ML about it to get ops input if this gets approved.","commit_id":"517ebefedad5a0c6dacb140931035828ad9b625b"},{"author":{"_account_id":6873,"name":"Matt Riedemann","email":"mriedem.os@gmail.com","username":"mriedem"},"change_message_id":"25faa9a05b25da09ba81b0b4f2910361595f6b2d","unresolved":false,"context_lines":[{"line_number":190,"context_line":"that, we can have a check such as ``if COUNT(InstanceMapping.id) where"},{"line_number":191,"context_line":"user_id\u003dNone \u003e 0`` then fall back on the legacy counting method to query cell"},{"line_number":192,"context_line":"databases."},{"line_number":193,"context_line":""},{"line_number":194,"context_line":"Implementation"},{"line_number":195,"context_line":"\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d"},{"line_number":196,"context_line":""}],"source_content_type":"text/x-rst","patch_set":6,"id":"3f79a3b5_a27097d9","line":193,"updated":"2018-12-21 15:54:25.000000000","message":"Will we migrate instance mappings that are queued_for_delete? I\u0027m not sure if it matters. Seems fine to migrate those as well to be safe.\n\n(later)\n\nActually I guess the answer must be yes because we\u0027ll need to filter on queued_for_delete\u003dFalse when counting quota, like here:\n\nhttps://github.com/openstack/nova/blob/905e25a63d3ba25cfbdf492891ac8864fed609ab/nova/quota.py#L1113\n\nCorrect?","commit_id":"517ebefedad5a0c6dacb140931035828ad9b625b"},{"author":{"_account_id":4690,"name":"melanie witt","display_name":"melwitt","email":"melwittt@gmail.com","username":"melwitt"},"change_message_id":"0654d78562441d309953136ef643f5eddcd1c0c2","unresolved":false,"context_lines":[{"line_number":190,"context_line":"that, we can have a check such as ``if COUNT(InstanceMapping.id) where"},{"line_number":191,"context_line":"user_id\u003dNone \u003e 0`` then fall back on the legacy counting method to query cell"},{"line_number":192,"context_line":"databases."},{"line_number":193,"context_line":""},{"line_number":194,"context_line":"Implementation"},{"line_number":195,"context_line":"\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d"},{"line_number":196,"context_line":""}],"source_content_type":"text/x-rst","patch_set":6,"id":"ffd0ebdf_55b588ed","line":193,"in_reply_to":"3f79a3b5_a27097d9","updated":"2019-01-04 00:26:47.000000000","message":"Right, we\u0027ll need to filter out queued_for_delete\u003dTrue mappings that we find for the project/user when doing the instance count. So yes, we will need to migrate mappings that are queued_for_delete as well.","commit_id":"517ebefedad5a0c6dacb140931035828ad9b625b"},{"author":{"_account_id":6873,"name":"Matt Riedemann","email":"mriedem.os@gmail.com","username":"mriedem"},"change_message_id":"25faa9a05b25da09ba81b0b4f2910361595f6b2d","unresolved":false,"context_lines":[{"line_number":208,"context_line":""},{"line_number":209,"context_line":"* Add a new column ``user_id`` to the ``nova_api.instance_mappings`` table."},{"line_number":210,"context_line":"* Implement an online data migration to populate the ``user_id`` field."},{"line_number":211,"context_line":"* Update the ``_server_group_count_members_by_user`` quota counting method to"},{"line_number":212,"context_line":"  use only the ``nova_api.instance_mappings`` table instead of querying cell"},{"line_number":213,"context_line":"  databases."},{"line_number":214,"context_line":"* Add a config option ``[workarounds]disable_quota_usage_from_placement`` that"},{"line_number":215,"context_line":"  defaults to False."},{"line_number":216,"context_line":"* Add a new method to count instances with a count of"}],"source_content_type":"text/x-rst","patch_set":6,"id":"3f79a3b5_dd3494d9","line":213,"range":{"start_line":211,"start_character":0,"end_line":213,"end_character":12},"updated":"2018-12-21 15:54:25.000000000","message":"Coincidentally this should also remove our need to include build requests in that method:\n\nhttps://github.com/openstack/nova/blob/905e25a63d3ba25cfbdf492891ac8864fed609ab/nova/quota.py#L1123","commit_id":"517ebefedad5a0c6dacb140931035828ad9b625b"},{"author":{"_account_id":4690,"name":"melanie witt","display_name":"melwitt","email":"melwittt@gmail.com","username":"melwitt"},"change_message_id":"0654d78562441d309953136ef643f5eddcd1c0c2","unresolved":false,"context_lines":[{"line_number":208,"context_line":""},{"line_number":209,"context_line":"* Add a new column ``user_id`` to the ``nova_api.instance_mappings`` table."},{"line_number":210,"context_line":"* Implement an online data migration to populate the ``user_id`` field."},{"line_number":211,"context_line":"* Update the ``_server_group_count_members_by_user`` quota counting method to"},{"line_number":212,"context_line":"  use only the ``nova_api.instance_mappings`` table instead of querying cell"},{"line_number":213,"context_line":"  databases."},{"line_number":214,"context_line":"* Add a config option ``[workarounds]disable_quota_usage_from_placement`` that"},{"line_number":215,"context_line":"  defaults to False."},{"line_number":216,"context_line":"* Add a new method to count instances with a count of"}],"source_content_type":"text/x-rst","patch_set":6,"id":"ffd0ebdf_f5c59c5c","line":213,"range":{"start_line":211,"start_character":0,"end_line":213,"end_character":12},"in_reply_to":"3f79a3b5_dd3494d9","updated":"2019-01-04 00:26:47.000000000","message":"As I mentioned in an earlier comment, pretty much. There\u0027s technically a tiny window where we can have a build request and no instance mapping yet.","commit_id":"517ebefedad5a0c6dacb140931035828ad9b625b"},{"author":{"_account_id":6873,"name":"Matt Riedemann","email":"mriedem.os@gmail.com","username":"mriedem"},"change_message_id":"ce9e7a4cc58f7eec546d0dea8755c72d28fcfb3f","unresolved":false,"context_lines":[{"line_number":208,"context_line":""},{"line_number":209,"context_line":"* Add a new column ``user_id`` to the ``nova_api.instance_mappings`` table."},{"line_number":210,"context_line":"* Implement an online data migration to populate the ``user_id`` field."},{"line_number":211,"context_line":"* Update the ``_server_group_count_members_by_user`` quota counting method to"},{"line_number":212,"context_line":"  use only the ``nova_api.instance_mappings`` table instead of querying cell"},{"line_number":213,"context_line":"  databases."},{"line_number":214,"context_line":"* Add a config option ``[workarounds]disable_quota_usage_from_placement`` that"},{"line_number":215,"context_line":"  defaults to False."},{"line_number":216,"context_line":"* Add a new method to count instances with a count of"}],"source_content_type":"text/x-rst","patch_set":6,"id":"ffd0ebdf_a347cc1d","line":213,"range":{"start_line":211,"start_character":0,"end_line":213,"end_character":12},"in_reply_to":"ffd0ebdf_f5c59c5c","updated":"2019-01-04 17:52:22.000000000","message":"Can we just move the instance mapping created to before the build request creation? I guess we could fail here in that case:\n\nhttps://github.com/openstack/nova/blob/8ef3d253a086e4f8575f5221d4515cda421abea2/nova/compute/api.py#L2452","commit_id":"517ebefedad5a0c6dacb140931035828ad9b625b"},{"author":{"_account_id":6873,"name":"Matt Riedemann","email":"mriedem.os@gmail.com","username":"mriedem"},"change_message_id":"25faa9a05b25da09ba81b0b4f2910361595f6b2d","unresolved":false,"context_lines":[{"line_number":211,"context_line":"* Update the ``_server_group_count_members_by_user`` quota counting method to"},{"line_number":212,"context_line":"  use only the ``nova_api.instance_mappings`` table instead of querying cell"},{"line_number":213,"context_line":"  databases."},{"line_number":214,"context_line":"* Add a config option ``[workarounds]disable_quota_usage_from_placement`` that"},{"line_number":215,"context_line":"  defaults to False."},{"line_number":216,"context_line":"* Add a new method to count instances with a count of"},{"line_number":217,"context_line":"  ``nova_api.instance_mappings`` filtering by ``project_id\u003d\u003cproject_id\u003e`` and"}],"source_content_type":"text/x-rst","patch_set":6,"id":"3f79a3b5_9d813c49","line":214,"updated":"2018-12-21 15:54:25.000000000","message":"Any thoughts on when this could be deprecated? I guess when we have a partitioning scheme built into placement. Just kind of sucks supporting two code paths for counting for a long time.","commit_id":"517ebefedad5a0c6dacb140931035828ad9b625b"},{"author":{"_account_id":4690,"name":"melanie witt","display_name":"melwitt","email":"melwittt@gmail.com","username":"melwitt"},"change_message_id":"0654d78562441d309953136ef643f5eddcd1c0c2","unresolved":false,"context_lines":[{"line_number":211,"context_line":"* Update the ``_server_group_count_members_by_user`` quota counting method to"},{"line_number":212,"context_line":"  use only the ``nova_api.instance_mappings`` table instead of querying cell"},{"line_number":213,"context_line":"  databases."},{"line_number":214,"context_line":"* Add a config option ``[workarounds]disable_quota_usage_from_placement`` that"},{"line_number":215,"context_line":"  defaults to False."},{"line_number":216,"context_line":"* Add a new method to count instances with a count of"},{"line_number":217,"context_line":"  ``nova_api.instance_mappings`` filtering by ``project_id\u003d\u003cproject_id\u003e`` and"}],"source_content_type":"text/x-rst","patch_set":6,"id":"ffd0ebdf_958e9025","line":214,"in_reply_to":"3f79a3b5_9d813c49","updated":"2019-01-04 00:26:47.000000000","message":"Yeah. I think it would be a long time in the future, depending on how important the idea of multiple Nova deployments sharing a single placement service becomes in Edge land.\n\nI agree, it kind of sucks to support two code paths but I felt like this is a fairly minimal case of it, if it\u0027s any consolation, where we have two self-contained counting methods in one file (nova/quota.py) and we use one or the other depending. (As opposed to two paths repeated throughout various code files).","commit_id":"517ebefedad5a0c6dacb140931035828ad9b625b"},{"author":{"_account_id":6873,"name":"Matt Riedemann","email":"mriedem.os@gmail.com","username":"mriedem"},"change_message_id":"ce9e7a4cc58f7eec546d0dea8755c72d28fcfb3f","unresolved":false,"context_lines":[{"line_number":211,"context_line":"* Update the ``_server_group_count_members_by_user`` quota counting method to"},{"line_number":212,"context_line":"  use only the ``nova_api.instance_mappings`` table instead of querying cell"},{"line_number":213,"context_line":"  databases."},{"line_number":214,"context_line":"* Add a config option ``[workarounds]disable_quota_usage_from_placement`` that"},{"line_number":215,"context_line":"  defaults to False."},{"line_number":216,"context_line":"* Add a new method to count instances with a count of"},{"line_number":217,"context_line":"  ``nova_api.instance_mappings`` filtering by ``project_id\u003d\u003cproject_id\u003e`` and"}],"source_content_type":"text/x-rst","patch_set":6,"id":"ffd0ebdf_6318041d","line":214,"in_reply_to":"ffd0ebdf_958e9025","updated":"2019-01-04 17:52:22.000000000","message":"There is also the performance impact of doing the instance mapping query check, but that should be temporary and we could add an api_db blocker migration in Train to make sure people did the data migration and then drop that query from the count path.","commit_id":"517ebefedad5a0c6dacb140931035828ad9b625b"},{"author":{"_account_id":6873,"name":"Matt Riedemann","email":"mriedem.os@gmail.com","username":"mriedem"},"change_message_id":"25faa9a05b25da09ba81b0b4f2910361595f6b2d","unresolved":false,"context_lines":[{"line_number":218,"context_line":"  ``user_id\u003d\u003cuser_id\u003e`` and ``queued_for_delete\u003dFalse``."},{"line_number":219,"context_line":"* Add a new count method that queries the placement API for CPU and RAM usage."},{"line_number":220,"context_line":"  In the new count method, add a check for whether the online data migration"},{"line_number":221,"context_line":"  has been run yet and if not, fall back on the legacy count method."},{"line_number":222,"context_line":"* Rename the ``_instances_cores_ram_count`` method to ``_cores_ram_count`` and"},{"line_number":223,"context_line":"  let it count only cores and ram in the legacy way, for use if"},{"line_number":224,"context_line":"  ``[workarounds]disable_quota_usage_from_placement`` is set to True."}],"source_content_type":"text/x-rst","patch_set":6,"id":"3f79a3b5_bda7580a","line":221,"updated":"2018-12-21 15:54:25.000000000","message":"Again I would argue we should just do the data migration online here so we don\u0027t have to fallback to the legacy method if we can help it - do as much of this online as possible. The sticking thing is I\u0027m not sure how you plan on determining if the database migration has been run or not, because it seems the only way you can determine that is by checking for instance mappings with no user_id for the given project at the time of counting, which would be wasteful in a fresh install where everything is already migrated.","commit_id":"517ebefedad5a0c6dacb140931035828ad9b625b"},{"author":{"_account_id":6873,"name":"Matt Riedemann","email":"mriedem.os@gmail.com","username":"mriedem"},"change_message_id":"a5a99c917e64f3296e46520b2d17cc902b534936","unresolved":false,"context_lines":[{"line_number":218,"context_line":"  ``user_id\u003d\u003cuser_id\u003e`` and ``queued_for_delete\u003dFalse``."},{"line_number":219,"context_line":"* Add a new count method that queries the placement API for CPU and RAM usage."},{"line_number":220,"context_line":"  In the new count method, add a check for whether the online data migration"},{"line_number":221,"context_line":"  has been run yet and if not, fall back on the legacy count method."},{"line_number":222,"context_line":"* Rename the ``_instances_cores_ram_count`` method to ``_cores_ram_count`` and"},{"line_number":223,"context_line":"  let it count only cores and ram in the legacy way, for use if"},{"line_number":224,"context_line":"  ``[workarounds]disable_quota_usage_from_placement`` is set to True."}],"source_content_type":"text/x-rst","patch_set":6,"id":"3f79a3b5_fdd1502e","line":221,"in_reply_to":"3f79a3b5_bda7580a","updated":"2018-12-21 15:58:42.000000000","message":"If we do have to do the check for data migration before counting each time, I\u0027m hoping we can make that temporary for Stein and drop it in T with a blocker migration, i.e. you can\u0027t pass \u0027nova-manage api_db sync\u0027 if there are any instance mappings with user_id\u003dNone to force the batched migration using the CLI.\n\nMaybe in addition to that, could we also cache the results of that check so we don\u0027t run it more than once per API worker per project_id?","commit_id":"517ebefedad5a0c6dacb140931035828ad9b625b"},{"author":{"_account_id":4690,"name":"melanie witt","display_name":"melwitt","email":"melwittt@gmail.com","username":"melwitt"},"change_message_id":"0654d78562441d309953136ef643f5eddcd1c0c2","unresolved":false,"context_lines":[{"line_number":218,"context_line":"  ``user_id\u003d\u003cuser_id\u003e`` and ``queued_for_delete\u003dFalse``."},{"line_number":219,"context_line":"* Add a new count method that queries the placement API for CPU and RAM usage."},{"line_number":220,"context_line":"  In the new count method, add a check for whether the online data migration"},{"line_number":221,"context_line":"  has been run yet and if not, fall back on the legacy count method."},{"line_number":222,"context_line":"* Rename the ``_instances_cores_ram_count`` method to ``_cores_ram_count`` and"},{"line_number":223,"context_line":"  let it count only cores and ram in the legacy way, for use if"},{"line_number":224,"context_line":"  ``[workarounds]disable_quota_usage_from_placement`` is set to True."}],"source_content_type":"text/x-rst","patch_set":6,"id":"ffd0ebdf_75b68cdc","line":221,"in_reply_to":"3f79a3b5_fdd1502e","updated":"2019-01-04 00:26:47.000000000","message":"I agree that I think the only way to check whether the data migration has been done is to check if any instance mapping with user_id\u003dNone exists for a given project. And I like the idea of making that temporary for Stein and drop it in T with a blocker migration. And I like the idea of caching the results of the check so we don\u0027t run it over and over again.\n\nI\u0027ll add text about all of this.","commit_id":"517ebefedad5a0c6dacb140931035828ad9b625b"},{"author":{"_account_id":6873,"name":"Matt Riedemann","email":"mriedem.os@gmail.com","username":"mriedem"},"change_message_id":"ce9e7a4cc58f7eec546d0dea8755c72d28fcfb3f","unresolved":false,"context_lines":[{"line_number":218,"context_line":"  ``user_id\u003d\u003cuser_id\u003e`` and ``queued_for_delete\u003dFalse``."},{"line_number":219,"context_line":"* Add a new count method that queries the placement API for CPU and RAM usage."},{"line_number":220,"context_line":"  In the new count method, add a check for whether the online data migration"},{"line_number":221,"context_line":"  has been run yet and if not, fall back on the legacy count method."},{"line_number":222,"context_line":"* Rename the ``_instances_cores_ram_count`` method to ``_cores_ram_count`` and"},{"line_number":223,"context_line":"  let it count only cores and ram in the legacy way, for use if"},{"line_number":224,"context_line":"  ``[workarounds]disable_quota_usage_from_placement`` is set to True."}],"source_content_type":"text/x-rst","patch_set":6,"id":"ffd0ebdf_63ed64ee","line":221,"in_reply_to":"ffd0ebdf_75b68cdc","updated":"2019-01-04 17:52:22.000000000","message":"\u003e I agree that I think the only way to check whether the data\n \u003e migration has been done is to check if any instance mapping with\n \u003e user_id\u003dNone exists for a given project. And I like the idea of\n \u003e making that temporary for Stein and drop it in T with a blocker\n \u003e migration. And I like the idea of caching the results of the check\n \u003e so we don\u0027t run it over and over again.\n \u003e \n \u003e I\u0027ll add text about all of this.\n\nSounds good yeah - that alleviates most of my concern over just doing the data migration at the time of the count.","commit_id":"517ebefedad5a0c6dacb140931035828ad9b625b"},{"author":{"_account_id":4690,"name":"melanie witt","display_name":"melwitt","email":"melwittt@gmail.com","username":"melwitt"},"change_message_id":"6aefd6fb6420d6113a72bcddbfa27c89771c2a5a","unresolved":false,"context_lines":[{"line_number":218,"context_line":"  ``user_id\u003d\u003cuser_id\u003e`` and ``queued_for_delete\u003dFalse``."},{"line_number":219,"context_line":"* Add a new count method that queries the placement API for CPU and RAM usage."},{"line_number":220,"context_line":"  In the new count method, add a check for whether the online data migration"},{"line_number":221,"context_line":"  has been run yet and if not, fall back on the legacy count method."},{"line_number":222,"context_line":"* Rename the ``_instances_cores_ram_count`` method to ``_cores_ram_count`` and"},{"line_number":223,"context_line":"  let it count only cores and ram in the legacy way, for use if"},{"line_number":224,"context_line":"  ``[workarounds]disable_quota_usage_from_placement`` is set to True."}],"source_content_type":"text/x-rst","patch_set":6,"id":"ffd0ebdf_d635fbf8","line":221,"in_reply_to":"ffd0ebdf_75b68cdc","updated":"2019-01-04 10:05:50.000000000","message":"Another thing I forgot to mention is that either way, we will need the nova-manage batch data migration to handle fast-forward-upgrades, IIUC.","commit_id":"517ebefedad5a0c6dacb140931035828ad9b625b"},{"author":{"_account_id":6873,"name":"Matt Riedemann","email":"mriedem.os@gmail.com","username":"mriedem"},"change_message_id":"ce9e7a4cc58f7eec546d0dea8755c72d28fcfb3f","unresolved":false,"context_lines":[{"line_number":218,"context_line":"  ``user_id\u003d\u003cuser_id\u003e`` and ``queued_for_delete\u003dFalse``."},{"line_number":219,"context_line":"* Add a new count method that queries the placement API for CPU and RAM usage."},{"line_number":220,"context_line":"  In the new count method, add a check for whether the online data migration"},{"line_number":221,"context_line":"  has been run yet and if not, fall back on the legacy count method."},{"line_number":222,"context_line":"* Rename the ``_instances_cores_ram_count`` method to ``_cores_ram_count`` and"},{"line_number":223,"context_line":"  let it count only cores and ram in the legacy way, for use if"},{"line_number":224,"context_line":"  ``[workarounds]disable_quota_usage_from_placement`` is set to True."}],"source_content_type":"text/x-rst","patch_set":6,"id":"ffd0ebdf_a3e34c05","line":221,"in_reply_to":"ffd0ebdf_d635fbf8","updated":"2019-01-04 17:52:22.000000000","message":"\u003e Another thing I forgot to mention is that either way, we will need\n \u003e the nova-manage batch data migration to handle fast-forward-upgrades,\n \u003e IIUC.\n\nYup definitely.","commit_id":"517ebefedad5a0c6dacb140931035828ad9b625b"},{"author":{"_account_id":6873,"name":"Matt Riedemann","email":"mriedem.os@gmail.com","username":"mriedem"},"change_message_id":"25faa9a05b25da09ba81b0b4f2910361595f6b2d","unresolved":false,"context_lines":[{"line_number":231,"context_line":"Testing"},{"line_number":232,"context_line":"\u003d\u003d\u003d\u003d\u003d\u003d\u003d"},{"line_number":233,"context_line":""},{"line_number":234,"context_line":"Unit tests and functional tests will be included to test the new functionality."},{"line_number":235,"context_line":""},{"line_number":236,"context_line":"Documentation Impact"},{"line_number":237,"context_line":"\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d"}],"source_content_type":"text/x-rst","patch_set":6,"id":"3f79a3b5_1dc44c92","line":234,"updated":"2018-12-21 15:54:25.000000000","message":"I would argue we should also have at least one CI job running with disable_quota_usage_from_placement\u003dTrue so we make sure to have integration test coverage of that path, like maybe nova-next or the nova-live-migration job.","commit_id":"517ebefedad5a0c6dacb140931035828ad9b625b"},{"author":{"_account_id":4690,"name":"melanie witt","display_name":"melwitt","email":"melwittt@gmail.com","username":"melwitt"},"change_message_id":"0654d78562441d309953136ef643f5eddcd1c0c2","unresolved":false,"context_lines":[{"line_number":231,"context_line":"Testing"},{"line_number":232,"context_line":"\u003d\u003d\u003d\u003d\u003d\u003d\u003d"},{"line_number":233,"context_line":""},{"line_number":234,"context_line":"Unit tests and functional tests will be included to test the new functionality."},{"line_number":235,"context_line":""},{"line_number":236,"context_line":"Documentation Impact"},{"line_number":237,"context_line":"\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d"}],"source_content_type":"text/x-rst","patch_set":6,"id":"ffd0ebdf_15cb8053","line":234,"in_reply_to":"3f79a3b5_1dc44c92","updated":"2019-01-04 00:26:47.000000000","message":"Yeah, good idea. Will add.","commit_id":"517ebefedad5a0c6dacb140931035828ad9b625b"},{"author":{"_account_id":6873,"name":"Matt Riedemann","email":"mriedem.os@gmail.com","username":"mriedem"},"change_message_id":"25faa9a05b25da09ba81b0b4f2910361595f6b2d","unresolved":false,"context_lines":[{"line_number":236,"context_line":"Documentation Impact"},{"line_number":237,"context_line":"\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d"},{"line_number":238,"context_line":""},{"line_number":239,"context_line":"The documentation_ of Cells v2 caveats will be updated to remove the paragraph"},{"line_number":240,"context_line":"about the inability to correctly calculate quota usage when one or more cells"},{"line_number":241,"context_line":"are unreachable."},{"line_number":242,"context_line":""}],"source_content_type":"text/x-rst","patch_set":6,"id":"3f79a3b5_fd35d0a6","line":239,"range":{"start_line":239,"start_character":58,"end_line":239,"end_character":64},"updated":"2018-12-21 15:54:25.000000000","message":"nit: rather than remove the doc, I\u0027d update it to mention that starting in Stein there are new deployment options, similar to what we\u0027ve done for other things we\u0027ve fixed:\n\nhttps://docs.openstack.org/nova/latest/user/cellsv2-layout.html#operations-requiring-upcalls\n\nBecause people also read the latest docs when figuring out issues with older deployments (e.g. we don\u0027t have docs published for pike).","commit_id":"517ebefedad5a0c6dacb140931035828ad9b625b"},{"author":{"_account_id":4690,"name":"melanie witt","display_name":"melwitt","email":"melwittt@gmail.com","username":"melwitt"},"change_message_id":"0654d78562441d309953136ef643f5eddcd1c0c2","unresolved":false,"context_lines":[{"line_number":236,"context_line":"Documentation Impact"},{"line_number":237,"context_line":"\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d"},{"line_number":238,"context_line":""},{"line_number":239,"context_line":"The documentation_ of Cells v2 caveats will be updated to remove the paragraph"},{"line_number":240,"context_line":"about the inability to correctly calculate quota usage when one or more cells"},{"line_number":241,"context_line":"are unreachable."},{"line_number":242,"context_line":""}],"source_content_type":"text/x-rst","patch_set":6,"id":"ffd0ebdf_35c80454","line":239,"range":{"start_line":239,"start_character":58,"end_line":239,"end_character":64},"in_reply_to":"3f79a3b5_fd35d0a6","updated":"2019-01-04 00:26:47.000000000","message":"Ah, good point. Will update.","commit_id":"517ebefedad5a0c6dacb140931035828ad9b625b"},{"author":{"_account_id":7,"name":"Jay Pipes","email":"jaypipes@gmail.com","username":"jaypipes"},"change_message_id":"b92c974902a24dbce47e4aacbd8ce89c3924bcf1","unresolved":false,"context_lines":[{"line_number":71,"context_line":""},{"line_number":72,"context_line":"The new method will contain:"},{"line_number":73,"context_line":""},{"line_number":74,"context_line":"* One query to the API database to get resource usage for instances. We can get"},{"line_number":75,"context_line":"  the number of instances for a project and user if we add a new column"},{"line_number":76,"context_line":"  ``user_id`` to the ``nova_api.instance_mappings`` table. We already have a"},{"line_number":77,"context_line":"  ``project_id`` column on the table. This will allow us to count instance"},{"line_number":78,"context_line":"  mappings for a project and a user to represent the instance count."},{"line_number":79,"context_line":""},{"line_number":80,"context_line":"We will rename the ``_instances_cores_ram_count`` method to"},{"line_number":81,"context_line":"``_cores_ram_count`` that counts cores and ram from the cell databases and"}],"source_content_type":"text/x-rst","patch_set":7,"id":"dfd5e7cf_99a00da8","line":78,"range":{"start_line":74,"start_character":0,"end_line":78,"end_character":68},"updated":"2019-01-07 15:22:11.000000000","message":"or, alternately, we could deprecate the instances quota entirely since it\u0027s mostly useless compared to quotas on actual resources that are consumed, like CPU, memory, IP addresses, etc...","commit_id":"9187328f1d419680565141ecf795530afddebf2c"},{"author":{"_account_id":7,"name":"Jay Pipes","email":"jaypipes@gmail.com","username":"jaypipes"},"change_message_id":"b92c974902a24dbce47e4aacbd8ce89c3924bcf1","unresolved":false,"context_lines":[{"line_number":103,"context_line":""},{"line_number":104,"context_line":"We will add a new method for counting cores and ram from placement that is used"},{"line_number":105,"context_line":"when ``[workarounds]disable_quota_usage_from_placement`` is False. This"},{"line_number":106,"context_line":"method could be called ``_cores_ram_count_placement``."},{"line_number":107,"context_line":""},{"line_number":108,"context_line":"The new method will contain:"},{"line_number":109,"context_line":""}],"source_content_type":"text/x-rst","patch_set":7,"id":"dfd5e7cf_d976953f","line":106,"range":{"start_line":106,"start_character":26,"end_line":106,"end_character":51},"updated":"2019-01-07 15:22:11.000000000","message":"or just _placement_resource_count, since we can count any resource class not just VCPU and MEMORY_MB","commit_id":"9187328f1d419680565141ecf795530afddebf2c"},{"author":{"_account_id":7,"name":"Jay Pipes","email":"jaypipes@gmail.com","username":"jaypipes"},"change_message_id":"b92c974902a24dbce47e4aacbd8ce89c3924bcf1","unresolved":false,"context_lines":[{"line_number":110,"context_line":"* One call to placement to get resource usage for CPU and RAM. We can get CPU"},{"line_number":111,"context_line":"  and RAM usage for a project and user by querying the ``/usages`` resource::"},{"line_number":112,"context_line":""},{"line_number":113,"context_line":"    GET /usages?project_id\u003d\u003cproject id\u003e\u0026user_id\u003d\u003cuser id\u003e"},{"line_number":114,"context_line":""},{"line_number":115,"context_line":"Alternatives"},{"line_number":116,"context_line":"------------"}],"source_content_type":"text/x-rst","patch_set":7,"id":"dfd5e7cf_79b689dc","line":113,"updated":"2019-01-07 15:22:11.000000000","message":"Recommend only project_id\u003d{project} and not doing user-specific quotas, which I\u0027m pretty sure we decided in Denver we want to deprecate.","commit_id":"9187328f1d419680565141ecf795530afddebf2c"},{"author":{"_account_id":6873,"name":"Matt Riedemann","email":"mriedem.os@gmail.com","username":"mriedem"},"change_message_id":"40a84eac8a456a9df72b4157b38b404b823e3a6b","unresolved":false,"context_lines":[{"line_number":157,"context_line":"End users will see consistent quota behavior even when cell databases are"},{"line_number":158,"context_line":"unavailable."},{"line_number":159,"context_line":""},{"line_number":160,"context_line":"Performance Impact"},{"line_number":161,"context_line":"------------------"},{"line_number":162,"context_line":""},{"line_number":163,"context_line":"The change involves making external REST API calls to placement instead of"}],"source_content_type":"text/x-rst","patch_set":7,"id":"ffd0ebdf_43c5c03b","line":160,"updated":"2019-01-04 18:04:02.000000000","message":"nit: another known performance impact would be checking if data needs to be migrated at the time of the quota check. Could be added in a follow up.","commit_id":"9187328f1d419680565141ecf795530afddebf2c"},{"author":{"_account_id":6873,"name":"Matt Riedemann","email":"mriedem.os@gmail.com","username":"mriedem"},"change_message_id":"40a84eac8a456a9df72b4157b38b404b823e3a6b","unresolved":false,"context_lines":[{"line_number":184,"context_line":"where ``user_id`` is None and query cells by corresponding ``project_id`` in"},{"line_number":185,"context_line":"the mapping. The query could filter on instance UUIDs, finding the ``user_id``"},{"line_number":186,"context_line":"values to populate in the mappings. This would implement the batched"},{"line_number":187,"context_line":"``nova-manage db online_data_migration`` way of doing the migration."},{"line_number":188,"context_line":""},{"line_number":189,"context_line":"We will also heal/populate an instance mapping on-the-fly when it is accessed"},{"line_number":190,"context_line":"during a server GET request. This would provide some data migration in the"}],"source_content_type":"text/x-rst","patch_set":7,"id":"ffd0ebdf_e3eb94ad","line":187,"range":{"start_line":187,"start_character":17,"end_line":187,"end_character":38},"updated":"2019-01-04 18:04:02.000000000","message":"nit: online_data_migrations (here and below)","commit_id":"9187328f1d419680565141ecf795530afddebf2c"},{"author":{"_account_id":6873,"name":"Matt Riedemann","email":"mriedem.os@gmail.com","username":"mriedem"},"change_message_id":"40a84eac8a456a9df72b4157b38b404b823e3a6b","unresolved":false,"context_lines":[{"line_number":198,"context_line":"the migration has not yet been run in order to fall back on the legacy counting"},{"line_number":199,"context_line":"method. We could have a check such as ``if count(InstanceMapping.id) where"},{"line_number":200,"context_line":"project_id\u003d\u003cproject id\u003e and user_id\u003dNone \u003e 0``, then fall back on the legacy"},{"line_number":201,"context_line":"counting method to query cell databases. We should cache the results of the"},{"line_number":202,"context_line":"each migration done success check by ``project_id`` so we avoid needlessly"},{"line_number":203,"context_line":"checking a ``project_id`` that has already been migrated every time quota is"},{"line_number":204,"context_line":"checked."},{"line_number":205,"context_line":""}],"source_content_type":"text/x-rst","patch_set":7,"id":"ffd0ebdf_63d7a4ef","line":202,"range":{"start_line":201,"start_character":69,"end_line":202,"end_character":33},"updated":"2019-01-04 18:04:02.000000000","message":"nit: this reads a bit confusingly - I know what you\u0027re trying to say though.","commit_id":"9187328f1d419680565141ecf795530afddebf2c"},{"author":{"_account_id":6873,"name":"Matt Riedemann","email":"mriedem.os@gmail.com","username":"mriedem"},"change_message_id":"40a84eac8a456a9df72b4157b38b404b823e3a6b","unresolved":false,"context_lines":[{"line_number":271,"context_line":""},{"line_number":272,"context_line":".. _documentation: https://docs.openstack.org/nova/latest/user/cellsv2-layout.html#quota-related-quirks"},{"line_number":273,"context_line":""},{"line_number":274,"context_line":"References"},{"line_number":275,"context_line":"\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d\u003d"},{"line_number":276,"context_line":""},{"line_number":277,"context_line":"This builds upon the work done in Pike to re-architect quotas to count"}],"source_content_type":"text/x-rst","patch_set":7,"id":"ffd0ebdf_634564b3","line":274,"updated":"2019-01-04 18:04:02.000000000","message":"Up to you if you want to mention it, but I think this still might inadvertently fix this bug:\n\nhttps://bugs.launchpad.net/nova/+bug/1716706\n\nI know Huawei public cloud had a hard time because of that one anyway (before pike, going over-quota would fail in the API because of the reservations table even if there are concurrent requests being made, but with pike if you\u0027re making concurrent requests the over-quota isn\u0027t caught until we check again in conductor, and then we put all of the instances in ERROR status which was a behavior change and caused some issues for our ops team, i.e. SLAs).","commit_id":"9187328f1d419680565141ecf795530afddebf2c"}]}
