)]}'
{"/COMMIT_MSG":[{"author":{"_account_id":4393,"name":"Dan Smith","email":"dms@danplanet.com","username":"danms"},"change_message_id":"bfcb51416029c11f1436c8f8d9c16d63563629b7","unresolved":true,"context_lines":[{"line_number":26,"context_line":"[1] https://review.opendev.org/c/openstack/tempest/+/949595"},{"line_number":27,"context_line":"[2] https://github.com/openstack/glance/blob/1d8bc7d17ccf16bc0b66b14c54e60fd1b152d471/glance/async_/flows/location_import.py#L318"},{"line_number":28,"context_line":""},{"line_number":29,"context_line":"Closes-Bug: #949595"},{"line_number":30,"context_line":"Change-Id: I21c04d73f58f45f74202da64ec0244fa0ced518b"}],"source_content_type":"text/x-gerrit-commit-message","patch_set":2,"id":"2dd6e689_60cd24ca","line":29,"updated":"2025-05-19 16:00:03.000000000","message":"FYI, this is not the right bug","commit_id":"fb55edcfc77b5423f9b2a1925d727ed2c015617b"}],"/PATCHSET_LEVEL":[{"author":{"_account_id":4393,"name":"Dan Smith","email":"dms@danplanet.com","username":"danms"},"change_message_id":"86f6866c7f928e2b564adc68863e1fff0a2cd4af","unresolved":false,"context_lines":[],"source_content_type":"","patch_set":2,"id":"2b9aaffe_19461416","updated":"2025-05-16 16:46:03.000000000","message":"Disabling this by default seems like a really big and unfortunate hammer to me.\n\nI assume the options are:\n\n1. Disable delete while the hashing is ongoing\n2. Allow delete and let the rbd resource be lazily removed later\n3. Try to abort the hash in some way\n\nFor 2-3, we could have the hash loop refresh the Image resource from the database every N seconds and abort if the status has gone to \"deleted\" (does glance have \"deleting\"?)","commit_id":"fb55edcfc77b5423f9b2a1925d727ed2c015617b"},{"author":{"_account_id":5314,"name":"Brian Rosmaita","email":"rosmaita.fossdev@gmail.com","username":"brian-rosmaita"},"change_message_id":"ba3af1ae1c35504b03bb99957e5a2fe1bb1324bc","unresolved":false,"context_lines":[],"source_content_type":"","patch_set":2,"id":"5877b2f4_866e9ec2","updated":"2025-05-19 15:51:57.000000000","message":"I think instead of this patch, why don\u0027t we revert the nova change that uses the location API until we have figured out the implications of this issue?","commit_id":"fb55edcfc77b5423f9b2a1925d727ed2c015617b"},{"author":{"_account_id":5314,"name":"Brian Rosmaita","email":"rosmaita.fossdev@gmail.com","username":"brian-rosmaita"},"change_message_id":"ac4269dad8f8521dad1704ca173ed6fc4aa9c3e3","unresolved":false,"context_lines":[],"source_content_type":"","patch_set":2,"id":"4255f42d_4bc372e8","updated":"2025-05-20 15:34:18.000000000","message":"One of the things we need to sort out are the API semantics.  What I mean is, when an image is \u0027active\u0027, then it\u0027s ready to be consumed.  Before the do_secure_hash option, the nova direct-to-ceph snapshots were considered immediately available even though they didn\u0027t have (and would never have) a checksum/hash.  But with do_secure_hash\u003dtrue, I think that we should not set the image \u0027active\u0027 until the hash computation has completed and the os_hash_value is available on the image record.\n\nI don\u0027t think we can tell people to poll for os_hash_value to be populated because legacy images won\u0027t have it.  In other words, there\u0027s no way for a consumer to tell if an \u0027active\u0027 image will have an os_hash_value in a few minutes, or if it will never have it.  I think we need to keep the semantics of \u0027active\u0027 to mean \u0027as ready for consumption as it will ever be\u0027.\n\nAll this is a long-winded way to say that in addition to figuring out the deletion business, we need to reconsider when the image should go \u0027active\u0027.","commit_id":"fb55edcfc77b5423f9b2a1925d727ed2c015617b"},{"author":{"_account_id":9303,"name":"Abhishek Kekane","email":"akekane@redhat.com","username":"abhishekkekane"},"change_message_id":"d74f1ef44a8a9cc5b1c22f7d974913e79466b70e","unresolved":false,"context_lines":[],"source_content_type":"","patch_set":2,"id":"c5096c49_6c4b8135","updated":"2025-05-14 07:35:34.000000000","message":"You need to change the documentation as well.","commit_id":"fb55edcfc77b5423f9b2a1925d727ed2c015617b"},{"author":{"_account_id":9303,"name":"Abhishek Kekane","email":"akekane@redhat.com","username":"abhishekkekane"},"change_message_id":"7ab42ee8be52aa292f02e6ee451085ad6f680947","unresolved":false,"context_lines":[],"source_content_type":"","patch_set":2,"id":"cddc3ff0_9b2b08f1","updated":"2025-05-14 07:37:04.000000000","message":"sorry, I missed your other patch!!","commit_id":"fb55edcfc77b5423f9b2a1925d727ed2c015617b"},{"author":{"_account_id":9303,"name":"Abhishek Kekane","email":"akekane@redhat.com","username":"abhishekkekane"},"change_message_id":"9035543c618f4e57a1c662f7d151ce670e9af894","unresolved":false,"context_lines":[],"source_content_type":"","patch_set":2,"id":"8e005ad4_cf955c4a","in_reply_to":"2b9aaffe_19461416","updated":"2025-05-16 17:03:28.000000000","message":"Option 2 is not possible due to already reported ceph bug [1]\nIt does not allows image to delete as rbd throws InUseByStore exception and actually it deletes the data from the backend.\n\nOption will be backward incompatible and need fix in tempest/or consumers that if hash calculation is in progress then delete will raise conflict rather than success.\n\nFor option 3 we don\u0027t have deleting state in glance, so aborting hash is not possible (?) as image state never changes in this case to deleted and it remains in active state as explained in [1] \n\n[1] https://bugs.launchpad.net/glance/+bug/2045769","commit_id":"fb55edcfc77b5423f9b2a1925d727ed2c015617b"},{"author":{"_account_id":9303,"name":"Abhishek Kekane","email":"akekane@redhat.com","username":"abhishekkekane"},"change_message_id":"3e5b6db7bb5078b29f8c699bc110bf6647a5bcdb","unresolved":false,"context_lines":[],"source_content_type":"","patch_set":2,"id":"cd481af5_f9a8deeb","in_reply_to":"4255f42d_4bc372e8","updated":"2025-05-20 15:41:02.000000000","message":"IF do_secure_hash is True we do not set image to active unless hash calculation is completed. The problem here is to delete the image we don\u0027t need image to be in active state.","commit_id":"fb55edcfc77b5423f9b2a1925d727ed2c015617b"},{"author":{"_account_id":9303,"name":"Abhishek Kekane","email":"akekane@redhat.com","username":"abhishekkekane"},"change_message_id":"37d46b6c880b78624233f3b99a61a6fea97e64f1","unresolved":false,"context_lines":[],"source_content_type":"","patch_set":2,"id":"2c61d279_b5e2d3e6","in_reply_to":"487b7d5b_c5019263","updated":"2025-05-20 15:31:34.000000000","message":"Either way we need a change in glance_store and glance side as well, I think rather than adding complex of passing snapshot data we can easily move the image to trash if hash calculation is in progress. This is my opinion though, I am open to follow any suggestion!!!","commit_id":"fb55edcfc77b5423f9b2a1925d727ed2c015617b"},{"author":{"_account_id":27615,"name":"Rajat Dhasmana","email":"rajatdhasmana@gmail.com","username":"whoami-rajat"},"change_message_id":"af2d86b714460fbdaf69cdc94ee1624b04ffbc34","unresolved":false,"context_lines":[],"source_content_type":"","patch_set":2,"id":"908ef0c7_88f186c6","in_reply_to":"487b7d5b_c5019263","updated":"2025-05-20 15:36:10.000000000","message":"I might have the most unpopular opinion here but we are missing the actual use case of the new location API feature, which is to resolve OSSN-0090 and OSSN-0065.\nThe hash calculation was a ADD ON feature, which is good to have to secure images, but was not the primary focus of all the work we did.\nI understand that we will need a redesign and there are a lot of good suggestions, but practically, it might take a cycle or two to discuss/code/merge those changes.\n\nI feel we should move ahead with the current design by disabling hashing which preserves the backward compatibility of making the images accessible instantly without worrying about any operation failing due to an RBD handle actively reading from the image but that\u0027s just me i guess.","commit_id":"fb55edcfc77b5423f9b2a1925d727ed2c015617b"},{"author":{"_account_id":9303,"name":"Abhishek Kekane","email":"akekane@redhat.com","username":"abhishekkekane"},"change_message_id":"46357e66320f98ffeedd804bfcea1a556f9b89aa","unresolved":false,"context_lines":[],"source_content_type":"","patch_set":2,"id":"81cdfb3e_4e6e821d","in_reply_to":"5877b2f4_866e9ec2","updated":"2025-05-19 16:01:45.000000000","message":"+1 on reverting nova patch!\n\nI think I am not able to explain what actual issue is since pranali was the one who working on it. Let me describe it by drafting actual request flow:\n\n1. Nova snapshot with new location API triggers hash calculation process\n2. While hash calculation is in process, cleanup call for deleting the snapshot throws InUseByStore exception preventing glance to mark image as deleted BUT data is deleted from the RBD backend (So we cannot use RBD trash here as there is no data present in RBD)\n3. This leads glance to raise 409 conflict causing the tempest tests to fail.\n\nWhat we can do is in RBD store before exception is raised [1], we can check whether image is still present or not and return response accordingly?\n\n[1] https://github.com/openstack/glance_store/blob/master/glance_store/_drivers/rbd.py#L487","commit_id":"fb55edcfc77b5423f9b2a1925d727ed2c015617b"},{"author":{"_account_id":4393,"name":"Dan Smith","email":"dms@danplanet.com","username":"danms"},"change_message_id":"bfcb51416029c11f1436c8f8d9c16d63563629b7","unresolved":false,"context_lines":[],"source_content_type":"","patch_set":2,"id":"4859db44_99de83fa","in_reply_to":"5877b2f4_866e9ec2","updated":"2025-05-19 16:00:03.000000000","message":"Yep, agreed:\n\nhttps://review.opendev.org/c/openstack/nova/+/950336","commit_id":"fb55edcfc77b5423f9b2a1925d727ed2c015617b"},{"author":{"_account_id":9303,"name":"Abhishek Kekane","email":"akekane@redhat.com","username":"abhishekkekane"},"change_message_id":"b511c7a1d25626efae9817cc760cb62baa7935b9","unresolved":false,"context_lines":[],"source_content_type":"","patch_set":2,"id":"783ca1e3_c01059ce","in_reply_to":"5884f847_5285f1a1","updated":"2025-05-20 15:10:05.000000000","message":"One more thing we can do is, use lazy deleting of the image at the rbd side. In glance we set property os_hash_algo to the image before actual hash calculation starts. At store side in delete image call we can check that is os_hash_value is set and os_hash_value is None (which means hash calculation is in progress) then we can move that image to trash instead of deleting it, this will allow us to mark image as deleted in glance.\n\nAny suggestions?","commit_id":"fb55edcfc77b5423f9b2a1925d727ed2c015617b"},{"author":{"_account_id":9303,"name":"Abhishek Kekane","email":"akekane@redhat.com","username":"abhishekkekane"},"change_message_id":"93b3c882489c2e9f4216ac26fdae0859f38919cd","unresolved":false,"context_lines":[],"source_content_type":"","patch_set":2,"id":"fc672d33_b48593bb","in_reply_to":"783ca1e3_c01059ce","updated":"2025-05-20 15:15:07.000000000","message":"We need to add new RBD method at store side and do this check at glance side as we don\u0027t pass image object to store so it is not possible to check os_hash_value and os_hash_algo at store side.","commit_id":"fb55edcfc77b5423f9b2a1925d727ed2c015617b"},{"author":{"_account_id":4393,"name":"Dan Smith","email":"dms@danplanet.com","username":"danms"},"change_message_id":"0cddaebe2e6f84f8792e3a614c7cfb686b9a5152","unresolved":false,"context_lines":[],"source_content_type":"","patch_set":2,"id":"5884f847_5285f1a1","in_reply_to":"81cdfb3e_4e6e821d","updated":"2025-05-19 16:07:37.000000000","message":"\u003e I think I am not able to explain what actual issue is since pranali was the one who working on it. Let me describe it by drafting actual request flow:\n\u003e \n\u003e 1. Nova snapshot with new location API triggers hash calculation process\n\u003e 2. While hash calculation is in process, cleanup call for deleting the snapshot throws InUseByStore exception preventing glance to mark image as deleted BUT data is deleted from the RBD backend (So we cannot use RBD trash here as there is no data present in RBD)\n\u003e 3. This leads glance to raise 409 conflict causing the tempest tests to fail.\n\nThis is already my understanding of the problem, yes.\n\n\u003e What we can do is in RBD store before exception is raised [1], we can check whether image is still present or not and return response accordingly?\n\nI don\u0027t understand what you mean by this suggestion.","commit_id":"fb55edcfc77b5423f9b2a1925d727ed2c015617b"},{"author":{"_account_id":4393,"name":"Dan Smith","email":"dms@danplanet.com","username":"danms"},"change_message_id":"1e4e1128299003546ce18cb09f56d7381950dc9c","unresolved":false,"context_lines":[],"source_content_type":"","patch_set":2,"id":"1f270d42_b4d37985","in_reply_to":"8e005ad4_cf955c4a","updated":"2025-05-19 15:22:02.000000000","message":"\u003e Option 2 is not possible due to already reported ceph bug [1]\n\u003e It does not allows image to delete as rbd throws InUseByStore exception and actually it deletes the data from the backend.\n\nI don\u0027t see why.. We don\u0027t have to delete it in the ceph backend to mark it as deleted in glance if we\u0027re waiting.\n\nAlso, isn\u0027t there some lazy delete option? I was sure Eric had pointed out something like that for the usual case of deleting a base image before any clones were done using it.\n\n\u003e Option will be backward incompatible and need fix in tempest/or consumers that if hash calculation is in progress then delete will raise conflict rather than success.\n\nAssume you meant option 1? Sure, that\u0027s true, but it\u0027s also maybe more correct than what we have now. Since glance doesn\u0027t support microversions it\u0027s hard to iterate the API behavior at all.\n\n\u003e For option 3 we don\u0027t have deleting state in glance, so aborting hash is not possible (?) as image state never changes in this case to deleted and it remains in active state as explained in [1] \n\nAs I mentioned, just marking it as deleted in glance so that the hash loop will notice, stop, and clean up is one way to handle it. We also have the ability to call to the worker doing the hashing (if we record, or start recording) which one is doing it.\n\nEither way, just disabling hashing by default does not seem like a reasonable solution to this problem, especially without a plan for what we\u0027re going to do to get back to being able to have it enabled. So I\u0027m -1 until that point. If other cores feel strongly that this is the right course of action, then feel free to merge with my -1 in place.\n\nAlso, sounds like we should revert the nova patch to use the new location API for the time being so nova doesn\u0027t trigger this for people?","commit_id":"fb55edcfc77b5423f9b2a1925d727ed2c015617b"},{"author":{"_account_id":9303,"name":"Abhishek Kekane","email":"akekane@redhat.com","username":"abhishekkekane"},"change_message_id":"decc581f12daa5bd01389729872a609c3c06a62d","unresolved":false,"context_lines":[],"source_content_type":"","patch_set":2,"id":"bfa729a8_14659c87","in_reply_to":"908ef0c7_88f186c6","updated":"2025-05-20 15:42:50.000000000","message":"\u003e I might have the most unpopular opinion here but we are missing the actual use case of the new location API feature, which is to resolve OSSN-0090 and OSSN-0065.\n\u003e The hash calculation was a ADD ON feature, which is good to have to secure images, but was not the primary focus of all the work we did.\n\u003e I understand that we will need a redesign and there are a lot of good suggestions, but practically, it might take a cycle or two to discuss/code/merge those changes.\n\u003e \n\u003e I feel we should move ahead with the current design by disabling hashing which preserves the backward compatibility of making the images accessible instantly without worrying about any operation failing due to an RBD handle actively reading from the image but that\u0027s just me i guess.\n\nWhat is wrong in setting it to False in job definition rather than flipping the default value of option? Or a devstack change to configure do_secure_hash with disabled by default?","commit_id":"fb55edcfc77b5423f9b2a1925d727ed2c015617b"},{"author":{"_account_id":4393,"name":"Dan Smith","email":"dms@danplanet.com","username":"danms"},"change_message_id":"c24a3258bbc228733b1a8ab9f0ccd459b5bb8a7f","unresolved":false,"context_lines":[],"source_content_type":"","patch_set":2,"id":"c6f715e1_2b0c8291","in_reply_to":"bfa729a8_14659c87","updated":"2025-05-20 15:54:27.000000000","message":"The thing wrong is that we\u0027ll be testing a config that isn\u0027t the default and isn\u0027t how we expect people to run. If we had done that ahead of time, we wouldn\u0027t have realized this was fatally flawed.\n\nAnd no, I don\u0027t think we should proceed with the defaults change just to make this go away. The hashing was an important part of getting the feature settled and I think its sort of crazy to be even considering not having stable hashes, image sizes, etc type things for things as important as OS base images in 2025.","commit_id":"fb55edcfc77b5423f9b2a1925d727ed2c015617b"},{"author":{"_account_id":27615,"name":"Rajat Dhasmana","email":"rajatdhasmana@gmail.com","username":"whoami-rajat"},"change_message_id":"09e7caaca5642412f32018693a2927e51e99edb9","unresolved":false,"context_lines":[],"source_content_type":"","patch_set":2,"id":"3f9d0466_28ed5510","in_reply_to":"c6f715e1_2b0c8291","updated":"2025-05-20 16:11:27.000000000","message":"\u003e The thing wrong is that we\u0027ll be testing a config that isn\u0027t the default and isn\u0027t how we expect people to run. If we had done that ahead of time, we wouldn\u0027t have realized this was fatally flawed.\n\nI understand that for the past 2 releases we have the default for hashing as True but no consumer is using it yet and we can document this with a warning. It shouldn\u0027t be too hard for deployment tools to disable it (at least for the ones I\u0027m aware of).\n\n\u003e And no, I don\u0027t think we should proceed with the defaults change just to make this go away. The hashing was an important part of getting the feature settled and I think its sort of crazy to be even considering not having stable hashes, image sizes, etc type things for things as important as OS base images in 2025.\n\nTo me, the root cause of the problems is when we wanted to club too many features,  while preserving the backward compatibility, into an API whose intent was just to provide a secure way to add a location in glance. Hashing has been a trade off for the optimizations we offer and end users still enable them for the performance benefit they get.\nI\u0027m also one of the contributors to blame for the poor design but when i started working on it, I had a different problem to solve in my mind than what we are discussing here.\nBut i will go with the consensus that team wants to handle this in a better way and will abandon the patch.","commit_id":"fb55edcfc77b5423f9b2a1925d727ed2c015617b"},{"author":{"_account_id":5314,"name":"Brian Rosmaita","email":"rosmaita.fossdev@gmail.com","username":"brian-rosmaita"},"change_message_id":"1cf9df951ac1f8fd8bff9749d7ad6e1a92ec3504","unresolved":false,"context_lines":[],"source_content_type":"","patch_set":2,"id":"487b7d5b_c5019263","in_reply_to":"fc672d33_b48593bb","updated":"2025-05-20 15:24:12.000000000","message":"In this scenario, the image data is already in Ceph and glance is using it to compute the hash.  Instead of handing glance the Ceph image to use, could we give glance a (Ceph) snapshot of the data?  The advantage would be that our current rbd code with the clone-v2 stuff would allow the glance image to be deleted even while the (Ceph) snapshot was in use, plus, when glance is done with the computation, it could just delete the snapshot unconditionally (so we don\u0027t have to remember to clear stuff up).","commit_id":"fb55edcfc77b5423f9b2a1925d727ed2c015617b"}]}
