)]}'
{"/COMMIT_MSG":[{"author":{"_account_id":15554,"name":"Bence Romsics","email":"bence.romsics@gmail.com","username":"ebenrom","status":"working for Ericsson, UTC+1 (+DST)"},"change_message_id":"d21bda2c1e2b90e5f03acb617d8ad2d8b335c30b","unresolved":true,"context_lines":[{"line_number":21,"context_line":"Each state file read/write is protected by a corresponding lock file."},{"line_number":22,"context_line":""},{"line_number":23,"context_line":"TODO review error handling"},{"line_number":24,"context_line":"TODO generic detection of multi-process g-api"},{"line_number":25,"context_line":"TODO delete all state and lock files at g-api startup"},{"line_number":26,"context_line":"TODO tests"},{"line_number":27,"context_line":""}],"source_content_type":"text/x-gerrit-commit-message","patch_set":3,"id":"139300cd_bcea2f24","line":24,"range":{"start_line":24,"start_character":0,"end_line":24,"end_character":2},"updated":"2024-04-15 14:53:38.000000000","message":"Please see my longer TODO comment in glance_store/common/attachment_state_manager.py","commit_id":"a4f2db84affe03df523d731811f08d04d75808a6"},{"author":{"_account_id":15554,"name":"Bence Romsics","email":"bence.romsics@gmail.com","username":"ebenrom","status":"working for Ericsson, UTC+1 (+DST)"},"change_message_id":"d21bda2c1e2b90e5f03acb617d8ad2d8b335c30b","unresolved":true,"context_lines":[{"line_number":22,"context_line":""},{"line_number":23,"context_line":"TODO review error handling"},{"line_number":24,"context_line":"TODO generic detection of multi-process g-api"},{"line_number":25,"context_line":"TODO delete all state and lock files at g-api startup"},{"line_number":26,"context_line":"TODO tests"},{"line_number":27,"context_line":""},{"line_number":28,"context_line":"Change-Id: I566412eb4d8d0268d73025cbc45c64a9c41bb762"}],"source_content_type":"text/x-gerrit-commit-message","patch_set":3,"id":"7e40f7b1_bf9d9e0f","line":25,"range":{"start_line":25,"start_character":0,"end_line":25,"end_character":2},"updated":"2024-04-15 14:53:38.000000000","message":"It is a problem if we leave out-of-date state files around after a crash. I believe it is safe to and we must delete all state files at g-api startup. But we cannot do that from glance_store. Where should we call the state file cleanup in glance? Here?\nhttps://opendev.org/openstack/glance/src/commit/0bcd6cd71c09917c6734421374fd598d73e8d0cc/glance/cmd/api.py#L101\n\nIs this enough? Or should we call the cleanup from another place too for when glance runs under wsgi? Where?\n\nI believe it\u0027s not a problem if we leave lock files around since all our locking works on open file descriptors. So if glance-api dies the locks are gone. But anyway, if I understand where to delete the state files from, deleting the lock files as well will be trivial.","commit_id":"a4f2db84affe03df523d731811f08d04d75808a6"},{"author":{"_account_id":9303,"name":"Abhishek Kekane","email":"akekane@redhat.com","username":"abhishekkekane"},"change_message_id":"bdb7639aa24ac399187dbb576755dca32c73eb42","unresolved":true,"context_lines":[{"line_number":22,"context_line":""},{"line_number":23,"context_line":"TODO review error handling"},{"line_number":24,"context_line":"TODO generic detection of multi-process g-api"},{"line_number":25,"context_line":"TODO delete all state and lock files at g-api startup"},{"line_number":26,"context_line":"TODO tests"},{"line_number":27,"context_line":""},{"line_number":28,"context_line":"Change-Id: I566412eb4d8d0268d73025cbc45c64a9c41bb762"}],"source_content_type":"text/x-gerrit-commit-message","patch_set":3,"id":"e1dd1f2b_6f99ebdb","line":25,"range":{"start_line":25,"start_character":0,"end_line":25,"end_character":2},"in_reply_to":"7e40f7b1_bf9d9e0f","updated":"2024-06-13 14:06:54.000000000","message":"You can refer, \n\nhttps://github.com/openstack/glance/blob/master/glance/common/wsgi.py#L473\n\nhttps://github.com/openstack/glance/blob/master/glance/common/wsgi_app.py#L157","commit_id":"a4f2db84affe03df523d731811f08d04d75808a6"}],"/PATCHSET_LEVEL":[{"author":{"_account_id":15554,"name":"Bence Romsics","email":"bence.romsics@gmail.com","username":"ebenrom","status":"working for Ericsson, UTC+1 (+DST)"},"change_message_id":"d21bda2c1e2b90e5f03acb617d8ad2d8b335c30b","unresolved":false,"context_lines":[],"source_content_type":"","patch_set":3,"id":"2bfc590e_5c0c2897","updated":"2024-04-15 14:53:38.000000000","message":"Notes:\n\n* There is an earlier attempt at fixing this bug, however that did not work for me in its current state and there\u0027s no progress on it for a long time:\nhttps://review.opendev.org/c/openstack/glance_store/+/850417\n\n* The two refactoring changes at the beginning of this change series are not necessary, however they hopefully clean this locking code base a tiny bit.","commit_id":"a4f2db84affe03df523d731811f08d04d75808a6"}],"glance_store/common/attachment_state_manager.py":[{"author":{"_account_id":4393,"name":"Dan Smith","email":"dms@danplanet.com","username":"danms"},"change_message_id":"11690d9cd1ed5ab8856f8c75a6bf49b4352c6666","unresolved":true,"context_lines":[{"line_number":138,"context_line":"    lock_path \u003d lockutils.get_lock_path(CONF) or \u0027/tmp\u0027"},{"line_number":139,"context_line":"    prefix \u003d \u0027volume-attachments-\u0027"},{"line_number":140,"context_line":""},{"line_number":141,"context_line":"    class _Attachment(object):"},{"line_number":142,"context_line":"        # A single multiattach volume, and the set of attachments in use"},{"line_number":143,"context_line":"        # on it."},{"line_number":144,"context_line":"        def __init__(self, volume):"}],"source_content_type":"text/x-python","patch_set":4,"id":"8bc18e5d_b91b58a1","line":141,"updated":"2024-06-13 16:20:30.000000000","message":"So I think maybe this legacy design is the wrong way to do this. I can see that you\u0027re trying to evolve this model that doesn\u0027t work without shared memory into one that does, but ... can we just rethink the way this works entirely?\n\nCan we make the algorithm something more like this:\n1. Acquire external lock\n2. Try to attach, if not already\n3. Drop lock\n\nBasically, avoid keeping our own state (which has to be synchronized across workers) and instead rely on the *actual* state. Step 2 above could be \"try to attach and ignore if the attach fails because it\u0027s already attached\" or \"check to see if it\u0027s already attached in some way\".\n\nThe state file you\u0027re using here is, I think, begging to get out of sync with the state of the local system, the set of attachments that cinder knows about, etc. I assume that\u0027s why you\u0027re asking about deleting them all on glance startup, but I think that sort of demonstrates why it\u0027s problematic.\n\nCan you think of a way we can implement this without keeping the state at all? Can we list attachments on the volume, check to see if one matches us, and if not, create one for our step 2 above?","commit_id":"11d9005f7dba1f9e705106c315e974522575c0f5"},{"author":{"_account_id":15554,"name":"Bence Romsics","email":"bence.romsics@gmail.com","username":"ebenrom","status":"working for Ericsson, UTC+1 (+DST)"},"change_message_id":"fa46aee7f2d3316ee335d11e03f300996a707d64","unresolved":true,"context_lines":[{"line_number":138,"context_line":"    lock_path \u003d lockutils.get_lock_path(CONF) or \u0027/tmp\u0027"},{"line_number":139,"context_line":"    prefix \u003d \u0027volume-attachments-\u0027"},{"line_number":140,"context_line":""},{"line_number":141,"context_line":"    class _Attachment(object):"},{"line_number":142,"context_line":"        # A single multiattach volume, and the set of attachments in use"},{"line_number":143,"context_line":"        # on it."},{"line_number":144,"context_line":"        def __init__(self, volume):"}],"source_content_type":"text/x-python","patch_set":4,"id":"c48460c4_b109de6c","line":141,"in_reply_to":"8bc18e5d_b91b58a1","updated":"2024-09-25 13:57:22.000000000","message":"Sorry for taking so long to get back to this.\n\nI don\u0027t see a way to avoid keeping some kind of state information. Here\u0027s why. In general (for attaching and detaching) we need two pieces of information:\n\n1) Is a particular volume attached or not?\n2) Is there another glance-api process that uses this volume attachment?\n\nThe first one is perfectly answerable without keeping state files around. The real problem presents itself when detaching a volume. To answer the second we must have all glance-api processes collaboratively manage this information, most likely at a shared place, like a common file system or db.\n\nThree options come to my mind where the who-needs-which-attachment information could be stored:\n\na) a shared file system (since we are talking about glance-api processes of the same host, this is just the host file system)\nb) the glance db\nc) the cinder db (via the cinder api)\n\nSince the cinder api already tracks some information about who needs what, it\u0027s tempting to pick that. But when it comes to handling the case when the state information gets out of sync with reality, I believe the shared file system clearly wins because that\u0027s the closest to the glance-api processes actually using the attachments. The further we go, the more failure cases can lead to out-of-sync state information. And the db is clearly further away than the host file system.\n\nI considered temporarily a fourth option, the iscsi layer. However I found that iscsiadm is not willing to keep track of multiple attachments. Plus we need a solution that is not backend specific, so even if iscsi could do this, another backend could be different.\n\nAll together I believe we must tackle the case that the who-needs-what information must be stored. I believe the host filesystem to be the best option for this. This information can get out of sync, therefore it will get out of sync and we must handle that case by re-syncing.\n\nI see two options for re-sync:\n\n1) At glance-api startup all state information can be discarded, because glance-api is the only user for the volume attachments and a freshly started glance-api cannot use any attachments yet.\n\nI believe the first re-sync option in itself to be sufficient.\n\n2) However if desired we could also refresh the who-needs-what information by storing the PIDs of the glance-api processes for each attached volume, e.g.:\n\n/var/lib/glance/multiattach/VOLUME-ID/PID\n\n(After thinking about this topic more I find this state format better than the json blob dump currently present in this patch.)\n\nAnd then actively checking for process liveness every time when the outermost lock is acquired.\n\nWhat do you think?","commit_id":"11d9005f7dba1f9e705106c315e974522575c0f5"}]}
