Implement rmdir method for HNS buckets#728
Conversation
|
Question: since HNS buckets have a true folder structure, it should be possible to ammend the directory listings cache rather than simply invalidate it when deleting, no? That could avoid unnecessary listings later on. |
|
/gcbrun |
63e15da to
10c8424
Compare
@martindurant Invalidating the cache for a specific path was also invalidating the cache for all parent directories in the hierarchy. I have updated the logic to remove the cache entry only for the deleted directory and to update its immediate parent to remove the deleted directory's entry. |
8a74f38 to
0e9718c
Compare
|
/gcbrun |
2 similar comments
|
/gcbrun |
|
/gcbrun |
@martindurant Can you please review the latest changes following the cache update logic? |
eb102df to
b1f396c
Compare
| for i, entry in enumerate(self.dircache[parent]): | ||
| if entry.get("name") == path: | ||
| self.dircache[parent].pop(i) | ||
| break |
There was a problem hiding this comment.
This ought to be correct, but if some some reason the entry is duplicated in the parent, you would end up removing the wrong thing. I think a comprehension would be clearer:
| for i, entry in enumerate(self.dircache[parent]): | |
| if entry.get("name") == path: | |
| self.dircache[parent].pop(i) | |
| break | |
| self.dircache[parent] = [ent for ent in self.dircache[parent] if ent["name"] != path] |
There was a problem hiding this comment.
Thanks for suggestion, Initially I planned something similar to your suggestion but wanted to skip iterating over entire list if we have already removed the path from parent cache to optimize the code(might not give much performance difference though).
you would end up removing the wrong thing
Just to make sure I understand this correctly, were you referring that we will not be removing duplicate entry if it exists?
There was a problem hiding this comment.
No: beause you are mutating the same thing you are iterating over, in the case of there being more than one item to remove, the index i will not be correct for the second one.
There was a problem hiding this comment.
in the case of there being more than one item to remove, the index i will not be correct for the second one.
In the original implementation, the break statement is used which exits the loop immediately after the first matching item is found and removed with .pop(i). So we won't be removing any unintended entry from the cache but this implementation won't remove any duplicates if they exist.
Changing it to list comprehension is better to make the code more robust to correctly handle potential duplicates, updated the code.
There was a problem hiding this comment.
Sorry, you are right: pop() would not be called moe than once. Still, this is better.
|
@martindurant @ankitaluthra1 Can we add /gcbrun to run the tests against real GCS and merge this PR if changes look good? |
|
/gcbrun |
|
(I don't know if I can actually activate it! I don't seem to have access to the run logs) |
You have access, tests were triggered after your /gcbrun comment. We have granted access to run the CI pipeline only to Ankita and you (specifically, anyone with either the Owner or Collaborator role). You will be able to see the logs in the checks tab. |
7a247dc to
81b072f
Compare
|
/gcbrun |
rmdir method provides an implementation for deletion of empty directories for Hierarchical Namespace (HNS) enabled buckets.