Skip to content

Conversation

@ijackson
Copy link
Contributor

@ijackson ijackson commented Jul 24, 2020

Hi.

Apropos of rust-lang/rustup#2433 @rbtcollins suggested putting the remove_dir_contents function here in remove_dir_all.

I think I have done the right thing in the Windows code in fs.rs but I have not been able to test, or even compile, the Windows code at all because I have no Windows installs. Edit: the CI tests seem to pass so I think it is probably right...

I did provide a test case for my new code.

I hope this meets with your approval. Let me know what you think.

Regards,
Ian.

@ijackson ijackson force-pushed the remove-dir-contents branch 2 times, most recently from 3090e62 to 8cbb545 Compare July 24, 2020 23:07
@ijackson
Copy link
Contributor Author

Looks like I am going to debug this via the CI. I hope this is not too annoying to you.

@ijackson
Copy link
Contributor Author

Second time lucky I guess. Out of interest are any of those CI tests on Unix? Perhaps not, but if you accept this MR then maybe there should be since you will have nontrivial Unix and allegedly-portable code.

Anyway, thanks for your attention and I look forward to your review.

@XAMPPRocky
Copy link
Owner

Thank you for your PR! This PR looks good to me. Currently the CI only tests windows, and I think I'd want to have some testing on unix before merging this. If you'd like I'd accept adding a new job to the CI that tests unix in this PR? It should be fairly straightforward, otherwise I can write the CI config in a bit.

@ijackson
Copy link
Contributor Author

ijackson commented Jul 25, 2020 via email

@XAMPPRocky
Copy link
Owner

XAMPPRocky commented Jul 26, 2020

FWIW GitHub actions is pretty easy to use especially if you use the GitHub online editor which has auto-completion (See 775feb1). I've added the unix CI to master so you should just need to rebase the changes.

@ijackson ijackson force-pushed the remove-dir-contents branch from 8cbb545 to 30791c3 Compare August 2, 2020 14:15
@ijackson
Copy link
Contributor Author

ijackson commented Aug 2, 2020 via email

Copy link
Collaborator

@rbtcollins rbtcollins left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for doing this, and I'm sorry I'm so late getting to a review for you.

src/portable.rs Outdated
/// If `dir_path` is a symlink to a directory, deletes the contents
/// of that directory. If `dir_path` does not exist, simply returns
/// (without deleting anything).
pub fn remove_dir_contents(dir_path: &Path) -> Result<(), io::Error> {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can see why you've written this new function, but I'd like to suggest a different implementation - bearing in mind some future optimisations I have planned. In particular, I want to avoid this serial loop over the read_dir contents at the top layer.

My suggestion is that rather than having allow_nondirectory / directories_only, your new flag to the core workhorse is delete_root; and that the second entry point to it be remove_dir_contents; then the only change needed is to move the remove_item(path, ctx) call from remove_dir_all_recursive to the caller of it, which will allow it to be skipped for the root parent case.

e.g.

 ctx.readonly = child.metadata()?.permissions().readonly();
        if child_type.is_dir() {
            remove_dir_all_recursive(&child.path(), ctx)?;
+          remove_item(&child.path(), ctx)
        } else {
            remove_item(&child.path().as_ref(), ctx)?;
        }
    }
    ctx.readonly = dir_readonly;
-    remove_item(path, ctx)

And call it from the root case as well.

This of course leaves the posix case needing its own implementation, but I think it will be very shallow at that point.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I looked at master, and yes, a different implementation strategy was sensible. I found it better to split the function rather than using a boolean flag - I prefer to encode things in control flow rather than flag booleans, as that makes analysis (eg type checks) easier. I hope you like what I have done.

@ijackson
Copy link
Contributor Author

ijackson commented Aug 8, 2020 via email

@rbtcollins
Copy link
Collaborator

rbtcollins commented Aug 11, 2020

Unfortunately doing that will make it impossible to distinguish "the top level directory did not exist" from "during the innards of attempting to recursively delete something we unexpectedly got ENOENT". I expect the current code just bubbles that unexpected ENOENT up to its caller (although I haven't checked). I have a counterproposal: leave this entrypoint as I had it, but also provide:

The current code also doesn't discriminate between ENOENT of the top level not existing and an ENOENT of a file that is being deleted - and actually thats a bug; it should handle ENOENT of a file being deleted by ignoring that error - it simply indicates that the job it is tasked with has been made easier, after all. There is some handling of related cases with hard links in fact,

So while I agree with the defect today, I think we should fix it, which would remove the issue.

/// Ensures that dir_path is an empty directory, or a symlink to
/// one. If it already exists, its contents are deleted (as if
/// with remove_dir_contents. If it does not, it is created
/// (as if with std::fs:create_dir.) fn ensure_empty_dir(...) This is surely what many callers
/// (including rustup) actually want. I think the obvious implementation is mkdir, followed
/// if necessary by remove_dir_contents.

I think that that would be nice to offer as well.

In src/portable.rs:
...
I can see why you've written this new function, but I'd like to suggest a different implementation - bearing in mind some future optimisations I have planned. In particular, I want to avoid this serial loop over the read_dir contents at the top layer. My suggestion is that rather than having allow_nondirectory / directories_only, your new flag to the core workhorse is delete_root; and that the second entry point to it be remove_dir_contents; then the only change needed is to move the remove_item(path, ctx) call from remove_dir_all_recursive to the caller of it, which will allow it to be skipped for the root parent case.
I'll have look at the code.
This of course leaves the posix case needing its own implementation, but I think it will be very shallow at that point.
The POSIX case can just do the loop, I think ? Ian.

Yes; the IO latency behaviour of POSIX platforms is generally much less roundabout than Windows.

-Rob

@ijackson
Copy link
Contributor Author

I'm afraid I am going to have to iterate via the CI since I do not have access to any Windows machines. Please bear with me.

@ijackson ijackson force-pushed the remove-dir-contents branch 2 times, most recently from d80d1db to 3ee4a7d Compare September 21, 2020 17:33
@rbtcollins
Copy link
Collaborator

rbtcollins commented Sep 21, 2020 via email

@ijackson
Copy link
Contributor Author

People complain about rustc being slow but they should try the compiler error whack-a-mole this way!

@ijackson
Copy link
Contributor Author

I don't think I want to think about tempdir and Windows and multithreading....

Anyway, looks like this might be going to pass its checks now. Thanks for the review etc.

@rbtcollins
Copy link
Collaborator

rbtcollins commented Sep 21, 2020 via email

@ijackson
Copy link
Contributor Author

I'm hoping this will be a rare problem for me :-).

@ijackson
Copy link
Contributor Author

Hi. I hope this is ready to merge now?

@rbtcollins
Copy link
Collaborator

I haven't had time to review sorry; moving across the world consumed all my recent bandwidth. Just coming out of that now.

README.md Outdated
Also provides `remove_dir_contents` for both Windows and Unix. This
removes just the contents of the directory, if it exists.

And provides `ensure_e pty_dir` for both Windows and Unix. This
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's a typo here.

}

/// Makes `path` an empty directory: if it does not exist, it is
/// created it as an empty directory (as if with
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's some grammar issue here - "it is created it as an"

src/portable.rs Outdated
/// `std::fs::create_dir`); if it does exist, its contents are
/// deleted (as if with `remove_dir_contents`).
///
/// It is an error of `path` exists but is not a directory (or
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

of instead of if?

@rbtcollins
Copy link
Collaborator

It's certainly a lot cleaner now. Which of the new APIs do you anticipate rustup using? Do we actually need the other one? ensure_empty_dir kindof implies the existence of an ensure_empty_dir_all call as well perhaps, though if people are deleting all of ~/.rustup and expecting rustup to be ok with that .. well, thats a different discussion.

@ijackson
Copy link
Contributor Author

Thanks for the review. I'll fix the typos and grammar.

It's certainly a lot cleaner now. Which of the new APIs do you anticipate rustup using?

I was imaginging replacing the deletion call at the end of rustup, with remove_dir_contents, and not touching rustup's directory creation code. Correct me if I'm wrong, but presumably rustup already does not mind finding leftover stuff from a previous run in its temporary directories. Otherwise it would sometimes get into a stuck state.

Do we actually need the other one? ensure_empty_dir kind of implies the existence of an ensure_empty_dir_all

I think something like ensure_empty_dir or ensure_empty_dir_all would be extremely useful for tests, which often want to make working files, but want to leave the test artefacts lying about for diagnosis. (There is a can of worms here for rust tests because of cargo's failure to provide a formal place to put such things, but let's leave that for anoyther day.)

If you prefer not to take the new ensure_empty_dir call now then that's fine by me. But it seems that if someone wanted such a thing in their own project, I might well think this crate a good place for it to live. The same applies to ensure_empty_dir_all; so, alternatively, if you want me to write that one too I'm happy to do that.

@ijackson
Copy link
Contributor Author

Hi. Did you have an opinion about ensure_empty_dir etc. ?

@ijackson
Copy link
Contributor Author

@rbtcollins Hi, what would you like to do here? Thanks, Ian.

@rbtcollins
Copy link
Collaborator

rbtcollins commented Oct 29, 2020 via email

This splits remove_dir_all into two: one part is the thin wrapper for
_delete_dir_contents, and the other part is responsible for deleting
the top-level directory.

We are going to expose a function for only deleting the contents, and
this will be its primary building block.

I provided two entrypoints, one of which wraps the other, rather than
a boolean and a conditional, because IMO it mekes the code clearer.
(It would also have been possible to duplicate the canonicalize call,
but I felt that was worse.)

There is no functional change yet.

Signed-off-by: Ian Jackson <[email protected]>
The implementation is relatively straightforward.  It's a shame that
we need to reach for libc to get the error handling right!

The return type is different to that of Windows's fs:_remove_dir_all.
Unix does not need to canonicalize the path, so doesn't return
a canonicalised path.

This type difference is OK provided both versions typecheck at the
call site, which will appear in a moment.

Right now this code is not used - the caller will appear in the next
commit - so we must temporarily add `#[allow]`.

As discussed, and differently to my v1,
  https://github.com/XAMPPRocky/remove_dir_all/pull/22/files/30791c382ca9535d4fd9758e1b070393bfa6ef3e#diff-cff001b3dd9b58fd8acadc50ffc32f83
I have made remove_dir_contents intolerant of ENOENT.

Signed-off-by: Ian Jackson <[email protected]>
We introduce a new module `portable` to contain it, rather than just
putting it striaght in `lib.rs`.  (Whether to do this seems a matter
of taste.)

This is the caller for unix::_remove_file_dir_contents so we can get rid of the
temporary `#[allow]` annotation.

Signed-off-by: Ian Jackson <[email protected]>
We check a few error behaviours too.

Signed-off-by: Ian Jackson <[email protected]>
We are going to want to add another test that reuses this.

No functional change.

Signed-off-by: Ian Jackson <[email protected]>
Signed-off-by: Ian Jackson <[email protected]>
@ijackson ijackson force-pushed the remove-dir-contents branch from 3ee4a7d to ce727bd Compare February 16, 2021 19:01
@ijackson
Copy link
Contributor Author

@J-Vin thanks for the review :-). I fixed that typo and rebased onto current main.

@rbtcollins
Copy link
Collaborator

Sorry about how long this took me to get to - hazards of moving across the world.

@rbtcollins rbtcollins merged commit d046874 into XAMPPRocky:master Mar 5, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants