Skip to content

Refactor excludes library #329

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversation

derrickstolee
Copy link

@derrickstolee derrickstolee commented Sep 3, 2019

The exclude library defined in dir.h was originally written for the .gitignore feature, but has since been used for .gitattributes and sparse-checkout. In the later applications, these patterns are used for inclusion rather than exclusion, so the name is confusing. This gets particularly bad when looking at how the sparse-checkout feature uses is_excluded_from_list() to really mean "should be included in the working directory".

This series performs several renames of structs and methods to generalize the exclude library to be a "pattern matching" library. Instead of a list of excludes, we have a list of path patterns. It is up to the consumer to decide what to do with a match. The .gitignore logic will still treat the patterns as a list of exclusions, the sparse-checkout logic treats the patterns as a list of inclusions.

For this reason, some methods and structs in dir.h retain "exclude" in their name. These are limited to things consumed only by the .gitignore feature (as far as I can tell).

Most of these changes are mechanical find-and-replaces, with the exception of some variable names and the last patch.

The last patch, "unpack-trees: rename 'is_excluded_from_list()'", performs a more meaningful refactor. The method is_excluded_from_list() was only used by sparse-checkout (inside the clear_ce_flags() methods) to see if a path should be included in the working directory. The return value of "1 for excluded" was confusing. Instead, use a new enum value. This required changing several method prototypes inside unpack-trees.c.

This refactor was inspired by Elijah Newren's feedback [1] on my sparse-checkout builtin RFC. I am working on a few other orthogonal changes to make to the existing sparse-checkout behavior before I resubmit that RFC.

I had started working on v2.23.0, but found adjacent-diff conflicts with md/list-objects-filter-combo [2] and js/partial-clone-sparse-blob [3]. Those branches are independent, but the conflicts with md/list-objects-filter-combo were more severe (and that branch seems closer to merging) so this is based on md/list-object-filter-combo. Hopefully the conflicts with js/partial-clone-sparse-blob are clear enough to resolve easily.

Thanks,
-Stolee

[1] https://public-inbox.org/git/CABPp-BFMtO=7UGVZPbqh3tthSetvz5F=W3S=RsryPSuchmZeZw@mail.gmail.com/
Re: [PATCH 8/9] sparse-checkout: use hashmaps for cone patterns

[2] https://public-inbox.org/git/[email protected]/
[RFC PATCH 0/3] implement composite filters

[3] https://public-inbox.org/git/[email protected]/
[PATCH 0/2] partial-clone: fix two issues with sparse filter handling

Cc: [email protected], [email protected], [email protected], [email protected], [email protected]

@derrickstolee derrickstolee changed the base branch from master to js/partial-clone-sparse-blob September 3, 2019 17:39
The first consumer of pattern-matching filenames was the
.gitignore feature. In that context, storing a list of patterns
as a list of 'struct exclude' items makes sense. However, the
sparse-checkout feature then adopted these structures and methods,
but with the opposite meaning: these patterns match the files
that should be included!

It would be clearer to rename this entire library as a "pattern
matching" library, and the callers apply exclusion/inclusion
logic accordingly based on their needs.

This commit renames 'struct exclude' to 'struct path_pattern'
and renames several variable names to match. 'struct pattern'
was already taken by attr.c, and this more completely describes
that the patterns are specific to file paths.

Signed-off-by: Derrick Stolee <[email protected]>
The first consumer of pattern-matching filenames was the
.gitignore feature. In that context, storing a list of patterns
as a 'struct exclude_list'  makes sense. However, the
sparse-checkout feature then adopted these structures and methods,
but with the opposite meaning: these patterns match the files
that should be included!

It would be clearer to rename this entire library as a "pattern
matching" library, and the callers apply exclusion/inclusion
logic accordingly based on their needs.

This commit renames 'struct exclude_list' to 'struct pattern_list'
and renames several variables called 'el' to 'pl'.

Signed-off-by: Derrick Stolee <[email protected]>
The first consumer of pattern-matching filenames was the
.gitignore feature. In that context, storing a list of patterns
as a 'struct exclude_list'  makes sense. However, the
sparse-checkout feature then adopted these structures and methods,
but with the opposite meaning: these patterns match the files
that should be included!

It would be clearer to rename this entire library as a "pattern
matching" library, and the callers apply exclusion/inclusion
logic accordingly based on their needs.

This commit replaces 'EXCL_FLAG_' to 'PATTERN_FLAG_' in the
names of the flags used on 'struct path_pattern'.

Signed-off-by: Derrick Stolee <[email protected]>
The first consumer of pattern-matching filenames was the
.gitignore feature. In that context, storing a list of patterns
as a 'struct exclude_list'  makes sense. However, the
sparse-checkout feature then adopted these structures and methods,
but with the opposite meaning: these patterns match the files
that should be included!

It would be clearer to rename this entire library as a "pattern
matching" library, and the callers apply exclusion/inclusion
logic accordingly based on their needs.

This commit renames several methods defined in dir.h to make
more sense with the renamed 'struct exclude_list' to 'struct
pattern_list' and 'struct exclude' to 'struct path_pattern':

 * last_exclude_matching() -> last_matching_pattern()
 * parse_exclude() -> parse_path_pattern()

In addition, the word 'exclude' was replaced with 'pattern'
in the methods below:

 * add_exclude_list()
 * add_excludes_from_file_to_list()
 * add_excludes_from_file()
 * add_excludes_from_blob_to_list()
 * add_exclude()
 * clear_exclude_list()

A few methods with the word "exclude" remain. These will
be handled seperately. In particular, the method
"is_excluded()" is concretely about the .gitignore file
relative to a specific directory. This is the important
boundary between library and consumer: is_excluded() cares
about .gitignore, but is_excluded() calls
last_matching_pattern() to make that decision.

Signed-off-by: Derrick Stolee <[email protected]>
The first consumer of pattern-matching filenames was the
.gitignore feature. In that context, storing a list of patterns
as a 'struct exclude_list'  makes sense. However, the
sparse-checkout feature then adopted these structures and methods,
but with the opposite meaning: these patterns match the files
that should be included!

Now that this library is renamed to use 'struct pattern_list'
and 'struct pattern', we can now rename the method used by
the sparse-checkout feature to determine which paths should
appear in the working directory.

The method is_excluded_from_list() is only used by the
sparse-checkout logic in unpack-trees and list-objects-filter.
The confusing part is that it returned 1 for "excluded" (i.e.
it matches the list of exclusions) but that really manes that
the path matched the list of patterns for _inclusion_ in the
working directory.

Rename the method to be path_matches_pattern_list() and have
it return an explicit 'enum pattern_match_result'. Here, the
values MATCHED = 1, UNMATCHED = 0, and UNDECIDED = -1 agree
with the previous integer values. This shift allows future
consumers to better understand what the retur values mean,
and provides more type checking for handling those values.

Signed-off-by: Derrick Stolee <[email protected]>
@derrickstolee derrickstolee changed the base branch from js/partial-clone-sparse-blob to md/list-objects-filter-combo September 3, 2019 17:50
@derrickstolee
Copy link
Author

/submit

@gitgitgadget
Copy link

gitgitgadget bot commented Sep 3, 2019

Submitted as [email protected]

@@ -561,7 +561,7 @@ int no_wildcard(const char *string)
return string[simple_length(string)] == '\0';
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Elijah Newren wrote (reply to this):

On Tue, Sep 3, 2019 at 11:04 AM Derrick Stolee via GitGitGadget
<[email protected]> wrote:
>
> From: Derrick Stolee <[email protected]>
>
> The first consumer of pattern-matching filenames was the
> .gitignore feature. In that context, storing a list of patterns
> as a 'struct exclude_list'  makes sense. However, the
> sparse-checkout feature then adopted these structures and methods,
> but with the opposite meaning: these patterns match the files
> that should be included!
>
> Now that this library is renamed to use 'struct pattern_list'
> and 'struct pattern', we can now rename the method used by
> the sparse-checkout feature to determine which paths should
> appear in the working directory.
>
> The method is_excluded_from_list() is only used by the
> sparse-checkout logic in unpack-trees and list-objects-filter.
> The confusing part is that it returned 1 for "excluded" (i.e.
> it matches the list of exclusions) but that really manes that

s/manes/means/

> the path matched the list of patterns for _inclusion_ in the
> working directory.
>
> Rename the method to be path_matches_pattern_list() and have
> it return an explicit 'enum pattern_match_result'. Here, the
> values MATCHED = 1, UNMATCHED = 0, and UNDECIDED = -1 agree
> with the previous integer values. This shift allows future
> consumers to better understand what the retur values mean,

s/retur/return/

> and provides more type checking for handling those values.
>
> Signed-off-by: Derrick Stolee <[email protected]>
> ---
>  dir.c                 | 25 +++++++++++++++++--------
>  dir.h                 | 21 +++++++++++++++++----
>  list-objects-filter.c | 29 +++++++++++++++--------------
>  unpack-trees.c        | 39 +++++++++++++++++++++++----------------
>  4 files changed, 72 insertions(+), 42 deletions(-)
>
> diff --git a/dir.c b/dir.c
> index b057bd3d95..34972abdaf 100644
> --- a/dir.c
> +++ b/dir.c
> @@ -1072,19 +1072,28 @@ static struct path_pattern *last_matching_pattern_from_list(const char *pathname
>  }
>
>  /*
> - * Scan the list and let the last match determine the fate.
> - * Return 1 for exclude, 0 for include and -1 for undecided.
> + * Scan the list of patterns to determine if the ordered list
> + * of patterns matches on 'pathname'.
> + *
> + * Return 1 for a match, 0 for not matched and -1 for undecided.

Maybe drop this last sentence since it's misleading, and since the
function signature already specifies that it returns an enum and the
enums have obvious meanings already?

>   */
> -int is_excluded_from_list(const char *pathname,
> -                         int pathlen, const char *basename, int *dtype,
> -                         struct pattern_list *pl, struct index_state *istate)
> +enum pattern_match_result path_matches_pattern_list(
> +                               const char *pathname, int pathlen,
> +                               const char *basename, int *dtype,
> +                               struct pattern_list *pl,
> +                               struct index_state *istate)
>  {
>         struct path_pattern *pattern;
>         pattern = last_matching_pattern_from_list(pathname, pathlen, basename,
>                                                   dtype, pl, istate);
> -       if (pattern)
> -               return pattern->flags & PATTERN_FLAG_NEGATIVE ? 0 : 1;
> -       return -1; /* undecided */
> +       if (pattern) {
> +               if (pattern->flags & PATTERN_FLAG_NEGATIVE)
> +                       return NOT_MATCHED;
> +               else
> +                       return MATCHED;
> +       }
> +
> +       return UNDECIDED;
>  }
>
>  static struct path_pattern *last_matching_pattern_from_lists(
> diff --git a/dir.h b/dir.h
> index 7babf31d88..608696c958 100644
> --- a/dir.h
> +++ b/dir.h
> @@ -230,10 +230,23 @@ int read_directory(struct dir_struct *, struct index_state *istate,
>                    const char *path, int len,
>                    const struct pathspec *pathspec);
>
> -int is_excluded_from_list(const char *pathname, int pathlen,
> -                         const char *basename, int *dtype,
> -                         struct pattern_list *pl,
> -                         struct index_state *istate);
> +enum pattern_match_result {
> +       UNDECIDED = -1,
> +       NOT_MATCHED = 0,
> +       MATCHED = 1,

Are we worried about preventing e.g. grep from using these names or
clashing with another lib (block-sha1?) that might choose to define
these?

> +};
> +
> +/*
> + * Scan the list of patterns to determine if the ordered list
> + * of patterns matches on 'pathname'.
> + *
> + * Return 1 for a match, 0 for not matched and -1 for undecided.

Again, I'd drop the last sentence.

> + */
> +enum pattern_match_result path_matches_pattern_list(const char *pathname,
> +                               int pathlen,
> +                               const char *basename, int *dtype,
> +                               struct pattern_list *pl,
> +                               struct index_state *istate);
>  struct dir_entry *dir_add_ignored(struct dir_struct *dir,
>                                   struct index_state *istate,
>                                   const char *pathname, int len);
> diff --git a/list-objects-filter.c b/list-objects-filter.c
> index ccd58e61c3..d624f1c898 100644
> --- a/list-objects-filter.c
> +++ b/list-objects-filter.c
> @@ -328,12 +328,12 @@ static void filter_blobs_limit__init(
>   */
>  struct frame {
>         /*
> -        * defval is the usual default include/exclude value that
> +        * default_match is the usual default include/exclude value that
>          * should be inherited as we recurse into directories based
>          * upon pattern matching of the directory itself or of a
>          * containing directory.
>          */
> -       int defval;
> +       enum pattern_match_result default_match;
>
>         /*
>          * 1 if the directory (recursively) contains any provisionally
> @@ -363,8 +363,9 @@ static enum list_objects_filter_result filter_sparse(
>         void *filter_data_)
>  {
>         struct filter_sparse_data *filter_data = filter_data_;
> -       int val, dtype;
> +       int dtype;
>         struct frame *frame;
> +       enum pattern_match_result match;
>
>         switch (filter_situation) {
>         default:
> @@ -373,15 +374,15 @@ static enum list_objects_filter_result filter_sparse(
>         case LOFS_BEGIN_TREE:
>                 assert(obj->type == OBJ_TREE);
>                 dtype = DT_DIR;
> -               val = is_excluded_from_list(pathname, strlen(pathname),
> -                                           filename, &dtype, &filter_data->pl,
> -                                           r->index);
> -               if (val < 0)
> -                       val = filter_data->array_frame[filter_data->nr - 1].defval;
> +               match = path_matches_pattern_list(pathname, strlen(pathname),
> +                                                 filename, &dtype, &filter_data->pl,
> +                                                 r->index);
> +               if (match == UNDECIDED)
> +                       match = filter_data->array_frame[filter_data->nr - 1].default_match;
>
>                 ALLOC_GROW(filter_data->array_frame, filter_data->nr + 1,
>                            filter_data->alloc);
> -               filter_data->array_frame[filter_data->nr].defval = val;
> +               filter_data->array_frame[filter_data->nr].default_match = match;
>                 filter_data->array_frame[filter_data->nr].child_prov_omit = 0;
>                 filter_data->nr++;
>
> @@ -435,12 +436,12 @@ static enum list_objects_filter_result filter_sparse(
>                 frame = &filter_data->array_frame[filter_data->nr - 1];
>
>                 dtype = DT_REG;
> -               val = is_excluded_from_list(pathname, strlen(pathname),
> +               match = path_matches_pattern_list(pathname, strlen(pathname),
>                                             filename, &dtype, &filter_data->pl,
>                                             r->index);
> -               if (val < 0)
> -                       val = frame->defval;
> -               if (val > 0) {
> +               if (match == UNDECIDED)
> +                       match = frame->default_match;
> +               if (match == MATCHED) {
>                         if (omits)
>                                 oidset_remove(omits, &obj->oid);
>                         return LOFR_MARK_SEEN | LOFR_DO_SHOW;
> @@ -487,7 +488,7 @@ static void filter_sparse_oid__init(
>                 die("could not load filter specification");
>
>         ALLOC_GROW(d->array_frame, d->nr + 1, d->alloc);
> -       d->array_frame[d->nr].defval = 0; /* default to include */
> +       d->array_frame[d->nr].default_match = 0; /* default to include */

s/0/MATCHED/ ?

>         d->array_frame[d->nr].child_prov_omit = 0;
>         d->nr++;
>
> diff --git a/unpack-trees.c b/unpack-trees.c
> index 902a799aeb..cd548f4fa2 100644
> --- a/unpack-trees.c
> +++ b/unpack-trees.c
> @@ -1265,7 +1265,8 @@ static int clear_ce_flags_1(struct index_state *istate,
>                             struct cache_entry **cache, int nr,
>                             struct strbuf *prefix,
>                             int select_mask, int clear_mask,
> -                           struct pattern_list *pl, int defval);
> +                           struct pattern_list *pl,
> +                           enum pattern_match_result default_match);
>
>  /* Whole directory matching */
>  static int clear_ce_flags_dir(struct index_state *istate,
> @@ -1273,19 +1274,21 @@ static int clear_ce_flags_dir(struct index_state *istate,
>                               struct strbuf *prefix,
>                               char *basename,
>                               int select_mask, int clear_mask,
> -                             struct pattern_list *pl, int defval)
> +                             struct pattern_list *pl,
> +                             enum pattern_match_result default_match)
>  {
>         struct cache_entry **cache_end;
>         int dtype = DT_DIR;
> -       int ret = is_excluded_from_list(prefix->buf, prefix->len,
> -                                       basename, &dtype, pl, istate);
>         int rc;
> +       enum pattern_match_result ret;
> +       ret = path_matches_pattern_list(prefix->buf, prefix->len,
> +                                       basename, &dtype, pl, istate);
>
>         strbuf_addch(prefix, '/');
>
>         /* If undecided, use matching result of parent dir in defval */
> -       if (ret < 0)
> -               ret = defval;
> +       if (ret == UNDECIDED)
> +               ret = default_match;
>
>         for (cache_end = cache; cache_end != cache + nr; cache_end++) {
>                 struct cache_entry *ce = *cache_end;
> @@ -1298,7 +1301,7 @@ static int clear_ce_flags_dir(struct index_state *istate,
>          * with ret (iow, we know in advance the incl/excl
>          * decision for the entire directory), clear flag here without
>          * calling clear_ce_flags_1(). That function will call
> -        * the expensive is_excluded_from_list() on every entry.
> +        * the expensive path_matches_pattern_list() on every entry.
>          */
>         rc = clear_ce_flags_1(istate, cache, cache_end - cache,
>                               prefix,
> @@ -1327,7 +1330,8 @@ static int clear_ce_flags_1(struct index_state *istate,
>                             struct cache_entry **cache, int nr,
>                             struct strbuf *prefix,
>                             int select_mask, int clear_mask,
> -                           struct pattern_list *pl, int defval)
> +                           struct pattern_list *pl,
> +                           enum pattern_match_result default_match)
>  {
>         struct cache_entry **cache_end = cache + nr;
>
> @@ -1338,7 +1342,8 @@ static int clear_ce_flags_1(struct index_state *istate,
>         while(cache != cache_end) {
>                 struct cache_entry *ce = *cache;
>                 const char *name, *slash;
> -               int len, dtype, ret;
> +               int len, dtype;
> +               enum pattern_match_result ret;
>
>                 if (select_mask && !(ce->ce_flags & select_mask)) {
>                         cache++;
> @@ -1362,7 +1367,7 @@ static int clear_ce_flags_1(struct index_state *istate,
>                                                        prefix,
>                                                        prefix->buf + prefix->len - len,
>                                                        select_mask, clear_mask,
> -                                                      pl, defval);
> +                                                      pl, default_match);
>
>                         /* clear_c_f_dir eats a whole dir already? */
>                         if (processed) {
> @@ -1374,18 +1379,20 @@ static int clear_ce_flags_1(struct index_state *istate,
>                         strbuf_addch(prefix, '/');
>                         cache += clear_ce_flags_1(istate, cache, cache_end - cache,
>                                                   prefix,
> -                                                 select_mask, clear_mask, pl, defval);
> +                                                 select_mask, clear_mask, pl,
> +                                                 default_match);
>                         strbuf_setlen(prefix, prefix->len - len - 1);
>                         continue;
>                 }
>
>                 /* Non-directory */
>                 dtype = ce_to_dtype(ce);
> -               ret = is_excluded_from_list(ce->name, ce_namelen(ce),
> -                                           name, &dtype, pl, istate);
> -               if (ret < 0)
> -                       ret = defval;
> -               if (ret > 0)
> +               ret = path_matches_pattern_list(ce->name,
> +                                               ce_namelen(ce),
> +                                               name, &dtype, pl, istate);
> +               if (ret == UNDECIDED)
> +                       ret = default_match;
> +               if (ret == MATCHED)
>                         ce->ce_flags &= ~clear_mask;
>                 cache++;
>         }
> --
> gitgitgadget

The rest looks good.

@gitgitgadget
Copy link

gitgitgadget bot commented Sep 4, 2019

On the Git mailing list, Elijah Newren wrote (reply to this):

On Tue, Sep 3, 2019 at 11:04 AM Derrick Stolee via GitGitGadget
<[email protected]> wrote:
>
> The exclude library defined in dir.h was originally written for the
> .gitignore feature, but has since been used for .gitattributes and
> sparse-checkout. In the later applications, these patterns are used for
> inclusion rather than exclusion, so the name is confusing. This gets
> particularly bad when looking at how the sparse-checkout feature uses
> is_excluded_from_list() to really mean "should be included in the working
> directory".
>
> This series performs several renames of structs and methods to generalize
> the exclude library to be a "pattern matching" library. Instead of a list of
> excludes, we have a list of path patterns. It is up to the consumer to
> decide what to do with a match. The .gitignore logic will still treat the
> patterns as a list of exclusions, the sparse-checkout logic treats the
> patterns as a list of inclusions.

Wahoo!  Thanks for going through and cleaning these up.  The
possibility of negated patterns (or negated logic or both) on top of
using "exclude" when "include" was _sometimes_ meant was just too much
for me to keep track of.  This series will make it much easier.

> For this reason, some methods and structs in dir.h retain "exclude" in their
> name. These are limited to things consumed only by the .gitignore feature
> (as far as I can tell).
>
> Most of these changes are mechanical find-and-replaces, with the exception
> of some variable names and the last patch.

I looked through the first four patches with --color-words=. to spot
check them; as you say, looks like straightforward mechanical changes.

I did notice that the first two paragraphs were shared between the
commit messages; it does make sense to help provide the context, but
it seemed a little duplicative.  I'm not sure what to suggest; it may
be fine as it is.  Just thought I'd flag it.

> The last patch, "unpack-trees: rename 'is_excluded_from_list()'", performs a
> more meaningful refactor. The method is_excluded_from_list() was only used
> by sparse-checkout (inside the clear_ce_flags() methods) to see if a path
> should be included in the working directory. The return value of "1 for
> excluded" was confusing. Instead, use a new enum value. This required
> changing several method prototypes inside unpack-trees.c.
>
> This refactor was inspired by Elijah Newren's feedback [1] on my
> sparse-checkout builtin RFC. I am working on a few other orthogonal changes
> to make to the existing sparse-checkout behavior before I resubmit that RFC.
>
> I had started working on v2.23.0, but found adjacent-diff conflicts with
> md/list-objects-filter-combo [2] and js/partial-clone-sparse-blob [3]. Those
> branches are independent, but the conflicts with
> md/list-objects-filter-combo were more severe (and that branch seems closer
> to merging) so this is based on md/list-object-filter-combo. Hopefully the
> conflicts with js/partial-clone-sparse-blob are clear enough to resolve
> easily.

I posted some comments on patch 5; otherwise, the series looks good to me.

@@ -32,19 +32,19 @@ static const struct option check_ignore_options[] = {
OPT_END()
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Jeff King wrote (reply to this):

On Tue, Sep 03, 2019 at 11:04:55AM -0700, Derrick Stolee via GitGitGadget wrote:

> From: Derrick Stolee <[email protected]>
> 
> The first consumer of pattern-matching filenames was the
> .gitignore feature. In that context, storing a list of patterns
> as a list of 'struct exclude' items makes sense. However, the
> sparse-checkout feature then adopted these structures and methods,
> but with the opposite meaning: these patterns match the files
> that should be included!
> 
> It would be clearer to rename this entire library as a "pattern
> matching" library, and the callers apply exclusion/inclusion
> logic accordingly based on their needs.
> 
> This commit renames 'struct exclude' to 'struct path_pattern'
> and renames several variable names to match. 'struct pattern'
> was already taken by attr.c, and this more completely describes
> that the patterns are specific to file paths.

I agree that the current name is overly restrictive, and this is a step
in the right direction. However, when I see path_pattern that makes me
think of our command-line pathspecs.

I wonder if there's a name that could more clearly distinguish the two.
Or if it's sufficient to just become Git jargon that "pathspec" is the
command-line one and "path_pattern" is the file-based one (we're at
least pretty consistent about the former already).

I think one could also make an argument that the name collision is a
sign that these two things should actually share both syntax and
implementation, since we're exposing too similar-but-not-quite versions
of the same idea to users. But given the compatibility issues, it's
probably not worth changing the user facing parts at this point (and I
also haven't thought too hard about it; there may be reasons why the two
_should_ differ).

-Peff

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Junio C Hamano wrote (reply to this):

Jeff King <[email protected]> writes:

> I wonder if there's a name that could more clearly distinguish the two.
> Or if it's sufficient to just become Git jargon that "pathspec" is the
> command-line one and "path_pattern" is the file-based one (we're at
> least pretty consistent about the former already).
>
> I think one could also make an argument that the name collision is a
> sign that these two things should actually share both syntax and
> implementation, since we're exposing too similar-but-not-quite versions
> of the same idea to users. But given the compatibility issues, it's
> probably not worth changing the user facing parts at this point (and I
> also haven't thought too hard about it; there may be reasons why the two
> _should_ differ).

Hmph.  I did not realize there are so many differences X-<.  

A pathspec is relative to $CWD, and there is a syntax, i.e.
prefixing with ":(top)", to make it relative to the root level.  An
entry in a .gitignore file will never affect paths outside the
directory the file appears in.  And there should never be such a
mechanism to allow it.

An entry without slash in .gitignore is a basename match, and there
is a syntax i.e. prefixing with "/", to anchor it to a single
directory.  A pathspec without slash also can be a basename match
(e.g. "*.c" matches "a/b.c" as well as "d.c").  A pathspec with a
slash can be made to tail-match (e.g. "**/*.c" matches "a/b.c",
"a/b/c.c", etc.) but I do not think of a way to make an entry with a
slash in a .gitignore file a tail-match the same way.  I do not think
this is intended but merely a missing feature.

So, yes, eventually we may want to make them more similar, but I
suspect that there are some things that should be in one but never
be in the other.

@gitgitgadget
Copy link

gitgitgadget bot commented Sep 5, 2019

This branch is now known as ds/include-exclude.

@gitgitgadget
Copy link

gitgitgadget bot commented Sep 5, 2019

This patch series was integrated into pu via git@60a992a.

@gitgitgadget gitgitgadget bot added the pu label Sep 5, 2019
@gitgitgadget
Copy link

gitgitgadget bot commented Sep 6, 2019

This patch series was integrated into pu via git@f6aef11.

@gitgitgadget
Copy link

gitgitgadget bot commented Sep 6, 2019

On the Git mailing list, Junio C Hamano wrote (reply to this):

"Derrick Stolee via GitGitGadget" <[email protected]> writes:

> I had started working on v2.23.0, but found adjacent-diff conflicts with
> md/list-objects-filter-combo [2] and js/partial-clone-sparse-blob [3]. Those
> branches are independent, but the conflicts with
> md/list-objects-filter-combo were more severe (and that branch seems closer
> to merging) so this is based on md/list-object-filter-combo. Hopefully the
> conflicts with js/partial-clone-sparse-blob are clear enough to resolve
> easily.

Thanks.

I did screw the resolution up in previous integration, but I think
what we have today is in a good shape.

@gitgitgadget
Copy link

gitgitgadget bot commented Sep 6, 2019

This patch series was integrated into pu via git@fb08f03.

@gitgitgadget
Copy link

gitgitgadget bot commented Sep 7, 2019

This patch series was integrated into pu via git@afbd9c9.

@gitgitgadget
Copy link

gitgitgadget bot commented Sep 9, 2019

This patch series was integrated into pu via git@0ea490b.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant