-
Notifications
You must be signed in to change notification settings - Fork 140
fsmonitor: start using an opaque token for last update #510
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…token Some file system monitors might not use or take a timestamp for processing and in the case of watchman could have race conditions with using a timestamp. Watchman uses something called a clockid that is used for race free queries to it. The clockid for watchman is simply a string. Change the fsmonitor_last_update from being a uint64_t to a char pointer so that any arbitrary data can be stored in it and passed back to the fsmonitor. Signed-off-by: Kevin Willford <[email protected]>
3ef0288
to
a12bbc2
Compare
Some file monitors like watchman will use something other than a timestamp to keep better track of what changes happen in between calls to query the fsmonitor. The clockid in watchman is a string. Now that the index is storing an opaque token for the last update the code needs to be updated to pass that opaque token to a verion 2 of the fsmonitor hook. Because there are repos that already have version 1 of the hook and we want them to continue to work when git is updated, we need to handle both version 1 and version 2 of the hook. In order to do that a config value is being added core.fsmonitorHookVersion to force what version of the hook should be used. When this is not set it will default to -1 and then the code will attempt to call version 2 of the hook first. If that fails it will fallback to trying version 1. Signed-off-by: Kevin Willford <[email protected]>
a12bbc2
to
8d381b7
Compare
/submit |
Submitted as [email protected] |
This branch is now known as |
This patch series was integrated into pu via git@f93c640. |
This patch series was integrated into pu via git@0d875df. |
This patch series was integrated into pu via git@da075a8. |
This patch series was integrated into pu via git@86e79c5. |
This patch series was integrated into pu via git@c10cd17. |
This patch series was integrated into pu via git@9277609. |
This patch series was integrated into pu via git@145a106. |
This patch series was integrated into pu via git@2d2d8b9. |
This patch series was integrated into pu via git@ba3eb76. |
This patch series was integrated into pu via git@49b9202. |
This patch series was integrated into pu via git@d53ae78. |
This patch series was integrated into pu via git@c175555. |
Resolves #291. Mostly a code move from `CloneVerb` to `ConfigStep`. We lose some output during clone, like "Watchman configured!" as we don't have access to `Output` during the `ConfigStep`. The `CloneVerb` runs the `ConfigStep` directly, but so does the `RegisterStep` and background maintenance. There is a benefit, though: we will try configuring Watchman on every upgrade and once a day. If a user installs Watchman after cloning, then we will configure it automatically. One remaining thing to think about: we are currently _copying_ the existing hook template from the hooks directory. As this changes (see microsoft/git#225 and gitgitgadget/git#510), we will need a different way to populate the hook. Perhaps we should store the hook contents in the Scalar codebase to avoid using a stale (or malicious) hook. Cc @kewillford for thoughts on this.
This patch series was integrated into pu via git@80f169c. |
Version 2 of the fsmonitor hooks is passed the version and an update token and must pass back a last update token to use for subsequent calls to the hook. Signed-off-by: Kevin Willford <[email protected]>
A new config value for core.fsmonitorHookVersion was added to be able to force the version of the fsmonitor hook. Possible values are 1 or 2. When this is not set the code will use a value of -1 and attempt to use version 2 of the hook first and if that fails will attempt version 1. Signed-off-by: Kevin Willford <[email protected]>
8d381b7
to
1db2a69
Compare
/submit |
Submitted as [email protected] |
This patch series was integrated into pu via git@886c35f. |
This patch series was integrated into pu via git@8f5f596. |
This patch series was integrated into pu via git@49db93f. |
@@ -324,7 +324,7 @@ struct index_state { | |||
struct hashmap dir_hash; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On the Git mailing list, Johannes Schindelin wrote (reply to this):
Hi Kevin,
On Thu, 23 Jan 2020, Kevin Willford via GitGitGadget wrote:
> diff --git a/fsmonitor.c b/fsmonitor.c
> index 868cca01e2..9860587225 100644
> --- a/fsmonitor.c
> +++ b/fsmonitor.c
> @@ -89,11 +98,12 @@ void write_fsmonitor_extension(struct strbuf *sb, struct index_state *istate)
> BUG("fsmonitor_dirty has more entries than the index (%"PRIuMAX" > %u)",
> (uintmax_t)istate->fsmonitor_dirty->bit_size, istate->cache_nr);
>
> - put_be32(&hdr_version, INDEX_EXTENSION_VERSION);
> + put_be32(&hdr_version, INDEX_EXTENSION_VERSION2);
> strbuf_add(sb, &hdr_version, sizeof(uint32_t));
>
> - put_be64(&tm, istate->fsmonitor_last_update);
> - strbuf_add(sb, &tm, sizeof(uint64_t));
> + strbuf_addstr(sb, istate->fsmonitor_last_update);
> + strbuf_addch(sb, 0); /* Want to keep a NUL */
I have a slight preference to use '\0' here, which my brain somehow reads
as `NUL`.
> +
> fixup = sb->len;
> strbuf_add(sb, &ewah_size, sizeof(uint32_t)); /* we'll fix this up later */
>
> @@ -110,9 +120,9 @@ void write_fsmonitor_extension(struct strbuf *sb, struct index_state *istate)
> }
>
> /*
> - * Call the query-fsmonitor hook passing the time of the last saved results.
> + * Call the query-fsmonitor hook passing the last update token of the saved results.
> */
> -static int query_fsmonitor(int version, uint64_t last_update, struct strbuf *query_result)
> +static int query_fsmonitor(int version, const char *last_update, struct strbuf *query_result)
> {
> struct child_process cp = CHILD_PROCESS_INIT;
>
> @@ -121,7 +131,7 @@ static int query_fsmonitor(int version, uint64_t last_update, struct strbuf *que
>
> argv_array_push(&cp.args, core_fsmonitor);
> argv_array_pushf(&cp.args, "%d", version);
> - argv_array_pushf(&cp.args, "%" PRIuMAX, (uintmax_t)last_update);
> + argv_array_pushf(&cp.args, "%s", last_update);
Maybe `argv_array_push(&cp.args, last_update)`?
> cp.use_shell = 1;
> cp.dir = get_git_work_tree();
>
> @@ -151,6 +161,7 @@ void refresh_fsmonitor(struct index_state *istate)
> int query_success = 0;
> size_t bol; /* beginning of line */
> uint64_t last_update;
> + struct strbuf last_update_token = STRBUF_INIT;
> char *buf;
> unsigned int i;
>
> @@ -164,6 +175,7 @@ void refresh_fsmonitor(struct index_state *istate)
> * should be inclusive to ensure we don't miss potential changes.
> */
> last_update = getnanotime();
> + strbuf_addf(&last_update_token, "%"PRIu64"", last_update);
>
> /*
> * If we have a last update time, call query_fsmonitor for the set of
> @@ -217,18 +229,21 @@ void refresh_fsmonitor(struct index_state *istate)
> }
> strbuf_release(&query_result);
>
> - /* Now that we've updated istate, save the last_update time */
> - istate->fsmonitor_last_update = last_update;
> + /* Now that we've updated istate, save the last_update_token */
> + FREE_AND_NULL(istate->fsmonitor_last_update);
> + istate->fsmonitor_last_update = strbuf_detach(&last_update_token, NULL);
I see quite a few `strbuf_detach()` calls in this patch, and could imagine
that this is a good indicator that the `fsmonitor_last_update` attribute
of `struct index_state` could be a `struct strbuf` instead, that is
`strbuf_reset()`ed and `strbuf_addf()`ed to, rather than having the
strbufs as local variables.
Other than that, this patch looks very good to me.
Thanks!
Dscho
> }
>
> void add_fsmonitor(struct index_state *istate)
> {
> unsigned int i;
> + struct strbuf last_update = STRBUF_INIT;
>
> if (!istate->fsmonitor_last_update) {
> trace_printf_key(&trace_fsmonitor, "add fsmonitor");
> istate->cache_changed |= FSMONITOR_CHANGED;
> - istate->fsmonitor_last_update = getnanotime();
> + strbuf_addf(&last_update, "%"PRIu64"", getnanotime());
> + istate->fsmonitor_last_update = strbuf_detach(&last_update, NULL);
>
> /* reset the fsmonitor state */
> for (i = 0; i < istate->cache_nr; i++)
> @@ -250,7 +265,7 @@ void remove_fsmonitor(struct index_state *istate)
> if (istate->fsmonitor_last_update) {
> trace_printf_key(&trace_fsmonitor, "remove fsmonitor");
> istate->cache_changed |= FSMONITOR_CHANGED;
> - istate->fsmonitor_last_update = 0;
> + FREE_AND_NULL(istate->fsmonitor_last_update);
> }
> }
>
> diff --git a/t/helper/test-dump-fsmonitor.c b/t/helper/test-dump-fsmonitor.c
> index 2786f47088..975f0ac890 100644
> --- a/t/helper/test-dump-fsmonitor.c
> +++ b/t/helper/test-dump-fsmonitor.c
> @@ -13,7 +13,7 @@ int cmd__dump_fsmonitor(int ac, const char **av)
> printf("no fsmonitor\n");
> return 0;
> }
> - printf("fsmonitor last update %"PRIuMAX"\n", (uintmax_t)istate->fsmonitor_last_update);
> + printf("fsmonitor last update %s\n", istate->fsmonitor_last_update);
>
> for (i = 0; i < istate->cache_nr; i++)
> printf((istate->cache[i]->ce_flags & CE_FSMONITOR_VALID) ? "+" : "-");
> --
> gitgitgadget
>
>
@@ -6,8 +6,10 @@ | |||
#include "run-command.h" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On the Git mailing list, Johannes Schindelin wrote (reply to this):
Hi Kevin,
On Thu, 23 Jan 2020, Kevin Willford via GitGitGadget wrote:
> diff --git a/fsmonitor.c b/fsmonitor.c
> index 9860587225..932bd9012d 100644
> --- a/fsmonitor.c
> +++ b/fsmonitor.c
> @@ -175,27 +195,60 @@ void refresh_fsmonitor(struct index_state *istate)
> * should be inclusive to ensure we don't miss potential changes.
> */
> last_update = getnanotime();
> - strbuf_addf(&last_update_token, "%"PRIu64"", last_update);
> + if (hook_version == HOOK_INTERFACE_VERSION1)
> + strbuf_addf(&last_update_token, "%"PRIu64"", last_update);
>
> /*
> - * If we have a last update time, call query_fsmonitor for the set of
> - * changes since that time, else assume everything is possibly dirty
> + * If we have a last update token, call query_fsmonitor for the set of
> + * changes since that token, else assume everything is possibly dirty
> * and check it all.
> */
> if (istate->fsmonitor_last_update) {
> - query_success = !query_fsmonitor(HOOK_INTERFACE_VERSION,
> - istate->fsmonitor_last_update, &query_result);
> + if (hook_version == -1 || hook_version == HOOK_INTERFACE_VERSION2) {
> + query_success = !query_fsmonitor(HOOK_INTERFACE_VERSION2,
> + istate->fsmonitor_last_update, &query_result);
> +
> + if (query_success) {
> + if (hook_version < 0)
> + hook_version = HOOK_INTERFACE_VERSION2;
> +
> + /*
> + * First entry will be the last update token
> + * Need to use a char * variable because static
> + * analysis was suggesting to use strbuf_addbuf
> + * but we don't want to copy the entire strbuf
> + * only the the chars up to the first NUL
> + */
> + buf = query_result.buf;
> + strbuf_addstr(&last_update_token, buf);
> + if (!last_update_token.len) {
> + warning("Empty last update token.");
> + query_success = 0;
> + } else {
> + bol = last_update_token.len + 1;
> + }
> + } else if (hook_version < 0) {
> + hook_version = HOOK_INTERFACE_VERSION1;
> + if (!last_update_token.len)
> + strbuf_addf(&last_update_token, "%"PRIu64"", last_update);
> + }
> + }
> +
> + if (hook_version == HOOK_INTERFACE_VERSION1) {
> + query_success = !query_fsmonitor(HOOK_INTERFACE_VERSION1,
> + istate->fsmonitor_last_update, &query_result);
I suspect that `istate->fsmonitor_last_update` might be incorrect here and
that you need `last_update_token.buf` instead. Besides...
> + }
I could imagine that this would be easier to read if you initialized
int interface_version = HOOK_INTERFACE_VERSION2;
const char *token = istate->fsmonitor_last_update;
and then, at the beginning of this hunk, where you already added an `if`,
extend it to
if (hook_version == HOOK_INTERFACE_VERSION1) {
interface_version = HOOK_INTERFACE_VERSION1;
strbuf_addf(&last_update_token, "%"PRIu64"", last_update);
token = last_update_token.buf;
}
Now, you can call
query_success = !query_fsmonitor(interface_version,
token, &query_result);
if (!query_success && hook_version < 0) {
hook_version = HOOK_INTERFACE_VERSION1;
strbuf_reset(&last_update_token);
strbuf_addf(&last_update_token, "%"PRIu64"", last_update);
token = last_update_token.buf;
query_success = !query_fsmonitor(hook_version,
token, &query_result);
}
if (query_success && interface_version == HOOK_INTERFACE_VERSION2)
bol = last_update_token.len + 1;
Technically, you could force `hook_version` to non-negative, via:
if (hook_version < 0) {
if (query_success)
hook_version = HOOK_INTERFACE_VERSION2;
else {
hook_version = HOOK_INTERFACE_VERSION1;
strbuf_reset(&last_update_token);
strbuf_addf(&last_update_token, "%"PRIu64"", last_update);
token = last_update_token.buf;
query_success = !query_fsmonitor(hook_version,
token, &query_result);
}
}
but that would not make anything quicker, and would make the code more
convoluted.
It is good that you keep the `fsmonitor-all` and `fsmonitor-watchman`
versions at 1, to verify that this fall-back mechanism works. I wonder
whether we should add one explicit test for that, so that those two hooks
can be upgraded to the interface v2.
Thanks,
Dscho
> +
> trace_performance_since(last_update, "fsmonitor process '%s'", core_fsmonitor);
> trace_printf_key(&trace_fsmonitor, "fsmonitor process '%s' returned %s",
> core_fsmonitor, query_success ? "success" : "failure");
> }
>
> /* a fsmonitor process can return '/' to indicate all entries are invalid */
> - if (query_success && query_result.buf[0] != '/') {
> + if (query_success && query_result.buf[bol] != '/') {
> /* Mark all entries returned by the monitor as dirty */
> buf = query_result.buf;
> - bol = 0;
> - for (i = 0; i < query_result.len; i++) {
> + for (i = bol; i < query_result.len; i++) {
> if (buf[i] != '\0')
> continue;
> fsmonitor_refresh_callback(istate, buf + bol);
> diff --git a/t/t7519-status-fsmonitor.sh b/t/t7519-status-fsmonitor.sh
> index cf0fda2d5a..fbfdcca000 100755
> --- a/t/t7519-status-fsmonitor.sh
> +++ b/t/t7519-status-fsmonitor.sh
> @@ -32,11 +32,12 @@ write_integration_script () {
> echo "$0: exactly 2 arguments expected"
> exit 2
> fi
> - if test "$1" != 1
> + if test "$1" != 2
> then
> echo "Unsupported core.fsmonitor hook version." >&2
> exit 1
> fi
> + printf "last_update_token\0"
> printf "untracked\0"
> printf "dir1/untracked\0"
> printf "dir2/untracked\0"
> @@ -107,6 +108,7 @@ EOF
> # test that "update-index --fsmonitor-valid" sets the fsmonitor valid bit
> test_expect_success 'update-index --fsmonitor-valid" sets the fsmonitor valid bit' '
> write_script .git/hooks/fsmonitor-test<<-\EOF &&
> + printf "last_update_token\0"
> EOF
> git update-index --fsmonitor &&
> git update-index --fsmonitor-valid dir1/modified &&
> @@ -167,6 +169,7 @@ EOF
> # test that newly added files are marked valid
> test_expect_success 'newly added files are marked valid' '
> write_script .git/hooks/fsmonitor-test<<-\EOF &&
> + printf "last_update_token\0"
> EOF
> git add new &&
> git add dir1/new &&
> @@ -207,6 +210,7 @@ EOF
> # test that *only* files returned by the integration script get flagged as invalid
> test_expect_success '*only* files returned by the integration script get flagged as invalid' '
> write_script .git/hooks/fsmonitor-test<<-\EOF &&
> + printf "last_update_token\0"
> printf "dir1/modified\0"
> EOF
> clean_repo &&
> @@ -276,6 +280,7 @@ do
> # (if enabled) files unless it is told about them.
> test_expect_success "status doesn't detect unreported modifications" '
> write_script .git/hooks/fsmonitor-test<<-\EOF &&
> + printf "last_update_token\0"
> :>marker
> EOF
> clean_repo &&
> diff --git a/t/t7519/fsmonitor-all b/t/t7519/fsmonitor-all
> index 691bc94dc2..94ab66bd3d 100755
> --- a/t/t7519/fsmonitor-all
> +++ b/t/t7519/fsmonitor-all
> @@ -17,7 +17,6 @@ fi
>
> if test "$1" != 1
> then
> - echo "Unsupported core.fsmonitor hook version." >&2
> exit 1
> fi
>
> diff --git a/t/t7519/fsmonitor-watchman b/t/t7519/fsmonitor-watchman
> index d8e7a1e5ba..264b9daf83 100755
> --- a/t/t7519/fsmonitor-watchman
> +++ b/t/t7519/fsmonitor-watchman
> @@ -26,8 +26,7 @@ if ($version == 1) {
> # subtract one second to make sure watchman will return all changes
> $time = int ($time / 1000000000) - 1;
> } else {
> - die "Unsupported query-fsmonitor hook version '$version'.\n" .
> - "Falling back to scanning...\n";
> + exit 1;
> }
>
> my $git_work_tree;
> --
> gitgitgadget
>
>
On the Git mailing list, Johannes Schindelin wrote (reply to this):
|
This patch series was integrated into pu via git@03d594d. |
This patch series was integrated into pu via git@45de354. |
This patch series was integrated into next via git@3c42195. |
This patch series was integrated into pu via git@8096fd0. |
This patch series was integrated into pu via git@37bd9e9. |
This patch series was integrated into pu via git@c9a33e5. |
This patch series was integrated into next via git@c9a33e5. |
This patch series was integrated into master via git@c9a33e5. |
Closed via c9a33e5. |
Only change for this version is to sync the watchman script that is in templates with the one in t/t7519 since it had some commented out debug code that could be useful.
Also haven't received any feedback on this patch series which would be good to make sure I'm not making any obvious mistakes.
Thanks!