-
Notifications
You must be signed in to change notification settings - Fork 141
create git-bugreport utility #561
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Starting in 3ac68a9, help.o began to depend on builtin/branch.o, builtin/clean.o, and builtin/config.o. This meant that help.o was unusable outside of the context of the main Git executable. To make help.o usable by other commands again, move list_config_help() into builtin/help.c (where it makes sense to assume other builtin libraries are present). When command-list.h is included but a member is not used, we start to hear a compiler warning. Since the config list is generated in a fairly different way than the command list, and since commands and config options are semantically different, move the config list into its own header and move the generator into its own script and build rule. Signed-off-by: Emily Shaffer <[email protected]>
It may be useful to know which shell Git was built to try to point to, in the event that shell-based Git commands are failing. $SHELL_PATH is set during the build and used to launch the manpage viewer, as well as by git-compat-util.h, and it's used during tests. 'git version --build-options' is encouraged for use in bug reports, so it makes sense to include this information there. Signed-off-by: Emily Shaffer <[email protected]>
There is an issue in commit fddc0d4: |
There is an issue in commit ed10318: |
There is an issue in commit eb9e472: |
There is an issue in commit 415dc3e: |
There is an issue in commit 0124e13: |
There is an issue in commit 1c72635: |
There is an issue in commit 688d337: |
There is an issue in commit 2cc27f2: |
Teach Git how to prompt the user for a good bug report: reproduction steps, expected behavior, and actual behavior. Later, Git can learn how to collect some diagnostic information from the repository. If users can send us a well-written bug report which contains diagnostic information we would otherwise need to ask the user for, we can reduce the number of question-and-answer round trips between the reporter and the Git contributor. Users may also wish to send a report like this to their local "Git expert" if they have put their repository into a state they are confused by. Signed-off-by: Emily Shaffer <[email protected]>
Knowing which version of Git a user has and how it was built allows us to more precisely pin down the circumstances when a certain issue occurs, so teach bugreport how to tell us the same output as 'git version --build-options'. It's not ideal to directly call 'git version --build-options' because that output goes to stdout. Instead, wrap the version string in a helper within help.[ch] library, and call that helper from within the bugreport library. Signed-off-by: Emily Shaffer <[email protected]>
2cc27f2
to
ea25130
Compare
There is an issue in commit af3d547337375f3d29ef1e192096b0c84af2cd1c: |
There is an issue in commit 28fec635b66eb866bd5362db1c66d2be391aadc6: |
There is an issue in commit 72f61175e3e608d7c343bd14ac5111e18ff5254a: |
There is an issue in commit ea25130cc8a55e302974bda65249124a9e989325: |
The contents of uname() can give us some insight into what sort of system the user is running on, and help us replicate their setup if need be. The domainname field is not guaranteed to be available, so don't collect it. Signed-off-by: Emily Shaffer <[email protected]>
To help pinpoint the source of a regression, it is useful to know some info about the compiler which the user's Git client was built with. By adding a generic get_compiler_info() in 'compat/' we can choose which relevant information to share per compiler; to get started, let's demonstrate the version of glibc if the user built with 'gcc'. Signed-off-by: Emily Shaffer <[email protected]>
It's possible for git-remote-curl to be built separately from git; in that case we want to know what version of cURL is used by git-remote-curl, not necessarily which version was present at git-bugreport's build time. So instead, ask git-remote-curl for the version information it knows about. Today, "git-remote-http" and "git-remote-https" are aliased to "git-remote-curl"; but in case we rely on a different library than cURL in the future, let's not explicitly reference cURL from bugreport. For longevity purposes, invoke the alias "git-remote-https" instead of "git-remote-http". Since it could have been built at a different time, also report the version and built-from commit of git-remote-curl alongside the cURL info. Signed-off-by: Emily Shaffer <[email protected]>
It's possible a user may complain about the way that Git interacts with their interactive shell, e.g. autocompletion or shell prompt. In that case, it's useful for us to know which shell they're using interactively. Signed-off-by: Emily Shaffer <[email protected]>
Add a new step to the build to generate a safelist of git-config variables which are appropriate to include in the output of git-bugreport. New variables can be added to the safelist by annotating their documentation in Documentation/config with the "annotate" macro, which is a no-op in AsciiDoc and AsciiDoctor. Some configs are private in nature, and can contain remote URLs, passwords, or other sensitive information. In the event that a user doesn't notice their information while reviewing a bugreport, that user may leak their credentials to other individuals, mailing lists, or bug tracking tools inadvertently. Heuristic blocklisting of configuration keys is imperfect and prone to false negatives; given the nature of the information which can be leaked, a safelist is more reliable. However, it's possible that in some situations, an organization may be less concerned with privacy of things like remote URLs and branch names, and more concerned with ease of diagnosis for their support staff. In those cases, it may make more sense for that organization to modify the code to use a blocklist. To that end, we should try to mark configs which are definitely safe, and configs which are definitely unsafe, and leave blank configs which are somewhere in between. To mark a config as safe, add "annotate:bugreport[include]" to the corresponding line in the config documentation; to mark it as unsafe, add "annotate:bugreport[exclude]" instead. Generating bugreport-config-safelist.h at build time by grepping the documentation for this new macro helps us prevent staleness. The macro itself is a no-op and should not alter the appearance of the documentation in either AsciiDoc or AsciiDoctor, confirmable by running: cd Documentation ./doc-diff --asciidoctor HEAD^ HEAD ./doc-diff --asciidoc HEAD^ HEAD Diffing the rendered HTML shows that only inline comments were added, which shouldn't be a problem. Additionally, add annotations to the sendemail config documentation in order to demonstrate a proof of concept. Helped-by: Martin Ågren <[email protected]> Helped-by: Johannes Schindelin <[email protected]> Signed-off-by: Emily Shaffer <[email protected]>
Teach bugreport to gather the values of config options which are present in 'bugreport-config-safelist.h', and show their origin scope. Many config options are sensitive, and many Git add-ons use config options which git-core does not know about; it is better only to gather config options which we know to be safe, rather than excluding options which we know to be unsafe. Rather than including the path to someone's config, which can reveal filesystem layout and project names, just name the scope (e.g. system, global, local) of the config source. Signed-off-by: Emily Shaffer <[email protected]>
Occasionally a failure a user is seeing may be related to a specific hook which is being run, perhaps without the user realizing. While the contents of hooks can be sensitive - containing user data or process information specific to the user's organization - simply knowing that a hook is being run at a certain stage can help us to understand whether something is going wrong. Without a definitive list of hook names within the code, we compile our own list from the documentation. This is likely prone to bitrot. To reduce the amount of code humans need to read, we turn the list into a string_list and iterate over it (as we are calling the same find_hook operation on each string). However, since bugreport should primarily be called by the user, the performance loss from massaging the string seems acceptable. Signed-off-by: Emily Shaffer <[email protected]>
The number of unpacked objects in a user's repository may help us understand the root of the problem they're seeing, especially if a command is running unusually slowly. Helped-by: Johannes Schindelin <[email protected]> Signed-off-by: Emily Shaffer <[email protected]>
Alongside the loose object counts, it can be useful to show the number of packs and packed objects. This way we can check whether the repo has an appropriate ratio of packed to loose objects to help determine whether it's behaving correctly. Add a utility to easily traverse all packfiles in a given repository. Use it in packfile.c and remove a redundant call to prepare_packed_git(), which is already called in get_all_packs(). Signed-off-by: Emily Shaffer <[email protected]>
Miscellaneous information used about the object store can end up in .git/objects/info; this can help us understand what may be going on with the object store when the user is reporting a bug. Otherwise, it could be difficult to track down what is going wrong with an object which isn't kept locally to .git/objects/ or .git/objects/pack. Having some understanding of where the user's objects may be kept can save us some hops during the bug reporting process. Signed-off-by: Emily Shaffer <[email protected]>
In some cases, it could be that the user is having a problem with an object which isn't present in their normal object directory. We can get a hint that that might be the case by examining the list of alternates where their object may be stored instead. Since paths to alternates may be sensitive, we'll instead count how many alternates have been specified and note how many of them exist or are broken. While object-cache.h describes a function "foreach_alt_odb()", this function does not provide information on broken alternates, which are skipped over in "link_alt_odb_entry()". Since the goal is to identify missing alternates, we can gather the contents of .git/objects/info/alternates manually. Signed-off-by: Emily Shaffer <[email protected]>
ea25130
to
c301beb
Compare
Created by quimby