Skip to content

Replace more instances of the_index and the_repository #379

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
dscho opened this issue Oct 8, 2019 · 14 comments
Open

Replace more instances of the_index and the_repository #379

dscho opened this issue Oct 8, 2019 · 14 comments
Labels
good first issue Good for newcomers

Comments

@dscho
Copy link
Member

dscho commented Oct 8, 2019

Once upon a time, Git was intended to be just like all the other simplistic Unix utilities, a set of commands to be cobbled together by shell scripts. In those times, it was (deemed) okay to cut corners, using global variables all over the place, and leaving it to exit() to "clean up" allocated memory.

In the meantime, Git has matured a lot, and most of its code is in proper C, and quite some of that code has been prepared for more versatile use already, e.g. for in-process submodule handling, by avoiding global variables.

Two of those global variables are the_index and the_repository. The former represents "the" Git index (AKA staging area) while the latter represents the .git directory.

Ideally, we would have no code in libgit.a (i.e. the internal API functions) that access the_index or the_repository directly. Rather, they should be passed through as function parameters, much like object-oriented languages implicitly pass a this or self parameter.

The convention is to give the repository parameter the short-and-sweet name r, and the index parameter the pretty intuitive name index. A good example to follow is this commit: 90d3405

Note that it is not necessary to pass through both the index and the repository, at least not in general, as the_repository->index is a thing. Only in rare circumstances will a function want to work specifically on a temporary, or potentially temporary, index. Meaning: when adding a struct the_repository *r parameter to a function's signature that already has a struct index_state *index, the latter parameter can be removed.

@dscho dscho added the good first issue Good for newcomers label Oct 8, 2019
@aviskarkc10
Copy link

I don't have a lot of experience working in C but can I claim this?

@dscho
Copy link
Member Author

dscho commented Oct 10, 2019

I don't have a lot of experience working in C

Excellent, "good first issue" is meant for exactly this case.

but can I claim this?

Sure, why don't you start git grep the_index or git grep the_repository and pick one file (as long as there are any matches other than builtin/*.c, you probably want to focus on those other matches)? You don't have to do all of them, that would be way too much.

I could even imagine that one file would also be too much, as there might be multiple affected functions, and you could start with extending just one function's signature so that the index (or the repository) parameter is passed in. To make this build, you will of course need to adjust all callers (Personally, I would use git grep <function-name> to find those, maybe you want to do that, too, or you can use Visual Studio Code or any other application that helps you find callers).

@aviskarkc10
Copy link

Sure. Thanks a lot.

@periperidip
Copy link

periperidip commented Jun 1, 2020

In the meantime, Git has matured a lot, and most of its code is in proper C, and quite some of that code has been prepared for more versatile use already, e.g. for in-process submodule handling, by avoiding global variables.

Hello Dscho! Could you explain this part in just a bit more detail please? Would be of great help! I am working on submodules as well and there are some things I wanna clear out! :)

E: From what I have understood so far is that with the introduction of submodules, a concept of repo inside a repo has come into being. This would mean that a global the_repository won't suffice and rather, we need presence of local variables denoting each repo here right? Hence we declare a reincarnated version of global the_index and global the_repository which are local r and local index?

The convention is to give the repository parameter the short-and-sweet name r, and the index parameter the pretty intuitive name index. A good example to follow is this commit: 90d3405

And this serves the purpose of simplifying the variable name right?

Ideally, we would have no code in libgit.a (i.e. the internal API functions) that access the_index or the_repository directly. Rather, they should be passed through as function parameters, much like object-oriented languages implicitly pass a this or self parameter.

Could you explain this please?

So, if there were a hypothetical world without submodules, then there wouldn't have been any immediate need to make this switch right?

Thanks,
periperidip

@dscho
Copy link
Member Author

dscho commented Jun 1, 2020

Well, those instances wouldn't be local variables forever, but rather function parameters. Have you followed the link I provided as an example?

@periperidip
Copy link

Well, those instances wouldn't be local variables forever, but rather function parameters. Have you followed the link I provided as an example?

I did just now. I see they are function parameters just like you said.

So, if there were a hypothetical world without submodules, then there wouldn't have been any immediate need to make this switch right?

In this case, we would not have really needed to do this right?

@dscho
Copy link
Member Author

dscho commented Jun 2, 2020

There are many advantages to using sort of an object-oriented design in this area. Having the option to support submodules in-process (including transparently recursive diffs) is only one of them.

@chinmoy12c
Copy link

Sure, why don't you start git grep the_index or git grep the_repository and pick one file (as long as there are any matches other than builtin/*.c, you probably want to focus on those other matches)? You don't have to do all of them, that would be way too much.

Hey, I'm planning to work on this, but I cannot find more references to the_index or the_repository other than those in builtin/*.c. Do the files in builtin need these changes as well or not?

@dscho
Copy link
Member Author

dscho commented Mar 21, 2021

https://github.com/git/git/search?q=the_repository&type=code shows at least one match outside of builtin/: shallow.c...

I think that you could find these locations via git grep the_repository -- '*.c' ':(exclude)builtin/*' or some variation thereof (sorry, I can't test this right now).

@chinmoy12c
Copy link

chinmoy12c commented Mar 24, 2021

Hey, @dscho, I'm trying to replace the_repository in cache-tree.c. I have changed cache_tree_update for which I have to make a few changes to function calls in builtin/*.c files. Should I go ahead with making the changes?

@dscho
Copy link
Member Author

dscho commented Mar 24, 2021

Hey, @dscho, I'm trying to replace the_repository in cache-tree.c. I have changed cache_tree_update for which I have to make a few changes to function calls in builtin/*.c files. Should I go ahead with making the changes?

Sure. Did you define struct repository *r = the_repository; in builtin/'s functions, and then pass r as first parameter?

@chinmoy12c
Copy link

Hey, @dscho, I'm trying to replace the_repository in cache-tree.c. I have changed cache_tree_update for which I have to make a few changes to function calls in builtin/*.c files. Should I go ahead with making the changes?

Sure. Did you define struct repository *r = the_repository; in builtin/'s functions, and then pass r as first parameter?

Yes, I'll link a PR today for review. 😄 I guess passing the_repository directly in tests is okay?

@dscho
Copy link
Member Author

dscho commented Mar 24, 2021

Hey, @dscho, I'm trying to replace the_repository in cache-tree.c. I have changed cache_tree_update for which I have to make a few changes to function calls in builtin/*.c files. Should I go ahead with making the changes?

Sure. Did you define struct repository *r = the_repository; in builtin/'s functions, and then pass r as first parameter?

Yes, I'll link a PR today for review. 😄 I guess passing the_repository directly in tests is okay?

The tests are implemented in shell. Unless you're referring to test helpers, where it is totally okay to use the_repository.

@chinmoy12c
Copy link

@dscho , please have a look at the PR and suggest necessary changes. :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
good first issue Good for newcomers
Projects
None yet
Development

No branches or pull requests

4 participants