Skip to content

Feature request: Determine version from Git service API #1122

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
strich opened this issue Dec 18, 2016 · 22 comments
Closed

Feature request: Determine version from Git service API #1122

strich opened this issue Dec 18, 2016 · 22 comments
Labels

Comments

@strich
Copy link

strich commented Dec 18, 2016

I've written a little web service that automates building projects and other things. I'd like to be able to determine the next version tag without having to download the Git repo. The Git repo history can be retreived via the service API for Github, BitBucket, etc. Would it be possible to write an interface to support that?

@asbjornu
Copy link
Member

@strich: Could you please provide a bit more detail on how this would work? Without the Git history, GitVersion is basically clueless and can't figure out a version number.

@strich
Copy link
Author

strich commented Dec 19, 2016

It is possible to retrieve the Git history and tags via API called to GitHub, BitBucket, etc.
For example, here is the json output for the Git commits for GitVersion: https://api.github.com/repos/GitTools/GitVersion/commits

We should be able to transform that data into whatever format GitVersion requires to make a decision no?

@asbjornu
Copy link
Member

Ah, I see. Well, the format GitVersion expects now is that which is provided by LibGit2Sharp. GitVersion has its own internal domain model today, but it's not a full abstraction on the data provided by LibGit2Sharp.

To get data from different sources, we would probably need to create a full internal abstraction inside GitVersion, which I believe would be a pretty difficult thing to implement. But if you think this is worthwhile, I wouldn't oppose a pull request implementing such a thing.

@strich
Copy link
Author

strich commented Dec 19, 2016

but it's not a full abstraction on the data provided by LibGit2Sharp.

What does this mean exactly? The data is abstracted, but the way it is modeled is coupled with how the data looks coming out of LibGit2Sharp?

I haven't yet explored the codebase at all - Do you think there is decent enough decoupling from LibGit2Sharp that one could implement such a feature in its most basic form would be easy enough?

@asbjornu
Copy link
Member

To be honest, I don't know. The current abstraction is not in place to make GitVersion independent from LibGit2Sharp, so if the abstraction is complete enough to allow that data to come from another source, that's a coincidence rather than a design decision, I think. That being said, it might be little work required to complete the abstraction. Please have a look at the source code and report your findings. 😄

@strich
Copy link
Author

strich commented Dec 29, 2016

I had a bit of a look at this and it seems to me like I would need to rewrite/refactor...a lot. Everything uses the core Branch/Commit classes provided by LibGit2Sharp, which has a ton of stuff running under the hood to collect and parse everything into the many fields in those classes.

It does not appear that I'll be able to take this route unfortunately.

@asbjornu
Copy link
Member

That was my suspicion. Sorry it didn't prove easier to do, but thanks for investigating nonetheless!

@JakeGinnivan
Copy link
Contributor

This is a direction I have wanted to go for a while, the main reason for it is actually for performance reasons. If we can build up a domain model of the actual information we need then query that model then we only ever have to interrogate the git repository once.

A first step would be to introduce our own cut down Branch/Commit type which we start using. Then just take it step by step.

@strich
Copy link
Author

strich commented Dec 30, 2016

I didn't go so far to profile exactly how much of LibGit2Sharp GitVersion uses IE any actions that aren't read-only like tagging and pushing said tags, but on the surface it didn't look too bad. I think there are a number of convenience functions used like the commit filtering and whatnot, that would have to be rewritten without LibGit2Sharp though.

@strich
Copy link
Author

strich commented Dec 30, 2016

Despite myself I went ahead and practically removed LibGit2Sharp from the GitVersion.Core project just to see what the damage would be - Not too bad! There are a few odd ends that have deeper rabbit holes behind them like the CommitFilter stuff, but the use of it in GitVersion appears to be fairly basic and could be refactored to use more traditional transform logic.

I'm not yet sure what I plan to do here -> It is a big leap from hacking GitVersion apart into a workable state without LibGit2Sharp and actually doing it properly with whatever design paradigms the maintainers prefer, etc.

@asbjornu
Copy link
Member

@strich: Whatever you find worth doing, do it! We're thankful for pull requests whatever they contain and can discuss the direction of it once we have something more concrete to discuss. Anything that abstracts LibGit2Sharp more than the complete lack of abstraction we have today would be great, so I'm not picky on how we do it, tbh. Immutability sounds like a good idea, though.

@JakeGinnivan
Copy link
Contributor

If I was doing this I would switch out the GitTools.Core NuGet reference for a project reference to start. Then would start at https://github.com/GitTools/GitTools.Core/blob/master/src/GitTools.Core/GitTools.Core.Shared/Git/RepositoryLoader.cs#L7

This would return a GitRepositoryMetadata type. Then remove libgit2sharp temporarily which would allow you to use that GitRepositoryMetadata class to provide all the info GitVersion needs. https://github.com/GitTools/GitTools.Core/blob/master/src/GitTools.Core/GitTools.Core.Shared/Git/Extensions/LibGitExtensions.cs would be a decent start but you may be able to remove those primitive queries and move to a higher level query which can be optimised easier for the different targets.

The way GitVersion works is it just queries Git for a bunch of info, like the merge bases for the current commit, what branches the commit is on, etc then uses all that info to work out the version. In theory this would be achievable.

The other consideration is the Tests. GitVersion has great test coverage, it creates git repositories temporarily, then creates commits then it runs GitVersion against the git repo. The first step would be to get the abstraction in place, ensure all the tests still pass. Then we can land that. Next step would be to create a GitHub provider which implements the abstraction, I don't think we could create hundreds of temporary github repos for each test so I think we will probably have to create a mock API which mimics the real GitHub api by querying the underlying Git repo the tests create. Then you would run each test for each abstraction provider.

This would be a great opportunity to kill the Git repo normalisation which is done on the build server because we can fetch pull request refs in the metadata provider and store them for later.

Hope that is useful, any other questions just ask

@strich
Copy link
Author

strich commented Dec 31, 2016

I spent most of yesterday on a little hack'n'slash expedition just to try to get an abstraction layer working and learn a little more about this project - I got it compiling and working. I didn't pay much mind to what might be the right way to do it in lieu of just getting a first iteration sorted but the way it currently works is that it has its own internal model of Repository, Branch, Commit, Tag, etc and it is populated once by the GitPreparer class, which will eventually become abstract and have an implementation for each provider (Libgit, Github, Bitbucket, etc). The intention being that that is the only place where provider code is used, so as to encapsulate it out a bit.

The main worry and holdup for me at the moment is the more complex commit tree sorting and filtering from libgit - I suppose I'll need to reimplement that in some form or another for the internal model as it is used heavily.

@JakeGinnivan thanks for the info. Though I'll be honest I'm not entirely sure I follow - RepositoryLoader doesn't appear to be used in GitVersionCore at all? I think my plan for the moment will be to follow my current implementation as above, and once I'm happy that it satisfies most tests we can review and discuss further.

With regards to the tests I'd be curious to know thoughts on whether the scope now changes - If we move GitVersion to using only its own internal model of the repository then there may be no need to try to perform test coverage on anything outside of testing against the internal model? IE testing API call returns and local git repo's may now be too out of scope?

@JakeGinnivan
Copy link
Contributor

@strich did you want to push your code up somewhere, I would be interested in looking at it.

In terms of tests, our internal abstraction should be able to be built off a git repo on the file system, so the tests should not have to change at all to support this change. it's an implementation detail.

We would then need other tests to simply validate our internal model can be populated correctly from multiple sources.

Sorry I took so long to respond. Been a crazy few months.

@strich
Copy link
Author

strich commented Feb 26, 2017

Hey @JakeGinnivan. Yeah early Jan I got so far as getting it "working" in some form or another, which you can review in my forked branch here.

I got a bit held up by some Github API traffic restrictions and then more so on potentially having to rewrite the tree sorting methods from LibGit used in the core version determination algorithms of this project.

Furthermore, I decided to use the OctoKit project for the GitHub integration, which requires .NET 4.5 and so my branch will only compile on that at the moment.

I'm still keen to get this done, but it is a bit of an uphill battle for me as I don't think I fully understand the core GitVersion code. I think I could get a bit more motivation if I had some help, that's for sure. :)

@strich
Copy link
Author

strich commented Feb 26, 2017

I had to go back over my commits to see where I got to:

The GitHub integration works fine but will need:

  1. Authorized app mode - Based on the number of potential API calls required it would be beneficial to create a GitVersion app with GitHub to increase the number of API calls allowed per hour.
  2. Authorized user - So one can access private repos.
  3. All the error case handling around the above.
  4. Optimization pass - API calls are currently the least-effort ones. They can be much more efficiently handled with async, paging, etc to reduce calls.
  5. Generalized so any service provider (BitBucket, GitLab, etc) will hopefully work.

The main task left however is stripping out all of the remaining LibGit2Sharp methods within the GitVersion algorithms - There are key things in there like tree sorting and commit/HEAD/branch finding that need to be rewritten to work with the internal domain model. This is the thing I struggle with the most, and I did make a start on it, but I'm unsure if it is actually correct.

@asbjornu
Copy link
Member

@strich: Sorry about the silence. I'm glad you've made progress and spent so much time investigating this. It does sound like trying to create a full abstraction of LibGit2Sharp might be the wrong path, though?

@strich
Copy link
Author

strich commented Jun 22, 2017

@asbjornu I'm not sure how else one could do it though - Trying to force feed data into LibGit2Sharp's own internal structures seems like ultimately the more difficult thing to do. The state I left this task in isn't too bad - There are just a couple of places where you're asking for some specific tree sorting to be done that needs to be reimplemented outside of LibGit2Sharp's systems.

@asbjornu
Copy link
Member

If those algorithms are simple enough and stable, I agree that reimplementation might be a valid course. But if they are complex and volatile, I'm not sure it's worth it.

@strich
Copy link
Author

strich commented Jun 23, 2017

They looked fairly simple to me - Just doing tree sorting forwards/backwards within branches, etc. It was just a bit beyond me at the time - I didn't really know how git commit parent/child relationships worked and whatnot.

@asbjornu
Copy link
Member

@strich: Right. Seems like @JakeGinnivan has gone ahead and started this work with #1243, so please take a look at that and contribute if you can!

@stale
Copy link

stale bot commented Jun 29, 2019

This issue has been automatically marked as stale because it has not had recent activity. After 30 days from now, it will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the stale label Jun 29, 2019
@stale stale bot closed this as completed Jul 29, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants