-
Notifications
You must be signed in to change notification settings - Fork 69
Closed
Description
The current solution has quite a few problems. But most importantly it is hard to use. It is written in Python and for developers who have no Python experience, it is not very convenient to use.
Problems with the current solution
- hard to install, requires Python
- too big output, JSON takes too much memory and CPU to process, GRPCGateway and filechanges use a lot of memory.
- it is slow, because of that we have to skip large repos
Requirements:
- It must have the same feature that the current solution has
- Binary executable (Windows, Linux, OSX)
- The output must be parsable line by line to avoid loading everything into the memory
- Make at least 10 times faster. Must be able to progress theses in less than 10 minutes:
- https://github.com/aosp-mirror/platform_frameworks_base
- https://github.com/eXpandFramework/eXpand
- https://github.com/expand/eXpand
- https://github.com/eXpandFramework/eXpand.lab
- https://github.com/eXpandFramework/lab
- https://github.com/laravel-enso/enso
- https://github.com/fellipegpbotelho/odonto-uni
- Import existing trained Python model to Go to find similar emails. Being able to detect similar emails. For example, if I have a commit with [email protected] [email protected], [email protected] it has to recognize it comes from the same user.
Nice to have
- Merge with the multi_repo_info_extractor, being able to extract multiple repos by passing tokens, credentials.
- Serverless compatibility (Go is available in Google Cloud)
- parse the code instead of using regex, to improve the accuracy of the import detection
- Minimalize disk IO, don't check out the code, do it in memory
- Support multiple outputs. Easily extendable by the community.
- Recognize squashed commits, not just merges
- GUI
If you have any suggestions, problems with the current implementation, please share.
andrey-helldar, Nibba2018, fearless-spider, brunolm, itnelo and 3 more
Metadata
Metadata
Assignees
Labels
No labels