A Python script that gathers metadata for all repositories in a GitHub organization and automatically exports the data into a desired Google Sheet (using a Google Cloud Console Service Account) for easy viewing and analysis.
- Fetches all repositories in an organization
- Collects key details:
- Repo visibility, name and description
- Date created and last updated
- Creator and top 4 contributors (
N/Acreator means it was either a transferred repository or a forked repository andNone (<GitHub Username>)means there was no full name attached to their github account) - Number of stars and number of branches
- README, license,
.gitignore, package requirements (requirements.txt,environment.yaml, etc.),CITATION.cff, .zenodo.json and contributor files presence - Primary Programming Language
- Website Reference, Dataset, Model, Paper Association, DOI for GitHub Repo presence
- Exports everything to a given Google Sheet document that it will require Editor permission to on the sheet's sharing permissions list.
- Highlights “No” fields with red cell colors
-
Clone this repository:
git clone https://github.com/Imageomics/repo-exporter.git cd repo-exporter -
Install Python dependencies:
pip install -r requirements.txt -
Run the script:
python export_repos.py -
Enter your GitHub Personal Access Token
To create one with permissions for both private and public repositories (public repository read-access only is enabled by default without adminstrator approval):
- Go to github.com/settings/personal-access-tokens
- Click Generate new token → Fine-grained token
- Under Resource owner, select the organization you want to access.
- Under Repository access, choose All repositories.
- Under Permissions select Repositories and set:
- Metadata -> Read-only
- Contents -> Read-only
- Adminstration -> Read-only
- Click Generate token and copy it (make sure to store it somewhere safe for future use). Note: The token must be approved by the organization administrator before accessing private repositories.
-
Create a Google Cloud Console Service Account and give it permission to use in the repository and in the Google sheet
- Go to https://console.cloud.google.com/
- Create a new project and name it anything
- Go to https://console.cloud.google.com/iam-admin/serviceaccounts, if you have multiple projects you'll need to select the project that you just made if it hasn't already been selected
- Create a service account, enter a name for it, enter a service account ID for it (can be anything), enter a description for it
- Click on the service account email -> Keys -> Add key -> Create new key and select JSON then finally click Create
- Go to https://github.com/Imageomics/repo-exporter/settings/secrets/actions and click New repository secret and name it GOOGLE_SERVICE_ACCOUNT_JSON and copy paste the entire contents of the JSON file into the Secret section and click Add secret
- Go to https://console.cloud.google.com/apis/library/sheets.googleapis.com and enable the Google Sheets API for the project you made
- Go to your chosen Google Spreadsheet and go to Share settings and add the new Service Account email you made and set it as an Editor