A streamlined Knowledge Organization Infrastructure (KOI) network node that processes GitHub events from a GitHub Sensor node. It extracts and stores repository and event metadata without performing Git operations, providing a lightweight solution for tracking GitHub activity within a KOI-net ecosystem.
- Lightweight Processing: Stores only metadata without cloning repositories or accessing file contents
- Low Resource Usage: Minimal CPU and disk space requirements with no Git operations
- Fast Event Processing: Quick event handling without waiting for Git operations
- KOI-net Integration: Fully compatible with the KOI-net protocol for distributed knowledge sharing
- Installation
- Quick Start
- Usage
- Configuration
- API Reference
- Architecture
- Examples
- Contributing
- Testing
- CI/CD & Deployment
- Versioning & Changelog
- License
- Contact & Support
# Create a virtual environment
python -m venv .venv
source .venv/bin/activate # On Windows: .venv\Scripts\activate
# Install the package
pip install koi-net-github-processor
# Pull the Docker image
docker pull blockscience/koi-net-github-processor:latest
# Run using Docker
docker run -p 8004:8004 -v $(pwd)/config.yaml:/app/config.yaml blockscience/koi-net-github-processor
# Clone the repository
git clone https://github.com/BlockScience/koi-net-github-processor.git
cd koi-net-github-processor
# Create a virtual environment
python -m venv .venv
source .venv/bin/activate # On Windows: .venv\Scripts\activate
# Install dependencies
pip install -e .
# Run tests
pytest
- Create a configuration file:
# Create a basic config.yaml
cat > config.yaml << EOF
server:
host: 127.0.0.1
port: 8004
path: /koi-net
koi_net:
node_name: processor_github
node_rid: orn:koi-net.node:processor_github+0bf78f28-9f56-4d31-8377-a33f49a0828e
node_profile:
base_url: http://127.0.0.1:8004/koi-net
node_type: FULL
provides:
event: []
state: []
cache_directory_path: .koi/processor-github/cache
event_queues_path: .koi/processor-github/queues.json
first_contact: http://127.0.0.1:8000/koi-net
index_db_path: .koi/processor-github/index.db
env:
github_token: GITHUB_TOKEN
EOF
- Set environment variables:
export GITHUB_TOKEN=your_github_personal_access_token
- Start the processor:
# Using Python module
python -m processor_github_node
# Alternative using make (if available)
make processor-gh
The GitHub Processor comes with a CLI tool for exploring stored events:
# List all tracked repositories
python cli.py list-repos
# Show events for a specific repository
python cli.py show-events sayertindall/koi-net
# Show detailed information about a specific event
python cli.py event-details orn:github.event:blockscience/koi-net:event123
# Add a repository to track
python cli.py add-repo blockscience/koi-net
# Show a summary of all events in the database
python cli.py summarize-events
import requests
# List repositories
response = requests.get("http://localhost:8004/api/processor/github/repositories")
repositories = response.json()
print(f"Tracked repositories: {len(repositories)}")
# Get events for a repository
repo_rid = "orn:github.repo:blockscience/koi-net"
response = requests.get(
f"http://localhost:8004/api/processor/github/repositories/{repo_rid}/events",
params={"limit": 10, "offset": 0}
)
events = response.json()
print(f"Found {len(events)} events for {repo_rid}")
The processor is configured using a YAML file with the following options:
Option | Default | Description | Required |
---|---|---|---|
server.host |
127.0.0.1 |
Host address to bind the server to | Yes |
server.port |
8004 |
Port to listen on | Yes |
server.path |
/koi-net |
Base path for KOI-net API endpoints | Yes |
koi_net.node_name |
processor_github |
Name of this node | Yes |
koi_net.node_rid |
Generated | Unique RID for this node | No |
koi_net.node_profile.base_url |
Based on server config | Base URL for this node's API | No |
koi_net.node_profile.node_type |
FULL |
Node type (FULL or PARTIAL) | Yes |
koi_net.node_profile.provides |
Empty lists | RID types provided by this node | Yes |
koi_net.cache_directory_path |
.koi/processor-github/cache |
Path to cache directory | Yes |
koi_net.event_queues_path |
.koi/processor-github/queues.json |
Path to event queues file | Yes |
koi_net.first_contact |
None | URL of first node to contact | No |
index_db_path |
.koi/processor-github/index.db |
Path to SQLite database | Yes |
env.github_token |
GITHUB_TOKEN |
Environment variable name for GitHub token | Yes |
server:
host: 127.0.0.1
port: 8004
path: /koi-net
koi_net:
node_name: processor_github
node_rid: orn:koi-net.node:processor_github+0bf78f28-9f56-4d31-8377-a33f49a0828e
node_profile:
base_url: http://127.0.0.1:8004/koi-net
node_type: FULL
provides:
event: []
state: []
cache_directory_path: .koi/processor-github/cache
event_queues_path: .koi/processor-github/queues.json
first_contact: http://127.0.0.1:8000/koi-net
index_db_path: .koi/processor-github/index.db
env:
github_token: GITHUB_TOKEN
Receives events broadcast from other nodes.
Request Body:
{
"events": [
{
"rid": "orn:github.event:owner/repo:event123",
"event_type": "NEW",
"manifest": {
"rid": "orn:github.event:owner/repo:event123",
"timestamp": "2023-01-01T12:00:00Z",
"sha256_hash": "hash123"
},
"contents": {}
}
]
}
Response: No content (204)
Allows partial nodes to poll for events.
Request Body:
{
"rid": "orn:koi-net.node:some-node+uuid",
"limit": 50
}
Response:
{
"events": [
{
"rid": "orn:github.event:owner/repo:event123",
"event_type": "NEW",
"manifest": {
"rid": "orn:github.event:owner/repo:event123",
"timestamp": "2023-01-01T12:00:00Z",
"sha256_hash": "hash123"
},
"contents": {}
}
]
}
Retrieves RIDs of a specific type.
Request Body:
{
"rid_types": ["orn:github.event"]
}
Response:
{
"rids": [
"orn:github.event:owner/repo:event123",
"orn:github.event:owner/repo:event456"
]
}
Retrieves manifests for specific RIDs.
Request Body:
{
"rids": ["orn:github.event:owner/repo:event123"]
}
Response:
{
"manifests": [
{
"rid": "orn:github.event:owner/repo:event123",
"timestamp": "2023-01-01T12:00:00Z",
"sha256_hash": "hash123"
}
],
"not_found": []
}
Retrieves full bundles for specific RIDs.
Request Body:
{
"rids": ["orn:github.event:owner/repo:event123"]
}
Response:
{
"bundles": [
{
"manifest": {
"rid": "orn:github.event:owner/repo:event123",
"timestamp": "2023-01-01T12:00:00Z",
"sha256_hash": "hash123"
},
"contents": {
"event_source_type": "push",
"repository": {
"name": "repo",
"owner": {
"login": "owner"
}
}
}
}
],
"not_found": [],
"deferred": []
}
Get the current status of the GitHub processor.
Response:
{
"status": "active",
"message": "GitHub processor node is running",
"details": {
"node_name": "processor_github",
"node_type": "FULL",
"db_path": ".koi/processor-github/index.db"
}
}
List all tracked repositories.
Response:
[
{
"repo_rid": "orn:github.repo:owner/repo",
"repo_url": "https://github.com/owner/repo.git",
"first_indexed": "2023-01-01T12:00:00Z",
"last_updated": "2023-01-02T12:00:00Z"
}
]
Get events for a specific repository.
Query Parameters:
limit
(optional): Maximum number of events to return (default: 50)offset
(optional): Pagination offset (default: 0)
Response:
[
{
"event_rid": "orn:github.event:owner/repo:event123",
"event_type": "push",
"timestamp": "2023-01-01T12:00:00Z",
"commit_sha": "abcdef123456",
"summary": "Push to owner/repo: abcdef1",
"bundle_rid": "orn:github.event:owner/repo:event123"
}
]
The GitHub Processor consists of several key components that work together to process GitHub events and provide access to the stored data:
┌─────────────────┐ ┌────────────────┐ ┌────────────────┐
│ GitHub Sensor │────>│ KOI-net Node │────>│ Other KOI-net │
│ (events) │ │ Interface │ │ Nodes │
└─────────────────┘ └────────┬───────┘ └────────────────┘
│
▼
┌─────────────────────────────────┐
│ Processor Interface │
│ │
│ ┌─────────────┐ ┌───────────┐ │
│ │ Event │ │ Network │ │
│ │ Handlers │ │ Handlers │ │
│ └─────────────┘ └───────────┘ │
└──────────────┬──────────────────┘
│
▼
┌───────────────────────────────────────────────┐
│ Repository Service │
└─────────────────────┬─────────────────────────┘
│
▼
┌───────────────────────────────────────────────┐
│ Index Database │
│ ┌────────────┐ ┌─────────┐ ┌───────────┐ │
│ │Repositories│ │ Events │ │ Metadata │ │
│ └────────────┘ └─────────┘ └───────────┘ │
└───────────────────────────────────────────────┘
│
▼
┌───────────────────────────────────────────────┐
│ REST API / CLI │
└───────────────────────────────────────────────┘
- KOI-net Node Interface: Handles communication with other nodes in the KOI-net network.
- Processor Interface: Processes incoming GitHub events through a pipeline of handlers.
- Event Handlers: Extract and normalize data from GitHub events.
- Network Handlers: Manage communication with other nodes, including edge negotiation.
- Repository Service: Core service managing GitHub repository data and events.
- Index Database: SQLite database storing metadata about repositories and GitHub events.
- REST API: FastAPI-based API for querying repositories and events.
- CLI: Command-line interface for interacting with the stored data.
import requests
import time
# 1. Add a repository to track
repo = "blockscience/koi-net"
requests.post(
"http://localhost:8004/api/processor/github/repositories",
json={"repo_url": f"https://github.com/{repo}.git"}
)
# 2. Monitor events for the repository
repo_rid = f"orn:github.repo:{repo}"
while True:
response = requests.get(
f"http://localhost:8004/api/processor/github/repositories/{repo_rid}/events"
)
events = response.json()
print(f"Found {len(events)} events for {repo}")
for event in events:
print(f" {event['event_type']} - {event['timestamp']} - {event['summary']}")
time.sleep(30) # Check every 30 seconds
#!/bin/bash
# This script demonstrates using the CLI to explore GitHub events
# List all repositories
echo "Listing all tracked repositories:"
python cli.py list-repos
# Select the first repository and show its events
REPO_RID=$(python cli.py list-repos | grep 'orn:github.repo' | head -1 | awk '{print $1}')
echo "Showing events for repository: $REPO_RID"
python cli.py show-events $REPO_RID
# Show details for the first event
EVENT_RID=$(python cli.py show-events $REPO_RID | grep 'orn:github.event' | head -1 | awk '{print $NF}')
echo "Showing details for event: $EVENT_RID"
python cli.py event-details $EVENT_RID
# Show overall summary
echo "Showing event summary:"
python cli.py summarize-events
Contributions to the GitHub Processor are welcome! Please follow these steps:
-
Fork the Repository
- Create a fork of the repository on GitHub.
-
Clone Your Fork
git clone https://github.com/YOUR-USERNAME/koi-net-github-processor.git cd koi-net-github-processor
-
Create a Feature Branch
git checkout -b feature/your-feature-name
-
Make Changes
- Implement your changes
- Add tests for new functionality
-
Run Tests
pytest
-
Commit Changes
git commit -am "Add your detailed commit message"
-
Push to GitHub
git push origin feature/your-feature-name
-
Create a Pull Request
- Go to your fork on GitHub and create a pull request to the main repository.
Please adhere to the project's code style and include appropriate tests with your contributions.
Run the test suite with:
# Run all tests
pytest
# Run with coverage report
pytest --cov=processor_github_node
# Generate HTML coverage report
pytest --cov=processor_github_node --cov-report=html
The project uses GitHub Actions for continuous integration:
name: GitHub Processor CI
on:
push:
branches: [ main ]
pull_request:
branches: [ main ]
jobs:
test:
runs-on: ubuntu-latest
strategy:
matrix:
python-version: [3.8, 3.9, "3.10"]
steps:
- uses: actions/checkout@v2
- name: Set up Python ${{ matrix.python-version }}
uses: actions/setup-python@v2
with:
python-version: ${{ matrix.python-version }}
- name: Install dependencies
run: |
python -m pip install --upgrade pip
pip install -e ".[dev]"
- name: Lint with flake8
run: flake8 processor_github_node
- name: Test with pytest
run: pytest
- name: Build package
run: python -m build
- name: Upload artifacts
uses: actions/upload-artifact@v2
with:
name: dist
path: dist/
This project follows Semantic Versioning. For a complete list of changes, see the CHANGELOG.md file.
- Major version: Incompatible API changes
- Minor version: New functionality in a backward-compatible manner
- Patch version: Backward-compatible bug fixes
This project is licensed under the MIT License - see the LICENSE file for details.
- BlockScience Team - [email protected]
- Issue Tracker: GitHub Issues
- Discussion: GitHub Discussions
- KOI-net Community Forum: community.koi-net.org