High-performance foundational system design algorithm implementations using PyArrow and modern Python.
Distributed Systems Algorithms:
- Consistent Hashing
- Merkle Trees for synchronization
- Raft Consensus Protocol (TODO)
Data Structures:
- Bloom Filters
- HyperLogLog
- QuadTrees
- Leaky Bucket rate limiter
Efficient Computation:
- Rsync Algorithm
- Ray Casting
- Operational Transformation
# Create virtual environment
python -m venv venvsource venv/bin/activate
# Install with PyArrow
pip install pyarrow==8.0.0 -r requirements.txtfrom pyarrow_algorithms import BloomFilter
bf = BloomFilter(capacity=100000, error_rate=0.01)bf.add("important_item")print("item exists:", "important_item" in bf)Run the full test suite with property-based testing:
pytest tests/ --hypothesis-show-statistics --cov=src- Implement Raft Consensus Protocol
- Add distributed implementation of key algorithms
- Build out a more robust testing/simulation suite with hypothesis and Redis
- Add Github Workflows for CI/CD
- Profile memory usage and develop benchmarks against Vanilla Python/Numpy implementations
- Fork the repository
- Create feature branch
- Add tests for new algorithms
- Submit Pull Request
MIT License - See LICENSE for details.