Skip to content

High-performance foundational system design algorithm implementations using PyArrow and modern Python.

License

Notifications You must be signed in to change notification settings

codeamt/Pyarrow-Algorithms

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

PyArrow Algorithms Toolkit

CI StatusCode CoverageLicense: MIT

High-performance foundational system design algorithm implementations using PyArrow and modern Python.

Features

Distributed Systems Algorithms

  • Consistent Hashing 
  • Merkle Trees for synchronization
  • Raft Consensus Protocol (TODO) 

Data Structures

  • Bloom Filters 
  • HyperLogLog 
  • QuadTrees 
  • Leaky Bucket rate limiter

Efficient Computation

  • Rsync Algorithm 
  • Ray Casting 
  • Operational Transformation

Installation

# Create virtual environment
python -m venv venvsource venv/bin/activate
# Install with PyArrow
pip install pyarrow==8.0.0 -r requirements.txt

Usage

from pyarrow_algorithms import BloomFilter
bf = BloomFilter(capacity=100000, error_rate=0.01)bf.add("important_item")print("item exists:", "important_item" in bf)

Testing

Run the full test suite with property-based testing:

pytest tests/ --hypothesis-show-statistics --cov=src

TODOs:

  • Implement Raft Consensus Protocol
  • Add distributed  implementation of key algorithms 
  • Build out a more robust testing/simulation suite with hypothesis and Redis 
  • Add Github Workflows for CI/CD
  • Profile memory usage and develop benchmarks against Vanilla Python/Numpy implementations 

Contributing

  1. Fork the repository
  2. Create feature branch
  3. Add tests for new algorithms
  4. Submit Pull Request

License

MIT License - See LICENSE for details.

About

High-performance foundational system design algorithm implementations using PyArrow and modern Python.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published