|
| 1 | +# DB-ESDK Performance Benchmark - Python |
| 2 | + |
| 3 | +This directory contains the Python implementation of the AWS Database Encryption SDK (DB-ESDK) performance benchmark suite. |
| 4 | + |
| 5 | +## Overview |
| 6 | + |
| 7 | +The Python benchmark provides comprehensive performance testing for the DB-ESDK Python runtime, measuring: |
| 8 | + |
| 9 | +- **Throughput**: Operations per second and bytes per second using ItemEncryptor operations |
| 10 | +- **Latency**: Encrypt, decrypt, and end-to-end timing for encrypted operations |
| 11 | +- **Memory Usage**: Peak memory consumption and efficiency |
| 12 | +- **Concurrency**: Multi-threaded performance scaling |
| 13 | +- **Statistical Analysis**: P50, P95, P99 latency percentiles |
| 14 | + |
| 15 | +## Prerequisites |
| 16 | + |
| 17 | +- Python 3.11 or higher |
| 18 | +- Poetry package manager |
| 19 | + |
| 20 | +## Setup |
| 21 | + |
| 22 | +### Install Poetry |
| 23 | + |
| 24 | +```bash |
| 25 | +# Install Poetry (if not already installed) |
| 26 | +curl -sSL https://install.python-poetry.org | python3 - |
| 27 | + |
| 28 | +# Or using pip |
| 29 | +pip install poetry |
| 30 | +``` |
| 31 | + |
| 32 | +### Install Dependencies |
| 33 | + |
| 34 | +```bash |
| 35 | +# Install all dependencies including dev dependencies |
| 36 | +poetry install |
| 37 | + |
| 38 | +# Install only production dependencies |
| 39 | +poetry install --no-dev |
| 40 | +``` |
| 41 | + |
| 42 | +## Building |
| 43 | + |
| 44 | +```bash |
| 45 | +# Build distribution packages |
| 46 | +poetry build |
| 47 | + |
| 48 | +# Install in development mode (automatic with poetry install) |
| 49 | +poetry install |
| 50 | + |
| 51 | +# Run tests using tox |
| 52 | +tox -e py311 |
| 53 | + |
| 54 | +# Run all tox environments |
| 55 | +tox |
| 56 | +``` |
| 57 | + |
| 58 | +## Running Benchmarks |
| 59 | + |
| 60 | +### Quick Test |
| 61 | + |
| 62 | +```bash |
| 63 | +# Using Poetry |
| 64 | +poetry run esdk-benchmark --quick |
| 65 | + |
| 66 | +# Using tox (recommended for isolated environment) |
| 67 | +tox -e benchmark |
| 68 | + |
| 69 | +# Using module execution |
| 70 | +poetry run python -m esdk_benchmark --quick |
| 71 | + |
| 72 | +# Direct script execution |
| 73 | +poetry run python src/esdk_benchmark/program.py --quick |
| 74 | +``` |
| 75 | + |
| 76 | +### Full Benchmark Suite |
| 77 | + |
| 78 | +```bash |
| 79 | +# Using Poetry |
| 80 | +poetry run esdk-benchmark |
| 81 | + |
| 82 | +# Using tox (recommended for isolated environment) |
| 83 | +tox -e benchmark-full |
| 84 | + |
| 85 | +# Using module execution |
| 86 | +poetry run python -m esdk_benchmark |
| 87 | + |
| 88 | +# Direct script execution |
| 89 | +poetry run python src/esdk_benchmark/program.py |
| 90 | +``` |
| 91 | + |
| 92 | +### Custom Configuration |
| 93 | + |
| 94 | +```bash |
| 95 | +# Specify custom config and output paths |
| 96 | +poetry run esdk-benchmark \ |
| 97 | + --config /path/to/config.yaml \ |
| 98 | + --output /path/to/results.json |
| 99 | +``` |
| 100 | + |
| 101 | +## Command Line Options |
| 102 | + |
| 103 | +- `--config, -c`: Path to test configuration file (default: `../../../config/test-scenarios.yaml`) |
| 104 | +- `--output, -o`: Path to output results file (default: `../../../results/raw-data/python_results.json`) |
| 105 | +- `--quick, -q`: Run quick test with reduced iterations |
| 106 | +- `--help, -h`: Show help message |
| 107 | + |
| 108 | +## Configuration |
| 109 | + |
| 110 | +The benchmark uses a YAML configuration file to define test parameters: |
| 111 | + |
| 112 | +```yaml |
| 113 | +data_sizes: |
| 114 | + small: [1024, 5120, 10240] |
| 115 | + medium: [102400, 512000, 1048576] |
| 116 | + large: [10485760, 52428800, 104857600] |
| 117 | + |
| 118 | +iterations: |
| 119 | + warmup: 5 |
| 120 | + measurement: 10 |
| 121 | + |
| 122 | +concurrency_levels: [1, 2, 4, 8] |
| 123 | +``` |
| 124 | +
|
| 125 | +## Output Format |
| 126 | +
|
| 127 | +Results are saved in JSON format with the following structure: |
| 128 | +
|
| 129 | +```json |
| 130 | +{ |
| 131 | + "metadata": { |
| 132 | + "language": "python", |
| 133 | + "timestamp": "2025-09-05T15:30:00Z", |
| 134 | + "python_version": "3.11.5", |
| 135 | + "platform": "Darwin-23.1.0-arm64-arm-64bit", |
| 136 | + "cpu_count": 8, |
| 137 | + "total_memory_gb": 16.0, |
| 138 | + "total_tests": 45 |
| 139 | + }, |
| 140 | + "results": [ |
| 141 | + { |
| 142 | + "test_name": "throughput", |
| 143 | + "language": "python", |
| 144 | + "data_size": 1024, |
| 145 | + "concurrency": 1, |
| 146 | + "put_latency_ms": 0.85, |
| 147 | + "get_latency_ms": 0.72, |
| 148 | + "end_to_end_latency_ms": 1.57, |
| 149 | + "ops_per_second": 636.94, |
| 150 | + "bytes_per_second": 652224.0, |
| 151 | + "peak_memory_mb": 0.0, |
| 152 | + "memory_efficiency_ratio": 0.0, |
| 153 | + "p50_latency": 1.55, |
| 154 | + "p95_latency": 1.89, |
| 155 | + "p99_latency": 2.12, |
| 156 | + "timestamp": "2025-09-05T15:30:15Z", |
| 157 | + "python_version": "3.11.5", |
| 158 | + "cpu_count": 8, |
| 159 | + "total_memory_gb": 16.0 |
| 160 | + } |
| 161 | + ] |
| 162 | +} |
| 163 | +``` |
| 164 | + |
| 165 | +## Key Features |
| 166 | + |
| 167 | +### DB-ESDK Integration |
| 168 | +- Uses AWS Database Encryption SDK for DynamoDB with transparent encryption |
| 169 | +- Configures attribute actions (ENCRYPT_AND_SIGN, SIGN_ONLY, DO_NOTHING) |
| 170 | +- Tests ItemEncryptor operations with client-side encryption |
| 171 | +- Uses Raw AES keyring for consistent performance testing |
| 172 | + |
| 173 | +### ItemEncryptor Operations |
| 174 | +- Performs encrypt_python_item operations using Python dict format |
| 175 | +- Measures decrypt_python_item operations for consistency |
| 176 | +- Tests realistic workloads with encryption overhead |
| 177 | +- Supports multiple data formats (Python dict, DynamoDB JSON, DBESDK shapes) |
| 178 | + |
| 179 | +### Performance Metrics |
| 180 | +- **Throughput Tests**: Measures ops/sec and bytes/sec for ItemEncryptor operations |
| 181 | +- **Memory Tests**: Tracks peak memory usage during encrypted operations using psutil |
| 182 | +- **Concurrency Tests**: Evaluates multi-threaded performance scaling with ThreadPoolExecutor |
| 183 | +- **Latency Analysis**: P50, P95, P99 percentiles for operation timing |
| 184 | + |
| 185 | +## Project Structure |
| 186 | + |
| 187 | +``` |
| 188 | +python/ |
| 189 | +├── README.md # This file |
| 190 | +├── pyproject.toml # Poetry configuration and dependencies |
| 191 | +├── tox.ini # Tox configuration for testing |
| 192 | +├── src/ |
| 193 | +│ └── esdk_benchmark/ |
| 194 | +│ ├── __init__.py # Package initialization |
| 195 | +│ ├── __main__.py # Module execution entry point |
| 196 | +│ ├── program.py # Main program and CLI |
| 197 | +│ ├── benchmark.py # Core benchmark implementation |
| 198 | +│ ├── models.py # Data models and configuration |
| 199 | +│ └── tests.py # Individual test implementations |
| 200 | +├── tests/ # Test suite |
| 201 | +│ ├── __init__.py |
| 202 | +│ └── test_benchmark.py |
| 203 | +└── run_benchmark.py # Convenience runner script |
| 204 | +``` |
| 205 | + |
| 206 | +## Dependencies |
| 207 | + |
| 208 | +Key dependencies used in this benchmark: |
| 209 | + |
| 210 | +- **aws-dbesdk-dynamodb**: Core encryption functionality for DynamoDB (with legacy-ddbec extras) |
| 211 | +- **boto3**: AWS SDK for Python (DynamoDB client operations) |
| 212 | +- **PyYAML**: YAML configuration file processing |
| 213 | +- **pydantic**: Data validation and settings management |
| 214 | +- **tqdm**: Progress bars for visual feedback |
| 215 | +- **psutil**: System and process utilities for memory monitoring |
| 216 | +- **numpy**: Numerical operations and statistics |
| 217 | + |
| 218 | +### Development Dependencies |
| 219 | +- **pytest**: Testing framework |
| 220 | +- **pytest-cov**: Coverage reporting |
| 221 | +- **black**: Code formatting |
| 222 | +- **flake8**: Linting |
| 223 | +- **mypy**: Type checking |
| 224 | +- **tox**: Testing automation |
| 225 | +- **memory-profiler**: Memory profiling utilities |
| 226 | + |
| 227 | +## Development |
| 228 | + |
| 229 | +### Code Style |
| 230 | + |
| 231 | +The project follows Python best practices with automated tooling: |
| 232 | + |
| 233 | +```bash |
| 234 | +# Format code |
| 235 | +tox -e format |
| 236 | + |
| 237 | +# Check formatting |
| 238 | +tox -e format-check |
| 239 | + |
| 240 | +# Lint code |
| 241 | +tox -e lint |
| 242 | + |
| 243 | +# Type checking |
| 244 | +tox -e type |
| 245 | + |
| 246 | +# Run all quality checks |
| 247 | +tox -e lint,type,format-check |
| 248 | +``` |
| 249 | + |
| 250 | +### Running Tests |
| 251 | + |
| 252 | +```bash |
| 253 | +# Run all tests |
| 254 | +tox -e py311 |
| 255 | + |
| 256 | +# Run tests with Poetry |
| 257 | +poetry run pytest |
| 258 | + |
| 259 | +# Run with coverage |
| 260 | +poetry run pytest --cov=esdk_benchmark |
| 261 | + |
| 262 | +# Run specific test file |
| 263 | +poetry run pytest tests/test_benchmark.py |
| 264 | + |
| 265 | +# Run all tox environments |
| 266 | +tox |
| 267 | +``` |
| 268 | + |
| 269 | +### Memory Profiling |
| 270 | + |
| 271 | +For detailed memory analysis: |
| 272 | + |
| 273 | +```bash |
| 274 | +# Memory profiler is included in dev dependencies |
| 275 | +poetry run python -m memory_profiler src/esdk_benchmark/benchmark.py |
| 276 | + |
| 277 | +# Or using tox |
| 278 | +tox -e benchmark # Includes memory profiler |
| 279 | +``` |
| 280 | + |
| 281 | +### Tox Environments |
| 282 | + |
| 283 | +Available tox environments: |
| 284 | + |
| 285 | +- `py311`: Run tests under Python 3.11 |
| 286 | +- `lint`: Run linting checks |
| 287 | +- `type`: Run type checking |
| 288 | +- `format`: Apply code formatting |
| 289 | +- `format-check`: Check code formatting |
| 290 | +- `benchmark`: Run quick benchmark |
| 291 | +- `benchmark-full`: Run full benchmark suite |
| 292 | +- `verify`: Verify setup and dependencies |
| 293 | +- `clean`: Clean up build artifacts |
| 294 | + |
| 295 | +## Troubleshooting |
| 296 | + |
| 297 | +### Common Issues |
| 298 | + |
| 299 | +1. **Import Errors**: Ensure Poetry environment is properly set up |
| 300 | + ```bash |
| 301 | + poetry install |
| 302 | + poetry run python -c "import esdk_benchmark; print('✓ OK')" |
| 303 | + ``` |
| 304 | + |
| 305 | +2. **Configuration Not Found**: Check that the config file path is correct relative to execution directory |
| 306 | + ```bash |
| 307 | + ls ../../config/test-scenarios.yaml |
| 308 | + ``` |
| 309 | + |
| 310 | +3. **Memory Issues**: For large data sizes, ensure sufficient system memory is available |
| 311 | + |
| 312 | +4. **Permission Errors**: Ensure write permissions for output directory |
| 313 | + ```bash |
| 314 | + mkdir -p ../../results/raw-data/ |
| 315 | + ``` |
| 316 | + |
| 317 | +5. **Poetry Issues**: If Poetry environment is corrupted |
| 318 | + ```bash |
| 319 | + poetry env remove python |
| 320 | + poetry install |
| 321 | + ``` |
| 322 | + |
| 323 | +### Debug Mode |
| 324 | + |
| 325 | +Enable verbose logging for troubleshooting: |
| 326 | + |
| 327 | +```python |
| 328 | +import logging |
| 329 | +logging.basicConfig(level=logging.DEBUG) |
| 330 | +``` |
| 331 | + |
| 332 | +## Performance Comparison |
| 333 | + |
| 334 | +This Python implementation mirrors the Java benchmark structure, enabling: |
| 335 | + |
| 336 | +- Cross-language performance comparisons |
| 337 | +- Consistent test scenarios and data sizes |
| 338 | +- Standardized output format for analysis |
| 339 | +- Similar statistical analysis and reporting |
| 340 | + |
| 341 | +## License |
| 342 | + |
| 343 | +This benchmark suite is part of the AWS Database Encryption SDK project and follows the same Apache-2.0 licensing terms. |
0 commit comments