PE-Packer

An educational PE (Portable Executable) laboratory built in Rust with Python bindings for research and training on PE formats, safe unpacking workflows, and defensive ML evaluation.

⚠️ Educational Purpose Only: This tool is designed for security research, malware analysis training, and building robust ML detection systems. Use only with legitimate, authorized benign samples.

Quick links: Security Policy · License Appendix (Educational Use)

Features

Multiple Encryption Algorithms: XOR, AES, or no encryption
Anti-Debugging Techniques: Optional anti-debug code injection
Section Randomization: Randomize PE section names to vary packed output
Benign Metadata Injection: Add legitimate-looking metadata for evasion research
Stub Variation: Generate diverse decryption stubs for dataset diversity
Python CLI: Modern command-line interface with Typer
Training Dataset Generation: Batch generate variants of PE files with configurable parameters
Metadata Tracking: Complete metadata for each packed sample for ML training
Cross-platform: Runs on Linux, macOS, and Windows

Installation

Prerequisites

Rust 1.56+ (for building from source)
Python 3.8+
pip or poetry

From PyPI (educational stub)

pip install pe-packer-educational

This PyPI package is a non-functional educational stub that prints guidance and links to the source. It does not ship packing code or native modules.

From Source

# Clone the repository
git clone https://github.com/codeamt/rust-python-pe-packer
cd rust-python-pe-packer

# Using uv (recommended)
uv sync --group dev --group training
uv run maturin develop --features python

# Or using pip
pip install -e .[dev,training]
maturin develop

Quick Start

Pack a Single File

# Safety gate: actual packing requires BOTH
#   PE_PACKER_ALLOW_PACKING=1  and  --force
# Otherwise, the CLI runs in dry-run mode and prints analysis only.

# Dry-run (no env, no --force)
pe-packer pack malware.exe packed.exe

# Actual packing (requires both env and --force)
PE_PACKER_ALLOW_PACKING=1 pe-packer pack malware.exe packed.exe --force

# Pack with AES encryption and anti-debugging
pe-packer pack malware.exe packed.exe --encryption aes --anti-debug

# Randomize sections and add benign metadata (with gates)
PE_PACKER_ALLOW_PACKING=1 pe-packer pack malware.exe packed.exe --force \
  --encryption aes \
  --anti-debug \
  --random-sections \
  --benign-metadata

Generate Training Dataset

# Generate 10 variants per file with all configurations
pe-packer generate-training-data \
  ./benign_samples \
  ./training_data \
  --variants 20

# Generate with specific encryption methods
pe-packer generate-training-data \
  ./benign_samples \
  ./training_data \
  --variants 15 \
  --encryption xor,aes

Testing Dataset Generation (Checklist)

Ensure you have a local folder with benign PE files, for example: ./benign_samples/
Run a small test to validate the pipeline end-to-end:

# Minimal smoke test: 1 variant, analysis-focused
pe-packer generate-training-data ./benign_samples ./training_data --variants 1

# Analyze the produced metadata
pe-packer analyze-dataset ./training_data/dataset_metadata.json

If you intend to actually generate packed binaries (not just analysis), ensure safety gates are consciously enabled when invoking direct packing commands (see Pack a Single File section). Dataset generation itself focuses on safe educational workflows and metadata.

Analyze Generated Dataset

pe-packer analyze-dataset training_data/dataset_metadata.json

Validate PE Files

pe-packer validate suspicious.exe

Commands

`pack`

Pack a single PE file with specified packing options.

pe-packer pack INPUT OUTPUT [OPTIONS]

Options:

--encryption, -e: Encryption algorithm (xor, aes, none) [default: xor]
--key, -k: Encryption key as hex string (auto-generated if not provided)
--anti-debug: Enable anti-debugging techniques
--random-sections: Randomize section names
--benign-metadata: Add benign-looking metadata
--stub-variation, -s: Stub variation identifier (1-32) [default: 1]
--verbose, -v: Enable verbose logging
--force: Required along with PE_PACKER_ALLOW_PACKING=1 to perform actual packing

Example:

PE_PACKER_ALLOW_PACKING=1 pe-packer pack sample.exe packed.exe --force \
  --encryption aes \
  --key deadbeef \
  --anti-debug \
  --stub-variation 3

`generate-training-data`

Generate training dataset with packed samples.

pe-packer generate-training-data INPUT_DIR OUTPUT_DIR [OPTIONS]

Options:

--variants, -n: Variants per file (1-100) [default: 10]
--encryption, -e: Comma-separated encryption methods [default: xor,aes,none]
--anti-debug / --no-anti-debug: Enable anti-debugging [default: enabled]
--random-sections / --no-random-sections: Randomize sections [default: enabled]
--benign-metadata / --no-benign-metadata: Add benign metadata [default: enabled]
--stub-variations, -s: Number of stub variations [default: 5]
--verbose, -v: Enable verbose logging

Example:

pe-packer generate-training-data \
  ./benign_samples \
  ./training_data \
  --variants 50 \
  --encryption xor,aes \
  --stub-variations 10

`analyze-dataset`

Analyze a generated training dataset.

pe-packer analyze-dataset METADATA_FILE [OPTIONS]

Options:

--verbose, -v: Enable verbose logging

Example:

pe-packer analyze-dataset training_data/dataset_metadata.json

`validate`

Validate a PE file format.

pe-packer validate FILE_PATH [OPTIONS]

Options:

--verbose, -v: Enable verbose logging

Example:

pe-packer validate sample.exe

Python API

Use PE-Packer programmatically in your Python code.

Basic Usage

from pe_packer import PEPacker, PackerConfig, EncryptionAlgorithm

# Create a configuration
config = PackerConfig(
    encryption=EncryptionAlgorithm.AES,
    add_anti_debug=True,
    randomize_sections=True,
    add_benign_metadata=True,
)

# Create a packer instance
packer = PEPacker(config)

# Pack a file
packer.pack_file("input.exe", "output.exe")

# Or pack from bytes
with open("input.exe", "rb") as f:
    data = f.read()
packed_data = packer.pack_bytes(data)

Training Data Generation

from pe_packer.training import DatasetGenerator, TrainingConfig
from pathlib import Path

# Configure dataset generation
config = TrainingConfig(
    variants_per_file=20,
    encryption_methods=["xor", "aes", "none"],
    enable_anti_debug=[True, False],
    stub_variations=5,
)

# Generate dataset
generator = DatasetGenerator(
    input_dir=Path("./benign_samples"),
    output_dir=Path("./training_data"),
    config=config,
)

dataset_info = generator.generate()
print(f"Generated {dataset_info['total_samples']} samples")

Dataset Analysis

from pe_packer.training import MetadataManager
from pathlib import Path

# Load and analyze dataset
manager = MetadataManager(Path("training_data/dataset_metadata.json"))

# Get statistics
stats = manager.get_statistics()
print(f"Total samples: {stats['total_variants']}")
print(f"Encryption distribution: {stats['encryption_distribution']}")
print(f"Anti-debug coverage: {stats['anti_debug_percentage']:.1f}%")

# Query specific samples
aes_samples = manager.get_samples_by_encryption("aes")
anti_debug_samples = manager.get_samples_with_anti_debug()

Architecture

Rust Backend (`src/`)

packer/: Core packing logic with encryption and stub generation
pe/: PE file parsing, building, and structure handling
python/: PyO3 bindings for Python integration
utils/: Error handling, logging, and utilities

Python Layer (`python/`)

core.py: High-level Python API for packing
cli.py: Modern Typer CLI interface
training/: Dataset generation and metadata management
utils/: File validation, entropy calculation, and helpers

Configuration

Default Configuration

# config/default.toml
encryption = "xor"
add_anti_debug = false
randomize_sections = false
add_benign_metadata = false
stub_variation = 1

Training Configuration

# config/training.toml
variants_per_file = 10
encryption_methods = ["xor", "aes", "none"]
enable_anti_debug = [true, false]
enable_random_sections = [true, false]
enable_benign_metadata = [true, false]
stub_variations = 5

Production Configuration

# config/production.toml
encryption = "aes"
add_anti_debug = true
randomize_sections = true
add_benign_metadata = true
stub_variation = 32

Output Format

Packed PE File

The output is a valid PE executable with:

Modified section headers and names
Encrypted original code sections
Decryption stub as entry point
Proper PE alignment and structure

Dataset Metadata

Generated as dataset_metadata.json:

{
  "total_samples": 100,
  "input_dir": "/path/to/samples",
  "output_dir": "/path/to/output",
  "config": {
    "variants_per_file": 10,
    "encryption_methods": ["xor", "aes", "none"],
    "enable_anti_debug": [true, false]
  },
  "files": [
    {
      "original_file": "sample.exe",
      "original_path": "/path/to/sample.exe",
      "variants": [
        {
          "variant_id": 0,
          "output_file": "sample_packed_000.exe",
          "config": {
            "encryption": "xor",
            "add_anti_debug": false
          },
          "file_size": 45056
        }
      ]
    }
  ]
}

Testing

Run Rust Tests

cargo test --verbose

Run Python Tests

pip install pytest pytest-benchmark
pytest python/tests/ -v

Benchmark

# Rust benchmarks
cargo bench

# Python performance tests
python benchmarks/python_performance.py

Building Wheels

Build distributable Python wheels:

pip install maturin
maturin build --release

# Or for local development
maturin develop

Contributing

Contributions are welcome! Please ensure:

All tests pass: cargo test && pytest python/tests/
Code is formatted: cargo fmt && black python/
Linting passes: cargo clippy && ruff check python/
Documentation is updated

License

This project is dual-licensed under Apache 2.0. See LICENSE file for details.

Security Disclaimer

This tool is provided for educational and authorized security research purposes only. Users are responsible for ensuring they have proper authorization before using this tool on any files or systems. Unauthorized modification or distribution of executables may violate laws in your jurisdiction.

Citation

If you use PE-Packer in your research, please cite:

@software{pe_packer,
  title={PE-Packer: Educational PE Packer for Malware Detection Training},
  author={AnnMargaret Tutu},
  year={2025},
  url={https://github.com/codeamt/rust-python-pe-packer}
}

Support

For issues, questions, or suggestions:

Open an issue on GitHub
Check existing documentation in docs/
Review examples in examples/

Acknowledgments

Built with Rust, goblin, and PyO3
CLI built with Typer
Inspired by educational packing techniques and malware analysis research

Last Updated: 2025
Version: 0.1.0

Name		Name	Last commit message	Last commit date
Latest commit History 20 Commits
.github/workflows		.github/workflows
docs		docs
pypi-stub		pypi-stub
python		python
src		src
.gitattributes		.gitattributes
.gitignore		.gitignore
Cargo.lock		Cargo.lock
Cargo.toml		Cargo.toml
LICENSE		LICENSE
LICENSE-APPENDIX.md		LICENSE-APPENDIX.md
Makefile		Makefile
README.md		README.md
SECURITY.md		SECURITY.md
mkdocs.yml		mkdocs.yml
pyproject.toml		pyproject.toml
uv.lock		uv.lock

License

Licenses found

codeamt/rust-python-pe-packer

Folders and files

Latest commit

History

Repository files navigation

PE-Packer

Features

Installation

Prerequisites

From PyPI (educational stub)

From Source

Quick Start

Pack a Single File

Generate Training Dataset

Testing Dataset Generation (Checklist)

Analyze Generated Dataset

Validate PE Files

Commands

pack

generate-training-data

analyze-dataset

validate

Python API

Basic Usage

Training Data Generation

Dataset Analysis

Architecture

Rust Backend (src/)

Python Layer (python/)

Configuration

Default Configuration

Training Configuration

Production Configuration

Output Format

Packed PE File

Dataset Metadata

Testing

Run Rust Tests

Run Python Tests

Benchmark

Building Wheels

Contributing

License

Security Disclaimer

Citation

Support

Acknowledgments

About

Resources

License

Licenses found

Security policy

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

`pack`

`generate-training-data`

`analyze-dataset`

`validate`

Rust Backend (`src/`)

Python Layer (`python/`)

Packages