Skip to content

Conversation

@qued
Copy link
Contributor

@qued qued commented Oct 10, 2025

πŸ“„ 56% (0.56x) speedup for zoom_image in unstructured_inference/models/tables.py

⏱️ Runtime : 296 milliseconds β†’ 190 milliseconds (best of 15 runs)

πŸ“ Explanation and details

The optimized code achieves a 55% speedup through three key memory optimization techniques:

1. Reduced Memory Allocations

  • Moved kernel = np.ones((1, 1), np.uint8) outside the resize operation to avoid unnecessary intermediate allocations
  • Used np.asarray(image) instead of np.array(image) to avoid copying when the PIL image is already a numpy-compatible array

2. In-Place Operations

  • Added dst=new_image parameter to both cv2.dilate() and cv2.erode() operations, making them modify the existing array in-place rather than creating new copies
  • This eliminates two major memory allocations that were consuming 32% of the original runtime (16.7% + 15.8% from the profiler)

3. Memory Access Pattern Improvements
The profiler shows the most dramatic improvements in the morphological operations:

  • cv2.dilate time reduced from 54.8ms to 0.5ms (99% reduction)
  • cv2.erode time reduced from 52.1ms to 0.2ms (99.6% reduction)

Performance Characteristics
The optimization shows consistent improvements across all test cases, with particularly strong gains for:

  • Large images (15-30% speedup on 500x400+ images)
  • Extreme scaling operations (30% improvement on extreme downscaling)
  • Memory-intensive scenarios where avoiding copies provides the most benefit

The core image processing logic remains identical - only memory management was optimized to eliminate unnecessary allocations and copies during the morphological operations.

βœ… Correctness verification report:

Test Status
βš™οΈ Existing Unit Tests βœ… 31 Passed
πŸŒ€ Generated Regression Tests βœ… 34 Passed
βͺ Replay Tests βœ… 5 Passed
πŸ”Ž Concolic Coverage Tests πŸ”˜ None Found
πŸ“Š Tests Coverage 100.0%
βš™οΈ Existing Unit Tests and Runtime
Test File::Test Function Original ⏱️ Optimized ⏱️ Speedup
models/test_tables.py::test_zoom_image 131ms 80.9ms 62.0%βœ…
πŸŒ€ Generated Regression Tests and Runtime
import cv2
import numpy as np
# imports
import pytest  # used for our unit tests
from PIL import Image as PILImage
from unstructured_inference.models.tables import zoom_image

# ----------- UNIT TESTS ------------

# Helper to create a solid color image
def create_image(width, height, color=(255, 0, 0)):
    """Create a PIL RGB image of the given size and color."""
    return PILImage.new("RGB", (width, height), color=color)

# Helper to compare two PIL images for pixel-wise equality
def images_equal(img1, img2):
    arr1 = np.array(img1)
    arr2 = np.array(img2)
    return arr1.shape == arr2.shape and np.all(arr1 == arr2)

# 1. BASIC TEST CASES

def test_zoom_identity():
    """Zoom factor 1.0 should preserve image size and content (modulo dilation/erosion)."""
    img = create_image(10, 10, (123, 222, 100))
    codeflash_output = zoom_image(img, 1.0); out = codeflash_output # 107ΞΌs -> 100ΞΌs (7.29% faster)

def test_zoom_upscale():
    """Zoom factor >1 should increase image size proportionally."""
    img = create_image(8, 6, (10, 20, 30))
    codeflash_output = zoom_image(img, 2.0); out = codeflash_output # 125ΞΌs -> 117ΞΌs (6.48% faster)
    # Check that the center pixel's color is close to the original (interpolation)
    arr = np.array(out)

def test_zoom_downscale():
    """Zoom factor <1 should decrease image size proportionally."""
    img = create_image(20, 10, (200, 100, 50))
    codeflash_output = zoom_image(img, 0.5); out = codeflash_output # 110ΞΌs -> 109ΞΌs (0.936% faster)
    # Check that the average color is close to the original (interpolation)
    arr = np.array(out)
    mean_color = arr.mean(axis=(0, 1))

def test_zoom_zero():
    """Zoom factor 0 should be treated as 1 (no scaling)."""
    img = create_image(7, 7, (0, 255, 0))
    codeflash_output = zoom_image(img, 0); out = codeflash_output # 86.3ΞΌs -> 85.7ΞΌs (0.691% faster)

def test_zoom_negative():
    """Negative zoom factor should be treated as 1 (no scaling)."""
    img = create_image(5, 5, (0, 0, 255))
    codeflash_output = zoom_image(img, -2.5); out = codeflash_output # 84.1ΞΌs -> 83.6ΞΌs (0.639% faster)

# 2. EDGE TEST CASES

def test_zoom_minimal_image():
    """1x1 pixel image should remain 1x1 for zoom=1, and scale up for zoom>1."""
    img = create_image(1, 1, (111, 222, 123))
    codeflash_output = zoom_image(img, 1); out1 = codeflash_output # 80.9ΞΌs -> 81.4ΞΌs (0.650% slower)
    codeflash_output = zoom_image(img, 3); out2 = codeflash_output # 77.9ΞΌs -> 75.6ΞΌs (3.12% faster)
    arr = np.array(out2)

def test_zoom_non_integer_factor():
    """Non-integer zoom factors should result in correctly scaled image sizes."""
    img = create_image(10, 10, (1, 2, 3))
    codeflash_output = zoom_image(img, 1.5); out = codeflash_output # 96.5ΞΌs -> 105ΞΌs (8.76% slower)


def test_zoom_large_factor():
    """Very large zoom factor should scale image up to large size."""
    img = create_image(2, 2, (10, 20, 30))
    codeflash_output = zoom_image(img, 100); out = codeflash_output # 312ΞΌs -> 283ΞΌs (10.3% faster)
    arr = np.array(out)


def test_zoom_alpha_channel():
    """Function should process RGBA images by discarding alpha (should not error)."""
    img = PILImage.new("RGBA", (10, 10), color=(10, 20, 30, 40))
    # Should not raise, but alpha is dropped in conversion
    codeflash_output = zoom_image(img.convert("RGB"), 2.0); out = codeflash_output # 115ΞΌs -> 113ΞΌs (2.14% faster)

def test_zoom_non_square_image():
    """Non-square images should scale proportionally."""
    img = create_image(8, 3, (123, 45, 67))
    codeflash_output = zoom_image(img, 2.5); out = codeflash_output # 117ΞΌs -> 114ΞΌs (2.37% faster)

# 3. LARGE SCALE TEST CASES

def test_zoom_large_image_upscale():
    """Zooming a large image up should work and be reasonably fast."""
    img = create_image(250, 400, (10, 20, 30))
    codeflash_output = zoom_image(img, 2); out = codeflash_output # 1.95ms -> 1.69ms (15.1% faster)
    # Check that the corner pixel is as expected (solid color)
    arr = np.array(out)

def test_zoom_large_image_downscale():
    """Zooming a large image down should work and be reasonably fast."""
    img = create_image(999, 999, (123, 234, 45))
    codeflash_output = zoom_image(img, 0.5); out = codeflash_output # 3.53ms -> 2.95ms (19.7% faster)
    # Check that the center pixel is close to the original color
    arr = np.array(out)
    center = arr[arr.shape[0]//2, arr.shape[1]//2]

def test_zoom_large_non_uniform_image():
    """Zooming a large, non-uniform image should preserve general structure."""
    # Create a gradient image
    arr = np.zeros((500, 700, 3), dtype=np.uint8)
    for i in range(500):
        for j in range(700):
            arr[i, j] = (i % 256, j % 256, (i+j) % 256)
    img = PILImage.fromarray(arr)
    codeflash_output = zoom_image(img, 0.8); out = codeflash_output # 2.20ms -> 1.97ms (11.7% faster)
    # Check that the mean color is similar (structure preserved)
    arr_out = np.array(out)
    arr_in = np.array(img)
    mean_in = arr_in.mean(axis=(0,1))
    mean_out = arr_out.mean(axis=(0,1))

def test_zoom_large_image_extreme_downscale():
    """Zooming a large image by a tiny factor should not crash or produce zero-size."""
    img = create_image(999, 999, (1, 2, 3))
    codeflash_output = zoom_image(img, 0.01); out = codeflash_output # 2.07ms -> 1.59ms (30.1% faster)

def test_zoom_large_image_extreme_upscale():
    """Zooming a small image by a large factor should not crash and should scale up."""
    img = create_image(2, 2, (1, 2, 3))
    codeflash_output = zoom_image(img, 400); out = codeflash_output # 2.19ms -> 1.92ms (13.8% faster)
    arr = np.array(out)
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.
#------------------------------------------------
import cv2
import numpy as np
# imports
import pytest  # used for our unit tests
from PIL import Image as PILImage
from unstructured_inference.models.tables import zoom_image

# unit tests

# ---------- BASIC TEST CASES ----------

def create_test_image(size=(10, 10), color=(255, 0, 0)):
    """Helper to create a solid color RGB PIL image."""
    return PILImage.new("RGB", size, color)

def test_zoom_image_identity_zoom_1():
    # Test that zoom=1 returns an image of the same size (with possible minor pixel changes due to dilation/erosion)
    img = create_test_image((10, 15), (123, 222, 111))
    codeflash_output = zoom_image(img, 1); out = codeflash_output # 90.8ΞΌs -> 90.4ΞΌs (0.509% faster)

def test_zoom_image_upscale():
    # Test that zoom > 1 upscales the image
    img = create_test_image((10, 10), (0, 255, 0))
    zoom = 2
    codeflash_output = zoom_image(img, zoom); out = codeflash_output # 120ΞΌs -> 117ΞΌs (3.04% faster)

def test_zoom_image_downscale():
    # Test that zoom < 1 downscales the image
    img = create_test_image((10, 10), (0, 0, 255))
    zoom = 0.5
    codeflash_output = zoom_image(img, zoom); out = codeflash_output # 108ΞΌs -> 97.9ΞΌs (10.5% faster)

def test_zoom_image_non_integer_zoom():
    # Test that non-integer zoom factors work
    img = create_test_image((8, 6), (10, 20, 30))
    zoom = 1.5
    codeflash_output = zoom_image(img, zoom); out = codeflash_output # 108ΞΌs -> 95.7ΞΌs (13.6% faster)
    expected_size = (int(round(8*1.5)), int(round(6*1.5)))

def test_zoom_image_preserves_mode():
    # Test that the mode is preserved (RGB)
    img = create_test_image((7, 7), (0, 0, 0))
    codeflash_output = zoom_image(img, 1); out = codeflash_output # 84.3ΞΌs -> 84.4ΞΌs (0.171% slower)

# ---------- EDGE TEST CASES ----------

def test_zoom_image_zero_zoom():
    # Test that zoom=0 is treated as zoom=1
    img = create_test_image((12, 8), (200, 100, 50))
    codeflash_output = zoom_image(img, 0); out = codeflash_output # 85.2ΞΌs -> 82.0ΞΌs (3.93% faster)

def test_zoom_image_negative_zoom():
    # Test that negative zoom is treated as zoom=1
    img = create_test_image((9, 9), (50, 50, 50))
    codeflash_output = zoom_image(img, -2); out = codeflash_output # 83.0ΞΌs -> 81.9ΞΌs (1.38% faster)

def test_zoom_image_minimal_1x1():
    # Test with a 1x1 image, any zoom factor
    img = create_test_image((1, 1), (123, 45, 67))
    codeflash_output = zoom_image(img, 1); out1 = codeflash_output
    codeflash_output = zoom_image(img, 2); out2 = codeflash_output
    codeflash_output = zoom_image(img, 0.5); out3 = codeflash_output

def test_zoom_image_non_square():
    # Test with non-square image
    img = create_test_image((13, 7), (1, 2, 3))
    codeflash_output = zoom_image(img, 2); out = codeflash_output # 121ΞΌs -> 123ΞΌs (1.92% slower)


def test_zoom_image_large_zoom():
    # Test with a large zoom factor
    img = create_test_image((2, 2), (255, 255, 255))
    codeflash_output = zoom_image(img, 10); out = codeflash_output # 161ΞΌs -> 154ΞΌs (4.31% faster)

def test_zoom_image_non_rgb_image():
    # Test with an image with alpha channel (RGBA)
    img = PILImage.new("RGBA", (5, 5), (10, 20, 30, 40))
    # Convert to RGB as the function expects RGB input
    img_rgb = img.convert("RGB")
    codeflash_output = zoom_image(img_rgb, 1.5); out = codeflash_output # 130ΞΌs -> 123ΞΌs (5.82% faster)


def test_zoom_image_float_size():
    # Test with float zoom that results in non-integer size
    img = create_test_image((7, 5), (100, 100, 100))
    zoom = 1.3
    expected_size = (int(round(7*1.3)), int(round(5*1.3)))
    codeflash_output = zoom_image(img, zoom); out = codeflash_output # 151ΞΌs -> 129ΞΌs (17.7% faster)

# ---------- LARGE SCALE TEST CASES ----------

def test_zoom_image_large_image_upscale():
    # Test with a large image upscaled
    img = create_test_image((500, 400), (10, 20, 30))
    zoom = 2
    codeflash_output = zoom_image(img, zoom); out = codeflash_output # 3.08ms -> 2.61ms (18.0% faster)

def test_zoom_image_large_image_downscale():
    # Test with a large image downscaled
    img = create_test_image((800, 600), (200, 100, 50))
    zoom = 0.5
    codeflash_output = zoom_image(img, zoom); out = codeflash_output # 2.22ms -> 2.06ms (7.56% faster)

def test_zoom_image_large_image_identity():
    # Test with a large image, zoom=1
    img = create_test_image((999, 999), (1, 2, 3))
    codeflash_output = zoom_image(img, 1); out = codeflash_output # 3.64ms -> 2.93ms (24.3% faster)


def test_zoom_image_performance_large():
    # Test that the function can process a large image in reasonable time
    img = create_test_image((999, 999), (123, 234, 45))
    codeflash_output = zoom_image(img, 0.9); out = codeflash_output # 4.08ms -> 3.59ms (13.7% faster)
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.
βͺ Replay Tests and Runtime
Test File::Test Function Original ⏱️ Optimized ⏱️ Speedup
test_pytest_test_unstructured_inference__replay_test_0.py::test_unstructured_inference_models_tables_zoom_image 137ms 85.1ms 61.4%βœ…

To edit these changes git checkout codeflash/optimize-zoom_image-metaix6e and push.

Codeflash


Note

Optimizes zoom_image in unstructured_inference/models/tables.py using np.asarray and in-place cv2 morphology, and bumps version to 1.0.8-dev2 with changelog entry.

  • Performance:
    • Optimize zoom_image in unstructured_inference/models/tables.py:
      • Use np.asarray for image conversion.
      • Make cv2.dilate/cv2.erode operate in-place via dst.
  • Versioning:
    • Update __version__ to 1.0.8-dev2 in unstructured_inference/__version__.py.
  • Changelog:
    • Add 1.0.8-dev2 entry noting zoom_image optimization.

Written by Cursor Bugbot for commit 1cfe7e7. This will update automatically on new commits. Configure here.

codeflash-ai bot and others added 3 commits August 27, 2025 01:22
The optimized code achieves a **55% speedup** through three key memory optimization techniques:

**1. Reduced Memory Allocations**
- Moved `kernel = np.ones((1, 1), np.uint8)` outside the resize operation to avoid unnecessary intermediate allocations
- Used `np.asarray(image)` instead of `np.array(image)` to avoid copying when the PIL image is already a numpy-compatible array

**2. In-Place Operations**
- Added `dst=new_image` parameter to both `cv2.dilate()` and `cv2.erode()` operations, making them modify the existing array in-place rather than creating new copies
- This eliminates two major memory allocations that were consuming 32% of the original runtime (16.7% + 15.8% from the profiler)

**3. Memory Access Pattern Improvements**
The profiler shows the most dramatic improvements in the morphological operations:
- `cv2.dilate` time reduced from 54.8ms to 0.5ms (99% reduction)
- `cv2.erode` time reduced from 52.1ms to 0.2ms (99.6% reduction)

**Performance Characteristics**
The optimization shows consistent improvements across all test cases, with particularly strong gains for:
- Large images (15-30% speedup on 500x400+ images)
- Extreme scaling operations (30% improvement on extreme downscaling)
- Memory-intensive scenarios where avoiding copies provides the most benefit

The core image processing logic remains identical - only memory management was optimized to eliminate unnecessary allocations and copies during the morphological operations.
@qued qued merged commit 5c352fb into main Oct 10, 2025
13 checks passed
@qued qued deleted the codeflash/optimize-zoom_image-metaix6e branch October 10, 2025 22:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants