Skip to content

Conversation

@gargsaumya
Copy link
Contributor

@gargsaumya gargsaumya commented Sep 2, 2025

Work Item / Issue Reference

AB#38383

GitHub Issue: #<ISSUE_NUMBER>


Summary

This pull request refactors the logic for inferring SQL types from Python parameters in the mssql_python/cursor.py module, especially for batch operations using executemany. The main improvements are more accurate type detection for integer columns by considering the minimum and maximum values in the data, and a cleaner separation of concerns in the codebase.

Improvements to type inference and parameter handling:

  • Refactored the type mapping logic in _map_sql_type to use min_val and max_val for integer columns, allowing for more accurate type selection based on the actual range of values in the data.
  • Updated _create_parameter_types_list to accept and forward min_val and max_val, supporting the improved type inference for batch operations.
  • Replaced the static method _select_best_sample_value with a new _compute_column_type method, which determines a representative sample value and computes min/max for integer columns, enhancing how types are inferred for each parameter column in executemany. [1] [2]
  • Modified the executemany method to use _compute_column_type for each parameter column, passing the computed min/max values to _create_parameter_types_list for better type assignment.

Code cleanup:

  • Removed the now-unnecessary _select_best_sample_value static method, consolidating logic and reducing code duplication.

Copilot AI review requested due to automatic review settings September 2, 2025 03:50
@github-actions github-actions bot added the pr-size: medium Moderate update size label Sep 2, 2025
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This pull request improves parameter type inference and handling in the executemany method by replacing the static _select_best_sample_value method with a more comprehensive _compute_column_type method. The changes enhance accuracy for batch operations with varied data types, particularly for integer range detection and Data At Execution (DAE) handling.

  • Replaced static sample value selection with comprehensive column analysis that computes representative values, DAE flags, and min/max ranges
  • Enhanced integer type mapping to use min/max values for more accurate SQL type selection
  • Updated parameter creation pipeline to pass computed range information through the type mapping process

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

@github-actions github-actions bot added pr-size: medium Moderate update size and removed pr-size: medium Moderate update size labels Sep 2, 2025
@github-actions github-actions bot added pr-size: small Minimal code update and removed pr-size: medium Moderate update size pr-size: small Minimal code update labels Sep 2, 2025
Copy link
Contributor

@sumitmsft sumitmsft left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got this from Copilot:

Missing test coverage for:

Batch decimal precision/scale aggregation
Large string threshold (NVARCHAR vs NVARCHAR(MAX)) including astral (UTF‑16 surrogate pair) edge cases
executemany sampling when a later row crosses the 4000 UTF‑16 unit boundary
Mixed Unicode/non-Unicode large strings (type chosen, DAE flag)
All‑NULL column inference (no crash, consistent default type)
Mixed bool and int values (BIT vs widen to INT)
Mixed numeric & string column fallback (string coercion, no silent truncation)
Large negative integers (e.g. −2^63+1) and 64‑bit boundary extremes (±2^63−1, ±2^31, etc.)
Mixed bool + very large int (widening logic)
Astral-heavy string where len(s) < 4000 but UTF‑16 units > 4000 (correct MAX path)
Large binary (>8000) bytes vs bytearray parity (same SQL type + isDAE)
Binary exactly at 8000 and just over (boundary behavior)
All parameters None in a column (executemany robustness)

Some tests may not be worth, so take your call and select the relevant ones.

Copy link
Collaborator

@bewithgaurav bewithgaurav left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

need to merge to latest main

@github-actions github-actions bot added pr-size: small Minimal code update and removed pr-size: small Minimal code update labels Sep 11, 2025
@gargsaumya gargsaumya force-pushed the saumya/executemany-sqltypefix branch from d528403 to 5bcb1bc Compare September 11, 2025 12:57
sumitmsft
sumitmsft previously approved these changes Sep 12, 2025
bewithgaurav
bewithgaurav previously approved these changes Sep 15, 2025
@gargsaumya gargsaumya dismissed stale reviews from bewithgaurav and sumitmsft via f983c82 September 17, 2025 05:47
@github-actions github-actions bot added pr-size: small Minimal code update and removed pr-size: small Minimal code update labels Sep 17, 2025
bewithgaurav
bewithgaurav previously approved these changes Sep 17, 2025
sumitmsft
sumitmsft previously approved these changes Sep 17, 2025
@gargsaumya gargsaumya dismissed stale reviews from sumitmsft and bewithgaurav via a9f633c September 17, 2025 07:37
@gargsaumya gargsaumya force-pushed the saumya/executemany-sqltypefix branch from f983c82 to a9f633c Compare September 17, 2025 07:37
@github-actions github-actions bot added pr-size: small Minimal code update and removed pr-size: small Minimal code update labels Sep 17, 2025
@gargsaumya gargsaumya merged commit 82119f7 into main Sep 17, 2025
18 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

pr-size: small Minimal code update

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants