-
Notifications
You must be signed in to change notification settings - Fork 417
Labels
documentationImprovements or additions to documentationImprovements or additions to documentationenhancementNew feature or requestNew feature or requestplugins
Milestone
Description
Feature Request: Implement Argument Normalizer Plugin (Native)
Implement a native (in‑process) plugin named ArgumentNormalizer that stabilizes prompt/tool inputs by normalizing:
- Unicode (NFC/NFD/NFKC/NFKD) and control characters
- Whitespace (trim, collapse internal spaces, normalize CR/LF, optional blank-line collapse)
- Optional casing (none/lower/upper/title)
- Numeric dates to ISO 8601 (
YYYY-MM-DD) withday_first/year_firstambiguity handling - Numbers to a canonical form (strip thousands separators;
.as decimal)
Default mode should be non-blocking (permissive), returning modified payloads when changes occur. Target hooks: prompt_pre_fetch, tool_pre_invoke.
Problem Statement
Incoming args vary widely in normalization (Unicode forms, whitespace, casing), and frequently include ambiguous numeric dates or locale-specific number formatting. This leads to:
- Prompt template mismatches and brittle tool args
- Lower PII/regex filtering accuracy due to noisy inputs
- Unnecessary retries and inconsistent behavior across environments
Proposal
Add plugins/argument_normalizer/argument_normalizer.py implementing a Plugin subclass with:
- Config (
ArgumentNormalizerConfig) for toggles and strategies (Unicode, whitespace, casing, dates, numbers) - Per-field regex-based overrides (
field_overrides) to tune normalization by key path (e.g.,user.name,items[0].title) - Safe recursive normalization for dict/list structures
- Non-blocking results with
modified_payloadand minimal metadata; no violations by default
Recommended ordering: run before PII filtering so detectors see stabilized inputs.
Scope
In scope:
prompt_pre_fetchandtool_pre_invokehooks- Unicode normalization, whitespace cleanup, optional casing, numeric date and number normalization
- Per-field overrides and conditions to target specific prompts/tools
Out of scope (for this issue):
- Resource hooks; advanced locale-aware date parsing libraries; schema validation; rate limiting
Configuration (Example)
- name: "ArgumentNormalizer"
kind: "plugins.argument_normalizer.argument_normalizer.ArgumentNormalizerPlugin"
description: "Normalizes Unicode, whitespace, casing, dates, and numbers in args"
version: "0.1.0"
author: "Mihai Criveti"
hooks: ["prompt_pre_fetch", "tool_pre_invoke"]
mode: "permissive"
priority: 40
conditions: []
config:
enable_unicode: true
unicode_form: "NFC"
remove_control_chars: true
enable_whitespace: true
trim: true
collapse_internal: true
normalize_newlines: true
collapse_blank_lines: false
enable_casing: false
case_strategy: "none" # none|lower|upper|title
enable_dates: true
day_first: false
year_first: false
enable_numbers: true
decimal_detection: "auto" # auto|comma|dot
field_overrides: []Ordering Guidance
- Argument Normalizer should precede PII Filter. Suggested priorities:
ArgumentNormalizer: 40PIIFilterPlugin: 50
Acceptance Criteria
- Native plugin class
ArgumentNormalizerPluginimplementsprompt_pre_fetchandtool_pre_invoke - Config supports Unicode, whitespace, casing, dates, numbers, and
field_overrides - Non-blocking behavior by default; returns
modified_payloadwhen changes occur - Recursive normalization works for nested dict/list args
- Unit tests cover unicode/whitespace, numbers (comma/dot locales), dates (
day_first), casing, and nested structures - README under
plugins/argument_normalizer/documents behavior, config, overrides, and examples - Docs mention in
llms/plugins-llms.mdand ordering note - Example config entry included in
plugins/config.yaml(commented or sample)
Tasks
- Implement plugin in
plugins/argument_normalizer/argument_normalizer.py - Add unit tests in
tests/unit/.../argument_normalizer/test_argument_normalizer.py - Write plugin README with examples and tuning tips
- Update docs: add to
llms/plugins-llms.md(Built‑in Plugins) and note ordering - Add example configuration to
plugins/config.yaml - Validate with
make doctest testand run targeted pytest selection
Risks & Mitigations
- Over-normalization (e.g., changing intended casing): mitigate via
field_overridesand disabledenable_casingby default - Ambiguous dates: controlled by
day_first/year_first; default conservative transformations - Locale edge cases for numbers:
decimal_detectionwith explicitcomma|dotoverride when needed
Metadata
Metadata
Assignees
Labels
documentationImprovements or additions to documentationImprovements or additions to documentationenhancementNew feature or requestNew feature or requestplugins