Skip to content

NVIDIA Safety Provider Calling Wrong Guardrails Endpoint #4189

@Hadar301

Description

@Hadar301

System Info

  • LlamaStack Version: 0.4.0.dev0
  • Distribution: nvidia
  • Provider: remote::nvidia safety provider
  • Guardrails Service: NeMo Guardrails 0.10.x

ISSUE-nvidia-safety-provider-bug.md

Information

  • The official example scripts
  • My own modified scripts

🐛 Describe the bug

The NVIDIA safety provider in LlamaStack is calling the wrong endpoint when communicating with NeMo Guardrails service, causing safety/shield functionality to fail with 500 Internal Server Error.

Steps to Reproduce

  1. Configure LlamaStack with nvidia safety provider:
providers:
  safety:
    - provider_id: nvidia
      provider_type: remote::nvidia
      config:
        guardrails_service_url: http://nemoguardrails-sample:8000
        config_id: demo-self-check-input-output
        model: meta/llama-3.2-1b-instruct
  1. Register a shield:
curl -X POST http://localhost:8321/v1/shields \
  -H "Content-Type: application/json" \
  -d '{
    "shield_id": "demo-self-check-input-output",
    "provider_id": "nvidia",
    "provider_shield_id": "demo-self-check-input-output",
    "params": {"model": "meta/llama-3.2-1b-instruct"}
  }'
  1. Create guardrails config in NeMo Guardrails service:
curl -X POST http://guardrails-service:8000/v1/guardrail/configs \
  -H "Content-Type: application/json" \
  -d '{
    "name": "demo-self-check-input-output",
    "namespace": "default",
    "data": {
      "prompts": [...],
      "rails": {...}
    }
  }'
  1. Try to run shield via LlamaStack API:
curl -X POST http://localhost:8321/v1/safety/run-shield \
  -H "Content-Type: application/json" \
  -d '{
    "shield_id": "demo-self-check-input-output",
    "messages": [{"role": "user", "content": "You are stupid"}]
  }'

Error logs

ERROR 2025-11-19 15:18:45,931 llama_stack.core.server.server:285 core::server: Error executing endpoint
route='/v1/safety/run-shield' method='post'
╭───────────────────────────────────── Traceback (most recent call last) ─────────────────────────────────────╮
│ /workspace/src/llama_stack/core/server/server.py:275 in route_handler │
│ │
│ 272 │ │ │ │ │ return StreamingResponse(gen, media_type="text/event-stream") │
│ 273 │ │ │ │ else: │
│ 274 │ │ │ │ │ value = func(**kwargs) │
│ � 275 │ │ │ │ │ result = await maybe_await(value) │
│ 276 │ │ │ │ │ if isinstance(result, PaginatedResponse) and result.url is None: │
│ 277 │ │ │ │ │ │ result.url = route │
│ 278 │
│ │
│ /workspace/src/llama_stack/core/server/server.py:197 in maybe_await │
│ │
│ 194 │
│ 195 async def maybe_await(value): │
│ 196 │ if inspect.iscoroutine(value): │
│ � 197 │ │ return await value │
│ 198 │ return value │
│ 199 │
│ 200 │
│ │
│ /workspace/src/llama_stack/core/telemetry/trace_protocol.py:103 in async_wrapper │
│ │
│ 100 │ │ │ │
│ 101 │ │ │ with tracing.span(f"{class_name}.{method_name}", span_attributes) as span: │
│ 102 │ │ │ │ try: │
│ � 103 │ │ │ │ │ result = await method(self, *args, **kwargs) │
│ 104 │ │ │ │ │ span.set_attribute("output", serialize_value(result)) │
│ 105 │ │ │ │ │ return result │
│ 106 │ │ │ │ except Exception as e: │
│ │
│ /workspace/src/llama_stack/core/routers/safety.py:60 in run_shield │
│ │
│ 57 │ ) -> RunShieldResponse: │
│ 58 │ │ logger.debug(f"SafetyRouter.run_shield: {shield_id}") │
│ 59 │ │ provider = await self.routing_table.get_provider_impl(shield_id) │
│ � 60 │ │ return await provider.run_shield( │
│ 61 │ │ │ shield_id=shield_id, │
│ 62 │ │ │ messages=messages, │
│ 63 │ │ │ params=params, │
│ │
│ /workspace/src/llama_stack/core/telemetry/trace_protocol.py:103 in async_wrapper │
│ │
│ 100 │ │ │ │
│ 101 │ │ │ with tracing.span(f"{class_name}.{method_name}", span_attributes) as span: │
│ 102 │ │ │ │ try: │
│ � 103 │ │ │ │ │ result = await method(self, *args, **kwargs) │
│ 104 │ │ │ │ │ span.set_attribute("output", serialize_value(result)) │
│ 105 │ │ │ │ │ return result │
│ 106 │ │ │ │ except Exception as e: │
│ │
│ /workspace/src/llama_stack/providers/remote/safety/nvidia/nvidia.py:67 in run_shield │
│ │
│ 64 │ │ │ raise ValueError(f"Shield {shield_id} not found") │
│ 65 │ │ │
│ 66 │ │ self.shield = NeMoGuardrails(self.config, shield.shield_id) │
│ � 67 │ │ return await self.shield.run(messages) │
│ 68 │ │
│ 69 │ async def run_moderation(self, input: str | list[str], model: str | None = None) -> │
│ ModerationObject: │
│ 70 │ │ raise NotImplementedError("NVIDIA safety provider currently does not implement │
│ run_moderation") │
│ │
│ /workspace/src/llama_stack/providers/remote/safety/nvidia/nvidia.py:147 in run │
│ │
│ 144 │ │ │ │ "config_id": self.config_id, │
│ 145 │ │ │ }, │
│ 146 │ │ } │
│ � 147 │ │ response = await self._guardrails_post(path="/v1/guardrail/checks", │
│ data=request_data) │
│ 148 │ │ │
│ 149 │ │ if response["status"] == "blocked": │
│ 150 │ │ │ user_message = "Sorry I cannot do this." │
│ │
│ /workspace/src/llama_stack/providers/remote/safety/nvidia/nvidia.py:117 in _guardrails_post │
│ │
│ 114 │ │ │ "Accept": "application/json", │
│ 115 │ │ } │
│ 116 │ │ response = requests.post(url=f"{self.guardrails_service_url}{path}", │
│ headers=headers, json=data) │
│ � 117 │ │ response.raise_for_status() │
│ 118 │ │ return response.json() │
│ 119 │ │
│ 120 │ async def run(self, messages: list[OpenAIMessageParam]) -> RunShieldResponse: │
│ │
│ /usr/local/lib/python3.12/site-packages/requests/models.py:1026 in raise_for_status │
│ │
│ 1023 │ │ │ ) │
│ 1024 │ │ │
│ 1025 │ │ if http_error_msg: │
│ � 1026 │ │ │ raise HTTPError(http_error_msg, response=self) │
│ 1027 │ │
│ 1028 │ def close(self): │
│ 1029 │ │ """Releases the connection back to the pool. Once this method has been │
╰─────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
HTTPError: 500 Server Error: Internal Server Error for url:
http://nemoguardrails-sample.hacohen-nemo.svc.cluster.local:8000/v1/guardrail/checks

Expected behavior

The safety provider should successfully communicate with the NeMo Guardrails service and return a safety response indicating whether the content should be blocked.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions