- 
                Notifications
    
You must be signed in to change notification settings  - Fork 315
 
Adds docker model runner #473
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
          
WalkthroughAdds Docker Model Runner as a first-class local provider: backend connection/discovery, OpenAPI endpoint, adapter support, config and enum entries, frontend connect UI and types, expanded model list entries, and extensive unit tests for connection and discovery. Changes
 Sequence Diagram(s)sequenceDiagram
    participant User
    participant Frontend
    participant Backend
    participant DMR as DockerModelRunner
    User->>Frontend: Open connect UI / choose Docker Model Runner
    Frontend->>Backend: GET /api/provider/docker_model_runner/connect?docker_model_runner_custom_url=...
    Backend->>DMR: GET /v1/models
    DMR-->>Backend: return models list or error
    Backend-->>Frontend: DockerModelRunnerConnection (supported/untested) or HTTP error
    Frontend-->>User: Display connection status and models
    Estimated code review effort🎯 4 (Complex) | ⏱️ ~35 minutes Possibly related PRs
 Suggested labels
 Suggested reviewers
 Poem
 Tip 🔌 Remote MCP (Model Context Protocol) integration is now available!Pro plan users can now connect to remote MCP servers from the Integrations page. Connect with popular remote MCPs such as Notion and Linear to add more context to your reviews and chats. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit: 
 SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. CodeRabbit Commands (Invoked using PR/Issue comments)Type  Other keywords and placeholders
 CodeRabbit Configuration File (
 | 
    
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 5
🔭 Outside diff range comments (1)
libs/core/kiln_ai/adapters/test_docker_model_runner_tools.py (1)
111-125: Fix incorrect exception mocking in failure testThe test is mocking
httpx.RequestErroras a genericException, but the implementation specifically catcheshttpx.RequestError. This doesn't properly test the exception handling.@pytest.mark.asyncio async def test_docker_model_runner_online_failure(): """Test that docker_model_runner_online returns False when service is unavailable.""" with patch("kiln_ai.adapters.docker_model_runner_tools.httpx") as mock_httpx: - mock_httpx.get.side_effect = Exception("Connection error") - mock_httpx.RequestError = Exception + import httpx + mock_httpx.RequestError = httpx.RequestError + mock_httpx.get.side_effect = httpx.RequestError("Connection error") from kiln_ai.adapters.docker_model_runner_tools import ( docker_model_runner_online, ) result = await docker_model_runner_online() assert result is False
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (1)
app/web_ui/static/images/docker.svgis excluded by!**/*.svg
📒 Files selected for processing (12)
app/desktop/studio_server/provider_api.py(7 hunks)app/web_ui/src/lib/api_schema.d.ts(4 hunks)app/web_ui/src/lib/ui/provider_image.ts(1 hunks)app/web_ui/src/routes/(fullscreen)/setup/(setup)/connect_providers/connect_providers.svelte(5 hunks)libs/core/kiln_ai/adapters/adapter_registry.py(1 hunks)libs/core/kiln_ai/adapters/docker_model_runner_tools.py(1 hunks)libs/core/kiln_ai/adapters/ml_model_list.py(1 hunks)libs/core/kiln_ai/adapters/model_adapters/litellm_adapter.py(2 hunks)libs/core/kiln_ai/adapters/provider_tools.py(3 hunks)libs/core/kiln_ai/adapters/test_docker_model_runner_tools.py(1 hunks)libs/core/kiln_ai/datamodel/datamodel_enums.py(1 hunks)libs/core/kiln_ai/utils/config.py(1 hunks)
🧰 Additional context used
📓 Path-based instructions (3)
app/web_ui/src/**/*.{js,jsx,ts,tsx,svelte}
📄 CodeRabbit Inference Engine (.cursor/rules/project.mdc)
Our web UI component framework is DaisyUI v4 (based on Tailwind). Be sure to use v4 docs and features, not v5
Files:
app/web_ui/src/lib/ui/provider_image.tsapp/web_ui/src/routes/(fullscreen)/setup/(setup)/connect_providers/connect_providers.svelteapp/web_ui/src/lib/api_schema.d.ts
**/*.py
📄 CodeRabbit Inference Engine (.cursor/rules/project.mdc)
**/*.py: Always assume pydantic 2 (not pydantic 1)
The project supports Python 3.10 and above
Files:
libs/core/kiln_ai/datamodel/datamodel_enums.pylibs/core/kiln_ai/utils/config.pylibs/core/kiln_ai/adapters/adapter_registry.pylibs/core/kiln_ai/adapters/ml_model_list.pylibs/core/kiln_ai/adapters/model_adapters/litellm_adapter.pylibs/core/kiln_ai/adapters/test_docker_model_runner_tools.pylibs/core/kiln_ai/adapters/provider_tools.pyapp/desktop/studio_server/provider_api.pylibs/core/kiln_ai/adapters/docker_model_runner_tools.py
**/test_*.py
📄 CodeRabbit Inference Engine (.cursor/rules/project.mdc)
**/test_*.py: Always use pytest for tests in Python code
Test brevity is important. Use approaches for re-use and brevity including using fixtures for repeated code, and using pytest parameterize for similar tests
Files:
libs/core/kiln_ai/adapters/test_docker_model_runner_tools.py
🧠 Learnings (2)
📚 Learning: the `glm_z1_rumination_32b_0414` model was intentionally removed from the built_in_models list due t...
Learnt from: leonardmq
PR: Kiln-AI/Kiln#418
File: libs/core/kiln_ai/adapters/ml_model_list.py:0-0
Timestamp: 2025-07-16T09:37:39.816Z
Learning: The `glm_z1_rumination_32b_0414` model was intentionally removed from the built_in_models list due to output formatting issues: output was duplicated in both `output` and `reasoning` fields, and contained random internal JSON in the output. This model should not be re-added without addressing these formatting problems.
Applied to files:
libs/core/kiln_ai/adapters/ml_model_list.py
📚 Learning: in the kiln web ui's project import functionality (edit_project.svelte), when the file selector api ...
Learnt from: scosman
PR: Kiln-AI/Kiln#464
File: app/web_ui/src/routes/(fullscreen)/setup/(setup)/create_project/edit_project.svelte:116-138
Timestamp: 2025-07-30T02:50:29.383Z
Learning: In the Kiln web UI's project import functionality (edit_project.svelte), when the file selector API fails, the preferred approach is to use alert() as a brief notification and set select_file_unavailable = true to gracefully fall back to manual path entry, rather than using the form's error handling system. This maintains a smooth user experience with automatic fallback rather than showing an error state.
Applied to files:
app/web_ui/src/routes/(fullscreen)/setup/(setup)/connect_providers/connect_providers.svelte
🧬 Code Graph Analysis (4)
libs/core/kiln_ai/adapters/ml_model_list.py (2)
libs/core/kiln_ai/datamodel/datamodel_enums.py (2)
ModelProviderName(80-100)StructuredOutputMode(23-45)app/web_ui/src/lib/types.ts (2)
ModelProviderName(41-41)StructuredOutputMode(40-40)
libs/core/kiln_ai/adapters/model_adapters/litellm_adapter.py (1)
libs/core/kiln_ai/datamodel/datamodel_enums.py (1)
ModelProviderName(80-100)
libs/core/kiln_ai/adapters/test_docker_model_runner_tools.py (1)
libs/core/kiln_ai/adapters/docker_model_runner_tools.py (5)
DockerModelRunnerConnection(41-48)docker_model_runner_base_url(11-22)parse_docker_model_runner_models(52-88)all_models(47-48)docker_model_runner_online(25-38)
libs/core/kiln_ai/adapters/provider_tools.py (2)
libs/core/kiln_ai/adapters/docker_model_runner_tools.py (1)
get_docker_model_runner_connection(91-110)libs/core/kiln_ai/datamodel/datamodel_enums.py (1)
ModelProviderName(80-100)
🔇 Additional comments (36)
app/web_ui/src/lib/ui/provider_image.ts (1)
9-9: LGTM!The addition of the
docker_model_runnerprovider image mapping follows the established pattern and uses a consistent file path structure.libs/core/kiln_ai/datamodel/datamodel_enums.py (1)
100-100: LGTM!The enum addition follows the established pattern with the string value matching the enum name, maintaining consistency across the codebase.
libs/core/kiln_ai/utils/config.py (1)
57-60: LGTM!The configuration property addition follows the established pattern for base URL configurations and uses appropriate environment variable naming conventions.
libs/core/kiln_ai/adapters/model_adapters/litellm_adapter.py (2)
184-186: LGTM!The Docker Model Runner integration correctly follows the established pattern used for Ollama, treating it as an OpenAI-compatible provider with JSON schema support. The implementation and comment are both accurate.
322-325: LGTM!The implementation correctly treats Docker Model Runner as a custom provider (similar to Ollama) to enable direct control over API requests. This approach is appropriate for supporting advanced features like
response_format=json_schema.libs/core/kiln_ai/adapters/ml_model_list.py (1)
1228-1233: LGTM! Docker Model Runner provider properly configured.The new provider entry follows established patterns and is well-configured:
- Uses appropriate structured output mode (
 json_schema) for OpenAI-compatible endpoints- Correctly disables data generation for a 3B model
 - Model ID follows Docker naming conventions
 - Aligns with the PR objective to add Docker Model Runner support
 libs/core/kiln_ai/adapters/adapter_registry.py (1)
109-126: LGTM! Docker Model Runner adapter follows established patterns.The implementation correctly mirrors the Ollama pattern:
- Uses
 LiteLlmAdapterwith OpenAI-compatible configuration- Constructs base URL from config with sensible default fallback
 - Adds "/v1" suffix for OpenAI-compatible endpoint
 - Uses placeholder API key "DMR" to satisfy LiteLLM requirements (similar to Ollama's "NA")
 - Default URL points to appropriate llamacpp endpoint on distinct port (12434)
 This integration properly extends the adapter registry to support the new Docker Model Runner provider.
libs/core/kiln_ai/adapters/provider_tools.py (3)
5-7: LGTM: Import statement follows established patternThe import correctly brings in the Docker Model Runner connection function, following the same pattern as the Ollama integration above it.
43-50: LGTM: Provider enablement logic is consistentThe Docker Model Runner enablement check follows the exact same pattern as the Ollama provider, checking for connection availability and the presence of supported or untested models. The implementation is correct and consistent.
399-400: LGTM: Human-readable name mapping added correctlyThe provider name mapping correctly returns "Docker Model Runner" for the docker_model_runner provider ID, maintaining consistency with other provider mappings in the match statement.
app/web_ui/src/lib/api_schema.d.ts (4)
423-439: LGTM! Endpoint definition follows established patterns.The new Docker Model Runner connection endpoint follows the exact same structure as the existing Ollama provider endpoint, ensuring consistency in the API schema.
1541-1551: LGTM! Schema definition maintains consistency.The
DockerModelRunnerConnectionschema correctly mirrors the structure of the existingOllamaConnectionschema, ensuring consistent typing across different provider connection responses.
2180-2180: LGTM! Enum extension follows naming conventions.The addition of
"docker_model_runner"to theModelProviderNameenum follows the established snake_case naming convention and is consistent with the provider identifier used throughout the API.
3913-3943: LGTM! Operation definition maintains API consistency.The operation definition correctly follows the established pattern from the Ollama provider, including the optional custom URL parameter and appropriate response types. The use of the new
DockerModelRunnerConnectionschema ensures type safety.libs/core/kiln_ai/adapters/test_docker_model_runner_tools.py (6)
1-10: LGTM!Import statements are appropriate for the test module.
12-18: LGTM!Test correctly verifies the default base URL behavior.
20-28: LGTM!Test correctly verifies custom base URL configuration.
60-70: LGTM!Test correctly handles the case of no available models.
72-81: LGTM!Test correctly handles invalid response format.
83-93: LGTM!Test correctly verifies the all_models() method combines both model lists.
libs/core/kiln_ai/adapters/docker_model_runner_tools.py (5)
1-9: LGTM!Import statements are appropriate for the module's functionality.
11-23: LGTM!Function correctly handles base URL configuration with appropriate fallback.
41-49: LGTM!Well-structured Pydantic model with appropriate fields and helper method.
51-89: LGTM!Function correctly parses model responses and categorizes them appropriately.
113-118: LGTM!Simple utility function correctly checks model availability.
app/web_ui/src/routes/(fullscreen)/setup/(setup)/connect_providers/connect_providers.svelte (4)
61-66: LGTM!Docker Model Runner provider definition is consistent with other providers.
225-230: LGTM!Status initialization for Docker Model Runner is consistent with other providers.
373-375: LGTM!Provider connection routing is properly integrated.
680-684: LGTM!Docker Model Runner initialization on mount follows the same pattern as Ollama.
app/desktop/studio_server/provider_api.py (7)
11-14: LGTM!Import statements for Docker Model Runner tools are appropriate.
84-146: LGTM!The
connect_docker_model_runnerfunction is well-implemented with proper validation and error handling.
268-273: LGTM!API endpoint follows the established pattern for provider connections.
241-245: LGTM!Docker Model Runner integration into available models list is properly positioned.
384-385: LGTM!Docker Model Runner correctly added to providers that don't support API key connections.
440-441: LGTM!Docker Model Runner correctly added to providers that don't support API key disconnections.
1038-1058: LGTM!The
models_from_docker_model_runner_idfunction correctly maps Docker model IDs to Kiln models.
| const connect_docker_model_runner = async (user_initiated: boolean = true) => { | ||
| status.docker_model_runner.connected = false | ||
| status.docker_model_runner.connecting = user_initiated | ||
| let data: any | null = null | ||
| try { | ||
| const { data: req_data, error: req_error } = await client.GET( | ||
| "/api/provider/docker_model_runner/connect", | ||
| { | ||
| params: { | ||
| query: { | ||
| custom_docker_url: custom_docker_url || undefined, | ||
| }, | ||
| }, | ||
| }, | ||
| ) | ||
| if (req_error) { | ||
| throw req_error | ||
| } | ||
| data = req_data | ||
| } catch (e) { | ||
| if ( | ||
| e && | ||
| typeof e === "object" && | ||
| "message" in e && | ||
| typeof e.message === "string" | ||
| ) { | ||
| status.docker_model_runner.error = e.message | ||
| } else { | ||
| status.docker_model_runner.error = "Failed to connect. Ensure Docker Model Runner is running." | ||
| } | ||
| status.docker_model_runner.connected = false | ||
| return | ||
| } finally { | ||
| status.docker_model_runner.connecting = false | ||
| } | ||
| if ( | ||
| data.supported_models.length === 0 && | ||
| (!data.untested_models || data.untested_models.length === 0) | ||
| ) { | ||
| status.docker_model_runner.error = | ||
| "Docker Model Runner running, but no models available. Pull some models first: https://hub.docker.com/u/ai" | ||
| return | ||
| } | ||
| status.docker_model_runner.error = null | ||
| status.docker_model_runner.connected = true | ||
| const supported_models_str = | ||
| data.supported_models.length > 0 | ||
| ? "The following supported models are available: " + | ||
| data.supported_models.join(", ") + | ||
| ". " | ||
| : "No supported models are loaded. " | ||
| const untested_models_str = | ||
| data.untested_models && data.untested_models.length > 0 | ||
| ? "The following untested models are loaded: " + | ||
| data.untested_models.join(", ") + | ||
| ". " | ||
| : "" | ||
| const custom_url_str = | ||
| custom_docker_url && custom_docker_url !== "http://localhost:12434/engines/llama.cpp" | ||
| ? "Custom Docker Model Runner URL: " + custom_docker_url | ||
| : "" | ||
| status.docker_model_runner.custom_description = | ||
| "Docker Model Runner connected. " + | ||
| supported_models_str + | ||
| untested_models_str + | ||
| custom_url_str | ||
| } | ||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fix parameter name typo and use constant for default URL
- Parameter 
user_initiatedhas a typo when used asuser_initated - Default URL is hardcoded in the comparison
 
-const connect_docker_model_runner = async (user_initiated: boolean = true) => {
+const connect_docker_model_runner = async (user_initiated: boolean = true) => {
   status.docker_model_runner.connected = false
-  status.docker_model_runner.connecting = user_initiated
+  status.docker_model_runner.connecting = user_initiated
   // ... rest of the function ...
   const custom_url_str =
-    custom_docker_url && custom_docker_url !== "http://localhost:12434/engines/llama.cpp"
+    custom_docker_url && custom_docker_url !== DEFAULT_DOCKER_URL
       ? "Custom Docker Model Runner URL: " + custom_docker_url
       : ""Add this constant at the top of the script section:
const DEFAULT_DOCKER_URL = "http://localhost:12434/engines/llama.cpp"🤖 Prompt for AI Agents
In
app/web_ui/src/routes/(fullscreen)/setup/(setup)/connect_providers/connect_providers.svelte
between lines 472 and 541, fix the typo in the parameter name by replacing all
instances of "user_initated" with the correct "user_initiated". Also, define a
constant DEFAULT_DOCKER_URL at the top of the script section with the value
"http://localhost:12434/engines/llama.cpp" and replace the hardcoded default URL
string in the comparison with this constant to improve maintainability.
| 
           @ilopezluna very cool! Our CR bot code rabbit found some small issues (typos, etc). Want to tackle those (or just mark as resolved if it's wrong about something) and let me know when it's ready?  | 
    
…xception handling instead of generic catch-all
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
♻️ Duplicate comments (1)
app/desktop/studio_server/provider_api.py (1)
133-137: Use specific exception handling instead of generic Exception.This is the same issue identified in the past review. The function should catch specific exceptions rather than the generic
Exceptionto avoid hiding unexpected errors.- except Exception as e: + except (openai.APIError, httpx.RequestError) as e: raise HTTPException( status_code=500, detail=f"Failed to connect to Docker Model Runner: {e}", )
🧹 Nitpick comments (1)
app/desktop/studio_server/provider_api.py (1)
100-102: Move imports to the top of the file.The imports inside the function should be moved to the top of the file for better organization and consistency with Python conventions.
+from kiln_ai.adapters.docker_model_runner_tools import ( + docker_model_runner_base_url, + parse_docker_model_runner_models, +)Then remove the local imports from within the function.
Also applies to: 116-118
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (5)
app/desktop/studio_server/provider_api.py(7 hunks)app/web_ui/src/lib/types.ts(1 hunks)app/web_ui/src/routes/(fullscreen)/setup/(setup)/connect_providers/connect_providers.svelte(7 hunks)libs/core/kiln_ai/adapters/docker_model_runner_tools.py(1 hunks)libs/core/kiln_ai/adapters/test_docker_model_runner_tools.py(1 hunks)
✅ Files skipped from review due to trivial changes (1)
- app/web_ui/src/lib/types.ts
 
🚧 Files skipped from review as they are similar to previous changes (3)
- libs/core/kiln_ai/adapters/test_docker_model_runner_tools.py
 - libs/core/kiln_ai/adapters/docker_model_runner_tools.py
 - app/web_ui/src/routes/(fullscreen)/setup/(setup)/connect_providers/connect_providers.svelte
 
🧰 Additional context used
📓 Path-based instructions (1)
**/*.py
📄 CodeRabbit Inference Engine (.cursor/rules/project.mdc)
**/*.py: Always assume pydantic 2 (not pydantic 1)
The project supports Python 3.10 and above
Files:
app/desktop/studio_server/provider_api.py
🧬 Code Graph Analysis (1)
app/desktop/studio_server/provider_api.py (5)
libs/core/kiln_ai/adapters/docker_model_runner_tools.py (4)
DockerModelRunnerConnection(43-50)get_docker_model_runner_connection(93-112)docker_model_runner_base_url(11-22)parse_docker_model_runner_models(54-90)libs/core/kiln_ai/utils/config.py (3)
Config(28-252)shared(149-152)save_setting(237-238)libs/core/kiln_ai/adapters/provider_tools.py (1)
provider_name_from_id(353-405)app/web_ui/src/lib/stores.ts (1)
provider_name_from_id(316-324)libs/core/kiln_ai/adapters/ml_model_list.py (2)
KilnModel(207-222)KilnModelProvider(161-204)
🔇 Additional comments (6)
app/desktop/studio_server/provider_api.py (6)
6-6: LGTM - Imports are properly used.The new imports are appropriately added to support Docker Model Runner functionality and are actually used in the code.
Also applies to: 12-15
269-273: LGTM - API endpoint follows established patterns.The new endpoint is consistent with the existing Ollama connection endpoint and properly structured.
242-245: LGTM - Integration follows established patterns.The Docker Model Runner integration in the available models list is consistent with the Ollama pattern and includes proper positioning logic.
385-385: LGTM - Correctly excludes Docker Model Runner from API key operations.Docker Model Runner, like Ollama, doesn't require API keys, so correctly excluding it from these operations follows the established pattern.
Also applies to: 441-441
979-1037: LGTM - Function follows established patterns with proper exception handling.This function correctly mirrors the Ollama pattern and properly handles specific exceptions as requested in previous feedback. The implementation is consistent and well-structured.
1039-1059: LGTM - Helper function follows established patterns.The function correctly mirrors the
models_from_ollama_tagpattern and provides consistent model mapping functionality for Docker Model Runner.
          
 Thank you @scosman! CodeRabbit looks really cool, I hadn’t heard of it before 👏  | 
    
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is great! Easy to use like Ollama. Nice work!
Code generally looks great and matches the project style. I left a few comments while looking around, nothing major. I can take a deeper look next week.
Random feedback as new DMR user
Great UX. Opened desktop, clicked run, worked.
One thing I would change: the 'latest' tag has been nothing but headaches for me with Ollama. Usability wise, it's a bit hard to figure out which model size it is. Over time they can release new models in same family, and then 'latest' is either not actually latest, or a different size than it used to be. When we store that a response was generated by a specific model 'latest' can burn us; we want the model name to be explicit and permanent. I'd suggest keeping model tags tied to the size.
        
          
                app/web_ui/src/routes/(fullscreen)/setup/(setup)/connect_providers/connect_providers.svelte
          
            Show resolved
            Hide resolved
        
      | ), | ||
| KilnModelProvider( | ||
| name=ModelProviderName.docker_model_runner, | ||
| model_id="ai/llama3.2", | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should set the size explicitly, so if the default ever changes, our data is still correct. llama3.2:3B-Q4_K_M?
We should also add the rest of the models. I know the list is bit and ugly... but it's actually really good for usability. Details: https://getkiln.ai/blog/i_wrote_2000_llm_test_cases_so_you_dont_have_to
Adding the models docker supports in there will do a few nice things:
- Nice names in the UI
 - Add them to the test suite
 - Once tested that they will appear higher in the dropdown as we'll trust them to perform certain tasks better (JSON mode, complex requests like synthetic data, etc).
 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have update the model references to include a more explicit tag <weight>-<quantization>
I considered adding all supported models as well, but I thought it might be better to include them in a follow-up PR that focuses solely on listing the supported models.
That said, if you’d prefer to include them now, I’m happy to do so.
| 
           oh small nit: Finetune badge doesn't need a star. Stars are to differentiate "suggested" but we don't have that concept for fine tunes  | 
    
          
 Yeah, I remember we had this same discussion in the DMR team :)  | 
    
          
 Sorry @scosman I didn't quite get that. Could you please clarify what you mean?  | 
    
          
 That was a comment on another PR. No idea how I managed to drop it here. Please disregard!  | 
    
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
some small changes/cleanup, but nothing major.
Great work navigating the codebase. Providers touch a lot of areas and were some of the earliest code!
        
          
                app/web_ui/src/routes/(fullscreen)/setup/(setup)/connect_providers/connect_providers.svelte
              
                Outdated
          
            Show resolved
            Hide resolved
        
      | model_id="ai/llama3.2:3B-Q4_K_M", | ||
| structured_output_mode=StructuredOutputMode.json_schema, | ||
| supports_data_gen=False, | ||
| ), | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be good to add the others in this PR. The usability of DMR goes way up with them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added in 6fc2711
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
BTW, I've also added the models with the latest and default tags. Most DMR users will use those, although, as we discussed, it can still be confusing.
Because these tags point to the same model I was wondering that would be nice to have a way to allow multiple ids or aliases per models.
Anyway, let me know if you prefer to remove these model references from this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, so we can only have 1 entry per provider in there. I'll add a test to check that. If we add multiple "Llama 8B on DMR" becomes ambiguous which isn't good for reproducibility.
Let's just use the Q4_K_M versions (or similar)? The user can still use the others; they will automatically appear in the dropdown, but with their own more specific IDs.
| 
           Nice! I see DMR also has some (or one?) embedding model that we might be able to add support for our upcoming RAG tooling after this is merged. This is the first external PR since we added a test coverage CI, I tried to run it (results are here), but looks like it is not posting the nicely formatted report back as a comment into the PR if it is from a fork 😄  | 
    
Replace Any parameter with List[openai.types.Model] and eliminate brittle dict parsing logic
# Conflicts: # app/web_ui/src/lib/api_schema.d.ts # libs/core/kiln_ai/adapters/provider_tools.py # libs/core/kiln_ai/datamodel/datamodel_enums.py
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
♻️ Duplicate comments (2)
app/desktop/studio_server/provider_api.py (2)
1040-1098: Populate model flags from provider metadata and reflect DMR’s JSON-schema support.
- Use provider’s
 uncensoredandsuggested_for_uncensored_data_geninstead of hardcoding.- Set
 supports_structured_output=Truefor Docker MR (DMR supports json_schema-constrained decoding natively). Also set it true for untested models to avoid misleading UI warnings.- supports_structured_output=docker_provider.supports_structured_output, + supports_structured_output=True, @@ - uncensored=False, - suggested_for_uncensored_data_gen=False, + uncensored=docker_provider.uncensored, + suggested_for_uncensored_data_gen=docker_provider.suggested_for_uncensored_data_gen, @@ - supports_structured_output=False, + supports_structured_output=True,To double-check that the provider metadata includes
uncensoredandsuggested_for_uncensored_data_genfor Docker MR, run:#!/bin/bash # Inspect provider definitions for docker_model_runner fd -t f ml_model_list.py | xargs rg -n "docker_model_runner" -A 8 -B 8
221-225: Fix insertion index to keep ordering stable (Ollama first, then Docker if both present).Using
insert(-1, ...)is surprising; it inserts before the last item and may shuffle relative position against key-based providers. Deterministically place Docker right after Ollama, or at the start if Ollama absent.- if docker_models: - models.insert(-1 if ollama_models else 0, docker_models) + if docker_models: + insert_idx = 1 if ollama_models else 0 + models.insert(insert_idx, docker_models)
🧹 Nitpick comments (1)
app/desktop/studio_server/provider_api.py (1)
248-254: Minor: simplify by removing the temporary variable.Keeps parity with the Ollama connect endpoint and reduces noise.
- async def connect_docker_model_runner_api( - docker_model_runner_custom_url: str | None = None, - ) -> DockerModelRunnerConnection: - chosen_url = docker_model_runner_custom_url - return await connect_docker_model_runner(chosen_url) + async def connect_docker_model_runner_api( + docker_model_runner_custom_url: str | None = None, + ) -> DockerModelRunnerConnection: + return await connect_docker_model_runner(docker_model_runner_custom_url)If you apply the earlier param-rename, calling with a keyword is also fine.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (13)
.cursor/rules/project.mdc(1 hunks)app/desktop/studio_server/provider_api.py(7 hunks)app/web_ui/src/lib/api_schema.d.ts(4 hunks)app/web_ui/src/lib/ui/provider_image.ts(1 hunks)app/web_ui/src/routes/(fullscreen)/setup/(setup)/connect_providers/connect_providers.svelte(10 hunks)libs/core/kiln_ai/adapters/adapter_registry.py(1 hunks)libs/core/kiln_ai/adapters/docker_model_runner_tools.py(1 hunks)libs/core/kiln_ai/adapters/ml_model_list.py(1 hunks)libs/core/kiln_ai/adapters/model_adapters/litellm_adapter.py(2 hunks)libs/core/kiln_ai/adapters/provider_tools.py(3 hunks)libs/core/kiln_ai/adapters/test_docker_model_runner_tools.py(1 hunks)libs/core/kiln_ai/datamodel/datamodel_enums.py(1 hunks)libs/core/kiln_ai/utils/config.py(1 hunks)
✅ Files skipped from review due to trivial changes (3)
- .cursor/rules/project.mdc
 - app/web_ui/src/lib/ui/provider_image.ts
 - libs/core/kiln_ai/adapters/provider_tools.py
 
🚧 Files skipped from review as they are similar to previous changes (9)
- libs/core/kiln_ai/datamodel/datamodel_enums.py
 - libs/core/kiln_ai/utils/config.py
 - libs/core/kiln_ai/adapters/adapter_registry.py
 - libs/core/kiln_ai/adapters/model_adapters/litellm_adapter.py
 - libs/core/kiln_ai/adapters/ml_model_list.py
 - libs/core/kiln_ai/adapters/test_docker_model_runner_tools.py
 - libs/core/kiln_ai/adapters/docker_model_runner_tools.py
 - app/web_ui/src/lib/api_schema.d.ts
 - app/web_ui/src/routes/(fullscreen)/setup/(setup)/connect_providers/connect_providers.svelte
 
🧰 Additional context used
📓 Path-based instructions (1)
**/*.py
📄 CodeRabbit Inference Engine (.cursor/rules/project.mdc)
**/*.py: Always assume pydantic 2 (not pydantic 1)
The project supports Python 3.10 and above
Files:
app/desktop/studio_server/provider_api.py
🧠 Learnings (1)
📚 Learning: 2025-08-08T15:50:45.306Z
Learnt from: leonardmq
PR: Kiln-AI/Kiln#418
File: libs/core/kiln_ai/adapters/ml_model_list.py:221-228
Timestamp: 2025-08-08T15:50:45.306Z
Learning: In libs/core/kiln_ai/adapters/ml_model_list.KilnModelProvider, naming policy: keep cross-provider feature flags unprefixed (e.g., reasoning_optional_for_structured_output) to allow reuse across providers; prefix provider-specific toggles with the provider name (e.g., siliconflow_enable_thinking) when the implementation is specific and potentially unsafe to generalize.
Applied to files:
app/desktop/studio_server/provider_api.py
🔇 Additional comments (4)
app/desktop/studio_server/provider_api.py (4)
6-6: Imports look good and are placed at top.httpx is used for specific exception handling. Docker Model Runner tools are properly imported.
Also applies to: 12-15
103-107: Good, explicit connection error message.Clear guidance about enabling TCP and checking docs is user-friendly right where it matters.
369-370: Correctly mark Docker Model Runner as not supporting API keys.Consistent with provider capabilities; prevents unsupported connect/disconnect attempts.
Also applies to: 429-430
1100-1120: Confirm whether DMR model IDs can have aliases (e.g., “:latest”) and handle them if so.Ollama has alias/tag handling; Docker MR may not, but if aliases exist, map them similarly to avoid missing known models.
If aliases exist, we can extend this function to match alias sets. Would you like a patch once confirmed?
…o expected client errors. This prevents masking unexpected bugs and aligns with earlier feedback on specific exception handling.
- When only Ollama is available → Ollama appears first - When both are available → Ollama appears first, Docker second
…chema constrained decoding natively
          
 We recently added: https://hub.docker.com/r/ai/granite-embedding-multilingual and https://hub.docker.com/r/ai/nomic-embed-text-v1.5 as well 👍 
 Thanks for pointing it out @leonardmq ! I'm not familiar with the coverage conventions of the project, could you advise me on whether I should add more tests?  | 
    
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
♻️ Duplicate comments (1)
app/web_ui/src/routes/(fullscreen)/setup/(setup)/connect_providers/connect_providers.svelte (1)
576-580: Hardcoded default URL comparison.The comparison uses a hardcoded URL string instead of a constant, which reduces maintainability.
Consider defining a constant at the top of the script section for better maintainability:
+const DEFAULT_DOCKER_URL = "http://localhost:12434/engines/llama.cpp" + const connect_docker_model_runner = async ( user_initiated: boolean = true, ) => { // ... rest of function ... const custom_url_str = - docker_model_runner_custom_url && - docker_model_runner_custom_url !== - "http://localhost:12434/engines/llama.cpp" + docker_model_runner_custom_url && docker_model_runner_custom_url !== DEFAULT_DOCKER_URL ? "Custom Docker Model Runner URL: " + docker_model_runner_custom_url : ""
🧹 Nitpick comments (2)
app/desktop/studio_server/provider_api.py (1)
219-223: Docker Model Runner models are inserted at index 0.While the implementation is correct, inserting Docker Model Runner models at index 0 means they will appear before Ollama models in the UI, which may not be the intended ordering given that Ollama was traditionally the primary local provider.
Consider whether Docker Model Runner should be ordered after Ollama to maintain consistency with existing user expectations, or if this new ordering is intentional.
app/web_ui/src/routes/(fullscreen)/setup/(setup)/connect_providers/connect_providers.svelte (1)
1082-1082: Inconsistent placeholder URL.The placeholder shows
http://localhost:12434/enginesbut the description mentionshttp://localhost:12434/engines/llama.cppand the hardcoded comparison in the code uses the latter.Consider making the placeholder consistent with the actual default URL:
- placeholder="http://localhost:12434/engines" + placeholder="http://localhost:12434/engines/llama.cpp"
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
app/desktop/studio_server/provider_api.py(7 hunks)app/web_ui/src/routes/(fullscreen)/setup/(setup)/connect_providers/connect_providers.svelte(10 hunks)
🧰 Additional context used
📓 Path-based instructions (2)
**/*.py
📄 CodeRabbit Inference Engine (.cursor/rules/project.mdc)
**/*.py: Always assume pydantic 2 (not pydantic 1)
The project supports Python 3.10 and above
Files:
app/desktop/studio_server/provider_api.py
app/web_ui/src/**/*.{js,jsx,ts,tsx,svelte}
📄 CodeRabbit Inference Engine (.cursor/rules/project.mdc)
Our web UI component framework is DaisyUI v4 (based on Tailwind). Be sure to use v4 docs and features, not v5
Files:
app/web_ui/src/routes/(fullscreen)/setup/(setup)/connect_providers/connect_providers.svelte
🧠 Learnings (1)
📚 Learning: 2025-08-08T15:50:45.306Z
Learnt from: leonardmq
PR: Kiln-AI/Kiln#418
File: libs/core/kiln_ai/adapters/ml_model_list.py:221-228
Timestamp: 2025-08-08T15:50:45.306Z
Learning: In libs/core/kiln_ai/adapters/ml_model_list.KilnModelProvider, naming policy: keep cross-provider feature flags unprefixed (e.g., reasoning_optional_for_structured_output) to allow reuse across providers; prefix provider-specific toggles with the provider name (e.g., siliconflow_enable_thinking) when the implementation is specific and potentially unsafe to generalize.
Applied to files:
app/desktop/studio_server/provider_api.py
🧬 Code Graph Analysis (1)
app/desktop/studio_server/provider_api.py (5)
libs/core/kiln_ai/adapters/docker_model_runner_tools.py (3)
DockerModelRunnerConnection(43-50)get_docker_model_runner_connection(90-112)docker_model_runner_base_url(11-22)libs/core/kiln_ai/utils/config.py (2)
Config(28-262)shared(159-162)libs/core/kiln_ai/datamodel/datamodel_enums.py (2)
ModelProviderName(80-102)StructuredOutputMode(23-45)libs/core/kiln_ai/adapters/provider_tools.py (1)
provider_name_from_id(342-398)libs/core/kiln_ai/adapters/ml_model_list.py (2)
KilnModel(244-259)KilnModelProvider(189-241)
🔇 Additional comments (19)
app/desktop/studio_server/provider_api.py (8)
6-6: LGTM!The httpx import is correctly placed at the top level and is needed for the Docker Model Runner exception handling.
12-15: LGTM!The imports from Docker Model Runner tools are appropriately organized and follow the project's import structure.
85-128: LGTM!The
connect_docker_model_runnerfunction is well-implemented with proper error handling. The URL validation, exception handling with pass-through for HTTPExceptions, and config saving logic all follow established patterns from the Ollama implementation.
251-256: LGTM!The API endpoint follows the established pattern and correctly delegates to the connection function with the custom URL parameter.
372-372: LGTM!Correctly marks Docker Model Runner as unsupported for API key connections, consistent with other local providers like Ollama.
432-432: LGTM!Correctly marks Docker Model Runner as unsupported for API key disconnections, consistent with the connect_api_key implementation.
1043-1101: LGTM!The
available_docker_model_runner_modelsfunction follows the same pattern as the Ollama equivalent. The error handling correctly catches the specific exceptions mentioned in past reviews and the model detail mapping looks appropriate.
1103-1123: LGTM!The
models_from_docker_model_runner_idfunction follows the exact same pattern asmodels_from_ollama_tagand correctly maps Docker Model Runner model IDs to KilnModel/KilnModelProvider pairs.app/web_ui/src/routes/(fullscreen)/setup/(setup)/connect_providers/connect_providers.svelte (11)
4-7: LGTM!The import correctly includes the DockerModelRunnerConnection type needed for the new functionality.
64-69: LGTM!The Docker Model Runner provider entry is well-structured and follows the same pattern as other local providers like Ollama.
253-258: LGTM!The Docker Model Runner status object follows the same structure as other providers.
413-415: LGTM!The provider connection logic correctly calls the Docker Model Runner connect function.
427-429: LGTM!The parameter name change from the typo to
user_initiatedis correct and maintains consistency.
512-586: LGTM!The
connect_docker_model_runnerfunction is well-implemented and follows the same pattern asconnect_ollama. The error handling, status management, and descriptive message construction are all appropriate.
686-688: LGTM!The Docker Model Runner base URL is correctly loaded from the settings during provider initialization.
735-739: LGTM!The automatic connection check on mount follows the same pattern as Ollama, appropriately clearing errors for non-user-initiated connections.
747-750: LGTM!The dialog function follows the same pattern as the Ollama equivalent.
961-968: LGTM!The error state UI correctly shows the custom URL button when Docker Model Runner has connection issues, following the same pattern as Ollama.
1062-1104: LGTM!The Docker Model Runner dialog is well-implemented and follows the same structure and styling as the Ollama dialog. The placeholder and description text are appropriate for Docker Model Runner usage.
          
 @ilopezluna - we do not have a hard number for coverage requirements, current overall backend coverage is at 92%. We try to go for 100% coverage, but some modules are harder to test. Latest diff coverage: https://github.com/Kiln-AI/Kiln/actions/runs/16907251293/artifacts/3746939147 My advice would be to look at the files with low coverage in this diff coverage and see if the untested lines are worth testing, and see if we already have similar tests for  Precision: this is only about the backend; we have few tests on the frontend at the moment so no need to worry about that part  | 
    
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
♻️ Duplicate comments (1)
app/desktop/studio_server/provider_api.py (1)
1081-1081: Consider conservative default for untested model structured output.The code sets
supports_structured_output=Truefor untested Docker Model Runner models. While Docker Model Runner does support JSON schema, being more conservative withsupports_structured_output=Falsefor truly unknown models might be safer, following the pattern used for Ollama untested models.From the past review comments, this was flagged as potentially too optimistic.
Consider changing to:
- supports_structured_output=True, + supports_structured_output=False,
🧹 Nitpick comments (1)
libs/core/kiln_ai/adapters/ml_model_list.py (1)
1198-1217: Consider consolidating multiple model_id variants.Multiple entries for the same base model with different tags (e.g., ai/llama3.1, ai/llama3.1:latest, ai/llama3.1:8B-Q4_K_M, ai/llama3.1:8B-F16) create UI clutter. Consider if these represent genuinely different models or if they could be consolidated with a canonical model_id.
From the past review comments, it seems there was discussion about using explicit tags like "3B-Q4_K_M" instead of "latest" for clarity.
Also applies to: 1556-1575, 2757-2791
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
💡 Knowledge Base configuration:
- MCP integration is disabled by default for public repositories
 - Jira integration is disabled by default for public repositories
 - Linear integration is disabled by default for public repositories
 
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (2)
app/desktop/studio_server/provider_api.py(7 hunks)libs/core/kiln_ai/adapters/ml_model_list.py(19 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
**/*.py
📄 CodeRabbit Inference Engine (.cursor/rules/project.mdc)
**/*.py: Always assume pydantic 2 (not pydantic 1)
The project supports Python 3.10 and above
Files:
app/desktop/studio_server/provider_api.pylibs/core/kiln_ai/adapters/ml_model_list.py
🧠 Learnings (5)
📚 Learning: 2025-08-08T15:50:45.334Z
Learnt from: leonardmq
PR: Kiln-AI/Kiln#418
File: libs/core/kiln_ai/adapters/ml_model_list.py:221-228
Timestamp: 2025-08-08T15:50:45.334Z
Learning: In libs/core/kiln_ai/adapters/ml_model_list.KilnModelProvider, naming policy: keep cross-provider feature flags unprefixed (e.g., reasoning_optional_for_structured_output) to allow reuse across providers; prefix provider-specific toggles with the provider name (e.g., siliconflow_enable_thinking) when the implementation is specific and potentially unsafe to generalize.
Applied to files:
app/desktop/studio_server/provider_api.pylibs/core/kiln_ai/adapters/ml_model_list.py
📚 Learning: 2025-08-08T16:13:26.526Z
Learnt from: leonardmq
PR: Kiln-AI/Kiln#418
File: libs/core/kiln_ai/adapters/ml_model_list.py:1875-1890
Timestamp: 2025-08-08T16:13:26.526Z
Learning: In libs/core/kiln_ai/adapters/ml_model_list.py (Python), do not blanket-add r1_thinking/optional_r1_thinking parsers for R1-style models. Parser usage is provider-specific and must be based on observed responses in tests. For PR Kiln-AI/Kiln#418, deepseek_r1_0528_distill_qwen3_8b providers were validated without needing a parser.
Applied to files:
libs/core/kiln_ai/adapters/ml_model_list.py
📚 Learning: 2025-08-08T16:19:20.074Z
Learnt from: leonardmq
PR: Kiln-AI/Kiln#418
File: libs/core/kiln_ai/adapters/ml_model_list.py:1736-1741
Timestamp: 2025-08-08T16:19:20.074Z
Learning: In libs/core/kiln_ai/adapters/ml_model_list.py (Python), for ModelName.qwq_32b with ModelProviderName.siliconflow_cn (model_id "Qwen/QwQ-32B"), do not set a parser (r1_thinking/optional_r1_thinking). Verified via tests/responses that SiliconFlow returns outputs that parse correctly without an R1 parser. Parser usage remains provider-specific and should only be added when validated.
Applied to files:
libs/core/kiln_ai/adapters/ml_model_list.py
📚 Learning: 2025-08-08T16:15:20.796Z
Learnt from: leonardmq
PR: Kiln-AI/Kiln#418
File: libs/core/kiln_ai/adapters/ml_model_list.py:2434-2439
Timestamp: 2025-08-08T16:15:20.796Z
Learning: In libs/core/kiln_ai/adapters/ml_model_list.py (Python), for ModelName.qwen_3_8b_no_thinking with ModelProviderName.siliconflow_cn (model_id "Qwen/Qwen3-8B"), the qwen3_style_no_think formatter is not required: SiliconFlow’s non‑thinking Qwen3‑8B returns clean output without <answer> wrappers. Formatter/parser usage is provider-specific and should be added only when verified by tests/responses.
Applied to files:
libs/core/kiln_ai/adapters/ml_model_list.py
📚 Learning: 2025-08-08T16:14:54.346Z
Learnt from: leonardmq
PR: Kiln-AI/Kiln#418
File: libs/core/kiln_ai/adapters/ml_model_list.py:1979-1983
Timestamp: 2025-08-08T16:14:54.346Z
Learning: In libs/core/kiln_ai/adapters/ml_model_list.py, for ModelName.deepseek_r1_distill_qwen_32b on provider siliconflow_cn (model_id "deepseek-ai/DeepSeek-R1-Distill-Qwen-32B"), do not set a parser (r1_thinking/optional_r1_thinking). Verified via testing that SiliconFlow returns final outputs that parse without an R1 parser, and reasoning may be omitted under structured output; rely on reasoning_optional_for_structured_output=True.
Applied to files:
libs/core/kiln_ai/adapters/ml_model_list.py
🧬 Code Graph Analysis (2)
app/desktop/studio_server/provider_api.py (5)
libs/core/kiln_ai/adapters/docker_model_runner_tools.py (3)
DockerModelRunnerConnection(43-50)get_docker_model_runner_connection(90-112)docker_model_runner_base_url(11-22)libs/core/kiln_ai/utils/config.py (3)
Config(28-262)shared(159-162)save_setting(247-248)libs/core/kiln_ai/datamodel/datamodel_enums.py (2)
ModelProviderName(80-102)StructuredOutputMode(23-45)libs/core/kiln_ai/adapters/provider_tools.py (1)
provider_name_from_id(342-398)libs/core/kiln_ai/adapters/ml_model_list.py (2)
KilnModel(245-260)KilnModelProvider(190-242)
libs/core/kiln_ai/adapters/ml_model_list.py (3)
libs/core/kiln_ai/datamodel/datamodel_enums.py (2)
ModelProviderName(80-102)StructuredOutputMode(23-45)app/web_ui/src/lib/types.ts (2)
ModelProviderName(43-43)StructuredOutputMode(42-42)libs/core/kiln_ai/adapters/parsers/test_r1_parser.py (1)
parser(8-9)
🔇 Additional comments (11)
libs/core/kiln_ai/adapters/ml_model_list.py (4)
96-96: LGTM: New model name follows naming conventions.The addition of
gemma_3_0p27bfollows the established pattern for model naming with the 270M size encoded as "0p27b".
1725-1738: LGTM: Gemma 3 270M model correctly configured.The new Gemma 3 270M model is appropriately configured with:
- Disabled structured output and data generation (appropriate for a 270M parameter model)
 - Docker Model Runner provider only
 - Correct model_id format
 ai/gemma3:270M-F16
1334-1353: Ensure parser configuration consistency across Docker Model Runner providers.The Docker Model Runner provider entries show extensive model coverage which is excellent. However, I notice some models (like QwQ, DeepSeek R1 variants) have reasoning capabilities and parsers configured correctly, while others don't. Please verify that all models requiring R1 thinking parsers have them properly configured.
According to the retrieved learnings, parser usage should be provider-specific and validated through tests. The configurations appear consistent with existing patterns for reasoning models.
Also applies to: 1436-1459, 1556-1575, 1618-1642, 1751-1756, 1775-1794, 1875-1898, 1961-1995, 2013-2032, 2289-2304, 2377-2426, 2618-2624, 2757-2791, 2856-2862, 2908-2919
1198-1217: Verify Docker Model Runner provider model configurationsThe ripgrep command you ran produced no matches, so I can’t confirm whether these entries are distinct model images/tags. Please verify.
- File: libs/core/kiln_ai/adapters/ml_model_list.py — lines 1198–1217
 KilnModelProvider( name=ModelProviderName.docker_model_runner, structured_output_mode=StructuredOutputMode.json_schema, model_id="ai/llama3.1", ), KilnModelProvider( name=ModelProviderName.docker_model_runner, structured_output_mode=StructuredOutputMode.json_schema, model_id="ai/llama3.1:latest", ), KilnModelProvider( name=ModelProviderName.docker_model_runner, structured_output_mode=StructuredOutputMode.json_schema, model_id="ai/llama3.1:8B-Q4_K_M", ), KilnModelProvider( name=ModelProviderName.docker_model_runner, structured_output_mode=StructuredOutputMode.json_schema, model_id="ai/llama3.1:8B-F16", ),Requested verification steps:
- Check the Docker/image registry or provider docs to confirm whether these tags point to distinct images.
 - If they are duplicates, collapse to a single canonical tag or add an inline comment explaining why multiple tags are listed.
 - If they are distinct, add brief comments or more descriptive identifiers to avoid confusion.
 app/desktop/studio_server/provider_api.py (7)
12-15: LGTM: Clean imports for Docker Model Runner integration.The imports are properly organized and include the necessary types and functions for Docker Model Runner support.
85-129: LGTM: Well-structured connection function with proper error handling.The
connect_docker_model_runnerfunction follows the established pattern from the Ollama integration with:
- Proper URL validation
 - Specific exception handling (addresses previous review feedback)
 - HTTPException passthrough to preserve status codes
 - Configuration persistence for custom URLs
 The implementation addresses previous feedback about exception handling and parameter naming.
251-256: LGTM: Clean API endpoint implementation.The Docker Model Runner connection endpoint follows the same pattern as the Ollama endpoint with appropriate parameter passing.
219-222: Verify provider ordering logic.The Docker Model Runner models are inserted at position 0, which means they'll appear first in the UI. However, the comment and subsequent Ollama insertion at position 0 suggests Ollama should maintain the top spot when both are present. The current logic would place Docker Model Runner first, then Ollama.
Should the insertion logic be:
models.insert(1, docker_models) # Insert at position 1 if Ollama will be at 0Or is Docker Model Runner intended to have priority over Ollama?
372-372: LGTM: Proper exclusion from API key workflows.Docker Model Runner is correctly excluded from both connect and disconnect API key workflows since it doesn't require API keys, following the same pattern as Ollama.
Also applies to: 432-432
1043-1101: LGTM: Well-structured model discovery implementation.The
available_docker_model_runner_modelsfunction follows the established pattern from Ollama with:
- Proper error handling with specific exceptions (addresses previous feedback)
 - Correct mapping from Docker model IDs to Kiln models
 - Appropriate handling of both supported and untested models
 
1103-1122: LGTM: Clean model matching implementation.The
models_from_docker_model_runner_idfunction properly matches Docker model IDs to Kiln models and providers, following the same pattern as the Ollama equivalent.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Nitpick comments (9)
libs/core/kiln_ai/adapters/test_provider_tools.py (1)
937-989: Good coverage for DMR provider_enabled; add an “untested-only” caseYou cover success, empty models, and exception paths. Since provider_enabled returns True when untested_models > 0 (even if supported_models == 0), add a test asserting that behavior.
Apply this diff to add the missing case:
@@ @pytest.mark.asyncio async def test_provider_enabled_docker_model_runner_connection_error(): @@ result = await provider_enabled(ModelProviderName.docker_model_runner) assert result is False +@pytest.mark.asyncio +async def test_provider_enabled_docker_model_runner_untested_only(): + """Test provider_enabled for Docker Model Runner when only untested models are present""" + with patch( + "kiln_ai.adapters.provider_tools.get_docker_model_runner_connection", + new_callable=AsyncMock, + ) as mock_get_docker: + mock_get_docker.return_value = DockerModelRunnerConnection( + message="Connected with untested", + supported_models=[], + untested_models=["some-new-model"], + ) + + result = await provider_enabled(ModelProviderName.docker_model_runner) + assert result is True + + def test_provider_name_from_id_docker_model_runner():libs/core/kiln_ai/adapters/test_adapter_registry.py (2)
11-11: Prefer consistent enum usage for structured_output_mode across testsYou import StructuredOutputMode and use it in the new DMR tests. Consider switching the earlier tests that pass "json_schema" as a string to the enum for consistency and static checking.
269-341: Strong adapter tests for DMR; consider a trailing-slash case and reduce duplication
- You validated default, custom, and None base URL behavior and the “/v1” suffix. Consider also testing when docker_model_runner_base_url ends with a trailing slash to ensure no double slashes in constructed base_url.
 - The three DMR tests share a lot of assertions; you could parametrize them to reduce duplication.
 Add a trailing-slash test (optional):
@@ def test_docker_model_runner_adapter_creation_with_none_url(mock_config, basic_task): @@ assert adapter.config.default_headers is None + + +def test_docker_model_runner_adapter_creation_with_trailing_slash(mock_config, basic_task): + """Test DMR adapter creation when config URL ends with a trailing slash (no double slashes in /v1).""" + mock_config.shared.return_value.docker_model_runner_base_url = ( + "http://custom:8080/engines/llama.cpp/" + ) + + adapter = adapter_for_task( + kiln_task=basic_task, + run_config_properties=RunConfigProperties( + model_name="llama_3_2_3b", + model_provider_name=ModelProviderName.docker_model_runner, + prompt_id="simple_prompt_builder", + structured_output_mode=StructuredOutputMode.json_schema, + ), + ) + + assert isinstance(adapter, LiteLlmAdapter) + assert adapter.config.base_url == "http://custom:8080/engines/llama.cpp/v1"If this test fails, we should normalize the base URL in adapter_registry by trimming a trailing slash before appending “/v1”.
libs/core/kiln_ai/adapters/test_docker_model_runner_tools.py (3)
34-65: Model parsing test looks good; using the enum avoids string driftYou correctly patched built_in_models and used ModelProviderName.docker_model_runner. Consider asserting that other known IDs land in untested_models to fully validate the split, but not required.
221-275: Minor: error-path coverage overlaps; consider adding openai.APIError caseYou already cover APIConnectionError and RequestError. Adding an openai.APIError case would exercise the remaining except branch.
Example (optional):
@@ async def test_get_docker_model_runner_connection_http_error(): @@ assert result is None + + +@pytest.mark.asyncio +async def test_get_docker_model_runner_connection_openai_api_error(): + """Test get_docker_model_runner_connection with OpenAI API error.""" + from kiln_ai.adapters.docker_model_runner_tools import get_docker_model_runner_connection + with patch("kiln_ai.adapters.docker_model_runner_tools.openai.OpenAI") as mock_openai: + mock_client = Mock() + mock_client.models.list.side_effect = openai.APIError(message="boom", request=Mock(), response=Mock()) + mock_openai.return_value = mock_client + result = await get_docker_model_runner_connection() + assert result is None
277-310: Two installation tests could be parametrized to reduce duplicationNot required, but consider parametrize over presence/absence to keep the file concise.
app/desktop/studio_server/test_provider_api.py (3)
5-5: Remove redundant import ofhttpxLine 392 imports
httpxagain when it's already imported on line 5. Remove the duplicate import.@@ -389,8 +389,6 @@ @pytest.mark.asyncio async def test_connect_docker_model_runner_http_error(): """Test connect_docker_model_runner with HTTP errors""" - import httpx - with patch( "app.desktop.studio_server.provider_api.get_docker_model_runner_connection" ) as mock_get_connection:Also applies to: 7-7
583-583: Consider using a constant for expected model countThe assertion
assert len(result.models) == 3on line 583 depends on the specific test setup (2 supported + 1 untested). Consider using a more explicit assertion or comment to clarify the expected count.- assert len(result.models) == 3 # 2 supported + 1 untested + expected_model_count = len(mock_connection.supported_models) + len(mock_connection.untested_models) + assert len(result.models) == expected_model_count # 2 supported + 1 untestedAlso applies to: 610-610
592-593: Consider asserting onuntested_modelfield explicitlyFor completeness, consider asserting on the
untested_modelfield for supported models to ensure it's explicitly False (not just falsy).- assert result.models[0].untested_model is False + assert result.models[0].untested_model is False # Supported models should not be marked as untested # Check second supported model assert result.models[1].id == "supported2" assert result.models[1].name == "Supported 2" assert result.models[1].supports_structured_output is False assert ( result.models[1].structured_output_mode == StructuredOutputMode.json_schema ) - assert result.models[1].untested_model is False + assert result.models[1].untested_model is False # Supported models should not be marked as untestedAlso applies to: 600-601
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
💡 Knowledge Base configuration:
- MCP integration is disabled by default for public repositories
 - Jira integration is disabled by default for public repositories
 - Linear integration is disabled by default for public repositories
 
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (4)
app/desktop/studio_server/test_provider_api.py(4 hunks)libs/core/kiln_ai/adapters/test_adapter_registry.py(3 hunks)libs/core/kiln_ai/adapters/test_docker_model_runner_tools.py(1 hunks)libs/core/kiln_ai/adapters/test_provider_tools.py(2 hunks)
🧰 Additional context used
📓 Path-based instructions (2)
**/*.py
📄 CodeRabbit Inference Engine (.cursor/rules/project.mdc)
**/*.py: Always assume pydantic 2 (not pydantic 1)
The project supports Python 3.10 and above
Files:
libs/core/kiln_ai/adapters/test_adapter_registry.pylibs/core/kiln_ai/adapters/test_docker_model_runner_tools.pylibs/core/kiln_ai/adapters/test_provider_tools.pyapp/desktop/studio_server/test_provider_api.py
**/test_*.py
📄 CodeRabbit Inference Engine (.cursor/rules/project.mdc)
**/test_*.py: Always use pytest for tests in Python code
Test brevity is important. Use approaches for re-use and brevity including using fixtures for repeated code, and using pytest parameterize for similar tests
Files:
libs/core/kiln_ai/adapters/test_adapter_registry.pylibs/core/kiln_ai/adapters/test_docker_model_runner_tools.pylibs/core/kiln_ai/adapters/test_provider_tools.pyapp/desktop/studio_server/test_provider_api.py
🧠 Learnings (1)
📚 Learning: 2025-08-08T15:50:45.334Z
Learnt from: leonardmq
PR: Kiln-AI/Kiln#418
File: libs/core/kiln_ai/adapters/ml_model_list.py:221-228
Timestamp: 2025-08-08T15:50:45.334Z
Learning: In libs/core/kiln_ai/adapters/ml_model_list.KilnModelProvider, naming policy: keep cross-provider feature flags unprefixed (e.g., reasoning_optional_for_structured_output) to allow reuse across providers; prefix provider-specific toggles with the provider name (e.g., siliconflow_enable_thinking) when the implementation is specific and potentially unsafe to generalize.
Applied to files:
app/desktop/studio_server/test_provider_api.py
🧬 Code Graph Analysis (4)
libs/core/kiln_ai/adapters/test_adapter_registry.py (5)
libs/core/kiln_ai/datamodel/datamodel_enums.py (2)
StructuredOutputMode(23-45)ModelProviderName(80-102)libs/core/kiln_ai/adapters/docker_model_runner_tools.py (1)
docker_model_runner_base_url(11-22)libs/core/kiln_ai/adapters/test_provider_tools.py (1)
mock_config(42-44)libs/core/kiln_ai/datamodel/task.py (1)
RunConfigProperties(48-82)libs/core/kiln_ai/adapters/model_adapters/litellm_adapter.py (1)
LiteLlmAdapter(27-441)
libs/core/kiln_ai/adapters/test_docker_model_runner_tools.py (1)
libs/core/kiln_ai/adapters/docker_model_runner_tools.py (7)
DockerModelRunnerConnection(43-50)docker_model_runner_base_url(11-22)parse_docker_model_runner_models(54-87)all_models(49-50)docker_model_runner_online(25-40)get_docker_model_runner_connection(90-112)docker_model_runner_model_installed(115-119)
libs/core/kiln_ai/adapters/test_provider_tools.py (2)
libs/core/kiln_ai/adapters/provider_tools.py (2)
provider_enabled(28-53)provider_name_from_id(342-398)libs/core/kiln_ai/datamodel/datamodel_enums.py (1)
ModelProviderName(80-102)
app/desktop/studio_server/test_provider_api.py (5)
libs/core/kiln_ai/datamodel/datamodel_enums.py (2)
StructuredOutputMode(23-45)ModelProviderName(80-102)app/desktop/studio_server/provider_api.py (3)
connect_docker_model_runner(85-128)available_docker_model_runner_models(1043-1100)models_from_docker_model_runner_id(1103-1122)libs/core/kiln_ai/adapters/docker_model_runner_tools.py (1)
docker_model_runner_base_url(11-22)libs/core/kiln_ai/utils/config.py (1)
save_setting(247-248)libs/core/kiln_ai/adapters/ml_model_list.py (2)
KilnModel(245-260)KilnModelProvider(190-242)
🔇 Additional comments (9)
libs/core/kiln_ai/adapters/test_provider_tools.py (1)
5-5: Import looks correct and scoped to the tests that need itDockerModelRunnerConnection is used only in the new DMR tests. Good, tight import.
libs/core/kiln_ai/adapters/test_adapter_registry.py (1)
21-23: Nice: default DMR base URL pinned in the mock ConfigThis makes the DMR tests deterministic and aligned with the implementation default.
libs/core/kiln_ai/adapters/test_docker_model_runner_tools.py (2)
90-111: Service “online” checks are well-isolated with AsyncClient patchingGood use of AsyncMock and context manager patching for httpx.AsyncClient.
131-176: Connection success test thoroughly verifies OpenAI client init and parse callAsserting base_url composition and parse invocation is spot on.
app/desktop/studio_server/test_provider_api.py (5)
349-479: Well-structured test coverage for Docker Model Runner connection logicThe connection test functions comprehensively cover all failure scenarios: invalid URL format, connection failures, API errors (OpenAI APIConnectionError), HTTP errors (httpx.RequestError), and HTTPException preservation. The URL persistence logic is properly tested for both custom and unchanged URLs.
481-523: LGTM! API endpoint tests are well-structuredThe API endpoint tests properly verify that the
/api/provider/docker_model_runner/connectendpoint correctly invokesconnect_docker_model_runnerwith both default (None) and custom URL parameters.
525-611: Comprehensive test coverage for available Docker Model Runner modelsThe test suite thoroughly covers Docker Model Runner model discovery, including:
- Successful connection with both supported and untested models
 - Correct mapping of model attributes (id, name, supports_structured_output, structured_output_mode)
 - Proper handling of untested models with the untested_model flag
 
613-707: Excellent error handling test coverageThe test suite properly covers all error scenarios for
available_docker_model_runner_models:
- No connection available
 - Empty model lists
 - API errors (openai.APIConnectionError)
 - HTTP errors (httpx.RequestError)
 - HTTPException handling
 All error cases correctly return None as expected.
709-784: Thorough test for models_from_docker_model_runner_id functionThe test comprehensively validates:
- Matching Docker Model Runner IDs return correct (KilnModel, KilnModelProvider) tuples
 - Multi-provider models correctly filter to Docker Model Runner provider only
 - Non-matching IDs return empty lists
 - Models without Docker Model Runner provider are excluded
 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking good! Few tiny changes then good to go:
- We only support one model provider per provider in ml_model_list (see comment). I should have had a test.
 - one small import fix in the tests
 
| model_id="ai/llama3.2:3B-Q4_K_M", | ||
| structured_output_mode=StructuredOutputMode.json_schema, | ||
| supports_data_gen=False, | ||
| ), | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, so we can only have 1 entry per provider in there. I'll add a test to check that. If we add multiple "Llama 8B on DMR" becomes ambiguous which isn't good for reproducibility.
Let's just use the Q4_K_M versions (or similar)? The user can still use the others; they will automatically appear in the dropdown, but with their own more specific IDs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🔭 Outside diff range comments (1)
app/desktop/studio_server/test_generate_docs.py (1)
44-45: Bug: Comparing enum value (str) to enum instance always returns FalseThis condition will never match because
provider.name.valueis a string andModelProviderName.fireworks_aiis an enum. As a result, fireworks finetune support is never detected via the API list.Fix by comparing enums (or values) consistently.
Apply this diff:
- if provider.name.value == ModelProviderName.fireworks_ai: + if provider.name == ModelProviderName.fireworks_ai:
🧹 Nitpick comments (5)
app/desktop/studio_server/test_generate_docs.py (1)
55-60: Async test should be marked for pytest
async def test_generate_model_table()will error if run without an event loop; it’s currently only marked aspaid. Add@pytest.mark.asyncioso it runs correctly when explicitly selected.Apply this diff:
@pytest.mark.paid(reason="Marking as paid so it isn't run by default") +@pytest.mark.asyncio async def test_generate_model_table():libs/core/kiln_ai/adapters/ml_model_list.py (2)
2410-2415: R1-style “thinking” models on DMR: verify parser requirementsSome DMR reasoning-capable entries don’t set an R1 parser, while analogous providers (e.g., OpenRouter/Fireworks/Together) do:
- Qwen3 0.6B (DMR) — reasoning_capable=True, no parser
 - Qwen3 8B (DMR) — reasoning_capable=True, no parser
 - Qwen3 14B (DMR) — reasoning_capable=True, no parser
 - Qwen3 30B A3B (DMR) — reasoning_capable=True, no parser
 Conversely:
- QwQ 32B (DMR) sets
 parser=r1_thinking(consistent with other providers)- DeepSeek R1 Distill Llama 70B/8B (DMR) set
 parser=r1_thinkingGiven past learnings: parser usage for R1-style output is provider-specific and must be backed by observed responses. Please confirm DMR’s OpenAI-compatible responses for these Qwen3 variants:
- If DMR returns … (or similar) interleaved content, set
 parser=r1_thinking.- If DMR normalizes outputs without requiring a parser, keep as-is.
 Optionally add minimal tests that feed sample DMR responses for these models into the parser layer to lock correctness.
Also applies to: 2549-2554, 2620-2625, 2672-2676
3277-3283: Avoid mutable default argument in utility function
disallowed_modes=[]is a mutable default. UseNoneand initialize inside the function.Apply this diff:
def default_structured_output_mode_for_model_provider( model_name: str, provider: ModelProviderName, default: StructuredOutputMode = StructuredOutputMode.default, - disallowed_modes: List[StructuredOutputMode] = [], + disallowed_modes: List[StructuredOutputMode] | None = None, ) -> StructuredOutputMode: @@ - try: + if disallowed_modes is None: + disallowed_modes = [] + + try: # Convert string to ModelName enum model_name_enum = ModelName(model_name)Also applies to: 3286-3294
libs/core/kiln_ai/adapters/test_ml_model_list.py (2)
193-227: Enforce “one provider entry per model” — solid guardrail; minor refinementGreat addition and aligns with earlier discussion about avoiding ambiguity. Prefer
pytest.failoverassert Falsefor clarity.Apply this diff:
- assert False, ( - f"Model {model.name} has duplicate providers:\n" - f"Expected: 1 entry per provider\n" - f"Found: {len(provider_names)} total entries, {len(unique_provider_names)} unique providers\n" - f"Duplicates: {', '.join(duplicate_details)}\n" - f"This suggests either:\n" - f"1. A bug where the same provider is accidentally duplicated, or\n" - f"2. Intentional design where the same provider offers different model variants\n" - f"If this is intentional, the test should be updated to allow multiple entries per provider." - ) + pytest.fail( + "Model {name} has duplicate providers:\n" + "Expected: 1 entry per provider\n" + "Found: {total} total entries, {unique} unique providers\n" + "Duplicates: {dups}\n" + "This suggests either:\n" + "1. A bug where the same provider is accidentally duplicated, or\n" + "2. Intentional design where the same provider offers different model variants\n" + "If this is intentional, the test should be updated to allow multiple entries per provider." + .format( + name=model.name, + total=len(provider_names), + unique=len(unique_provider_names), + dups=", ".join(duplicate_details), + ) + )
150-156: Docstring mismatch with implementationThe test says “model with single provider” but uses
gpt_4_1_nano, which has multiple providers. The logic is correct (it just grabs the first provider), but the docstring should be updated to avoid confusion.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
💡 Knowledge Base configuration:
- MCP integration is disabled by default for public repositories
 - Jira integration is disabled by default for public repositories
 - Linear integration is disabled by default for public repositories
 
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (4)
app/desktop/studio_server/test_generate_docs.py(1 hunks)libs/core/kiln_ai/adapters/ml_model_list.py(19 hunks)libs/core/kiln_ai/adapters/test_docker_model_runner_tools.py(1 hunks)libs/core/kiln_ai/adapters/test_ml_model_list.py(2 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- libs/core/kiln_ai/adapters/test_docker_model_runner_tools.py
 
🧰 Additional context used
📓 Path-based instructions (2)
**/*.py
📄 CodeRabbit Inference Engine (.cursor/rules/project.mdc)
**/*.py: Always assume pydantic 2 (not pydantic 1)
The project supports Python 3.10 and above
Files:
app/desktop/studio_server/test_generate_docs.pylibs/core/kiln_ai/adapters/test_ml_model_list.pylibs/core/kiln_ai/adapters/ml_model_list.py
**/test_*.py
📄 CodeRabbit Inference Engine (.cursor/rules/project.mdc)
**/test_*.py: Always use pytest for tests in Python code
Test brevity is important. Use approaches for re-use and brevity including using fixtures for repeated code, and using pytest parameterize for similar tests
Files:
app/desktop/studio_server/test_generate_docs.pylibs/core/kiln_ai/adapters/test_ml_model_list.py
🧠 Learnings (5)
📚 Learning: 2025-08-08T16:13:26.526Z
Learnt from: leonardmq
PR: Kiln-AI/Kiln#418
File: libs/core/kiln_ai/adapters/ml_model_list.py:1875-1890
Timestamp: 2025-08-08T16:13:26.526Z
Learning: In libs/core/kiln_ai/adapters/ml_model_list.py (Python), do not blanket-add r1_thinking/optional_r1_thinking parsers for R1-style models. Parser usage is provider-specific and must be based on observed responses in tests. For PR Kiln-AI/Kiln#418, deepseek_r1_0528_distill_qwen3_8b providers were validated without needing a parser.
Applied to files:
app/desktop/studio_server/test_generate_docs.pylibs/core/kiln_ai/adapters/test_ml_model_list.pylibs/core/kiln_ai/adapters/ml_model_list.py
📚 Learning: 2025-08-08T15:50:45.334Z
Learnt from: leonardmq
PR: Kiln-AI/Kiln#418
File: libs/core/kiln_ai/adapters/ml_model_list.py:221-228
Timestamp: 2025-08-08T15:50:45.334Z
Learning: In libs/core/kiln_ai/adapters/ml_model_list.KilnModelProvider, naming policy: keep cross-provider feature flags unprefixed (e.g., reasoning_optional_for_structured_output) to allow reuse across providers; prefix provider-specific toggles with the provider name (e.g., siliconflow_enable_thinking) when the implementation is specific and potentially unsafe to generalize.
Applied to files:
libs/core/kiln_ai/adapters/ml_model_list.py
📚 Learning: 2025-08-08T16:19:20.074Z
Learnt from: leonardmq
PR: Kiln-AI/Kiln#418
File: libs/core/kiln_ai/adapters/ml_model_list.py:1736-1741
Timestamp: 2025-08-08T16:19:20.074Z
Learning: In libs/core/kiln_ai/adapters/ml_model_list.py (Python), for ModelName.qwq_32b with ModelProviderName.siliconflow_cn (model_id "Qwen/QwQ-32B"), do not set a parser (r1_thinking/optional_r1_thinking). Verified via tests/responses that SiliconFlow returns outputs that parse correctly without an R1 parser. Parser usage remains provider-specific and should only be added when validated.
Applied to files:
libs/core/kiln_ai/adapters/ml_model_list.py
📚 Learning: 2025-08-08T16:14:54.346Z
Learnt from: leonardmq
PR: Kiln-AI/Kiln#418
File: libs/core/kiln_ai/adapters/ml_model_list.py:1979-1983
Timestamp: 2025-08-08T16:14:54.346Z
Learning: In libs/core/kiln_ai/adapters/ml_model_list.py, for ModelName.deepseek_r1_distill_qwen_32b on provider siliconflow_cn (model_id "deepseek-ai/DeepSeek-R1-Distill-Qwen-32B"), do not set a parser (r1_thinking/optional_r1_thinking). Verified via testing that SiliconFlow returns final outputs that parse without an R1 parser, and reasoning may be omitted under structured output; rely on reasoning_optional_for_structured_output=True.
Applied to files:
libs/core/kiln_ai/adapters/ml_model_list.py
📚 Learning: 2025-08-08T16:15:20.796Z
Learnt from: leonardmq
PR: Kiln-AI/Kiln#418
File: libs/core/kiln_ai/adapters/ml_model_list.py:2434-2439
Timestamp: 2025-08-08T16:15:20.796Z
Learning: In libs/core/kiln_ai/adapters/ml_model_list.py (Python), for ModelName.qwen_3_8b_no_thinking with ModelProviderName.siliconflow_cn (model_id "Qwen/Qwen3-8B"), the qwen3_style_no_think formatter is not required: SiliconFlow’s non‑thinking Qwen3‑8B returns clean output without <answer> wrappers. Formatter/parser usage is provider-specific and should be added only when verified by tests/responses.
Applied to files:
libs/core/kiln_ai/adapters/ml_model_list.py
🧬 Code Graph Analysis (2)
app/desktop/studio_server/test_generate_docs.py (1)
libs/core/kiln_ai/adapters/ml_model_list.py (1)
KilnModelProvider(190-242)
libs/core/kiln_ai/adapters/ml_model_list.py (3)
libs/core/kiln_ai/datamodel/datamodel_enums.py (2)
ModelProviderName(80-102)StructuredOutputMode(23-45)app/web_ui/src/lib/types.ts (2)
ModelProviderName(43-43)StructuredOutputMode(42-42)libs/core/kiln_ai/adapters/parsers/test_r1_parser.py (1)
parser(8-9)
🔇 Additional comments (5)
app/desktop/studio_server/test_generate_docs.py (1)
5-10: Import path consolidation LGTMSwitching to kiln_ai.adapters.* and provider_tools simplifies imports and matches the new adapter layout.
libs/core/kiln_ai/adapters/ml_model_list.py (3)
96-96: New ModelName entry: Gemma 3 270M — LGTMConsistent naming with existing convention (
0p27b) and used below in the model list.
1198-1203: DMR provider additions with explicit model IDs — good coverage and explicitness
- Clear, explicit tags (weights/quantization) — avoids “latest” ambiguity and supports reproducibility.
 - Broad coverage across Llama, Mistral, Gemma, Phi, Qwen, DeepSeek — aligned with the PR objective.
 No functional concerns on these entries as-is.
Also applies to: 1319-1324, 1374-1380, 1406-1411, 1556-1560, 1642-1655, 1669-1673, 1692-1695, 1776-1781, 1844-1850, 1869-1871, 2129-2135, 2209-2217, 2410-2415, 2549-2554, 2620-2625, 2672-2676
1406-1411: Confirm structured output claims for Llama 3.2 3B on DMROpenRouter entry for 3B marks structured output as unreliable; DMR sets
structured_output_mode=json_schemaand omitssupports_structured_output=False. If DMR reliably enforces JSON schema for this model, great — if not, consider aligning capabilities (e.g.,supports_structured_output=Falseorjson_instruction_and_object) until validated.libs/core/kiln_ai/adapters/test_ml_model_list.py (1)
1-1: Import for Counter — LGTMNeeded for the new duplicate-provider test.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great work! We can merge as soon as CI passes.
| KilnModel( | ||
| family=ModelFamily.gemma, | ||
| name=ModelName.gemma_3_0p27b, | ||
| friendly_name="Gemma 3 270M", | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That was fast!
| ) | ||
| 
               | 
          ||
| 
               | 
          ||
| def test_unique_providers_per_model(): | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
What does this PR do?
Add support to Docker Model Runner. The implementation follows the same pattern as Ollama integration but uses OpenAI-compatible endpoints.
Related Issues
#472
Contributor License Agreement
I, @ilopezluna, confirm that I have read and agree to the Contributors License Agreement.
Checklists
Summary by CodeRabbit