Skip to content

Commit fcf2306

Browse files
author
Andrej Simurka
committed
Azure inference supported
1 parent 23948fd commit fcf2306

File tree

6 files changed

+292
-3
lines changed

6 files changed

+292
-3
lines changed

.github/workflows/e2e_tests.yaml

Lines changed: 30 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -8,9 +8,12 @@ jobs:
88
runs-on: ubuntu-latest
99
strategy:
1010
matrix:
11-
environment: [ "ci"]
11+
environment: [ "ci", "azure"]
1212
env:
1313
OPENAI_API_KEY: ${{ secrets.OPENAI_API_KEY }}
14+
CLIENT_SECRET: ${{ secrets.CLIENT_SECRET }}
15+
CLIENT_ID: ${{ secrets.CLIENT_ID }}
16+
TENANT_ID: ${{ secrets.TENANT_ID }}
1417

1518
steps:
1619
- uses: actions/checkout@v4
@@ -72,6 +75,32 @@ jobs:
7275
7376
authentication:
7477
module: "noop"
78+
79+
- name: Get Azure API key (access token)
80+
if: matrix.environment == 'azure'
81+
id: azure_token
82+
env:
83+
CLIENT_ID: ${{ secrets.CLIENT_ID }}
84+
CLIENT_SECRET: ${{ secrets.CLIENT_SECRET }}
85+
TENANT_ID: ${{ secrets.TENANT_ID }}
86+
run: |
87+
echo "Requesting Azure API token..."
88+
RESPONSE=$(curl -s -X POST \
89+
-H "Content-Type: application/x-www-form-urlencoded" \
90+
-d "client_id=$CLIENT_ID&scope=https://cognitiveservices.azure.com/.default&client_secret=$CLIENT_SECRET&grant_type=client_credentials" \
91+
"https://login.microsoftonline.com/$TENANT_ID/oauth2/v2.0/token")
92+
93+
echo "Response received. Extracting access_token..."
94+
ACCESS_TOKEN=$(echo "$RESPONSE" | jq -r '.access_token')
95+
96+
if [ -z "$ACCESS_TOKEN" ] || [ "$ACCESS_TOKEN" == "null" ]; then
97+
echo "❌ Failed to obtain Azure access token. Response:"
98+
echo "$RESPONSE"
99+
exit 1
100+
fi
101+
102+
echo "✅ Successfully obtained Azure access token."
103+
echo "AZURE_API_KEY=$ACCESS_TOKEN" >> $GITHUB_ENV
75104
76105
- name: Select and configure run.yaml
77106
env:

README.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -125,6 +125,8 @@ Lightspeed Core Stack (LCS) supports the large language models from the provider
125125
| OpenAI | gpt-5, gpt-4o, gpt4-turbo, gpt-4.1, o1, o3, o4 | Yes | remote::openai | [1](examples/openai-faiss-run.yaml) [2](examples/openai-pgvector-run.yaml) |
126126
| OpenAI | gpt-3.5-turbo, gpt-4 | No | remote::openai | |
127127
| RHAIIS (vLLM)| meta-llama/Llama-3.1-8B-Instruct | Yes | remote::vllm | [1](tests/e2e/configs/run-rhaiis.yaml) |
128+
| Azure | gpt-5, gpt-5-mini, gpt-5-nano, gpt-5-chat, gpt-4.1, gpt-4.1-mini, gpt-4.1-nano, o3-mini, o4-mini | Yes | remote::azure | [1](examples/azure-run.yaml) |
129+
| Azure | o1, o1-mini | No | remote::azure | |
128130

129131
The "provider_type" is used in the llama stack configuration file when refering to the provider.
130132

docker-compose.yaml

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -12,6 +12,7 @@ services:
1212
- ./run.yaml:/opt/app-root/run.yaml:Z
1313
environment:
1414
- OPENAI_API_KEY=${OPENAI_API_KEY}
15+
- AZURE_API_KEY=${AZURE_API_KEY}
1516
- BRAVE_SEARCH_API_KEY=${BRAVE_SEARCH_API_KEY:-}
1617
- TAVILY_SEARCH_API_KEY=${TAVILY_SEARCH_API_KEY:-}
1718
- RHAIIS_URL=${RHAIIS_URL}
@@ -36,6 +37,7 @@ services:
3637
- ./lightspeed-stack.yaml:/app-root/lightspeed-stack.yaml:Z
3738
environment:
3839
- OPENAI_API_KEY=${OPENAI_API_KEY}
40+
- AZURE_API_KEY=${AZURE_API_KEY}
3941
depends_on:
4042
llama-stack:
4143
condition: service_healthy

docs/providers.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -36,7 +36,7 @@ The tables below summarize each provider category, containing the following atri
3636
| meta-reference | inline | `accelerate`, `fairscale`, `torch`, `torchvision`, `transformers`, `zmq`, `lm-format-enforcer`, `sentence-transformers`, `torchao==0.8.0`, `fbgemm-gpu-genai==1.1.2` ||
3737
| sentence-transformers | inline | `torch torchvision torchao>=0.12.0 --extra-index-url https://download.pytorch.org/whl/cpu`, `sentence-transformers --no-deps` ||
3838
| anthropic | remote | `litellm` ||
39-
| azure | remote | `itellm` | |
39+
| azure | remote | | |
4040
| bedrock | remote | `boto3` ||
4141
| cerebras | remote | `cerebras_cloud_sdk` ||
4242
| databricks | remote |||
@@ -287,4 +287,4 @@ Red Hat providers:
287287

288288
---
289289

290-
For a deeper understanding, see the [official llama-stack configuration documentation](https://llama-stack.readthedocs.io/en/latest/distributions/configuration.html).
290+
For a deeper understanding, see the [official llama-stack providers documentation](https://llamastack.github.io/docs/providers).

examples/azure-run.yaml

Lines changed: 128 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,128 @@
1+
version: '2'
2+
image_name: minimal-viable-llama-stack-configuration
3+
4+
apis:
5+
- agents
6+
- datasetio
7+
- eval
8+
- files
9+
- inference
10+
- post_training
11+
- safety
12+
- scoring
13+
- telemetry
14+
- tool_runtime
15+
- vector_io
16+
benchmarks: []
17+
container_image: null
18+
datasets: []
19+
external_providers_dir: null
20+
inference_store:
21+
db_path: .llama/distributions/ollama/inference_store.db
22+
type: sqlite
23+
logging: null
24+
metadata_store:
25+
db_path: .llama/distributions/ollama/registry.db
26+
namespace: null
27+
type: sqlite
28+
providers:
29+
files:
30+
- provider_id: localfs
31+
provider_type: inline::localfs
32+
config:
33+
storage_dir: /tmp/llama-stack-files
34+
metadata_store:
35+
type: sqlite
36+
db_path: .llama/distributions/ollama/files_metadata.db
37+
agents:
38+
- provider_id: meta-reference
39+
provider_type: inline::meta-reference
40+
config:
41+
persistence_store:
42+
db_path: .llama/distributions/ollama/agents_store.db
43+
namespace: null
44+
type: sqlite
45+
responses_store:
46+
db_path: .llama/distributions/ollama/responses_store.db
47+
type: sqlite
48+
datasetio:
49+
- provider_id: huggingface
50+
provider_type: remote::huggingface
51+
config:
52+
kvstore:
53+
db_path: .llama/distributions/ollama/huggingface_datasetio.db
54+
namespace: null
55+
type: sqlite
56+
- provider_id: localfs
57+
provider_type: inline::localfs
58+
config:
59+
kvstore:
60+
db_path: .llama/distributions/ollama/localfs_datasetio.db
61+
namespace: null
62+
type: sqlite
63+
eval:
64+
- provider_id: meta-reference
65+
provider_type: inline::meta-reference
66+
config:
67+
kvstore:
68+
db_path: .llama/distributions/ollama/meta_reference_eval.db
69+
namespace: null
70+
type: sqlite
71+
inference:
72+
- provider_id: azure
73+
provider_type: remote::azure
74+
config:
75+
api_key: ${env.AZURE_API_KEY}
76+
api_base: https://ols-test.openai.azure.com/
77+
api_version: 2024-02-15-preview
78+
api_type: ${env.AZURE_API_TYPE:=}
79+
post_training:
80+
- provider_id: huggingface
81+
provider_type: inline::huggingface-gpu
82+
config:
83+
checkpoint_format: huggingface
84+
device: cpu
85+
distributed_backend: null
86+
dpo_output_dir: "."
87+
safety:
88+
- provider_id: llama-guard
89+
provider_type: inline::llama-guard
90+
config:
91+
excluded_categories: []
92+
scoring:
93+
- provider_id: basic
94+
provider_type: inline::basic
95+
config: {}
96+
- provider_id: llm-as-judge
97+
provider_type: inline::llm-as-judge
98+
config: {}
99+
- provider_id: braintrust
100+
provider_type: inline::braintrust
101+
config:
102+
openai_api_key: '********'
103+
telemetry:
104+
- provider_id: meta-reference
105+
provider_type: inline::meta-reference
106+
config:
107+
service_name: 'lightspeed-stack-telemetry'
108+
sinks: sqlite
109+
sqlite_db_path: .llama/distributions/ollama/trace_store.db
110+
tool_runtime:
111+
- provider_id: model-context-protocol
112+
provider_type: remote::model-context-protocol
113+
config: {}
114+
scoring_fns: []
115+
server:
116+
auth: null
117+
host: null
118+
port: 8321
119+
quota: null
120+
tls_cafile: null
121+
tls_certfile: null
122+
tls_keyfile: null
123+
shields: []
124+
models:
125+
- model_id: gpt-4o-mini
126+
model_type: llm
127+
provider_id: azure
128+
provider_model_id: gpt-4o-mini

tests/e2e/configs/run-azure.yaml

Lines changed: 128 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,128 @@
1+
version: '2'
2+
image_name: minimal-viable-llama-stack-configuration
3+
4+
apis:
5+
- agents
6+
- datasetio
7+
- eval
8+
- files
9+
- inference
10+
- post_training
11+
- safety
12+
- scoring
13+
- telemetry
14+
- tool_runtime
15+
- vector_io
16+
benchmarks: []
17+
container_image: null
18+
datasets: []
19+
external_providers_dir: null
20+
inference_store:
21+
db_path: .llama/distributions/ollama/inference_store.db
22+
type: sqlite
23+
logging: null
24+
metadata_store:
25+
db_path: .llama/distributions/ollama/registry.db
26+
namespace: null
27+
type: sqlite
28+
providers:
29+
files:
30+
- provider_id: localfs
31+
provider_type: inline::localfs
32+
config:
33+
storage_dir: /tmp/llama-stack-files
34+
metadata_store:
35+
type: sqlite
36+
db_path: .llama/distributions/ollama/files_metadata.db
37+
agents:
38+
- provider_id: meta-reference
39+
provider_type: inline::meta-reference
40+
config:
41+
persistence_store:
42+
db_path: .llama/distributions/ollama/agents_store.db
43+
namespace: null
44+
type: sqlite
45+
responses_store:
46+
db_path: .llama/distributions/ollama/responses_store.db
47+
type: sqlite
48+
datasetio:
49+
- provider_id: huggingface
50+
provider_type: remote::huggingface
51+
config:
52+
kvstore:
53+
db_path: .llama/distributions/ollama/huggingface_datasetio.db
54+
namespace: null
55+
type: sqlite
56+
- provider_id: localfs
57+
provider_type: inline::localfs
58+
config:
59+
kvstore:
60+
db_path: .llama/distributions/ollama/localfs_datasetio.db
61+
namespace: null
62+
type: sqlite
63+
eval:
64+
- provider_id: meta-reference
65+
provider_type: inline::meta-reference
66+
config:
67+
kvstore:
68+
db_path: .llama/distributions/ollama/meta_reference_eval.db
69+
namespace: null
70+
type: sqlite
71+
inference:
72+
- provider_id: azure
73+
provider_type: remote::azure
74+
config:
75+
api_key: ${env.AZURE_API_KEY}
76+
api_base: https://ols-test.openai.azure.com/
77+
api_version: 2024-02-15-preview
78+
api_type: ${env.AZURE_API_TYPE:=}
79+
post_training:
80+
- provider_id: huggingface
81+
provider_type: inline::huggingface-gpu
82+
config:
83+
checkpoint_format: huggingface
84+
device: cpu
85+
distributed_backend: null
86+
dpo_output_dir: "."
87+
safety:
88+
- provider_id: llama-guard
89+
provider_type: inline::llama-guard
90+
config:
91+
excluded_categories: []
92+
scoring:
93+
- provider_id: basic
94+
provider_type: inline::basic
95+
config: {}
96+
- provider_id: llm-as-judge
97+
provider_type: inline::llm-as-judge
98+
config: {}
99+
- provider_id: braintrust
100+
provider_type: inline::braintrust
101+
config:
102+
openai_api_key: '********'
103+
telemetry:
104+
- provider_id: meta-reference
105+
provider_type: inline::meta-reference
106+
config:
107+
service_name: 'lightspeed-stack-telemetry'
108+
sinks: sqlite
109+
sqlite_db_path: .llama/distributions/ollama/trace_store.db
110+
tool_runtime:
111+
- provider_id: model-context-protocol
112+
provider_type: remote::model-context-protocol
113+
config: {}
114+
scoring_fns: []
115+
server:
116+
auth: null
117+
host: null
118+
port: 8321
119+
quota: null
120+
tls_cafile: null
121+
tls_certfile: null
122+
tls_keyfile: null
123+
shields: []
124+
models:
125+
- model_id: gpt-4o-mini
126+
model_type: llm
127+
provider_id: azure
128+
provider_model_id: gpt-4o-mini

0 commit comments

Comments
 (0)