Skip to content

Commit 3dd7d90

Browse files
authored
Merge pull request #741 from aws-samples/feature/ai-quality-assurance
Added AI quality assurance examples
2 parents 5ff8add + 58a62f7 commit 3dd7d90

File tree

38 files changed

+2825
-631
lines changed

38 files changed

+2825
-631
lines changed
Lines changed: 33 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,33 @@
1+
HELP.md
2+
target/
3+
.mvn/wrapper/maven-wrapper.jar
4+
!**/src/main/**/target/
5+
!**/src/test/**/target/
6+
7+
### STS ###
8+
.apt_generated
9+
.classpath
10+
.factorypath
11+
.project
12+
.settings
13+
.springBeans
14+
.sts4-cache
15+
16+
### IntelliJ IDEA ###
17+
.idea
18+
*.iws
19+
*.iml
20+
*.ipr
21+
22+
### NetBeans ###
23+
/nbproject/private/
24+
/nbbuild/
25+
/dist/
26+
/nbdist/
27+
/.nb-gradle/
28+
build/
29+
!**/src/main/**/build/
30+
!**/src/test/**/build/
31+
32+
### VS Code ###
33+
.vscode/
Lines changed: 133 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,133 @@
1+
# Spring AI Agent
2+
3+
A comprehensive AI-powered agent built with Spring AI framework, featuring weather forecasting capabilities and secure OAuth integration.
4+
5+
## Related Documentation
6+
7+
This project is part of a larger microservices ecosystem:
8+
9+
- [Weather Service Documentation](../weather/README.md) - Weather forecast service with global coverage
10+
11+
## Project Overview
12+
13+
### Description
14+
15+
The Spring AI Agent is a demonstration of how to build modern AI-powered applications using the Spring AI framework. It provides weather forecasting capabilities through:
16+
17+
- Weather forecasts for any city worldwide
18+
- Integration with external weather APIs
19+
- Model Context Protocol (MCP) client for connecting to weather services
20+
- Secure OAuth authentication and authorization
21+
22+
The application serves as the central component in a microservices architecture, connecting to the Weather service through the Model Context Protocol (MCP).
23+
24+
### Purpose
25+
26+
This application serves as:
27+
28+
1. A reference implementation for Spring AI integration with weather services
29+
2. A demonstration of secure AI application patterns with OAuth
30+
3. A practical example of building weather assistants with Spring Boot
31+
4. A showcase for integrating with Amazon Bedrock and weather APIs
32+
33+
### Technology Stack
34+
35+
- **Java 21**: Latest LTS version with modern language features
36+
- **Spring Boot 3.5.7**: Core framework for building the application
37+
- **Spring AI 1.0.3**: AI integration framework
38+
- **Spring Security**: OAuth 2.0 authentication and authorization
39+
- **Amazon Bedrock**: AI model provider (Claude Sonnet 4)
40+
- **Docker**: Containerization for application
41+
42+
## Security
43+
44+
### OAuth 2.0 Integration
45+
46+
The application implements OAuth 2.0 for secure authentication and authorization:
47+
48+
- **Authorization Server**: Integrated OAuth 2.0 authorization server
49+
- **Resource Protection**: Secured API endpoints with JWT tokens
50+
- **Token Validation**: Automatic JWT token validation and user context
51+
52+
## Getting Started
53+
54+
### Prerequisites
55+
56+
- Java 21 or higher
57+
- Maven 3.8 or higher
58+
- AWS account with Amazon Bedrock access
59+
60+
### Prerequisites for Full Functionality
61+
62+
Before starting the AI agent, ensure the required services are running:
63+
64+
1. **Start Authorization Server** (port 9000):
65+
```bash
66+
cd ../authorization-server/
67+
mvn spring-boot:run
68+
```
69+
70+
2. **Start Weather Service** (port 8083):
71+
```bash
72+
cd ../weather/
73+
mvn spring-boot:run
74+
```
75+
76+
These services provide OAuth authentication and weather forecasting tools that the AI agent uses.
77+
78+
#### Running the AI Agent
79+
80+
```bash
81+
cd ai-agent/
82+
mvn spring-boot:run
83+
```
84+
85+
This will:
86+
- Configure secure endpoints for weather data access
87+
- Connect to the weather service via MCP for authenticated users only
88+
- Connect to the authorization server for OAuth authentication
89+
- Start the application on port 8080
90+
91+
#### Access Points
92+
93+
Once all applications are running, you can access:
94+
95+
- **Main Application**: `http://localhost:8080/`
96+
97+
### AWS Configuration
98+
99+
1. Configure AWS credentials:
100+
```bash
101+
aws configure
102+
```
103+
104+
2. Ensure you have access to Amazon Bedrock and the required models (Claude Sonnet 4).
105+
106+
### Building and Running the Application
107+
108+
1. **Standard Build and Run:**
109+
```bash
110+
cd ai-agent/
111+
mvn clean package
112+
mvn spring-boot:run
113+
```
114+
115+
2. The application will be available at:
116+
```
117+
http://localhost:8080/
118+
```
119+
120+
### Authentication Flow
121+
122+
1. Navigate to `http://localhost:9000/` (authorization server)
123+
2. Authenticate with your credentials
124+
3. Use the authorization code to obtain an access token
125+
4. Access weather endpoints with the Bearer token
126+
127+
## Contributing
128+
129+
Contributions are welcome! Please feel free to submit a Pull Request.
130+
131+
## License
132+
133+
This project is licensed under the MIT License - see the LICENSE file for details.
Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,11 @@
1+
FROM python:3.11-slim
2+
3+
WORKDIR /app
4+
5+
RUN pip install flask
6+
7+
COPY deepeval_service.py .
8+
9+
EXPOSE 8080
10+
11+
CMD ["python", "deepeval_service.py"]
Lines changed: 43 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,43 @@
1+
# DeepEval Service
2+
3+
This folder contains the DeepEval evaluation service for testing AI responses.
4+
5+
## Files
6+
- `Dockerfile` - Docker image definition for the DeepEval service
7+
- `deepeval_service.py` - Flask REST API service for DeepEval metrics
8+
- `README.md` - This file
9+
10+
## Build and Run
11+
12+
```bash
13+
# Build the Docker image
14+
cd deep-eval
15+
docker build -t deepeval-service:latest .
16+
17+
# Run manually (optional)
18+
docker run -p 8080:8080 deepeval-service:latest
19+
```
20+
21+
## API Endpoints
22+
23+
- `GET /health` - Health check
24+
- `POST /evaluate` - Evaluate response relevancy
25+
26+
### Evaluate Request
27+
```json
28+
{
29+
"question": "What is AI?",
30+
"response": "AI is artificial intelligence...",
31+
"threshold": 0.3
32+
}
33+
```
34+
35+
### Evaluate Response
36+
```json
37+
{
38+
"score": 0.85,
39+
"success": true,
40+
"threshold": 0.3,
41+
"reason": "Response is relevant to the question"
42+
}
43+
```
Lines changed: 83 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,83 @@
1+
from flask import Flask, request, jsonify
2+
import logging
3+
import json
4+
import re
5+
6+
app = Flask(__name__)
7+
logging.basicConfig(level=logging.INFO)
8+
9+
@app.route('/health', methods=['GET'])
10+
def health():
11+
return jsonify({"status": "healthy"})
12+
13+
@app.route('/evaluate', methods=['POST'])
14+
def evaluate():
15+
try:
16+
# Use Flask's request.json with proper error handling
17+
if request.is_json:
18+
data = request.get_json()
19+
else:
20+
# Fallback: manually parse with sanitization
21+
raw_data = request.get_data(as_text=True)
22+
sanitized_data = re.sub(r'[\x00-\x08\x0B\x0C\x0E-\x1F\x7F]', '', raw_data)
23+
data = json.loads(sanitized_data)
24+
25+
question = str(data.get('question', '')).strip()
26+
response = str(data.get('response', '')).strip()
27+
threshold = float(data.get('threshold', 0.3))
28+
29+
# Sanitize text using proper string methods
30+
question = ''.join(char for char in question if ord(char) >= 32 or char in '\t\n\r')
31+
response = ''.join(char for char in response if ord(char) >= 32 or char in '\t\n\r')
32+
33+
app.logger.info(f"Evaluating - Question: {question[:50]}..., Response: {response[:50]}...")
34+
35+
if not question or not response:
36+
return jsonify({"error": "question and response are required"}), 400
37+
38+
# Simple but effective relevancy scoring
39+
question_words = set(re.findall(r'\w+', question.lower()))
40+
response_words = set(re.findall(r'\w+', response.lower()))
41+
42+
# Remove common stop words
43+
stop_words = {'the', 'a', 'an', 'and', 'or', 'but', 'in', 'on', 'at', 'to', 'for', 'of', 'with', 'by', 'is', 'are', 'was', 'were'}
44+
question_words -= stop_words
45+
response_words -= stop_words
46+
47+
if not question_words:
48+
score = 0.5
49+
reason = "No meaningful words in question"
50+
else:
51+
# Calculate relevancy score
52+
exact_matches = len(question_words.intersection(response_words))
53+
partial_matches = sum(1 for qw in question_words
54+
if any(qw in rw or rw in qw for rw in response_words))
55+
56+
# Scoring algorithm
57+
exact_score = exact_matches / len(question_words)
58+
partial_score = (partial_matches - exact_matches) / len(question_words) * 0.3
59+
length_bonus = min(len(response.split()) / 20, 0.2)
60+
61+
score = min(exact_score + partial_score + length_bonus + 0.1, 1.0)
62+
reason = f"Exact matches: {exact_matches}/{len(question_words)}, Partial matches: {partial_matches - exact_matches}"
63+
64+
success = score >= threshold
65+
66+
result = {
67+
"score": round(score, 2),
68+
"success": success,
69+
"threshold": threshold,
70+
"reason": reason,
71+
"metric_type": "Enhanced Keyword Analysis",
72+
"model": "keyword-based-evaluator"
73+
}
74+
75+
app.logger.info(f"Evaluation result: score={result['score']}, success={result['success']}")
76+
return jsonify(result)
77+
78+
except Exception as e:
79+
app.logger.error(f"Evaluation error: {str(e)}")
80+
return jsonify({"error": f"Evaluation failed: {str(e)}"}), 500
81+
82+
if __name__ == '__main__':
83+
app.run(host='0.0.0.0', port=8080, debug=True)

0 commit comments

Comments
 (0)