Skip to content

Commit 41e45e2

Browse files
authored
Merge pull request #93 from Rat-OS/feature/comprehensive-error-logging-system
feat: add comprehensive error logging system with unified architecture
2 parents 1ef7016 + 3a7c1b6 commit 41e45e2

File tree

188 files changed

+3825
-236
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

188 files changed

+3825
-236
lines changed

.augment-guidelines

Lines changed: 28 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,2 +1,29 @@
1+
# General
2+
3+
- When starting a new terminal, run `nix-shell -p pnpm bun nodejs` and wait until you've entered the nix shell.
14
- Use `pnpx` instead of `npx`
2-
- Use `pnpm run test` in the `src` folder to run the tests once, without watching for changes.
5+
- Always run the tests before comitting or pushing.
6+
7+
# Linting
8+
9+
- use `pnpm run lint` in the `src` folder to run the linter once, without watching for changes.\
10+
- use `pnpm run lint:fix` in the `src` folder to run the linter once, without watching for changes, and output the results in a format that can be consumed by CI tools.
11+
12+
# Testing
13+
14+
- DO NOT EVER mock zod schemas.
15+
- Use `pnpm run test` in the `src` folder to run the tests once, without watching for changes.
16+
- use `pnpm run test:nopp` for general use, **unless** you intentionally want to run the slow post-processor tests.
17+
18+
# Bash
19+
20+
- Always run shellcheck linting before comitting or pushing.
21+
22+
# Typescript
23+
24+
- Always run `pnpm run typecheck` and `pnpm run lint:fix` before comitting or pushing.
25+
26+
# Styling
27+
28+
- Use tailwind classes.
29+
- IMPORTANT: the design always runs in dark mode, so text colors should be bright, not dark.

.augment/env/setup.sh

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
#!/bin/bash
1+
#!/usr/bin/env bash
22

33
KLIPPER_DIR=/mnt/persist/klipper
44
KLIPPER_ENV_DIR=/mnt/persist/klipper-env
@@ -47,7 +47,7 @@ RATOS_CONFIGURATION_PATH="$CONFIGURATOR_ROOT_DIR/configuration"
4747
KLIPPER_CONFIG_PATH="$PRINTER_DATA_DIR/config"
4848
RATOS_SCRIPT_DIR="$CONFIGURATOR_ROOT_DIR/src/scripts"
4949
KLIPPER_DIR="$KLIPPER_DIR"
50-
KLIPPER_ENV_DIR="$KLIPPER_ENV_DIR"
50+
KLIPPER_ENV="$KLIPPER_ENV_DIR"
5151
MOONRAKER_DIR="$MOONRAKER_DIR"
5252
LOG_FILE="$PRINTER_DATA_DIR/logs/ratos-configurator.log"
5353
RATOS_DATA_DIR="$PRINTER_DATA_DIR/ratos-data"

LOGGING_SYSTEM.md

Lines changed: 308 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,308 @@
1+
# RatOS Unified Logging System
2+
3+
This document describes the comprehensive unified logging system implemented for the RatOS-configurator project. The system consolidates all RatOS logs into a single main log file while providing specialized tools for viewing and analyzing logs from different sources, including update scripts and other system operations.
4+
5+
## Overview
6+
7+
The unified logging system consists of four main components:
8+
9+
1. **Structured Bash Logging Library** - Captures errors from shell scripts in JSON format, writing to the main RatOS log
10+
2. **CLI Log Management Commands** - Command-line tools for viewing and analyzing logs with source filtering
11+
3. **Web UI Integration** - Browser-based log viewer with filtering and analysis capabilities
12+
4. **Debug Integration** - Automatic inclusion of logs in debug packages
13+
14+
## Architecture
15+
16+
### 1. Bash Logging Library (`configuration/scripts/ratos-logging.sh`)
17+
18+
The bash logging library provides structured logging capabilities for shell scripts, outputting logs in JSON format compatible with the pino logging system used throughout the application. **All logs are written to the main RatOS log file** (`/var/log/ratos-configurator.log`) with a `source: "ratos-update"` field for filtering.
19+
20+
#### Features:
21+
- **JSON-formatted logs** compatible with pino
22+
- **Multiple log levels**: trace, debug, info, warn, error, fatal
23+
- **Unified log file** - writes to main RatOS log instead of separate files
24+
- **Source identification** - all entries tagged with `source: "ratos-update"`
25+
- **Error trapping** with stack trace capture
26+
- **Command execution logging** with automatic error handling
27+
- **Timestamped entries** with process information
28+
29+
#### Usage Example:
30+
```bash
31+
#!/bin/bash
32+
source "$(dirname "$0")/ratos-logging.sh"
33+
34+
# Set up error trapping
35+
setup_error_trap "my-script"
36+
37+
# Log script start
38+
log_script_start "my-script.sh" "1.0.0"
39+
40+
# Log various levels
41+
log_info "Starting operation" "main"
42+
log_warn "This is a warning" "main" "WARN_CODE"
43+
log_error "This is an error" "main" "ERROR_CODE"
44+
45+
# Execute commands with logging
46+
execute_with_logging "apt-get update" "package_update" "APT_UPDATE_FAILED"
47+
48+
# Log script completion
49+
log_script_complete "my-script.sh" $?
50+
```
51+
52+
#### Configuration:
53+
- `RATOS_LOG_LEVEL`: Set minimum log level (default: info)
54+
- `RATOS_LOG_FILE`: Log file path (default: uses `${LOG_FILE}` from environment, typically `/var/log/ratos-configurator.log`)
55+
- `RATOS_LOG_MAX_SIZE`: Maximum log file size before rotation (default: 0 = disabled when using main log)
56+
- `RATOS_LOG_BACKUP_COUNT`: Number of backup files to keep (default: 0 = disabled when using main log)
57+
58+
**Note**: When using the unified logging system, log rotation is handled by the main RatOS log configuration, not by individual scripts.
59+
60+
### 2. CLI Log Management (`src/cli/commands/update-logs.tsx`)
61+
62+
The CLI provides several commands for viewing and analyzing update logs. **Update logs are now a subcommand of the main `logs` command** and automatically filter the main log file to show only entries with `source: "ratos-update"`.
63+
64+
#### Commands:
65+
66+
**`ratos logs update-logs summary`**
67+
- Shows a summary of the most recent update attempt from the main log
68+
- Displays success/failure status, error counts, and timing information
69+
- Automatically filters by `source: "ratos-update"`
70+
71+
**`ratos logs update-logs show`**
72+
- Shows detailed log entries with filtering options from the main log
73+
- Options:
74+
- `-n, --lines <number>`: Number of recent lines to show (default: 50)
75+
- `-l, --level <level>`: Minimum log level (trace, debug, info, warn, error, fatal)
76+
- `-c, --context <context>`: Filter by context
77+
- `-d, --details`: Show detailed information
78+
79+
**`ratos logs update-logs errors`**
80+
- Shows only errors and warnings from the most recent update
81+
- Options:
82+
- `-d, --details`: Show detailed information
83+
84+
#### Usage Examples:
85+
```bash
86+
# Show update summary (note the new command structure)
87+
ratos logs update-logs summary
88+
89+
# Show last 100 log entries at debug level
90+
ratos logs update-logs show -n 100 -l debug
91+
92+
# Show only errors with details
93+
ratos logs update-logs errors -d
94+
95+
# Show logs from specific context
96+
ratos logs update-logs show -c "update_symlinks" -d
97+
98+
# Other log commands remain available:
99+
ratos logs tail # Tail the main log file
100+
ratos logs rotate # Force log rotation
101+
```
102+
103+
### 3. Web UI Integration
104+
105+
The web interface provides a comprehensive log viewer accessible at `/configure/update-logs`.
106+
107+
#### Features:
108+
- **Log Summary Dashboard**: Overview of recent update attempts
109+
- **Interactive Log Viewer**: Browse and filter log entries
110+
- **Real-time Filtering**: Filter by log level, context, and search terms
111+
- **Error Highlighting**: Visual distinction for different log levels
112+
- **Download Capability**: Download raw log files
113+
- **Auto-refresh**: Automatic updates when new logs are available
114+
115+
#### Components:
116+
- `UpdateLogsViewer`: Main component for displaying logs
117+
- `UpdateLogsErrorBoundary`: Error boundary for graceful error handling
118+
- `LogSummaryCard`: Summary statistics and controls
119+
- `LogEntryComponent`: Individual log entry display
120+
121+
### 4. API Endpoints
122+
123+
#### TRPC Endpoints (`src/server/routers/update-logs.ts`):
124+
- `update-logs.summary`: Get log summary statistics (filtered by `source: "ratos-update"`)
125+
- `update-logs.entries`: Get filtered log entries (filtered by `source: "ratos-update"`)
126+
- `update-logs.errors`: Get only errors and warnings (filtered by `source: "ratos-update"`)
127+
- `update-logs.contexts`: Get available log contexts (filtered by `source: "ratos-update"`)
128+
- `update-logs.clear`: **Disabled** - Cannot clear main log file (use log rotation instead)
129+
- `update-logs.download`: Download main log file (contains all sources)
130+
131+
#### REST Endpoints:
132+
- `GET /api/update-logs/download`: Download log file as attachment
133+
134+
### 5. Debug Integration
135+
136+
Update logs are automatically included in debug packages as part of the main log file:
137+
- Main log file (`/var/log/ratos-configurator.log`) is added to debug packages
138+
- Rotated log files (`.1`, `.2`, etc.) are included
139+
- All log sources (including update logs) are included in a single file
140+
- Logs are categorized appropriately in the debug package
141+
142+
## Log Format
143+
144+
All logs follow a consistent JSON format:
145+
146+
```json
147+
{
148+
"level": 30,
149+
"time": "2024-01-01T10:00:00.000Z",
150+
"msg": "Log message",
151+
"source": "ratos-update",
152+
"context": "update_symlinks",
153+
"errorCode": "SYMLINK_CREATE_FAILED",
154+
"pid": 1234,
155+
"hostname": "ratos-pi"
156+
}
157+
```
158+
159+
### Fields:
160+
- `level`: Numeric log level (10=trace, 20=debug, 30=info, 40=warn, 50=error, 60=fatal)
161+
- `time`: ISO 8601 timestamp
162+
- `msg`: Human-readable log message
163+
- `source`: Source component (e.g., "ratos-update")
164+
- `context`: Function or operation context (optional)
165+
- `errorCode`: Standardized error code (optional)
166+
- `pid`: Process ID
167+
- `hostname`: System hostname
168+
169+
## Error Codes
170+
171+
Standardized error codes help identify common issues:
172+
173+
### Update Script Error Codes:
174+
- `SCRIPT_ERROR`: General script failure
175+
- `SCRIPT_SUCCESS`: Script completed successfully
176+
- `SYMLINK_CREATE_FAILED`: Failed to create symbolic link
177+
- `SYMLINK_REMOVE_FAILED`: Failed to remove symbolic link
178+
- `NODE_INSTALL_FAILED`: Node.js installation failed
179+
- `APT_UPDATE_FAILED`: Package list update failed
180+
- `EXTENSION_SYMLINK_FAILED`: Extension symlinking failed
181+
- `OWNERSHIP_CHANGE_FAILED`: File ownership change failed
182+
183+
### System Error Codes:
184+
- `FILE_NOT_FOUND`: Required file not found
185+
- `PERMISSION_DENIED`: Insufficient permissions
186+
- `NETWORK_ERROR`: Network connectivity issue
187+
- `DISK_FULL`: Insufficient disk space
188+
189+
## Error Handling and Retry Logic
190+
191+
### Bash Scripts:
192+
- Automatic error trapping with `set -eE`
193+
- Stack trace capture on script failure
194+
- Graceful error reporting with context
195+
- Exit codes indicate success/failure status
196+
197+
### Web UI:
198+
- Error boundaries prevent UI crashes
199+
- Automatic retry with exponential backoff
200+
- Graceful degradation when logs unavailable
201+
- User-friendly error messages
202+
203+
### CLI:
204+
- Robust error handling for missing files
205+
- Clear error messages with suggested actions
206+
- Non-zero exit codes for scripting
207+
208+
## Monitoring and Alerting
209+
210+
### Log Rotation:
211+
- Automatic rotation when files exceed 10MB
212+
- Keeps 5 backup files by default
213+
- Configurable via environment variables
214+
215+
### Performance:
216+
- Efficient JSON parsing with error recovery
217+
- Indexed log entries for fast filtering
218+
- Lazy loading for large log files
219+
220+
## Troubleshooting
221+
222+
### Common Issues:
223+
224+
**Log file not found:**
225+
- Ensure update scripts have been run at least once
226+
- Check `RATOS_DATA_DIR` environment variable
227+
- Verify directory permissions
228+
229+
**Permission errors:**
230+
- Ensure log directory is writable by the RatOS user
231+
- Check file ownership and permissions
232+
- Run scripts with appropriate privileges
233+
234+
**Large log files:**
235+
- Log rotation should handle this automatically
236+
- Manually clear logs using `ratos update-logs clear` (CLI) or web UI
237+
- Adjust `RATOS_LOG_MAX_SIZE` if needed
238+
239+
**Missing log entries:**
240+
- Check `RATOS_LOG_LEVEL` setting
241+
- Ensure scripts are using the logging library correctly
242+
- Verify JSON format of log entries
243+
244+
### Debug Commands:
245+
```bash
246+
# Check main log file location and size
247+
ls -la /var/log/ratos-configurator.log*
248+
249+
# View raw log file (all sources)
250+
cat /var/log/ratos-configurator.log
251+
252+
# View only update logs
253+
grep '"source":"ratos-update"' /var/log/ratos-configurator.log
254+
255+
# Test log parsing
256+
ratos logs update-logs summary
257+
258+
# Force log rotation (instead of clearing)
259+
ratos logs rotate
260+
261+
# Validate bash scripts with ShellCheck
262+
shellcheck -ax -s bash configuration/scripts/ratos-logging.sh
263+
shellcheck -ax -s bash configuration/scripts/ratos-update.sh
264+
```
265+
266+
## Development
267+
268+
### Adding New Log Sources:
269+
1. Source the logging library: `source "$(dirname "$0")/ratos-logging.sh"`
270+
2. Set up error trapping: `setup_error_trap "script-name"`
271+
3. Use logging functions: `log_info`, `log_error`, etc.
272+
4. Add appropriate error codes to documentation
273+
274+
### Code Quality Standards:
275+
- **ShellCheck Compliance**: All bash scripts must pass ShellCheck validation
276+
- **Error Handling**: Use proper error trapping with selective `set +e`/`set -e`
277+
- **Variable Quoting**: Always quote variables and use `read -r` for input
278+
- **Exit Codes**: Use proper exit code handling and propagation
279+
280+
### Testing:
281+
- Unit tests in `src/__tests__/update-logs.test.ts`
282+
- Integration tests for CLI commands
283+
- End-to-end tests for web UI
284+
- ShellCheck validation in CI/CD pipeline
285+
286+
### Contributing:
287+
- Follow existing log format and error code conventions
288+
- Run ShellCheck on all bash scripts before committing
289+
- Add tests for new functionality
290+
- Update documentation for new features
291+
- Ensure backward compatibility
292+
293+
## Security Considerations
294+
295+
- Log files may contain sensitive information
296+
- Automatic inclusion in debug packages with user consent
297+
- No credentials or secrets should be logged
298+
- File permissions restrict access to RatOS user
299+
- Log rotation prevents unbounded disk usage
300+
301+
## Future Enhancements
302+
303+
- Real-time log streaming via WebSocket
304+
- Log aggregation from multiple sources
305+
- Advanced filtering and search capabilities
306+
- Integration with external monitoring systems
307+
- Automated error pattern detection
308+
- Performance metrics and trending

configuration/boards/ay-caramba/compile.sh

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
#!/bin/bash
1+
#!/usr/bin/env bash
22
if [ "$EUID" -ne 0 ]
33
then echo "ERROR: Please run as root"
44
exit

configuration/boards/ay-caramba/flash.sh

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
#!/bin/bash
1+
#!/usr/bin/env bash
22
MCU=/dev/ay-caramba
33
if [ "$EUID" -ne 0 ]
44
then echo "ERROR: Please run as root"

configuration/boards/ay-caramba/make-and-flash-mcu.sh

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
#!/bin/bash
1+
#!/usr/bin/env bash
22

33
if [ "$EUID" -ne 0 ]
44
then echo "ERROR: Please run as root"

configuration/boards/btt-ebb36-10/compile.sh

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
#!/bin/bash
1+
#!/usr/bin/env bash
22
if [ "$EUID" -ne 0 ]
33
then echo "ERROR: Please run as root"
44
exit

configuration/boards/btt-ebb36-10/flash.sh

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
#!/bin/bash
1+
#!/usr/bin/env bash
22
MCU=/dev/btt-ebb36-10
33
if [ "$EUID" -ne 0 ]
44
then echo "ERROR: Please run as root"

configuration/boards/btt-ebb36-10/make-and-flash-mcu.sh

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
#!/bin/bash
1+
#!/usr/bin/env bash
22

33
if [ "$EUID" -ne 0 ]
44
then echo "ERROR: Please run as root"

configuration/boards/btt-ebb36-11/compile.sh

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
#!/bin/bash
1+
#!/usr/bin/env bash
22
if [ "$EUID" -ne 0 ]
33
then echo "ERROR: Please run as root"
44
exit

0 commit comments

Comments
 (0)