OpenAF mini-a

Mini-A is a minimalist autonomous agent that uses LLMs, shell commands and/or MCP stdio or http(s) servers to achieve user-defined goals. It is designed to be simple, flexible, and easy to use. Can be used as a library, command-line tool, or embedded interface in other applications.

Quick Start

Two steps to use:

Set OAF_MODEL environment variable to the model you want to use.
Run the agent through one of the provided entry points:
- Shell wrapper: ./mini-a.sh goal="your goal" (convenient default that executes mini-a.yaml)
- oJob invocation: ojob mini-a.yaml goal="your goal" (explicit oJob execution)
- Library usage: loadLib('mini-a.js'); (new MiniA()).start({ goal: '...' })

These entry points share the same options, so you can switch between them without changing configuration flags.

Shell access is disabled by default for safety; add useshell=true when you explicitly want the agent to run commands.

If you prefer the browser UI, start ./mini-a-web.sh onport=8888 after exporting the model settings and open http://localhost:8888.

Need to share supporting text? Launch the web server with useattach=true to reveal the paperclip button beside the prompt box. You can attach multiple text-based files (up to 512 KB each) before submitting, review them as removable chips, and open the full contents from the conversation stream.

Common web toggles:

showexecs=true to surface executed commands in the transcript
logpromptheaders=origin,referer to emit selected HTTP headers for debugging
usediagrams=false / usecharts=false to disable Mermaid diagrams or Chart.js rendering when the runtime lacks those assets
useattach=true to enable the paperclip for uploading supporting text snippets
mode=shell (or any preset below) to reuse a curated bundle of flags from mini-a-modes.yaml

Mode presets

Mini-A ships with reusable argument bundles so you can switch behaviors without remembering every flag. Pass mode=<name> with mini-a.sh, mini-a.yaml, or mini-a-main.yaml and the runtime will merge the corresponding preset from mini-a-modes.yaml before applying any explicit flags you provide on the command line.

Built-in presets:

shell – Read-only shell access (useshell=true).
shellrw – Shell with write access enabled (useshell=true readwrite=true).
shellutils – Shell plus the Mini File Tool MCP utilities (useutils=true usetools=true).
chatbot – Lightweight conversational mode (chatbotmode=true).
web – Browser UI with tool registration (usetools=true).
webfull – Web UI with history, attachments, diagrams, charts, and planning enabled (usetools=true usediagrams=true usecharts=true usehistory=true useattach=true historykeep=true useplanning=true).

Create your own presets by copying the file and adding new entries or overriding them locally—the agent loads the YAML on each run, so custom additions are immediately available.

Documentation

Detailed Usage Guide - Comprehensive guide covering all configuration options, examples, and best practices
MCP Documentation - Catalog of available Model Context Protocol servers
Creating MCPs - Step-by-step guide for creating custom MCP servers
External MCPs - List of external MCP servers you can use with Mini-A
Contributing Guide - How to contribute to the project
Code of Conduct - Community guidelines and standards

Basic Usage

Setting the model

Examples:

Provider	Model	OAF_MODEL value	Observations
OpenAI	gpt-5-mini	`(type: openai, model: gpt-5-mini, key: ..., timeout: 900000, temperature: 1)`
Google	gemini	`(type: gemini, model: gemini-2.5-flash-lite, key: ..., timeout: 900000, temperature: 0)`
GitHub	gpt-5-nano	`(type: openai, url: 'https://models.github.ai/inference', model: openai/gpt-5-nano, key: $(gh auth token), timeout: 900000, temperature: 1, apiVersion: '')`
AWS Bedrock	nova-pro	`(type: bedrock, timeout: 900000, options: (model: 'amazon.nova-pro-v1:0', temperature: 0))`	After installing OpenAF's oPack "AWS" add to mini-a calls `libs="aws.js"`
AWS Bedrock	claude-sonnet-4.5	`(type: bedrock, timeout: 900000, options: (model: 'eu.anthropic.claude-sonnet-4-5-20250929-v1:0', region: eu-west-1, temperature: 0, params:(max_tokens: 200000)))`	After installing OpenAF's oPack "AWS" add to mini-a calls `libs="aws.js"`
Groq	gpt-oss-20b	`(type: openai, model: 'openai/gpt-oss-20b', key: '...', url: 'https://api.groq.com/openai', timeout: 900000, temperature: 0)`
Ollama	gemma3	`(type: ollama, model: 'gemma3', url: 'http://ollama.local', timeout: 900000)`
Ollama	mistral	`(type: ollama, model: 'mistral', url: 'http://ollama.local', timeout: 900000)`

Note: export OAF_MODEL="..."

Recommended model tiers

All uses (best): Claude Sonnet 4.5, OpenAI GPT-5, Google Gemini 2.5, OpenAI OSS 120B
Low cost (best): OpenAI GPT-5 mini, Amazon Nova Pro/Mini, OpenAI OSS 20B
Simple agent shell tool: Gemma 3, Phi 4
Chatbot: Mistral 7B, Llama 3.2 8B

Dual-Model Configuration (Cost Optimization)

Mini-A supports a dual-model configuration for cost optimization. Set OAF_LC_MODEL to use a cheaper model for routine operations, while keeping a more capable model for complex scenarios.

# Main model (high-capability, used for complex reasoning and initial planning)
export OAF_MODEL="(type: openai, model: gpt-4, key: 'your-api-key')"

# Low-cost model (used for routine operations like summarization and simple tasks)
export OAF_LC_MODEL="(type: openai, model: gpt-3.5-turbo, key: 'your-api-key')"

How it works:

Step 0: Always uses the main model for initial planning
Subsequent steps: Uses the low-cost model by default
Smart escalation: Automatically switches to the main model when:
- Multiple consecutive errors occur
- The agent gets stuck in thinking loops
- Complex reasoning is needed

Cost savings: This can significantly reduce API costs by using cheaper models for routine tasks while ensuring quality for complex operations.

Tool orchestration enhancements

Recent updates focus on performance and resiliency when working with MCP tools:

Parallel tool execution – When the model responds with multiple independent tool calls in the same step, Mini-A executes them concurrently, reducing overall latency for long-running MCP operations.
Dynamic tool selection – Pair usetools=true with mcpdynamic=true to let Mini-A narrow the registered tool set via keyword heuristics, then the low-cost LLM, and finally the primary model, falling back to the full catalog if none match.
Smart context caching – System prompts and tool schema summaries are cached across sessions, minimizing repeated token overhead and keeping instructions consistent even as the tool roster grows.
Deterministic tool result caching – Tools marked with annotations.readOnlyHint, annotations.idempotentHint, or explicit caching metadata reuse previous results for the same parameters. Configure the default cache window with toolcachettl=<ms> or override it per tool via metadata.
Lazy MCP initialization – Pass mcplazy=true to defer establishing MCP connections until a tool is actually needed. This shortens startup times when many optional integrations are configured.

These improvements work out of the box and can be tuned per environment; see USAGE.md for option details.

Reliability & recovery upgrades

Mini-A now bounces back from flaky infrastructure faster and with richer diagnostics:

Exponential backoff on LLM and MCP calls automatically spaces out retries to absorb transient rate limits, timeouts, and network hiccups before escalating failures.
Automatic checkpoints and restoration capture the agent state at the end of each healthy step, allowing seamless recovery after transient errors without losing context or progress.
Error categorization distinguishes between transient and permanent faults so the agent can retry, escalate, or halt with clear messaging instead of blindly reissuing requests.
Circuit breakers for MCP connections temporarily pause repeatedly failing integrations, protecting the session from hammering unhealthy backends.
Preserved error context across summaries keeps the latest recovery notes at the top of the conversation even when the working memory is compressed.

Advanced planning upgrades

Enable useplanning=true to activate a richer planning workflow that now adapts to task complexity:

Goal-aware strategy selection inspects the goal upfront and disables planning for trivial requests, keeps a short linear task list for moderate work, and creates a nested plan tree for complex missions.
Automatic decomposition & checkpoints seeds state.plan with structured steps, intermediate checkpoints, and progress percentages so the LLM can track execution without handcrafting the scaffold from scratch.
Feasibility validation pre-checks each step against available shell access and registered MCP tools, blocking impossible tasks and surfacing actionable warnings in the log.
Dynamic replanning hooks mark the active step as blocked whenever the runtime raises an error, flagging state.plan.meta.needsReplan=true so the model knows to adjust its strategy.
Progress metrics & logging record overall completion, checkpoint counts, and new counters (plans_generated, plans_validated, plans_replanned, etc.) that show up in getMetrics().

The new planning helpers live entirely in state.plan, so existing prompts and transcripts remain compatible while gaining richer telemetry.

Running the mini agent

Single MCP connection

mini-a.sh goal="list all nmaguiar/imgutils image tags" mcp="(cmd: 'docker run --rm -i mcp/dockerhub', timeout: 5000)" rpm=20
mini-a.sh goal="..." mcp="(cmd: 'docker run --rm -i mcp/wikipedia-mcp', timeout: 5000)" rpm=20 __format=md

rpm caps model requests per minute; add tpm=<tokens> to limit combined prompt and completion tokens when needed.

Multiple MCP connections

mini-a.sh goal="get the latest top 20 tags used by the library/ubuntu, cross-check those tag names with the list of Ubuntu releases in Wikipedia, and produce a table with ubuntu release, tag name and latest push date" mcp="[(cmd: 'docker run --rm -i mcp/dockerhub', timeout: 5000), (cmd: 'docker run --rm -i mcp/wikipedia-mcp', timeout: 5000)]" rpm=20 tpm=80000 __format=md

Local MCP servers

Using built-in MCP servers:

# Database operations (preload optional helpers via libs=... if needed)
mini-a.sh goal="create a test table with European countries" \
  mcp="(cmd: 'ojob mcps/mcp-db.yaml jdbc=jdbc:h2:./data user=sa pass=sa libs=db/helpers.js', timeout: 5000)" rpm=20

# Network utilities

# Time and timezone utilities
mini-a.sh goal="what time is it in Sydney right now?" mcp="(cmd: 'ojob mcps/mcp-time.yaml', timeout: 5000)" rpm=20

# SSH execution (mcp-ssh)
mini-a.sh goal="run 'uptime' on remote host via SSH MCP" mcp="(cmd: 'ojob mcps/mcp-ssh.yaml ssh=ssh://user:pass@host:22/ident readwrite=false', timeout: 5000)" rpm=20

mini-a.sh goal="check if port 80 is open on google.com" mcp="(cmd: 'ojob mcps/mcp-net.yaml', timeout: 5000)" rpm=20

# Email operations
mini-a.sh goal="send a test email" mcp="(cmd: 'ojob mcps/mcp-email.yaml smtpserver=smtp.example.com [email protected]', timeout: 5000)" rpm=20

# S3 inventory (read-only by default; add readwrite=true to enable writes)
mini-a.sh goal="list the latest invoices in our S3 bucket" \
  mcp="(cmd: 'ojob mcps/mcp-s3.yaml bucket=finance-archive prefix=invoices/', timeout: 5000)" rpm=20

# RSS monitoring
mini-a.sh goal="summarize the last five posts from the OpenAI blog" \
  mcp="(cmd: 'ojob mcps/mcp-rss.yaml', timeout: 5000)" knowledge="- prefer bullet lists" rpm=20

# Market data lookups
mini-a.sh goal="compare AAPL and MSFT revenue trends" \
  mcp="(cmd: 'ojob mcps/mcp-fin.yaml', timeout: 5000)" rpm=20

# Local shell MCP (inherits the command allow/deny list)
mini-a.sh goal="collect disk usage stats" \
  mcp="(cmd: 'ojob mcps/mcp-shell.yaml timeout=3000 shellallow=df,du', timeout: 5000)" rpm=20

Shell operations

Remove docker images older than 1 year:

mini-a.sh goal="help me remove docker images that are older than 1 year" rpm=20 knowledge="give a final answer with a summary of changes in markdown" useshell=true

Analyze project structure:

mini-a.sh goal="analyze the current directory structure and provide insights" useshell=true rpm=15 __format=md

Chatbot-style conversations

When you just need an assistant-style exchange without the agent workflow, pass chatbotmode=true. Mini-A will swap in a lighter system prompt geared for natural dialogue, skip tool descriptions in the instructions, and stream replies just like a regular chat bot. You can combine this with the web UI or CLI entry points:

./mini-a.sh goal="help me plan a vacation in Lisbon" chatbotmode=true

From code you can opt into the same behavior:

var mini = new MiniA()
mini.start({ goal: "Summarize today's standup", chatbotmode: true })

All other flags (MCP connections, attachments, shell access, etc.) continue to work—you simply choose between the agent-style prompt or a conversational one based on the task.

Task planning updates (agent mode, opt-in)

Set useplanning=true (and keep chatbotmode=false) to have the agent maintain a lightweight task plan inside the state (plan array). Each item includes a short title and a status (pending, in_progress, done, or blocked). Leave useplanning unset/false and Mini-A will skip the planning instructions entirely.

CLI / oJob output: Planning updates appear with the 🗺️ icon, alongside thought (💭) messages.
Web UI: When an active plan exists the transcript keeps the 🗺️ entries and the interface surfaces an expandable progress card that summarizes completed vs. total steps and renders the plan as a numbered checklist with completed items struck through.
Custom integrations: The current plan continues to flow through the state payload passed back on each step, enabling downstream automation.

The agent revises the plan whenever progress changes, so the summary always reflects the latest approach. When no plan is active the web UI hides 🗺️ updates and the progress card stays collapsed.

Project Components

Mini-A ships with three complementary components:

mini-a.yaml – Core oJob definition that implements the agent workflow.
mini-a.sh – Shell wrapper that locates the repository directory and runs mini-a.yaml with all provided arguments.
mini-a.js – Reusable library so you can embed the agent in other OpenAF jobs or automation scripts.
mini-a-web.sh / mini-a-web.yaml – Lightweight HTTP server that serves the browser UI found in public/.

Features

Multi-Model Support: Works with OpenAI, Google Gemini, GitHub Models, AWS Bedrock, Ollama, and more
Dual-Model Cost Optimization: Use a low-cost model for routine steps via OAF_LC_MODEL with smart escalation to the main model when needed (see details)
MCP Integration: Seamless integration with Model Context Protocol servers (both local and external)
STDIO or HTTP MCPs: Use MCPs over STDIO or start them as remote HTTP servers with onport (see MCP docs)
Shell Access: Optional shell command execution with safety controls
Flexible Configuration: Extensive configuration options for different use cases
Dynamic Planning View: Opt into useplanning=true to keep a live plan (🗺️) of the current task, complete with web UI progress tracking
Dynamic MCP Tool Selection: Combine usetools=true with mcpdynamic=true to have the agent register only the tools it considers relevant to the current goal, falling back gracefully when needed
Built-in MCPs: Includes database, file system, network, time/timezone, email, data channel, RSS, S3, Yahoo Finance, SSH execution, and local shell MCP servers
Multiple MCP Connections: Connect to multiple MCPs at once and orchestrate across them
Simple Web UI: Lightweight embedded chat interface for interactive use (screenshot above)
Text Attachments in the Web UI: When started with useattach=true, upload and review text files alongside your prompt with collapsible previews in the conversation log
Utility MCP Helpers: Add useutils=true to bundle the Mini File Tool utilities as an on-demand MCP you can call from the agent
Chatbot Mode: Toggle chatbotmode=true to strip agent-style instructions and chat with the model in a lightweight assistant mode
Safety Features: Command filtering, confirmation prompts, and read-only modes
Conversation Persistence: Save and resume conversations across sessions
Automatic Context Summarization: Keeps context within limits with auto-summarize when it grows
Rate Limiting: Built-in rate limiting for API usage control
Metrics & Observability: Built-in counters surfaced via MiniA.getMetrics() and OpenAF's ow.metrics registry for dashboards.

Installation

Mini-A is built on the OpenAF platform. To get started:

Install OpenAF - Download from openaf.io
Install oPack:
```
opack install mini-a
```
Set your model configuration (see model examples below)
Start using Mini-A!

Configuration

Environment Variables

OAF_MODEL (required): LLM model configuration
OAF_LC_MODEL (optional): Low-cost LLM model configuration for cost optimization
OAF_FLAGS="(MD_DARKMODE: 'auto')": For setting forced dark mode or automatic

Command Line Options

All Mini-A options can be passed as command line arguments:

goal – Objective the agent should achieve (required for MiniA.start / mini-a.yaml)
mcp – MCP server configuration (single object or array, in JSON/SLON)
usetools – Register MCP tools directly with the model instead of expanding the prompt with tool schemas
mcpdynamic – When combined with usetools=true, analyze the goal and register only the MCP tools that look relevant
useutils – Mount the Mini File Tool helpers as an auxiliary MCP connection (default false)
useshell – Allow shell command execution (default false)
shell – Prefix every shell command (requires useshell=true; ideal for sandboxing with sandbox-exec, container exec, docker exec, etc.)
readwrite – Allow file system modifications without confirmation prompts (default false)
checkall – Prompt before running every shell command (default false)
shellallow, shellbanextra – Override the default banned command lists
shellallowpipes – Permit pipes/redirection/control operators when executing shell commands (default false)
shellbatch – Skip interactive confirmations when checkall is active (default false)
knowledge – Extra instructions or the path to a text file to append to the system prompt
rules – Additional numbered rules (JSON/SLON array) injected into the system prompt
state – Initial agent state payload (JSON/SLON string or object) preserved between steps
mode – Apply a preset from mini-a-modes.yaml to toggle a bundle of arguments at once
conversation – Path to a conversation JSON file to load/save chat history
libs – Comma-separated list of extra OpenAF libraries to load before starting
maxsteps – Maximum consecutive steps without a successful action before forcing a final response (default 15)
maxcontext – Approximate token budget for context before auto-summarization kicks in (default disabled)
rpm – Maximum LLM requests per minute (waits automatically)
tpm – Maximum combined prompt/completion tokens per minute
verbose, debug – Enable progressively richer logging
raw – Return the final response exactly as produced instead of formatted output
outfile – Path to write the final answer (implies JSON output unless __format is provided)
__format – Output format (e.g. md, json)
chatbotmode – Skip the agent workflow and respond like a regular chat assistant (default false)
useplanning – Keep a live task plan in agent mode; Mini-A disables it automatically for trivial goals
auditch – Persist every interaction event to an OpenAF channel (JSSLON definition)

For a complete list of options, see the Usage Guide.

Security

Mini-A includes several security features:

Command Filtering: Dangerous commands are blocked by default
Customizable Shell Controls: Use shellallow, shellallowpipes, and shellbanextra to fine-tune shell access
Interactive Confirmation: Use checkall=true for command approval
Read-Only Mode: File system protection enabled by default
Shell Isolation: Shell access disabled by default

Shell Prefix Examples

Combine useshell=true with the new shell=... option to route every command through an OS sandbox or container runtime. The prefix is prepended to the command before execution (the safety filters still inspect the original command text).

macOS (sandbox-exec) – Constrain the agent with a sandbox profile:

./mini-a.sh goal="catalog ~/Projects" useshell=true \
  shell="sandbox-exec -f /usr/share/sandbox/default.sb"

macOS Sequoia (container) – Use Apple's container CLI after starting an instance:

container run --detach --name mini-a --image docker.io/library/ubuntu:24.04 sleep infinity
./mini-a.sh goal="inspect /work" useshell=true shell="container exec mini-a"

Linux / macOS / Windows WSL (Docker) – Exec into an existing sandbox container:

docker run -d --rm --name mini-a-sandbox -v "$PWD":/work -w /work ubuntu:24.04 sleep infinity
./mini-a.sh goal="summarize git status" useshell=true shell="docker exec mini-a-sandbox"

Linux / macOS / Windows WSL (Podman) – Same pattern with Podman:

podman run -d --rm --name mini-a-sandbox -v "$PWD":/work -w /work docker.io/library/fedora:latest sleep infinity
./mini-a.sh goal="list source files" useshell=true shell="podman exec mini-a-sandbox"

See the Usage Guide for trade-offs and when to choose shell prefixes versus the built-in restriction flags.

See the Usage Guide for detailed security information.

Monitoring & Metrics

Mini-A tracks detailed runtime metrics (LLM calls, shell approvals, escalation counters, summarization activity, and more). You can access them in two ways:

From code, call MiniA.getMetrics() to obtain a snapshot of counters for the current process.
Through OpenAF's metrics registry (ow.metrics.add('mini-a', ...)), which exposes the same information to external scrapers or dashboards.

These metrics are useful for tracking costs, diagnosing stuck runs, and creating operational dashboards for long-lived agents.

Metric categories returned by `getMetrics()`

Category	Keys	What they represent
`llm_calls`	`normal`, `low_cost`, `total`, `fallback_to_main`	How many requests were routed to the primary or low-cost model and the number of times Mini-A escalated back to the main model.
`goals`	`achieved`, `failed`, `stopped`	Goal-level outcomes for the current run.
`actions`	`thoughts_made`, `thinks_made`, `finals_made`, `mcp_actions_executed`, `mcp_actions_failed`, `shell_commands_executed`, `shell_commands_blocked`, `shell_commands_approved`, `shell_commands_denied`, `unknown_actions`	Volume and success of high-level agent actions, including MCP calls and shell approvals.
`planning`	`disabled_simple_goal`, `plans_generated`, `plans_validated`, `plans_validation_failed`, `plans_replanned`	Planning workflow activity: when planning was skipped, generated, validated, or rebuilt during a session.
`performance`	`steps_taken`, `total_session_time_ms`, `avg_step_time_ms`, `max_context_tokens`, `llm_estimated_tokens`, `llm_actual_tokens`, `llm_normal_tokens`, `llm_lc_tokens`	Execution pacing, wall-clock timings, and token consumption.
`behavior_patterns`	`escalations`, `retries`, `consecutive_errors`, `consecutive_thoughts`, `json_parse_failures`, `action_loops_detected`, `thinking_loops_detected`, `similar_thoughts_detected`	Signals that help detect unhealthy loops or parser issues.
`summarization`	`summaries_made`, `summaries_skipped`, `summaries_forced`, `context_summarizations`, `summaries_tokens_reduced`, `summaries_original_tokens`, `summaries_final_tokens`	Automatic summarization decisions and token savings.

Use these counters to plot dashboards, set alerts (for example, when consecutive_errors keeps climbing), or estimate LLM spend based on token metrics.

Contributing

We welcome contributions! Please see our Contributing Guide for details on:

Code contribution process
Development setup
Pull request guidelines
Community standards

Running tests

Run the test suite from the repository root using oJob:

ojob tests/autoTestAll.yaml

Be sure to execute this in the main repo folder so relative paths used by the tests resolve correctly. You need OpenAF installed so the ojob command is available.

The run generates an autoTestAll.results.json file with detailed results—inspect it locally and delete it before your final commit.

Community

Issues: GitHub Issues
Discussions: GitHub Discussions
Email: [email protected]

Please read our Code of Conduct before participating.

License

This project is licensed under the Apache License 2.0 - see the LICENSE file for details.

Name		Name	Last commit message	Last commit date
Latest commit History 423 Commits
.github		.github
examples		examples
mcps		mcps
public		public
tests		tests
.odoc.db		.odoc.db
.package.yaml		.package.yaml
AGENTS.md		AGENTS.md
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
EXTERNAL-MCPS.md		EXTERNAL-MCPS.md
LICENSE		LICENSE
README.md		README.md
TESTS_MODELS.md		TESTS_MODELS.md
USAGE.md		USAGE.md
mini-a-file-tool.js		mini-a-file-tool.js
mini-a-main.yaml		mini-a-main.yaml
mini-a-modes.yaml		mini-a-modes.yaml
mini-a-web.sh		mini-a-web.sh
mini-a-web.yaml		mini-a-web.yaml
mini-a.js		mini-a.js
mini-a.sh		mini-a.sh
mini-a.yaml		mini-a.yaml

Uh oh!

License

Uh oh!

OpenAF/mini-a

Folders and files

Latest commit

History

Repository files navigation

OpenAF mini-a

Quick Start

Mode presets

Documentation

Basic Usage

Setting the model

Recommended model tiers

Dual-Model Configuration (Cost Optimization)

Tool orchestration enhancements

Reliability & recovery upgrades

Advanced planning upgrades

Running the mini agent

Single MCP connection

Multiple MCP connections

Local MCP servers

Shell operations

Chatbot-style conversations

Task planning updates (agent mode, opt-in)

Project Components

Features

Installation

Configuration

Environment Variables

Command Line Options

Security

Shell Prefix Examples

Monitoring & Metrics

Metric categories returned by getMetrics()

Contributing

Running tests

Community

License

About

Topics

Resources

License

Code of conduct

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 6

Uh oh!

Languages

Metric categories returned by `getMetrics()`

Packages