Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
152 changes: 152 additions & 0 deletions CHEATSHEET.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,152 @@
# mcp-pandoc Quick Reference Cheatsheet

_Last Updated: June 27, 2025_

## 🚀 Prerequisites (One-Time Setup)

| Component | macOS | Ubuntu/Debian | Windows |
| ----------------------- | ---------------------- | ------------------------------------ | --------------------------------------------------------------------- |
| **Pandoc** | `brew install pandoc` | `sudo apt-get install pandoc` | [Download installer](https://pandoc.org/installing.html) |
| **UV** | `brew install uv` | `pip install uv` | `pip install uv` |
| **TeX Live** (PDF only) | `brew install texlive` | `sudo apt-get install texlive-xetex` | [MiKTeX](https://miktex.org/) or [TeX Live](https://tug.org/texlive/) |

## 📊 Supported Formats & Conversions

### Bidirectional Conversion Matrix

| From\To | MD | HTML | TXT | DOCX | PDF | RST | LaTeX | EPUB |
| ------------ | --- | ---- | --- | ---- | --- | --- | ----- | ---- |
| **Markdown** | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |
| **HTML** | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |
| **TXT** | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |
| **DOCX** | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |
| **PDF** | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |
| **RST** | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |
| **LaTeX** | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |
| **EPUB** | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |

### Format Categories

| Category | Formats | Requirements |
| ------------ | --------------------------- | ------------------------------- |
| **Basic** | MD, HTML, TXT | None |
| **Advanced** | DOCX, PDF, RST, LaTeX, EPUB | Must specify `output_file` path |
| **Styled** | DOCX with reference doc | Custom template support ⭐ |

## ⚡ Quick Examples

### Content-to-Format Conversions

```bash
# Markdown to HTML (displayed)
"Convert this to HTML: # Hello World"

# Markdown to DOCX (saved)
"Convert this to DOCX and save as /tmp/doc.docx: # My Document"

# Markdown to PDF (saved)
"Convert this to PDF and save as /tmp/doc.pdf: # My Document"
```

### File-to-File Conversions

```bash
# DOCX to PDF
"Convert /path/input.docx to PDF and save as /path/output.pdf"

# Markdown to DOCX
"Convert /path/input.md to DOCX and save as /path/output.docx"

# HTML to Markdown
"Convert /path/input.html to Markdown and save as /path/output.md"
```

### Reference Document Styling (⭐ NEW Feature)

```bash
# Step 1: Create reference document
pandoc -o /tmp/reference.docx --print-default-data-file reference.docx

# Step 2: Use reference for styled conversion
"Convert this to DOCX using /tmp/reference.docx as reference and save as /tmp/styled.docx:
# Professional Report
This will be styled according to the reference document."
```

## 🔄 Common Workflows

### Publishing Pipeline

| Step | Command | Output |
| ---- | -------------------------------------------------------- | ----------------- |
| 1 | `"Convert manuscript.md to DOCX and save as draft.docx"` | Draft for review |
| 2 | `"Convert draft.docx to PDF and save as final.pdf"` | Publication ready |

### Documentation Workflow

| Step | Command | Purpose |
| ---- | --------------------------------------------------------- | ----------------- |
| 1 | `"Convert README.md to HTML and save as docs/index.html"` | Web documentation |
| 2 | `"Convert README.md to PDF and save as docs/manual.pdf"` | Printable manual |

### Professional Reports

| Step | Command | Result |
| ---- | -------------------------------------------------------------------------------------- | ------------------ |
| 1 | Create template: `pandoc -o template.docx --print-default-data-file reference.docx` | Custom styling |
| 2 | `"Convert report.md to DOCX using template.docx as reference and save as report.docx"` | Branded document |
| 3 | `"Convert report.docx to PDF and save as report.pdf"` | Final distribution |

## 💡 Pro Tips

### File Paths

| ✅ Correct | ❌ Incorrect |
| ------------------------ | ---------------------- |
| `/tmp/document.pdf` | `/tmp/document` |
| `C:\Documents\file.docx` | `C:\Documents\` |
| `./output/report.html` | `just convert to HTML` |

### Format-Specific Notes

| Format | Requirements | Notes |
| --------- | ---------------------- | ----------------------- |
| **PDF** | TeX Live installed | Uses XeLaTeX engine |
| **DOCX** | Optional reference doc | Supports custom styling |
| **EPUB** | Output file required | Good for e-books |
| **LaTeX** | Output file required | Academic documents |

### Reference Documents

| Use Case | Command |
| ---------------------- | ------------------------------------------------------------- |
| **Create default** | `pandoc -o ref.docx --print-default-data-file reference.docx` |
| **Corporate branding** | Customize ref.docx in Word/LibreOffice → Save |
| **Apply styling** | Add `reference_doc: "/path/to/ref.docx"` parameter |

### Error Troubleshooting

| Error | Solution |
| --------------------------------------- | ------------------------------------------- |
| "xelatex not found" | Install TeX Live |
| "Reference document not found" | Check file path exists |
| "output_file path is required" | Add complete file path for advanced formats |
| "only supported for docx output format" | Reference docs only work with DOCX |

## 🎯 Parameter Quick Reference

| Parameter | Type | Required | Description | Example |
| --------------- | ------ | -------- | ----------------------------- | --------------------------- |
| `contents` | string | ✅\* | Text to convert | `"# Hello World"` |
| `input_file` | string | ✅\* | File to convert | `"/path/input.md"` |
| `output_format` | string | ✅ | Target format | `"docx"`, `"pdf"`, `"html"` |
| `output_file` | string | ⚠️\*\* | Save location | `"/path/output.docx"` |
| `input_format` | string | ❌ | Source format (auto-detected) | `"markdown"` |
| `reference_doc` | string | ❌ | DOCX template | `"/path/template.docx"` |

\*Either `contents` OR `input_file` required
\*\*Required for: PDF, DOCX, RST, LaTeX, EPUB

---

_This cheatsheet covers mcp-pandoc v0.3.4+ with reference document support_
68 changes: 51 additions & 17 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,16 @@ Please note that mcp-pandoc is currently in early development. PDF support is un

Credit: This project uses the [Pandoc Python package](https://pypi.org/project/pandoc/) for document conversion, forming the foundation for this project.

## 📋 Quick Reference

**New to mcp-pandoc?** Check out our **[📖 CHEATSHEET.md](CHEATSHEET.md)** for:
- ⚡ Copy-paste examples for all formats
- 🔄 Bidirectional conversion matrix
- 🎯 Common workflows and pro tips
- 🌟 Reference document styling guide

*Perfect for quick lookups and getting started fast!*

## Demo

[![mcp-pandoc - v1: Seamless Document Format Conversion for Claude using MCP server](https://img.youtube.com/vi/vN3VOb0rygM/maxresdefault.jpg)](https://youtu.be/vN3VOb0rygM)
Expand Down Expand Up @@ -43,6 +53,7 @@ More to come...
- `input_format` (string): Source format of the content (defaults to markdown)
- `output_format` (string): Target format (defaults to markdown)
- `output_file` (string): Complete path for output file (required for pdf, docx, rst, latex, epub formats)
- `reference_doc` (string): Path to a reference document to use for styling (supported for docx output format)
- Supported input/output formats:
- markdown
- html
Expand All @@ -54,23 +65,31 @@ More to come...
- txt
- Note: For advanced formats (pdf, docx, rst, latex, epub), an output_file path is required

### Supported Formats

Currently supported formats:

Basic formats (direct conversion):

- Plain text (.txt)
- Markdown (.md)
- HTML (.html)

Advanced formats (requires complete file paths):

- PDF (.pdf) - requires TeX Live installation
- DOCX (.docx)
- RST (.rst)
- LaTeX (.tex)
- EPUB (.epub)
## 📊 Supported Formats & Conversions

### Bidirectional Conversion Matrix
| From\To | MD | HTML | TXT | DOCX | PDF | RST | LaTeX | EPUB |
|---------|----|----|-----|------|-----|-----|-------|------|
| **Markdown** | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |
| **HTML** | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |
| **TXT** | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |
| **DOCX** | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |
| **PDF** | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |
| **RST** | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |
| **LaTeX** | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |
| **EPUB** | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |

### Format Categories
| Category | Formats | Requirements |
|----------|---------|--------------|
| **Basic** | MD, HTML, TXT | None |
| **Advanced** | DOCX, PDF, RST, LaTeX, EPUB | Must specify `output_file` path |
| **Styled** | DOCX with reference doc | Custom template support ⭐ |

### Requirements by Format
- **PDF (.pdf)** - requires TeX Live installation
- **DOCX (.docx)** - supports custom styling via reference documents
- **All others** - no additional requirements

Note: For advanced formats:

Expand All @@ -97,6 +116,8 @@ To use the published one
}
```

**💡 Quick Start**: See **[CHEATSHEET.md](CHEATSHEET.md)** for copy-paste examples and common workflows.

### ⚠️ Important Notes

#### Critical Requirements
Expand Down Expand Up @@ -157,6 +178,13 @@ To use the published one

# Converting between file formats
"Convert /path/to/input.md to PDF and save as /path/to/output.pdf"

# Converting to DOCX with a reference document template
"Convert input.md to DOCX using template.docx as reference and save as output.docx"

# Step-by-step reference document workflow
"First create a reference document: pandoc -o custom-reference.docx --print-default-data-file reference.docx" or if you already have one, use that
"Then convert with custom styling: Convert this text to DOCX using /path/to/custom-reference.docx as reference and save as /path/to/styled-output.docx"
```

❌ Incorrect Usage:
Expand Down Expand Up @@ -189,6 +217,12 @@ To use the published one
- Basic: txt, html, markdown
- Advanced: pdf, docx, rst, latex, epub

4. **Reference Document Issues**
- Error: "Reference document not found"
- Solution: Ensure the reference document path exists and is accessible
- Note: Reference documents only work with DOCX output format
- How to create: `pandoc -o reference.docx --print-default-data-file reference.docx`

## Quickstart

### Install
Expand Down
2 changes: 1 addition & 1 deletion pyproject.toml
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
[project]
name = "mcp-pandoc"
version = "0.3.3"
version = "0.3.4"
description = "MCP to interface with pandoc to convert files to differnt formats. Eg: Converting markdown to pdf."
readme = "README.md"
requires-python = ">=3.11"
Expand Down
20 changes: 15 additions & 5 deletions smithery.yaml
Original file line number Diff line number Diff line change
@@ -1,12 +1,22 @@
# Smithery configuration file: https://smithery.ai/docs/config#smitheryyaml
name: mcp-pandoc
version: "1.0.0"

# specify how Smithery launches the MCP server
startCommand:
type: stdio
configSchema:
# JSON Schema defining the configuration options for the MCP.
type: object
properties: {}
commandFunction:
# A function that produces the CLI command to start the MCP on stdio.
|-
() => ({command: 'uv', args: ['run', 'mcp-pandoc']})
# using uv run mcp-pandoc as your startup command
commandFunction: |-
() => ({ command: "uv", args: ["run", "mcp-pandoc"] })

# dependencies to be installed by Smithery before running your tool
dependencies:
system: # for system-level OS packages
- pandoc
- texlive-xetex # full TeX Live for PDF support
python: # pip-installable Python packages
- uv # the UV runner package
- mcp-pandoc # ensure your own package is installed in the environment
28 changes: 28 additions & 0 deletions src/mcp_pandoc/server.py
Original file line number Diff line number Diff line change
Expand Up @@ -60,6 +60,16 @@ async def handle_list_tools() -> list[types.Tool]:
"2. The desired output format\n"
"3. For advanced formats: complete output path + filename + extension\n"
"Example: 'Convert this markdown to PDF and save as /path/to/output.pdf'\n\n"
"🎨 DOCX STYLING (NEW FEATURE):\n"
"4. Custom DOCX Styling with Reference Documents:\n"
" * Use reference_doc parameter to apply professional styling to DOCX output\n"
" * Create custom templates with your branding, fonts, and formatting\n"
" * Perfect for corporate reports, academic papers, and professional documents\n"
" * Example: 'Convert this report to DOCX using /templates/corporate-style.docx as reference and save as /reports/Q4-report.docx'\n\n"
"📋 Creating Reference Documents:\n"
" * Generate template: pandoc -o template.docx --print-default-data-file reference.docx\n"
" * Customize in Word/LibreOffice: fonts, colors, headers, margins\n"
" * Use for consistent branding across all documents\n\n"
"Note: After conversion, always check the success message for the exact file location."
),
inputSchema={
Expand Down Expand Up @@ -88,6 +98,10 @@ async def handle_list_tools() -> list[types.Tool]:
"output_file": {
"type": "string",
"description": "Complete path where to save the output including filename and extension (required for pdf, docx, rst, latex, epub formats)"
},
"reference_doc": {
"type": "string",
"description": "Path to a reference document to use for styling (supported for docx output format)"
}
},
"oneOf": [
Expand Down Expand Up @@ -134,11 +148,19 @@ async def handle_call_tool(
output_file = arguments.get("output_file")
output_format = arguments.get("output_format", "markdown").lower()
input_format = arguments.get("input_format", "markdown").lower()
reference_doc = arguments.get("reference_doc")

# Validate input parameters
if not contents and not input_file:
raise ValueError("Either 'contents' or 'input_file' must be provided")

# Validate reference_doc if provided
if reference_doc:
if output_format != "docx":
raise ValueError("reference_doc parameter is only supported for docx output format")
if not os.path.exists(reference_doc):
raise ValueError(f"Reference document not found: {reference_doc}")

# Define supported formats
SUPPORTED_FORMATS = {'html', 'markdown', 'pdf', 'docx', 'rst', 'latex', 'epub', 'txt'}
if output_format not in SUPPORTED_FORMATS:
Expand All @@ -160,6 +182,12 @@ async def handle_call_tool(
"-V", "geometry:margin=1in"
])

# Handle reference doc for docx format
if reference_doc and output_format == "docx":
extra_args.extend([
"--reference-doc", reference_doc
])

# Convert content using pypandoc
if input_file:
if not os.path.exists(input_file):
Expand Down