This repository contains an MCP (Model Context Protocol) server (server.py
) built using the fastmcp
library. It interacts with the Kaggle API to provide tools for searching and downloading datasets, and a prompt for generating EDA notebooks.
server.py
: The FastMCP server application. It defines resources, tools, and prompts for interacting with Kaggle..env.example
: An example file for environment variables (Kaggle API credentials). Rename to.env
and fill in your details.requirements.txt
: Lists the necessary Python packages.pyproject.toml
&uv.lock
: Project metadata and locked dependencies foruv
package manager.datasets/
: Default directory where downloaded Kaggle datasets will be stored.
-
Clone the repository:
git clone <repository-url> cd <repository-directory>
-
Create a virtual environment (recommended):
python -m venv venv source venv/bin/activate # On Windows use `venv\Scripts\activate` # Or use uv: uv venv
-
Install dependencies: Using pip:
pip install -r requirements.txt
Or using uv:
uv sync
-
Set up Kaggle API credentials:
- Method 1 (Recommended): Environment Variables
- Create
.env
file - Open the
.env
file and add your Kaggle username and API key:KAGGLE_USERNAME=your_kaggle_username KAGGLE_KEY=your_kaggle_api_key
- You can obtain your API key from your Kaggle account page (
Account
>API
>Create New API Token
). This will download akaggle.json
file containing your username and key.
- Create
- Method 2:
kaggle.json
file- Download your
kaggle.json
file from your Kaggle account. - Place the
kaggle.json
file in the expected location (usually~/.kaggle/kaggle.json
on Linux/macOS orC:\Users\<Your User Name>\.kaggle\kaggle.json
on Windows). Thekaggle
library will automatically detect this file if the environment variables are not set.
- Download your
- Method 1 (Recommended): Environment Variables
- Ensure your virtual environment is active.
- Run the MCP server:
The server will start and register its resources, tools, and prompts. You can interact with it using an MCP client or compatible tools.
uv run kaggle-mcp
This project requires Kaggle API credentials to access Kaggle datasets.
- Go to https://www.kaggle.com/settings and click "Create New API Token" to download your
kaggle.json
file. - Open the
kaggle.json
file and copy your username and key into a new.env
file in the project root:
KAGGLE_USERNAME=your_username
KAGGLE_KEY=your_key
docker build -t kaggle-mcp-test .
docker run --rm -it --env-file .env kaggle-mcp-test
This will automatically load your Kaggle credentials as environment variables inside the container.
The server exposes the following capabilities through the Model Context Protocol:
search_kaggle_datasets(query: str)
:- Searches for datasets on Kaggle matching the provided query string.
- Returns a JSON list of the top 10 matching datasets with details like reference, title, download count, and last updated date.
download_kaggle_dataset(dataset_ref: str, download_path: str | None = None)
:- Downloads and unzips files for a specific Kaggle dataset.
dataset_ref
: The dataset identifier in the formatusername/dataset-slug
(e.g.,kaggle/titanic
).download_path
(Optional): Specifies where to download the dataset. If omitted, it defaults to./datasets/<dataset_slug>/
relative to the server script's location.
generate_eda_notebook(dataset_ref: str)
:- Generates a prompt message suitable for an AI model (like Gemini) to create a basic Exploratory Data Analysis (EDA) notebook for the specified Kaggle dataset reference.
- The prompt asks for Python code covering data loading, missing value checks, visualizations, and basic statistics.
Go to Claude > Settings > Developer > Edit Config > claude_desktop_config.json to include the following:
{
"mcpServers": {
"kaggle-mcp": {
"command": "kaggle-mcp",
"cwd": "<path-to-their-cloned-repo>/kaggle-mcp"
}
}
}
An AI agent or MCP client could interact with this server like this:
- Agent: "Search Kaggle for datasets about 'heart disease'"
- Server executes
search_kaggle_datasets(query='heart disease')
- Server executes
- Agent: "Download the dataset 'user/heart-disease-dataset'"
- Server executes
download_kaggle_dataset(dataset_ref='user/heart-disease-dataset')
- Server executes
- Agent: "Generate an EDA notebook prompt for 'user/heart-disease-dataset'"
- Server executes
generate_eda_notebook(dataset_ref='user/heart-disease-dataset')
- Server returns a structured prompt message.
- Server executes
- Agent: (Sends the prompt to a code-generating model) -> Receives EDA Python code.