Skip to content

Scrapers for collecting structured data from AI chat platforms like ChatGPT, Perplexity, Gemini, Claude, Copilot, Groq, Google AI Mode, and more.

License

Notifications You must be signed in to change notification settings

scrapelesshq/LLM-chat-scraper

Repository files navigation

LLM-Chat-Scraper

LLM-Chat-Scraper

Scrapers for collecting structured data from AI chat platforms
ChatGPT, Perplexity, Gemini, Claude, Copilot, Groq, Google AI Mode and more — built for researchers, integrators and automation pipelines.

Follow on YouTuBe Join our Discord Follow us on X Join us on Reddit Official Website


First, install the SDK

# Install the official Scrapeless SDK
npm install @scrapeless-ai/sdk

Click here to obtain your API-KEY


ChatGPT Scraper

You can use the ChatGPT scraper contained in this repository to collect structured conversation data. Clone or browse the scraper at: https://github.com/scrapelesshq/LLM-chat-scraper/tree/main/chatgpt_scraper and follow these quick steps to run it locally:

git clone https://github.com/scrapelesshq/LLM-chat-scraper.git
cd LLM-chat-scraper/chatgpt_scraper
npm install
cp .env.example .env
# edit .env and add at least SCRAPELESS_API_KEY=your_api_key_here

Important: after configuring .env, you need to edit src/chatgpt.ts to replace placeholder values:

Field Description
task_id A unique identifier for this scraping task
proxy_url Your Proxy URL
prompt The message or query you want ChatGPT to respond to
webhook Optional webhook URL to send results
web_search Enable web search functionality (true/false)
timeout Maximum wait time for responses in milliseconds
session_name Optional name for the browser session

Perplexity Scraper

You can use the Perplexity scraper contained in this repository to collect structured conversation data. Clone or browse the scraper at: https://github.com/scrapelesshq/LLM-chat-scraper/tree/main/perplexity_scraper and follow these quick steps to run it locally:

git clone https://github.com/scrapelesshq/LLM-chat-scraper.git
cd LLM-chat-scraper/perplexity_scraper
npm install
cp .env.example .env
# edit .env and add at least SCRAPELESS_API_KEY=your_api_key_here

Important: after configuring .env, you need to edit src/perplexity.ts to replace placeholder values:

Field Description
proxyCountry Country for proxy routing (e.g., "ANY", "US", "BR" etc.)
sessionName Name of the browser session (e.g., "perplexity-scraper")
prompt Your Perplexity query or instruction
timeout Maximum wait time for the response in milliseconds

Gemini Scraper

You can use the Gemini scraper contained in this repository to collect structured conversation data. Clone or browse the scraper at: https://github.com/scrapelesshq/LLM-chat-scraper/tree/main/gemini_scraper and follow these quick steps to run it locally:

git clone https://github.com/scrapelesshq/LLM-chat-scraper.git
cd LLM-chat-scraper/gemini_scraper
npm install
cp .env.example .env
# edit .env and add at least SCRAPELESS_API_KEY=your_api_key_here

Important: after configuring .env, you need to edit src/gemini.ts to replace placeholder values:

Field Description
proxyCountry Country for proxy routing (e.g., "ANY", "US", "BR" etc.)
sessionName Name of the browser session (e.g., "gemini-scraper")
prompt Your Perplexity query or instruction
timeout Maximum wait time for the response in milliseconds

Google AI Overview Scraper

You can use the Google AI Overview Scraper contained in this repository to collect structured conversation data. Clone or browse the scraper at: https://github.com/scrapelesshq/LLM-chat-scraper/tree/main/google_ai_overview_scraper and follow these quick steps to run it locally:

git clone https://github.com/scrapelesshq/LLM-chat-scraper.git
cd LLM-chat-scraper/google_ai_overview_scraper
npm install
cp .env.example .env
# edit .env and add at least SCRAPELESS_API_KEY=your_api_key_here

Important: after configuring .env, you need to edit src/ai_overview.ts to replace placeholder values:

Field Description
proxyCountry Country for proxy routing (e.g., "ANY", "US", "BR" etc.)
sessionName Name of the browser session (e.g., "google-ai-overview-scraper")
prompt Your Perplexity query or instruction
timeout Maximum wait time for the response in milliseconds

Google AI Mode Scraper

You can use the Google AI Mode Scraper contained in this repository to collect structured conversation data. Clone or browse the scraper at: https://github.com/scrapelesshq/LLM-chat-scraper/tree/main/google_ai_mode_scraper and follow these quick steps to run it locally:

git clone https://github.com/scrapelesshq/LLM-chat-scraper.git
cd LLM-chat-scraper/google_ai_mode_scraper
npm install
cp .env.example .env
# edit .env and add at least SCRAPELESS_API_KEY=your_api_key_here

Important: after configuring .env, you need to edit src/ai_mode.ts to replace placeholder values:

Field Description
proxyCountry Country for proxy routing (e.g., "ANY", "US", "BR" etc.)
sessionName Name of the browser session (e.g., "google-ai-mode-scraper")
prompt Your Perplexity query or instruction
timeout Maximum wait time for the response in milliseconds

📄 License

This project is licensed under the Apache License - see the LICENSE file for details.

📞 Support

🏢 About Scrapeless

Scrapeless is a powerful web scraping and browser automation platform that helps businesses extract data from any website at scale. Our platform provides:

  • High-performance web scraping infrastructure
  • Global proxy network
  • Browser automation capabilities
  • Enterprise-grade reliability and support

Visit scrapeless.com to learn more and get started.

About

Scrapers for collecting structured data from AI chat platforms like ChatGPT, Perplexity, Gemini, Claude, Copilot, Groq, Google AI Mode, and more.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published