-
Notifications
You must be signed in to change notification settings - Fork 168
Description
Create a new application inspired by the Instant Data Scraper Chrome extension, but with additional functionalities:
- Automatically detect and extract structured data (tables, lists, grids) from arbitrary web pages, using heuristics or AI analysis.
- Allow the user to manually select lists, tables, or grid elements on the page for data extraction, in cases where automatic detection is insufficient or inaccurate.
- Enable users to save their manual selections as named "work profiles" that can be loaded and reused on similar or recurring web pages.
- Include export options for CSV, Excel, JSON, and integration hooks (e.g., Google Sheets, Airtable, Zapier).
- Ensure user privacy by processing data locally in the browser whenever possible.
- Optionally support scraping of paginated or dynamically loaded content (AJAX/infinite scroll).
This feature will improve usability for users dealing with complex or inconsistent web pages and streamline repetitive scraping tasks.
How it works:
The extension uses AI-based heuristics to analyze the HTML structure of web pages and identify sections containing structured or tabular data.
It does not require custom scripts or site-specific modules, but instead automatically scans the page to find tables or lists.
Data identification and selection:
Automatic scanning:
Analyzes HTML table elements (
)Identifies lists (
- ,
- )
Detects grids built with
Recognizes repeating data blocks
Smart selection:
Uses AI to determine which tables or sections are most likely to contain useful information
Allows preview of detected data
Offers the option for manual adjustment if the automatic prediction is not accurate
Technical features:
Supports dynamic content (infinite scroll or AJAX tables)
Can extract data from multiple pages (pagination)
Collects not only text but also links and image sources
Runs entirely in the browser using:
JavaScript for DOM manipulation
Chrome extension APIs for data access
Does not send data externally for processing
Export and processing:
Export formats include:
CSV
Excel
JSON
Direct export to Google Sheets
Integration with Airtable and Zapier
Allows renaming and filtering of columns before export