Feedback

Gave this a try :-)

Feedback:

- If this library works as advertised it'd be huge!
- `mlscraper.html` is missing from the PyPI package.
- When no scraper can be found, the error message could be more helpful:
  `mlscraper.training.NoScraperFoundException: did not find scraper`
  Would be nice if the error message gave some guidance as to what fields
  couldn't be found in the HTML.
  Even with DEBUG log level it's not really helpful.
- See more notes in my script below.
- Training the script was really slow (gave up after 15 min).

```py
import requests
from mlscraper.html import Page
from mlscraper.samples import Sample, TrainingSet
from mlscraper.training import train_scraper

jonas_url = "https://github.com/jonashaag"
resp = requests.get(jonas_url)
resp.raise_for_status()

page = Page(resp.content)
sample = Sample(
    page,
    {
        "name": "Jonas Haag",
        "followers": "329",  # Note that this doesn't work if 329 passed as an int.
        #'company': '@QuantCo',  # Does not work.
        "twitter": "@_jonashaag",  # Does not work without the "@".
        "username": "jonashaag",
        "nrepos": "282",
    },
)

training_set = TrainingSet()
training_set.add_sample(sample)

scraper = train_scraper(training_set)

resp = requests.get("https://github.com/lorey")
result = scraper.get(Page(resp.content))
print(result)
```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Feedback #19

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Feedback #19

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions