Unsupervised

Explore the Data Using Pandas-
typo: "interpretation. <3 your data"

Why not apply some of the preprocessing techniques from the last lesson here on the music reviews data?

Creating the DTM using scikit-learn-
Explanation needed for why it's necessary to remove numbers.

Topic Modeling-
typo: "what the ext is about" -> "text"
The paragraph on the "theory" behind LDA is very dense and difficult to parse.

It is unnecessary to fit-transform both tf-idf and countvectorizer here - one or the other is fine.

Error message fitting the lda model:
"LatentDirichletAllocation(n_topics=10...)" -> "LatentDirichletAllocation(n_components=10"

It might be nice to include an interpretation of the 10 topics identified by the model.

Error message in cosine similarity example at end of notebook.

Further resources-
The link for the blog post is broken. Remove it?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Unsupervised #9

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Unsupervised #9

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions