Improve multinomial Naive Bayes classifier

Ideas:
- Feature selection: select a certain proportion of words with highest information gain
- Normalize feature vectors to the average vector length observed in the data
- Investigate locally weighted learning

Resources:
- [A Comparative Study on Feature Selection in Text Categorization](http://courses.ischool.berkeley.edu/i256/f06/papers/yang97comparative.pdf)
- [Multinomial Naive Bayes for Text Categorization Revisited](http://www.cs.waikato.ac.nz/ml/publications/2004/kibriya_et_al_cr.pdf)
- [scikit-learn TfidfVectorizer](http://scikit-learn.org/stable/modules/generated/sklearn.feature_extraction.text.TfidfVectorizer.html)


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Improve multinomial Naive Bayes classifier #4

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Improve multinomial Naive Bayes classifier #4

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions