Skip to content

reddavis/Feature-Selection

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

26 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Feature Selection

Feature Selection is a library of feature selection algorithms.

en.wikipedia.org/wiki/Feature_selection

Install

gem sources -a http://gemcutter.org
sudo gem install feature_selection

How To Use

There are currently 3 implemented feature collections: Chi-Squared, Mutual Information and Frequency Based.

Each return a hash that looks similar to:

{klass => {term => score, term => score}, klass => {term => score}}

Example:

data = {
        :spam => [['this', 'is', 'some', 'information'], ['this', 'is', 'something', 'that', 'is', 'information']],
        :ham => [['this', 'test', 'some', 'more', 'information'], ['there', 'are', 'some', 'things']],
        }

a = FeatureSelection::ChiSquared.new(data)

# You can also use...
# FeatureSelection::MutualInformation
# FeatureSelection::FrequencyBased

a.rank_features
  #=> {:spam => {term => score, term => score}, :ham => {term => score}}

Logging

There are two ways to log the activity:

# Provide a path of somewhere to log to
log = File.expand_path(File.dirname(__FILE__) + '/log.txt')
FeatureSelection::MutualInformation.new(data, :log_to => log)

# Provide an existing Logger object
log = Logger.new('log.txt')
FeatureSelection::MutualInformation.new(data, :log_to => log)

Copyright © 2009 reddavis. See LICENSE for details.

About

A library of feature selection algorithms.

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages