I’m very proud to announce the first release of Ruby-Stemmer, an implementation of Stemming Algorithm using SnowBall API from libstemmer_c.
That’s it. Is not a pure ruby implementation, but the external library it’s included with the download so you don’t need to install anything else, just run:
gem install ruby-stemmer.
The usage is very simple:
require 'lingua/stemmer' s = Lingua::Stemmer.new s.stem "installation" # ==> install
You can change the language or the encoding by passing them to the Stemmer constructor. For example to run the Romanian algorithm with ISO-8859-2 as encoding just use:
s = Lingua::Stemmer.new(:language => 'ro', :encoding => 'ISO_8859_2') s.stem "găinațul" #==> găinaț
You can read the complete list of algorithms in modules.txt
The code, released under the terms of MIT-LICENSE, is available also on github
git clone git://github.com/aurelian/ruby-stemmer.git
Please use the infrastructure provided by github to report issues.