locknet.ro

archive

[ANN] Ruby-Stemmer

I’m very proud to announce the first release of Ruby-Stemmer, an implementation of Stemming Algorithm using SnowBall API from libstemmer_c.

That’s it. Is not a pure ruby implementation, but the external library it’s included with the download so you don’t need to install anything else, just run: gem install ruby-stemmer.

The usage is very simple:

require 'lingua/stemmer'
s = Lingua::Stemmer.new
s.stem "installation" # ==> install

You can change the language or the encoding by passing them to the Stemmer constructor. For example to run the Romanian algorithm with ISO-8859-2 as encoding just use:

s = Lingua::Stemmer.new(:language => 'ro', :encoding => 'ISO_8859_2')
s.stem "găinațul" #==> găinaț

You can read the complete list of algorithms in modules.txt

The code, released under the terms of MIT-LICENSE, is available also on github

git clone git://github.com/aurelian/ruby-stemmer.git

Note

Please use the infrastructure provided by github to report issues.