Stemming algorithms written in Python 3.
Snowball is a small string processing language for creating stemming
algorithms for use in Information Retrieval, plus a collection of
stemming algorithms implemented using it.
Snowball was originally designed and built by Martin Porter. Martin
retired from development in 2014 and Snowball is now maintained as a
community project. Martin originally chose the name Snowball as a
tribute to SNOBOL, the excellent string handling language from the
1960s. It now also serves as a metaphor for how the project grows by
gathering contributions over time.
Algorithms are available for the following languages:
- Arabic
- Armenian
- Basque
- Catalan
- Danish
- Dutch
- English (Standard, Porter)
- Finnish
- French
- German
- Greek
- Hindi
- Hungarian
- Indonesian
- Irish
- Italian
- Lithuanian
- Nepali
- Norwegian
- Portuguese
- Romanian
- Russian
- Serbian
- Spanish
- Swedish
- Tamil
- Turkish
- Yiddish