Reading the rules from an external file
python
from bulstem.stem import BulStemmer
Pre-defined names of rule sets
PRE_DEFINED_RULES = ['stem-context-1',
'stem-context-2',
'stem-context-3']
Excepted output:
1 втор
2 втори
3 вторият
for i, rules_name in enumerate(PRE_DEFINED_RULES, start=1):
stemmer = BulStemmer.from_file(rules_name, min_freq=2, left_context=i)
print(i, stemmer.stem('вторият'))
stemmer = BulStemmer.from_file('stem_rules_context_2_utf8.txt', min_freq=2, left_context=i)
stemmer.stem('вторият') Excepted output: 1. 'втори'
stemmer.stem('вероятен') Excepted output: 1. 'вероят'
`BulStemmer.from_file` params:
1. `path` - Path (or pre-defined name) to the rules file formatted, as follows: word ==> stem ==> freq.
2. `min_freq` - The minimum frequency of a rule to be used when stemming.
3. `left_context` - Size of the prefix which will not be stemmed.