abydos.stemmer package¶
abydos.stemmer.
The stemmer package collects stemmer classes for a number of languages including:
English stemmers:
German stemmers:
Caumanns' (
Caumanns
)CLEF German (
CLEFGerman
)CLEF German Plus (
CLEFGermanPlus
)Snowball German (
SnowballGerman
)Swedish stemmers:
CLEF Swedish (
CLEFSwedish
)Snowball Swedish (
SnowballSwedish
)Latin stemmer:
Schinke (
Schinke
)Danish stemmer:
Snowball Danish (
SnowballDanish
)Dutch stemmer:
Snowball Dutch (
SnowballDutch
)Norwegian stemmer:
Snowball Norwegian (
SnowballNorwegian
)
Each stemmer has a stem
method, which takes a word and returns its stemmed
form:
>>> stmr = Porter()
>>> stmr.stem('democracy')
'democraci'
>>> stmr.stem('trusted')
'trust'
-
class
abydos.stemmer.
_Snowball
[source]¶ Bases:
abydos.stemmer._stemmer._Stemmer
Snowball stemmer base class.
New in version 0.3.6.
-
_codanonvowels
= {"'", 'b', 'c', 'd', 'f', 'g', 'h', 'j', 'k', 'l', 'm', 'n', 'p', 'q', 'r', 's', 't', 'v', 'z'}¶
-
_sb_ends_in_short_syllable
(term)[source]¶ Return True iff term ends in a short syllable.
(...according to the Porter2 specification.)
NB: This is akin to the CVC test from the Porter stemmer. The description is unfortunately poor/ambiguous.
- Parameters
term (str) -- The term to examine
- Returns
True iff term ends in a short syllable
- Return type
bool
New in version 0.1.0.
Changed in version 0.3.6: Encapsulated in class
-
_sb_has_vowel
(term)[source]¶ Return Porter helper function _sb_has_vowel value.
- Parameters
term (str) -- The term to examine
- Returns
True iff a vowel exists in the term (as defined in the Porter stemmer definition)
- Return type
bool
New in version 0.1.0.
Changed in version 0.3.6: Encapsulated in class
-
_sb_r1
(term, r1_prefixes=None)[source]¶ Return the R1 region, as defined in the Porter2 specification.
- Parameters
term (str) -- The term to examine
r1_prefixes (set) -- Prefixes to consider
- Returns
Length of the R1 region
- Return type
int
New in version 0.1.0.
Changed in version 0.3.6: Encapsulated in class
-
_sb_r2
(term, r1_prefixes=None)[source]¶ Return the R2 region, as defined in the Porter2 specification.
- Parameters
term (str) -- The term to examine
r1_prefixes (set) -- Prefixes to consider
- Returns
Length of the R1 region
- Return type
int
New in version 0.1.0.
Changed in version 0.3.6: Encapsulated in class
-
_sb_short_word
(term, r1_prefixes=None)[source]¶ Return True iff term is a short word.
(...according to the Porter2 specification.)
- Parameters
term (str) -- The term to examine
r1_prefixes (set) -- Prefixes to consider
- Returns
True iff term is a short word
- Return type
bool
New in version 0.1.0.
Changed in version 0.3.6: Encapsulated in class
-
_vowels
= {'a', 'e', 'i', 'o', 'u', 'y'}¶
-
-
class
abydos.stemmer.
Lovins
[source]¶ Bases:
abydos.stemmer._stemmer._Stemmer
Lovins stemmer.
The Lovins stemmer is described in Julie Beth Lovins's article [Lov68].
New in version 0.3.6.
Initialize the stemmer.
New in version 0.3.6.
-
_cond_aa
(word, suffix_len)[source]¶ Return Lovins' condition AA.
- Parameters
word (str) -- Word to check
suffix_len (int) -- Suffix length
- Returns
True if condition is met
- Return type
bool
New in version 0.2.0.
Changed in version 0.3.6: Encapsulated in class
-
_cond_b
(word, suffix_len)[source]¶ Return Lovins' condition B.
- Parameters
word (str) -- Word to check
suffix_len (int) -- Suffix length
- Returns
True if condition is met
- Return type
bool
New in version 0.2.0.
Changed in version 0.3.6: Encapsulated in class
-
_cond_bb
(word, suffix_len)[source]¶ Return Lovins' condition BB.
- Parameters
word (str) -- Word to check
suffix_len (int) -- Suffix length
- Returns
True if condition is met
- Return type
bool
New in version 0.2.0.
Changed in version 0.3.6: Encapsulated in class
-
_cond_c
(word, suffix_len)[source]¶ Return Lovins' condition C.
- Parameters
word (str) -- Word to check
suffix_len (int) -- Suffix length
- Returns
True if condition is met
- Return type
bool
New in version 0.2.0.
Changed in version 0.3.6: Encapsulated in class
-
_cond_cc
(word, suffix_len)[source]¶ Return Lovins' condition CC.
- Parameters
word (str) -- Word to check
suffix_len (int) -- Suffix length
- Returns
True if condition is met
- Return type
bool
New in version 0.2.0.
Changed in version 0.3.6: Encapsulated in class
-
_cond_d
(word, suffix_len)[source]¶ Return Lovins' condition D.
- Parameters
word (str) -- Word to check
suffix_len (int) -- Suffix length
- Returns
True if condition is met
- Return type
bool
New in version 0.2.0.
Changed in version 0.3.6: Encapsulated in class
-
_cond_e
(word, suffix_len)[source]¶ Return Lovins' condition E.
- Parameters
word (str) -- Word to check
suffix_len (int) -- Suffix length
- Returns
True if condition is met
- Return type
bool
New in version 0.2.0.
Changed in version 0.3.6: Encapsulated in class
-
_cond_f
(word, suffix_len)[source]¶ Return Lovins' condition F.
- Parameters
word (str) -- Word to check
suffix_len (int) -- Suffix length
- Returns
True if condition is met
- Return type
bool
New in version 0.2.0.
Changed in version 0.3.6: Encapsulated in class
-
_cond_g
(word, suffix_len)[source]¶ Return Lovins' condition G.
- Parameters
word (str) -- Word to check
suffix_len (int) -- Suffix length
- Returns
True if condition is met
- Return type
bool
New in version 0.2.0.
Changed in version 0.3.6: Encapsulated in class
-
_cond_h
(word, suffix_len)[source]¶ Return Lovins' condition H.
- Parameters
word (str) -- Word to check
suffix_len (int) -- Suffix length
- Returns
True if condition is met
- Return type
bool
New in version 0.2.0.
Changed in version 0.3.6: Encapsulated in class
-
_cond_i
(word, suffix_len)[source]¶ Return Lovins' condition I.
- Parameters
word (str) -- Word to check
suffix_len (int) -- Suffix length
- Returns
True if condition is met
- Return type
bool
New in version 0.2.0.
Changed in version 0.3.6: Encapsulated in class
-
_cond_j
(word, suffix_len)[source]¶ Return Lovins' condition J.
- Parameters
word (str) -- Word to check
suffix_len (int) -- Suffix length
- Returns
True if condition is met
- Return type
bool
New in version 0.2.0.
Changed in version 0.3.6: Encapsulated in class
-
_cond_k
(word, suffix_len)[source]¶ Return Lovins' condition K.
- Parameters
word (str) -- Word to check
suffix_len (int) -- Suffix length
- Returns
True if condition is met
- Return type
bool
New in version 0.2.0.
Changed in version 0.3.6: Encapsulated in class
-
_cond_l
(word, suffix_len)[source]¶ Return Lovins' condition L.
- Parameters
word (str) -- Word to check
suffix_len (int) -- Suffix length
- Returns
True if condition is met
- Return type
bool
New in version 0.2.0.
Changed in version 0.3.6: Encapsulated in class
-
_cond_m
(word, suffix_len)[source]¶ Return Lovins' condition M.
- Parameters
word (str) -- Word to check
suffix_len (int) -- Suffix length
- Returns
True if condition is met
- Return type
bool
New in version 0.2.0.
Changed in version 0.3.6: Encapsulated in class
-
_cond_n
(word, suffix_len)[source]¶ Return Lovins' condition N.
- Parameters
word (str) -- Word to check
suffix_len (int) -- Suffix length
- Returns
True if condition is met
- Return type
bool
New in version 0.2.0.
Changed in version 0.3.6: Encapsulated in class
-
_cond_o
(word, suffix_len)[source]¶ Return Lovins' condition O.
- Parameters
word (str) -- Word to check
suffix_len (int) -- Suffix length
- Returns
True if condition is met
- Return type
bool
New in version 0.2.0.
Changed in version 0.3.6: Encapsulated in class
-
_cond_p
(word, suffix_len)[source]¶ Return Lovins' condition P.
- Parameters
word (str) -- Word to check
suffix_len (int) -- Suffix length
- Returns
True if condition is met
- Return type
bool
New in version 0.2.0.
Changed in version 0.3.6: Encapsulated in class
-
_cond_q
(word, suffix_len)[source]¶ Return Lovins' condition Q.
- Parameters
word (str) -- Word to check
suffix_len (int) -- Suffix length
- Returns
True if condition is met
- Return type
bool
New in version 0.2.0.
Changed in version 0.3.6: Encapsulated in class
-
_cond_r
(word, suffix_len)[source]¶ Return Lovins' condition R.
- Parameters
word (str) -- Word to check
suffix_len (int) -- Suffix length
- Returns
True if condition is met
- Return type
bool
New in version 0.2.0.
Changed in version 0.3.6: Encapsulated in class
-
_cond_s
(word, suffix_len)[source]¶ Return Lovins' condition S.
- Parameters
word (str) -- Word to check
suffix_len (int) -- Suffix length
- Returns
True if condition is met
- Return type
bool
New in version 0.2.0.
Changed in version 0.3.6: Encapsulated in class
-
_cond_t
(word, suffix_len)[source]¶ Return Lovins' condition T.
- Parameters
word (str) -- Word to check
suffix_len (int) -- Suffix length
- Returns
True if condition is met
- Return type
bool
New in version 0.2.0.
Changed in version 0.3.6: Encapsulated in class
-
_cond_u
(word, suffix_len)[source]¶ Return Lovins' condition U.
- Parameters
word (str) -- Word to check
suffix_len (int) -- Suffix length
- Returns
True if condition is met
- Return type
bool
New in version 0.2.0.
Changed in version 0.3.6: Encapsulated in class
-
_cond_v
(word, suffix_len)[source]¶ Return Lovins' condition V.
- Parameters
word (str) -- Word to check
suffix_len (int) -- Suffix length
- Returns
True if condition is met
- Return type
bool
New in version 0.2.0.
Changed in version 0.3.6: Encapsulated in class
-
_cond_w
(word, suffix_len)[source]¶ Return Lovins' condition W.
- Parameters
word (str) -- Word to check
suffix_len (int) -- Suffix length
- Returns
True if condition is met
- Return type
bool
New in version 0.2.0.
Changed in version 0.3.6: Encapsulated in class
-
_cond_x
(word, suffix_len)[source]¶ Return Lovins' condition X.
- Parameters
word (str) -- Word to check
suffix_len (int) -- Suffix length
- Returns
True if condition is met
- Return type
bool
New in version 0.2.0.
Changed in version 0.3.6: Encapsulated in class
-
_cond_y
(word, suffix_len)[source]¶ Return Lovins' condition Y.
- Parameters
word (str) -- Word to check
suffix_len (int) -- Suffix length
- Returns
True if condition is met
- Return type
bool
New in version 0.2.0.
Changed in version 0.3.6: Encapsulated in class
-
_cond_z
(word, suffix_len)[source]¶ Return Lovins' condition Z.
- Parameters
word (str) -- Word to check
suffix_len (int) -- Suffix length
- Returns
True if condition is met
- Return type
bool
New in version 0.2.0.
Changed in version 0.3.6: Encapsulated in class
-
_recode
= ()¶
-
_recode24
(stem)[source]¶ Return Lovins' conditional recode rule 24.
- Parameters
stem (str) -- Word to stem
- Returns
Word stripped of suffix
- Return type
str
New in version 0.2.0.
Changed in version 0.3.6: Encapsulated in class
-
_recode28
(stem)[source]¶ Return Lovins' conditional recode rule 28.
- Parameters
stem (str) -- Word to stem
- Returns
Word stripped of suffix
- Return type
str
New in version 0.2.0.
Changed in version 0.3.6: Encapsulated in class
-
_recode30
(stem)[source]¶ Return Lovins' conditional recode rule 30.
- Parameters
stem (str) -- Word to stem
- Returns
Word stripped of suffix
- Return type
str
New in version 0.2.0.
Changed in version 0.3.6: Encapsulated in class
-
_recode32
(stem)[source]¶ Return Lovins' conditional recode rule 32.
- Parameters
stem (str) -- Word to stem
- Returns
Word stripped of suffix
- Return type
str
New in version 0.2.0.
Changed in version 0.3.6: Encapsulated in class
-
_recode9
(stem)[source]¶ Return Lovins' conditional recode rule 9.
- Parameters
stem (str) -- Word to stem
- Returns
Word stripped of suffix
- Return type
str
New in version 0.2.0.
Changed in version 0.3.6: Encapsulated in class
-
_suffix
= {}¶
-
stem
(word)[source]¶ Return Lovins stem.
- Parameters
word (str) -- The word to stem
- Returns
Word stem
- Return type
str
Examples
>>> stmr = Lovins() >>> stmr.stem('reading') 'read' >>> stmr.stem('suspension') 'suspens' >>> stmr.stem('elusiveness') 'elus'
New in version 0.2.0.
Changed in version 0.3.6: Encapsulated in class
-
-
abydos.stemmer.
lovins
(word)[source]¶ Return Lovins stem.
This is a wrapper for
Lovins.stem()
.- Parameters
word (str) -- The word to stem
- Returns
str
- Return type
Word stem
Examples
>>> lovins('reading') 'read' >>> lovins('suspension') 'suspens' >>> lovins('elusiveness') 'elus'
New in version 0.2.0.
Deprecated since version 0.4.0: This will be removed in 0.6.0. Use the Lovins.stem method instead.
-
class
abydos.stemmer.
PaiceHusk
[source]¶ Bases:
abydos.stemmer._stemmer._Stemmer
Paice-Husk stemmer.
Implementation of the Paice-Husk Stemmer, also known as the Lancaster Stemmer, developed by Chris Paice, with the assistance of Gareth Husk
This is based on the algorithm's description in [Pai90].
New in version 0.3.6.
-
_rule_table
= {1: {'a': (True, 1, None, True), 'e': (False, 1, None, False), 'i': ((True, 1, None, True), (False, 1, 'y', False)), 'j': (False, 1, 's', True), 's': ((True, 1, None, False), (False, 0, None, True))}, 2: {'ag': (False, 2, None, False), 'al': (False, 2, None, False), 'an': (False, 2, None, False), 'ar': (False, 2, None, True), 'at': (False, 2, None, False), 'bb': (False, 1, None, True), 'cl': (False, 1, None, True), 'dd': (False, 1, None, True), 'ed': (False, 2, None, False), 'en': (False, 2, None, False), 'er': (False, 2, None, False), 'gg': (False, 1, None, True), 'ia': (True, 2, None, True), 'ic': (False, 2, None, False), 'if': (False, 2, None, False), 'ij': (False, 1, 'd', True), 'is': (False, 2, None, False), 'iv': (False, 2, None, False), 'iz': (False, 2, None, False), 'll': (False, 1, None, True), 'ly': (False, 2, None, False), 'mm': (False, 1, None, True), 'nc': (False, 1, 't', False), 'nj': (False, 1, 'd', True), 'nn': (False, 1, None, True), 'oj': (False, 1, 'd', True), 'or': (False, 2, None, False), 'pp': (False, 1, None, True), 'rr': (False, 1, None, True), 'ss': (False, 0, None, True), 'th': (True, 2, None, True), 'tr': (False, 1, None, False), 'tt': (False, 1, None, True), 'uj': (False, 1, 'd', True), 'ul': (False, 2, None, True), 'um': (True, 2, None, True), 'ur': (False, 2, None, False), 'us': (True, 2, None, True), 'yz': (False, 1, 's', True)}, 3: {'abl': (False, 3, None, False), 'acy': (False, 3, None, False), 'ant': (False, 3, None, False), 'ary': (False, 3, None, False), 'bil': (False, 2, 'l', False), 'bly': (False, 1, None, False), 'ear': (False, 0, None, True), 'eed': (False, 1, None, True), 'een': (False, 0, None, True), 'eiv': (False, 0, None, True), 'ent': (False, 3, None, False), 'ety': (False, 3, None, False), 'fuj': (False, 1, 's', True), 'ful': (False, 3, None, False), 'hej': (False, 1, 'r', True), 'iag': (False, 3, 'y', True), 'ial': (False, 3, None, False), 'ian': (False, 3, None, False), 'ibl': (False, 3, None, True), 'ied': (False, 3, 'y', False), 'ier': (False, 3, 'y', False), 'ies': (False, 3, 'y', False), 'ify': (False, 3, None, True), 'ily': (False, 3, 'y', False), 'ing': (False, 3, None, False), 'ion': (False, 3, None, False), 'iqu': (False, 3, None, True), 'ish': (False, 3, None, False), 'ism': (False, 3, None, False), 'ist': (False, 3, None, False), 'ity': (False, 3, None, False), 'ium': (False, 3, None, True), 'lty': (False, 2, None, True), 'ncy': (False, 2, 't', False), 'ogu': (False, 1, None, True), 'ogy': (False, 1, None, True), 'omy': (False, 1, None, True), 'opy': (False, 1, None, True), 'ory': (False, 3, None, False), 'ous': (False, 3, None, False), 'phy': (False, 1, None, True), 'ply': (False, 0, None, True), 'sis': (False, 2, None, True), 'siv': (False, 3, 'j', False), 'ual': (False, 3, None, False)}, 4: {'ceed': (False, 2, 'ss', True), 'cept': (False, 2, 'iv', True), 'duct': (False, 1, None, True), 'hood': (False, 4, None, False), 'iabl': (False, 4, 'y', True), 'iful': (False, 4, 'y', True), 'lief': (False, 1, 'v', True), 'ment': (False, 4, None, False), 'misj': (False, 2, 't', True), 'ness': (False, 4, None, False), 'olut': (False, 2, 'v', True), 'orpt': (False, 2, 'b', True), 'ript': (False, 2, 'b', True), 'ship': (False, 4, None, False), 'sion': (False, 4, 'j', False), 'sist': (False, 0, None, True), 'verj': (False, 1, 't', True), 'xion': (False, 4, 'ct', True), 'ytic': (False, 3, 's', True)}, 5: {'guish': (False, 5, 'ct', True), 'istry': (False, 5, None, True), 'sumpt': (False, 2, None, True)}, 6: {'ifiabl': (False, 6, None, True), 'plicat': (False, 4, 'y', True)}}¶
-
stem
(word)[source]¶ Return Paice-Husk stem.
- Parameters
word (str) -- The word to stem
- Returns
Word stem
- Return type
str
Examples
>>> stmr = PaiceHusk() >>> stmr.stem('assumption') 'assum' >>> stmr.stem('verifiable') 'ver' >>> stmr.stem('fancies') 'fant' >>> stmr.stem('fanciful') 'fancy' >>> stmr.stem('torment') 'tor'
New in version 0.3.0.
Changed in version 0.3.6: Encapsulated in class
-
-
abydos.stemmer.
paice_husk
(word)[source]¶ Return Paice-Husk stem.
This is a wrapper for
PaiceHusk.stem()
.- Parameters
word (str) -- The word to stem
- Returns
Word stem
- Return type
str
Examples
>>> paice_husk('assumption') 'assum' >>> paice_husk('verifiable') 'ver' >>> paice_husk('fancies') 'fant' >>> paice_husk('fanciful') 'fancy' >>> paice_husk('torment') 'tor'
New in version 0.3.0.
Deprecated since version 0.4.0: This will be removed in 0.6.0. Use the PaiceHusk.stem method instead.
-
class
abydos.stemmer.
UEALite
(max_word_length=20, max_acro_length=8, return_rule_no=False, var='standard')[source]¶ Bases:
abydos.stemmer._stemmer._Stemmer
UEA-Lite stemmer.
The UEA-Lite stemmer is discussed in [JS05].
This is chiefly based on the Java implementation of the algorithm, with variants based on the Perl implementation and Jason Adams' Ruby port.
Java version: [Chu] Perl version: [JS05] Ruby version: [Ada17]
New in version 0.3.6.
Initialize UEALite instance.
- Parameters
max_word_length (int) -- The maximum word length allowed
max_acro_length (int) -- The maximum acronym length allowed
return_rule_no (bool) -- If True, returns the stem along with rule number
var (str) --
Variant rules to use:
Adams
to use Jason Adams' rulesPerl
to use the original Perl rules
New in version 0.4.0.
-
_adams_rule_table
= {2: {'cs': (3, 0, None), 'es': (63, 2, None), 'is': (64, 2, 'e'), 'ss': (6, 0, None), 'us': (67, 0, None)}, 3: {'bed': (36, 2, None), 'ced': (18, 1, None), 'ces': (2, 1, None), 'des': (63.1, 1, None), 'eed': (7, 0, None), 'ees': (10, 1, None), 'ged': (43, 1, None), 'ges': (63.1, 1, None), 'ied': (56, 3, 'y'), 'ies': (59, 3, 'y'), 'kes': (63.1, 1, None), 'led': (12, 2, None), 'les': (50, 1, None), 'mes': (63.7, 1, None), 'ned': (13, 1, None), 'oed': (31.3, 1, None), 'oes': (31.2, 1, None), 'ous': (65, 0, None), 'pes': (63.8, 1, None), 'red': (20, 1, None), 'res': (63.9, 1, None), 'sed': (29, 1, None), 'ses': (11, 1, None), 'sis': (4, 0, None), 'ted': (22, 2, None), 'tes': (51, 1, None), 'tis': (5, 0, None), 'ued': (8, 1, None), 'ues': (9, 1, None), 'ums': (66, 0, None), 'ved': (17, 1, None), 'ves': (60, 1, None), 'zed': (52, 1, None)}, 4: {'aked': (31.1, 1, None), 'amed': (31, 1, None), 'aped': (61.3, 1, None), 'ated': (22.1, 1, None), 'beds': (36, 3, None), 'bled': (12.3, 1, None), 'ding': (40, 3, 'e'), 'does': (31.2, 2, None), 'eeds': (7, 1, None), 'eled': (12.2, 2, None), 'ened': (13.7, 2, None), 'ered': (20.1, 2, None), 'eses': (11.1, 2, 'is'), 'gged': (43.1, 3, None), 'ging': (45, 3, 'e'), 'gned': (13.1, 2, None), 'ides': (63.2, 1, None), 'iked': (31.1, 1, None), 'imed': (31, 1, None), 'ines': (63.3, 1, None), 'ited': (22.7, 2, None), 'izes': (63.5, 1, None), 'ling': (42, 3, 'e'), 'lled': (12.1, 2, None), 'lves': (60.1, 3, 'f'), 'ming': (44, 3, 'e'), 'nged': (43.2, 1, None), 'ning': (46, 3, 'e'), 'nned': (13.3, 3, None), 'oded': (61.1, 1, None), 'oked': (31.1, 1, None), 'oned': (13.2, 2, None), 'ones': (63.6, 1, None), 'pled': (12.4, 1, None), 'reds': (20, 2, None), 'rned': (13.4, 2, None), 'sing': (54, 3, 'e'), 'ssed': (28, 2, None), 'sses': (11.2, 2, None), 'ting': (48, 3, 'e'), 'tled': (12.5, 1, None), 'tted': (21, 3, None), 'uded': (61.2, 1, None), 'uked': (31.1, 1, None), 'umed': (31, 1, None), 'ures': (63.4, 1, None), 'uses': (11.3, 1, None), 'uted': (22.2, 1, None), 'ving': (39, 3, 'e'), 'zing': (54.1, 3, 'e')}, 5: {'ained': (13.6, 2, None), 'anges': (23, 1, None), 'aning': (46.6, 3, None), 'arred': (19.1, 3, None), 'ating': (57, 3, 'e'), 'cting': (48.1, 3, None), 'dding': (40.4, 4, None), 'dings': (40, 4, 'e'), 'dying': (58.2, 4, 'ie'), 'eared': (20.3, 2, None), 'ected': (15, 2, None), 'eding': (40.5, 3, None), 'eling': (42.1, 3, None), 'ening': (46.5, 3, None), 'erned': (13.5, 2, None), 'erred': (19, 3, None), 'eting': (48.4, 3, None), 'gging': (45.1, 4, None), 'gings': (45, 4, 'e'), 'gning': (46.4, 3, None), 'iases': (11.4, 2, None), 'ifted': (14, 2, None), 'iring': (54.4, 3, 'e'), 'lding': (40.3, 3, None), 'leted': (22.3, 1, None), 'lings': (42, 4, 'e'), 'lling': (41, 4, None), 'lming': (44.1, 3, None), 'lored': (20.4, 2, None), 'lying': (58.2, 4, 'ie'), 'mided': (22.1, 1, None), 'mings': (44, 4, 'e'), 'mited': (22.5, 1, None), 'mming': (44.3, 4, None), 'ncing': (54.2, 3, 'e'), 'nding': (40.1, 3, None), 'nging': (45.2, 3, None), 'nning': (46.3, 4, None), 'noted': (22.4, 1, None), 'nting': (48.2, 3, None), 'oling': (42.3, 3, None), 'oning': (46.2, 3, None), 'pting': (48.3, 3, None), 'rabed': (36.1, 1, None), 'rding': (40.2, 3, None), 'rebed': (36.1, 1, None), 'ribed': (36.1, 1, None), 'rming': (44.2, 3, None), 'rning': (46.1, 3, None), 'robed': (36.1, 1, None), 'rubed': (36.1, 1, None), 'sings': (54, 4, 'e'), 'ssing': (37, 3, None), 'sting': (47, 3, None), 'thing': (58.1, 0, None), 'tings': (48, 4, 'e'), 'tored': (20.2, 1, None), 'tting': (26, 4, None), 'tying': (58.2, 4, 'ie'), 'ulted': (32, 2, None), 'uming': (33, 3, 'e'), 'uring': (54.3, 3, 'e'), 'urred': (20.5, 3, None), 'vided': (22.9, 1, None), 'vings': (39, 4, 'e'), 'vited': (22.6, 1, None)}, 6: {'aceous': (1, 6, None), 'acting': (25, 3, None), 'ailing': (42.2, 3, None), 'aining': (24, 3, None), 'chited': (22.8, 1, None), 'ddings': (40.4, 5, None), 'eading': (40.7, 3, None), 'ealing': (42.4, 3, None), 'edings': (40.5, 4, None), 'elings': (42.1, 4, None), 'etings': (48.4, 4, None), 'ggings': (45.1, 5, None), 'irings': (54.4, 4, 'e'), 'ldings': (40.3, 4, None), 'llings': (41, 5, None), 'mmings': (44.3, 5, None), 'ncings': (54.2, 4, 'e'), 'ndings': (40.1, 4, None), 'ngings': (45.2, 4, None), 'ntings': (48.2, 4, None), 'oading': (40.6, 3, None), 'olings': (42.3, 4, None), 'rdings': (40.2, 4, None), 'ssings': (37, 4, None), 'stings': (47, 4, None), 'things': (58.1, 1, None), 'ttings': (26, 5, None), 'ulting': (38, 3, None), 'urings': (54.3, 4, 'e'), 'viding': (27, 3, 'e')}, 7: {'ailings': (42.2, 4, None), 'eadings': (40.7, 4, None), 'ealings': (42.4, 4, None), 'fulness': (34, 4, None), 'oadings': (40.6, 4, None), 'ousness': (35, 4, None), 'titudes': (30, 1, None)}}¶
-
_perl_rule_table
= {2: {'cs': (3, 0, None), 'es': (63, 2, None), 'is': (64, 2, 'e'), 'ss': (6, 0, None), 'us': (67, 0, None)}, 3: {'bed': (36, 2, None), 'ced': (18, 1, None), 'ces': (2, 1, None), 'eed': (7, 0, None), 'ees': (10, 1, None), 'ged': (43, 1, None), 'ges': (63.1, 1, None), 'ied': (56, 3, 'y'), 'ies': (59, 3, 'y'), 'led': (12, 2, None), 'les': (50, 1, None), 'mes': (63.7, 1, None), 'ned': (13, 1, None), 'ous': (65, 0, None), 'pes': (63.8, 1, None), 'red': (20, 1, None), 'sed': (29, 1, None), 'ses': (11, 1, None), 'sis': (4, 0, None), 'ted': (22, 2, None), 'tes': (51, 1, None), 'tis': (5, 0, None), 'ued': (8, 1, None), 'ues': (9, 1, None), 'ums': (66, 0, None), 'ved': (17, 1, None), 'ves': (60, 1, None), 'zed': (52, 1, None)}, 4: {'aped': (61.3, 1, None), 'ated': (22.1, 1, None), 'bled': (12.3, 1, None), 'ding': (40, 3, 'e'), 'eled': (12.2, 2, None), 'ened': (13.7, 2, None), 'ered': (20.1, 2, None), 'eses': (11.1, 2, 'is'), 'gged': (43.1, 3, None), 'ging': (45, 3, 'e'), 'gned': (13.1, 2, None), 'ides': (63.2, 1, None), 'ines': (63.3, 1, None), 'izes': (63.5, 1, None), 'ling': (42, 3, 'e'), 'lled': (12.1, 2, None), 'lves': (60.1, 3, 'f'), 'ming': (44, 3, 'e'), 'nged': (43.2, 1, None), 'ning': (46, 3, 'e'), 'nned': (13.3, 3, None), 'oded': (61.1, 1, None), 'oned': (13.2, 2, None), 'ones': (63.6, 1, None), 'pled': (12.4, 1, None), 'rned': (13.4, 2, None), 'sing': (54, 3, 'e'), 'ssed': (28, 2, None), 'sses': (11.2, 2, None), 'ting': (48, 3, 'e'), 'tled': (12.5, 1, None), 'tted': (21, 3, None), 'uded': (61.2, 1, None), 'umed': (31, 1, None), 'ures': (63.4, 1, None), 'uses': (11.3, 1, None), 'uted': (22.2, 1, None), 'ving': (39, 3, 'e'), 'zing': (54.1, 3, 'e')}, 5: {'ained': (13.6, 2, None), 'anges': (23, 1, None), 'aning': (46.6, 3, None), 'ating': (57, 3, 'e'), 'cting': (48.1, 3, None), 'dding': (40.4, 4, None), 'eared': (20.3, 2, None), 'ected': (15, 2, None), 'eding': (40.5, 3, None), 'eling': (42.1, 3, None), 'ening': (46.5, 3, None), 'erned': (13.5, 2, None), 'erred': (19, 3, None), 'eting': (48.4, 3, None), 'gging': (45.1, 4, None), 'gning': (46.4, 3, None), 'iases': (11.4, 2, None), 'ifted': (14, 2, None), 'iring': (54.4, 3, 'e'), 'lding': (40.3, 3, None), 'leted': (22.3, 1, None), 'lling': (41, 4, None), 'lming': (44.1, 3, None), 'lored': (20.4, 2, None), 'mming': (44.3, 4, None), 'ncing': (54.2, 3, 'e'), 'nding': (40.1, 3, None), 'nging': (45.2, 3, None), 'nning': (46.3, 4, None), 'noted': (22.4, 1, None), 'nting': (48.2, 3, None), 'oling': (42.3, 3, None), 'oning': (46.2, 3, None), 'pting': (48.3, 3, None), 'rabed': (36.1, 1, None), 'rding': (40.2, 3, None), 'rebed': (36.1, 1, None), 'ribed': (36.1, 1, None), 'rming': (44.2, 3, None), 'rning': (46.1, 3, None), 'robed': (36.1, 1, None), 'rubed': (36.1, 1, None), 'ssing': (37, 3, None), 'sting': (47, 3, None), 'thing': (58.1, 0, None), 'tored': (20.2, 1, None), 'tting': (26, 4, None), 'ulted': (32, 2, None), 'uming': (33, 3, 'e'), 'uring': (54.3, 3, 'e'), 'urred': (20.5, 3, None), 'vided': (16, 1, None)}, 6: {'aceous': (1, 6, None), 'acting': (25, 3, None), 'ailing': (42.2, 3, None), 'aining': (24, 3, None), 'eading': (40.7, 3, None), 'ealing': (42.4, 3, None), 'oading': (40.6, 3, None), 'ulting': (38, 3, None), 'viding': (27, 3, 'e')}, 7: {'fulness': (34, 4, None), 'ousness': (35, 4, None), 'titudes': (30, 1, None)}}¶
-
_problem_words
= {'as', 'during', 'has', 'is', 'this', 'was'}¶
-
_rules
= {'Adams': {2: {'cs': (3, 0, None), 'es': (63, 2, None), 'is': (64, 2, 'e'), 'ss': (6, 0, None), 'us': (67, 0, None)}, 3: {'bed': (36, 2, None), 'ced': (18, 1, None), 'ces': (2, 1, None), 'des': (63.1, 1, None), 'eed': (7, 0, None), 'ees': (10, 1, None), 'ged': (43, 1, None), 'ges': (63.1, 1, None), 'ied': (56, 3, 'y'), 'ies': (59, 3, 'y'), 'kes': (63.1, 1, None), 'led': (12, 2, None), 'les': (50, 1, None), 'mes': (63.7, 1, None), 'ned': (13, 1, None), 'oed': (31.3, 1, None), 'oes': (31.2, 1, None), 'ous': (65, 0, None), 'pes': (63.8, 1, None), 'red': (20, 1, None), 'res': (63.9, 1, None), 'sed': (29, 1, None), 'ses': (11, 1, None), 'sis': (4, 0, None), 'ted': (22, 2, None), 'tes': (51, 1, None), 'tis': (5, 0, None), 'ued': (8, 1, None), 'ues': (9, 1, None), 'ums': (66, 0, None), 'ved': (17, 1, None), 'ves': (60, 1, None), 'zed': (52, 1, None)}, 4: {'aked': (31.1, 1, None), 'amed': (31, 1, None), 'aped': (61.3, 1, None), 'ated': (22.1, 1, None), 'beds': (36, 3, None), 'bled': (12.3, 1, None), 'ding': (40, 3, 'e'), 'does': (31.2, 2, None), 'eeds': (7, 1, None), 'eled': (12.2, 2, None), 'ened': (13.7, 2, None), 'ered': (20.1, 2, None), 'eses': (11.1, 2, 'is'), 'gged': (43.1, 3, None), 'ging': (45, 3, 'e'), 'gned': (13.1, 2, None), 'ides': (63.2, 1, None), 'iked': (31.1, 1, None), 'imed': (31, 1, None), 'ines': (63.3, 1, None), 'ited': (22.7, 2, None), 'izes': (63.5, 1, None), 'ling': (42, 3, 'e'), 'lled': (12.1, 2, None), 'lves': (60.1, 3, 'f'), 'ming': (44, 3, 'e'), 'nged': (43.2, 1, None), 'ning': (46, 3, 'e'), 'nned': (13.3, 3, None), 'oded': (61.1, 1, None), 'oked': (31.1, 1, None), 'oned': (13.2, 2, None), 'ones': (63.6, 1, None), 'pled': (12.4, 1, None), 'reds': (20, 2, None), 'rned': (13.4, 2, None), 'sing': (54, 3, 'e'), 'ssed': (28, 2, None), 'sses': (11.2, 2, None), 'ting': (48, 3, 'e'), 'tled': (12.5, 1, None), 'tted': (21, 3, None), 'uded': (61.2, 1, None), 'uked': (31.1, 1, None), 'umed': (31, 1, None), 'ures': (63.4, 1, None), 'uses': (11.3, 1, None), 'uted': (22.2, 1, None), 'ving': (39, 3, 'e'), 'zing': (54.1, 3, 'e')}, 5: {'ained': (13.6, 2, None), 'anges': (23, 1, None), 'aning': (46.6, 3, None), 'arred': (19.1, 3, None), 'ating': (57, 3, 'e'), 'cting': (48.1, 3, None), 'dding': (40.4, 4, None), 'dings': (40, 4, 'e'), 'dying': (58.2, 4, 'ie'), 'eared': (20.3, 2, None), 'ected': (15, 2, None), 'eding': (40.5, 3, None), 'eling': (42.1, 3, None), 'ening': (46.5, 3, None), 'erned': (13.5, 2, None), 'erred': (19, 3, None), 'eting': (48.4, 3, None), 'gging': (45.1, 4, None), 'gings': (45, 4, 'e'), 'gning': (46.4, 3, None), 'iases': (11.4, 2, None), 'ifted': (14, 2, None), 'iring': (54.4, 3, 'e'), 'lding': (40.3, 3, None), 'leted': (22.3, 1, None), 'lings': (42, 4, 'e'), 'lling': (41, 4, None), 'lming': (44.1, 3, None), 'lored': (20.4, 2, None), 'lying': (58.2, 4, 'ie'), 'mided': (22.1, 1, None), 'mings': (44, 4, 'e'), 'mited': (22.5, 1, None), 'mming': (44.3, 4, None), 'ncing': (54.2, 3, 'e'), 'nding': (40.1, 3, None), 'nging': (45.2, 3, None), 'nning': (46.3, 4, None), 'noted': (22.4, 1, None), 'nting': (48.2, 3, None), 'oling': (42.3, 3, None), 'oning': (46.2, 3, None), 'pting': (48.3, 3, None), 'rabed': (36.1, 1, None), 'rding': (40.2, 3, None), 'rebed': (36.1, 1, None), 'ribed': (36.1, 1, None), 'rming': (44.2, 3, None), 'rning': (46.1, 3, None), 'robed': (36.1, 1, None), 'rubed': (36.1, 1, None), 'sings': (54, 4, 'e'), 'ssing': (37, 3, None), 'sting': (47, 3, None), 'thing': (58.1, 0, None), 'tings': (48, 4, 'e'), 'tored': (20.2, 1, None), 'tting': (26, 4, None), 'tying': (58.2, 4, 'ie'), 'ulted': (32, 2, None), 'uming': (33, 3, 'e'), 'uring': (54.3, 3, 'e'), 'urred': (20.5, 3, None), 'vided': (22.9, 1, None), 'vings': (39, 4, 'e'), 'vited': (22.6, 1, None)}, 6: {'aceous': (1, 6, None), 'acting': (25, 3, None), 'ailing': (42.2, 3, None), 'aining': (24, 3, None), 'chited': (22.8, 1, None), 'ddings': (40.4, 5, None), 'eading': (40.7, 3, None), 'ealing': (42.4, 3, None), 'edings': (40.5, 4, None), 'elings': (42.1, 4, None), 'etings': (48.4, 4, None), 'ggings': (45.1, 5, None), 'irings': (54.4, 4, 'e'), 'ldings': (40.3, 4, None), 'llings': (41, 5, None), 'mmings': (44.3, 5, None), 'ncings': (54.2, 4, 'e'), 'ndings': (40.1, 4, None), 'ngings': (45.2, 4, None), 'ntings': (48.2, 4, None), 'oading': (40.6, 3, None), 'olings': (42.3, 4, None), 'rdings': (40.2, 4, None), 'ssings': (37, 4, None), 'stings': (47, 4, None), 'things': (58.1, 1, None), 'ttings': (26, 5, None), 'ulting': (38, 3, None), 'urings': (54.3, 4, 'e'), 'viding': (27, 3, 'e')}, 7: {'ailings': (42.2, 4, None), 'eadings': (40.7, 4, None), 'ealings': (42.4, 4, None), 'fulness': (34, 4, None), 'oadings': (40.6, 4, None), 'ousness': (35, 4, None), 'titudes': (30, 1, None)}}, 'Perl': {2: {'cs': (3, 0, None), 'es': (63, 2, None), 'is': (64, 2, 'e'), 'ss': (6, 0, None), 'us': (67, 0, None)}, 3: {'bed': (36, 2, None), 'ced': (18, 1, None), 'ces': (2, 1, None), 'eed': (7, 0, None), 'ees': (10, 1, None), 'ged': (43, 1, None), 'ges': (63.1, 1, None), 'ied': (56, 3, 'y'), 'ies': (59, 3, 'y'), 'led': (12, 2, None), 'les': (50, 1, None), 'mes': (63.7, 1, None), 'ned': (13, 1, None), 'ous': (65, 0, None), 'pes': (63.8, 1, None), 'red': (20, 1, None), 'sed': (29, 1, None), 'ses': (11, 1, None), 'sis': (4, 0, None), 'ted': (22, 2, None), 'tes': (51, 1, None), 'tis': (5, 0, None), 'ued': (8, 1, None), 'ues': (9, 1, None), 'ums': (66, 0, None), 'ved': (17, 1, None), 'ves': (60, 1, None), 'zed': (52, 1, None)}, 4: {'aped': (61.3, 1, None), 'ated': (22.1, 1, None), 'bled': (12.3, 1, None), 'ding': (40, 3, 'e'), 'eled': (12.2, 2, None), 'ened': (13.7, 2, None), 'ered': (20.1, 2, None), 'eses': (11.1, 2, 'is'), 'gged': (43.1, 3, None), 'ging': (45, 3, 'e'), 'gned': (13.1, 2, None), 'ides': (63.2, 1, None), 'ines': (63.3, 1, None), 'izes': (63.5, 1, None), 'ling': (42, 3, 'e'), 'lled': (12.1, 2, None), 'lves': (60.1, 3, 'f'), 'ming': (44, 3, 'e'), 'nged': (43.2, 1, None), 'ning': (46, 3, 'e'), 'nned': (13.3, 3, None), 'oded': (61.1, 1, None), 'oned': (13.2, 2, None), 'ones': (63.6, 1, None), 'pled': (12.4, 1, None), 'rned': (13.4, 2, None), 'sing': (54, 3, 'e'), 'ssed': (28, 2, None), 'sses': (11.2, 2, None), 'ting': (48, 3, 'e'), 'tled': (12.5, 1, None), 'tted': (21, 3, None), 'uded': (61.2, 1, None), 'umed': (31, 1, None), 'ures': (63.4, 1, None), 'uses': (11.3, 1, None), 'uted': (22.2, 1, None), 'ving': (39, 3, 'e'), 'zing': (54.1, 3, 'e')}, 5: {'ained': (13.6, 2, None), 'anges': (23, 1, None), 'aning': (46.6, 3, None), 'ating': (57, 3, 'e'), 'cting': (48.1, 3, None), 'dding': (40.4, 4, None), 'eared': (20.3, 2, None), 'ected': (15, 2, None), 'eding': (40.5, 3, None), 'eling': (42.1, 3, None), 'ening': (46.5, 3, None), 'erned': (13.5, 2, None), 'erred': (19, 3, None), 'eting': (48.4, 3, None), 'gging': (45.1, 4, None), 'gning': (46.4, 3, None), 'iases': (11.4, 2, None), 'ifted': (14, 2, None), 'iring': (54.4, 3, 'e'), 'lding': (40.3, 3, None), 'leted': (22.3, 1, None), 'lling': (41, 4, None), 'lming': (44.1, 3, None), 'lored': (20.4, 2, None), 'mming': (44.3, 4, None), 'ncing': (54.2, 3, 'e'), 'nding': (40.1, 3, None), 'nging': (45.2, 3, None), 'nning': (46.3, 4, None), 'noted': (22.4, 1, None), 'nting': (48.2, 3, None), 'oling': (42.3, 3, None), 'oning': (46.2, 3, None), 'pting': (48.3, 3, None), 'rabed': (36.1, 1, None), 'rding': (40.2, 3, None), 'rebed': (36.1, 1, None), 'ribed': (36.1, 1, None), 'rming': (44.2, 3, None), 'rning': (46.1, 3, None), 'robed': (36.1, 1, None), 'rubed': (36.1, 1, None), 'ssing': (37, 3, None), 'sting': (47, 3, None), 'thing': (58.1, 0, None), 'tored': (20.2, 1, None), 'tting': (26, 4, None), 'ulted': (32, 2, None), 'uming': (33, 3, 'e'), 'uring': (54.3, 3, 'e'), 'urred': (20.5, 3, None), 'vided': (16, 1, None)}, 6: {'aceous': (1, 6, None), 'acting': (25, 3, None), 'ailing': (42.2, 3, None), 'aining': (24, 3, None), 'eading': (40.7, 3, None), 'ealing': (42.4, 3, None), 'oading': (40.6, 3, None), 'ulting': (38, 3, None), 'viding': (27, 3, 'e')}, 7: {'fulness': (34, 4, None), 'ousness': (35, 4, None), 'titudes': (30, 1, None)}}, 'standard': {2: {'cs': (3, 0, None), 'es': (63, 2, None), 'is': (64, 2, 'e'), 'ss': (6, 0, None), 'us': (67, 0, None)}, 3: {'bed': (36, 2, None), 'ced': (18, 1, None), 'ces': (2, 1, None), 'eed': (7, 0, None), 'ees': (10, 1, None), 'ged': (43, 1, None), 'ges': (63.1, 1, None), 'ied': (56, 3, 'y'), 'ies': (59, 3, 'y'), 'led': (12, 2, None), 'les': (50, 1, None), 'mes': (63.7, 1, None), 'ned': (13, 1, None), 'ous': (65, 0, None), 'pes': (63.8, 1, None), 'red': (20, 1, None), 'sed': (29, 1, None), 'ses': (11, 1, None), 'sis': (4, 0, None), 'ted': (22, 2, None), 'tes': (51, 1, None), 'tis': (5, 0, None), 'ued': (8, 1, None), 'ues': (9, 1, None), 'ums': (66, 0, None), 'ved': (17, 1, None), 'ves': (60, 1, None), 'zed': (52, 1, None)}, 4: {'aped': (61.3, 1, None), 'ated': (22.1, 1, None), 'beds': (36, 3, None), 'bled': (12.3, 1, None), 'ding': (40, 3, 'e'), 'eeds': (7, 1, None), 'eled': (12.2, 2, None), 'ened': (13.7, 2, None), 'ered': (20.1, 2, None), 'eses': (11.1, 2, 'is'), 'gged': (43.1, 3, None), 'ging': (45, 3, 'e'), 'gned': (13.1, 2, None), 'ides': (63.2, 1, None), 'ines': (63.3, 1, None), 'izes': (63.5, 1, None), 'ling': (42, 3, 'e'), 'lled': (12.1, 2, None), 'lves': (60.1, 3, 'f'), 'ming': (44, 3, 'e'), 'nged': (43.2, 1, None), 'ning': (46, 3, 'e'), 'nned': (13.3, 3, None), 'oded': (61.1, 1, None), 'oned': (13.2, 2, None), 'ones': (63.6, 1, None), 'pled': (12.4, 1, None), 'reds': (20, 2, None), 'rned': (13.4, 2, None), 'sing': (54, 3, 'e'), 'ssed': (28, 2, None), 'sses': (11.2, 2, None), 'ting': (48, 3, 'e'), 'tled': (12.5, 1, None), 'tted': (21, 3, None), 'uded': (61.2, 1, None), 'umed': (31, 1, None), 'ures': (63.4, 1, None), 'uses': (11.3, 1, None), 'uted': (22.2, 1, None), 'ving': (39, 3, 'e'), 'zing': (54.1, 3, 'e')}, 5: {'ained': (13.6, 2, None), 'anges': (23, 1, None), 'aning': (46.6, 3, None), 'ating': (57, 3, 'e'), 'cting': (48.1, 3, None), 'dding': (40.4, 4, None), 'dings': (40, 4, 'e'), 'eared': (20.3, 2, None), 'ected': (15, 2, None), 'eding': (40.5, 3, None), 'eling': (42.1, 3, None), 'ening': (46.5, 3, None), 'erned': (13.5, 2, None), 'erred': (19, 3, None), 'eting': (48.4, 3, None), 'gging': (45.1, 4, None), 'gings': (45, 4, 'e'), 'gning': (46.4, 3, None), 'iases': (11.4, 2, None), 'ifted': (14, 2, None), 'iring': (54.4, 3, 'e'), 'lding': (40.3, 3, None), 'leted': (22.3, 1, None), 'lings': (42, 4, 'e'), 'lling': (41, 4, None), 'lming': (44.1, 3, None), 'lored': (20.4, 2, None), 'mings': (44, 4, 'e'), 'mming': (44.3, 4, None), 'ncing': (54.2, 3, 'e'), 'nding': (40.1, 3, None), 'nging': (45.2, 3, None), 'nning': (46.3, 4, None), 'noted': (22.4, 1, None), 'nting': (48.2, 3, None), 'oling': (42.3, 3, None), 'oning': (46.2, 3, None), 'pting': (48.3, 3, None), 'rabed': (36.1, 1, None), 'rding': (40.2, 3, None), 'rebed': (36.1, 1, None), 'ribed': (36.1, 1, None), 'rming': (44.2, 3, None), 'rning': (46.1, 3, None), 'robed': (36.1, 1, None), 'rubed': (36.1, 1, None), 'sings': (54, 4, 'e'), 'ssing': (37, 3, None), 'sting': (47, 3, None), 'thing': (58.1, 0, None), 'tings': (48, 4, 'e'), 'tored': (20.2, 1, None), 'tting': (26, 4, None), 'ulted': (32, 2, None), 'uming': (33, 3, 'e'), 'uring': (54.3, 3, 'e'), 'urred': (20.5, 3, None), 'vided': (16, 1, None), 'vings': (39, 4, 'e')}, 6: {'aceous': (1, 6, None), 'acting': (25, 3, None), 'ailing': (42.2, 3, None), 'aining': (24, 3, None), 'ddings': (40.4, 5, None), 'eading': (40.7, 3, None), 'ealing': (42.4, 3, None), 'edings': (40.5, 4, None), 'elings': (42.1, 4, None), 'etings': (48.4, 4, None), 'ggings': (45.1, 5, None), 'irings': (54.4, 4, 'e'), 'ldings': (40.3, 4, None), 'llings': (41, 5, None), 'mmings': (44.3, 5, None), 'ncings': (54.2, 4, 'e'), 'ndings': (40.1, 4, None), 'ngings': (45.2, 4, None), 'ntings': (48.2, 4, None), 'oading': (40.6, 3, None), 'olings': (42.3, 4, None), 'rdings': (40.2, 4, None), 'ssings': (37, 4, None), 'stings': (47, 4, None), 'things': (58.1, 1, None), 'ttings': (26, 5, None), 'ulting': (38, 3, None), 'urings': (54.3, 4, 'e'), 'viding': (27, 3, 'e')}, 7: {'ailings': (42.2, 4, None), 'eadings': (40.7, 4, None), 'ealings': (42.4, 4, None), 'fulness': (34, 4, None), 'oadings': (40.6, 4, None), 'ousness': (35, 4, None), 'titudes': (30, 1, None)}}}¶
-
_standard_rule_table
= {2: {'cs': (3, 0, None), 'es': (63, 2, None), 'is': (64, 2, 'e'), 'ss': (6, 0, None), 'us': (67, 0, None)}, 3: {'bed': (36, 2, None), 'ced': (18, 1, None), 'ces': (2, 1, None), 'eed': (7, 0, None), 'ees': (10, 1, None), 'ged': (43, 1, None), 'ges': (63.1, 1, None), 'ied': (56, 3, 'y'), 'ies': (59, 3, 'y'), 'led': (12, 2, None), 'les': (50, 1, None), 'mes': (63.7, 1, None), 'ned': (13, 1, None), 'ous': (65, 0, None), 'pes': (63.8, 1, None), 'red': (20, 1, None), 'sed': (29, 1, None), 'ses': (11, 1, None), 'sis': (4, 0, None), 'ted': (22, 2, None), 'tes': (51, 1, None), 'tis': (5, 0, None), 'ued': (8, 1, None), 'ues': (9, 1, None), 'ums': (66, 0, None), 'ved': (17, 1, None), 'ves': (60, 1, None), 'zed': (52, 1, None)}, 4: {'aped': (61.3, 1, None), 'ated': (22.1, 1, None), 'beds': (36, 3, None), 'bled': (12.3, 1, None), 'ding': (40, 3, 'e'), 'eeds': (7, 1, None), 'eled': (12.2, 2, None), 'ened': (13.7, 2, None), 'ered': (20.1, 2, None), 'eses': (11.1, 2, 'is'), 'gged': (43.1, 3, None), 'ging': (45, 3, 'e'), 'gned': (13.1, 2, None), 'ides': (63.2, 1, None), 'ines': (63.3, 1, None), 'izes': (63.5, 1, None), 'ling': (42, 3, 'e'), 'lled': (12.1, 2, None), 'lves': (60.1, 3, 'f'), 'ming': (44, 3, 'e'), 'nged': (43.2, 1, None), 'ning': (46, 3, 'e'), 'nned': (13.3, 3, None), 'oded': (61.1, 1, None), 'oned': (13.2, 2, None), 'ones': (63.6, 1, None), 'pled': (12.4, 1, None), 'reds': (20, 2, None), 'rned': (13.4, 2, None), 'sing': (54, 3, 'e'), 'ssed': (28, 2, None), 'sses': (11.2, 2, None), 'ting': (48, 3, 'e'), 'tled': (12.5, 1, None), 'tted': (21, 3, None), 'uded': (61.2, 1, None), 'umed': (31, 1, None), 'ures': (63.4, 1, None), 'uses': (11.3, 1, None), 'uted': (22.2, 1, None), 'ving': (39, 3, 'e'), 'zing': (54.1, 3, 'e')}, 5: {'ained': (13.6, 2, None), 'anges': (23, 1, None), 'aning': (46.6, 3, None), 'ating': (57, 3, 'e'), 'cting': (48.1, 3, None), 'dding': (40.4, 4, None), 'dings': (40, 4, 'e'), 'eared': (20.3, 2, None), 'ected': (15, 2, None), 'eding': (40.5, 3, None), 'eling': (42.1, 3, None), 'ening': (46.5, 3, None), 'erned': (13.5, 2, None), 'erred': (19, 3, None), 'eting': (48.4, 3, None), 'gging': (45.1, 4, None), 'gings': (45, 4, 'e'), 'gning': (46.4, 3, None), 'iases': (11.4, 2, None), 'ifted': (14, 2, None), 'iring': (54.4, 3, 'e'), 'lding': (40.3, 3, None), 'leted': (22.3, 1, None), 'lings': (42, 4, 'e'), 'lling': (41, 4, None), 'lming': (44.1, 3, None), 'lored': (20.4, 2, None), 'mings': (44, 4, 'e'), 'mming': (44.3, 4, None), 'ncing': (54.2, 3, 'e'), 'nding': (40.1, 3, None), 'nging': (45.2, 3, None), 'nning': (46.3, 4, None), 'noted': (22.4, 1, None), 'nting': (48.2, 3, None), 'oling': (42.3, 3, None), 'oning': (46.2, 3, None), 'pting': (48.3, 3, None), 'rabed': (36.1, 1, None), 'rding': (40.2, 3, None), 'rebed': (36.1, 1, None), 'ribed': (36.1, 1, None), 'rming': (44.2, 3, None), 'rning': (46.1, 3, None), 'robed': (36.1, 1, None), 'rubed': (36.1, 1, None), 'sings': (54, 4, 'e'), 'ssing': (37, 3, None), 'sting': (47, 3, None), 'thing': (58.1, 0, None), 'tings': (48, 4, 'e'), 'tored': (20.2, 1, None), 'tting': (26, 4, None), 'ulted': (32, 2, None), 'uming': (33, 3, 'e'), 'uring': (54.3, 3, 'e'), 'urred': (20.5, 3, None), 'vided': (16, 1, None), 'vings': (39, 4, 'e')}, 6: {'aceous': (1, 6, None), 'acting': (25, 3, None), 'ailing': (42.2, 3, None), 'aining': (24, 3, None), 'ddings': (40.4, 5, None), 'eading': (40.7, 3, None), 'ealing': (42.4, 3, None), 'edings': (40.5, 4, None), 'elings': (42.1, 4, None), 'etings': (48.4, 4, None), 'ggings': (45.1, 5, None), 'irings': (54.4, 4, 'e'), 'ldings': (40.3, 4, None), 'llings': (41, 5, None), 'mmings': (44.3, 5, None), 'ncings': (54.2, 4, 'e'), 'ndings': (40.1, 4, None), 'ngings': (45.2, 4, None), 'ntings': (48.2, 4, None), 'oading': (40.6, 3, None), 'olings': (42.3, 4, None), 'rdings': (40.2, 4, None), 'ssings': (37, 4, None), 'stings': (47, 4, None), 'things': (58.1, 1, None), 'ttings': (26, 5, None), 'ulting': (38, 3, None), 'urings': (54.3, 4, 'e'), 'viding': (27, 3, 'e')}, 7: {'ailings': (42.2, 4, None), 'eadings': (40.7, 4, None), 'ealings': (42.4, 4, None), 'fulness': (34, 4, None), 'oadings': (40.6, 4, None), 'ousness': (35, 4, None), 'titudes': (30, 1, None)}}¶
-
stem
(word)[source]¶ Return UEA-Lite stem.
- Parameters
word (str) -- The word to stem
- Returns
Word stem
- Return type
str or (str, int)
Examples
>>> uealite('readings') 'read' >>> uealite('insulted') 'insult' >>> uealite('cussed') 'cuss' >>> uealite('fancies') 'fancy' >>> uealite('eroded') 'erode'
New in version 0.1.0.
Changed in version 0.3.6: Encapsulated in class
-
abydos.stemmer.
uealite
(word, max_word_length=20, max_acro_length=8, return_rule_no=False, var='standard')[source]¶ Return UEA-Lite stem.
This is a wrapper for
UEALite.stem()
.- Parameters
word (str) -- The word to stem
max_word_length (int) -- The maximum word length allowed
max_acro_length (int) -- The maximum acronym length allowed
return_rule_no (bool) -- If True, returns the stem along with rule number
var (str) --
Variant rules to use:
Adams
to use Jason Adams' rulesPerl
to use the original Perl rules
- Returns
Word stem
- Return type
str or (str, int)
Examples
>>> uealite('readings') 'read' >>> uealite('insulted') 'insult' >>> uealite('cussed') 'cuss' >>> uealite('fancies') 'fancy' >>> uealite('eroded') 'erode'
New in version 0.1.0.
Deprecated since version 0.4.0: This will be removed in 0.6.0. Use the UEALite.stem method instead.
-
class
abydos.stemmer.
SStemmer
[source]¶ Bases:
abydos.stemmer._stemmer._Stemmer
S-stemmer.
The S stemmer is defined in [Har91].
New in version 0.3.6.
-
stem
(word)[source]¶ Return the S-stemmed form of a word.
- Parameters
word (str) -- The word to stem
- Returns
Word stem
- Return type
str
Examples
>>> stmr = SStemmer() >>> stmr.stem('summaries') 'summary' >>> stmr.stem('summary') 'summary' >>> stmr.stem('towers') 'tower' >>> stmr.stem('reading') 'reading' >>> stmr.stem('census') 'census'
New in version 0.3.0.
Changed in version 0.3.6: Encapsulated in class
-
-
abydos.stemmer.
s_stemmer
(word)[source]¶ Return the S-stemmed form of a word.
This is a wrapper for
SStemmer.stem()
.- Parameters
word (str) -- The word to stem
- Returns
Word stem
- Return type
str
Examples
>>> s_stemmer('summaries') 'summary' >>> s_stemmer('summary') 'summary' >>> s_stemmer('towers') 'tower' >>> s_stemmer('reading') 'reading' >>> s_stemmer('census') 'census'
New in version 0.3.0.
Deprecated since version 0.4.0: This will be removed in 0.6.0. Use the SStemmer.stem method instead.
-
class
abydos.stemmer.
Caumanns
[source]¶ Bases:
abydos.stemmer._stemmer._Stemmer
Caumanns stemmer.
Jörg Caumanns' stemmer is described in his article in [Cau99].
This implementation is based on the GermanStemFilter described at [Lan13].
New in version 0.3.6.
-
_umlauts
= {228: 'a', 246: 'o', 252: 'u'}¶
-
stem
(word)[source]¶ Return Caumanns German stem.
- Parameters
word (str) -- The word to stem
- Returns
Word stem
- Return type
str
Examples
>>> stmr = Caumanns() >>> stmr.stem('lesen') 'les' >>> stmr.stem('graues') 'grau' >>> stmr.stem('buchstabieren') 'buchstabier'
New in version 0.2.0.
Changed in version 0.3.6: Encapsulated in class
-
-
abydos.stemmer.
caumanns
(word)[source]¶ Return Caumanns German stem.
This is a wrapper for
Caumanns.stem()
.- Parameters
word (str) -- The word to stem
- Returns
Word stem
- Return type
str
Examples
>>> caumanns('lesen') 'les' >>> caumanns('graues') 'grau' >>> caumanns('buchstabieren') 'buchstabier'
New in version 0.2.0.
Deprecated since version 0.4.0: This will be removed in 0.6.0. Use the Caumanns.stem method instead.
-
class
abydos.stemmer.
Schinke
[source]¶ Bases:
abydos.stemmer._stemmer._Stemmer
Schinke stemmer.
This is defined in [SGRW96].
New in version 0.3.6.
-
_keep_que
= {'abs', 'abus', 'adae', 'adus', 'aps', 'at', 'attor', 'co', 'conco', 'contor', 'cui', 'cuius', 'de', 'deco', 'deni', 'detor', 'exco', 'extor', 'inco', 'intor', 'ita', 'ne', 'obli', 'obtor', 'optor', 'perae', 'plenis', 'praetor', 'qua', 'quae', 'quam', 'quando', 'quarum', 'quas', 'quem', 'qui', 'quibus', 'quis', 'quo', 'quorum', 'quos', 'quotusquis', 'quous', 'reco', 'retor', 'sus', 'tor', 'ubi', 'undi', 'us', 'uter', 'uti', 'utribi', 'utro'}¶
-
_n_endings
= {1: {'a', 'e', 'i', 'o', 'u'}, 2: {'ae', 'am', 'as', 'em', 'es', 'ia', 'is', 'nt', 'os', 'ud', 'um', 'us'}, 3: {'ius'}, 4: {'ibus'}}¶
-
_v_endings_alter
= {1: {}, 2: {'bo'}, 3: {'bor', 'ero', 'unt'}, 4: {'iunt'}, 5: {'beris', 'erunt', 'untur'}, 6: {'iuntur'}}¶
-
_v_endings_strip
= {1: {'m', 'r', 's', 't'}, 2: {'ns', 'nt', 'ri'}, 3: {'mur', 'mus', 'ris', 'sti', 'tis', 'tur'}, 4: {'mini', 'ntur', 'stis'}, 5: {}, 6: {}}¶
-
stem
(word)[source]¶ Return the stem of a word according to the Schinke stemmer.
- Parameters
word (str) -- The word to stem
- Returns
Word stem
- Return type
str
Examples
>>> stmr = Schinke() >>> stmr.stem('atque') {'n': 'atque', 'v': 'atque'} >>> stmr.stem('census') {'n': 'cens', 'v': 'censu'} >>> stmr.stem('virum') {'n': 'uir', 'v': 'uiru'} >>> stmr.stem('populusque') {'n': 'popul', 'v': 'populu'} >>> stmr.stem('senatus') {'n': 'senat', 'v': 'senatu'}
New in version 0.3.0.
Changed in version 0.3.6: Encapsulated in class
-
-
abydos.stemmer.
schinke
(word)[source]¶ Return the stem of a word according to the Schinke stemmer.
This is a wrapper for
Schinke.stem()
.- Parameters
word (str) -- The word to stem
- Returns
Word stem
- Return type
str
Examples
>>> schinke('atque') {'n': 'atque', 'v': 'atque'} >>> schinke('census') {'n': 'cens', 'v': 'censu'} >>> schinke('virum') {'n': 'uir', 'v': 'uiru'} >>> schinke('populusque') {'n': 'popul', 'v': 'populu'} >>> schinke('senatus') {'n': 'senat', 'v': 'senatu'}
New in version 0.3.0.
Deprecated since version 0.4.0: This will be removed in 0.6.0. Use the Schinke.stem method instead.
-
class
abydos.stemmer.
Porter
(early_english=False)[source]¶ Bases:
abydos.stemmer._stemmer._Stemmer
Porter stemmer.
The Porter stemmer is described in [Por80].
New in version 0.3.6.
Initialize Porter instance.
- Parameters
early_english (bool) -- Set to True in order to remove -eth & -est (2nd & 3rd person singular verbal agreement suffixes)
New in version 0.4.0.
-
_ends_in_cvc
(term)[source]¶ Return Porter helper function _ends_in_cvc value.
- Parameters
term (str) -- The word to scan for cvc
- Returns
True iff the stem ends in cvc (as defined in the Porter stemmer definition)
- Return type
bool
New in version 0.1.0.
Changed in version 0.3.6: Encapsulated in class
-
_ends_in_doubled_cons
(term)[source]¶ Return Porter helper function _ends_in_doubled_cons value.
- Parameters
term (str) -- The word to check for a final doubled consonant
- Returns
True iff the stem ends in a doubled consonant (as defined in the Porter stemmer definition)
- Return type
bool
New in version 0.1.0.
Changed in version 0.3.6: Encapsulated in class
-
_has_vowel
(term)[source]¶ Return Porter helper function _has_vowel value.
- Parameters
term (str) -- The word to scan for vowels
- Returns
True iff a vowel exists in the term (as defined in the Porter stemmer definition)
- Return type
bool
New in version 0.1.0.
Changed in version 0.3.6: Encapsulated in class
-
_m_degree
(term)[source]¶ Return Porter helper function _m_degree value.
m-degree is equal to the number of V to C transitions
- Parameters
term (str) -- The word for which to calculate the m-degree
- Returns
The m-degree as defined in the Porter stemmer definition
- Return type
int
New in version 0.1.0.
Changed in version 0.3.6: Encapsulated in class
-
_vowels
= {'a', 'e', 'i', 'o', 'u', 'y'}¶
-
stem
(word)[source]¶ Return Porter stem.
- Parameters
word (str) -- The word to stem
- Returns
Word stem
- Return type
str
Examples
>>> stmr = Porter() >>> stmr.stem('reading') 'read' >>> stmr.stem('suspension') 'suspens' >>> stmr.stem('elusiveness') 'elus'
>>> stmr = Porter(early_english=True) >>> stmr.stem('eateth') 'eat'
New in version 0.1.0.
Changed in version 0.3.6: Encapsulated in class
-
abydos.stemmer.
porter
(word, early_english=False)[source]¶ Return Porter stem.
This is a wrapper for
Porter.stem()
.- Parameters
word (str) -- The word to stem
early_english (bool) -- Set to True in order to remove -eth & -est (2nd & 3rd person singular verbal agreement suffixes)
- Returns
Word stem
- Return type
str
Examples
>>> porter('reading') 'read' >>> porter('suspension') 'suspens' >>> porter('elusiveness') 'elus'
>>> porter('eateth', early_english=True) 'eat'
New in version 0.1.0.
Deprecated since version 0.4.0: This will be removed in 0.6.0. Use the Porter.stem method instead.
-
class
abydos.stemmer.
Porter2
(early_english=False)[source]¶ Bases:
abydos.stemmer._snowball._Snowball
Porter2 (Snowball English) stemmer.
The Porter2 (Snowball English) stemmer is defined in [Por02].
New in version 0.3.6.
Initialize Porter2 instance.
- Parameters
early_english (bool) -- Set to True in order to remove -eth & -est (2nd & 3rd person singular verbal agreement suffixes)
New in version 0.4.0.
-
_doubles
= {'bb', 'dd', 'ff', 'gg', 'mm', 'nn', 'pp', 'rr', 'tt'}¶
-
_exception1dict
= {'dying': 'die', 'early': 'earli', 'gently': 'gentl', 'idly': 'idl', 'lying': 'lie', 'only': 'onli', 'singly': 'singl', 'skies': 'sky', 'skis': 'ski', 'tying': 'tie', 'ugly': 'ugli'}¶
-
_exception1set
= {'andes', 'atlas', 'bias', 'cosmos', 'howe', 'news', 'sky'}¶
-
_exception2set
= {'canning', 'earring', 'exceed', 'herring', 'inning', 'outing', 'proceed', 'succeed'}¶
-
_li
= {'c', 'd', 'e', 'g', 'h', 'k', 'm', 'n', 'r', 't'}¶
-
_r1_prefixes
= ('commun', 'gener', 'arsen')¶
-
stem
(word)[source]¶ Return the Porter2 (Snowball English) stem.
- Parameters
word (str) -- The word to stem
- Returns
Word stem
- Return type
str
Examples
>>> stmr = Porter2() >>> stmr.stem('reading') 'read' >>> stmr.stem('suspension') 'suspens' >>> stmr.stem('elusiveness') 'elus'
>>> stmr = Porter2(early_english=True) >>> stmr.stem('eateth') 'eat'
New in version 0.1.0.
Changed in version 0.3.6: Encapsulated in class
-
abydos.stemmer.
porter2
(word, early_english=False)[source]¶ Return the Porter2 (Snowball English) stem.
This is a wrapper for
Porter2.stem()
.- Parameters
word (str) -- The word to stem
early_english (bool) -- Set to True in order to remove -eth & -est (2nd & 3rd person singular verbal agreement suffixes)
- Returns
Word stem
- Return type
str
Examples
>>> porter2('reading') 'read' >>> porter2('suspension') 'suspens' >>> porter2('elusiveness') 'elus'
>>> porter2('eateth', early_english=True) 'eat'
New in version 0.1.0.
Deprecated since version 0.4.0: This will be removed in 0.6.0. Use the Porter2.stem method instead.
-
class
abydos.stemmer.
SnowballDanish
[source]¶ Bases:
abydos.stemmer._snowball._Snowball
Snowball Danish stemmer.
The Snowball Danish stemmer is defined at: http://snowball.tartarus.org/algorithms/danish/stemmer.html
New in version 0.3.6.
-
_s_endings
= {'a', 'b', 'c', 'd', 'f', 'g', 'h', 'j', 'k', 'l', 'm', 'n', 'o', 'p', 'r', 't', 'v', 'y', 'z', 'å'}¶
-
_vowels
= {'a', 'e', 'i', 'o', 'u', 'y', 'å', 'æ', 'ø'}¶
-
stem
(word)[source]¶ Return Snowball Danish stem.
- Parameters
word (str) -- The word to stem
- Returns
Word stem
- Return type
str
Examples
>>> stmr = SnowballDanish() >>> stmr.stem('underviser') 'undervis' >>> stmr.stem('suspension') 'suspension' >>> stmr.stem('sikkerhed') 'sikker'
New in version 0.1.0.
Changed in version 0.3.6: Encapsulated in class
-
-
abydos.stemmer.
sb_danish
(word)[source]¶ Return Snowball Danish stem.
This is a wrapper for
SnowballDanish.stem()
.- Parameters
word (str) -- The word to stem
- Returns
Word stem
- Return type
str
Examples
>>> sb_danish('underviser') 'undervis' >>> sb_danish('suspension') 'suspension' >>> sb_danish('sikkerhed') 'sikker'
New in version 0.1.0.
Deprecated since version 0.4.0: This will be removed in 0.6.0. Use the SnowballDanish.stem method instead.
-
class
abydos.stemmer.
SnowballDutch
[source]¶ Bases:
abydos.stemmer._snowball._Snowball
Snowball Dutch stemmer.
The Snowball Dutch stemmer is defined at: http://snowball.tartarus.org/algorithms/dutch/stemmer.html
New in version 0.3.6.
-
_accented
= {225: 'a', 228: 'a', 233: 'e', 235: 'e', 237: 'i', 239: 'i', 243: 'o', 246: 'o', 250: 'u', 252: 'u'}¶
-
_not_s_endings
= {'a', 'e', 'i', 'j', 'o', 'u', 'y', 'è'}¶
-
_undouble
(word)[source]¶ Undouble endings -kk, -dd, and -tt.
- Parameters
word (str) -- The word to stem
- Returns
The word with doubled endings undoubled
- Return type
str
New in version 0.1.0.
Changed in version 0.3.6: Encapsulated in class
-
_vowels
= {'a', 'e', 'i', 'o', 'u', 'y', 'è'}¶
-
stem
(word)[source]¶ Return Snowball Dutch stem.
- Parameters
word (str) -- The word to stem
- Returns
Word stem
- Return type
str
Examples
>>> stmr = SnowballDutch() >>> stmr.stem('lezen') 'lez' >>> stmr.stem('opschorting') 'opschort' >>> stmr.stem('ongrijpbaarheid') 'ongrijp'
New in version 0.1.0.
Changed in version 0.3.6: Encapsulated in class
-
-
abydos.stemmer.
sb_dutch
(word)[source]¶ Return Snowball Dutch stem.
This is a wrapper for
SnowballDutch.stem()
.- Parameters
word (str) -- The word to stem
- Returns
Word stem
- Return type
str
Examples
>>> sb_dutch('lezen') 'lez' >>> sb_dutch('opschorting') 'opschort' >>> sb_dutch('ongrijpbaarheid') 'ongrijp'
New in version 0.1.0.
Deprecated since version 0.4.0: This will be removed in 0.6.0. Use the SnowballDutch.stem method instead.
-
class
abydos.stemmer.
SnowballGerman
(alternate_vowels=False)[source]¶ Bases:
abydos.stemmer._snowball._Snowball
Snowball German stemmer.
The Snowball German stemmer is defined at: http://snowball.tartarus.org/algorithms/german/stemmer.html
New in version 0.3.6.
Initialize SnowballGerman instance.
- Parameters
alternate_vowels (bool) -- Composes ae as ä, oe as ö, and ue as ü before running the algorithm
New in version 0.4.0.
-
_s_endings
= {'b', 'd', 'f', 'g', 'h', 'k', 'l', 'm', 'n', 'r', 't'}¶
-
_st_endings
= {'b', 'd', 'f', 'g', 'h', 'k', 'l', 'm', 'n', 't'}¶
-
_vowels
= {'a', 'e', 'i', 'o', 'u', 'y', 'ä', 'ö', 'ü'}¶
-
stem
(word)[source]¶ Return Snowball German stem.
- Parameters
word (str) -- The word to stem
- Returns
Word stem
- Return type
str
Examples
>>> stmr = SnowballGerman() >>> stmr.stem('lesen') 'les' >>> stmr.stem('graues') 'grau' >>> stmr.stem('buchstabieren') 'buchstabi'
New in version 0.1.0.
Changed in version 0.3.6: Encapsulated in class
-
abydos.stemmer.
sb_german
(word, alternate_vowels=False)[source]¶ Return Snowball German stem.
This is a wrapper for
SnowballGerman.stem()
.- Parameters
word (str) -- The word to stem
alternate_vowels (bool) -- Composes ae as ä, oe as ö, and ue as ü before running the algorithm
- Returns
Word stem
- Return type
str
Examples
>>> sb_german('lesen') 'les' >>> sb_german('graues') 'grau' >>> sb_german('buchstabieren') 'buchstabi'
New in version 0.1.0.
Deprecated since version 0.4.0: This will be removed in 0.6.0. Use the SnowballGerman.stem method instead.
-
class
abydos.stemmer.
SnowballNorwegian
[source]¶ Bases:
abydos.stemmer._snowball._Snowball
Snowball Norwegian stemmer.
The Snowball Norwegian stemmer is defined at: http://snowball.tartarus.org/algorithms/norwegian/stemmer.html
New in version 0.3.6.
-
_s_endings
= {'b', 'c', 'd', 'f', 'g', 'h', 'j', 'l', 'm', 'n', 'o', 'p', 'r', 't', 'v', 'y', 'z'}¶
-
_vowels
= {'a', 'e', 'i', 'o', 'u', 'y', 'å', 'æ', 'ø'}¶
-
stem
(word)[source]¶ Return Snowball Norwegian stem.
- Parameters
word (str) -- The word to stem
- Returns
Word stem
- Return type
str
Examples
>>> stmr = SnowballNorwegian() >>> stmr.stem('lese') 'les' >>> stmr.stem('suspensjon') 'suspensjon' >>> stmr.stem('sikkerhet') 'sikker'
New in version 0.1.0.
Changed in version 0.3.6: Encapsulated in class
-
-
abydos.stemmer.
sb_norwegian
(word)[source]¶ Return Snowball Norwegian stem.
This is a wrapper for
SnowballNorwegian.stem()
.- Parameters
word (str) -- The word to stem
- Returns
Word stem
- Return type
str
Examples
>>> sb_norwegian('lese') 'les' >>> sb_norwegian('suspensjon') 'suspensjon' >>> sb_norwegian('sikkerhet') 'sikker'
New in version 0.1.0.
Deprecated since version 0.4.0: This will be removed in 0.6.0. Use the SnowballNorwegian.stem method instead.
-
class
abydos.stemmer.
SnowballSwedish
[source]¶ Bases:
abydos.stemmer._snowball._Snowball
Snowball Swedish stemmer.
The Snowball Swedish stemmer is defined at: http://snowball.tartarus.org/algorithms/swedish/stemmer.html
New in version 0.3.6.
-
_s_endings
= {'b', 'c', 'd', 'f', 'g', 'h', 'j', 'k', 'l', 'm', 'n', 'o', 'p', 'r', 't', 'v', 'y'}¶
-
_vowels
= {'a', 'e', 'i', 'o', 'u', 'y', 'ä', 'å', 'ö'}¶
-
stem
(word)[source]¶ Return Snowball Swedish stem.
- Parameters
word (str) -- The word to stem
- Returns
Word stem
- Return type
str
Examples
>>> stmr = SnowballSwedish() >>> stmr.stem('undervisa') 'undervis' >>> stmr.stem('suspension') 'suspension' >>> stmr.stem('visshet') 'viss'
New in version 0.1.0.
Changed in version 0.3.6: Encapsulated in class
-
-
abydos.stemmer.
sb_swedish
(word)[source]¶ Return Snowball Swedish stem.
This is a wrapper for
SnowballSwedish.stem()
.- Parameters
word (str) -- The word to stem
- Returns
Word stem
- Return type
str
Examples
>>> sb_swedish('undervisa') 'undervis' >>> sb_swedish('suspension') 'suspension' >>> sb_swedish('visshet') 'viss'
New in version 0.1.0.
Deprecated since version 0.4.0: This will be removed in 0.6.0. Use the SnowballSwedish.stem method instead.
-
class
abydos.stemmer.
CLEFGerman
[source]¶ Bases:
abydos.stemmer._stemmer._Stemmer
CLEF German stemmer.
The CLEF German stemmer is defined at [Sav05].
New in version 0.3.6.
-
_umlauts
= {228: 'a', 246: 'o', 252: 'u'}¶
-
stem
(word)[source]¶ Return CLEF German stem.
- Parameters
word (str) -- The word to stem
- Returns
Word stem
- Return type
str
Examples
>>> stmr = CLEFGerman() >>> stmr.stem('lesen') 'lese' >>> stmr.stem('graues') 'grau' >>> stmr.stem('buchstabieren') 'buchstabier'
New in version 0.1.0.
Changed in version 0.3.6: Encapsulated in class
-
-
abydos.stemmer.
clef_german
(word)[source]¶ Return CLEF German stem.
This is a wrapper for
CLEFGerman.stem()
.- Parameters
word (str) -- The word to stem
- Returns
Word stem
- Return type
str
Examples
>>> clef_german('lesen') 'lese' >>> clef_german('graues') 'grau' >>> clef_german('buchstabieren') 'buchstabier'
New in version 0.1.0.
Deprecated since version 0.4.0: This will be removed in 0.6.0. Use the CLEFGerman.stem method instead.
-
class
abydos.stemmer.
CLEFGermanPlus
[source]¶ Bases:
abydos.stemmer._stemmer._Stemmer
CLEF German stemmer plus.
The CLEF German stemmer plus is defined at [Sav05].
New in version 0.3.6.
-
_accents
= {224: 'a', 225: 'a', 226: 'a', 228: 'a', 236: 'i', 237: 'i', 238: 'i', 239: 'i', 242: 'o', 243: 'o', 244: 'o', 246: 'o', 249: 'u', 250: 'u', 251: 'u', 252: 'u'}¶
-
_st_ending
= {'b', 'd', 'f', 'g', 'h', 'k', 'l', 'm', 'n', 't'}¶
-
stem
(word)[source]¶ Return 'CLEF German stemmer plus' stem.
- Parameters
word (str) -- The word to stem
- Returns
Word stem
- Return type
str
Examples
>>> stmr = CLEFGermanPlus() >>> clef_german_plus('lesen') 'les' >>> clef_german_plus('graues') 'grau' >>> clef_german_plus('buchstabieren') 'buchstabi'
New in version 0.1.0.
Changed in version 0.3.6: Encapsulated in class
-
-
abydos.stemmer.
clef_german_plus
(word)[source]¶ Return 'CLEF German stemmer plus' stem.
This is a wrapper for
CLEFGermanPlus.stem()
.- Parameters
word (str) -- The word to stem
- Returns
Word stem
- Return type
str
Examples
>>> stmr = CLEFGermanPlus() >>> clef_german_plus('lesen') 'les' >>> clef_german_plus('graues') 'grau' >>> clef_german_plus('buchstabieren') 'buchstabi'
New in version 0.1.0.
Deprecated since version 0.4.0: This will be removed in 0.6.0. Use the CLEFGermanPlus.stem method instead.
-
class
abydos.stemmer.
CLEFSwedish
[source]¶ Bases:
abydos.stemmer._stemmer._Stemmer
CLEF Swedish stemmer.
The CLEF Swedish stemmer is defined at [Sav05].
New in version 0.3.6.
-
stem
(word)[source]¶ Return CLEF Swedish stem.
- Parameters
word (str) -- The word to stem
- Returns
Word stem
- Return type
str
Examples
>>> clef_swedish('undervisa') 'undervis' >>> clef_swedish('suspension') 'suspensio' >>> clef_swedish('visshet') 'viss'
New in version 0.1.0.
Changed in version 0.3.6: Encapsulated in class
-
-
abydos.stemmer.
clef_swedish
(word)[source]¶ Return CLEF Swedish stem.
This is a wrapper for
CLEFSwedish.stem()
.- Parameters
word (str) -- The word to stem
- Returns
Word stem
- Return type
str
Examples
>>> clef_swedish('undervisa') 'undervis' >>> clef_swedish('suspension') 'suspensio' >>> clef_swedish('visshet') 'viss'
New in version 0.1.0.
Deprecated since version 0.4.0: This will be removed in 0.6.0. Use the CLEFSwedish.stem method instead.