abydos.stemmer package

abydos.stemmer.

The stemmer package collects stemmer classes for a number of languages including:

Each stemmer has a stem method, which takes a word and returns its stemmed form:

>>> stmr = Porter()
>>> stmr.stem('democracy')
'democraci'
>>> stmr.stem('trusted')
'trust'

class abydos.stemmer._Stemmer[source]

Bases: object

Abstract Stemmer class.

New in version 0.3.6.

stem(word)[source]

Return stem.

Parameters
  • word (str) -- The word to stem

  • *args -- Variable length argument list

  • **kwargs -- Arbitrary keyword arguments

Returns

Word stem

Return type

str

New in version 0.3.6.

class abydos.stemmer._Snowball[source]

Bases: abydos.stemmer._stemmer._Stemmer

Snowball stemmer base class.

New in version 0.3.6.

_codanonvowels = {"'", 'b', 'c', 'd', 'f', 'g', 'h', 'j', 'k', 'l', 'm', 'n', 'p', 'q', 'r', 's', 't', 'v', 'z'}
_sb_ends_in_short_syllable(term)[source]

Return True iff term ends in a short syllable.

(...according to the Porter2 specification.)

NB: This is akin to the CVC test from the Porter stemmer. The description is unfortunately poor/ambiguous.

Parameters

term (str) -- The term to examine

Returns

True iff term ends in a short syllable

Return type

bool

New in version 0.1.0.

Changed in version 0.3.6: Encapsulated in class

_sb_has_vowel(term)[source]

Return Porter helper function _sb_has_vowel value.

Parameters

term (str) -- The term to examine

Returns

True iff a vowel exists in the term (as defined in the Porter stemmer definition)

Return type

bool

New in version 0.1.0.

Changed in version 0.3.6: Encapsulated in class

_sb_r1(term, r1_prefixes=None)[source]

Return the R1 region, as defined in the Porter2 specification.

Parameters
  • term (str) -- The term to examine

  • r1_prefixes (set) -- Prefixes to consider

Returns

Length of the R1 region

Return type

int

New in version 0.1.0.

Changed in version 0.3.6: Encapsulated in class

_sb_r2(term, r1_prefixes=None)[source]

Return the R2 region, as defined in the Porter2 specification.

Parameters
  • term (str) -- The term to examine

  • r1_prefixes (set) -- Prefixes to consider

Returns

Length of the R1 region

Return type

int

New in version 0.1.0.

Changed in version 0.3.6: Encapsulated in class

_sb_short_word(term, r1_prefixes=None)[source]

Return True iff term is a short word.

(...according to the Porter2 specification.)

Parameters
  • term (str) -- The term to examine

  • r1_prefixes (set) -- Prefixes to consider

Returns

True iff term is a short word

Return type

bool

New in version 0.1.0.

Changed in version 0.3.6: Encapsulated in class

_vowels = {'a', 'e', 'i', 'o', 'u', 'y'}
class abydos.stemmer.Lovins[source]

Bases: abydos.stemmer._stemmer._Stemmer

Lovins stemmer.

The Lovins stemmer is described in Julie Beth Lovins's article [Lov68].

New in version 0.3.6.

Initialize the stemmer.

New in version 0.3.6.

_cond_aa(word, suffix_len)[source]

Return Lovins' condition AA.

Parameters
  • word (str) -- Word to check

  • suffix_len (int) -- Suffix length

Returns

True if condition is met

Return type

bool

New in version 0.2.0.

Changed in version 0.3.6: Encapsulated in class

_cond_b(word, suffix_len)[source]

Return Lovins' condition B.

Parameters
  • word (str) -- Word to check

  • suffix_len (int) -- Suffix length

Returns

True if condition is met

Return type

bool

New in version 0.2.0.

Changed in version 0.3.6: Encapsulated in class

_cond_bb(word, suffix_len)[source]

Return Lovins' condition BB.

Parameters
  • word (str) -- Word to check

  • suffix_len (int) -- Suffix length

Returns

True if condition is met

Return type

bool

New in version 0.2.0.

Changed in version 0.3.6: Encapsulated in class

_cond_c(word, suffix_len)[source]

Return Lovins' condition C.

Parameters
  • word (str) -- Word to check

  • suffix_len (int) -- Suffix length

Returns

True if condition is met

Return type

bool

New in version 0.2.0.

Changed in version 0.3.6: Encapsulated in class

_cond_cc(word, suffix_len)[source]

Return Lovins' condition CC.

Parameters
  • word (str) -- Word to check

  • suffix_len (int) -- Suffix length

Returns

True if condition is met

Return type

bool

New in version 0.2.0.

Changed in version 0.3.6: Encapsulated in class

_cond_d(word, suffix_len)[source]

Return Lovins' condition D.

Parameters
  • word (str) -- Word to check

  • suffix_len (int) -- Suffix length

Returns

True if condition is met

Return type

bool

New in version 0.2.0.

Changed in version 0.3.6: Encapsulated in class

_cond_e(word, suffix_len)[source]

Return Lovins' condition E.

Parameters
  • word (str) -- Word to check

  • suffix_len (int) -- Suffix length

Returns

True if condition is met

Return type

bool

New in version 0.2.0.

Changed in version 0.3.6: Encapsulated in class

_cond_f(word, suffix_len)[source]

Return Lovins' condition F.

Parameters
  • word (str) -- Word to check

  • suffix_len (int) -- Suffix length

Returns

True if condition is met

Return type

bool

New in version 0.2.0.

Changed in version 0.3.6: Encapsulated in class

_cond_g(word, suffix_len)[source]

Return Lovins' condition G.

Parameters
  • word (str) -- Word to check

  • suffix_len (int) -- Suffix length

Returns

True if condition is met

Return type

bool

New in version 0.2.0.

Changed in version 0.3.6: Encapsulated in class

_cond_h(word, suffix_len)[source]

Return Lovins' condition H.

Parameters
  • word (str) -- Word to check

  • suffix_len (int) -- Suffix length

Returns

True if condition is met

Return type

bool

New in version 0.2.0.

Changed in version 0.3.6: Encapsulated in class

_cond_i(word, suffix_len)[source]

Return Lovins' condition I.

Parameters
  • word (str) -- Word to check

  • suffix_len (int) -- Suffix length

Returns

True if condition is met

Return type

bool

New in version 0.2.0.

Changed in version 0.3.6: Encapsulated in class

_cond_j(word, suffix_len)[source]

Return Lovins' condition J.

Parameters
  • word (str) -- Word to check

  • suffix_len (int) -- Suffix length

Returns

True if condition is met

Return type

bool

New in version 0.2.0.

Changed in version 0.3.6: Encapsulated in class

_cond_k(word, suffix_len)[source]

Return Lovins' condition K.

Parameters
  • word (str) -- Word to check

  • suffix_len (int) -- Suffix length

Returns

True if condition is met

Return type

bool

New in version 0.2.0.

Changed in version 0.3.6: Encapsulated in class

_cond_l(word, suffix_len)[source]

Return Lovins' condition L.

Parameters
  • word (str) -- Word to check

  • suffix_len (int) -- Suffix length

Returns

True if condition is met

Return type

bool

New in version 0.2.0.

Changed in version 0.3.6: Encapsulated in class

_cond_m(word, suffix_len)[source]

Return Lovins' condition M.

Parameters
  • word (str) -- Word to check

  • suffix_len (int) -- Suffix length

Returns

True if condition is met

Return type

bool

New in version 0.2.0.

Changed in version 0.3.6: Encapsulated in class

_cond_n(word, suffix_len)[source]

Return Lovins' condition N.

Parameters
  • word (str) -- Word to check

  • suffix_len (int) -- Suffix length

Returns

True if condition is met

Return type

bool

New in version 0.2.0.

Changed in version 0.3.6: Encapsulated in class

_cond_o(word, suffix_len)[source]

Return Lovins' condition O.

Parameters
  • word (str) -- Word to check

  • suffix_len (int) -- Suffix length

Returns

True if condition is met

Return type

bool

New in version 0.2.0.

Changed in version 0.3.6: Encapsulated in class

_cond_p(word, suffix_len)[source]

Return Lovins' condition P.

Parameters
  • word (str) -- Word to check

  • suffix_len (int) -- Suffix length

Returns

True if condition is met

Return type

bool

New in version 0.2.0.

Changed in version 0.3.6: Encapsulated in class

_cond_q(word, suffix_len)[source]

Return Lovins' condition Q.

Parameters
  • word (str) -- Word to check

  • suffix_len (int) -- Suffix length

Returns

True if condition is met

Return type

bool

New in version 0.2.0.

Changed in version 0.3.6: Encapsulated in class

_cond_r(word, suffix_len)[source]

Return Lovins' condition R.

Parameters
  • word (str) -- Word to check

  • suffix_len (int) -- Suffix length

Returns

True if condition is met

Return type

bool

New in version 0.2.0.

Changed in version 0.3.6: Encapsulated in class

_cond_s(word, suffix_len)[source]

Return Lovins' condition S.

Parameters
  • word (str) -- Word to check

  • suffix_len (int) -- Suffix length

Returns

True if condition is met

Return type

bool

New in version 0.2.0.

Changed in version 0.3.6: Encapsulated in class

_cond_t(word, suffix_len)[source]

Return Lovins' condition T.

Parameters
  • word (str) -- Word to check

  • suffix_len (int) -- Suffix length

Returns

True if condition is met

Return type

bool

New in version 0.2.0.

Changed in version 0.3.6: Encapsulated in class

_cond_u(word, suffix_len)[source]

Return Lovins' condition U.

Parameters
  • word (str) -- Word to check

  • suffix_len (int) -- Suffix length

Returns

True if condition is met

Return type

bool

New in version 0.2.0.

Changed in version 0.3.6: Encapsulated in class

_cond_v(word, suffix_len)[source]

Return Lovins' condition V.

Parameters
  • word (str) -- Word to check

  • suffix_len (int) -- Suffix length

Returns

True if condition is met

Return type

bool

New in version 0.2.0.

Changed in version 0.3.6: Encapsulated in class

_cond_w(word, suffix_len)[source]

Return Lovins' condition W.

Parameters
  • word (str) -- Word to check

  • suffix_len (int) -- Suffix length

Returns

True if condition is met

Return type

bool

New in version 0.2.0.

Changed in version 0.3.6: Encapsulated in class

_cond_x(word, suffix_len)[source]

Return Lovins' condition X.

Parameters
  • word (str) -- Word to check

  • suffix_len (int) -- Suffix length

Returns

True if condition is met

Return type

bool

New in version 0.2.0.

Changed in version 0.3.6: Encapsulated in class

_cond_y(word, suffix_len)[source]

Return Lovins' condition Y.

Parameters
  • word (str) -- Word to check

  • suffix_len (int) -- Suffix length

Returns

True if condition is met

Return type

bool

New in version 0.2.0.

Changed in version 0.3.6: Encapsulated in class

_cond_z(word, suffix_len)[source]

Return Lovins' condition Z.

Parameters
  • word (str) -- Word to check

  • suffix_len (int) -- Suffix length

Returns

True if condition is met

Return type

bool

New in version 0.2.0.

Changed in version 0.3.6: Encapsulated in class

_recode = ()
_recode24(stem)[source]

Return Lovins' conditional recode rule 24.

Parameters

stem (str) -- Word to stem

Returns

Word stripped of suffix

Return type

str

New in version 0.2.0.

Changed in version 0.3.6: Encapsulated in class

_recode28(stem)[source]

Return Lovins' conditional recode rule 28.

Parameters

stem (str) -- Word to stem

Returns

Word stripped of suffix

Return type

str

New in version 0.2.0.

Changed in version 0.3.6: Encapsulated in class

_recode30(stem)[source]

Return Lovins' conditional recode rule 30.

Parameters

stem (str) -- Word to stem

Returns

Word stripped of suffix

Return type

str

New in version 0.2.0.

Changed in version 0.3.6: Encapsulated in class

_recode32(stem)[source]

Return Lovins' conditional recode rule 32.

Parameters

stem (str) -- Word to stem

Returns

Word stripped of suffix

Return type

str

New in version 0.2.0.

Changed in version 0.3.6: Encapsulated in class

_recode9(stem)[source]

Return Lovins' conditional recode rule 9.

Parameters

stem (str) -- Word to stem

Returns

Word stripped of suffix

Return type

str

New in version 0.2.0.

Changed in version 0.3.6: Encapsulated in class

_suffix = {}
stem(word)[source]

Return Lovins stem.

Parameters

word (str) -- The word to stem

Returns

Word stem

Return type

str

Examples

>>> stmr = Lovins()
>>> stmr.stem('reading')
'read'
>>> stmr.stem('suspension')
'suspens'
>>> stmr.stem('elusiveness')
'elus'

New in version 0.2.0.

Changed in version 0.3.6: Encapsulated in class

abydos.stemmer.lovins(word)[source]

Return Lovins stem.

This is a wrapper for Lovins.stem().

Parameters

word (str) -- The word to stem

Returns

str

Return type

Word stem

Examples

>>> lovins('reading')
'read'
>>> lovins('suspension')
'suspens'
>>> lovins('elusiveness')
'elus'

New in version 0.2.0.

Deprecated since version 0.4.0: This will be removed in 0.6.0. Use the Lovins.stem method instead.

class abydos.stemmer.PaiceHusk[source]

Bases: abydos.stemmer._stemmer._Stemmer

Paice-Husk stemmer.

Implementation of the Paice-Husk Stemmer, also known as the Lancaster Stemmer, developed by Chris Paice, with the assistance of Gareth Husk

This is based on the algorithm's description in [Pai90].

New in version 0.3.6.

_acceptable(word)[source]
_apply_rule(word, rule, intact, terminate)[source]
_has_vowel(word)[source]
_rule_table = {1: {'a': (True, 1, None, True), 'e': (False, 1, None, False), 'i': ((True, 1, None, True), (False, 1, 'y', False)), 'j': (False, 1, 's', True), 's': ((True, 1, None, False), (False, 0, None, True))}, 2: {'ag': (False, 2, None, False), 'al': (False, 2, None, False), 'an': (False, 2, None, False), 'ar': (False, 2, None, True), 'at': (False, 2, None, False), 'bb': (False, 1, None, True), 'cl': (False, 1, None, True), 'dd': (False, 1, None, True), 'ed': (False, 2, None, False), 'en': (False, 2, None, False), 'er': (False, 2, None, False), 'gg': (False, 1, None, True), 'ia': (True, 2, None, True), 'ic': (False, 2, None, False), 'if': (False, 2, None, False), 'ij': (False, 1, 'd', True), 'is': (False, 2, None, False), 'iv': (False, 2, None, False), 'iz': (False, 2, None, False), 'll': (False, 1, None, True), 'ly': (False, 2, None, False), 'mm': (False, 1, None, True), 'nc': (False, 1, 't', False), 'nj': (False, 1, 'd', True), 'nn': (False, 1, None, True), 'oj': (False, 1, 'd', True), 'or': (False, 2, None, False), 'pp': (False, 1, None, True), 'rr': (False, 1, None, True), 'ss': (False, 0, None, True), 'th': (True, 2, None, True), 'tr': (False, 1, None, False), 'tt': (False, 1, None, True), 'uj': (False, 1, 'd', True), 'ul': (False, 2, None, True), 'um': (True, 2, None, True), 'ur': (False, 2, None, False), 'us': (True, 2, None, True), 'yz': (False, 1, 's', True)}, 3: {'abl': (False, 3, None, False), 'acy': (False, 3, None, False), 'ant': (False, 3, None, False), 'ary': (False, 3, None, False), 'bil': (False, 2, 'l', False), 'bly': (False, 1, None, False), 'ear': (False, 0, None, True), 'eed': (False, 1, None, True), 'een': (False, 0, None, True), 'eiv': (False, 0, None, True), 'ent': (False, 3, None, False), 'ety': (False, 3, None, False), 'fuj': (False, 1, 's', True), 'ful': (False, 3, None, False), 'hej': (False, 1, 'r', True), 'iag': (False, 3, 'y', True), 'ial': (False, 3, None, False), 'ian': (False, 3, None, False), 'ibl': (False, 3, None, True), 'ied': (False, 3, 'y', False), 'ier': (False, 3, 'y', False), 'ies': (False, 3, 'y', False), 'ify': (False, 3, None, True), 'ily': (False, 3, 'y', False), 'ing': (False, 3, None, False), 'ion': (False, 3, None, False), 'iqu': (False, 3, None, True), 'ish': (False, 3, None, False), 'ism': (False, 3, None, False), 'ist': (False, 3, None, False), 'ity': (False, 3, None, False), 'ium': (False, 3, None, True), 'lty': (False, 2, None, True), 'ncy': (False, 2, 't', False), 'ogu': (False, 1, None, True), 'ogy': (False, 1, None, True), 'omy': (False, 1, None, True), 'opy': (False, 1, None, True), 'ory': (False, 3, None, False), 'ous': (False, 3, None, False), 'phy': (False, 1, None, True), 'ply': (False, 0, None, True), 'sis': (False, 2, None, True), 'siv': (False, 3, 'j', False), 'ual': (False, 3, None, False)}, 4: {'ceed': (False, 2, 'ss', True), 'cept': (False, 2, 'iv', True), 'duct': (False, 1, None, True), 'hood': (False, 4, None, False), 'iabl': (False, 4, 'y', True), 'iful': (False, 4, 'y', True), 'lief': (False, 1, 'v', True), 'ment': (False, 4, None, False), 'misj': (False, 2, 't', True), 'ness': (False, 4, None, False), 'olut': (False, 2, 'v', True), 'orpt': (False, 2, 'b', True), 'ript': (False, 2, 'b', True), 'ship': (False, 4, None, False), 'sion': (False, 4, 'j', False), 'sist': (False, 0, None, True), 'verj': (False, 1, 't', True), 'xion': (False, 4, 'ct', True), 'ytic': (False, 3, 's', True)}, 5: {'guish': (False, 5, 'ct', True), 'istry': (False, 5, None, True), 'sumpt': (False, 2, None, True)}, 6: {'ifiabl': (False, 6, None, True), 'plicat': (False, 4, 'y', True)}}
stem(word)[source]

Return Paice-Husk stem.

Parameters

word (str) -- The word to stem

Returns

Word stem

Return type

str

Examples

>>> stmr = PaiceHusk()
>>> stmr.stem('assumption')
'assum'
>>> stmr.stem('verifiable')
'ver'
>>> stmr.stem('fancies')
'fant'
>>> stmr.stem('fanciful')
'fancy'
>>> stmr.stem('torment')
'tor'

New in version 0.3.0.

Changed in version 0.3.6: Encapsulated in class

abydos.stemmer.paice_husk(word)[source]

Return Paice-Husk stem.

This is a wrapper for PaiceHusk.stem().

Parameters

word (str) -- The word to stem

Returns

Word stem

Return type

str

Examples

>>> paice_husk('assumption')
'assum'
>>> paice_husk('verifiable')
'ver'
>>> paice_husk('fancies')
'fant'
>>> paice_husk('fanciful')
'fancy'
>>> paice_husk('torment')
'tor'

New in version 0.3.0.

Deprecated since version 0.4.0: This will be removed in 0.6.0. Use the PaiceHusk.stem method instead.

class abydos.stemmer.UEALite(max_word_length=20, max_acro_length=8, return_rule_no=False, var='standard')[source]

Bases: abydos.stemmer._stemmer._Stemmer

UEA-Lite stemmer.

The UEA-Lite stemmer is discussed in [JS05].

This is chiefly based on the Java implementation of the algorithm, with variants based on the Perl implementation and Jason Adams' Ruby port.

Java version: [Chu] Perl version: [JS05] Ruby version: [Ada17]

New in version 0.3.6.

Initialize UEALite instance.

Parameters
  • max_word_length (int) -- The maximum word length allowed

  • max_acro_length (int) -- The maximum acronym length allowed

  • return_rule_no (bool) -- If True, returns the stem along with rule number

  • var (str) --

    Variant rules to use:

    • Adams to use Jason Adams' rules

    • Perl to use the original Perl rules

New in version 0.4.0.

_adams_rule_table = {2: {'cs': (3, 0, None), 'es': (63, 2, None), 'is': (64, 2, 'e'), 'ss': (6, 0, None), 'us': (67, 0, None)}, 3: {'bed': (36, 2, None), 'ced': (18, 1, None), 'ces': (2, 1, None), 'des': (63.1, 1, None), 'eed': (7, 0, None), 'ees': (10, 1, None), 'ged': (43, 1, None), 'ges': (63.1, 1, None), 'ied': (56, 3, 'y'), 'ies': (59, 3, 'y'), 'kes': (63.1, 1, None), 'led': (12, 2, None), 'les': (50, 1, None), 'mes': (63.7, 1, None), 'ned': (13, 1, None), 'oed': (31.3, 1, None), 'oes': (31.2, 1, None), 'ous': (65, 0, None), 'pes': (63.8, 1, None), 'red': (20, 1, None), 'res': (63.9, 1, None), 'sed': (29, 1, None), 'ses': (11, 1, None), 'sis': (4, 0, None), 'ted': (22, 2, None), 'tes': (51, 1, None), 'tis': (5, 0, None), 'ued': (8, 1, None), 'ues': (9, 1, None), 'ums': (66, 0, None), 'ved': (17, 1, None), 'ves': (60, 1, None), 'zed': (52, 1, None)}, 4: {'aked': (31.1, 1, None), 'amed': (31, 1, None), 'aped': (61.3, 1, None), 'ated': (22.1, 1, None), 'beds': (36, 3, None), 'bled': (12.3, 1, None), 'ding': (40, 3, 'e'), 'does': (31.2, 2, None), 'eeds': (7, 1, None), 'eled': (12.2, 2, None), 'ened': (13.7, 2, None), 'ered': (20.1, 2, None), 'eses': (11.1, 2, 'is'), 'gged': (43.1, 3, None), 'ging': (45, 3, 'e'), 'gned': (13.1, 2, None), 'ides': (63.2, 1, None), 'iked': (31.1, 1, None), 'imed': (31, 1, None), 'ines': (63.3, 1, None), 'ited': (22.7, 2, None), 'izes': (63.5, 1, None), 'ling': (42, 3, 'e'), 'lled': (12.1, 2, None), 'lves': (60.1, 3, 'f'), 'ming': (44, 3, 'e'), 'nged': (43.2, 1, None), 'ning': (46, 3, 'e'), 'nned': (13.3, 3, None), 'oded': (61.1, 1, None), 'oked': (31.1, 1, None), 'oned': (13.2, 2, None), 'ones': (63.6, 1, None), 'pled': (12.4, 1, None), 'reds': (20, 2, None), 'rned': (13.4, 2, None), 'sing': (54, 3, 'e'), 'ssed': (28, 2, None), 'sses': (11.2, 2, None), 'ting': (48, 3, 'e'), 'tled': (12.5, 1, None), 'tted': (21, 3, None), 'uded': (61.2, 1, None), 'uked': (31.1, 1, None), 'umed': (31, 1, None), 'ures': (63.4, 1, None), 'uses': (11.3, 1, None), 'uted': (22.2, 1, None), 'ving': (39, 3, 'e'), 'zing': (54.1, 3, 'e')}, 5: {'ained': (13.6, 2, None), 'anges': (23, 1, None), 'aning': (46.6, 3, None), 'arred': (19.1, 3, None), 'ating': (57, 3, 'e'), 'cting': (48.1, 3, None), 'dding': (40.4, 4, None), 'dings': (40, 4, 'e'), 'dying': (58.2, 4, 'ie'), 'eared': (20.3, 2, None), 'ected': (15, 2, None), 'eding': (40.5, 3, None), 'eling': (42.1, 3, None), 'ening': (46.5, 3, None), 'erned': (13.5, 2, None), 'erred': (19, 3, None), 'eting': (48.4, 3, None), 'gging': (45.1, 4, None), 'gings': (45, 4, 'e'), 'gning': (46.4, 3, None), 'iases': (11.4, 2, None), 'ifted': (14, 2, None), 'iring': (54.4, 3, 'e'), 'lding': (40.3, 3, None), 'leted': (22.3, 1, None), 'lings': (42, 4, 'e'), 'lling': (41, 4, None), 'lming': (44.1, 3, None), 'lored': (20.4, 2, None), 'lying': (58.2, 4, 'ie'), 'mided': (22.1, 1, None), 'mings': (44, 4, 'e'), 'mited': (22.5, 1, None), 'mming': (44.3, 4, None), 'ncing': (54.2, 3, 'e'), 'nding': (40.1, 3, None), 'nging': (45.2, 3, None), 'nning': (46.3, 4, None), 'noted': (22.4, 1, None), 'nting': (48.2, 3, None), 'oling': (42.3, 3, None), 'oning': (46.2, 3, None), 'pting': (48.3, 3, None), 'rabed': (36.1, 1, None), 'rding': (40.2, 3, None), 'rebed': (36.1, 1, None), 'ribed': (36.1, 1, None), 'rming': (44.2, 3, None), 'rning': (46.1, 3, None), 'robed': (36.1, 1, None), 'rubed': (36.1, 1, None), 'sings': (54, 4, 'e'), 'ssing': (37, 3, None), 'sting': (47, 3, None), 'thing': (58.1, 0, None), 'tings': (48, 4, 'e'), 'tored': (20.2, 1, None), 'tting': (26, 4, None), 'tying': (58.2, 4, 'ie'), 'ulted': (32, 2, None), 'uming': (33, 3, 'e'), 'uring': (54.3, 3, 'e'), 'urred': (20.5, 3, None), 'vided': (22.9, 1, None), 'vings': (39, 4, 'e'), 'vited': (22.6, 1, None)}, 6: {'aceous': (1, 6, None), 'acting': (25, 3, None), 'ailing': (42.2, 3, None), 'aining': (24, 3, None), 'chited': (22.8, 1, None), 'ddings': (40.4, 5, None), 'eading': (40.7, 3, None), 'ealing': (42.4, 3, None), 'edings': (40.5, 4, None), 'elings': (42.1, 4, None), 'etings': (48.4, 4, None), 'ggings': (45.1, 5, None), 'irings': (54.4, 4, 'e'), 'ldings': (40.3, 4, None), 'llings': (41, 5, None), 'mmings': (44.3, 5, None), 'ncings': (54.2, 4, 'e'), 'ndings': (40.1, 4, None), 'ngings': (45.2, 4, None), 'ntings': (48.2, 4, None), 'oading': (40.6, 3, None), 'olings': (42.3, 4, None), 'rdings': (40.2, 4, None), 'ssings': (37, 4, None), 'stings': (47, 4, None), 'things': (58.1, 1, None), 'ttings': (26, 5, None), 'ulting': (38, 3, None), 'urings': (54.3, 4, 'e'), 'viding': (27, 3, 'e')}, 7: {'ailings': (42.2, 4, None), 'eadings': (40.7, 4, None), 'ealings': (42.4, 4, None), 'fulness': (34, 4, None), 'oadings': (40.6, 4, None), 'ousness': (35, 4, None), 'titudes': (30, 1, None)}}
_perl_rule_table = {2: {'cs': (3, 0, None), 'es': (63, 2, None), 'is': (64, 2, 'e'), 'ss': (6, 0, None), 'us': (67, 0, None)}, 3: {'bed': (36, 2, None), 'ced': (18, 1, None), 'ces': (2, 1, None), 'eed': (7, 0, None), 'ees': (10, 1, None), 'ged': (43, 1, None), 'ges': (63.1, 1, None), 'ied': (56, 3, 'y'), 'ies': (59, 3, 'y'), 'led': (12, 2, None), 'les': (50, 1, None), 'mes': (63.7, 1, None), 'ned': (13, 1, None), 'ous': (65, 0, None), 'pes': (63.8, 1, None), 'red': (20, 1, None), 'sed': (29, 1, None), 'ses': (11, 1, None), 'sis': (4, 0, None), 'ted': (22, 2, None), 'tes': (51, 1, None), 'tis': (5, 0, None), 'ued': (8, 1, None), 'ues': (9, 1, None), 'ums': (66, 0, None), 'ved': (17, 1, None), 'ves': (60, 1, None), 'zed': (52, 1, None)}, 4: {'aped': (61.3, 1, None), 'ated': (22.1, 1, None), 'bled': (12.3, 1, None), 'ding': (40, 3, 'e'), 'eled': (12.2, 2, None), 'ened': (13.7, 2, None), 'ered': (20.1, 2, None), 'eses': (11.1, 2, 'is'), 'gged': (43.1, 3, None), 'ging': (45, 3, 'e'), 'gned': (13.1, 2, None), 'ides': (63.2, 1, None), 'ines': (63.3, 1, None), 'izes': (63.5, 1, None), 'ling': (42, 3, 'e'), 'lled': (12.1, 2, None), 'lves': (60.1, 3, 'f'), 'ming': (44, 3, 'e'), 'nged': (43.2, 1, None), 'ning': (46, 3, 'e'), 'nned': (13.3, 3, None), 'oded': (61.1, 1, None), 'oned': (13.2, 2, None), 'ones': (63.6, 1, None), 'pled': (12.4, 1, None), 'rned': (13.4, 2, None), 'sing': (54, 3, 'e'), 'ssed': (28, 2, None), 'sses': (11.2, 2, None), 'ting': (48, 3, 'e'), 'tled': (12.5, 1, None), 'tted': (21, 3, None), 'uded': (61.2, 1, None), 'umed': (31, 1, None), 'ures': (63.4, 1, None), 'uses': (11.3, 1, None), 'uted': (22.2, 1, None), 'ving': (39, 3, 'e'), 'zing': (54.1, 3, 'e')}, 5: {'ained': (13.6, 2, None), 'anges': (23, 1, None), 'aning': (46.6, 3, None), 'ating': (57, 3, 'e'), 'cting': (48.1, 3, None), 'dding': (40.4, 4, None), 'eared': (20.3, 2, None), 'ected': (15, 2, None), 'eding': (40.5, 3, None), 'eling': (42.1, 3, None), 'ening': (46.5, 3, None), 'erned': (13.5, 2, None), 'erred': (19, 3, None), 'eting': (48.4, 3, None), 'gging': (45.1, 4, None), 'gning': (46.4, 3, None), 'iases': (11.4, 2, None), 'ifted': (14, 2, None), 'iring': (54.4, 3, 'e'), 'lding': (40.3, 3, None), 'leted': (22.3, 1, None), 'lling': (41, 4, None), 'lming': (44.1, 3, None), 'lored': (20.4, 2, None), 'mming': (44.3, 4, None), 'ncing': (54.2, 3, 'e'), 'nding': (40.1, 3, None), 'nging': (45.2, 3, None), 'nning': (46.3, 4, None), 'noted': (22.4, 1, None), 'nting': (48.2, 3, None), 'oling': (42.3, 3, None), 'oning': (46.2, 3, None), 'pting': (48.3, 3, None), 'rabed': (36.1, 1, None), 'rding': (40.2, 3, None), 'rebed': (36.1, 1, None), 'ribed': (36.1, 1, None), 'rming': (44.2, 3, None), 'rning': (46.1, 3, None), 'robed': (36.1, 1, None), 'rubed': (36.1, 1, None), 'ssing': (37, 3, None), 'sting': (47, 3, None), 'thing': (58.1, 0, None), 'tored': (20.2, 1, None), 'tting': (26, 4, None), 'ulted': (32, 2, None), 'uming': (33, 3, 'e'), 'uring': (54.3, 3, 'e'), 'urred': (20.5, 3, None), 'vided': (16, 1, None)}, 6: {'aceous': (1, 6, None), 'acting': (25, 3, None), 'ailing': (42.2, 3, None), 'aining': (24, 3, None), 'eading': (40.7, 3, None), 'ealing': (42.4, 3, None), 'oading': (40.6, 3, None), 'ulting': (38, 3, None), 'viding': (27, 3, 'e')}, 7: {'fulness': (34, 4, None), 'ousness': (35, 4, None), 'titudes': (30, 1, None)}}
_problem_words = {'as', 'during', 'has', 'is', 'this', 'was'}
_rules = {'Adams': {2: {'cs': (3, 0, None), 'es': (63, 2, None), 'is': (64, 2, 'e'), 'ss': (6, 0, None), 'us': (67, 0, None)}, 3: {'bed': (36, 2, None), 'ced': (18, 1, None), 'ces': (2, 1, None), 'des': (63.1, 1, None), 'eed': (7, 0, None), 'ees': (10, 1, None), 'ged': (43, 1, None), 'ges': (63.1, 1, None), 'ied': (56, 3, 'y'), 'ies': (59, 3, 'y'), 'kes': (63.1, 1, None), 'led': (12, 2, None), 'les': (50, 1, None), 'mes': (63.7, 1, None), 'ned': (13, 1, None), 'oed': (31.3, 1, None), 'oes': (31.2, 1, None), 'ous': (65, 0, None), 'pes': (63.8, 1, None), 'red': (20, 1, None), 'res': (63.9, 1, None), 'sed': (29, 1, None), 'ses': (11, 1, None), 'sis': (4, 0, None), 'ted': (22, 2, None), 'tes': (51, 1, None), 'tis': (5, 0, None), 'ued': (8, 1, None), 'ues': (9, 1, None), 'ums': (66, 0, None), 'ved': (17, 1, None), 'ves': (60, 1, None), 'zed': (52, 1, None)}, 4: {'aked': (31.1, 1, None), 'amed': (31, 1, None), 'aped': (61.3, 1, None), 'ated': (22.1, 1, None), 'beds': (36, 3, None), 'bled': (12.3, 1, None), 'ding': (40, 3, 'e'), 'does': (31.2, 2, None), 'eeds': (7, 1, None), 'eled': (12.2, 2, None), 'ened': (13.7, 2, None), 'ered': (20.1, 2, None), 'eses': (11.1, 2, 'is'), 'gged': (43.1, 3, None), 'ging': (45, 3, 'e'), 'gned': (13.1, 2, None), 'ides': (63.2, 1, None), 'iked': (31.1, 1, None), 'imed': (31, 1, None), 'ines': (63.3, 1, None), 'ited': (22.7, 2, None), 'izes': (63.5, 1, None), 'ling': (42, 3, 'e'), 'lled': (12.1, 2, None), 'lves': (60.1, 3, 'f'), 'ming': (44, 3, 'e'), 'nged': (43.2, 1, None), 'ning': (46, 3, 'e'), 'nned': (13.3, 3, None), 'oded': (61.1, 1, None), 'oked': (31.1, 1, None), 'oned': (13.2, 2, None), 'ones': (63.6, 1, None), 'pled': (12.4, 1, None), 'reds': (20, 2, None), 'rned': (13.4, 2, None), 'sing': (54, 3, 'e'), 'ssed': (28, 2, None), 'sses': (11.2, 2, None), 'ting': (48, 3, 'e'), 'tled': (12.5, 1, None), 'tted': (21, 3, None), 'uded': (61.2, 1, None), 'uked': (31.1, 1, None), 'umed': (31, 1, None), 'ures': (63.4, 1, None), 'uses': (11.3, 1, None), 'uted': (22.2, 1, None), 'ving': (39, 3, 'e'), 'zing': (54.1, 3, 'e')}, 5: {'ained': (13.6, 2, None), 'anges': (23, 1, None), 'aning': (46.6, 3, None), 'arred': (19.1, 3, None), 'ating': (57, 3, 'e'), 'cting': (48.1, 3, None), 'dding': (40.4, 4, None), 'dings': (40, 4, 'e'), 'dying': (58.2, 4, 'ie'), 'eared': (20.3, 2, None), 'ected': (15, 2, None), 'eding': (40.5, 3, None), 'eling': (42.1, 3, None), 'ening': (46.5, 3, None), 'erned': (13.5, 2, None), 'erred': (19, 3, None), 'eting': (48.4, 3, None), 'gging': (45.1, 4, None), 'gings': (45, 4, 'e'), 'gning': (46.4, 3, None), 'iases': (11.4, 2, None), 'ifted': (14, 2, None), 'iring': (54.4, 3, 'e'), 'lding': (40.3, 3, None), 'leted': (22.3, 1, None), 'lings': (42, 4, 'e'), 'lling': (41, 4, None), 'lming': (44.1, 3, None), 'lored': (20.4, 2, None), 'lying': (58.2, 4, 'ie'), 'mided': (22.1, 1, None), 'mings': (44, 4, 'e'), 'mited': (22.5, 1, None), 'mming': (44.3, 4, None), 'ncing': (54.2, 3, 'e'), 'nding': (40.1, 3, None), 'nging': (45.2, 3, None), 'nning': (46.3, 4, None), 'noted': (22.4, 1, None), 'nting': (48.2, 3, None), 'oling': (42.3, 3, None), 'oning': (46.2, 3, None), 'pting': (48.3, 3, None), 'rabed': (36.1, 1, None), 'rding': (40.2, 3, None), 'rebed': (36.1, 1, None), 'ribed': (36.1, 1, None), 'rming': (44.2, 3, None), 'rning': (46.1, 3, None), 'robed': (36.1, 1, None), 'rubed': (36.1, 1, None), 'sings': (54, 4, 'e'), 'ssing': (37, 3, None), 'sting': (47, 3, None), 'thing': (58.1, 0, None), 'tings': (48, 4, 'e'), 'tored': (20.2, 1, None), 'tting': (26, 4, None), 'tying': (58.2, 4, 'ie'), 'ulted': (32, 2, None), 'uming': (33, 3, 'e'), 'uring': (54.3, 3, 'e'), 'urred': (20.5, 3, None), 'vided': (22.9, 1, None), 'vings': (39, 4, 'e'), 'vited': (22.6, 1, None)}, 6: {'aceous': (1, 6, None), 'acting': (25, 3, None), 'ailing': (42.2, 3, None), 'aining': (24, 3, None), 'chited': (22.8, 1, None), 'ddings': (40.4, 5, None), 'eading': (40.7, 3, None), 'ealing': (42.4, 3, None), 'edings': (40.5, 4, None), 'elings': (42.1, 4, None), 'etings': (48.4, 4, None), 'ggings': (45.1, 5, None), 'irings': (54.4, 4, 'e'), 'ldings': (40.3, 4, None), 'llings': (41, 5, None), 'mmings': (44.3, 5, None), 'ncings': (54.2, 4, 'e'), 'ndings': (40.1, 4, None), 'ngings': (45.2, 4, None), 'ntings': (48.2, 4, None), 'oading': (40.6, 3, None), 'olings': (42.3, 4, None), 'rdings': (40.2, 4, None), 'ssings': (37, 4, None), 'stings': (47, 4, None), 'things': (58.1, 1, None), 'ttings': (26, 5, None), 'ulting': (38, 3, None), 'urings': (54.3, 4, 'e'), 'viding': (27, 3, 'e')}, 7: {'ailings': (42.2, 4, None), 'eadings': (40.7, 4, None), 'ealings': (42.4, 4, None), 'fulness': (34, 4, None), 'oadings': (40.6, 4, None), 'ousness': (35, 4, None), 'titudes': (30, 1, None)}}, 'Perl': {2: {'cs': (3, 0, None), 'es': (63, 2, None), 'is': (64, 2, 'e'), 'ss': (6, 0, None), 'us': (67, 0, None)}, 3: {'bed': (36, 2, None), 'ced': (18, 1, None), 'ces': (2, 1, None), 'eed': (7, 0, None), 'ees': (10, 1, None), 'ged': (43, 1, None), 'ges': (63.1, 1, None), 'ied': (56, 3, 'y'), 'ies': (59, 3, 'y'), 'led': (12, 2, None), 'les': (50, 1, None), 'mes': (63.7, 1, None), 'ned': (13, 1, None), 'ous': (65, 0, None), 'pes': (63.8, 1, None), 'red': (20, 1, None), 'sed': (29, 1, None), 'ses': (11, 1, None), 'sis': (4, 0, None), 'ted': (22, 2, None), 'tes': (51, 1, None), 'tis': (5, 0, None), 'ued': (8, 1, None), 'ues': (9, 1, None), 'ums': (66, 0, None), 'ved': (17, 1, None), 'ves': (60, 1, None), 'zed': (52, 1, None)}, 4: {'aped': (61.3, 1, None), 'ated': (22.1, 1, None), 'bled': (12.3, 1, None), 'ding': (40, 3, 'e'), 'eled': (12.2, 2, None), 'ened': (13.7, 2, None), 'ered': (20.1, 2, None), 'eses': (11.1, 2, 'is'), 'gged': (43.1, 3, None), 'ging': (45, 3, 'e'), 'gned': (13.1, 2, None), 'ides': (63.2, 1, None), 'ines': (63.3, 1, None), 'izes': (63.5, 1, None), 'ling': (42, 3, 'e'), 'lled': (12.1, 2, None), 'lves': (60.1, 3, 'f'), 'ming': (44, 3, 'e'), 'nged': (43.2, 1, None), 'ning': (46, 3, 'e'), 'nned': (13.3, 3, None), 'oded': (61.1, 1, None), 'oned': (13.2, 2, None), 'ones': (63.6, 1, None), 'pled': (12.4, 1, None), 'rned': (13.4, 2, None), 'sing': (54, 3, 'e'), 'ssed': (28, 2, None), 'sses': (11.2, 2, None), 'ting': (48, 3, 'e'), 'tled': (12.5, 1, None), 'tted': (21, 3, None), 'uded': (61.2, 1, None), 'umed': (31, 1, None), 'ures': (63.4, 1, None), 'uses': (11.3, 1, None), 'uted': (22.2, 1, None), 'ving': (39, 3, 'e'), 'zing': (54.1, 3, 'e')}, 5: {'ained': (13.6, 2, None), 'anges': (23, 1, None), 'aning': (46.6, 3, None), 'ating': (57, 3, 'e'), 'cting': (48.1, 3, None), 'dding': (40.4, 4, None), 'eared': (20.3, 2, None), 'ected': (15, 2, None), 'eding': (40.5, 3, None), 'eling': (42.1, 3, None), 'ening': (46.5, 3, None), 'erned': (13.5, 2, None), 'erred': (19, 3, None), 'eting': (48.4, 3, None), 'gging': (45.1, 4, None), 'gning': (46.4, 3, None), 'iases': (11.4, 2, None), 'ifted': (14, 2, None), 'iring': (54.4, 3, 'e'), 'lding': (40.3, 3, None), 'leted': (22.3, 1, None), 'lling': (41, 4, None), 'lming': (44.1, 3, None), 'lored': (20.4, 2, None), 'mming': (44.3, 4, None), 'ncing': (54.2, 3, 'e'), 'nding': (40.1, 3, None), 'nging': (45.2, 3, None), 'nning': (46.3, 4, None), 'noted': (22.4, 1, None), 'nting': (48.2, 3, None), 'oling': (42.3, 3, None), 'oning': (46.2, 3, None), 'pting': (48.3, 3, None), 'rabed': (36.1, 1, None), 'rding': (40.2, 3, None), 'rebed': (36.1, 1, None), 'ribed': (36.1, 1, None), 'rming': (44.2, 3, None), 'rning': (46.1, 3, None), 'robed': (36.1, 1, None), 'rubed': (36.1, 1, None), 'ssing': (37, 3, None), 'sting': (47, 3, None), 'thing': (58.1, 0, None), 'tored': (20.2, 1, None), 'tting': (26, 4, None), 'ulted': (32, 2, None), 'uming': (33, 3, 'e'), 'uring': (54.3, 3, 'e'), 'urred': (20.5, 3, None), 'vided': (16, 1, None)}, 6: {'aceous': (1, 6, None), 'acting': (25, 3, None), 'ailing': (42.2, 3, None), 'aining': (24, 3, None), 'eading': (40.7, 3, None), 'ealing': (42.4, 3, None), 'oading': (40.6, 3, None), 'ulting': (38, 3, None), 'viding': (27, 3, 'e')}, 7: {'fulness': (34, 4, None), 'ousness': (35, 4, None), 'titudes': (30, 1, None)}}, 'standard': {2: {'cs': (3, 0, None), 'es': (63, 2, None), 'is': (64, 2, 'e'), 'ss': (6, 0, None), 'us': (67, 0, None)}, 3: {'bed': (36, 2, None), 'ced': (18, 1, None), 'ces': (2, 1, None), 'eed': (7, 0, None), 'ees': (10, 1, None), 'ged': (43, 1, None), 'ges': (63.1, 1, None), 'ied': (56, 3, 'y'), 'ies': (59, 3, 'y'), 'led': (12, 2, None), 'les': (50, 1, None), 'mes': (63.7, 1, None), 'ned': (13, 1, None), 'ous': (65, 0, None), 'pes': (63.8, 1, None), 'red': (20, 1, None), 'sed': (29, 1, None), 'ses': (11, 1, None), 'sis': (4, 0, None), 'ted': (22, 2, None), 'tes': (51, 1, None), 'tis': (5, 0, None), 'ued': (8, 1, None), 'ues': (9, 1, None), 'ums': (66, 0, None), 'ved': (17, 1, None), 'ves': (60, 1, None), 'zed': (52, 1, None)}, 4: {'aped': (61.3, 1, None), 'ated': (22.1, 1, None), 'beds': (36, 3, None), 'bled': (12.3, 1, None), 'ding': (40, 3, 'e'), 'eeds': (7, 1, None), 'eled': (12.2, 2, None), 'ened': (13.7, 2, None), 'ered': (20.1, 2, None), 'eses': (11.1, 2, 'is'), 'gged': (43.1, 3, None), 'ging': (45, 3, 'e'), 'gned': (13.1, 2, None), 'ides': (63.2, 1, None), 'ines': (63.3, 1, None), 'izes': (63.5, 1, None), 'ling': (42, 3, 'e'), 'lled': (12.1, 2, None), 'lves': (60.1, 3, 'f'), 'ming': (44, 3, 'e'), 'nged': (43.2, 1, None), 'ning': (46, 3, 'e'), 'nned': (13.3, 3, None), 'oded': (61.1, 1, None), 'oned': (13.2, 2, None), 'ones': (63.6, 1, None), 'pled': (12.4, 1, None), 'reds': (20, 2, None), 'rned': (13.4, 2, None), 'sing': (54, 3, 'e'), 'ssed': (28, 2, None), 'sses': (11.2, 2, None), 'ting': (48, 3, 'e'), 'tled': (12.5, 1, None), 'tted': (21, 3, None), 'uded': (61.2, 1, None), 'umed': (31, 1, None), 'ures': (63.4, 1, None), 'uses': (11.3, 1, None), 'uted': (22.2, 1, None), 'ving': (39, 3, 'e'), 'zing': (54.1, 3, 'e')}, 5: {'ained': (13.6, 2, None), 'anges': (23, 1, None), 'aning': (46.6, 3, None), 'ating': (57, 3, 'e'), 'cting': (48.1, 3, None), 'dding': (40.4, 4, None), 'dings': (40, 4, 'e'), 'eared': (20.3, 2, None), 'ected': (15, 2, None), 'eding': (40.5, 3, None), 'eling': (42.1, 3, None), 'ening': (46.5, 3, None), 'erned': (13.5, 2, None), 'erred': (19, 3, None), 'eting': (48.4, 3, None), 'gging': (45.1, 4, None), 'gings': (45, 4, 'e'), 'gning': (46.4, 3, None), 'iases': (11.4, 2, None), 'ifted': (14, 2, None), 'iring': (54.4, 3, 'e'), 'lding': (40.3, 3, None), 'leted': (22.3, 1, None), 'lings': (42, 4, 'e'), 'lling': (41, 4, None), 'lming': (44.1, 3, None), 'lored': (20.4, 2, None), 'mings': (44, 4, 'e'), 'mming': (44.3, 4, None), 'ncing': (54.2, 3, 'e'), 'nding': (40.1, 3, None), 'nging': (45.2, 3, None), 'nning': (46.3, 4, None), 'noted': (22.4, 1, None), 'nting': (48.2, 3, None), 'oling': (42.3, 3, None), 'oning': (46.2, 3, None), 'pting': (48.3, 3, None), 'rabed': (36.1, 1, None), 'rding': (40.2, 3, None), 'rebed': (36.1, 1, None), 'ribed': (36.1, 1, None), 'rming': (44.2, 3, None), 'rning': (46.1, 3, None), 'robed': (36.1, 1, None), 'rubed': (36.1, 1, None), 'sings': (54, 4, 'e'), 'ssing': (37, 3, None), 'sting': (47, 3, None), 'thing': (58.1, 0, None), 'tings': (48, 4, 'e'), 'tored': (20.2, 1, None), 'tting': (26, 4, None), 'ulted': (32, 2, None), 'uming': (33, 3, 'e'), 'uring': (54.3, 3, 'e'), 'urred': (20.5, 3, None), 'vided': (16, 1, None), 'vings': (39, 4, 'e')}, 6: {'aceous': (1, 6, None), 'acting': (25, 3, None), 'ailing': (42.2, 3, None), 'aining': (24, 3, None), 'ddings': (40.4, 5, None), 'eading': (40.7, 3, None), 'ealing': (42.4, 3, None), 'edings': (40.5, 4, None), 'elings': (42.1, 4, None), 'etings': (48.4, 4, None), 'ggings': (45.1, 5, None), 'irings': (54.4, 4, 'e'), 'ldings': (40.3, 4, None), 'llings': (41, 5, None), 'mmings': (44.3, 5, None), 'ncings': (54.2, 4, 'e'), 'ndings': (40.1, 4, None), 'ngings': (45.2, 4, None), 'ntings': (48.2, 4, None), 'oading': (40.6, 3, None), 'olings': (42.3, 4, None), 'rdings': (40.2, 4, None), 'ssings': (37, 4, None), 'stings': (47, 4, None), 'things': (58.1, 1, None), 'ttings': (26, 5, None), 'ulting': (38, 3, None), 'urings': (54.3, 4, 'e'), 'viding': (27, 3, 'e')}, 7: {'ailings': (42.2, 4, None), 'eadings': (40.7, 4, None), 'ealings': (42.4, 4, None), 'fulness': (34, 4, None), 'oadings': (40.6, 4, None), 'ousness': (35, 4, None), 'titudes': (30, 1, None)}}}
_standard_rule_table = {2: {'cs': (3, 0, None), 'es': (63, 2, None), 'is': (64, 2, 'e'), 'ss': (6, 0, None), 'us': (67, 0, None)}, 3: {'bed': (36, 2, None), 'ced': (18, 1, None), 'ces': (2, 1, None), 'eed': (7, 0, None), 'ees': (10, 1, None), 'ged': (43, 1, None), 'ges': (63.1, 1, None), 'ied': (56, 3, 'y'), 'ies': (59, 3, 'y'), 'led': (12, 2, None), 'les': (50, 1, None), 'mes': (63.7, 1, None), 'ned': (13, 1, None), 'ous': (65, 0, None), 'pes': (63.8, 1, None), 'red': (20, 1, None), 'sed': (29, 1, None), 'ses': (11, 1, None), 'sis': (4, 0, None), 'ted': (22, 2, None), 'tes': (51, 1, None), 'tis': (5, 0, None), 'ued': (8, 1, None), 'ues': (9, 1, None), 'ums': (66, 0, None), 'ved': (17, 1, None), 'ves': (60, 1, None), 'zed': (52, 1, None)}, 4: {'aped': (61.3, 1, None), 'ated': (22.1, 1, None), 'beds': (36, 3, None), 'bled': (12.3, 1, None), 'ding': (40, 3, 'e'), 'eeds': (7, 1, None), 'eled': (12.2, 2, None), 'ened': (13.7, 2, None), 'ered': (20.1, 2, None), 'eses': (11.1, 2, 'is'), 'gged': (43.1, 3, None), 'ging': (45, 3, 'e'), 'gned': (13.1, 2, None), 'ides': (63.2, 1, None), 'ines': (63.3, 1, None), 'izes': (63.5, 1, None), 'ling': (42, 3, 'e'), 'lled': (12.1, 2, None), 'lves': (60.1, 3, 'f'), 'ming': (44, 3, 'e'), 'nged': (43.2, 1, None), 'ning': (46, 3, 'e'), 'nned': (13.3, 3, None), 'oded': (61.1, 1, None), 'oned': (13.2, 2, None), 'ones': (63.6, 1, None), 'pled': (12.4, 1, None), 'reds': (20, 2, None), 'rned': (13.4, 2, None), 'sing': (54, 3, 'e'), 'ssed': (28, 2, None), 'sses': (11.2, 2, None), 'ting': (48, 3, 'e'), 'tled': (12.5, 1, None), 'tted': (21, 3, None), 'uded': (61.2, 1, None), 'umed': (31, 1, None), 'ures': (63.4, 1, None), 'uses': (11.3, 1, None), 'uted': (22.2, 1, None), 'ving': (39, 3, 'e'), 'zing': (54.1, 3, 'e')}, 5: {'ained': (13.6, 2, None), 'anges': (23, 1, None), 'aning': (46.6, 3, None), 'ating': (57, 3, 'e'), 'cting': (48.1, 3, None), 'dding': (40.4, 4, None), 'dings': (40, 4, 'e'), 'eared': (20.3, 2, None), 'ected': (15, 2, None), 'eding': (40.5, 3, None), 'eling': (42.1, 3, None), 'ening': (46.5, 3, None), 'erned': (13.5, 2, None), 'erred': (19, 3, None), 'eting': (48.4, 3, None), 'gging': (45.1, 4, None), 'gings': (45, 4, 'e'), 'gning': (46.4, 3, None), 'iases': (11.4, 2, None), 'ifted': (14, 2, None), 'iring': (54.4, 3, 'e'), 'lding': (40.3, 3, None), 'leted': (22.3, 1, None), 'lings': (42, 4, 'e'), 'lling': (41, 4, None), 'lming': (44.1, 3, None), 'lored': (20.4, 2, None), 'mings': (44, 4, 'e'), 'mming': (44.3, 4, None), 'ncing': (54.2, 3, 'e'), 'nding': (40.1, 3, None), 'nging': (45.2, 3, None), 'nning': (46.3, 4, None), 'noted': (22.4, 1, None), 'nting': (48.2, 3, None), 'oling': (42.3, 3, None), 'oning': (46.2, 3, None), 'pting': (48.3, 3, None), 'rabed': (36.1, 1, None), 'rding': (40.2, 3, None), 'rebed': (36.1, 1, None), 'ribed': (36.1, 1, None), 'rming': (44.2, 3, None), 'rning': (46.1, 3, None), 'robed': (36.1, 1, None), 'rubed': (36.1, 1, None), 'sings': (54, 4, 'e'), 'ssing': (37, 3, None), 'sting': (47, 3, None), 'thing': (58.1, 0, None), 'tings': (48, 4, 'e'), 'tored': (20.2, 1, None), 'tting': (26, 4, None), 'ulted': (32, 2, None), 'uming': (33, 3, 'e'), 'uring': (54.3, 3, 'e'), 'urred': (20.5, 3, None), 'vided': (16, 1, None), 'vings': (39, 4, 'e')}, 6: {'aceous': (1, 6, None), 'acting': (25, 3, None), 'ailing': (42.2, 3, None), 'aining': (24, 3, None), 'ddings': (40.4, 5, None), 'eading': (40.7, 3, None), 'ealing': (42.4, 3, None), 'edings': (40.5, 4, None), 'elings': (42.1, 4, None), 'etings': (48.4, 4, None), 'ggings': (45.1, 5, None), 'irings': (54.4, 4, 'e'), 'ldings': (40.3, 4, None), 'llings': (41, 5, None), 'mmings': (44.3, 5, None), 'ncings': (54.2, 4, 'e'), 'ndings': (40.1, 4, None), 'ngings': (45.2, 4, None), 'ntings': (48.2, 4, None), 'oading': (40.6, 3, None), 'olings': (42.3, 4, None), 'rdings': (40.2, 4, None), 'ssings': (37, 4, None), 'stings': (47, 4, None), 'things': (58.1, 1, None), 'ttings': (26, 5, None), 'ulting': (38, 3, None), 'urings': (54.3, 4, 'e'), 'viding': (27, 3, 'e')}, 7: {'ailings': (42.2, 4, None), 'eadings': (40.7, 4, None), 'ealings': (42.4, 4, None), 'fulness': (34, 4, None), 'oadings': (40.6, 4, None), 'ousness': (35, 4, None), 'titudes': (30, 1, None)}}
stem(word)[source]

Return UEA-Lite stem.

Parameters

word (str) -- The word to stem

Returns

Word stem

Return type

str or (str, int)

Examples

>>> uealite('readings')
'read'
>>> uealite('insulted')
'insult'
>>> uealite('cussed')
'cuss'
>>> uealite('fancies')
'fancy'
>>> uealite('eroded')
'erode'

New in version 0.1.0.

Changed in version 0.3.6: Encapsulated in class

abydos.stemmer.uealite(word, max_word_length=20, max_acro_length=8, return_rule_no=False, var='standard')[source]

Return UEA-Lite stem.

This is a wrapper for UEALite.stem().

Parameters
  • word (str) -- The word to stem

  • max_word_length (int) -- The maximum word length allowed

  • max_acro_length (int) -- The maximum acronym length allowed

  • return_rule_no (bool) -- If True, returns the stem along with rule number

  • var (str) --

    Variant rules to use:

    • Adams to use Jason Adams' rules

    • Perl to use the original Perl rules

Returns

Word stem

Return type

str or (str, int)

Examples

>>> uealite('readings')
'read'
>>> uealite('insulted')
'insult'
>>> uealite('cussed')
'cuss'
>>> uealite('fancies')
'fancy'
>>> uealite('eroded')
'erode'

New in version 0.1.0.

Deprecated since version 0.4.0: This will be removed in 0.6.0. Use the UEALite.stem method instead.

class abydos.stemmer.SStemmer[source]

Bases: abydos.stemmer._stemmer._Stemmer

S-stemmer.

The S stemmer is defined in [Har91].

New in version 0.3.6.

stem(word)[source]

Return the S-stemmed form of a word.

Parameters

word (str) -- The word to stem

Returns

Word stem

Return type

str

Examples

>>> stmr = SStemmer()
>>> stmr.stem('summaries')
'summary'
>>> stmr.stem('summary')
'summary'
>>> stmr.stem('towers')
'tower'
>>> stmr.stem('reading')
'reading'
>>> stmr.stem('census')
'census'

New in version 0.3.0.

Changed in version 0.3.6: Encapsulated in class

abydos.stemmer.s_stemmer(word)[source]

Return the S-stemmed form of a word.

This is a wrapper for SStemmer.stem().

Parameters

word (str) -- The word to stem

Returns

Word stem

Return type

str

Examples

>>> s_stemmer('summaries')
'summary'
>>> s_stemmer('summary')
'summary'
>>> s_stemmer('towers')
'tower'
>>> s_stemmer('reading')
'reading'
>>> s_stemmer('census')
'census'

New in version 0.3.0.

Deprecated since version 0.4.0: This will be removed in 0.6.0. Use the SStemmer.stem method instead.

class abydos.stemmer.Caumanns[source]

Bases: abydos.stemmer._stemmer._Stemmer

Caumanns stemmer.

Jörg Caumanns' stemmer is described in his article in [Cau99].

This implementation is based on the GermanStemFilter described at [Lan13].

New in version 0.3.6.

_umlauts = {228: 'a', 246: 'o', 252: 'u'}
stem(word)[source]

Return Caumanns German stem.

Parameters

word (str) -- The word to stem

Returns

Word stem

Return type

str

Examples

>>> stmr = Caumanns()
>>> stmr.stem('lesen')
'les'
>>> stmr.stem('graues')
'grau'
>>> stmr.stem('buchstabieren')
'buchstabier'

New in version 0.2.0.

Changed in version 0.3.6: Encapsulated in class

abydos.stemmer.caumanns(word)[source]

Return Caumanns German stem.

This is a wrapper for Caumanns.stem().

Parameters

word (str) -- The word to stem

Returns

Word stem

Return type

str

Examples

>>> caumanns('lesen')
'les'
>>> caumanns('graues')
'grau'
>>> caumanns('buchstabieren')
'buchstabier'

New in version 0.2.0.

Deprecated since version 0.4.0: This will be removed in 0.6.0. Use the Caumanns.stem method instead.

class abydos.stemmer.Schinke[source]

Bases: abydos.stemmer._stemmer._Stemmer

Schinke stemmer.

This is defined in [SGRW96].

New in version 0.3.6.

_keep_que = {'abs', 'abus', 'adae', 'adus', 'aps', 'at', 'attor', 'co', 'conco', 'contor', 'cui', 'cuius', 'de', 'deco', 'deni', 'detor', 'exco', 'extor', 'inco', 'intor', 'ita', 'ne', 'obli', 'obtor', 'optor', 'perae', 'plenis', 'praetor', 'qua', 'quae', 'quam', 'quando', 'quarum', 'quas', 'quem', 'qui', 'quibus', 'quis', 'quo', 'quorum', 'quos', 'quotusquis', 'quous', 'reco', 'retor', 'sus', 'tor', 'ubi', 'undi', 'us', 'uter', 'uti', 'utribi', 'utro'}
_n_endings = {1: {'a', 'e', 'i', 'o', 'u'}, 2: {'ae', 'am', 'as', 'em', 'es', 'ia', 'is', 'nt', 'os', 'ud', 'um', 'us'}, 3: {'ius'}, 4: {'ibus'}}
_v_endings_alter = {1: {}, 2: {'bo'}, 3: {'bor', 'ero', 'unt'}, 4: {'iunt'}, 5: {'beris', 'erunt', 'untur'}, 6: {'iuntur'}}
_v_endings_strip = {1: {'m', 'r', 's', 't'}, 2: {'ns', 'nt', 'ri'}, 3: {'mur', 'mus', 'ris', 'sti', 'tis', 'tur'}, 4: {'mini', 'ntur', 'stis'}, 5: {}, 6: {}}
stem(word)[source]

Return the stem of a word according to the Schinke stemmer.

Parameters

word (str) -- The word to stem

Returns

Word stem

Return type

str

Examples

>>> stmr = Schinke()
>>> stmr.stem('atque')
{'n': 'atque', 'v': 'atque'}
>>> stmr.stem('census')
{'n': 'cens', 'v': 'censu'}
>>> stmr.stem('virum')
{'n': 'uir', 'v': 'uiru'}
>>> stmr.stem('populusque')
{'n': 'popul', 'v': 'populu'}
>>> stmr.stem('senatus')
{'n': 'senat', 'v': 'senatu'}

New in version 0.3.0.

Changed in version 0.3.6: Encapsulated in class

abydos.stemmer.schinke(word)[source]

Return the stem of a word according to the Schinke stemmer.

This is a wrapper for Schinke.stem().

Parameters

word (str) -- The word to stem

Returns

Word stem

Return type

str

Examples

>>> schinke('atque')
{'n': 'atque', 'v': 'atque'}
>>> schinke('census')
{'n': 'cens', 'v': 'censu'}
>>> schinke('virum')
{'n': 'uir', 'v': 'uiru'}
>>> schinke('populusque')
{'n': 'popul', 'v': 'populu'}
>>> schinke('senatus')
{'n': 'senat', 'v': 'senatu'}

New in version 0.3.0.

Deprecated since version 0.4.0: This will be removed in 0.6.0. Use the Schinke.stem method instead.

class abydos.stemmer.Porter(early_english=False)[source]

Bases: abydos.stemmer._stemmer._Stemmer

Porter stemmer.

The Porter stemmer is described in [Por80].

New in version 0.3.6.

Initialize Porter instance.

Parameters

early_english (bool) -- Set to True in order to remove -eth & -est (2nd & 3rd person singular verbal agreement suffixes)

New in version 0.4.0.

_ends_in_cvc(term)[source]

Return Porter helper function _ends_in_cvc value.

Parameters

term (str) -- The word to scan for cvc

Returns

True iff the stem ends in cvc (as defined in the Porter stemmer definition)

Return type

bool

New in version 0.1.0.

Changed in version 0.3.6: Encapsulated in class

_ends_in_doubled_cons(term)[source]

Return Porter helper function _ends_in_doubled_cons value.

Parameters

term (str) -- The word to check for a final doubled consonant

Returns

True iff the stem ends in a doubled consonant (as defined in the Porter stemmer definition)

Return type

bool

New in version 0.1.0.

Changed in version 0.3.6: Encapsulated in class

_has_vowel(term)[source]

Return Porter helper function _has_vowel value.

Parameters

term (str) -- The word to scan for vowels

Returns

True iff a vowel exists in the term (as defined in the Porter stemmer definition)

Return type

bool

New in version 0.1.0.

Changed in version 0.3.6: Encapsulated in class

_m_degree(term)[source]

Return Porter helper function _m_degree value.

m-degree is equal to the number of V to C transitions

Parameters

term (str) -- The word for which to calculate the m-degree

Returns

The m-degree as defined in the Porter stemmer definition

Return type

int

New in version 0.1.0.

Changed in version 0.3.6: Encapsulated in class

_vowels = {'a', 'e', 'i', 'o', 'u', 'y'}
stem(word)[source]

Return Porter stem.

Parameters

word (str) -- The word to stem

Returns

Word stem

Return type

str

Examples

>>> stmr = Porter()
>>> stmr.stem('reading')
'read'
>>> stmr.stem('suspension')
'suspens'
>>> stmr.stem('elusiveness')
'elus'
>>> stmr = Porter(early_english=True)
>>> stmr.stem('eateth')
'eat'

New in version 0.1.0.

Changed in version 0.3.6: Encapsulated in class

abydos.stemmer.porter(word, early_english=False)[source]

Return Porter stem.

This is a wrapper for Porter.stem().

Parameters
  • word (str) -- The word to stem

  • early_english (bool) -- Set to True in order to remove -eth & -est (2nd & 3rd person singular verbal agreement suffixes)

Returns

Word stem

Return type

str

Examples

>>> porter('reading')
'read'
>>> porter('suspension')
'suspens'
>>> porter('elusiveness')
'elus'
>>> porter('eateth', early_english=True)
'eat'

New in version 0.1.0.

Deprecated since version 0.4.0: This will be removed in 0.6.0. Use the Porter.stem method instead.

class abydos.stemmer.Porter2(early_english=False)[source]

Bases: abydos.stemmer._snowball._Snowball

Porter2 (Snowball English) stemmer.

The Porter2 (Snowball English) stemmer is defined in [Por02].

New in version 0.3.6.

Initialize Porter2 instance.

Parameters

early_english (bool) -- Set to True in order to remove -eth & -est (2nd & 3rd person singular verbal agreement suffixes)

New in version 0.4.0.

_doubles = {'bb', 'dd', 'ff', 'gg', 'mm', 'nn', 'pp', 'rr', 'tt'}
_exception1dict = {'dying': 'die', 'early': 'earli', 'gently': 'gentl', 'idly': 'idl', 'lying': 'lie', 'only': 'onli', 'singly': 'singl', 'skies': 'sky', 'skis': 'ski', 'tying': 'tie', 'ugly': 'ugli'}
_exception1set = {'andes', 'atlas', 'bias', 'cosmos', 'howe', 'news', 'sky'}
_exception2set = {'canning', 'earring', 'exceed', 'herring', 'inning', 'outing', 'proceed', 'succeed'}
_li = {'c', 'd', 'e', 'g', 'h', 'k', 'm', 'n', 'r', 't'}
_r1_prefixes = ('commun', 'gener', 'arsen')
stem(word)[source]

Return the Porter2 (Snowball English) stem.

Parameters

word (str) -- The word to stem

Returns

Word stem

Return type

str

Examples

>>> stmr = Porter2()
>>> stmr.stem('reading')
'read'
>>> stmr.stem('suspension')
'suspens'
>>> stmr.stem('elusiveness')
'elus'
>>> stmr = Porter2(early_english=True)
>>> stmr.stem('eateth')
'eat'

New in version 0.1.0.

Changed in version 0.3.6: Encapsulated in class

abydos.stemmer.porter2(word, early_english=False)[source]

Return the Porter2 (Snowball English) stem.

This is a wrapper for Porter2.stem().

Parameters
  • word (str) -- The word to stem

  • early_english (bool) -- Set to True in order to remove -eth & -est (2nd & 3rd person singular verbal agreement suffixes)

Returns

Word stem

Return type

str

Examples

>>> porter2('reading')
'read'
>>> porter2('suspension')
'suspens'
>>> porter2('elusiveness')
'elus'
>>> porter2('eateth', early_english=True)
'eat'

New in version 0.1.0.

Deprecated since version 0.4.0: This will be removed in 0.6.0. Use the Porter2.stem method instead.

class abydos.stemmer.SnowballDanish[source]

Bases: abydos.stemmer._snowball._Snowball

Snowball Danish stemmer.

The Snowball Danish stemmer is defined at: http://snowball.tartarus.org/algorithms/danish/stemmer.html

New in version 0.3.6.

_s_endings = {'a', 'b', 'c', 'd', 'f', 'g', 'h', 'j', 'k', 'l', 'm', 'n', 'o', 'p', 'r', 't', 'v', 'y', 'z', 'å'}
_vowels = {'a', 'e', 'i', 'o', 'u', 'y', 'å', 'æ', 'ø'}
stem(word)[source]

Return Snowball Danish stem.

Parameters

word (str) -- The word to stem

Returns

Word stem

Return type

str

Examples

>>> stmr = SnowballDanish()
>>> stmr.stem('underviser')
'undervis'
>>> stmr.stem('suspension')
'suspension'
>>> stmr.stem('sikkerhed')
'sikker'

New in version 0.1.0.

Changed in version 0.3.6: Encapsulated in class

abydos.stemmer.sb_danish(word)[source]

Return Snowball Danish stem.

This is a wrapper for SnowballDanish.stem().

Parameters

word (str) -- The word to stem

Returns

Word stem

Return type

str

Examples

>>> sb_danish('underviser')
'undervis'
>>> sb_danish('suspension')
'suspension'
>>> sb_danish('sikkerhed')
'sikker'

New in version 0.1.0.

Deprecated since version 0.4.0: This will be removed in 0.6.0. Use the SnowballDanish.stem method instead.

class abydos.stemmer.SnowballDutch[source]

Bases: abydos.stemmer._snowball._Snowball

Snowball Dutch stemmer.

The Snowball Dutch stemmer is defined at: http://snowball.tartarus.org/algorithms/dutch/stemmer.html

New in version 0.3.6.

_accented = {225: 'a', 228: 'a', 233: 'e', 235: 'e', 237: 'i', 239: 'i', 243: 'o', 246: 'o', 250: 'u', 252: 'u'}
_not_s_endings = {'a', 'e', 'i', 'j', 'o', 'u', 'y', 'è'}
_undouble(word)[source]

Undouble endings -kk, -dd, and -tt.

Parameters

word (str) -- The word to stem

Returns

The word with doubled endings undoubled

Return type

str

New in version 0.1.0.

Changed in version 0.3.6: Encapsulated in class

_vowels = {'a', 'e', 'i', 'o', 'u', 'y', 'è'}
stem(word)[source]

Return Snowball Dutch stem.

Parameters

word (str) -- The word to stem

Returns

Word stem

Return type

str

Examples

>>> stmr = SnowballDutch()
>>> stmr.stem('lezen')
'lez'
>>> stmr.stem('opschorting')
'opschort'
>>> stmr.stem('ongrijpbaarheid')
'ongrijp'

New in version 0.1.0.

Changed in version 0.3.6: Encapsulated in class

abydos.stemmer.sb_dutch(word)[source]

Return Snowball Dutch stem.

This is a wrapper for SnowballDutch.stem().

Parameters

word (str) -- The word to stem

Returns

Word stem

Return type

str

Examples

>>> sb_dutch('lezen')
'lez'
>>> sb_dutch('opschorting')
'opschort'
>>> sb_dutch('ongrijpbaarheid')
'ongrijp'

New in version 0.1.0.

Deprecated since version 0.4.0: This will be removed in 0.6.0. Use the SnowballDutch.stem method instead.

class abydos.stemmer.SnowballGerman(alternate_vowels=False)[source]

Bases: abydos.stemmer._snowball._Snowball

Snowball German stemmer.

The Snowball German stemmer is defined at: http://snowball.tartarus.org/algorithms/german/stemmer.html

New in version 0.3.6.

Initialize SnowballGerman instance.

Parameters

alternate_vowels (bool) -- Composes ae as ä, oe as ö, and ue as ü before running the algorithm

New in version 0.4.0.

_s_endings = {'b', 'd', 'f', 'g', 'h', 'k', 'l', 'm', 'n', 'r', 't'}
_st_endings = {'b', 'd', 'f', 'g', 'h', 'k', 'l', 'm', 'n', 't'}
_vowels = {'a', 'e', 'i', 'o', 'u', 'y', 'ä', 'ö', 'ü'}
stem(word)[source]

Return Snowball German stem.

Parameters

word (str) -- The word to stem

Returns

Word stem

Return type

str

Examples

>>> stmr = SnowballGerman()
>>> stmr.stem('lesen')
'les'
>>> stmr.stem('graues')
'grau'
>>> stmr.stem('buchstabieren')
'buchstabi'

New in version 0.1.0.

Changed in version 0.3.6: Encapsulated in class

abydos.stemmer.sb_german(word, alternate_vowels=False)[source]

Return Snowball German stem.

This is a wrapper for SnowballGerman.stem().

Parameters
  • word (str) -- The word to stem

  • alternate_vowels (bool) -- Composes ae as ä, oe as ö, and ue as ü before running the algorithm

Returns

Word stem

Return type

str

Examples

>>> sb_german('lesen')
'les'
>>> sb_german('graues')
'grau'
>>> sb_german('buchstabieren')
'buchstabi'

New in version 0.1.0.

Deprecated since version 0.4.0: This will be removed in 0.6.0. Use the SnowballGerman.stem method instead.

class abydos.stemmer.SnowballNorwegian[source]

Bases: abydos.stemmer._snowball._Snowball

Snowball Norwegian stemmer.

The Snowball Norwegian stemmer is defined at: http://snowball.tartarus.org/algorithms/norwegian/stemmer.html

New in version 0.3.6.

_s_endings = {'b', 'c', 'd', 'f', 'g', 'h', 'j', 'l', 'm', 'n', 'o', 'p', 'r', 't', 'v', 'y', 'z'}
_vowels = {'a', 'e', 'i', 'o', 'u', 'y', 'å', 'æ', 'ø'}
stem(word)[source]

Return Snowball Norwegian stem.

Parameters

word (str) -- The word to stem

Returns

Word stem

Return type

str

Examples

>>> stmr = SnowballNorwegian()
>>> stmr.stem('lese')
'les'
>>> stmr.stem('suspensjon')
'suspensjon'
>>> stmr.stem('sikkerhet')
'sikker'

New in version 0.1.0.

Changed in version 0.3.6: Encapsulated in class

abydos.stemmer.sb_norwegian(word)[source]

Return Snowball Norwegian stem.

This is a wrapper for SnowballNorwegian.stem().

Parameters

word (str) -- The word to stem

Returns

Word stem

Return type

str

Examples

>>> sb_norwegian('lese')
'les'
>>> sb_norwegian('suspensjon')
'suspensjon'
>>> sb_norwegian('sikkerhet')
'sikker'

New in version 0.1.0.

Deprecated since version 0.4.0: This will be removed in 0.6.0. Use the SnowballNorwegian.stem method instead.

class abydos.stemmer.SnowballSwedish[source]

Bases: abydos.stemmer._snowball._Snowball

Snowball Swedish stemmer.

The Snowball Swedish stemmer is defined at: http://snowball.tartarus.org/algorithms/swedish/stemmer.html

New in version 0.3.6.

_s_endings = {'b', 'c', 'd', 'f', 'g', 'h', 'j', 'k', 'l', 'm', 'n', 'o', 'p', 'r', 't', 'v', 'y'}
_vowels = {'a', 'e', 'i', 'o', 'u', 'y', 'ä', 'å', 'ö'}
stem(word)[source]

Return Snowball Swedish stem.

Parameters

word (str) -- The word to stem

Returns

Word stem

Return type

str

Examples

>>> stmr = SnowballSwedish()
>>> stmr.stem('undervisa')
'undervis'
>>> stmr.stem('suspension')
'suspension'
>>> stmr.stem('visshet')
'viss'

New in version 0.1.0.

Changed in version 0.3.6: Encapsulated in class

abydos.stemmer.sb_swedish(word)[source]

Return Snowball Swedish stem.

This is a wrapper for SnowballSwedish.stem().

Parameters

word (str) -- The word to stem

Returns

Word stem

Return type

str

Examples

>>> sb_swedish('undervisa')
'undervis'
>>> sb_swedish('suspension')
'suspension'
>>> sb_swedish('visshet')
'viss'

New in version 0.1.0.

Deprecated since version 0.4.0: This will be removed in 0.6.0. Use the SnowballSwedish.stem method instead.

class abydos.stemmer.CLEFGerman[source]

Bases: abydos.stemmer._stemmer._Stemmer

CLEF German stemmer.

The CLEF German stemmer is defined at [Sav05].

New in version 0.3.6.

_umlauts = {228: 'a', 246: 'o', 252: 'u'}
stem(word)[source]

Return CLEF German stem.

Parameters

word (str) -- The word to stem

Returns

Word stem

Return type

str

Examples

>>> stmr = CLEFGerman()
>>> stmr.stem('lesen')
'lese'
>>> stmr.stem('graues')
'grau'
>>> stmr.stem('buchstabieren')
'buchstabier'

New in version 0.1.0.

Changed in version 0.3.6: Encapsulated in class

abydos.stemmer.clef_german(word)[source]

Return CLEF German stem.

This is a wrapper for CLEFGerman.stem().

Parameters

word (str) -- The word to stem

Returns

Word stem

Return type

str

Examples

>>> clef_german('lesen')
'lese'
>>> clef_german('graues')
'grau'
>>> clef_german('buchstabieren')
'buchstabier'

New in version 0.1.0.

Deprecated since version 0.4.0: This will be removed in 0.6.0. Use the CLEFGerman.stem method instead.

class abydos.stemmer.CLEFGermanPlus[source]

Bases: abydos.stemmer._stemmer._Stemmer

CLEF German stemmer plus.

The CLEF German stemmer plus is defined at [Sav05].

New in version 0.3.6.

_accents = {224: 'a', 225: 'a', 226: 'a', 228: 'a', 236: 'i', 237: 'i', 238: 'i', 239: 'i', 242: 'o', 243: 'o', 244: 'o', 246: 'o', 249: 'u', 250: 'u', 251: 'u', 252: 'u'}
_st_ending = {'b', 'd', 'f', 'g', 'h', 'k', 'l', 'm', 'n', 't'}
stem(word)[source]

Return 'CLEF German stemmer plus' stem.

Parameters

word (str) -- The word to stem

Returns

Word stem

Return type

str

Examples

>>> stmr = CLEFGermanPlus()
>>> clef_german_plus('lesen')
'les'
>>> clef_german_plus('graues')
'grau'
>>> clef_german_plus('buchstabieren')
'buchstabi'

New in version 0.1.0.

Changed in version 0.3.6: Encapsulated in class

abydos.stemmer.clef_german_plus(word)[source]

Return 'CLEF German stemmer plus' stem.

This is a wrapper for CLEFGermanPlus.stem().

Parameters

word (str) -- The word to stem

Returns

Word stem

Return type

str

Examples

>>> stmr = CLEFGermanPlus()
>>> clef_german_plus('lesen')
'les'
>>> clef_german_plus('graues')
'grau'
>>> clef_german_plus('buchstabieren')
'buchstabi'

New in version 0.1.0.

Deprecated since version 0.4.0: This will be removed in 0.6.0. Use the CLEFGermanPlus.stem method instead.

class abydos.stemmer.CLEFSwedish[source]

Bases: abydos.stemmer._stemmer._Stemmer

CLEF Swedish stemmer.

The CLEF Swedish stemmer is defined at [Sav05].

New in version 0.3.6.

stem(word)[source]

Return CLEF Swedish stem.

Parameters

word (str) -- The word to stem

Returns

Word stem

Return type

str

Examples

>>> clef_swedish('undervisa')
'undervis'
>>> clef_swedish('suspension')
'suspensio'
>>> clef_swedish('visshet')
'viss'

New in version 0.1.0.

Changed in version 0.3.6: Encapsulated in class

abydos.stemmer.clef_swedish(word)[source]

Return CLEF Swedish stem.

This is a wrapper for CLEFSwedish.stem().

Parameters

word (str) -- The word to stem

Returns

Word stem

Return type

str

Examples

>>> clef_swedish('undervisa')
'undervis'
>>> clef_swedish('suspension')
'suspensio'
>>> clef_swedish('visshet')
'viss'

New in version 0.1.0.

Deprecated since version 0.4.0: This will be removed in 0.6.0. Use the CLEFSwedish.stem method instead.