abydos.phonetic package

abydos.phonetic.

The phonetic package includes classes for phonetic algorithms, including:

There are also language-specific phonetic algorithms for German:

For French:

For Spanish:

For Swedish:

For Norwegian:

For Brazilian Portuguese:

And there are some hybrid phonetic algorithms that employ multiple underlying phonetic algorithms:

  • Oxford Name Compression Algorithm (ONCA) (ONCA)

  • MetaSoundex (MetaSoundex)

Each class has an encode method to return the phonetically encoded string. Classes for which encode returns a numeric value generally have an encode_alpha method that returns an alphabetic version of the phonetic encoding, as demonstrated below:

>>> rus = RussellIndex()
>>> rus.encode('Abramson')
128637
>>> rus.encode_alpha('Abramson')
'ABRMCN'

class abydos.phonetic.RussellIndex[source]

Bases: abydos.phonetic._phonetic._Phonetic

Russell Index.

This follows Robert C. Russell's Index algorithm, as described in [Rus18].

New in version 0.3.6.

encode(word)[source]

Return the Russell Index (integer output) of a word.

Parameters

word (str) -- The word to transform

Returns

The Russell Index value

Return type

int

Examples

>>> pe = RussellIndex()
>>> pe.encode('Christopher')
3813428
>>> pe.encode('Niall')
715
>>> pe.encode('Smith')
3614
>>> pe.encode('Schmidt')
3614

New in version 0.1.0.

Changed in version 0.3.6: Encapsulated in class

encode_alpha(word)[source]

Return the Russell Index (alphabetic output) for the word.

This follows Robert C. Russell's Index algorithm, as described in [Rus18].

Parameters

word (str) -- The word to transform

Returns

The Russell Index value as an alphabetic string

Return type

str

Examples

>>> pe = RussellIndex()
>>> pe.encode_alpha('Christopher')
'CRACDBR'
>>> pe.encode_alpha('Niall')
'NAL'
>>> pe.encode_alpha('Smith')
'CMAD'
>>> pe.encode_alpha('Schmidt')
'CMAD'

New in version 0.1.0.

Changed in version 0.3.6: Encapsulated in class

abydos.phonetic.russell_index(word)[source]

Return the Russell Index (integer output) of a word.

This is a wrapper for RussellIndex.encode().

Parameters

word (str) -- The word to transform

Returns

The Russell Index value

Return type

int

Examples

>>> russell_index('Christopher')
3813428
>>> russell_index('Niall')
715
>>> russell_index('Smith')
3614
>>> russell_index('Schmidt')
3614

New in version 0.1.0.

Deprecated since version 0.4.0: This will be removed in 0.6.0. Use the RussellIndex.encode method instead.

abydos.phonetic.russell_index_num_to_alpha(num)[source]

Convert the Russell Index integer to an alphabetic string.

This is a wrapper for RussellIndex._to_alpha().

Parameters

num (int) -- A Russell Index integer value

Returns

The Russell Index as an alphabetic string

Return type

str

Examples

>>> russell_index_num_to_alpha(3813428)
'CRACDBR'
>>> russell_index_num_to_alpha(715)
'NAL'
>>> russell_index_num_to_alpha(3614)
'CMAD'

New in version 0.1.0.

Deprecated since version 0.4.0: This will be removed in 0.6.0. Use the RussellIndex._to_alpha method instead.

abydos.phonetic.russell_index_alpha(word)[source]

Return the Russell Index (alphabetic output) for the word.

This is a wrapper for RussellIndex.encode_alpha().

Parameters

word (str) -- The word to transform

Returns

The Russell Index value as an alphabetic string

Return type

str

Examples

>>> russell_index_alpha('Christopher')
'CRACDBR'
>>> russell_index_alpha('Niall')
'NAL'
>>> russell_index_alpha('Smith')
'CMAD'
>>> russell_index_alpha('Schmidt')
'CMAD'

New in version 0.1.0.

Deprecated since version 0.4.0: This will be removed in 0.6.0. Use the RussellIndex.encode_alpha method instead.

class abydos.phonetic.Soundex(max_length=4, var='American', reverse=False, zero_pad=True)[source]

Bases: abydos.phonetic._phonetic._Phonetic

Soundex.

Three variants of Soundex are implemented:

  • 'American' follows the American Soundex algorithm, as described at [Sta07] and in [Knu98]; this is also called Miracode

  • 'special' follows the rules from the 1880-1910 US Census retrospective re-analysis, in which h & w are not treated as blocking consonants but as vowels. Cf. [Rep13].

  • 'Census' follows the rules laid out in GIL 55 [Sta97] by the US Census, including coding prefixed and unprefixed versions of some names

New in version 0.3.6.

Initialize Soundex instance.

Parameters
  • max_length (int) -- The length of the code returned (defaults to 4)

  • var (str) --

    The variant of the algorithm to employ (defaults to American):

    • American follows the American Soundex algorithm, as described at [Sta07] and in [Knu98]; this is also called Miracode

    • special follows the rules from the 1880-1910 US Census retrospective re-analysis, in which h & w are not treated as blocking consonants but as vowels. Cf. [Rep13].

    • Census follows the rules laid out in GIL 55 [Sta97] by the US Census, including coding prefixed and unprefixed versions of some names

  • reverse (bool) -- Reverse the word before computing the selected Soundex (defaults to False); This results in "Reverse Soundex", which is useful for blocking in cases where the initial elements may be in error.

  • zero_pad (bool) -- Pad the end of the return value with 0s to achieve a max_length string

New in version 0.4.0.

encode(word)[source]

Return the Soundex code for a word.

Parameters

word (str) -- The word to transform

Returns

The Soundex value

Return type

str

Examples

>>> pe = Soundex()
>>> pe.encode("Christopher")
'C623'
>>> pe.encode("Niall")
'N400'
>>> pe.encode('Smith')
'S530'
>>> pe.encode('Schmidt')
'S530'
>>> Soundex(max_length=-1).encode('Christopher')
'C623160000000000000000000000000000000000000000000000000000000000'
>>> Soundex(max_length=-1, zero_pad=False).encode('Christopher')
'C62316'
>>> Soundex(reverse=True).encode('Christopher')
'R132'
>>> pe.encode('Ashcroft')
'A261'
>>> pe.encode('Asicroft')
'A226'
>>> pe_special = Soundex(var='special')
>>> pe_special.encode('Ashcroft')
'A226'
>>> pe_special.encode('Asicroft')
'A226'

New in version 0.1.0.

Changed in version 0.3.6: Encapsulated in class

encode_alpha(word)[source]

Return the alphabetic Soundex code for a word.

Parameters

word (str) -- The word to transform

Returns

The alphabetic Soundex value

Return type

str

Examples

>>> pe = Soundex()
>>> pe.encode_alpha("Christopher")
'CRKT'
>>> pe.encode_alpha("Niall")
'NL'
>>> pe.encode_alpha('Smith')
'SNT'
>>> pe.encode_alpha('Schmidt')
'SNT'

New in version 0.4.0.

abydos.phonetic.soundex(word, max_length=4, var='American', reverse=False, zero_pad=True)[source]

Return the Soundex code for a word.

This is a wrapper for Soundex.encode().

Parameters
  • word (str) -- The word to transform

  • max_length (int) -- The length of the code returned (defaults to 4)

  • var (str) --

    The variant of the algorithm to employ (defaults to American):

    • American follows the American Soundex algorithm, as described at [Sta07] and in [Knu98]; this is also called Miracode

    • special follows the rules from the 1880-1910 US Census retrospective re-analysis, in which h & w are not treated as blocking consonants but as vowels. Cf. [Rep13].

    • Census follows the rules laid out in GIL 55 [Sta97] by the US Census, including coding prefixed and unprefixed versions of some names

  • reverse (bool) -- Reverse the word before computing the selected Soundex (defaults to False); This results in "Reverse Soundex", which is useful for blocking in cases where the initial elements may be in error.

  • zero_pad (bool) -- Pad the end of the return value with 0s to achieve a max_length string

Returns

The Soundex value

Return type

str

Examples

>>> soundex("Christopher")
'C623'
>>> soundex("Niall")
'N400'
>>> soundex('Smith')
'S530'
>>> soundex('Schmidt')
'S530'
>>> soundex('Christopher', max_length=-1)
'C623160000000000000000000000000000000000000000000000000000000000'
>>> soundex('Christopher', max_length=-1, zero_pad=False)
'C62316'
>>> soundex('Christopher', reverse=True)
'R132'
>>> soundex('Ashcroft')
'A261'
>>> soundex('Asicroft')
'A226'
>>> soundex('Ashcroft', var='special')
'A226'
>>> soundex('Asicroft', var='special')
'A226'

New in version 0.1.0.

Deprecated since version 0.4.0: This will be removed in 0.6.0. Use the Soundex.encode method instead.

class abydos.phonetic.RefinedSoundex(max_length=-1, zero_pad=False, retain_vowels=False)[source]

Bases: abydos.phonetic._phonetic._Phonetic

Refined Soundex.

This is Soundex, but with more character classes. It was defined at [Boy98].

New in version 0.3.6.

Initialize RefinedSoundex instance.

Parameters
  • max_length (int) -- The length of the code returned (defaults to unlimited)

  • zero_pad (bool) -- Pad the end of the return value with 0s to achieve a max_length string

  • retain_vowels (bool) -- Retain vowels (as 0) in the resulting code

New in version 0.4.0.

encode(word)[source]

Return the Refined Soundex code for a word.

Parameters

word (str) -- The word to transform

Returns

The Refined Soundex value

Return type

str

Examples

>>> pe = RefinedSoundex()
>>> pe.encode('Christopher')
'C93619'
>>> pe.encode('Niall')
'N7'
>>> pe.encode('Smith')
'S86'
>>> pe.encode('Schmidt')
'S386'

New in version 0.3.0.

Changed in version 0.3.6: Encapsulated in class

encode_alpha(word)[source]

Return the alphabetic Refined Soundex code for a word.

Parameters

word (str) -- The word to transform

Returns

The alphabetic Refined Soundex value

Return type

str

Examples

>>> pe = RefinedSoundex()
>>> pe.encode_alpha('Christopher')
'CRKTPR'
>>> pe.encode_alpha('Niall')
'NL'
>>> pe.encode_alpha('Smith')
'SNT'
>>> pe.encode_alpha('Schmidt')
'SKNT'

New in version 0.4.0.

abydos.phonetic.refined_soundex(word, max_length=-1, zero_pad=False, retain_vowels=False)[source]

Return the Refined Soundex code for a word.

This is a wrapper for RefinedSoundex.encode().

Parameters
  • word (str) -- The word to transform

  • max_length (int) -- The length of the code returned (defaults to unlimited)

  • zero_pad (bool) -- Pad the end of the return value with 0s to achieve a max_length string

  • retain_vowels (bool) -- Retain vowels (as 0) in the resulting code

Returns

The Refined Soundex value

Return type

str

Examples

>>> refined_soundex('Christopher')
'C93619'
>>> refined_soundex('Niall')
'N7'
>>> refined_soundex('Smith')
'S86'
>>> refined_soundex('Schmidt')
'S386'

New in version 0.3.0.

Deprecated since version 0.4.0: This will be removed in 0.6.0. Use the RefinedSoundex.encode method instead.

class abydos.phonetic.DaitchMokotoff(max_length=6, zero_pad=True)[source]

Bases: abydos.phonetic._phonetic._Phonetic

Daitch-Mokotoff Soundex.

Based on Daitch-Mokotoff Soundex [Mok97], this returns values of a word as a set. A collection is necessary since there can be multiple values for a single word.

New in version 0.3.6.

Initialize DaitchMokotoff instance.

Parameters
  • max_length (int) -- The length of the code returned (defaults to 6; must be between 6 and 64)

  • zero_pad (bool) -- Pad the end of the return value with 0s to achieve a max_length string

New in version 0.4.0.

encode(word)[source]

Return the Daitch-Mokotoff Soundex code for a word.

Parameters

word (str) -- The word to transform

Returns

The Daitch-Mokotoff Soundex value

Return type

str

Examples

>>> pe = DaitchMokotoff()
>>> sorted(pe.encode('Christopher'))
['494379', '594379']
>>> pe.encode('Niall')
{'680000'}
>>> pe.encode('Smith')
{'463000'}
>>> pe.encode('Schmidt')
{'463000'}
>>> sorted(DaitchMokotoff(max_length=20,
... zero_pad=False).encode('The quick brown fox'))
['35457976754', '3557976754']

New in version 0.1.0.

Changed in version 0.3.6: Encapsulated in class

encode_alpha(word)[source]

Return the alphabetic Daitch-Mokotoff Soundex code for a word.

Parameters

word (str) -- The word to transform

Returns

The alphabetic Daitch-Mokotoff Soundex value

Return type

str

Examples

>>> pe = DaitchMokotoff()
>>> sorted(pe.encode_alpha('Christopher'))
['KRSTPR', 'SRSTPR']
>>> pe.encode_alpha('Niall')
{'NL'}
>>> pe.encode_alpha('Smith')
{'SNT'}
>>> pe.encode_alpha('Schmidt')
{'SNT'}
>>> sorted(DaitchMokotoff(max_length=20,
... zero_pad=False).encode_alpha('The quick brown fox'))
['TKKPRPNPKS', 'TKSKPRPNPKS']

New in version 0.4.0.

abydos.phonetic.dm_soundex(word, max_length=6, zero_pad=True)[source]

Return the Daitch-Mokotoff Soundex code for a word.

This is a wrapper for DaitchMokotoff.encode().

Parameters
  • word (str) -- The word to transform

  • max_length (int) -- The length of the code returned (defaults to 6; must be between 6 and 64)

  • zero_pad (bool) -- Pad the end of the return value with 0s to achieve a max_length string

Returns

The Daitch-Mokotoff Soundex value

Return type

str

Examples

>>> sorted(dm_soundex('Christopher'))
['494379', '594379']
>>> dm_soundex('Niall')
{'680000'}
>>> dm_soundex('Smith')
{'463000'}
>>> dm_soundex('Schmidt')
{'463000'}
>>> sorted(dm_soundex('The quick brown fox', max_length=20,
... zero_pad=False))
['35457976754', '3557976754']

New in version 0.1.0.

Changed in version 0.3.6: Encapsulated in class

Deprecated since version 0.4.0: This will be removed in 0.6.0. Use the DaitchMokotoff.encode method instead.

class abydos.phonetic.FuzzySoundex(max_length=5, zero_pad=True)[source]

Bases: abydos.phonetic._phonetic._Phonetic

Fuzzy Soundex.

Fuzzy Soundex is an algorithm derived from Soundex, defined in [HM02].

New in version 0.3.6.

Initialize FuzzySoundex instance.

Parameters
  • max_length (int) -- The length of the code returned (defaults to 4)

  • zero_pad (bool) -- Pad the end of the return value with 0s to achieve a max_length string

New in version 0.4.0.

encode(word)[source]

Return the Fuzzy Soundex code for a word.

Parameters

word (str) -- The word to transform

Returns

The Fuzzy Soundex value

Return type

str

Examples

>>> pe = FuzzySoundex()
>>> pe.encode('Christopher')
'K6931'
>>> pe.encode('Niall')
'N4000'
>>> pe.encode('Smith')
'S5300'
>>> pe.encode('Smith')
'S5300'

New in version 0.1.0.

Changed in version 0.3.6: Encapsulated in class

encode_alpha(word)[source]

Return the alphabetic Fuzzy Soundex code for a word.

Parameters

word (str) -- The word to transform

Returns

The alphabetic Fuzzy Soundex value

Return type

str

Examples

>>> pe = FuzzySoundex()
>>> pe.encode_alpha('Christopher')
'KRSTP'
>>> pe.encode_alpha('Niall')
'NL'
>>> pe.encode_alpha('Smith')
'SNT'
>>> pe.encode_alpha('Schmidt')
'SNT'

New in version 0.4.0.

abydos.phonetic.fuzzy_soundex(word, max_length=5, zero_pad=True)[source]

Return the Fuzzy Soundex code for a word.

This is a wrapper for FuzzySoundex.encode().

Parameters
  • word (str) -- The word to transform

  • max_length (int) -- The length of the code returned (defaults to 4)

  • zero_pad (bool) -- Pad the end of the return value with 0s to achieve a max_length string

Returns

The Fuzzy Soundex value

Return type

str

Examples

>>> fuzzy_soundex('Christopher')
'K6931'
>>> fuzzy_soundex('Niall')
'N4000'
>>> fuzzy_soundex('Smith')
'S5300'
>>> fuzzy_soundex('Smith')
'S5300'

New in version 0.1.0.

Deprecated since version 0.4.0: This will be removed in 0.6.0. Use the FuzzySoundex.encode method instead.

class abydos.phonetic.LEIN(max_length=4, zero_pad=True)[source]

Bases: abydos.phonetic._phonetic._Phonetic

LEIN code.

This is Michigan LEIN (Law Enforcement Information Network) name coding, described in [MKTM77].

New in version 0.3.6.

Initialize LEIN instance.

Parameters
  • max_length (int) -- The length of the code returned (defaults to 4)

  • zero_pad (bool) -- Pad the end of the return value with 0s to achieve a max_length string

New in version 0.4.0.

encode(word)[source]

Return the LEIN code for a word.

Parameters

word (str) -- The word to transform

Returns

The LEIN code

Return type

str

Examples

>>> pe = LEIN()
>>> pe.encode('Christopher')
'C351'
>>> pe.encode('Niall')
'N300'
>>> pe.encode('Smith')
'S210'
>>> pe.encode('Schmidt')
'S521'

New in version 0.3.0.

Changed in version 0.3.6: Encapsulated in class

encode_alpha(word)[source]

Return the alphabetic LEIN code for a word.

Parameters

word (str) -- The word to transform

Returns

The alphabetic LEIN code

Return type

str

Examples

>>> pe = LEIN()
>>> pe.encode_alpha('Christopher')
'CLKT'
>>> pe.encode_alpha('Niall')
'NL'
>>> pe.encode_alpha('Smith')
'SNT'
>>> pe.encode_alpha('Schmidt')
'SKNT'

New in version 0.4.0.

abydos.phonetic.lein(word, max_length=4, zero_pad=True)[source]

Return the LEIN code for a word.

This is a wrapper for LEIN.encode().

Parameters
  • word (str) -- The word to transform

  • max_length (int) -- The length of the code returned (defaults to 4)

  • zero_pad (bool) -- Pad the end of the return value with 0s to achieve a max_length string

Returns

The LEIN code

Return type

str

Examples

>>> lein('Christopher')
'C351'
>>> lein('Niall')
'N300'
>>> lein('Smith')
'S210'
>>> lein('Schmidt')
'S521'

New in version 0.3.0.

Deprecated since version 0.4.0: This will be removed in 0.6.0. Use the LEIN.encode method instead.

class abydos.phonetic.Phonex(max_length=4, zero_pad=True)[source]

Bases: abydos.phonetic._phonetic._Phonetic

Phonex code.

Phonex is an algorithm derived from Soundex, defined in [LR96].

New in version 0.3.6.

Initialize Phonex instance.

Parameters
  • max_length (int) -- The length of the code returned (defaults to 4)

  • zero_pad (bool) -- Pad the end of the return value with 0s to achieve a max_length string

New in version 0.4.0.

encode(word)[source]

Return the Phonex code for a word.

Parameters

word (str) -- The word to transform

Returns

The Phonex value

Return type

str

Examples

>>> pe = Phonex()
>>> pe.encode('Christopher')
'C623'
>>> pe.encode('Niall')
'N400'
>>> pe.encode('Schmidt')
'S253'
>>> pe.encode('Smith')
'S530'

New in version 0.1.0.

Changed in version 0.3.6: Encapsulated in class

encode_alpha(word)[source]

Return the alphabetic Phonex code for a word.

Parameters

word (str) -- The word to transform

Returns

The alphabetic Phonex value

Return type

str

Examples

>>> pe = Phonex()
>>> pe.encode_alpha('Christopher')
'CRST'
>>> pe.encode_alpha('Niall')
'NL'
>>> pe.encode_alpha('Smith')
'SNT'
>>> pe.encode_alpha('Schmidt')
'SSNT'

New in version 0.4.0.

abydos.phonetic.phonex(word, max_length=4, zero_pad=True)[source]

Return the Phonex code for a word.

This is a wrapper for Phonex.encode().

Parameters
  • word (str) -- The word to transform

  • max_length (int) -- The length of the code returned (defaults to 4)

  • zero_pad (bool) -- Pad the end of the return value with 0s to achieve a max_length string

Returns

The Phonex value

Return type

str

Examples

>>> phonex('Christopher')
'C623'
>>> phonex('Niall')
'N400'
>>> phonex('Schmidt')
'S253'
>>> phonex('Smith')
'S530'

New in version 0.1.0.

Deprecated since version 0.4.0: This will be removed in 0.6.0. Use the Phonex.encode method instead.

class abydos.phonetic.PHONIC(max_length=5, zero_pad=True, extended=False)[source]

Bases: abydos.phonetic._phonetic._Phonetic

PHONIC code.

PHONIC is a Soundex-like algorithm defined in [Taf70].

New in version 0.4.1.

Initialize PHONIC instance.

Parameters
  • max_length (int) -- The length of the code returned (defaults to 5)

  • zero_pad (bool) -- Pad the end of the return value with 0s to achieve a max_length string

  • extended (bool) -- If True, this uses Taft's 'Extended PHONIC coding' mode, which simply omits the first character of the code.

New in version 0.4.1.

encode(word)[source]

Return the PHONIC code for a word.

Parameters

word (str) -- The word to transform

Returns

The PHONIC code

Return type

str

Examples

>>> pe = PHONIC()
>>> pe.encode('Christopher')
'C6401'
>>> pe.encode('Niall')
'N2500'
>>> pe.encode('Smith')
'S0310'
>>> pe.encode('Schmidt')
'S0631'

New in version 0.4.1.

encode_alpha(word)[source]

Return the alphabetic PHONIC code for a word.

Parameters

word (str) -- The word to transform

Returns

The alphabetic PHONIC value

Return type

str

Examples

>>> pe = PHONIC()
>>> pe.encode_alpha('Christopher')
'JRSTF'
>>> pe.encode_alpha('Niall')
'NL'
>>> pe.encode_alpha('Smith')
'SMT'
>>> pe.encode_alpha('Schmidt')
'SJMT'

New in version 0.4.1.

class abydos.phonetic.Phonix(max_length=4, zero_pad=True)[source]

Bases: abydos.phonetic._phonetic._Phonetic

Phonix code.

Phonix is a Soundex-like algorithm defined in [Gad90].

This implementation is based on: - [Pfe00] - [Chr11] - [Kollar]

New in version 0.3.6.

Initialize Phonix instance.

Parameters
  • max_length (int) -- The length of the code returned (defaults to 4)

  • zero_pad (bool) -- Pad the end of the return value with 0s to achieve a max_length string

New in version 0.3.6.

encode(word)[source]

Return the Phonix code for a word.

Parameters

word (str) -- The word to transform

Returns

The Phonix value

Return type

str

Examples

>>> pe = Phonix()
>>> pe.encode('Christopher')
'K683'
>>> pe.encode('Niall')
'N400'
>>> pe.encode('Smith')
'S530'
>>> pe.encode('Schmidt')
'S530'

New in version 0.1.0.

Changed in version 0.3.6: Encapsulated in class

encode_alpha(word)[source]

Return the alphabetic Phonix code for a word.

Parameters

word (str) -- The word to transform

Returns

The alphabetic Phonix value

Return type

str

Examples

>>> pe = Phonix()
>>> pe.encode_alpha('Christopher')
'KRST'
>>> pe.encode_alpha('Niall')
'NL'
>>> pe.encode_alpha('Smith')
'SNT'
>>> pe.encode_alpha('Schmidt')
'SNT'

New in version 0.4.0.

abydos.phonetic.phonix(word, max_length=4, zero_pad=True)[source]

Return the Phonix code for a word.

This is a wrapper for Phonix.encode().

Parameters
  • word (str) -- The word to transform

  • max_length (int) -- The length of the code returned (defaults to 4)

  • zero_pad (bool) -- Pad the end of the return value with 0s to achieve a max_length string

Returns

The Phonix value

Return type

str

Examples

>>> phonix('Christopher')
'K683'
>>> phonix('Niall')
'N400'
>>> phonix('Smith')
'S530'
>>> phonix('Schmidt')
'S530'

New in version 0.1.0.

Deprecated since version 0.4.0: This will be removed in 0.6.0. Use the Phonix.encode method instead.

class abydos.phonetic.PSHPSoundexFirst(max_length=4, german=False)[source]

Bases: abydos.phonetic._phonetic._Phonetic

PSHP Soundex/Viewex Coding of a first name.

This coding is based on [HBD76].

Reference was also made to the German version of the same: [HBD79].

A separate class, PSHPSoundexLast is used for last names.

New in version 0.3.6.

Initialize PSHPSoundexFirst instance.

Parameters
  • max_length (int) -- The length of the code returned (defaults to 4)

  • german (bool) -- Set to True if the name is German (different rules apply)

New in version 0.4.0.

encode(fname)[source]

Calculate the PSHP Soundex/Viewex Coding of a first name.

Parameters

fname (str) -- The first name to encode

Returns

The PSHP Soundex/Viewex Coding

Return type

str

Examples

>>> pe = PSHPSoundexFirst()
>>> pe.encode('Smith')
'S530'
>>> pe.encode('Waters')
'W352'
>>> pe.encode('James')
'J700'
>>> pe.encode('Schmidt')
'S500'
>>> pe.encode('Ashcroft')
'A220'
>>> pe.encode('John')
'J500'
>>> pe.encode('Colin')
'K400'
>>> pe.encode('Niall')
'N400'
>>> pe.encode('Sally')
'S400'
>>> pe.encode('Jane')
'J500'

New in version 0.3.0.

Changed in version 0.3.6: Encapsulated in class

encode_alpha(fname)[source]

Calculate the alphabetic PSHP Soundex/Viewex Coding of a first name.

Parameters

fname (str) -- The first name to encode

Returns

The alphabetic PSHP Soundex/Viewex Coding

Return type

str

Examples

>>> pe = PSHPSoundexFirst()
>>> pe.encode_alpha('Smith')
'SNT'
>>> pe.encode_alpha('Waters')
'WTNK'
>>> pe.encode_alpha('James')
'JN'
>>> pe.encode_alpha('Schmidt')
'SN'
>>> pe.encode_alpha('Ashcroft')
'AKK'
>>> pe.encode_alpha('John')
'JN'
>>> pe.encode_alpha('Colin')
'KL'
>>> pe.encode_alpha('Niall')
'NL'
>>> pe.encode_alpha('Sally')
'SL'
>>> pe.encode_alpha('Jane')
'JN'

New in version 0.4.0.

abydos.phonetic.pshp_soundex_first(fname, max_length=4, german=False)[source]

Calculate the PSHP Soundex/Viewex Coding of a first name.

This is a wrapper for PSHPSoundexFirst.encode().

Parameters
  • fname (str) -- The first name to encode

  • max_length (int) -- The length of the code returned (defaults to 4)

  • german (bool) -- Set to True if the name is German (different rules apply)

Returns

The PSHP Soundex/Viewex Coding

Return type

str

Examples

>>> pshp_soundex_first('Smith')
'S530'
>>> pshp_soundex_first('Waters')
'W352'
>>> pshp_soundex_first('James')
'J700'
>>> pshp_soundex_first('Schmidt')
'S500'
>>> pshp_soundex_first('Ashcroft')
'A220'
>>> pshp_soundex_first('John')
'J500'
>>> pshp_soundex_first('Colin')
'K400'
>>> pshp_soundex_first('Niall')
'N400'
>>> pshp_soundex_first('Sally')
'S400'
>>> pshp_soundex_first('Jane')
'J500'

New in version 0.3.0.

Deprecated since version 0.4.0: This will be removed in 0.6.0. Use the PSHPSoundexFirst.encode method instead.

class abydos.phonetic.PSHPSoundexLast(max_length=4, german=False)[source]

Bases: abydos.phonetic._phonetic._Phonetic

PSHP Soundex/Viewex Coding of a last name.

This coding is based on [HBD76].

Reference was also made to the German version of the same: [HBD79].

A separate function, PSHPSoundexFirst is used for first names.

New in version 0.3.6.

Initialize PSHPSoundexLast instance.

Parameters
  • max_length (int) -- The length of the code returned (defaults to 4)

  • german (bool) -- Set to True if the name is German (different rules apply)

New in version 0.4.0.

encode(lname)[source]

Calculate the PSHP Soundex/Viewex Coding of a last name.

Parameters

lname (str) -- The last name to encode

Returns

The PSHP Soundex/Viewex Coding

Return type

str

Examples

>>> pe = PSHPSoundexLast()
>>> pe.encode('Smith')
'S530'
>>> pe.encode('Waters')
'W350'
>>> pe.encode('James')
'J500'
>>> pe.encode('Schmidt')
'S530'
>>> pe.encode('Ashcroft')
'A225'

New in version 0.3.0.

Changed in version 0.3.6: Encapsulated in class

encode_alpha(lname)[source]

Calculate the alphabetic PSHP Soundex/Viewex Coding of a last name.

Parameters

lname (str) -- The last name to encode

Returns

The PSHP alphabetic Soundex/Viewex Coding

Return type

str

Examples

>>> pe = PSHPSoundexLast()
>>> pe.encode_alpha('Smith')
'SNT'
>>> pe.encode_alpha('Waters')
'WTN'
>>> pe.encode_alpha('James')
'JN'
>>> pe.encode_alpha('Schmidt')
'SNT'
>>> pe.encode_alpha('Ashcroft')
'AKKN'

New in version 0.4.0.

abydos.phonetic.pshp_soundex_last(lname, max_length=4, german=False)[source]

Calculate the PSHP Soundex/Viewex Coding of a last name.

This is a wrapper for PSHPSoundexLast.encode().

Parameters
  • lname (str) -- The last name to encode

  • max_length (int) -- The length of the code returned (defaults to 4)

  • german (bool) -- Set to True if the name is German (different rules apply)

Returns

The PSHP Soundex/Viewex Coding

Return type

str

Examples

>>> pshp_soundex_last('Smith')
'S530'
>>> pshp_soundex_last('Waters')
'W350'
>>> pshp_soundex_last('James')
'J500'
>>> pshp_soundex_last('Schmidt')
'S530'
>>> pshp_soundex_last('Ashcroft')
'A225'

New in version 0.3.0.

Deprecated since version 0.4.0: This will be removed in 0.6.0. Use the PSHPSoundexLast.encode method instead.

class abydos.phonetic.NYSIIS(max_length=6, modified=False)[source]

Bases: abydos.phonetic._phonetic._Phonetic

NYSIIS Code.

The New York State Identification and Intelligence System algorithm is defined in [Taf70].

The modified version of this algorithm is described in Appendix B of [LA77].

New in version 0.3.6.

Initialize AlphaSIS instance.

Parameters
  • max_length (int) -- The maximum length (default 6) of the code to return

  • modified (bool) -- Indicates whether to use USDA modified NYSIIS

New in version 0.4.0.

encode(word)[source]

Return the NYSIIS code for a word.

Parameters

word (str) -- The word to transform

Returns

The NYSIIS value

Return type

str

Examples

>>> pe = NYSIIS()
>>> pe.encode('Christopher')
'CRASTA'
>>> pe.encode('Niall')
'NAL'
>>> pe.encode('Smith')
'SNAT'
>>> pe.encode('Schmidt')
'SNAD'
>>> NYSIIS(max_length=-1).encode('Christopher')
'CRASTAFAR'
>>> pe_8m = NYSIIS(max_length=8, modified=True)
>>> pe_8m.encode('Christopher')
'CRASTAFA'
>>> pe_8m.encode('Niall')
'NAL'
>>> pe_8m.encode('Smith')
'SNAT'
>>> pe_8m.encode('Schmidt')
'SNAD'

New in version 0.1.0.

Changed in version 0.3.6: Encapsulated in class

abydos.phonetic.nysiis(word, max_length=6, modified=False)[source]

Return the NYSIIS code for a word.

This is a wrapper for NYSIIS.encode().

Parameters
  • word (str) -- The word to transform

  • max_length (int) -- The maximum length (default 6) of the code to return

  • modified (bool) -- Indicates whether to use USDA modified NYSIIS

Returns

The NYSIIS value

Return type

str

Examples

>>> nysiis('Christopher')
'CRASTA'
>>> nysiis('Niall')
'NAL'
>>> nysiis('Smith')
'SNAT'
>>> nysiis('Schmidt')
'SNAD'
>>> nysiis('Christopher', max_length=-1)
'CRASTAFAR'
>>> nysiis('Christopher', max_length=8, modified=True)
'CRASTAFA'
>>> nysiis('Niall', max_length=8, modified=True)
'NAL'
>>> nysiis('Smith', max_length=8, modified=True)
'SNAT'
>>> nysiis('Schmidt', max_length=8, modified=True)
'SNAD'

New in version 0.1.0.

Deprecated since version 0.4.0: This will be removed in 0.6.0. Use the NYSIIS.encode method instead.

class abydos.phonetic.MRA[source]

Bases: abydos.phonetic._phonetic._Phonetic

Western Airlines Surname Match Rating Algorithm.

A description of the Western Airlines Surname Match Rating Algorithm can be found on page 18 of [MKTM77].

New in version 0.3.6.

encode(word)[source]

Return the MRA personal numeric identifier (PNI) for a word.

Parameters

word (str) -- The word to transform

Returns

The MRA PNI

Return type

str

Examples

>>> pe = MRA()
>>> pe.encode('Christopher')
'CHRPHR'
>>> pe.encode('Niall')
'NL'
>>> pe.encode('Smith')
'SMTH'
>>> pe.encode('Schmidt')
'SCHMDT'

New in version 0.1.0.

Changed in version 0.3.6: Encapsulated in class

abydos.phonetic.mra(word)[source]

Return the MRA personal numeric identifier (PNI) for a word.

This is a wrapper for MRA.encode().

Parameters

word (str) -- The word to transform

Returns

The MRA PNI

Return type

str

Examples

>>> mra('Christopher')
'CHRPHR'
>>> mra('Niall')
'NL'
>>> mra('Smith')
'SMTH'
>>> mra('Schmidt')
'SCHMDT'

New in version 0.1.0.

Deprecated since version 0.4.0: This will be removed in 0.6.0. Use the MRA.encode method instead.

class abydos.phonetic.Caverphone(version=2)[source]

Bases: abydos.phonetic._phonetic._Phonetic

Caverphone.

A description of version 1 of the algorithm can be found in [Hoo02].

A description of version 2 of the algorithm can be found in [Hoo04].

New in version 0.3.6.

Initialize Caverphone instance.

Parameters

version (int) -- The version of Caverphone to employ for encoding (defaults to 2)

New in version 0.4.0.

encode(word)[source]

Return the Caverphone code for a word.

Parameters

word (str) -- The word to transform

Returns

The Caverphone value

Return type

str

Examples

>>> pe = Caverphone()
>>> pe.encode('Christopher')
'KRSTFA1111'
>>> pe.encode('Niall')
'NA11111111'
>>> pe.encode('Smith')
'SMT1111111'
>>> pe.encode('Schmidt')
'SKMT111111'
>>> pe_1 = Caverphone(version=1)
>>> pe_1.encode('Christopher')
'KRSTF1'
>>> pe_1.encode('Niall')
'N11111'
>>> pe_1.encode('Smith')
'SMT111'
>>> pe_1.encode('Schmidt')
'SKMT11'

New in version 0.1.0.

Changed in version 0.3.6: Encapsulated in class

encode_alpha(word)[source]

Return the alphabetic Caverphone code for a word.

Parameters

word (str) -- The word to transform

Returns

The alphabetic Caverphone value

Return type

str

Examples

>>> pe = Caverphone()
>>> pe.encode_alpha('Christopher')
'KRSTFA'
>>> pe.encode_alpha('Niall')
'NA'
>>> pe.encode_alpha('Smith')
'SMT'
>>> pe.encode_alpha('Schmidt')
'SKMT'
>>> pe_1 = Caverphone(version=1)
>>> pe_1.encode_alpha('Christopher')
'KRSTF'
>>> pe_1.encode_alpha('Niall')
'N'
>>> pe_1.encode_alpha('Smith')
'SMT'
>>> pe_1.encode_alpha('Schmidt')
'SKMT'

New in version 0.4.0.

abydos.phonetic.caverphone(word, version=2)[source]

Return the Caverphone code for a word.

This is a wrapper for Caverphone.encode().

Parameters
  • word (str) -- The word to transform

  • version (int) -- The version of Caverphone to employ for encoding (defaults to 2)

Returns

The Caverphone value

Return type

str

Examples

>>> caverphone('Christopher')
'KRSTFA1111'
>>> caverphone('Niall')
'NA11111111'
>>> caverphone('Smith')
'SMT1111111'
>>> caverphone('Schmidt')
'SKMT111111'
>>> caverphone('Christopher', 1)
'KRSTF1'
>>> caverphone('Niall', 1)
'N11111'
>>> caverphone('Smith', 1)
'SMT111'
>>> caverphone('Schmidt', 1)
'SKMT11'

New in version 0.1.0.

Deprecated since version 0.4.0: This will be removed in 0.6.0. Use the Caverphone.encode method instead.

class abydos.phonetic.AlphaSIS(max_length=14)[source]

Bases: abydos.phonetic._phonetic._Phonetic

Alpha-SIS.

The Alpha Search Inquiry System code is defined in [Cor73]. This implementation is based on the description in [MKTM77].

New in version 0.3.6.

Initialize AlphaSIS instance.

Parameters

max_length (int) -- The length of the code returned (defaults to 14)

New in version 0.4.0.

encode(word)[source]

Return the IBM Alpha Search Inquiry System code for a word.

A collection is necessary as the return type since there can be multiple values for a single word. But the collection must be ordered since the first value is the primary coding.

Parameters

word (str) -- The word to transform

Returns

The Alpha-SIS value

Return type

tuple

Examples

>>> pe = AlphaSIS()
>>> pe.encode('Christopher')
('06401840000000', '07040184000000', '04018400000000')
>>> pe.encode('Niall')
('02500000000000',)
>>> pe.encode('Smith')
('03100000000000',)
>>> pe.encode('Schmidt')
('06310000000000',)

New in version 0.1.0.

Changed in version 0.3.6: Encapsulated in class

encode_alpha(word)[source]

Return the alphabetic Alpha-SIS code for a word.

Parameters

word (str) -- The word to transform

Returns

The alphabetic Alpha-SIS value

Return type

tuple

Examples

>>> pe = AlphaSIS()
>>> pe.encode_alpha('Christopher')
('JRSTFR', 'KSRSTFR', 'RSTFR')
>>> pe.encode_alpha('Niall')
('NL',)
>>> pe.encode_alpha('Smith')
('MT',)
>>> pe.encode_alpha('Schmidt')
('JMT',)

New in version 0.4.0.

abydos.phonetic.alpha_sis(word, max_length=14)[source]

Return the IBM Alpha Search Inquiry System code for a word.

This is a wrapper for AlphaSIS.encode().

Parameters
  • word (str) -- The word to transform

  • max_length (int) -- The length of the code returned (defaults to 14)

Returns

The Alpha-SIS value

Return type

tuple

Examples

>>> alpha_sis('Christopher')
('06401840000000', '07040184000000', '04018400000000')
>>> alpha_sis('Niall')
('02500000000000',)
>>> alpha_sis('Smith')
('03100000000000',)
>>> alpha_sis('Schmidt')
('06310000000000',)

New in version 0.1.0.

Deprecated since version 0.4.0: This will be removed in 0.6.0. Use the AlphaSIS.encode method instead.

class abydos.phonetic.Davidson(omit_fname=False)[source]

Bases: abydos.phonetic._phonetic._Phonetic

Davidson Consonant Code.

This is based on the name compression system described in [Dav62].

[Dol70] identifies this as having been the name compression algorithm used by SABRE.

New in version 0.3.6.

Initialize Davidson instance.

Parameters

omit_fname (bool) -- Set to True to completely omit the first character of the first name

New in version 0.4.0.

encode(lname, fname='.')[source]

Return Davidson's Consonant Code.

Parameters
  • lname (str) -- Last name (or word) to be encoded

  • fname (str) -- First name (optional), of which the first character is included in the code.

Returns

Davidson's Consonant Code

Return type

str

Example

>>> pe = Davidson()
>>> pe.encode('Gough')
'G   .'
>>> pe.encode('pneuma')
'PNM .'
>>> pe.encode('knight')
'KNGT.'
>>> pe.encode('trice')
'TRC .'
>>> pe.encode('judge')
'JDG .'
>>> pe.encode('Smith', 'James')
'SMT J'
>>> pe.encode('Wasserman', 'Tabitha')
'WSRMT'

New in version 0.3.0.

Changed in version 0.3.6: Encapsulated in class

abydos.phonetic.davidson(lname, fname='.', omit_fname=False)[source]

Return Davidson's Consonant Code.

This is a wrapper for Davidson.encode().

Parameters
  • lname (str) -- Last name (or word) to be encoded

  • fname (str) -- First name (optional), of which the first character is included in the code.

  • omit_fname (bool) -- Set to True to completely omit the first character of the first name

Returns

Davidson's Consonant Code

Return type

str

Example

>>> davidson('Gough')
'G   .'
>>> davidson('pneuma')
'PNM .'
>>> davidson('knight')
'KNGT.'
>>> davidson('trice')
'TRC .'
>>> davidson('judge')
'JDG .'
>>> davidson('Smith', 'James')
'SMT J'
>>> davidson('Wasserman', 'Tabitha')
'WSRMT'

New in version 0.3.0.

Deprecated since version 0.4.0: This will be removed in 0.6.0. Use the Davidson.encode method instead.

class abydos.phonetic.Dolby(max_length=-1, keep_vowels=False, vowel_char='*')[source]

Bases: abydos.phonetic._phonetic._Phonetic

Dolby Code.

This follows "A Spelling Equivalent Abbreviation Algorithm For Personal Names" from [Dol70] and [C+69].

New in version 0.3.6.

Initialize Dolby instance.

Parameters
  • max_length (int) -- Maximum length of the returned Dolby code -- this also activates the fixed-length code mode if it is greater than 0

  • keep_vowels (bool) -- If True, retains all vowel markers

  • vowel_char (str) -- The vowel marker character (default to *)

New in version 0.4.0.

encode(word)[source]

Return the Dolby Code of a name.

Parameters

word (str) -- The word to transform

Returns

The Dolby Code

Return type

str

Examples

>>> pe = Dolby()
>>> pe.encode('Hansen')
'H*NSN'
>>> pe.encode('Larsen')
'L*RSN'
>>> pe.encode('Aagaard')
'*GR'
>>> pe.encode('Braaten')
'BR*DN'
>>> pe.encode('Sandvik')
'S*NVK'
>>> pe_6 = Dolby(max_length=6)
>>> pe_6.encode('Hansen')
'H*NS*N'
>>> pe_6.encode('Larsen')
'L*RS*N'
>>> pe_6.encode('Aagaard')
'*G*R  '
>>> pe_6.encode('Braaten')
'BR*D*N'
>>> pe_6.encode('Sandvik')
'S*NF*K'
>>> pe.encode('Smith')
'SM*D'
>>> pe.encode('Waters')
'W*DRS'
>>> pe.encode('James')
'J*MS'
>>> pe.encode('Schmidt')
'SM*D'
>>> pe.encode('Ashcroft')
'*SKRFD'
>>> pe_6.encode('Smith')
'SM*D  '
>>> pe_6.encode('Waters')
'W*D*RS'
>>> pe_6.encode('James')
'J*M*S '
>>> pe_6.encode('Schmidt')
'SM*D  '
>>> pe_6.encode('Ashcroft')
'*SKRFD'

New in version 0.3.0.

Changed in version 0.3.6: Encapsulated in class

encode_alpha(word)[source]

Return the alphabetic Dolby Code of a name.

Parameters

word (str) -- The word to transform

Returns

The alphabetic Dolby Code

Return type

str

Examples

>>> pe = Dolby()
>>> pe.encode_alpha('Hansen')
'HANSN'
>>> pe.encode_alpha('Larsen')
'LARSN'
>>> pe.encode_alpha('Aagaard')
'AGR'
>>> pe.encode_alpha('Braaten')
'BRADN'
>>> pe.encode_alpha('Sandvik')
'SANVK'

New in version 0.4.0.

abydos.phonetic.dolby(word, max_length=-1, keep_vowels=False, vowel_char='*')[source]

Return the Dolby Code of a name.

This is a wrapper for Dolby.encode().

Parameters
  • word (str) -- The word to transform

  • max_length (int) -- Maximum length of the returned Dolby code -- this also activates the fixed-length code mode if it is greater than 0

  • keep_vowels (bool) -- If True, retains all vowel markers

  • vowel_char (str) -- The vowel marker character (default to *)

Returns

The Dolby Code

Return type

str

Examples

>>> dolby('Hansen')
'H*NSN'
>>> dolby('Larsen')
'L*RSN'
>>> dolby('Aagaard')
'*GR'
>>> dolby('Braaten')
'BR*DN'
>>> dolby('Sandvik')
'S*NVK'
>>> dolby('Hansen', max_length=6)
'H*NS*N'
>>> dolby('Larsen', max_length=6)
'L*RS*N'
>>> dolby('Aagaard', max_length=6)
'*G*R  '
>>> dolby('Braaten', max_length=6)
'BR*D*N'
>>> dolby('Sandvik', max_length=6)
'S*NF*K'
>>> dolby('Smith')
'SM*D'
>>> dolby('Waters')
'W*DRS'
>>> dolby('James')
'J*MS'
>>> dolby('Schmidt')
'SM*D'
>>> dolby('Ashcroft')
'*SKRFD'
>>> dolby('Smith', max_length=6)
'SM*D  '
>>> dolby('Waters', max_length=6)
'W*D*RS'
>>> dolby('James', max_length=6)
'J*M*S '
>>> dolby('Schmidt', max_length=6)
'SM*D  '
>>> dolby('Ashcroft', max_length=6)
'*SKRFD'

New in version 0.3.0.

Deprecated since version 0.4.0: This will be removed in 0.6.0. Use the Dolby.encode method instead.

class abydos.phonetic.SPFC[source]

Bases: abydos.phonetic._phonetic._Phonetic

Standardized Phonetic Frequency Code (SPFC).

Standardized Phonetic Frequency Code is roughly Soundex-like. This implementation is based on page 19-21 of [MKTM77].

New in version 0.3.6.

encode(word)[source]

Return the Standardized Phonetic Frequency Code (SPFC) of a word.

Parameters

word (str) -- The word to transform

Returns

The SPFC value

Return type

str

Raises

AttributeError -- Word attribute must be a string with a space or period dividing the first and last names or a tuple/list consisting of the first and last names

Examples

>>> pe = SPFC()
>>> pe.encode('Christopher Smith')
'01160'
>>> pe.encode('Christopher Schmidt')
'01160'
>>> pe.encode('Niall Smith')
'01660'
>>> pe.encode('Niall Schmidt')
'01660'
>>> pe.encode('L.Smith')
'01960'
>>> pe.encode('R.Miller')
'65490'
>>> pe.encode(('L', 'Smith'))
'01960'
>>> pe.encode(('R', 'Miller'))
'65490'

New in version 0.1.0.

Changed in version 0.3.6: Encapsulated in class

encode_alpha(word)[source]

Return the alphabetic SPFC of a word.

Parameters

word (str) -- The word to transform

Returns

The alphabetic SPFC value

Return type

str

Examples

>>> pe = SPFC()
>>> pe.encode_alpha('Christopher Smith')
'SDCMS'
>>> pe.encode_alpha('Christopher Schmidt')
'SDCMS'
>>> pe.encode_alpha('Niall Smith')
'SDMMS'
>>> pe.encode_alpha('Niall Schmidt')
'SDMMS'
>>> pe.encode_alpha('L.Smith')
'SDEMS'
>>> pe.encode_alpha('R.Miller')
'EROES'
>>> pe.encode_alpha(('L', 'Smith'))
'SDEMS'
>>> pe.encode_alpha(('R', 'Miller'))
'EROES'

New in version 0.4.0.

abydos.phonetic.spfc(word)[source]

Return the Standardized Phonetic Frequency Code (SPFC) of a word.

This is a wrapper for SPFC.encode().

Parameters

word (str) -- The word to transform

Returns

The SPFC value

Return type

str

Examples

>>> spfc('Christopher Smith')
'01160'
>>> spfc('Christopher Schmidt')
'01160'
>>> spfc('Niall Smith')
'01660'
>>> spfc('Niall Schmidt')
'01660'
>>> spfc('L.Smith')
'01960'
>>> spfc('R.Miller')
'65490'
>>> spfc(('L', 'Smith'))
'01960'
>>> spfc(('R', 'Miller'))
'65490'

New in version 0.1.0.

Deprecated since version 0.4.0: This will be removed in 0.6.0. Use the SPFC.encode method instead.

class abydos.phonetic.RogerRoot(max_length=5, zero_pad=True)[source]

Bases: abydos.phonetic._phonetic._Phonetic

Roger Root code.

This is Roger Root name coding, described in [MKTM77].

New in version 0.3.6.

Initialize RogerRoot instance.

Parameters
  • max_length (int) -- The maximum length (default 5) of the code to return

  • zero_pad (bool) -- Pad the end of the return value with 0s to achieve a max_length string

New in version 0.4.0.

encode(word)[source]

Return the Roger Root code for a word.

Parameters

word (str) -- The word to transform

Returns

The Roger Root code

Return type

str

Examples

>>> pe = RogerRoot()
>>> pe.encode('Christopher')
'06401'
>>> pe.encode('Niall')
'02500'
>>> pe.encode('Smith')
'00310'
>>> pe.encode('Schmidt')
'06310'

New in version 0.3.0.

Changed in version 0.3.6: Encapsulated in class

encode_alpha(word)[source]

Return the alphabetic Roger Root code for a word.

Parameters

word (str) -- The word to transform

Returns

The alphabetic Roger Root code

Return type

str

Examples

>>> pe = RogerRoot()
>>> pe.encode_alpha('Christopher')
'JRST'
>>> pe.encode_alpha('Niall')
'NL'
>>> pe.encode_alpha('Smith')
'SMT'
>>> pe.encode_alpha('Schmidt')
'JMT'

New in version 0.4.0.

abydos.phonetic.roger_root(word, max_length=5, zero_pad=True)[source]

Return the Roger Root code for a word.

This is a wrapper for RogerRoot.encode().

Parameters
  • word (str) -- The word to transform

  • max_length (int) -- The maximum length (default 5) of the code to return

  • zero_pad (bool) -- Pad the end of the return value with 0s to achieve a max_length string

Returns

The Roger Root code

Return type

str

Examples

>>> roger_root('Christopher')
'06401'
>>> roger_root('Niall')
'02500'
>>> roger_root('Smith')
'00310'
>>> roger_root('Schmidt')
'06310'

New in version 0.3.0.

Deprecated since version 0.4.0: This will be removed in 0.6.0. Use the RogerRoot.encode method instead.

class abydos.phonetic.StatisticsCanada(max_length=4)[source]

Bases: abydos.phonetic._phonetic._Phonetic

Statistics Canada code.

The original description of this algorithm could not be located, and may only have been specified in an unpublished TR. The coding does not appear to be in use by Statistics Canada any longer. In its place, this is an implementation of the "Census modified Statistics Canada name coding procedure".

The modified version of this algorithm is described in Appendix B of [MKTM77].

New in version 0.3.6.

Initialize StatisticsCanada instance.

Parameters

max_length (int) -- The length of the code returned (defaults to 4)

New in version 0.4.0.

encode(word)[source]

Return the Statistics Canada code for a word.

Parameters

word (str) -- The word to transform

Returns

The Statistics Canada name code value

Return type

str

Examples

>>> pe = StatisticsCanada()
>>> pe.encode('Christopher')
'CHRS'
>>> pe.encode('Niall')
'NL'
>>> pe.encode('Smith')
'SMTH'
>>> pe.encode('Schmidt')
'SCHM'

New in version 0.3.0.

Changed in version 0.3.6: Encapsulated in class

abydos.phonetic.statistics_canada(word, max_length=4)[source]

Return the Statistics Canada code for a word.

This is a wrapper for StatisticsCanada.encode().

Parameters
  • word (str) -- The word to transform

  • max_length (int) -- The maximum length (default 4) of the code to return

Returns

The Statistics Canada name code value

Return type

str

Examples

>>> statistics_canada('Christopher')
'CHRS'
>>> statistics_canada('Niall')
'NL'
>>> statistics_canada('Smith')
'SMTH'
>>> statistics_canada('Schmidt')
'SCHM'

New in version 0.3.0.

Deprecated since version 0.4.0: This will be removed in 0.6.0. Use the StatisticsCanada.encode method instead.

class abydos.phonetic.SoundD(max_length=4)[source]

Bases: abydos.phonetic._phonetic._Phonetic

SoundD code.

SoundD is defined in [VB12].

New in version 0.3.6.

Initialize SoundD instance.

Parameters

max_length (int) -- The length of the code returned (defaults to 4)

New in version 0.4.0.

encode(word)[source]

Return the SoundD code.

Parameters

word (str) -- The word to transform

Returns

The SoundD code

Return type

str

Examples

>>> pe = SoundD()
>>> pe.encode('Gough')
'2000'
>>> pe.encode('pneuma')
'5500'
>>> pe.encode('knight')
'5300'
>>> pe.encode('trice')
'3620'
>>> pe.encode('judge')
'2200'

New in version 0.3.0.

Changed in version 0.3.6: Encapsulated in class

encode_alpha(word)[source]

Return the alphabetic SoundD code.

Parameters

word (str) -- The word to transform

Returns

The alphabetic SoundD code

Return type

str

Examples

>>> pe = SoundD()
>>> pe.encode_alpha('Gough')
'K'
>>> pe.encode_alpha('pneuma')
'NN'
>>> pe.encode_alpha('knight')
'NT'
>>> pe.encode_alpha('trice')
'TRK'
>>> pe.encode_alpha('judge')
'KK'

New in version 0.4.0.

abydos.phonetic.sound_d(word, max_length=4)[source]

Return the SoundD code.

Parameters
  • word (str) -- The word to transform

  • max_length (int) -- The length of the code returned (defaults to 4)

Returns

The SoundD code

Return type

str

Examples

>>> sound_d('Gough')
'2000'
>>> sound_d('pneuma')
'5500'
>>> sound_d('knight')
'5300'
>>> sound_d('trice')
'3620'
>>> sound_d('judge')
'2200'

New in version 0.3.0.

Deprecated since version 0.4.0: This will be removed in 0.6.0. Use the SoundD.encode method instead.

class abydos.phonetic.ParmarKumbharana[source]

Bases: abydos.phonetic._phonetic._Phonetic

Parmar-Kumbharana code.

This is based on the phonetic algorithm proposed in [PK14].

New in version 0.3.6.

encode(word)[source]

Return the Parmar-Kumbharana encoding of a word.

Parameters

word (str) -- The word to transform

Returns

The Parmar-Kumbharana encoding

Return type

str

Examples

>>> pe = ParmarKumbharana()
>>> pe.encode('Gough')
'GF'
>>> pe.encode('pneuma')
'NM'
>>> pe.encode('knight')
'NT'
>>> pe.encode('trice')
'TRS'
>>> pe.encode('judge')
'JJ'

New in version 0.3.0.

Changed in version 0.3.6: Encapsulated in class

abydos.phonetic.parmar_kumbharana(word)[source]

Return the Parmar-Kumbharana encoding of a word.

This is a wrapper for ParmarKumbharana.encode().

Parameters

word (str) -- The word to transform

Returns

The Parmar-Kumbharana encoding

Return type

str

Examples

>>> parmar_kumbharana('Gough')
'GF'
>>> parmar_kumbharana('pneuma')
'NM'
>>> parmar_kumbharana('knight')
'NT'
>>> parmar_kumbharana('trice')
'TRS'
>>> parmar_kumbharana('judge')
'JJ'

New in version 0.3.0.

Deprecated since version 0.4.0: This will be removed in 0.6.0. Use the ParmarKumbharana.encode method instead.

class abydos.phonetic.Metaphone(max_length=-1)[source]

Bases: abydos.phonetic._phonetic._Phonetic

Metaphone.

Based on Lawrence Philips' Pick BASIC code from 1990 [Phi90b], as described in [Phi90a]. This incorporates some corrections to the above code, particularly some of those suggested by Michael Kuhn in [Kuh95].

New in version 0.3.6.

Initialize AlphaSIS instance.

Parameters

max_length (int) -- The maximum length of the returned Metaphone code (defaults to 64, but in Philips' original implementation this was 4)

New in version 0.4.0.

encode(word)[source]

Return the Metaphone code for a word.

Based on Lawrence Philips' Pick BASIC code from 1990 [Phi90b], as described in [Phi90a]. This incorporates some corrections to the above code, particularly some of those suggested by Michael Kuhn in [Kuh95].

Parameters

word (str) -- The word to transform

Returns

The Metaphone value

Return type

str

Examples

>>> pe = Metaphone()
>>> pe.encode('Christopher')
'KRSTFR'
>>> pe.encode('Niall')
'NL'
>>> pe.encode('Smith')
'SM0'
>>> pe.encode('Schmidt')
'SKMTT'

New in version 0.1.0.

Changed in version 0.3.6: Encapsulated in class

abydos.phonetic.metaphone(word, max_length=-1)[source]

Return the Metaphone code for a word.

This is a wrapper for Metaphone.encode().

Parameters
  • word (str) -- The word to transform

  • max_length (int) -- The maximum length of the returned Metaphone code (defaults to 64, but in Philips' original implementation this was 4)

Returns

The Metaphone value

Return type

str

Examples

>>> metaphone('Christopher')
'KRSTFR'
>>> metaphone('Niall')
'NL'
>>> metaphone('Smith')
'SM0'
>>> metaphone('Schmidt')
'SKMTT'

New in version 0.1.0.

Deprecated since version 0.4.0: This will be removed in 0.6.0. Use the Metaphone.encode method instead.

class abydos.phonetic.DoubleMetaphone(max_length=-1)[source]

Bases: abydos.phonetic._phonetic._Phonetic

Double Metaphone.

Based on Lawrence Philips' (Visual) C++ code from 1999 [Phi00].

New in version 0.3.6.

Initialize DoubleMetaphone instance.

Parameters

max_length (int) -- Maximum length of the returned Dolby code -- this also activates the fixed-length code mode if it is greater than 0

New in version 0.4.0.

encode(word)[source]

Return the Double Metaphone code for a word.

Parameters

word (str) -- The word to transform

Returns

The Double Metaphone value(s)

Return type

tuple

Examples

>>> pe = DoubleMetaphone()
>>> pe.encode('Christopher')
('KRSTFR', '')
>>> pe.encode('Niall')
('NL', '')
>>> pe.encode('Smith')
('SM0', 'XMT')
>>> pe.encode('Schmidt')
('XMT', 'SMT')

New in version 0.1.0.

Changed in version 0.3.6: Encapsulated in class

encode_alpha(word)[source]

Return the alphabetic Double Metaphone code for a word.

Parameters

word (str) -- The word to transform

Returns

The alphabetic Double Metaphone value(s)

Return type

tuple

Examples

>>> pe = DoubleMetaphone()
>>> pe.encode_alpha('Christopher')
('KRSTFR', '')
>>> pe.encode_alpha('Niall')
('NL', '')
>>> pe.encode_alpha('Smith')
('SMÞ', 'XMT')
>>> pe.encode_alpha('Schmidt')
('XMT', 'SMT')

New in version 0.4.0.

abydos.phonetic.double_metaphone(word, max_length=-1)[source]

Return the Double Metaphone code for a word.

This is a wrapper for DoubleMetaphone.encode().

Parameters
  • word (str) -- The word to transform

  • max_length (int) -- The maximum length of the returned Double Metaphone codes (defaults to unlimited, but in Philips' original implementation this was 4)

Returns

The Double Metaphone value(s)

Return type

tuple

Examples

>>> double_metaphone('Christopher')
('KRSTFR', '')
>>> double_metaphone('Niall')
('NL', '')
>>> double_metaphone('Smith')
('SM0', 'XMT')
>>> double_metaphone('Schmidt')
('XMT', 'SMT')

New in version 0.1.0.

Deprecated since version 0.4.0: This will be removed in 0.6.0. Use the DoubleMetaphone.encode method instead.

class abydos.phonetic.Eudex(max_length=8)[source]

Bases: abydos.phonetic._phonetic._Phonetic

Eudex hash.

This implementation of eudex phonetic hashing is based on the specification (not the reference implementation) at [Tic].

Further details can be found at [Tic16].

New in version 0.3.6.

Initialize Eudex instance.

Parameters

max_length (int) -- The length in bits of the code returned (default 8)

New in version 0.4.0.

encode(word)[source]

Return the eudex phonetic hash of a word.

Parameters

word (str) -- The word to transform

Returns

The eudex hash

Return type

int

Examples

>>> pe = Eudex()
>>> pe.encode('Colin')
432345564238053650
>>> pe.encode('Christopher')
433648490138894409
>>> pe.encode('Niall')
648518346341351840
>>> pe.encode('Smith')
720575940412906756
>>> pe.encode('Schmidt')
720589151732307997

New in version 0.3.0.

Changed in version 0.3.6: Encapsulated in class

abydos.phonetic.eudex(word, max_length=8)[source]

Return the eudex phonetic hash of a word.

This is a wrapper for Eudex.encode().

Parameters
  • word (str) -- The word to transform

  • max_length (int) -- The length in bits of the code returned (default 8)

Returns

The eudex hash

Return type

int

Examples

>>> eudex('Colin')
432345564238053650
>>> eudex('Christopher')
433648490138894409
>>> eudex('Niall')
648518346341351840
>>> eudex('Smith')
720575940412906756
>>> eudex('Schmidt')
720589151732307997

New in version 0.3.0.

Deprecated since version 0.4.0: This will be removed in 0.6.0. Use the Eudex.encode method instead.

class abydos.phonetic.BeiderMorse(language_arg=0, name_mode='gen', match_mode='approx', concat=False, filter_langs=False)[source]

Bases: abydos.phonetic._phonetic._Phonetic

Beider-Morse Phonetic Matching.

The Beider-Morse Phonetic Matching algorithm is described in [BM08]. The reference implementation is licensed under GPLv3.

New in version 0.3.6.

Initialize BeiderMorse instance.

Parameters
  • language_arg (str or int) --

    The language of the term; supported values include:

    • any

    • arabic

    • cyrillic

    • czech

    • dutch

    • english

    • french

    • german

    • greek

    • greeklatin

    • hebrew

    • hungarian

    • italian

    • latvian

    • polish

    • portuguese

    • romanian

    • russian

    • spanish

    • turkish

  • name_mode (str) --

    The name mode of the algorithm:

    • gen -- general (default)

    • ash -- Ashkenazi

    • sep -- Sephardic

  • match_mode (str) -- Matching mode: approx or exact

  • concat (bool) -- Concatenation mode

  • filter_langs (bool) -- Filter out incompatible languages

New in version 0.4.0.

encode(word)[source]

Return the Beider-Morse Phonetic Matching encoding(s) of a term.

Parameters

word (str) -- The word to transform

Returns

The Beider-Morse phonetic value(s)

Return type

tuple

Raises

ValueError -- Unknown language

Examples

>>> pe = BeiderMorse()
>>> pe.encode('Christopher')
'xrQstopir xrQstYpir xristopir xristYpir xrQstofir xrQstYfir
xristofir xristYfir xristopi xritopir xritopi xristofi xritofir
xritofi tzristopir tzristofir zristopir zristopi zritopir zritopi
zristofir zristofi zritofir zritofi'
>>> pe.encode('Niall')
'nial niol'
>>> pe.encode('Smith')
'zmit'
>>> pe.encode('Schmidt')
'zmit stzmit'
>>> BeiderMorse(language_arg='German').encode('Christopher')
'xrQstopir xrQstYpir xristopir xristYpir xrQstofir xrQstYfir
xristofir xristYfir'
>>> BeiderMorse(language_arg='English').encode('Christopher')
'tzristofir tzrQstofir tzristafir tzrQstafir xristofir xrQstofir
xristafir xrQstafir'
>>> BeiderMorse(language_arg='German',
... name_mode='ash').encode('Christopher')
'xrQstopir xrQstYpir xristopir xristYpir xrQstofir xrQstYfir
xristofir xristYfir'
>>> BeiderMorse(language_arg='German',
... match_mode='exact').encode('Christopher')
'xriStopher xriStofer xristopher xristofer'

New in version 0.1.0.

Changed in version 0.3.6: Encapsulated in class

abydos.phonetic.bmpm(word, language_arg=0, name_mode='gen', match_mode='approx', concat=False, filter_langs=False)[source]

Return the Beider-Morse Phonetic Matching encoding(s) of a term.

This is a wrapper for BeiderMorse.encode().

Parameters
  • word (str) -- The word to transform

  • language_arg (str) --

    The language of the term; supported values include:

    • any

    • arabic

    • cyrillic

    • czech

    • dutch

    • english

    • french

    • german

    • greek

    • greeklatin

    • hebrew

    • hungarian

    • italian

    • latvian

    • polish

    • portuguese

    • romanian

    • russian

    • spanish

    • turkish

  • name_mode (str) --

    The name mode of the algorithm:

    • gen -- general (default)

    • ash -- Ashkenazi

    • sep -- Sephardic

  • match_mode (str) -- Matching mode: approx or exact

  • concat (bool) -- Concatenation mode

  • filter_langs (bool) -- Filter out incompatible languages

Returns

The Beider-Morse phonetic value(s)

Return type

tuple

Examples

>>> bmpm('Christopher')
'xrQstopir xrQstYpir xristopir xristYpir xrQstofir xrQstYfir xristofir
xristYfir xristopi xritopir xritopi xristofi xritofir xritofi
tzristopir tzristofir zristopir zristopi zritopir zritopi zristofir
zristofi zritofir zritofi'
>>> bmpm('Niall')
'nial niol'
>>> bmpm('Smith')
'zmit'
>>> bmpm('Schmidt')
'zmit stzmit'
>>> bmpm('Christopher', language_arg='German')
'xrQstopir xrQstYpir xristopir xristYpir xrQstofir xrQstYfir xristofir
xristYfir'
>>> bmpm('Christopher', language_arg='English')
'tzristofir tzrQstofir tzristafir tzrQstafir xristofir xrQstofir
xristafir xrQstafir'
>>> bmpm('Christopher', language_arg='German', name_mode='ash')
'xrQstopir xrQstYpir xristopir xristYpir xrQstofir xrQstYfir xristofir
xristYfir'
>>> bmpm('Christopher', language_arg='German', match_mode='exact')
'xriStopher xriStofer xristopher xristofer'

New in version 0.1.0.

Deprecated since version 0.4.0: This will be removed in 0.6.0. Use the BeiderMorse.encode method instead.

class abydos.phonetic.NRL[source]

Bases: abydos.phonetic._phonetic._Phonetic

Naval Research Laboratory English-to-phoneme encoder.

This is defined by [EJMS76].

New in version 0.3.6.

encode(word)[source]

Return the Naval Research Laboratory phonetic encoding of a word.

Parameters

word (str) -- The word to transform

Returns

The NRL phonetic encoding

Return type

str

Examples

>>> pe = NRL()
>>> pe.encode('the')
'DHAX'
>>> pe.encode('round')
'rAWnd'
>>> pe.encode('quick')
'kwIHk'
>>> pe.encode('eaten')
'IYtEHn'
>>> pe.encode('Smith')
'smIHTH'
>>> pe.encode('Larsen')
'lAArsEHn'

New in version 0.3.0.

Changed in version 0.3.6: Encapsulated in class

abydos.phonetic.nrl(word)[source]

Return the Naval Research Laboratory phonetic encoding of a word.

This is a wrapper for NRL.encode().

Parameters

word (str) -- The word to transform

Returns

The NRL phonetic encoding

Return type

str

Examples

>>> nrl('the')
'DHAX'
>>> nrl('round')
'rAWnd'
>>> nrl('quick')
'kwIHk'
>>> nrl('eaten')
'IYtEHn'
>>> nrl('Smith')
'smIHTH'
>>> nrl('Larsen')
'lAArsEHn'

New in version 0.3.0.

Deprecated since version 0.4.0: This will be removed in 0.6.0. Use the NRL.encode method instead.

class abydos.phonetic.MetaSoundex(lang='en')[source]

Bases: abydos.phonetic._phonetic._Phonetic

MetaSoundex.

This is based on [KV17]. Only English ('en') and Spanish ('es') languages are supported, as in the original.

New in version 0.3.6.

Initialize MetaSoundex instance.

Parameters

lang (str) -- Either en for English or es for Spanish

New in version 0.4.0.

encode(word)[source]

Return the MetaSoundex code for a word.

Parameters

word (str) -- The word to transform

Returns

The MetaSoundex code

Return type

str

Examples

>>> pe = MetaSoundex()
>>> pe.encode('Smith')
'4500'
>>> pe.encode('Waters')
'7362'
>>> pe.encode('James')
'1520'
>>> pe.encode('Schmidt')
'4530'
>>> pe.encode('Ashcroft')
'0261'
>>> pe = MetaSoundex(lang='es')
>>> pe.encode('Perez')
'094'
>>> pe.encode('Martinez')
'69364'
>>> pe.encode('Gutierrez')
'83994'
>>> pe.encode('Santiago')
'4638'
>>> pe.encode('Nicolás')
'6754'

New in version 0.3.0.

Changed in version 0.3.6: Encapsulated in class

encode_alpha(word)[source]

Return the MetaSoundex code for a word.

Parameters

word (str) -- The word to transform

Returns

The MetaSoundex code

Return type

str

Examples

>>> pe = MetaSoundex()
>>> pe.encode_alpha('Smith')
'SN'
>>> pe.encode_alpha('Waters')
'WTRK'
>>> pe.encode_alpha('James')
'JNK'
>>> pe.encode_alpha('Schmidt')
'SNT'
>>> pe.encode_alpha('Ashcroft')
'AKRP'
>>> pe = MetaSoundex(lang='es')
>>> pe.encode_alpha('Perez')
'PRS'
>>> pe.encode_alpha('Martinez')
'NRTNS'
>>> pe.encode_alpha('Gutierrez')
'GTRRS'
>>> pe.encode_alpha('Santiago')
'SNTG'
>>> pe.encode_alpha('Nicolás')
'NKLS'

New in version 0.4.0.

abydos.phonetic.metasoundex(word, lang='en')[source]

Return the MetaSoundex code for a word.

This is a wrapper for MetaSoundex.encode().

Parameters
  • word (str) -- The word to transform

  • lang (str) -- Either en for English or es for Spanish

Returns

The MetaSoundex code

Return type

str

Examples

>>> metasoundex('Smith')
'4500'
>>> metasoundex('Waters')
'7362'
>>> metasoundex('James')
'1520'
>>> metasoundex('Schmidt')
'4530'
>>> metasoundex('Ashcroft')
'0261'
>>> metasoundex('Perez', lang='es')
'094'
>>> metasoundex('Martinez', lang='es')
'69364'
>>> metasoundex('Gutierrez', lang='es')
'83994'
>>> metasoundex('Santiago', lang='es')
'4638'
>>> metasoundex('Nicolás', lang='es')
'6754'

New in version 0.3.0.

Deprecated since version 0.4.0: This will be removed in 0.6.0. Use the MetaSoundex.encode method instead.

class abydos.phonetic.ONCA(max_length=4, zero_pad=True)[source]

Bases: abydos.phonetic._phonetic._Phonetic

Oxford Name Compression Algorithm (ONCA).

This is the Oxford Name Compression Algorithm, based on [Gil97].

I can find no complete description of the "anglicised version of the NYSIIS method" identified as the first step in this algorithm, so this is likely not a precisely correct implementation, in that it employs the standard NYSIIS algorithm.

New in version 0.3.6.

Initialize ONCA instance.

Parameters
  • max_length (int) -- The maximum length (default 5) of the code to return

  • zero_pad (bool) -- Pad the end of the return value with 0s to achieve a max_length string

New in version 0.4.0.

encode(word)[source]

Return the Oxford Name Compression Algorithm (ONCA) code for a word.

Parameters

word (str) -- The word to transform

Returns

The ONCA code

Return type

str

Examples

>>> pe = ONCA()
>>> pe.encode('Christopher')
'C623'
>>> pe.encode('Niall')
'N400'
>>> pe.encode('Smith')
'S530'
>>> pe.encode('Schmidt')
'S530'

New in version 0.3.0.

Changed in version 0.3.6: Encapsulated in class

encode_alpha(word)[source]

Return the alphabetic ONCA code for a word.

Parameters

word (str) -- The word to transform

Returns

The alphabetic ONCA code

Return type

str

Examples

>>> pe = ONCA()
>>> pe.encode_alpha('Christopher')
'CRKT'
>>> pe.encode_alpha('Niall')
'NL'
>>> pe.encode_alpha('Smith')
'SNT'
>>> pe.encode_alpha('Schmidt')
'SNT'

New in version 0.4.0.

abydos.phonetic.onca(word, max_length=4, zero_pad=True)[source]

Return the Oxford Name Compression Algorithm (ONCA) code for a word.

This is a wrapper for ONCA.encode().

Parameters
  • word (str) -- The word to transform

  • max_length (int) -- The maximum length (default 5) of the code to return

  • zero_pad (bool) -- Pad the end of the return value with 0s to achieve a max_length string

Returns

The ONCA code

Return type

str

Examples

>>> onca('Christopher')
'C623'
>>> onca('Niall')
'N400'
>>> onca('Smith')
'S530'
>>> onca('Schmidt')
'S530'

New in version 0.3.0.

Deprecated since version 0.4.0: This will be removed in 0.6.0. Use the ONCA.encode method instead.

class abydos.phonetic.FONEM[source]

Bases: abydos.phonetic._phonetic._Phonetic

FONEM.

FONEM is a phonetic algorithm designed for French (particularly surnames in Saguenay, Canada), defined in [BBL81].

Guillaume Plique's Javascript implementation [Pli18] at https://github.com/Yomguithereal/talisman/blob/master/src/phonetics/french/fonem.js was also consulted for this implementation.

New in version 0.3.6.

encode(word)[source]

Return the FONEM code of a word.

Parameters

word (str) -- The word to transform

Returns

The FONEM code

Return type

str

Examples

>>> pe = FONEM()
>>> pe.encode('Marchand')
'MARCHEN'
>>> pe.encode('Beaulieu')
'BOLIEU'
>>> pe.encode('Beaumont')
'BOMON'
>>> pe.encode('Legrand')
'LEGREN'
>>> pe.encode('Pelletier')
'PELETIER'

New in version 0.3.0.

Changed in version 0.3.6: Encapsulated in class

abydos.phonetic.fonem(word)[source]

Return the FONEM code of a word.

This is a wrapper for FONEM.encode().

Parameters

word (str) -- The word to transform

Returns

The FONEM code

Return type

str

Examples

>>> fonem('Marchand')
'MARCHEN'
>>> fonem('Beaulieu')
'BOLIEU'
>>> fonem('Beaumont')
'BOMON'
>>> fonem('Legrand')
'LEGREN'
>>> fonem('Pelletier')
'PELETIER'

New in version 0.3.0.

Deprecated since version 0.4.0: This will be removed in 0.6.0. Use the FONEM.encode method instead.

class abydos.phonetic.HenryEarly(max_length=3)[source]

Bases: abydos.phonetic._phonetic._Phonetic

Henry code, early version.

The early version of Henry coding is given in [LegareLC72]. This is different from the later version defined in [Hen76].

New in version 0.3.6.

Initialize HenryEarly instance.

Parameters

max_length (int) -- The length of the code returned (defaults to 3)

New in version 0.4.0.

encode(word)[source]

Calculate the early version of the Henry code for a word.

Parameters

word (str) -- The word to transform

Returns

The early Henry code

Return type

str

Examples

>>> pe = HenryEarly()
>>> pe.encode('Marchand')
'MRC'
>>> pe.encode('Beaulieu')
'BL'
>>> pe.encode('Beaumont')
'BM'
>>> pe.encode('Legrand')
'LGR'
>>> pe.encode('Pelletier')
'PLT'

New in version 0.3.0.

Changed in version 0.3.6: Encapsulated in class

abydos.phonetic.henry_early(word, max_length=3)[source]

Calculate the early version of the Henry code for a word.

This is a wrapper for HenryEarly.encode().

Parameters
  • word (str) -- The word to transform

  • max_length (int) -- The length of the code returned (defaults to 3)

Returns

The early Henry code

Return type

str

Examples

>>> henry_early('Marchand')
'MRC'
>>> henry_early('Beaulieu')
'BL'
>>> henry_early('Beaumont')
'BM'
>>> henry_early('Legrand')
'LGR'
>>> henry_early('Pelletier')
'PLT'

New in version 0.3.0.

Deprecated since version 0.4.0: This will be removed in 0.6.0. Use the HenryEarly.encode method instead.

class abydos.phonetic.Koelner[source]

Bases: abydos.phonetic._phonetic._Phonetic

Kölner Phonetik.

Based on the algorithm defined by [Pos69].

New in version 0.3.6.

encode(word)[source]

Return the Kölner Phonetik (numeric output) code for a word.

While the output code is numeric, it is still a str because 0s can lead the code.

Parameters

word (str) -- The word to transform

Returns

The Kölner Phonetik value as a numeric string

Return type

str

Example

>>> pe = Koelner()
>>> pe.encode('Christopher')
'478237'
>>> pe.encode('Niall')
'65'
>>> pe.encode('Smith')
'862'
>>> pe.encode('Schmidt')
'862'
>>> pe.encode('Müller')
'657'
>>> pe.encode('Zimmermann')
'86766'

New in version 0.1.0.

Changed in version 0.3.6: Encapsulated in class

encode_alpha(word)[source]

Return the Kölner Phonetik (alphabetic output) code for a word.

Parameters

word (str) -- The word to transform

Returns

The Kölner Phonetik value as an alphabetic string

Return type

str

Examples

>>> pe = Koelner()
>>> pe.encode_alpha('Smith')
'SNT'
>>> pe.encode_alpha('Schmidt')
'SNT'
>>> pe.encode_alpha('Müller')
'NLR'
>>> pe.encode_alpha('Zimmermann')
'SNRNN'

New in version 0.1.0.

Changed in version 0.3.6: Encapsulated in class

abydos.phonetic.koelner_phonetik(word)[source]

Return the Kölner Phonetik (numeric output) code for a word.

This is a wrapper for Koelner.encode().

Parameters

word (str) -- The word to transform

Returns

The Kölner Phonetik value as a numeric string

Return type

str

Example

>>> koelner_phonetik('Christopher')
'478237'
>>> koelner_phonetik('Niall')
'65'
>>> koelner_phonetik('Smith')
'862'
>>> koelner_phonetik('Schmidt')
'862'
>>> koelner_phonetik('Müller')
'657'
>>> koelner_phonetik('Zimmermann')
'86766'

New in version 0.1.0.

Deprecated since version 0.4.0: This will be removed in 0.6.0. Use the Koelner.encode method instead.

abydos.phonetic.koelner_phonetik_num_to_alpha(num)[source]

Convert a Kölner Phonetik code from numeric to alphabetic.

This is a wrapper for Koelner._to_alpha().

Parameters

num (str or int) -- A numeric Kölner Phonetik representation

Returns

An alphabetic representation of the same word

Return type

str

Examples

>>> koelner_phonetik_num_to_alpha('862')
'SNT'
>>> koelner_phonetik_num_to_alpha('657')
'NLR'
>>> koelner_phonetik_num_to_alpha('86766')
'SNRNN'

New in version 0.1.0.

Deprecated since version 0.4.0: This will be removed in 0.6.0. Use the Koelner._to_alpha method instead.

abydos.phonetic.koelner_phonetik_alpha(word)[source]

Return the Kölner Phonetik (alphabetic output) code for a word.

This is a wrapper for Koelner.encode_alpha().

Parameters

word (str) -- The word to transform

Returns

The Kölner Phonetik value as an alphabetic string

Return type

str

Examples

>>> koelner_phonetik_alpha('Smith')
'SNT'
>>> koelner_phonetik_alpha('Schmidt')
'SNT'
>>> koelner_phonetik_alpha('Müller')
'NLR'
>>> koelner_phonetik_alpha('Zimmermann')
'SNRNN'

New in version 0.1.0.

Deprecated since version 0.4.0: This will be removed in 0.6.0. Use the Koelner.encode_alpha method instead.

class abydos.phonetic.Haase(primary_only=False)[source]

Bases: abydos.phonetic._phonetic._Phonetic

Haase Phonetik.

Based on the algorithm described at [Pra15].

Based on the original [HH00].

New in version 0.3.6.

Initialize Haase instance.

Parameters

primary_only (bool) -- If True, only the primary code is returned

New in version 0.4.0.

encode(word)[source]

Return the Haase Phonetik (numeric output) code for a word.

While the output code is numeric, it is nevertheless a str.

Parameters

word (str) -- The word to transform

Returns

The Haase Phonetik value as a numeric string

Return type

tuple

Examples

>>> pe = Haase()
>>> pe.encode('Joachim')
('9496',)
>>> pe.encode('Christoph')
('4798293', '8798293')
>>> pe.encode('Jörg')
('974',)
>>> pe.encode('Smith')
('8692',)
>>> pe.encode('Schmidt')
('8692', '4692')

New in version 0.3.0.

Changed in version 0.3.6: Encapsulated in class

encode_alpha(word)[source]

Return the alphabetic Haase Phonetik code for a word.

Parameters

word (str) -- The word to transform

Returns

The alphabetic Haase Phonetik value

Return type

tuple

Examples

>>> pe = Haase()
>>> pe.encode_alpha('Joachim')
('AKAN',)
>>> pe.encode_alpha('Christoph')
('KRASTAF', 'SRASTAF')
>>> pe.encode_alpha('Jörg')
('ARK',)
>>> pe.encode_alpha('Smith')
('SNAT',)
>>> pe.encode_alpha('Schmidt')
('SNAT', 'KNAT')

New in version 0.4.0.

abydos.phonetic.haase_phonetik(word, primary_only=False)[source]

Return the Haase Phonetik code for a word.

This is a wrapper for Haase.encode().

Parameters
  • word (str) -- The word to transform

  • primary_only (bool) -- If True, only the primary code is returned

Returns

The Haase Phonetik value as a numeric string

Return type

tuple

Examples

>>> haase_phonetik('Joachim')
('9496',)
>>> haase_phonetik('Christoph')
('4798293', '8798293')
>>> haase_phonetik('Jörg')
('974',)
>>> haase_phonetik('Smith')
('8692',)
>>> haase_phonetik('Schmidt')
('8692', '4692')

New in version 0.3.0.

Deprecated since version 0.4.0: This will be removed in 0.6.0. Use the Haase.encode method instead.

class abydos.phonetic.RethSchek[source]

Bases: abydos.phonetic._phonetic._Phonetic

Reth-Schek Phonetik.

This algorithm is proposed in [vonRethS77].

Since I couldn't secure a copy of that document (maybe I'll look for it next time I'm in Germany), this implementation is based on what I could glean from the implementations published by German Record Linkage Center (www.record-linkage.de):

  • Privacy-preserving Record Linkage (PPRL) (in R) [Ruk18]

  • Merge ToolBox (in Java) [SBB04]

Rules that are unclear:

  • Should 'C' become 'G' or 'Z'? (PPRL has both, 'Z' rule blocked)

  • Should 'CC' become 'G'? (PPRL has blocked 'CK' that may be typo)

  • Should 'TUI' -> 'ZUI' rule exist? (PPRL has rule, but I can't think of a German word with '-tui-' in it.)

  • Should we really change 'SCH' -> 'CH' and then 'CH' -> 'SCH'?

New in version 0.3.6.

encode(word)[source]

Return Reth-Schek Phonetik code for a word.

Parameters

word (str) -- The word to transform

Returns

The Reth-Schek Phonetik code

Return type

str

Examples

>>> pe = RethSchek()
>>> pe.encode('Joachim')
'JOAGHIM'
>>> pe.encode('Christoph')
'GHRISDOF'
>>> pe.encode('Jörg')
'JOERG'
>>> pe.encode('Smith')
'SMID'
>>> pe.encode('Schmidt')
'SCHMID'

New in version 0.3.0.

Changed in version 0.3.6: Encapsulated in class

abydos.phonetic.reth_schek_phonetik(word)[source]

Return Reth-Schek Phonetik code for a word.

This is a wrapper for RethSchek.encode().

Parameters

word (str) -- The word to transform

Returns

The Reth-Schek Phonetik code

Return type

str

Examples

>>> reth_schek_phonetik('Joachim')
'JOAGHIM'
>>> reth_schek_phonetik('Christoph')
'GHRISDOF'
>>> reth_schek_phonetik('Jörg')
'JOERG'
>>> reth_schek_phonetik('Smith')
'SMID'
>>> reth_schek_phonetik('Schmidt')
'SCHMID'

New in version 0.3.0.

Deprecated since version 0.4.0: This will be removed in 0.6.0. Use the RethSchek.encode method instead.

class abydos.phonetic.Phonem[source]

Bases: abydos.phonetic._phonetic._Phonetic

Phonem.

Phonem is defined in [GM88].

This version is based on the Perl implementation documented at [Wil05]. It includes some enhancements presented in the Java port at [dcm4che].

Phonem is intended chiefly for German names/words.

New in version 0.3.6.

encode(word)[source]

Return the Phonem code for a word.

Parameters
  • word (str) --

  • word to transform (The) --

Returns

The Phonem value

Return type

str

Examples

>>> pe = Phonem()
>>> pe.encode('Christopher')
'CRYSDOVR'
>>> pe.encode('Niall')
'NYAL'
>>> pe.encode('Smith')
'SMYD'
>>> pe.encode('Schmidt')
'CMYD'

New in version 0.1.0.

Changed in version 0.3.6: Encapsulated in class

abydos.phonetic.phonem(word)[source]

Return the Phonem code for a word.

This is a wrapper for Phonem.encode().

Parameters

word (str) -- The word to transform

Returns

The Phonem value

Return type

str

Examples

>>> phonem('Christopher')
'CRYSDOVR'
>>> phonem('Niall')
'NYAL'
>>> phonem('Smith')
'SMYD'
>>> phonem('Schmidt')
'CMYD'

New in version 0.1.0.

Deprecated since version 0.4.0: This will be removed in 0.6.0. Use the Phonem.encode method instead.

class abydos.phonetic.Phonet(mode=1, lang='de')[source]

Bases: abydos.phonetic._phonetic._Phonetic

Phonet code.

phonet ("Hannoveraner Phonetik") was developed by Jörg Michael and documented in [Mic99].

This is a port of Jesper Zedlitz's code, which is licensed LGPL [Zed15].

That is, in turn, based on Michael's C code, which is also licensed LGPL [Mic07].

New in version 0.3.6.

Initialize AlphaSIS instance.

Parameters
  • mode (int) -- The ponet variant to employ (1 or 2)

  • lang (str) -- de (default) for German, none for no language

New in version 0.4.0.

encode(word)[source]

Return the phonet code for a word.

Parameters

word (str) -- The word to transform

Returns

The phonet value

Return type

str

Examples

>>> pe = Phonet()
>>> pe.encode('Christopher')
'KRISTOFA'
>>> pe.encode('Niall')
'NIAL'
>>> pe.encode('Smith')
'SMIT'
>>> pe.encode('Schmidt')
'SHMIT'
>>> pe2 = Phonet(mode=2)
>>> pe2.encode('Christopher')
'KRIZTUFA'
>>> pe2.encode('Niall')
'NIAL'
>>> pe2.encode('Smith')
'ZNIT'
>>> pe2.encode('Schmidt')
'ZNIT'
>>> pe_none = Phonet(lang='none')
>>> pe_none.encode('Christopher')
'CHRISTOPHER'
>>> pe_none.encode('Niall')
'NIAL'
>>> pe_none.encode('Smith')
'SMITH'
>>> pe_none.encode('Schmidt')
'SCHMIDT'

New in version 0.1.0.

Changed in version 0.3.6: Encapsulated in class

abydos.phonetic.phonet(word, mode=1, lang='de')[source]

Return the phonet code for a word.

This is a wrapper for Phonet.encode().

Parameters
  • word (str) -- The word to transform

  • mode (int) -- The ponet variant to employ (1 or 2)

  • lang (str) -- de (default) for German, none for no language

Returns

The phonet value

Return type

str

Examples

>>> phonet('Christopher')
'KRISTOFA'
>>> phonet('Niall')
'NIAL'
>>> phonet('Smith')
'SMIT'
>>> phonet('Schmidt')
'SHMIT'
>>> phonet('Christopher', mode=2)
'KRIZTUFA'
>>> phonet('Niall', mode=2)
'NIAL'
>>> phonet('Smith', mode=2)
'ZNIT'
>>> phonet('Schmidt', mode=2)
'ZNIT'
>>> phonet('Christopher', lang='none')
'CHRISTOPHER'
>>> phonet('Niall', lang='none')
'NIAL'
>>> phonet('Smith', lang='none')
'SMITH'
>>> phonet('Schmidt', lang='none')
'SCHMIDT'

New in version 0.1.0.

Deprecated since version 0.4.0: This will be removed in 0.6.0. Use the Phonet.encode method instead.

class abydos.phonetic.SoundexBR(max_length=4, zero_pad=True)[source]

Bases: abydos.phonetic._phonetic._Phonetic

SoundexBR.

This is based on [Mar15].

New in version 0.3.6.

Initialize SoundexBR instance.

Parameters
  • max_length (int) -- The length of the code returned (defaults to 4)

  • zero_pad (bool) -- Pad the end of the return value with 0s to achieve a max_length string

New in version 0.4.0.

encode(word)[source]

Return the SoundexBR encoding of a word.

Parameters

word (str) -- The word to transform

Returns

The SoundexBR code

Return type

str

Examples

>>> pe = SoundexBR()
>>> pe.encode('Oliveira')
'O416'
>>> pe.encode('Almeida')
'A453'
>>> pe.encode('Barbosa')
'B612'
>>> pe.encode('Araújo')
'A620'
>>> pe.encode('Gonçalves')
'G524'
>>> pe.encode('Goncalves')
'G524'

New in version 0.3.0.

Changed in version 0.3.6: Encapsulated in class

encode_alpha(word)[source]

Return the alphabetic SoundexBR encoding of a word.

Parameters

word (str) -- The word to transform

Returns

The alphabetic SoundexBR code

Return type

str

Examples

>>> pe = SoundexBR()
>>> pe.encode_alpha('Oliveira')
'OLPR'
>>> pe.encode_alpha('Almeida')
'ALNT'
>>> pe.encode_alpha('Barbosa')
'BRPK'
>>> pe.encode_alpha('Araújo')
'ARK'
>>> pe.encode_alpha('Gonçalves')
'GNKL'
>>> pe.encode_alpha('Goncalves')
'GNKL'

New in version 0.4.0.

abydos.phonetic.soundex_br(word, max_length=4, zero_pad=True)[source]

Return the SoundexBR encoding of a word.

This is a wrapper for SoundexBR.encode().

Parameters
  • word (str) -- The word to transform

  • max_length (int) -- The length of the code returned (defaults to 4)

  • zero_pad (bool) -- Pad the end of the return value with 0s to achieve a max_length string

Returns

The SoundexBR code

Return type

str

Examples

>>> soundex_br('Oliveira')
'O416'
>>> soundex_br('Almeida')
'A453'
>>> soundex_br('Barbosa')
'B612'
>>> soundex_br('Araújo')
'A620'
>>> soundex_br('Gonçalves')
'G524'
>>> soundex_br('Goncalves')
'G524'

New in version 0.3.0.

Deprecated since version 0.4.0: This will be removed in 0.6.0. Use the SoundexBR.encode method instead.

class abydos.phonetic.PhoneticSpanish(max_length=-1)[source]

Bases: abydos.phonetic._phonetic._Phonetic

PhoneticSpanish.

This follows the coding described in [AmonME12] and [delPAngelesEGGM15].

New in version 0.3.6.

Initialize PhoneticSpanish instance.

Parameters

max_length (int) -- The length of the code returned (defaults to unlimited)

New in version 0.4.0.

encode(word)[source]

Return the PhoneticSpanish coding of word.

Parameters

word (str) -- The word to transform

Returns

The PhoneticSpanish code

Return type

str

Examples

>>> pe = PhoneticSpanish()
>>> pe.encode('Perez')
'094'
>>> pe.encode('Martinez')
'69364'
>>> pe.encode('Gutierrez')
'83994'
>>> pe.encode('Santiago')
'4638'
>>> pe.encode('Nicolás')
'6454'

New in version 0.3.0.

Changed in version 0.3.6: Encapsulated in class

encode_alpha(word)[source]

Return the alphabetic PhoneticSpanish coding of word.

Parameters

word (str) -- The word to transform

Returns

The alphabetic PhoneticSpanish code

Return type

str

Examples

>>> pe = PhoneticSpanish()
>>> pe.encode_alpha('Perez')
'PRS'
>>> pe.encode_alpha('Martinez')
'NRTNS'
>>> pe.encode_alpha('Gutierrez')
'GTRRS'
>>> pe.encode_alpha('Santiago')
'SNTG'
>>> pe.encode_alpha('Nicolás')
'NSLS'

New in version 0.4.0.

abydos.phonetic.phonetic_spanish(word, max_length=-1)[source]

Return the PhoneticSpanish coding of word.

This is a wrapper for PhoneticSpanish.encode().

Parameters
  • word (str) -- The word to transform

  • max_length (int) -- The length of the code returned (defaults to unlimited)

Returns

The PhoneticSpanish code

Return type

str

Examples

>>> phonetic_spanish('Perez')
'094'
>>> phonetic_spanish('Martinez')
'69364'
>>> phonetic_spanish('Gutierrez')
'83994'
>>> phonetic_spanish('Santiago')
'4638'
>>> phonetic_spanish('Nicolás')
'6454'

New in version 0.3.0.

Deprecated since version 0.4.0: This will be removed in 0.6.0. Use the PhoneticSpanish.encode method instead.

class abydos.phonetic.SpanishMetaphone(max_length=6, modified=False)[source]

Bases: abydos.phonetic._phonetic._Phonetic

Spanish Metaphone.

This is a quick rewrite of the Spanish Metaphone Algorithm, as presented at https://github.com/amsqr/Spanish-Metaphone and discussed in [MLM12].

Modified version based on [delPAngelesBailonM16].

New in version 0.3.6.

Initialize AlphaSIS instance.

Parameters
  • max_length (int) -- The length of the code returned (defaults to 6)

  • modified (bool) -- Set to True to use del Pilar Angeles & Bailón-Miguel's modified version of the algorithm

New in version 0.4.0.

encode(word)[source]

Return the Spanish Metaphone of a word.

Parameters

word (str) -- The word to transform

Returns

The Spanish Metaphone code

Return type

str

Examples

>>> pe = SpanishMetaphone()
>>> pe.encode('Perez')
'PRZ'
>>> pe.encode('Martinez')
'MRTNZ'
>>> pe.encode('Gutierrez')
'GTRRZ'
>>> pe.encode('Santiago')
'SNTG'
>>> pe.encode('Nicolás')
'NKLS'

New in version 0.3.0.

Changed in version 0.3.6: Encapsulated in class

abydos.phonetic.spanish_metaphone(word, max_length=6, modified=False)[source]

Return the Spanish Metaphone of a word.

This is a wrapper for SpanishMetaphone.encode().

Parameters
  • word (str) -- The word to transform

  • max_length (int) -- The length of the code returned (defaults to 6)

  • modified (bool) -- Set to True to use del Pilar Angeles & Bailón-Miguel's modified version of the algorithm

Returns

The Spanish Metaphone code

Return type

str

Examples

>>> spanish_metaphone('Perez')
'PRZ'
>>> spanish_metaphone('Martinez')
'MRTNZ'
>>> spanish_metaphone('Gutierrez')
'GTRRZ'
>>> spanish_metaphone('Santiago')
'SNTG'
>>> spanish_metaphone('Nicolás')
'NKLS'

New in version 0.3.0.

Deprecated since version 0.4.0: This will be removed in 0.6.0. Use the SpanishMetaphone.encode method instead.

class abydos.phonetic.SfinxBis(max_length=-1)[source]

Bases: abydos.phonetic._phonetic._Phonetic

SfinxBis code.

SfinxBis is a Soundex-like algorithm defined in [Axe09].

This implementation follows the reference implementation: [Sjoo09].

SfinxBis is intended chiefly for Swedish names.

New in version 0.3.6.

Initialize SfinxBis instance.

Parameters

max_length (int) -- The length of the code returned (defaults to unlimited)

New in version 0.4.0.

encode(word)[source]

Return the SfinxBis code for a word.

Parameters

word (str) -- The word to transform

Returns

The SfinxBis value

Return type

tuple

Examples

>>> pe = SfinxBis()
>>> pe.encode('Christopher')
('K68376',)
>>> pe.encode('Niall')
('N4',)
>>> pe.encode('Smith')
('S53',)
>>> pe.encode('Schmidt')
('S53',)
>>> pe.encode('Johansson')
('J585',)
>>> pe.encode('Sjöberg')
('#162',)

New in version 0.1.0.

Changed in version 0.3.6: Encapsulated in class

encode_alpha(word)[source]

Return the alphabetic SfinxBis code for a word.

Parameters

word (str) -- The word to transform

Returns

The alphabetic SfinxBis value

Return type

tuple

Examples

>>> pe = SfinxBis()
>>> pe.encode_alpha('Christopher')
('KRSTFR',)
>>> pe.encode_alpha('Niall')
('NL',)
>>> pe.encode_alpha('Smith')
('SNT',)
>>> pe.encode_alpha('Schmidt')
('SNT',)
>>> pe.encode_alpha('Johansson')
('JNSN',)
>>> pe.encode_alpha('Sjöberg')
('ŠPRK',)

New in version 0.4.0.

abydos.phonetic.sfinxbis(word, max_length=-1)[source]

Return the SfinxBis code for a word.

This is a wrapper for SfinxBis.encode().

Parameters
  • word (str) -- The word to transform

  • max_length (int) -- The length of the code returned (defaults to unlimited)

Returns

The SfinxBis value

Return type

tuple

Examples

>>> sfinxbis('Christopher')
('K68376',)
>>> sfinxbis('Niall')
('N4',)
>>> sfinxbis('Smith')
('S53',)
>>> sfinxbis('Schmidt')
('S53',)
>>> sfinxbis('Johansson')
('J585',)
>>> sfinxbis('Sjöberg')
('#162',)

New in version 0.1.0.

Deprecated since version 0.4.0: This will be removed in 0.6.0. Use the SfinxBis.encode method instead.

class abydos.phonetic.Waahlin(encoder=None)[source]

Bases: abydos.phonetic._phonetic._Phonetic

Wåhlin code.

Wåhlin's first-letter coding is based on the description in [Eri97].

New in version 0.3.6.

Initialize Waahlin instance.

Parameters

encoder (_Phonetic) -- An initialized phonetic algorithm object

New in version 0.4.0.

encode(word, alphabetic=False)[source]

Return the Wåhlin code for a word.

Parameters
  • word (str) -- The word to transform

  • alphabetic (bool) -- If True, the encoder will apply its alphabetic form (.encode_alpha rather than .encode)

Returns

The Wåhlin code value

Return type

str

Examples

>>> pe = Waahlin()
>>> pe.encode('Christopher')
'KRISTOFER'
>>> pe.encode('Niall')
'NJALL'
>>> pe.encode('Smith')
'SMITH'
>>> pe.encode('Schmidt')
'*MIDT'

New in version 0.4.0.

encode_alpha(word)[source]

Return the alphabetic Wåhlin code for a word.

Parameters

word (str) -- The word to transform

Returns

The alphabetic Wåhlin code value

Return type

str

Examples

>>> pe = Waahlin()
>>> pe.encode_alpha('Christopher')
'KRISTOFER'
>>> pe.encode_alpha('Niall')
'NJALL'
>>> pe.encode_alpha('Smith')
'SMITH'
>>> pe.encode_alpha('Schmidt')
'ŠMIDT'

New in version 0.4.0.

class abydos.phonetic.Norphone[source]

Bases: abydos.phonetic._phonetic._Phonetic

Norphone.

The reference implementation by Lars Marius Garshol is available in [Gar15].

Norphone was designed for Norwegian, but this implementation has been extended to support Swedish vowels as well. This function incorporates the "not implemented" rules from the above file's rule set.

New in version 0.3.6.

encode(word)[source]

Return the Norphone code.

Parameters

word (str) -- The word to transform

Returns

The Norphone code

Return type

str

Examples

>>> pe = Norphone()
>>> pe.encode('Hansen')
'HNSN'
>>> pe.encode('Larsen')
'LRSN'
>>> pe.encode('Aagaard')
'ÅKRT'
>>> pe.encode('Braaten')
'BRTN'
>>> pe.encode('Sandvik')
'SNVK'

New in version 0.3.0.

Changed in version 0.3.6: Encapsulated in class

abydos.phonetic.norphone(word)[source]

Return the Norphone code.

This is a wrapper for Norphone.encode().

Parameters

word (str) -- The word to transform

Returns

The Norphone code

Return type

str

Examples

>>> norphone('Hansen')
'HNSN'
>>> norphone('Larsen')
'LRSN'
>>> norphone('Aagaard')
'ÅKRT'
>>> norphone('Braaten')
'BRTN'
>>> norphone('Sandvik')
'SNVK'

New in version 0.3.0.

Deprecated since version 0.4.0: This will be removed in 0.6.0. Use the Norphone.encode method instead.

class abydos.phonetic.Ainsworth[source]

Bases: abydos.phonetic._phonetic._Phonetic

Ainsworth's grapheme to phoneme converter.

Based on the ruleset listed in [Ain73].

New in version 0.4.1.

encode(word)[source]

Return the phonemic representation of a word.

Parameters

word (str) -- The word to transform

Returns

The phonemic representation in IPA

Return type

str

Examples

>>> pe = Ainsworth()
>>> pe.encode('Christopher')
'tʃrɪstofɜ'
>>> pe.encode('Niall')
'nɪɔl'
>>> pe.encode('Smith')
'smɪð'
>>> pe.encode('Schmidt')
'skmɪdt'

New in version 0.4.1.