abydos.phones package

abydos.phones.

The phones module implements phonetic feature coding, decoding, and comparison functions. It has three functions:

  • ipa_to_features() takes a string of IPA symbols and returns list of integers that represent the phonetic features bundled in the phone that the symbols represents.

  • ipa_to_feature_dicts() takes a string of IPA symbols and returns list of human-readable dicts that represent the phonetic features bundled in the phone that the symbols represents.

  • get_feature() takes a list of feature bundles produced by ipa_to_features() and a feature name and returns a list representing whether that feature is present in each component of the list.

  • cmp_features() takes two phonetic feature bundles, such as the components of the lists returned by ipa_to_features(), and returns a measure of their similarity.

An example using these functions on two different pronunciations of the word 'international':

>>> int1 = 'ɪntənæʃənəɫ'
>>> int2 = 'ɪnɾənæʃɨnəɫ'
>>> feat1 = ipa_to_features(int1)
>>> feat1
[1826957413067434410,
 2711173160463936106,
 2783230754502126250,
 1828083331160779178,
 2711173160463936106,
 1826957425885227434,
 2783231556184615322,
 1828083331160779178,
 2711173160463936106,
 1828083331160779178,
 2693158721554917798]
>>> feat2 = ipa_to_features(int2)
>>> feat2
[1826957413067434410,
 2711173160463936106,
 2711173160463935914,
 1828083331160779178,
 2711173160463936106,
 1826957425885227434,
 2783231556184615322,
 1826957414069873066,
 2711173160463936106,
 1828083331160779178,
 2693158721554917798]
>>> ipa_to_feature_dicts('ʤɪn')
[{'syllabic': '-',
  'consonantal': '+',
  'sonorant': '-',
  'approximant': '-',
  'labial': '-',
  'round': '0',
  'protruded': '0',
  'compressed': '0',
  'labiodental': '0',
  'coronal': '+',
  'anterior': '-',
  'distributed': '+',
  'dorsal': '+',
  'high': '-',
  'low': '-',
  'front': '-',
  'back': '-',
  'tense': '-',
  'pharyngeal': '-',
  'atr': '0',
  'rtr': '0',
  'voice': '+',
  'spread_glottis': '-',
  'constricted_glottis': '-',
  'glottalic_suction': '-',
  'velaric_suction': '-',
  'continuant': '+/-',
  'nasal': '-',
  'strident': '+',
  'lateral': '-',
  'delayed_release': '+'},
 {'syllabic': '+',
  'consonantal': '-',
  'sonorant': '+',
  'approximant': '+',
  'labial': '+',
  'round': '-',
  'protruded': '-',
  'compressed': '-',
  'labiodental': '-',
  'coronal': '-',
  'anterior': '0',
  'distributed': '0',
  'dorsal': '+',
  'high': '+',
  'low': '-',
  'front': '+',
  'back': '-',
  'tense': '-',
  'pharyngeal': '+',
  'atr': '-',
  'rtr': '-',
  'voice': '+',
  'spread_glottis': '-',
  'constricted_glottis': '-',
  'glottalic_suction': '-',
  'velaric_suction': '-',
  'continuant': '+',
  'nasal': '-',
  'strident': '-',
  'lateral': '-',
  'delayed_release': '-'},
 {'syllabic': '-',
  'consonantal': '+',
  'sonorant': '+',
  'approximant': '-',
  'labial': '-',
  'round': '0',
  'protruded': '0',
  'compressed': '0',
  'labiodental': '0',
  'coronal': '+',
  'anterior': '+',
  'distributed': '-',
  'dorsal': '-',
  'high': '0',
  'low': '0',
  'front': '0',
  'back': '0',
  'tense': '0',
  'pharyngeal': '-',
  'atr': '0',
  'rtr': '0',
  'voice': '+',
  'spread_glottis': '-',
  'constricted_glottis': '-',
  'glottalic_suction': '-',
  'velaric_suction': '-',
  'continuant': '-',
  'nasal': '+',
  'strident': '-',
  'lateral': '-',
  'delayed_release': '-'}]
>>> get_feature(feat1, 'consonantal')
[-1, 1, 1, -1, 1, -1, 1, -1, 1, -1, 1]
>>> get_feature(feat1, 'nasal')
[-1, 1, -1, -1, 1, -1, -1, -1, 1, -1, -1]
>>> [cmp_features(f1, f2) for f1, f2 in zip(feat1, feat2)]
[1.0,
 1.0,
 0.9032258064516129,
 1.0,
 1.0,
 1.0,
 1.0,
 0.9193548387096774,
 1.0,
 1.0,
 1.0]
>>> sum(cmp_features(f1, f2) for f1, f2 in zip(feat1, feat2))/len(feat1)
0.9838709677419355

abydos.phones.ipa_to_features(ipa)[source]

Convert IPA to features.

This translates an IPA string of one or more phones to a list of ints representing the features of the string.

Parameters

ipa (str) -- The IPA representation of a phone or series of phones

Returns

A representation of the features of the input string

Return type

list of ints

Examples

>>> ipa_to_features('mut')
[2709662981243185770, 1825831513894594986, 2783230754502126250]
>>> ipa_to_features('fon')
[2781702983095331242, 1825831531074464170, 2711173160463936106]
>>> ipa_to_features('telz')
[2783230754502126250, 1826957430176000426, 2693158761954453926,
2783230754501863834]

New in version 0.1.0.

abydos.phones.ipa_to_feature_dicts(ipa)[source]

Convert IPA to a feature dict list.

This translates an IPA string of one or more phones to a list of dicts representing the features of the string.

Parameters

ipa (str) -- The IPA representation of a phone or series of phones

Returns

A representation of the features of the input string

Return type

list of dicts

Examples

>>> ipa_to_feature_dicts('mut')
[{'syllabic': '-',
  'consonantal': '+',
  'sonorant': '+',
  'approximant': '-',
  'labial': '+',
  'round': '-',
  'protruded': '-',
  'compressed': '-',
  'labiodental': '-',
  'coronal': '-',
  'anterior': '0',
  'distributed': '0',
  'dorsal': '-',
  'high': '0',
  'low': '0',
  'front': '0',
  'back': '0',
  'tense': '0',
  'pharyngeal': '-',
  'atr': '0',
  'rtr': '0',
  'voice': '+',
  'spread_glottis': '-',
  'constricted_glottis': '-',
  'glottalic_suction': '-',
  'velaric_suction': '-',
  'continuant': '-',
  'nasal': '+',
  'strident': '-',
  'lateral': '-',
  'delayed_release': '-'},
 {'syllabic': '+',
  'consonantal': '-',
  'sonorant': '+',
  'approximant': '+',
  'labial': '+',
  'round': '+',
  'protruded': '-',
  'compressed': '-',
  'labiodental': '-',
  'coronal': '-',
  'anterior': '0',
  'distributed': '0',
  'dorsal': '+',
  'high': '+',
  'low': '-',
  'front': '-',
  'back': '+',
  'tense': '+',
  'pharyngeal': '+',
  'atr': '+',
  'rtr': '-',
  'voice': '+',
  'spread_glottis': '-',
  'constricted_glottis': '-',
  'glottalic_suction': '-',
  'velaric_suction': '-',
  'continuant': '+',
  'nasal': '-',
  'strident': '-',
  'lateral': '-',
  'delayed_release': '-'},
 {'syllabic': '-',
  'consonantal': '+',
  'sonorant': '-',
  'approximant': '-',
  'labial': '-',
  'round': '0',
  'protruded': '0',
  'compressed': '0',
  'labiodental': '0',
  'coronal': '+',
  'anterior': '+',
  'distributed': '-',
  'dorsal': '-',
  'high': '0',
  'low': '0',
  'front': '0',
  'back': '0',
  'tense': '0',
  'pharyngeal': '-',
  'atr': '0',
  'rtr': '0',
  'voice': '-',
  'spread_glottis': '-',
  'constricted_glottis': '-',
  'glottalic_suction': '-',
  'velaric_suction': '-',
  'continuant': '-',
  'nasal': '-',
  'strident': '-',
  'lateral': '-',
  'delayed_release': '-'}]

New in version 0.4.1.

abydos.phones.get_feature(vector, feature)[source]

Get a feature vector.

This returns a list of ints, equal in length to the vector input,

representing presence/absence/neutrality with respect to a particular phonetic feature.

Parameters
  • vector (list) -- A tuple or list of ints representing the phonetic features of a phone or series of phones (such as is returned by the ipa_to_features function)

  • feature (str) --

    A feature name from the set:

    • syllabic

    • consonantal

    • sonorant

    • approximant

    • labial

    • round

    • protruded

    • compressed

    • labiodental

    • coronal

    • anterior

    • distributed

    • dorsal

    • high

    • low

    • front

    • back

    • tense

    • pharyngeal

    • atr

    • rtr

    • voice

    • spread_glottis

    • constricted_glottis

    • glottalic_suction

    • velaric_suction

    • continuant

    • nasal

    • strident

    • lateral

    • delayed_release

Returns

A list indicating presence/absence/neutrality with respect to the feature

Return type

list of ints

Raises

AttributeError -- feature must be one of ...

Examples

>>> tails = ipa_to_features('telz')
>>> get_feature(tails, 'consonantal')
[1, -1, 1, 1]
>>> get_feature(tails, 'sonorant')
[-1, 1, 1, -1]
>>> get_feature(tails, 'nasal')
[-1, -1, -1, -1]
>>> get_feature(tails, 'coronal')
[1, -1, 1, 1]

New in version 0.1.0.

abydos.phones.cmp_features(feat1, feat2, weights=None)[source]

Compare features.

This returns a number in the range [0, 1] representing a comparison of two feature bundles.

If one of the bundles is negative, -1 is returned (for unknown values)

If the bundles are identical, 1 is returned.

If they are inverses of one another, 0 is returned.

Otherwise, a float representing their similarity is returned.

Parameters
  • feat1 (int) -- A feature bundle

  • feat2 (int) -- A feature bundle

  • weights (None or list or tuple or dict) -- If None, all features are of equal significance and a simple normalized hamming distance of the features is calculated. If a list or tuple of numeric values is supplied, the values are inferred as the weights for each feature, in order of the features listed in _FEATURE_MASK. If a dict is supplied, its key values should match keys in _FEATURE_MASK to which each weight (value) should be assigned. Missing values in all cases are assigned a weight of 0 and will be omitted from the comparison.

Returns

A comparison of the feature bundles

Return type

float

Examples

>>> cmp_features(ipa_to_features('l')[0], ipa_to_features('l')[0])
1.0
>>> cmp_features(ipa_to_features('l')[0], ipa_to_features('n')[0])
0.8709677419354839
>>> cmp_features(ipa_to_features('l')[0], ipa_to_features('z')[0])
0.8709677419354839
>>> cmp_features(ipa_to_features('l')[0], ipa_to_features('i')[0])
0.564516129032258

New in version 0.1.0.

Changed in version 0.4.1: Added weights parameter for modifiable feature weighting