Metaphone

By | September 18, 2017

Metaphone is a much more modern phoentic algorithm for matching words based upon there English pronunciation, and is a significant advancement on Soundex.

There has since been several iterations to the Metaphone Algorithm including Double Metaphone and MetaPhone 3.

Again as with the Soundex example I will focus on real examples of using the Metaphone Algorithm (see links below for the background information) looking at the practical results achieved with Metaphone across some of the more popular types of Data.

Metaphone and FirstNames

Here are some example of the Metaphone Algorithm used with First Names.

Metaphone(‘Steven’) and Metaphone(‘Stephen’) prodcue the same Phonetic Key of ‘STFN’ which would show a valid match

MetaPhone(‘Matthew’) and MetaPhone(‘Matt’) produce different Phonetic Keys ‘M0’ and ‘MT’ respectively.

MetaPhone(‘Shaun’) and MetaPhone(‘Sean’) produce different Phonetic Keys ‘SN’ and ‘XN’ respectively.

You will see that in the example above the more sophisticated Metaphone algorithm does not see ‘Matt’ and ‘Matthew’ as the same and does not assign these the same phonetic key.

The same goes for ‘Sean’ and ‘Shaun’.

Interestingly with these examples the less sophisticated Soundex function does assign the same phonetic key to these so they would be considered as potential Matches.

Metaphone and Surnames

Lets have a look at some common examples of Surname using the MetaPhone function and compare these with the results we have got from Soundex.

Surname Metaphone Key Soundex Key
Jones JN J52
Johnson JNSN J52
Peters PTR P382
Peterson PTRSN P382
Wood WT W3
Woods WT W32
McDonald MKTNLT M235
MacDonald MKTNLT M235
Merton MRTN M835
Martin MRTN M835

 

In practice you can see that the Metaphone Algorithm is more strict in the creation of the Phonetic keys and it looks at the entire word rather than the first 6 or so characters that are used to create the Soundex Key. Even so both algorithms still suggest potential Matches incorrectly, such as ‘Merton’ and ‘Martin’.

The Metaphone algorithm generates fewer potential matches as it correctly sees that ‘Jones’ and ‘Johnson’ are different and ‘Peters’ and ‘Peterson’ are also different whilst accepting that ‘Wood’ and ‘Woods’ could be a potential match.

So it would appear that the Metaphone algorithm works better than Soundex when used for matching Surnames.

Metaphone and Company Names

This is a lot more complex, the problem here is that a Company Name is made up of multiple Words so it is much more complex.

As with the Soundex Algorithm the Metaphone Algorithm was designed to work with words and not phrases and so using it for Matching Company Names is far from perfect.

Here are a few examples to demonstrate this:

Metaphone(‘Ford Motor Company’) = ‘FRTMTRKMPN’

Metaphone(‘Ford Motors’) = ‘FRTMTR’

Metaphone(‘Walmart Stores’) =’WLMRTSTR’

Metaphone(‘Wal-mart’) = ‘WLMRT’

Metaphone(‘Royal British Legion’) = ‘RYLBRTXLJN’

Metaphone(‘British Royal Legion’) = ‘BRTXRYLJN’

You can see that to the human eye these are all obviously the same, but the Metaphone function is unable to predict that these are valid matches.

Does Metaphone help find Typing Errors

This is a commonly asked question, and the answer is sometimes, it really depends on vowel placement, in most cases if the consonants are in the same sequence then Metaphone will provide some assistance, but not always for instance Metphone(‘Graphic’) = ‘KRFK’ and Metaphone(‘Grpahic’) = ‘KRPHK’.

Edit Distance Algorithms are typically better at handling this kind of typo.

Summary for using the Metaphone Function

Metaphone as with most Phonetic algorithms can be very useful when matching single words, but struggles to add value when you have a phrase that needs to be matched.

But as with all Fuzzy Logic there is always a trade off between the quantity of potential matches that are generated and the work required to validate these, you can see from the Soundex examples that the Soundex will suggest many more potential matches than Metaphone and in some cases it suggest valid matches that are overlooked by the more sophisticated Metaphone algorithm, but the cost of this is the extra workload required to validate these match candidates.

Useful Links

Metaphone in Wikipedia

 

Leave a Reply

Your email address will not be published. Required fields are marked *