From the NannyMUD documentation
2001-09-07
NAME
soundex - Compute a soundex index.LOCATION
/obj/soundex.cAVAILABILITY
All objects can use this functionality.SYNTAX
string soundex(string word)DESCRIPTION
The function returns a kind of soundex coding of 'word'. The original soundex algorithm is from D. Knuth's "The Art of Computer Programming, volume 3". It is a hash function, for american surnames, which will give good hash table behaviour. The soundex coding is a way of using phonetic similarities to group words. This can be used when searching to find candidates when a direct hit doesn't occur. In NannyMUD, a modified soundex algorithm is used to find possible topics in the xdoc system when there is no direct hit. This will mostly occur when the user makes a typo. It is quite possible that the soundex method isn't the best; feel free to suggest a better one. The original soundex algorithm compresses the word to a code with an inital letter followed by three digits. The first letter of the word is used for the first character of the code. The rest of the word is coded according to the following rules: + A E I O U Y H W Not coded + B F P V --> 1 + C G J K Q S X Z --> 2 + D T --> 3 + L --> 4 + M N --> 5 + R --> 6 + Double consonants are treated as one. + Adjacent consonants from the same code group are treated as one. + Abbreviated prefixes are spelt in full (ST -> saint). + Apostrophes and hyphes are ignored. The algorithm in NannyMUD has the following, ad hoc, modifications: + Before coding, the word is reduced to a sorted list of unique letters. + We don't code the characters 'åäöü_' + All letters, including the first, are coded. + Letters like ;,:. are coded as "7".