From the NannyMUD documentation

LAST CHANGE

2001-09-07

FUNCTION


NAME

        soundex - Compute a soundex index.

LOCATION

        /obj/soundex.c

AVAILABILITY

	All objects can use this functionality.

SYNTAX

        string soundex(string word)

DESCRIPTION

	The function returns a kind of soundex coding of 'word'. The
	original soundex algorithm is from D. Knuth's "The Art of
	Computer Programming, volume 3". It is a hash function, for
	american surnames, which will give good hash table behaviour.

	The soundex coding is a way of using phonetic similarities to
	group words. This can be used when searching to find
	candidates when a direct hit doesn't occur.

	In NannyMUD, a modified soundex algorithm is used to find
	possible topics in the xdoc system when there is no direct
	hit. This will mostly occur when the user makes a typo. It is
	quite possible that the soundex method isn't the best; feel
	free to suggest a better one.

	The original soundex algorithm compresses the word to a code
	with an inital letter followed by three digits. The first
	letter of the word is used for the first character of the
	code. The rest of the word is coded according to the following
	rules: 
	+ A E I O U Y H W     Not coded
	+ B F P V         --> 1
	+ C G J K Q S X Z --> 2
	+ D T             --> 3
	+ L               --> 4
	+ M N             --> 5
	+ R               --> 6
	+ Double consonants are treated as one.
	+ Adjacent consonants from the same code group are treated as
	  one. 
	+ Abbreviated prefixes are spelt in full (ST -> saint).
	+ Apostrophes and hyphes are ignored.

	The algorithm in NannyMUD has the following, ad hoc,
	modifications:
	+ Before coding, the word is reduced to a sorted list of
	  unique letters. 
	+ We don't code the characters 'åäöü_'
	+ All letters, including the first, are coded.
	+ Letters like ;,:. are coded as "7".