05 Mar, 2010, ATT_Turan wrote in the 41st comment:
Votes: 0
Unless I'm really missing something, lines 10 and 17 are incorrect; it should be "an honourable" and "an AAA". The first is because it was spelled in the British style and your exception list only has it in the American no-U spelling, which is an easy fix. I have no idea why "AAA" is in the exception list, unless you're presuming that the letters are not being pronounced but rather being said "triple A" or something.
06 Mar, 2010, Tyche wrote in the 42nd comment:
Votes: 0
ATT_Turan said:
Unless I'm really missing something, lines 10 and 17 are incorrect; it should be "an honourable" and "an AAA". The first is because it was spelled in the British style and your exception list only has it in the American no-U spelling, which is an easy fix. I have no idea why "AAA" is in the exception list, unless you're presuming that the letters are not being pronounced but rather being said "triple A" or something.


Correct. The dictionary doesn't contain British spellings and AAA is pronounced triple-A. I don't remember whether I included them in the test to illustrate that or not.
06 Mar, 2010, Tyche wrote in the 43rd comment:
Votes: 0
flumpy said:
nice work, tyche! useful :)


Thanks. I give credit to Eiz over at mudlab for pointing me to the dictionary and planting the idea. :-)
06 Mar, 2010, Idealiad wrote in the 44th comment:
Votes: 0
flumpy said:
nice work, tyche! useful :)


Agreed, great post Tyche.
06 Mar, 2010, flumpy wrote in the 45th comment:
Votes: 0
Tbh you would say 'a tripple a battery', in most cases, so, cool. I plugged the regex, logic and files into Groovy mud, hope you don't mind tyche?

I'll put your name somewhere in there to give credit if you like :)

M
06 Mar, 2010, ATT_Turan wrote in the 46th comment:
Votes: 0
True, and I suppose I even have mental precedent, as typing something like "an R.S.V.P." looks correct to me; I suppose I just didn't see "AAA" and think "triple-A battery" without any context.

That's a very useful and concise little snippet - thanks, Tyche.
06 Mar, 2010, Tyche wrote in the 47th comment:
Votes: 0
flumpy said:
Tbh you would say 'a tripple a battery', in most cases, so, cool. I plugged the regex, logic and files into Groovy mud, hope you don't mind tyche?


Not at all. That's why I posted.

Acronyms are always problematic….
While both a AAA battery and a AAA member (American Automobile Association) are pronounced 'triple-A',
OTOH, a AA battery and an AA member (Alcoholics Anonymous) are pronounced differently, 'double-A' and 'ay ay'.

I would guess if your theme is something like "global bureaucracy" and incorporates many acronyms you're going to have more problems. ;-)

Edit: on second thought it could be "a AA battery" or "an AA battery" depending on what sort of battery, electrical or anti-aircraft.
06 Mar, 2010, Cratylus wrote in the 48th comment:
Votes: 0
Tyche said:
I would guess if your theme is something like "global bureaucracy" and incorporates many acronyms you're going to have more problems. ;-)


Dibs on "ObaMUD"
06 Mar, 2010, Tyche wrote in the 49th comment:
Votes: 0
Cratylus said:
Dibs on "ObaMUD"


I'm not sure that permapoverty will go over well with the player base. It's a dark theme. I think players want escapism.
30 Apr, 2010, Runter wrote in the 50th comment:
Votes: 0
@Tyche Little bit of a necro, but I plan on using the data you mined, and wanted to say thanks for this post. It's going to be useful for me.
30 Apr, 2010, Runter wrote in the 51st comment:
Votes: 0
Also this:

http://deveiate.org/projects/Linguistics...

Fantastic Ruby library that does this and other things for you. Credit goes to Twisol for this find.
30 Apr, 2010, flumpy wrote in the 52nd comment:
Votes: 0
Hmmnh, Tyche's solution it doesn't seem to work for "sheep". It comes out as "an sheep"..

That ruby lib looks good, but I can't find a Java equivalent :(
30 Apr, 2010, Tyche wrote in the 53rd comment:
Votes: 0
flumpy said:
Hmmnh, Tyche's solution it doesn't seem to work for "sheep". It comes out as "an sheep"..


Hmm, there's an entry in the consonants exception table for SH, ehs ech, which should be removed or changed to SH$. Good catch.
01 May, 2010, flumpy wrote in the 54th comment:
Votes: 0
bah ignore me, seems to be ok
03 May, 2010, Runter wrote in the 55th comment:
Votes: 0
Taken from Ruby linguistics library.

# This pattern matches strings of capitals starting with a "vowel-sound"
# consonant followed by another consonant, and which are not likely
# to be real words (oh, all right then, it's just magic!)
A_abbrev = %{
(?! FJO | [HLMNS]Y. | RY[EO] | SQU
| ( F[LR]? | [HL] | MN? | N | RH? | S[CHKLMNPTVW]? | X(YL)?) [AEIOU])
[FHLMNRSX][A-Z]
}

# This pattern codes the beginnings of all english words begining with a
# 'y' followed by a consonant. Any other y-consonant prefix therefore
# implies an abbreviation.
A_y_cons = 'y(b[lor]|cl[ea]|fere|gg|p[ios]|rou|tt)'

# Exceptions to exceptions
A_explicit_an = matchgroup( "euler", "hour(?!i)", "heir", "honest", "hono" )


Meanwhile, back at the Hall of Justice.

case word

# Handle special cases
when /^(#{A_explicit_an})/i
return "an #{word}"

# Handle abbreviations
when /^(#{A_abbrev})/x
return "an #{word}"
when /^[aefhilmnorsx][.-]/i
return "an #{word}"
when /^[a-z][.-]/i
return "a #{word}"

# Handle consonants
when /^[^aeiouy]/i
return "a #{word}"

# Handle special vowel-forms
when /^e[uw]/i
return "a #{word}"
when /^onc?e\b/i
return "a #{word}"
when /^uni([^nmd]|mo)/i
return "a #{word}"
when /^u[bcfhjkqrst][aeiou]/i
return "a #{word}"

# Handle vowels
when /^[aeiou]/i
return "an #{word}"

# Handle y… (before certain consonants implies (unnaturalized) "i.." sound)
when /^(#{A_y_cons})/i
return "an #{word}"

# Otherwise, guess "a"
else
return "a #{word}"
end
03 May, 2010, Runter wrote in the 56th comment:
Votes: 0
For completeness.

94	    ### Wrap one or more parts in a non-capturing alteration Regexp
95 def self::matchgroup( *parts )
96 re = parts.flatten.join("|")
97 "(?:#{re})"
98 end
40.0/56