03 May, 2009, Iota wrote in the 1st comment:
Votes: 0
I might need a way to codedly turn an English noun's singular form into a plural. Do any of you know a way to do this?
Right now I'm thinking of having a function that compares a given word to a table of known exceptions then applies a specific rule to the word if it matches one and a general rule if it doesn't.
03 May, 2009, Grimble wrote in the 2nd comment:
Votes: 0
I think you'll find there are too many exceptions to effectively make it rule based. Consider the plural of mouse (mice), goose (geese), moose (moose), wolf (wolves), man (men), etc.

It may be easier to manually build a mapping table from singular to plural (it looks like you already know the noun's you want to map, or have a way of identifying them), and periodically update it as new noun's are added to your game text. The default behavior can be to throw an 's' on the end, or an apostrophe if it ends in 's' already, or replace an ending of 'y' with 'ies'.
03 May, 2009, flumpy wrote in the 3rd comment:
Votes: 0
The rule is quite easy, except for (as Grimble says) the exceptions.

http://www.usingenglish.com/weblog/archi...

This is my function for doing it:

public String pluralize(String word) {
if (word.endsWith("s")) {
return word;
}
if (word.endsWith("y")) {
return word.replaceAll("/y$/", "ies");
}
if(word.matches("(fe?)$")){
return word.replaceAll("/fe?$/", "ves");
}
if (word.matches("(s|z|ch|sh|x)$")) {
return word += "es";
}

return word += "s";
}


The exceptions could be given on the object you are trying to get the plural for. So for the MOB "a man", the MudObject could provide the alternative plural "men".

That way you only have to store them when you need them.
03 May, 2009, David Haley wrote in the 4th comment:
Votes: 0
Basically Iota, you are correct that you want a list of exceptions that don't follow rules, but then you don't want just one general rule; there are actually several classes of rules for plurals. Flumpy listed a few, probably the big ones. I don't agree that there's no use in having rules because really a huge number of words do follow one rule or another. I wouldn't store the exceptions on the mob because that makes sharing harder; rather, I would have the mob point to some shared word in a dictionary that can be easily recycled. You still only need to store exceptions that you actually need.

This whole language business is one of the things that got me interested in starting a codebase from scratch. I'm pretty interested in a markup language for correctly generating text in many languages. In French for instance the articles you use are a function of the kind of article (direct vs indirect), the number and gender of the noun, and the first letter of said noun. For instance, "le chien", "la porte", "l'avion", "les avions". Unfortunately, I started spending more time on the language than the actual codebase. :lol:

Quote
it looks like you already know the noun's you want to map

I have to admit that I find some amusement in this in a thread about correctly pluralizing nouns… :wink:
04 May, 2009, Tyche wrote in the 5th comment:
Votes: 0
$ cat lingtest.rb 
#!/bin/ruby
require 'linguistics'
Linguistics::use( :en )

inventory = %w{mouse sword canary sword canary mouse mace
lantern sack amulet arrow arrow shield arrow arrow arrow}
puts "Bubba is carrying " + inventory.en.conjunction + "."

$ ./lingtest.rb
Bubba is carrying five arrows, two mice, two swords, two canaries, a mace, a lantern, a sack, an amulet, and a shield.


Conjunction junction, what's your function?
04 May, 2009, KaVir wrote in the 6th comment:
Votes: 0
If you're generating plural object names, you'll also need to define which word should be pluralised, so that "a bottle of water" doesn't become "two bottle of waters".

Personally I store both singular and plural names for objects. I suppose I could generate a default, but it hardly seems worth the hassle - I've always supported the plural form, so every object already defines one.
04 May, 2009, flumpy wrote in the 7th comment:
Votes: 0
that ruby thing is cool

got me thinking there must be a java equivalent, and here it is incase anyone cares:

https://inflector.dev.java.net/


ok i'm going to get rid of my crap code :)

EDIT:

The java and ruby packages are based of the same paper. If the code does the same thing, there are a couple of whoopsies:


import static org.jvnet.inflector.Noun;


assertEquals(pluralOf("knife"), "knives"); //yay
assertEquals(pluralOf("day"), "days"); //yay
assertEquals(pluralOf("calf"), "calves"); // yay
assertEquals(pluralOf("branch"), "branches"); // yay
assertEquals(pluralOf("fox"), "foxes"); // yay
assertEquals(pluralOf("loaf"), "loaves"); //yay
assertEquals(pluralOf("thief"), "thieves"); // << FAILS
assertEquals(pluralOf("man"), "men"); // yay
assertEquals(pluralOf("matrix", "matrices"); // << FAILS
assertEquals(pluralOf("woman"), "women"); // yay
assertEquals(pluralOf("bottle of water"), "bottles of water"); // yay!


other than that, loving it.
04 May, 2009, elanthis wrote in the 8th comment:
Votes: 0
I don't do much fancy for Source MUD. Each item/object gets a singular and plural name. I also attach articles to names, so very natural English sentences are generated. If you just create an object blueprint called "longsword" it will guess that the proper indefinite article is "a" and that the plural is "longswords." If you need to deal with something more complex, just type it in. It makes it easy to deal with naturally plural objects too, like glasses or pants. If you use a name with "of" in it it figures out the pluralization goes on the word before "of" (bottles of water, pairs of glasses, rolls of paper) or you can just specify it yourself.

If you need to pluralize any word typed in a by a user, that won't work. If you're trying to pluralize builder-supplied data, though, you either need to find a library that already includes a full table (which will be huge) or use a simple rule-based engine and allow builder-supplied overrides.
04 May, 2009, David Haley wrote in the 9th comment:
Votes: 0
There is a middle ground between a full table and a simple rule engine only allowing builder overrides. I would avoid per-item overrides, actually: I would just have a rule-based engine on top of which you add global overrides for certain words. There's no reason for every builder to have to manually specify goose/geese when you can just stick it into the global handler as an exception.
04 May, 2009, elanthis wrote in the 10th comment:
Votes: 0
Each builder wouldn't specify goose/geese, because they'd just be reusing the single goose blueprint. Recreating the same base item over and over and over again is for idiots using Diku-based MUDs.

There is also a MUD out there intended to help teach Latin that has all kind of work put into it for dealing with the conjugations, declensions, modes, persons, pluralization, and genders of nouns, verbs, adjectives, adverbs, and so on. Dunno if the authors of that MUD would have anything useful to contribute if you emailed as asked them. (Latin was the Romans' greatest joke, played on the rest of the world for the next 2000+ years… and we all fell for it.)
04 May, 2009, David Haley wrote in the 11th comment:
Votes: 0
It was an example – I was hoping the general problem would be pretty clear and that we wouldn't pick the simplest possible scenario to make the problem go away. :rolleyes: To develop a little bit more, picture some word appearing in many variations, for instance with adjectives decorating it, and with different underlying characteristics so that the same blueprint can't be recycled.
04 May, 2009, David Haley wrote in the 12th comment:
Votes: 0
And by the way, one hardly needs to go to Latin to find a language with difficult grammar. It's not as if the problem is simple for today's languages.
17 May, 2009, Iota wrote in the 13th comment:
Votes: 0
I forgot all about this, but I ended up writing a function that has seems to work reasonably well.

/*
Makeplural function by Jessica Chen goodbitster_at_gmail_com.

This program is free software: you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation, either version 3 of the License, or
(at your option) any later version.

This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.

You should have received a copy of the GNU General Public License
along with this program. If not, see <http://www.gnu.org/licenses/>;.

*/



char *makeplural(char *oldstr)
{
char newword[512/4]={"\0"};
strcpy(newword,oldstr);
char* p=newword, *c;
char input[512/4]={"\0"};
int found=0;
if((c=strstr(newword, "man"))!=NULL && strcmp(oldstr, "human"))
{
sprintf(input, "men");
found=1;
}
else if((c=strstr(newword,"tooth"))!=NULL)
{
sprintf(input, "teeth");
found=1;
}
else if((c=strstr(newword, "fe"))!=NULL || (c=strstr(newword, "ff"))!=NULL)
{
sprintf(input,"ves");
found=1;
}
else if((c=strstr(newword,"foot"))!=NULL)
{
sprintf(input, "feet");
found=1;
}
else if((c=strstr(newword,"ium"))!=NULL)
{
sprintf(input, "ia");
found=1;
}
else /* letter endings */
{
p=newword;
while(*p!='\0')
p++;
p–;
c=p-1;


if (strchr("zxs",*p))
sprintf(++p, "es");
else if(*p=='y' && !strchr("aeiouAEIOU", *(p-1)))
sprintf(p, "ies");
else if((p=strstr(c,"ch"))!=NULL)
sprintf(p+2, "es");
else if((p=strstr(c,"sh"))!=NULL)
sprintf(p+2, "es");
else
sprintf(newword, "%ss",oldstr);
return strdup(newword);


}


if(found)
{
int len=strlen(input);
int len2=strlen©;
int i=0;
char* tc=c;
for(i=0;i<len2;i++)
tc++;
if (!isalnum(*tc))
{

sprintf(c,"%s",input);
return strdup(newword);

}

}


return strdup(oldstr);

}


It's used like this:
char* tp = makeplural(objname); //if objname is the singular form, the plural form will be strduped into tp.
/* <Use tp here> */
free(tp);
17 May, 2009, elanthis wrote in the 14th comment:
Votes: 0
Why are you using strstr()? That is going to find words even if they're not at the end of the word, so your function would apparently transform "toothbrush" into "teethbursh." Or is that what that funky loop/increment thing at the bottom (why is it a loop and not just tc += len2 ?) supposed to fix? Not to be a dick, but that code is just very hard to read and poorly structured. :/

Also, as a side note, the GPL is a horrible license to use for anything MUD related, as like 95% of MUDs are GPL-incompatible due to the DIku/ROM/blah licenses.
17 May, 2009, Iota wrote in the 15th comment:
Votes: 0
elanthis said:
Why are you using strstr()? That is going to find words even if they're not at the end of the word, so your function would apparently transform "toothbrush" into "teethbursh." Or is that what that funky loop/increment thing at the bottom (why is it a loop and not just tc += len2 ?) supposed to fix? Not to be a dick, but that code is just very hard to read and poorly structured. :/

Also, as a side note, the GPL is a horrible license to use for anything MUD related, as like 95% of MUDs are GPL-incompatible due to the DIku/ROM/blah licenses.


Patch welcome.

[Edit: Here is a fixed version of the function that turns "toothbrush" into "toothbrushes" and "footsoldier" into "footsoldiers" instead of "teeth" and "feet", respectively. Thanks for the heads up.]
char *makeplural(char *oldstr)
{
char newword[512/4]={"\0"};
strcpy(newword,oldstr);
char* p=newword, *c;
char input[512/4]={"\0"};
int found=0;
if((c=strstr(newword, "man"))!=NULL && strcmp(oldstr, "human") && !isalnum(*(c+3)))
{
sprintf(input, "men");
found=1;
}
else if((c=strstr(newword,"tooth"))!=NULL && !isalnum(*(c+5)))
{
sprintf(input, "teeth");
found=1;
}
else if((c=strstr(newword, "fe"))!=NULL || (c=strstr(newword, "ff"))!=NULL)
{
sprintf(input,"ves");
found=1;
}
else if((c=strstr(newword,"foot"))!=NULL && !isalnum(*(c+4)))
{
sprintf(input, "feet");
found=1;
}
else if((c=strstr(newword,"ium"))!=NULL)
{
sprintf(input, "ia");
found=1;
}
else /* letter endings */
{
p=newword;
while(*p!='\0')
p++;
p–;
c=p-1;


if (strchr("zxs",*p))
sprintf(++p, "es");
else if(*p=='y' && !strchr("aeiouAEIOU", *(p-1)))
sprintf(p, "ies");
else if((p=strstr(c,"ch"))!=NULL)
sprintf(p+2, "es");
else if((p=strstr(c,"sh"))!=NULL)
sprintf(p+2, "es");
else
sprintf(newword, "%ss",oldstr);
return strdup(newword);


}


if(found)
{
int len=strlen(input);
int len2=strlen©;
int i=0;
char* tc=c;
tc+=len2;

if (!isalnum(*(tc)))
{

sprintf(c,"%s",input);
return strdup(newword);

}

}


return strdup(oldstr);

}
18 May, 2009, David Haley wrote in the 16th comment:
Votes: 0
strchr is really not what you want to be using. Consider what your "ff" or "fe" rule will do for things like "effect"…

You could make things a lot simpler if the assumptions on function entry were clear. It looks like sometimes you care about checking what happens after the string you matched (the isalnum business) but then other times you don't seem to care. If you knew you were pluralizing one word at a time, it would be easier to write the function; you would just check if the string ends with a given pattern.
18 May, 2009, Iota wrote in the 17th comment:
Votes: 0
Hey, thanks for pointing that out. I fixed the function for those oversights shortly after my last post and also gave it the ability to deal with object names which have "pair of" or "set of" in them.

What should I be using instead of "strchr("aeiouAEIOU", *(p-1))" to check for vowels, though?
18 May, 2009, David Haley wrote in the 18th comment:
Votes: 0
Sorry, I meant "strstr" in my post when I was talking about "ff" and "fe".

The usage of strchr looks ok, although I'd probably have a separate function 'is_vowel' or something like that instead of relying on manual listing every time you need to check for vowels. I'd note though that the indenting is a little funny; it makes it difficult to follow what exactly your function is doing. (For instance, you don't indent your 'if' blocks, or even your loop block.)
0.0/18