04 Apr, 2011, Chris Bailey wrote in the 1st comment:
Votes: 0
Does anyone know of an app, dsl, or library that can do the following -

twelve hundred and sixty-two plus nine hundred and eighty-four
-> two thousand forty-six


I've been able to implement most of it myself but I'm having problems with numbers like:

four hundred and eighty-six thousand six hundred forty-two

and would rather not fuss with it anymore.
04 Apr, 2011, Runter wrote in the 2nd comment:
Votes: 0
Heh, well this is a natural language parser question. These questions are almost like "Can a machine understand english?" There's some natural language parsers out there. This specific problem isn't that hard, but in your examples you show an inconsistency that is gonna be a problem. "twelve hundred" isn't structured correctly, and it's as bad as "a thousand million." It's just more common. So if you want to write something that understands special cases, and such, it'll be much harder. If you want to write something that strictly understands numbers when written correctly "one thousand, two hundred" vs "twelve hundred" it'll be much easier to do.

I'm aware of libraries that can convert numbers to english, but that's quite a bit easier than what you're asking for.
04 Apr, 2011, Chris Bailey wrote in the 3rd comment:
Votes: 0
Yeah, I'm trying to take input variations in to account, that is what is complicating it.

With what I have currently…

Twelve hundred and sixty two plus/times/minus/multiplied by/divided by one thousand two hundred sixty two

works just fine. I only run in to problems when I try "NUMBER(Four) PLACE(Hundred) NUMBER(Two) PLACE(Thousand) NUMBER(Six)"

My "parser" seems to work on any pattern but that one. But I don't see another way to get the rest to work. So I guess I should just
enforce input. =)
04 Apr, 2011, Runter wrote in the 4th comment:
Votes: 0
for that case it should read it as nested if the next place is tiered higher than the previous.

Like thousand is bigger than hundred, so it should retrospectively multiply the previous place by thousnad and add the terms.
If our number is 402,006

The node should first be number(4) place(100), it then then next is number(2) place (1000). Since 1000 is bigger multiply 1000 by 100, for 100,000 and add the number fields and merge into a single node making: number(4.02) place(100,000)

I think that would probably solve that, but im not sure if it will work in all cases.
04 Apr, 2011, quixadhal wrote in the 5th comment:
Votes: 0
Program in COBOL??? :devil:

I know you all hate perl, but… THIS might do what you want.
04 Apr, 2011, Runter wrote in the 6th comment:
Votes: 0
I dunno of ruby linguistics does it or not but it's very similar (and probably inspired by) that perl library.
04 Apr, 2011, Chris Bailey wrote in the 7th comment:
Votes: 0
That perl lib seems to do what I'm looking for. Ruby linguistics was a big help for creating my numeral list.

require 'linguistics'
Linguistics::use(:en)

cardinal_numerals = Hash.new

for i in (0..99)
cardinal_numerals[i.en.numwords] = i
end


cardinal_numerals['forty-three'] => 43, heheh =)
04 Apr, 2011, Chris Bailey wrote in the 8th comment:
Votes: 0
Well, Perl tried to eat my face so I just went ahead and rolled my own.

Please enter your equation in long form:
=>
four hundred and sixty-two thousand, seven hundred and eighty-two plus forty-two thousand, six hundred and ninety-nine

Please enter your equation in long form:
=>
four hundred and sixty-two thousand, seven hundred and eighty-two plus forty-two thousand, six hundred and ninety-nine
[DEBUG} Eval String = (4*100+62)*1000+(7*100+82)+42*1000+(6*100+99).
505481
Please enter your equation in long form:
=>
[/code]

Yay!
04 Apr, 2011, Runter wrote in the 9th comment:
Votes: 0
Perl always eats faces.
04 Apr, 2011, quixadhal wrote in the 10th comment:
Votes: 0
Noooo, perl eats brains. The face is just usually in the way…
04 Apr, 2011, Chris Bailey wrote in the 11th comment:
Votes: 0
This seems to do the trick, but hasn't been thoroughly tested. Just incase someone else needs something similar.

CARDINALS = {"zero"=>0.0, "one"=>1.0, "two"=>2.0, "three"=>3.0, "four"=>4.0, "five"=>5.0, "six"=>6.0, "seven"=>7.0, "eight"=>8.0, "nine"=>9.0,
"ten"=>10.0, "eleven"=>11.0, "twelve"=>12.0, "thirteen"=>13.0, "fourteen"=>14.0, "fifteen"=>15.0, "sixteen"=>16.0, "seventeen"=>17.0,
"eighteen"=>18.0, "nineteen"=>19.0, "twenty"=>20.0, "twenty-one"=>21.0, "twenty-two"=>22.0, "twenty-three"=>23.0, "twenty-four"=>24.0,
"twenty-five"=>25.0,"twenty-six"=>26.0, "twenty-seven"=>27.0, "twenty-eight"=>28.0, "twenty-nine"=>29.0, "thirty"=>30.0, "thirty-one"=>31.0,
"thirty-two"=>32.0, "thirty-three"=>33.0, "thirty-four"=>34.0, "thirty-five"=>35.0, "thirty-six"=>36.0, "thirty-seven"=>37.0, "thirty-eight"=>38.0,
"thirty-nine"=>39.0, "forty"=>40.0, "forty-one"=>41.0, "forty-two"=>42.0, "forty-three"=>43.0, "forty-four"=>44.0, "forty-five"=>45.0,
"forty-six"=>46.0, "forty-seven"=>47.0, "forty-eight"=>48.0, "forty-nine"=>49.0, "fifty"=>50.0, "fifty-one"=>51.0, "fifty-two"=>52.0,
"fifty-three"=>53.0, "fifty-four"=>54.0, "fifty-five"=>55.0, "fifty-six"=>56.0, "fifty-seven"=>57.0, "fifty-eight"=>58.0, "fifty-nine"=>59.0,
"sixty"=>60.0, "sixty-one"=>61.0, "sixty-two"=>62.0, "sixty-three"=>63.0, "sixty-four"=>64.0, "sixty-five"=>65.0, "sixty-six"=>66.0,
"sixty-seven"=>67.0, "sixty-eight"=>68.0, "sixty-nine"=>69.0, "seventy"=>70.0, "seventy-one"=>71.0, "seventy-two"=>72.0, "seventy-three"=>73.0,
"seventy-four"=>74.0, "seventy-five"=>75.0, "seventy-six"=>76.0, "seventy-seven"=>77.0, "seventy-eight"=>78.0, "seventy-nine"=>79.0,
"eighty"=>80.0, "eighty-one"=>81.0, "eighty-two"=>82.0, "eighty-three"=>83.0, "eighty-four"=>84.0, "eighty-five"=>85.0, "eighty-six"=>86.0,
"eighty-seven"=>87.0, "eighty-eight"=>88.0, "eighty-nine"=>89.0, "ninety"=>90.0, "ninety-one"=>91.0, "ninety-two"=>92.0, "ninety-three"=>93.0,
"ninety-four"=>94.0, "ninety-five"=>95.0, "ninety-six"=>96.0, "ninety-seven"=>97.0, "ninety-eight"=>98.0, "ninety-nine"=>99.0,}

PLACES = {"hundred"=> 100,
"thousand"=> 1000,
"million"=> 1000000,
"billion"=> 1000000000,
"trillion"=> 1000000000000,}

KEYWORDS = {"plus" => '+',
"minus" => '-',
"times" => '*',
"divided" => '/',
"multiplied" => '*',}

# Hacky singleton
class Parser
@@p, @@c, @@k = PLACES, CARDINALS, KEYWORDS
# Returns a string containing the "equation" to be evaluated.
def self.parse(s)
eq, lw, m, eqs = [],'','',''
s.gsub!(',','')
arr = s.split(' ')
arr.each do |w|
if (m = @@c[w])
eq << '+' if (@@p[lw]);eq << m
if lw.eql? 'and'
eq.insert(eq.size-4,'(')
eq.insert(eq.size-1,'+')
eq << ')'
end
elsif (m = @@p[w]) then (eq << '*' && eq << m)
elsif (m = @@k[w])
eq << ')'
eq << m
eq << '('
end
lw = w
end
eq.insert(0,'(')
eq << ')'
eq.each {|w| eqs << w.to_s}
return eqs
end
end
05 Apr, 2011, David Haley wrote in the 12th comment:
Votes: 0
You can also solve this pretty straightforwardly with a state machine, which makes it easier to allow words like "forty one" rather than forcing one-word numbers like "forty-one".
05 Apr, 2011, Chris Bailey wrote in the 13th comment:
Votes: 0
That would be great, my code is a mess. How would you do it?


EDIT: I was thinking about just porting that Perl lib for Ruby Linguistics but I seriously can't read Perl. =P
05 Apr, 2011, Runter wrote in the 14th comment:
Votes: 0
Chris Bailey said:
That would be great, my code is a mess. How would you do it?


EDIT: I was thinking about just porting that Perl lib for Ruby Linguistics but I seriously can't read Perl. =P


Nobody can read perl.

Read up on state machines a little, then check out the state machine libraries for Ruby like this. Obviously you don't need a library to implement a state machine, but there's some interesting ones out there.
06 Apr, 2011, David Haley wrote in the 15th comment:
Votes: 0
Well, the state machine is pretty basic: some words can be modifiers, some are to be modified. For example, "four" can stand on its own or it can be a modifier. The phrase "four" just means 4; "four hundred" means 400.

You always know that if a smaller number is followed by a larger number, the smaller one modifies the larger one. Conversely if a larger number is followed by a smaller number, you know that you're done with the previous number phrase.

So, the state machine is to apply this rule until you end a phrase, and then push that number onto some total, and eventually collapse that total.

"One million nine hundred thousand seven hundred and two"

1- Push 'one'
2- 'million' is a bigger number, so it must be attached to 'one'
3- 'nine' is a smaller number, so it must be a new phrase. Pop off 1,000,000 and go back to initial state with 'nine'.
4- 'hundred' is a bigger number, so it's attached
5- 'thousand' is a bigger number, so it's attached
6- 'seven' is a smaller number, so we're starting a new phrase. Add 900,000 to total.
7- etc.

Basically the states are:
- initial state until we find a number
- building up a number going from smaller to larger number
- leaving the number-building state by finding a smaller number; move back to initial state
0.0/15