-------------------------------------------------------------- HOW for Interlude "how the hell do i program Coda?" -------------------------------------------------------------- Coda is a simplified language based loosely around functional languages like Scheme and Dylan. in the Coda universe, everything is just a "value" ... to Coda, there really isn't that much of a difference between the distance to the sun, the text of the constitution, or the instructions for calculating your height in centimeters ... all of these are just values, stored somewhere on some computer. All of the following are examples of values: 42 -- this is a NUMBER ... no decimal. 3.141592 -- this is called a REAL ... it can be much larger and smaller than NUMBERS since it has decimal parts, but it's less accurate for precise 'whole number' calculations. "spam" -- this is a STRING ... any number of letters, spaces, special characters, etc... the "" just mark the beginning and the end, they're not part of the string. abc -- this is a SYMBOL ... symbols are most interesting in that you can associate some other value with them, then later use the symbol to retrieve the shorthand. Some symbols already have values associated with them by the Coda environment ... these are the built-ins listed below. symbols can also be transferred around without associating values to them. #17 -- this is an OBJECT ... the database is made up of objects, which in turn are made up of three things: a collection of SYMBOLs, a collection of parents for that object, and a collection of regular expressions. The parents and regular expressions will be discussed later. !15 -- this is an ERROR value ... this allows the program to analyze errors and compare them to others in a flexible way. errors are discussed at length below. '(1 3.27 5) -- this is a LIST of values, kept in order. This list is made up of the values 1, 3.27, and 5 ... in that order. the apostrophe (') will be explained later. A list can hold any other types of values, even other lists. List manipulation really makes up the backbone of Coda. There are a few other types (e.g. ERROR, METHOD, USER) which will be discussed shortly. values are manipulated and converted, thereby changing the database, using "expressions" that all come in one easy form: (<function> [<value>]* ) every expression is just a value (a "function") followed by a list of values that are called the "arguments" for that function. For example, the expression: (+ 1 5) has a function '+' and arguments 1 and 5. Similarly, the expression: (+ 2 3 1) has a function '+' and arguments 2, 3, and 1. when Coda comes across an expression, first it figures out what the "value" of the the function and arguments are. Then, using these, it figures out what the whole expression is 'worth' ... what it "evaluates to." If we think of + as 'add', we could read the expressions above as (+ 1 5) --> add one and five (+ 2 3 1) --> add two and three and one Both of the expressions above would evaluate to the value 6. Similarly, some other simple expressions: (* 8 7 3) --> multiply eight and seven and three ... evaluates to 168 (- 7 3) --> subtract three from seven ... evaluates to 4 (+ 1 1 1 1) --> add one and one and one and one ... evaluates to 4 There's just one other twist to make Coda really useful: since Coda figures out the value of arguments before figuring out what an expression evaluates to, you can insert expressions as arguments for other expressions. For example, we know that (- 7 3) evaluates to 4, so (+ 1 (- 7 3)) is the same as saying 'add one to the value of (- 7 3)', so (+ 1 (- 7 3) would evaluate to 5. if you play around with this, you can do a lot of very powerful things by nesting expressions within other expressions: (* (+ 3 4 5) (/ (+ 7 8) 3)) would simplify to: (* (+ 3 4 5) (/ 15 3)) since (+ 7 8) --> 15 (* 12 (/ 15 3)) since (+ 3 4 5) --> 12 (* 12 5) since (/ 15 3) --> 5 60 just remember, anywhere you can give an value in a Coda program, you could just give an expression that evaluates to the same value. if we look back up at lists like '("spam" 1 22.32) ... we can see that the apostrophe that we stuck at the beginning is neccessary to keep lists from being confused with expressions. it looks kind of funny, but that's how LISP has always done it, so that's what everyone's used to. so ... now you've seen the functions *, +, -, /. There are over five dozen different built-in functions in the Coda server, plus hundreds more that people have written. i'll describe all of the built-in ones now. ------------------ BUILT-IN FUNCTIONS ------------------ ARITHMATIC ========== + syntax: ( + <arg> <arg>... ) this will evaluate to the summed values of all of the arguments. if the arguments are strings or lists, this will evaluate to the concatenated value, so (+ "ab" "cd") --> "abcd" arguments must be NUMBER, REAL, STRING, LIST, and must be of the same type as each other. - syntax: ( - <arg> <arg> ) this will evaluate to the first argument minus the second. arguments must be NUMBER or REAL * syntax: ( * <arg> <arg>... ) this will evaluate to the product of all of the arguments. arguments must be NUMBER or REAL / syntax: ( / <arg> <arg> ) this will evaluate to the first argument divided by the second. if the arguments are NUMBERs, this will round down fractions. arguments must be NUMBER or REAL % syntax: ( % <arg> <arg> ) this will evaluate to the first argument modulo the second. This means that the first is divided by the second, and the whole number remainder is returned. arguments must be NUMBER square-root syntax: ( square-root <arg> ) operates on a NUMBER or a REAL ... always returns a REAL. BOOLEAN ======= 'boolean' is a computer term for "true or false." boolean values are fairly important in Coda ... when you want to specify the conditions under which things should happen, you use a boolean expression. Each type of value has rules for which possible entries mean "true" and which mean "false." By convention, the NUMBER 0 is false, and all others are true. any STRING of length one or more is considered true, while "" is false. any LIST of length one or more is true, while '() is false. all ERROR values are false. if an OBJECT exists, it is true, otherwise false. = syntax: ( = <arg> <arg> ) if the first argument has the same value as the second, this expression will evaluate to 1, otherwise, it will evaluate to 0. arguments can be of any type, but must be of the same type. < syntax: ( < <arg> <arg> ) if the first argument is less than the second, this expression will evaluate to 1, otherwise, 0. A string is less than another if the first is shorter than the second. A a list is less than another if the first is an initial subset of the second. arguments can be of any type, but must be of the same type. > syntax: ( > <arg> <arg> ) if the first argument is greater than the second, this expression will evaluate to 1, otherwise, 0. A string is less than another if the first is longer than the second. A list is greater than another if the second is an initial subset of the first. arguments can be of any type, but must be of the same type. <= syntax: ( <= <arg> <arg> ) evaluates to true if arg1 is less than or equal to arg2 >= syntax: ( >= <arg> <arg> ) evaluates to true if arg1 is greater than or equal to arg2 not syntax: ( not <arg> ) if the argument has a boolean value of true, this evaluates to '(), otherwise, it evaluates to 1. arguments can be of type NUMBER, REAL, STRING, LIST, ERROR. or syntax: ( or <arg>... ) this evaluates each argument, one at a time. if it finds one that is boolean true [see 'not' for a discussion of boolean], it evaluates to that value, otherwise, if it makes it all the way through the arguments without finding a true value, it evaluates to false. and syntax: ( and <arg>... ) this is the reverse of 'or' ... it evaluates each argument, returning the first false argument. if none are false, it evaluates to true. if syntax: ( if <arg> <arg> <arg> ) 'if' evaluates the first argument. if that is a true value, it evaluates to the value of the second argument, otherwise it evaluates to the third. if the third is omitted, it evaluates to '() -- a useful value for recursive functions over lists. CONVERSION ========== tonum syntax: ( tonum <arg> ) this converts the argument into an ordinal NUMBER and evaluates to the result. decimal values are truncated... if you want to round off, add 0.5 first: (tonum (+ 1.6 0.5)) --> 2 arguments can be STRING, REAL, NUMBER, ERROR, OBJ toreal syntax: ( toreal <arg> ) this converts the argument to a floating-point REAL number. the argument must be REAL, STRING, or NUM toobj syntax: ( toobj <arg> ) this evaluates to the object who's number is indicated by the NUMBER in the argument. tolist syntax: ( tolist <arg>... ) this creates a list from the values of the arguments. typeof syntax: ( typeof <arg> ) this takes the argument and returns an SYMBOL corresponding to the type of that argument... so (typeof 15) --> NUMBER (typeof "spam") --> STRING .... etc. therefore, the best way to check if a value is of a certain type would be to compare to that symbol: (= (typeof 15) (quote NUMBER)) but it also works to just compare to a known value: (= (typeof 15) (typeof 1)) to find out if a word is a reserved builtin, you can check if (typeof ...) is equal to 'RESERVED if the argument is a user-defined data type (see below) then this evaluates to the OBJ type of the data. tostr syntax: ( tostr <arg> ) this converts the argument to a STRING. the argument can be of any type. format syntax: ( format <arg> <arg2> ) this is identical to 'tostr', but if the argument is of type METHOD, the string result will have newlines and spaces inserted to make the code look pretty. argument 2 must be a NUMBER indicating the width (in characters) to use as the maximum line length. values less than 10 will be ignored and a default will be used. num2char syntax: ( num2char <arg> ) this takes a NUMBER and evaluates to a STRING of length 1 that has one character equal to the ASCII value of the argument. (num2char 101) --> "e" char2num syntax: ( char2num <arg> ) this takes in a STRING and returns the ASCII ordinal value of the first character. (char2num "e") --> 101 ERRORS ====== i decided to keep errors as numerical data, so you could create new error codes "on the fly" by assigning new error codes to new problems you want to report. this increase in generality comes at a slight cost in readability, so until i get this resolved, some error codes will be reported to the coder by number. Appendix A lists some of the most common error codes you might encounter. when a value of type ERROR is generated in Coda, if a 'handle' statement is enclosing for that error code, control returns to that point. otherwise, if the player's object defines a variable 'handler' containing a METHOD value, this method is called with the error's number as the first argument, the error string as the second. otherwise, the error is just reported to the user via 'tell' any error handling method must have two formal parameters ... the first gets the numerical value of the error and the second gets the string error report. for example, this is a valid handler: (method (n s) (+ (tostr n) s)) raise syntax: ( raise <arg> [<arg2>] ) this gets the numerical value of the numerical first argument and evaluates to the error with that code. the NUMBER must be positive, or else you will get error code -1, since all other negative codes are reserved for internal use. in a program, you can signify an error value by putting an exclamation mark before a number. this means that (= !15 (raise 15)) is true. if a STRING is provided as a second argument, this message is passed to the programmer if the error is not ignored. ignore syntax: ( ignore <arg> <arg> ) this gets the numerical value of the first argument, then evaluates to the second argument, ignoring all occurrences of the error corresponding to argument 1. This is said to have 'scope' over the second argument, because once this expression is finished being evaluated, the error code will no longer be ignored ... it only applies to the one expression given as argument 2. handle syntax: ( handle <arg1> <arg2> <arg3> ) arg1 must be a NUMBER and arg2 must be a METHOD. if the wrong type of arguments are given, no error will be raised, so check this carefully. as argument 3 is being evaluated, if an error with the number from arg1 is raised, control returns to this point and the METHOD error handler given in arg2 is evaluated. for example: (handle 321 (method (errnum errstr) (* errnum 2)) (raise 321)) --> 641 LIST/STRING PROCESSING ====================== index syntax: ( index <arg> <arg> ) this takes a NUMBER as the first argument and a LIST as the second and evaluates to the <arg 1>th element of <arg 2> range syntax: ( range <arg1> <arg2> <arg3> ) this takes NUMBERs as arguments 1 and 2, and STRING or LIST as argument 3. if arg3 is a STRING, it evaluates to a string corresponding to characters from arg1 to arg2 out of arg3. (range 3 5 "abcdefg") --> "cde" if arg3 is a LIST, it evaluates to a list corresponding to elements from arg1 to arg2 of arg3. (range 3 5 '(2 4 6 8 10 12)) --> '(6 8 10) length syntax: ( length <arg> ) evaluates to the number of characters in the argument, if it's STRING, evaluates to the number of elements in the LIST argument. insert syntax: ( insert <arg1> <arg2> <arg3> ) evaluates to a value corresponding to arg2 with arg1 inserted before position arg3 ... (insert 5 '(a b c) 2) --> '(a 5 b c) (insert "m" "lard" 3) --> "lamrd" search syntax: ( search <arg1> <arg2> ) search allows you to scan a LIST in arg2 for the value in arg1. search then evaluates to the position of arg1 in arg2, or 0 if it's not present. if arg2 is a STRING, then arg1 must also be a STRING, and search evaluates to the position in the string of the first occurrence of arg1. for example: (search #2 '(#17 #8 #2 #3)) --> 3 (search "bc" "abcde") --> 2 (search #2 '(#17 #8 #3)) --> 0 explode syntax: ( explode <arg1> <arg2> ) explode takes two STRING arguments and evaluates to a list made up of all of the sub-strings of arg2 that are separated by the string in arg1. For example: (explode " " "now is the time") --> '("now" "is" "the" "time") implode syntax: ( implode <arg1> <arg2> ) implode is the complete opposite of explode ... it takes a separator and a list of strings and evaluates to a long string made up of the elements of the list with the separator inserted: (implode "**" '("ab" "cd" "ef")) --> "ab**cd**ef" (implode "." (explode " " "now is the time")) --> "now.is.the.time" reg-split syntax: ( reg-split <arg1> <arg2> ) this is one of the most complex and powerful built-in functions in the language. it takes two strings, similar to explode, and produces a list composed of the portion of arg2 before the occurrence of arg1, then the portion matching arg1 and ending with the portion after arg1. reg-split looks for the first occurence of arg1 in arg2, and if it doesn't find one, it returns the empty list. Most critical, however, is that arg1 is interpreted as being a regular expression, and this pattern is searched for in argument 2. for example, (reg-split "ab*a" "cccabbbac") --> '("ccc" "abbba" "c") (reg-split ".* ((to)|(at)) *" "give to tim") --> '("" "give to " "tim") reg-split looks for the first match it can find, but it rejects trivial matches where the regular expression would match an empty string... (reg-split "b*" "acdc") --> '() since the empty list is boolean FALSE, this function can be easily used to see if a regular expression exists in a string, but more complex parsing uses can be found by someone who gets used to the funky regular expression syntax. for a quick discussion of regular expressions, check out Appendix D after syntax: ( after <arg1> <arg2> ) after takes a STRING or a LIST as arg2. if arg2 is a LIST, then arg1 must be a number and the expression evaluates to the remainder of the list after element number arg1. if arg2 is a STRING, then arg1 can be a number of a string ... if it is a number, then this evaluates to all parts of the string after the arg1-th character. if arg1 is a string, then after searches for arg1 in arg2 and evaluates to the remainder of the string after the location of arg1. this function is case sensitive with strings. before syntax: ( before <arg1> <arg2> ) before is the opposite of after ... it takes the same argument types, but returns the portion of the string/list that is BEFORE arg1. this function is case sensitive with strings. car syntax: ( car <arg> ) car will take a list and return the first element from the argument. don't ask why it's called 'car' ... it's just convention. the argument must be a LIST cdr syntax: ( cdr <arg> ) cdr will take the remainder of a list after the first element. the argument must be a LIST. once again, don't ask why it's called 'cdr.' cdr is equivalent to (after 1 <arg>) for example: (cdr '(5 7 9)) --> '(7 9) cons syntax: ( cons <arg1> <arg2> ) cons will take any value as arg1 and a LIST as arg2, then form a list made up of arg2 with arg1 appended at the beginning. LITTLE THINGS ============= time syntax: ( time ) returns a NUMBER corresponding to the number of seconds since Jan 1, 1971. random syntax: ( random ) returns a random NUMBER of some really big size. to get a number from 1 to 6 out of this, i could say (+ (% (random) 6) 1) crypt syntax: ( crypt <arg> ) the argument must be a string. this evaluates to a unix-encrypted string derived from the argument. Effectively a one-way encryption. sleep syntax: ( sleep <arg> ) the argument must be NUMBER. the process will not continue for this many seconds. filetext syntax: ( filetext <arg> ) on the machine where the Coda server is running, there is a subdirectory where short text files can be stored by the host administrator. if you say (filetext "spam") and there is a file called 'spam' in that directory, then the function evaluates to a big string containing the contents of that file. if the file doesn't exist, then you will get (error E_BAD_FILE) SYMBOL MANIPULATION =================== earlier, we mentioned a SYMBOL value can be shorthand for another value. in computer-ese, the SYMBOL is called a 'variable' if it is 'storing' some other value. the SYMBOL just gives you a way to put something into the computer's memory and be able to get it out later. when you put the name of a symbol in a program, and that program is evaluated, the symbol is checked and its stored value is used. For example, let's just say that the symbol 'pi' contains the value 3.1415. if we were to evaluate the expression (* 2 pi), it would evaluate to 6.283 since 'pi' evaluates to 3.1415. set syntax: ( set <symbol> <arg2> <arg3>...) this is the function that allows you to change the value associated with a symbol. it takes the symbol given as argument one and associates the value of argument 2 with it. for example, if we took the symbol 'pi' with the value above and decided to reassign it, we could give the expression: (set pi 4) from now on, the symbol 'pi' will evaluate to the NUMBER 4 instead of the REAL 3.1415 any arguments after arg2 are evaluated with the symbol bound to the new value. the expression evaluates to the last value of the last argument. sets syntax: ( sets <arg1> <arg2> ... ) this is similar to 'set' except that it evaluates 'arg1' first, even if arg1 is a symbol. that means that you can put a symbol as the second argument that is storing another symbol. for example: (assume 'pi' contains 3.1415 and 'const' contains the symbol 'pi') (sets const 3) from this point on, 'pi' will contain 3. similarly: (sets (tosym "pi") 3) will have the same effect. var syntax: ( var <symbol> <arg> ) this function will create a new symbol that can be used anywhere within the expression <arg>. This symbol will not be set initially, so you will have to use 'set' to put a value into it. occasionally, you may want to deal with symbols for themselves instead of just looking up their stored value ... for example, i've found times when i've wanted to tell an object which variable it should look up. to do that, i pass a symbol. manipulating symbols is a little tricky, but once you get used to the syntax, it can be a useful tool. tosym syntax: ( tosym <arg> ) arg must evaluate to a STRING ... tosym will evaluate to the SYMBOL that looks the same as the string, then this will be looked up. for example: (tosym "spam") --> spam quote syntax: ( quote <arg> ) to indicate that you want to use a symbol without evaluating it, you can use quote ... for example: (quote abc) --> abc the standard shorthand for this is to put an apostrophe before the symbol name, indicating that you don't want to evaluate it: 'abc --> (quote abc) eval syntax: ( eval <arg> ) this takes in a value and then the whole expression evaluates as the argument. for most values, that just means that <arg> is returned ... (eval 15) --> 15 (eval "spam") --> "spam" but if the argument is a list or a symbol, these arguments are evaluated: (eval '(+ 1 2)) --> 3 (eval '((method (n) (* n n)) 3) --> 9 (eval (cons * '(3 4))) --> 12 (eval 'pi) --> 3.1415 (assuming 'pi' stores 3.1415) collect syntax: ( collect <symbol> ) this function searches for <symbol> on 'this' and all of its ancestors, making a list of all results. This allows you to bypass inheritance on a symbol to find out how the ancestors define it. for example, if #9 descends from #5 descends from #2 descends from #1 and all of them define their own version of the symbol 'pi' as follows: #9 defines it as 3.1415, #5 as 3, #2 as 314, #1 as "three" then if #9 called (collect 'pi) --> '("three" 314 3 3.1415) semaphore syntax: ( semahpore <symbol> <arg2> <arg3>...) this is identical to 'set' for most purposes, but if <symbol> is already defined on 'this' as <arg2> then the remaining arguments are not evaluated and an error (E_SEMAPHORE ... 32) is raised. most people can just ignore this function, but it was needed to provide an atomic check-and-set mechanism since the server multitasks. FLOW CONTROL ============ these functions are used if you want to evaluate multiple expressions, or you want to evaluate the same expression several times. begin syntax: ( begin <arg>... ) this just gets all of the arguments that follow it and evaluates to the last one. while syntax: ( while <arg1> <arg2> ) this evaluates the first argument, if that has a boolean 'true' value, it evaluates the second one, then it resets both of them and repeats by evaluating the first argument again, etc. it keeps doing this until the first argument evaluates to a boolean 'false' value. dotimes syntax: ( dotimes ( <symbol> <arg1> ) <arg2> ) if arg1 evaluates to a NUMBER type, this evaluates arg2 that many times. it also creates a new local symbol (indicated by <symbol> above) that can be accessed from within arg2. this symbol starts by containing 1, and each time <arg2> is evaluated, this value is increased by one. for example: (dotimes (abc 5) (* 2 abc)) --> this will evaluate (* 2 abc) five times, evaluating to 2, 4, 6, 8, then 10 (in that order) since 'abc' contains 1, 2, 3, 4, then 5. foreach syntax: ( foreach ( <symbol> <arg1> ) <arg2> ) foreach is similar to dotimes, in that it allows you to evaluate something several times, but foreach takes a LIST value for arg1, then assigns the <symbol> to each of the elements in that list, one at a time, while evaluating arg2 each time. for example: (foreach (abc '(2 3 7 11)) (+ abc abc)) --> this will evaluate (+ abc abc) four times, once for each element in '(2 3 7 11). the first time, abc will contain the value 2, so (+ abc abc) will evaluate to 4. the second time, abc will contain 3, so (+ abc abc) will evaluate to 6. METHOD STUFF ============ the last type of value that Coda uses is by far the most powerful. This type of value, METHOD, is how you give Coda a list of instructions. if a METHOD value is at the beginning of an expression, i.e. if it comes after the parenthesies, then it can act just like a built-in function. a method can even take arguments, just like a built-in function. by creating and changing methods, you can change what things do in the environment. let's say i had a METHOD value stored in the variable 'double' that takes one argument and evaluates to that argument doubled. in an expression, you could write (double 5) and this would evaluate to 10. I'll go over how you could create this method. method syntax: ( method ( <symbol>... ) <arg> ) this is the reserved word that always evaluates to a value of type METHOD. following 'method' comes a (possibly empty) list of symbols that are created when the method is used, thereby taking on the values of the arguments passed to method. so if we had an expression: (method (a b) ... ) this would be a method that would take two arguments, a and b. similarly, if the symbol 'double' contained a method like: (method (num) ... ) then double would take one argument and place this value in 'num' ... after assigning the arguments to the symbols (these symbols are called 'formal parameters' by CS-geek types), the expression in <arg> is evaluated. so ... if we wanted to write the method that is stored in 'double' above, we could write: (method (num) (+ num num) ) when 'double' is called, its argument is stored in 'num' and then (+ num num) is evaluated. the method as a whole evaluates to the result of this expression. we could also have written this method as: (method (num) (* 2 num) ) ... either of these will take one argument and evaluate to that argument doubled. so, the expression (set double (method (num) (+ num num))) would create a new METHOD value and then associate this value with the symbol 'double'. compile syntax: ( compile <arg> ) the argument must be a STRING ... compile parses the string and returns the value that corresponds to that string ... whether that value is a number, a list, a method, etc... return syntax: ( return <arg> ) occasionally, you may be deep within a method when you find that you just want that method to exit immediately. the 'return' function allows you to do this ... if this is evaluated, the method it is contained in stops evaluating and just returns the value evaluated by <arg>. fork syntax: ( fork <symbol> <arg1>... ) fork is a tool that you should only have to use on very rare occasions. it enables a method to start another method running that should go at the same time. to do this, you give the symbol that the method is stored under followed by the values to be passed to that method. for example, if you wanted to fork a new process that is running the 'house-keep' method with object #17 as an argument, you could write: (fork house-keep #17). the fork expression evaluates to a NUMBER indicating the unique process ID of the forked method. kill syntax: ( kill ) immediately terminate execution of this process. OBJECT-ORIENTED FUNCTIONS ========================= when a method is evaluated, three symbols are set by the server: 'this' contains the object that the method is evaluating on. 'player' contains the object that started off the whole process chain. 'caller' contains the object that told this object to execute this method. all of the functions below access or change parts of the object 'this': set syntax: ( set <symbol> <arg1> <arg2>...) 'set' was discussed above, but it needs a little more explanation in the context of Objects. Every object has a list of symbols stored on it. It also has a group of parents with symbols stored on them. If you use a symbol in a program, it first checks to see if a locally scoped variable was declared using 'var', or through 'dotimes', 'foreach', or the formal parameters of 'method'. if the symbol isn't located in the local variables, it checks on 'this' to see if it exists there. if it doesn't, it checks the parents of 'this', and all of their parents, etc... if you use the 'set' function to set the value of a symbol, and that symbol doesn't already exist as a local variable and it doesn't exist on 'this', the symbol and its associated value is added to 'this.' to keep your code (and objects) clean, if you don't intend a value to be stored on the object, you should use the 'var' function to denote a local (non-object-oriented) symbol. one other quirk/feature of 'set' ... if a parent of an object defines a variable starting with the '~' character, then the object cannot save to that variable. (i.e. you cannot override any variable starting with ~) this is intended to allow 'safe' methods in a database who's behavior is guaranteed by children. rmvar syntax: ( rmvar <symbol> ) rmvar is used to remove a symbol from 'this' if it's no longer needed. vars syntax: ( vars ) this evaluates to a list of all symbols stored on 'this' clone syntax: ( clone ) this function creates a new object with 'this' as its only parent, then it evaluates to this object. parents syntax: ( parents ) this evaluates to a LIST of all parents of this object. disconnect syntax: ( disconnect ) this breaks the network connection to this object. echo syntax: ( echo <arg> ) the argument must be a string, then this function sends that string down the network connection associated with 'this.' just to be a complete loser, i used the Unix/C convention to indicate a newline/carriage return... if the compiler find the sequence \n in a string literal, it treats this as a return character. this means that the string "abc\ndef\n" will echo as: abc def address syntax: ( address ) if this object is connected (either as an incoming connected redirected by #0 or as an outbound connection ... see below) then this returns the IP address of the other end of the connection. if it is not connected, this raises E_NOT_CONNECTED commands syntax: ( commands ) as much as i wanted to be a language-purist, potential users convinced me that i'd be killed if i didn't provide a _FAST_ way to figure out what to do when someone types in a line to the environment. to do this, i made a very fast mapping on every object that would take in a string and spit out a list of symbols which could hold the right method to call. these are paired with a list of terms derived from the input string. the (commands) function will return the list of all of these command/symbol pairs that are stored on this object. for a full explanation of parsing, see appendix C. addcmd syntax: ( addcmd <arg1> <arg2> ) this allows you to add a new pair to the list of commands recognized by 'this.' ... the first LIST, arg1, must contain the 'sentence' to be matched/parsed and the second argument must be a SYMBOL. Once this is executed, a new mapping will exist on 'this' ... so if someone evaluates (matchcmd <string>) where <string> matches <arg1>, then '(arg2 <tokenlist>) will be in the list of symbols returned by matchcmd. for a full explanation of parsing, see appendix C. matchcmd syntax: ( matchcmd <arg> ) the argument must be a string, then this returns a list. the list is made up of pairs that match the string. each pair is a method name to call followed by a list of string segments that were derived. for a full explanation of parsing, see appendix C. rmcmd syntax: ( rmcmd <arg1> ) arg1 must be a STRING that will match against the sentence that you wish to remove from the list. so if one of the command-pairs is '(("say" REST) say-command) then you'd have to use (rmcmd "say abc") [or anything else that will match '("say" REST) to remove it. this will remove the first matching sentence it discovers. for a full explanation of parsing, see appendix C. purge-cmds syntax: ( purge-cmds ) this function will remove all command mappings from 'this' ... use with caution, needless to say. match-one syntax: ( match-one <list> <str> ) this is used primarily for trying out the parsing mechanism on a non-object- oriented scale. allows you to give one sentence list and one string to see if (and how) that sentence matches the given string. returns '() if no match can be made. (match-one '("spam" (or "on" "off")) "spam on") --> '("spam" "on") (match-one '("spam" (or "on" "off")) "spam ack") --> '() call syntax: ( call <arg1> <symbol> <arg3>... ) this function allows a method to tell another object to evaluate one of its methods. arg1 must be an OBJECT (or a user-defined data type ... see below) and there can be any number of arguments after the SYMBOL. 'call' evaluates each of these arguments and then evaluates the method named <symbol> on object <arg1> by passing it the remainder of the arguments as parameters. call evaluates to the result of the method's execution. to make people happy, i've added a quick shorthand for the 'call' method: (#2 : tell "spam") is the same as (call #2 tell "spam") (caller:tell "spam") is the same as (call caller tell "spam") (remember, 'caller' is a symbol that starts out storing the object that told this object to execute this method. this can only be done with the 'call' method, so the call-ee has its 'caller' method set to the object that is evaluating the 'call' expression) as an extra bonus, if you try to 'call' a method on an object that doesn't define a method of that name, (i.e. if you tried (call #1 fnord) but #1 doesn't have a 'fnord') then Coda checks to see if the object defines a method named 'default' ... if it does, then 'default' is called instead. default gets two arguments: the first is the name of the method which you were trying to call, the second is a list made up of the arguments which were being passed. so if you tried: (call #1 fnord "hey") and fnord wasn't defined on #1, then 'default' would be called instead, as if the programmer had typed (call #1 default fnord '("hey")) instead. pass syntax: ( pass <arg>... ) occasionally, you may find that the parents of an object define a symbol to be a method that isn't quite what you needed, and you want to add some details onto its performance without having to re-type everything it does. the 'pass' function allows you to evaluate an inherited version of the currently running method. for example, if your parents define the symbol 'standardize' to contain a method which performs some calculations on one number, evaluating to another, and you want to redefine this method to always add one to the result of the calculation, you could say ... (+ 1 ( pass some-number )) ... where 'some-number' is passed to the the parent's 'standardize' method. pass-to syntax: ( pass-to <arg>... <object> ) once in a while, you may have different parents who define a method and you want to specify which of them should be inherited from. in this case, you can use the 'pass-to' function to give all of the arguments to be passed, then give the object number of the parent to get the method from. USER-DEFINED DATA TYPING ======================== for a 'real' object oriented language, the user should be able to group data with a group of actions that this data will respond to ... this 'encapsulation' is lauded by CS professors and dreaded by intro students, but if used right, it can be very powerful. the most intuitive way i could think of to bundle a group of methods together was to place them on an object ... this has the built-in advantage of allowing inheritance and overriding through the normal Object-Oriented heirarchy. So ... i've introduced user-defined data types to the language ... each user-defined value is made up of a value and an associated object where its methods are grouped. the shorthand for this uses the {} brackets to indicate a user-defined value ... {#1 "spam"} indicates the string "spam" associated with object number 1. for non-constant values, you should always use 'class': class syntax: ( class <arg1> <arg2> ) if argument 1 evaluates to an OBJ, then this expression will evaluate to the user-defined data type composed of that object associated with the value in argument 2. notice that even though {#1 "spam"} has a core value of type STRING, it is no longer a string ... rather, it is of type #1 ... so normal string operations on this value will yield type errors. user defined values can be set to variables and accessed normally, but many operations no longer apply to the data. in fact, the only operations that still will work on these user-defined values are 'tostr' 'typeof' and (most importantly) 'call' call syntax: ( call <arg1> <symbol> <arg3>... ) when 'call' is used with a user defined value in the first argument position, it calls method <symbol> on the object portion of arg1 and hands the value portion of arg1 as the first argument to that method. for example: if you had method 'double' on object #17 defined as (method (num) (* 2 num)), then (call (class #17 11) double) would evaluate to 22. if you're more clever, the methods on the object will themselves return data of the same type: (method (num) (class this (* 2 num))) similarly, if 'concat' were defined on #19 as (method (thisval suffix) (class this (+ thisval suffix))) then (call (class #19 "abc") "lmn") would evaluate to {#19 "abclmn"} SYSTEM-ONLY FUNCTIONS ===================== object #0 is the 'system' object and it is allowed to do some things that don't really fit in the object-oriented paradigm at all. reconnect syntax: ( reconnect <arg1> <arg2> ) when a connection is first made to the MU*, the connection is tied to object #0 and the 'connect' method on #0 is called. if the system object wishes to move a network connection from one object to another, the reconnect method allows it to do that. arg1 and arg2 must both be object numbers. the first indicates the object to move the connection from, the second is which to move the connection to. shutdown syntax: ( shutdown ) the server can end the process, making sure everything is saved onto disk. setparents syntax: ( setparents <arg1> <arg2> ) arg1 must be an OBJECT and arg2 must be a LIST of objects that arg1 should have as its parents. log syntax: ( log <string> ) outputs the string to the standard output of the server process. If the output is being piped to a file, this will therefore be saved to the log. run-script syntax: ( run-script <arg1> ) this is a potentially VERY powerful tool for an ambitious administrator ... in the main Coda directory, there is a subdirectory called 'script' when you evaluate (run-script "some string"), it will try to execute the program "script/some script" and evaluate to the string result returned by the script. this disallows any arguments containing '..' the 'scripts' in this directory can be anything executable on your system, so shell-scripts will work just as well as precompiled programs. one warning, however ... the whole Coda server has to sit and wait for the output of the script, so using this tool for trivial operations would cause noticeable lag. connect-out syntax: ( connect-out <arg1> <arg2> ) the first argument must be an object, the second is a string that must be in a very specific format: "#.#.#.# ####" ... it must be a _numerical_ IP address with exactly one space. if that is satisfied, it tries to connect object from arg1 to that ip address so that 'echo' will send to that address and incoming information will go to 'parse' on that object. (the object MUST have 'parse' 'tell' and 'quit' defined, or inherit them. ------------------------------------------------------------------------------ APPENDIX A: Error codes ======================== like noted earlier, errors will just be reported by number since this provides greater generality. built-in methods that return error codes are slowly being fitted with string equivalents, but it's kind of in a half-finished state now, so debugging will sometimes require reference to the following table to see what an error value means. in the next release, i'll provide an in-DB wrapper for this to make things a little more readable, but until then, the common in-server error numbers i use are listed here: -1 Trying to get a negative (error ...) value. 10 Invalid regular expression used: unbalenced parenthesies, etc... 11 Trying to save to a variable starting with ~ that is defined on a parent (i.e. invalid override) 12 Invalid value type passed to a built-in function. 13 Trying to compare incompatible types. 14 Trying to access past the end of a list or string, or before the beginning. 15 A symbol was used that isn't defined locally or on 'this' or its parents. 16 Too many arguments passed to method or function. 17 Too few arguments passed to method or function. 18 Invalid function for expression ... an expression must start with a built-in function or a method. if you start it with a symbol, that must contain a method or built-in. this means that expressions like (15 "spam") don't make any sense since 15 isn't a function. 19 Trying to divide by zero, or some other arithmatic error. 20 Only #0 can execute the function that this is trying (listed above) 21 Trying to access an object that isn't valid. 22 Unable to parse string. 23 Invalid symbol/value pair given to 'dotimes' or 'foreach' ... after these reserved words, you must put a pair (<symbol> <expression>) of the right types. 24 Trying to use a variable that hasn't been initialized. 25 Trying to remove a command that doesn't exist. couldn't match the string passed in. 26 That object does not have a network connection to it. 27 Trying to 'explode' on a bad separator. 28 Could not 'pass' properly. 29 Permission ... i never use this in the server, but it is common in the database for someone to check if you're OK to read/change some object ... if you fail, you frequently get this error. 30 Bad file ... using the 'filetext' or 'run-script' built-ins on files that are not legitimate. 31 This is used in the database to signify that no match was found. if you EVER see a negative value other than -1, let me know as soon as possible (try th@summit-rs6000.stanford.edu) ... tell me what you were doing, what the input to the method was, and send me the text of the method where the error was raised, if you can. (i use negative values in a couple places where i could theoretically screw up) APPENDIX B: Little gimmicks ============================ i read about some LISP dialect that allows you to use the ] character to mean 'close off all remaining parenthesies.' therefore, (+ 1 (* 2 3 ] is syntactically correct in this dialect. this is kind of inelegant to me, but i added it, mostly for people who are coding without the advantage of emacs's wonderful parenthesies-closing features which hilite the open parens when you type a closed parens. therefore, (method () (+ 1 (* 2 (% 4 (/ 6 2 ] will now parse into (method () (+ 1 (* 2 (% 4 (/ 6 2))))) APPENDIX C: Parsing ==================== in its purest form, parsing is left completely up to the object ... the 'parse' method receives a string of what the user typed up to they hit return. realistically, this is a little too open-ended without some fast DB tools to help choose a method to call and the arguments to give it. to allow you to keep the generality while still having a lot of this work done behind the scenes, i've provided the built-in function 'matchcmd' to give an object a list of methods that the user probably intended when typing that line, paired up with the input string divided as the user probably intended it. the 'parse' method then only needs to try these until it finds a method pair that are correct, then returns. a sample parse method from a minimal DB could look like this: (method (instring) (foreach (match (matchcmd instring)) (if ((index 1 match) (index 2 match)) (return "OK")))) this uses (matchcmd instring) to get a list of all match-pairs that this object can find, then it tries each one, since (index 1 match) will return the first element of the match pair (the method name to try) and this will be passed the second half of the match-pair: (index 2 match) which will contain the input string chopped up for the method into a list of substrings. to illustrate this in a non-technical way, say you want to have method 'whisper-method' called whenever someone types: whisper <something> to <something2> you also want to be able to know what <something> and <something2> are without doing any extra work. in this case, (matchcmd "whisper hey to bob") would give you back a list that included this pair: '(WHISPER-COMMAND '("whisper" "hey" "to" "bob")) and if you took ((index 1 match) (index 2 match)) of this, you'd actually be calling (whisper-command '("whisper" "hey" "to" "bob")) this means that whisper-command only has to take the 4th element of the list it is passed to get the name of the player you are whispering to: "bob" it only needs to take the 2nd element of the list to figure out what the message is: "hey" ok ... now to the technical. how do you specify the 'whisper <something>...' stuff to the system? this is done through a list (i'll call it the 'sentence') of tokens. each token will match up with part of the input string if the sentence forms a match. the resulting '("whisper" "hey"...) list will correspond 1-to-1 with the matching sentence. the simplest sentece is made up a sequence of string tokens: '("push" "button") each string (case sensitive) will match up with part of the input line if it can. if the user tries to match this sentence with "push button", then this will give you '("push" "button") as the matched-list. string tokens are simple, but not quite versatile enough for many uses. a lot of times you want to match a string made up of a certain sequence of characters: digits from 0-9 perhaps, or maybe only a single sequence of non- space characters. Coda supports 'Character Classes' to allow you to give certain symbols in place of a string as a token. for example, the character class 'NUMBER' will match any sequence of digits. so you could specify a sentece like '("deposit" NUMBER "coins") when this matches the string "deposit 1234 coins", it will give you a parsed list '("deposit" "1234" "coins") common other character classes are REST (matches the rest of the line) and SOME (matches the rest of the line as long as there is at least one character) the third type of token can be built using an expression with 'or': '("push" (or "up" "down" NUMBER)) in this case, if you try to match "push up" you will get '("push" "up") if you match "push 321" you will get '("push" "321") the fourth (and final) type of token is the most powerful by far. there are a lot of times when you want to match all of the text BEFORE some divider, like a preposition. an expression using 'before' allows this: '("whisper" (before "to") "to" REST) this will match the original example above ... "whisper hey to bob" and give you '("whisper" "hey" "to" "bob") similarly, it would also match "whisper how's it going to bob" as '("whisper" "how's it going" "to" "bob") a slight problem arises if the clause before the divider contains the divider itself ... e.g. if you want to match "whisper go to hell to bob" and have it work correctly. this is resolved by making 'before' ignore any text within "" marks: whisper "go to hell" to bob will correctly match to '("whisper" "go to hell" "to" "bob") you may have noticed that the parsing snarfs up extra spaces. these aren't actually neccessary for divisions, however, so you could set up an alternate whisper command mathing "whisper bob=go to hell" (TinyMUD syntax): '("whisper" (before "=") "=" REST) you could even get REALLY fancy like i do in the default DB and match both correctly: '("whisper" (or (before "=") (before "to")) (or "=" "to") REST) the method that receives the match only has to check the third index of the list to find out which syntax was used, since this will either contain "=" or "to" anyway ... to tell the system that you want an object (and its children) to call method X when sentence Y is matched, the 'addcmd' function is used: (addcmd Y X) so if i wanted to have it call the 'whisper-method' symbol when the above sentence was matched, i could specify: (addcmd '("whisper" (before "=") "=" REST) 'whisper-method) from this point on, whenever (matchcmd "whisper something = else") is called on this object, one of the pairs returned by it will be '(WHISPER-METHOD '("whisper" "something" "=" "else")) this sentence-symbol parse structure is best learned by example ... look at the commands on object #9 and #11 to see how the line is tokenized, then see how the corresponding methods handle the resultant list of strings. administrator note: to add new character classes to the system, add to the bottom of the COMMAND_CONSTANTS file in the main directory and restart the server. APPENDIX D: Regular Expressions ================================ there are whole chapters in CS textbooks on the wonder and power of regular expressions, but i'll just give you a little glossing of this particular syntax used by gnu & Coda. to really get the hang of these, you need to try them out a bit since precedence, binding, etc... are all a little odd at first. most characters just represent themselves in a regular expression ... so 'a' always just matches with 'a' normally. a few characters have special meanings in matching, though: . the period will match ANY character, so you can use it to fill any space: "ac.c" will match "acdc" "accc" ... but NOT "acc" since the period must match SOME character. * indicates 'any number of occurrences of the preceding object'.... so the string "ab*c" will match "ac" "abc" "abbbc", etc... + this indicates 'one or more of the preceding object' ... so the string "ab+c" will match "abc" and "abbc" but NOT "ac" ? this indicates 'at most one of the preceding object' ... so the string "ab?c" will match "ac" and "abc" but NOT "abbbc" [] brackets indicate that any character within them is meant to match here ... so "a[bc]d" will match "abd" or "acd" ... but it will NOT match "ad" or "abcd" since the brackets indicate just one character. to match those two strings, we could add a *: "a[bc]*d" that says 'one a followed by any number of b's or c's, then one d'. if you start the brackets with a ^ character, you indicate that it is supposed to match everything _EXCEPT_ the characters in the brackets ... so "[^ ]" means 'any sequence of non-spaces' and it will match "abc" but not "ab c" you can also put a hyphen in brackets to indicate a range of possible characters. for example, "[0-9]" will match "5" but not "m" similarly, "[0-9a-f]+" will match "deadbeef666" but not "deadhead" () these are used to group characters into a block ... for other operations to act on. "(ab)" will match "ab", but you can use this group more powerfully, since "(ab)*" will match "abab" ... the tokens between the parenthesies become one solid unit. \ this escape character is used to say 'treat the character after me as normal' ... this allows you to match things like * and [ in a semi-civilized way: "a\[b\]c\." matches "a[b]c." unfortunately, when you are entering a string literal in the compiler, it also uses backslashes as the escape character ... so when you compile an expression containing "\a", the Coda compiler converts that into just "a" ... so you might have to jump through a lot of ugly hoops to get what you want when you start working with the \ character ... don't be surprised if you're typing in up to six \\\\\\ in a row, since "\\" will match "\" | this escape character indicates that either the string before or after it is to match ... so "to|for|on" will match "to" "for" or "on" ... but it will not match "tofor" or "foron" since the | operator tends to grab a lot more than you want, you can limit the scope by using parenthesies ... "give (to|for) me" will match "give to me" or "give for me", but "give to|for me" will only match "give to" or "for me" ^ this character signifies the beginning of the string (when it isn't within []) ... therefore, "^abc" will match "abc" ... but it wouldn't match the middle section of "dabcd" since the abc in the regular expression comes after the ^, so it will only match strings where the abc starts the string $ the complement of ^ ... a regular expression ending in $ will only match a string that ends in $ ... so "abc$" will match "abc" but it would not match the middle portion of "dabcd" (if anyone can figure out how to get character classes working in regex.h, let me know ... then i can add those to the regular expression matching.) mail me documentation suggestions/comments/revisions and you'll get a big, fat, juicy kiss. promise. snarflrd@leland.stanford.edu