Message-ID: <94Mar16.211449pst.58375@mu.parc.xerox.com>
Date: Wed, 16 Mar 1994 21:14:38 -0800
From: Pavel Curtis <Pavel@parc.xerox.com>
To: MOO-Cows@parc.xerox.com
Subject: The 1.8.x planned feature list
OK, at long last, here's my list of everything I am consciously planning to do
in 1.8.x. Note that `disk-basing' and `multiple inheritance' are not on this
list; the former is, as I've said, in the queue for 1.9.0, and the latter is
not in the plan at all.
There are no release times in this message; I have no estimates available, so
don't bother asking. Sorry.
The order of this list is almost entirely random and most certainly does not
reflect my priorities, which are somewhat poorly worked out right now. The
only firm decision in that regard is that the first 1.8.x release will contain
the first three tasks listed (cleaned-up internal DB interface,
exception-handling, and improvements to hostname lookup); this is because my
current sources already contain partial implementations of all of these and I
pretty much have to finish implementing them in order to get back to a
compilable set of sources...
Before anyone else can say it, I'm well aware of just how mind-bogglingly long
this list is; fortunately, most of the tasks are pretty small in scope,
implementable in no more than a few days of my time. I expect, therefore, to
have each succeeding 1.8.x release contain just one or two of the more major
pieces and a selection of several of the smaller ones. To help y'all
understand which changes I think fall in which categories, I've labelled each
change with one of the following three marks:
-- a small change involving less than a day of my time
++ a substantial change, but still implementable within a week
** a relatively major change, probably requiring more than a week
of my time
In all cases, these are only my best guesses and they are intended to include a
fair amount of slack, since I rarely get to spend even a majority of any week
doing hacking anymore.
Getting this list out to y'all is my part of a little bargain I'm hoping to
strike. Your part is to let me know, either publicly or privately, which ten
or so of these changes are most important to you. I'll take that feedback into
consideration as I decide what to work on next at any point. You can also help
by reminding me of things I've promised at some point but have not included in
the list. You can also, of course, try to convince me to add other things to
the list, but since it's so long already, it may be a tough sell.
Finally, let me reassure you that I will try to be good about posting a
proposed design for each of the more significant changes *before* I implement
it, so that y'all can have a voice in the process. This is particularly true
for the following changes:
exception-handling
in-DB command parsing
floating-point numbers
I hope to send out my current (more-or-less complete) design for the first one
of these in the very near future. One advantage of doing so is that I'll then
be able to erase a good portion of my whiteboard again...
Enough prologue; here's the list.
Pavel
** Clean up internal DB interface
Currently, the majority of the server code makes a lot of assumptions
about the data structures and in-memory nature of the database; one
important step on the way to disk-basing is to eliminate those
assumptions. I am about 2/3 of the way through converting the server
over to a strictly procedural interface to the DB that (hopefully)
hides completely the fact that it's still kept in memory; if I do this
well, the eventual switch to disk-basing will be much more localized,
just covering the specifically DB-oriented modules of the code, a
relatively small fraction of the server.
** Exception-handling and -signaling
The current `d' bit on verbs is clearly an extremely clumsy,
error-prone, and insufficient mechanism for handling the errors raised
by various built-in server operations. The new mechanism, to be
described fully in an upcoming message, makes it easy for programmers
to catch just the errors they are prepared to handle, to raise and
catch new kinds of errors specific to their applications, to cope
reasonably cleanly with other people's code running out of ticks and/or
seconds, and to capture the error traceback in a usable form before it
can be displayed to an unwitting user. It may also brew your coffee
for you in the morning.
++ multiple-process workaround for hostname lookup problems
A lot of people are having a variety of problems with the consequences
of the server's current policy of abruptly interrupting the C library
functions gethostbyaddr() and gethostbyname() when they take `too long'
to translate a numeric IP address to/from a human-readable host name.
I have finally worked out a portable mechanism for avoiding having to
do that violence, involving the use of multiple extra UNIX processes to
handle the lookups. It's pretty damned gross, in my opinion, but it
should prove to be a great deal more robust.
--------------- Tasks above this line are already in progress. ---------------
++ MOO-code performance profiling
This will make it possible for MOO programmers to find out where all of
the ticks/seconds are going in their applications, with a breakdown by
MOO object/verb and built-in functions. Finally you'll be able to tell
whether or not it's $list_utils:assoc that's really slowing you down.
It will also be possible for wizards to use this facility to discover
what code is actually spamming everyone when the lag is high.
++ MOO-code network servers
It will be possible for wizard-blessed code to listen for outside
network connections on other TCP ports than the one the server itself
is using, making in-MOO Gopher, SMTP, WWW, NNTP, etc. servers easy to
write.
++ suspend()/resume(task-id [, value])
It will be possible for a task to suspend `forever' and for the owner
of any suspended/forked task to cut short their associated waiting
times, making them immediately eligible to run.
-- Unprogrammed verbs equivalent to ones with empty programs
Right now, unprogrammed verbs are, in most respects, utterly invisible
(verb-call can't find them, for example) while verbs with empty
programs (i.e., with the empty list of statements), are equivalent to
one that simply returns zero. I want to change the treatment of
unprogrammed verbs to make them just like empty ones, for the sake of
consistency and simplicity of the programmer's model.
-- callers([task-id [, include-line-numbers]])
The first argument would allow a wizard or the owner of a task to get
ahold of the stack for a suspended task. The second would allow you to
get the current line number of all of the frames on the stack, for use
perhaps in generating more useful traceback descriptions.
++ New, much faster regexp package
The current implementation of the match() built-in function is quite
slow and poorly-written. The GNU project has a new implementation of
roughly the same functionality that is much faster and more robust.
++ New built-in function `set_connection_user(conn, user)'
This would allow for more flexibility in how player objects are
associated with network connections. For example, this would allow
systems to use read() to prompt for information during the login
process, or the construction of a command to allow a user to
temporarily act as some other user (such as an administrator
temporarily acting as their associated wizard, like in the UNIX `su'
command). The design of this facility is not yet set.
-- open_network_connection() suspends the calling task
This would allow the server to continue doing useful work while waiting
for an outbound connection attempt to succeed or fail without, as it
currently does, imposing a set time limit on how long that takes.
++ Setting system parameters during server operation
This would allow you to change the ticks/seconds limits and other
system parameters from MOO code, while the server is running. Some of
the new facilities listed below, like making certain functions
wiz-only, would also be controlled by this mechanism.
++ Limit on the number of background tasks per user
I'd like to make this configurable on a per-user basis so that some
folks could be allowed more tasks than others. This would make it
easier to keep from losing when somebody blows it in their use of
`fork' and tries to create huge gobs of tasks.
-- Maybe some facilities for binary I/O, at least on outbound connections
This shouldn't be hard, but I haven't decided on a form for the
facility to take, or on the permissions checks that should apply.
Basically, I just want to enable the use of non-ASCII or
non-line-oriented services from inside the MOO.
++ Statements analogous to `break' and `continue' from C
I haven't completely decided on the form this will take, but it will
probably allow exit/continuation of any surrounding loop, not just the
innermost one as in C.
++ Optionally making many of the built-in functions wiz-only
This would cover all functions that aren't pure data operations, like
length(), index(), etc. Probably it will be possible to make only
certain selected functions wiz-only.
-- Optionally making many of the built-in properties wiz-only
Ditto. Some folks, for example, have expressed interest in keeping
.location/.contents secret.
++ Unambiguous (1-based) numeric names for verbs
This would finally allow reliable, unambiguous access to any verb on an
object, regardless of naming conflicts. Rog posted the idea that I
want to implement on this mailing list sometime (late?) last year.
-- Optionally disabling numeric strings as special indexical verb names
This would allow a complete switchover from the old, bad verb-indexing
method to the new, good one. Without disabling this, it could be
impossible to refer by name to certain verbs.
++ `$' notation in subscripting brackets
I put forward this idea on MOO-Cows last year; the idea in short is
that the subscripting brackets `[ ... ]' would allow within them the
use of the special expression `$', meaning the length of the value
being subscripted. This allows, for example, expressions like
`x[2..$]' to get the `rest' of a list after the first element or
`x[random($)]' to get a random element of a list.
** Support for in-db parsing
This will include built-in function support for efficient emulation of
the current built-in parser and close variants. The built-in parser
will no longer exist outside of those new built-ins. The design of
this feature is not at all complete yet.
-- Adding OUTPUTPREFIX and OUTPUTSUFFIX as synonyms for PREFIX and SUFFIX
I'm told that this would make the MOO compatible with some other
servers, allowing the same somewhat fancy clients to be used with both.
** Support for floating-point numbers
This will include simple arithmetic, some number of useful built-in
functions (including the most common transcendentals), and formatted
conversion of numbers to strings.
-- Maybe moving checkpoint scheduling entirely inside the DB
This would make the scheduling of checkpoints much more flexible and
much easier for the wizards to change as demand warrants. Basically,
this would involve simply removing code from the server that currently
uses $dump_interval to decide when to make a checkpoint; instead, only
the use of the existing dump_database() built-in function would cause a
checkpoint to be taken. For those who might guess otherwise, this
remains useful even in the eventual disk-based server, which still has
something like checkpointing to do in order to make sure that there's a
consistent copy of the DB on disk for backup mechanisms to save.
-- Prevent multiple verbs on a single object with the same name and args
There's no good reason for this sort of thing, it would prevent a
certain amount of confusion, and it would mean that name/args would be
a unique identifier for a verb on an object.
-- Built-in functions computing the size of an object/value in bytes
This is obviously useful for a numer of MOOs that are moving to
byte-based quota mechanisms; with the speedup possible by coding these
primitives in C, it should be possible to keep the quotas much more
current. I have considered trying to get the server to do all of the
work of byte-based quota management, but it's pretty hard/inefficient
to do it much better than what's already done in MOO code.
-- queued_tasks([value counts_only [, obj player]])
If COUNTS_ONLY is true, then for normal players only a count of tasks
is returned and for wizards an a-list {{player, count}, ...} is
returned, giving the counts for every player who has any queued tasks
at all. If PLAYER is specified (only a wizard can specify a different
player), then only the data for the specified player is returned. This
will make it easier for wizards to quickly determine the state of the
background task queues in the system.
++ Persistent `handle' on a verb/property across all kinds of changes
This is a new kind of MOO value, a `verbdef', which refers to the thing
that `add_verb()' creates and that `set_verb_code()',
`set_verb_args()', and `set_verb_info()' modify. This is quite useful
for a MOO-code browser/editor, which could be guaranteed to keep its
hand on the verb that the user specified at the beginning of a session
in spite of concurrent changes to its name, args, code, etc. ditto for
properties once the property-renaming change is introduced (see below).
++ New MOO value type: tables
Tables are like lists except that that map from values to values,
rather than just from a small range of numbers to values; this is
useful in a wide range of applications where programmers are trying to
associate values with arbitrary `keys'. Implemented using hashing and
some tricky collaboration with the reference counter, many common
styles of use will be efficiently performed using side-effects without
ever showing the MOO programmer anything that violates the current
model that all values are immutable.
-- hash(value)
A primitive consistently mapping arbitrary MOO values into quite random
integers (or maybe lists of integers, to get enough bits), enabling a
variety of applications (like inter-MOO value transfer) where a quick
check is needed to determine if two values are the same without having
to actually transfer the value first. For the cognoscente, this will
probably involve the use of either MD5 or Snefru, two available
cryptographically-secure hash functions.
-- toascii(char)/tochar(num)
Simple functions for converting between characters and numbers, making
it easier (for one thing) to cope with binary I/O.
++ foo:bar, call_proc(proc, @args)
This makes available both parts of the expression `foo:bar(@args)' as
separate pieces: (1) the verb lookup, including the permissions checks,
and (2) the actual verb invocation, including the passing of the
arguments. One thing this enables is the handing out of the ability to
call a particular `!x' verb only to certain trusted parties without
requiring hairy permissions checks in the verb itself. Also, and this
is perhaps of some use for in-DB command parsing, a wizard could look
up a verb with the wizardly permissions and then change to some user's
permissions before actually invoking the verb. In conjunction with the
`x' bit change below, one could finally make all of the private verbs
on an object `!x' and thus avoid the necessity for many in-verb
permissions checks, since only trusted folks would be able to call the
verb in the first place.
-- Regularizing the `x' bit on verbs
The `x' bit currently acts quite differently than the `r' or `w' bits
on verbs; if `x' is off, then *nobody* can call it from MOO-code, not
even the owner or a wizard. I want to change it to work like the other
bits, whereby the owner or a wizard could call any verb and everybody
else would get E_PERM if `x' is not set. I know that this would break
some existing code that counts on the current behavior where a `!x'
verb is simply invisible to verb-call's lookup procedure, but I think
the number of such cases is pretty small. Feel free to object if you
think I'm wrong about that.
++ Binary, compact DB file format
Right now, the DB file is written out entirely in printable ASCII, with
the code for all verbs written out in source form. This slows down the
checkpointing code, makes the file substantially bigger than it might
be, and really slows down the DB loader on server startup, since it has
to do a lot of relatively expensive parsing. I want to move to a
compact binary format, in particular storing verb code in its compiled
state. This should help both in speed and space, perhaps reducing both
by as much as a third or more. Also, most of the new code for this
will be useful in the disk-based system as well, since the DB will be
in a binary format then anyway (probably GDBM, for those who care).
++ Server reboot without losing connections
This is a clever/gross hack whereby the server (but not the server
machine) could be rebooted without kicking off all of the users. This
will be more useful in the disk-based server but I think will be
sufficiently so even now. The idea is that the server would stop
processing commands for a bit while it makes a special checkpoint
(which includes *all* of the state of the server, including the pending
input/output for all connections and the information about who's
attached to what connection), uses the UNIX `exec()' function to start
running the new copy of the server without losing all of the network
connections, and loads in all of the data stored by that special
checkpoint. Since that data includes information about which file
descriptors are for which connections, it could re-establish all of the
state from the previous server. The upshot is that the users would
just see a rather long pause in server response (much shorter for the
eventual disk-based server) instead of being booted and having to
reconnect later. For many MOOs, this could be done at night or early
in the morning without any users noticing.
++ `player' -> `user'
I want to remove all uses of the word `player' from the programming
language, replacing it in all cases with `user'. This affects a number
of built-in function names and the built-in variable `player'. This
will be done in such a way as to allow an old DB to be automatically
converted during loading. It may also be possible to allow either name
to work even after loading, so that code could still be relatively
easily be transferred from an older MOO to a newer one.
-- $vname(...)
This is a new notation equivalent to `#0:vname(...)', by analogy to the
current `$' notation for properties on #0. In systems that have made
many built-in functions wiz-only, it might be useful to define any
publicly-callable versions on #0 so that they could still be called
with a concise name.
++ Archwizard's emergency back door
This is a facility allowing the archwizard (the one with access to the
actual machine on which the server is running) to start up the server
in a very special single-user mode where arbitrary expressions and/or
commands could be executed as a wizard without necessarily having to go
through any verbs in the database. This makes it possible to
more-or-less easily recover from nasty mistakes made even in such
critical places as #0:do_login_command(), etc.
-- Non-suspending read()
Currently, the server always suspends a task that calls read(), even if
there's already data available for reading on the connection. This
makes certain applications that read a bunch on data from a network
connection, like MOO-Gopher, pretty painfully slow. In the new server,
it will be possible to get the old behavior but also to have such calls
to read() return immediately with the read data. For MOO-Gopher and
other such applications, this could improve performance by orders of
magnitude.
++ Fix 16 v 32 v 64 bit problems?
I could go to the effort of making sure that the server always gets the
sizes of integers it needs even on machines where C's `int' type is not
32 bits long. It's not entirely clear how important such a change
would be, though, since almost all machines do have a 32-bit `int' type
and I think all of the others have that as an option. Feel free to let
me know if I'm wrong about this *and* it's important to you to have the
server run on such a machine.
-- Allow renaming of properties
Currently, you cannot rename a property the way you can a verb. I
can't see any reason for this inconsistency and it certainly has been a
pain for a number of people at various times.
++ Verb argument names, checking
It has always been a personal embarrassment for me, and a pain for most
MOO programmers, that verbs can only get their arguments as one list
that they must destructure (and check the length of) explicitly. I
have a design for a simple argument name/number specification construct
that also allows for optional arguments (with default values) and a
list of the `rest' of the arguments; with this, you'd also get
automatic checking for the correct number of arguments to a verb and
raising of a more evocative error message than `Range error'.
?? General server performance improvements
It has been a long time since I did any performance profiling of the
server and tried to make it substantially faster. I am guessing that
there's at least a factor of two more-or-less readily available and
perhaps as much as a factor of ten. Obviously, such an improvement
would be welcome in a number of MOOs. It's hard to guess just how much
effort is involved here, since I don't know how many relatively simple
changes there are that could yield big improvements.
---------------------- End of monstrous list of features ----------------------