03 Mar, 2008, David Haley wrote in the 1st comment:
Votes: 0
FYI: Nick Gammon wrote an example tool to do analysis on area files.

I like the approach of using external tools for analysis rather than doing it all in the codebase. Of course, if you change the area format, you need to change two programs: the tool and the MUD… (This is one reason why I thought it would be a good idea to have the areas stored in a Lua format from the start. It's basically key-value, it just happens to be in a form that Lua can immediately understand.)

Anyhow, this is pretty interesting, and it might be a good way to spot potential problems in areas.
03 Mar, 2008, quixadhal wrote in the 2nd comment:
Votes: 0
If we get the entire format to be proper key/value (as in one key -> one value), storing it in a database becomes pretty easy as well. We need to get rid of things like:
Actflags   npc sentinel immortal~
Affected detect_invis~
Stats1 500 20 0 -2 10000 40000

and repalce them with:
IsNPC true
IsSentinel true
IsImmortal true
HasDetectInvis true
Alignment 500
Level 20
MobTHAC0 0
AC -2
Gold 10000
Exp 40000

before it really becomes useful.

Of course, I'd probably actually use a junction table for the affects, rather than a field for each one, but you get the idea.

For my old mud, I wrote a converter program to change it from the custom Diku style it was into a few other formats. Later, the idea of moving it verbatum to a new codebase faded, but the converter itself did a lot of extra checking that the codebase never bothered to do (such as pointing out mismatched exits, isolated rooms, mobs or objects whose numbers didn't match the zone they belonged to, etc).
03 Mar, 2008, Guest wrote in the 3rd comment:
Votes: 0
If you're going to look toward storing the area data in a database, then you can just use bitvectors and store them that way in the DB. The only reason strings are used the way they are with the current format is that it makes it easier for humans to read the data and make changes to it offline. You obviously don't get that luxury with a database.

Storing things in a text file with IsSentinel, IsImmortal, etc. is wasteful IMO. I think it would be almost as wasteful to store the fields in the DB in the same manner. Maintenance nightmare etc. Especially if you have dozens of flags in several tables to deal with.
03 Mar, 2008, quixadhal wrote in the 4th comment:
Votes: 0
It is slightly wasteful in a text file format, I agree. However, it does mean your load/store routines can use identical logic no matter what source they're pulling from.

In a database, a boolean field should be just as efficient as a bit vector, but you don't have to bother picking them apart. If you use junction tables, no storage is used for false values (the row is present if true, or absent), however you do incur the id penalty which means more space used overall (unless there are lots of bits that are usually false). The advantage though, is you don't have to alter your table schema if you decide to add or remove a bit field.
03 Mar, 2008, Noplex wrote in the 5th comment:
Votes: 0
quixadhal said:
If we get the entire format to be proper key/value (as in one key -> one value), storing it in a database becomes pretty easy as well. We need to get rid of things like:
Actflags   npc sentinel immortal~
Affected detect_invis~
Stats1 500 20 0 -2 10000 40000

and repalce them with:
IsNPC true
IsSentinel true
IsImmortal true
HasDetectInvis true
Alignment 500
Level 20
MobTHAC0 0
AC -2
Gold 10000
Exp 40000

What, exactly, is your angle for serializing the whole file? I think it might be a waste of space to make it explicit that its a boolean value. The person parsing the file would implement the data structure in the code the way they want to (either using bits, or boolean values).
04 Mar, 2008, quixadhal wrote in the 6th comment:
Votes: 0
My angle is to minimize future work as different flavours of data storage formats come and go.

The amount of extra space used is trivial. Last time I looked, you could buy a 500 Gig(!) hard drive for less than $100. Speed isn't a factor, unless you're running your mud off an old 5-1/4 inch floppy or something.

On the other hand, let's say one wants to move from text files to XML? It would be nice if you had single key/value pairs so the various XML utilities out there would just work with it, rather than having to further break apart things like "Attribs". Off the top of your head, what's the fourth number in that string assigned to? I don't know… maybe intelligence? Maybe wisdom? Which one IS wisdom? If we had key/value pairs, you'd just look at the file and see "Wisdom 13".

What I'm getting at is, right now the file parser has to be hand-coded with special cases depending on the key name. Stats1 uses different code than Stats2, they have to know how many numbers to expect. If everything in the file were key/value pairs, you could parse the entire thing with a while loop and a regex. Not that you want to, perhaps, but simple is usually good. It makes writing external utilities easier. It also makes the format self-documenting, which is also good.

As for booleans… you could just put the key in the file without a value (or, more properly, a NULL value), and assume it is false if not present. You could also use 1 or 0, t or f, or the words. You could leave them as they are too. I just think it makes more sense to have seperate values seperate. For that matter, I would personally do away with bitvectors and just use booleans in the structures – even though you'd inflate the memory footprint a bit when using C (from 4 bytes to 32 per set of flags).
04 Mar, 2008, David Haley wrote in the 7th comment:
Votes: 0
Quixadhal is right. There is clear, established and most importantly, well-tested theory about good practice in database design, and grouping multiple entries like a series of character attributes where each has their own clear key into a single entry is not a good thing to do. It is confusing (like he said, which entry is which again?) and makes it more difficult to directly talk to the DB: instead you have to extract the right sub-field from the field, and when you update, you have to reconstruct the whole string.

quixadhal said:
For that matter, I would personally do away with bitvectors and just use booleans in the structures – even though you'd inflate the memory footprint a bit when using C (from 4 bytes to 32 per set of flags).

I wouldn't do that if you have a lot of bit vectors. It's one thing to make the storage format clear and extensible, but it's another thing to make your memory structures reflect that structure exactly. Increasing storage by 8 when it's so easy to avoid is not necessarily a great thing to do. It is so easy to write a class in C++ that handles the nitty-gritty details of bitvector math for you, while keeping storage efficient (i.e. one bit per flag number).
04 Mar, 2008, Davion wrote in the 8th comment:
Votes: 0
DavidHaley said:
Increasing storage by 8 when it's so easy to avoid is not necessarily a great thing to do. It is so easy to write a class in C++ that handles the nitty-gritty details of bitvector math for you, while keeping storage efficient (i.e. one bit per flag number).


Couldn't you just use std::bitset and avoid even having to write a class?
04 Mar, 2008, Noplex wrote in the 9th comment:
Votes: 0
Davion said:
DavidHaley said:
Increasing storage by 8 when it's so easy to avoid is not necessarily a great thing to do. It is so easy to write a class in C++ that handles the nitty-gritty details of bitvector math for you, while keeping storage efficient (i.e. one bit per flag number).


Couldn't you just use std::bitset and avoid even having to write a class?

I don't think David is talking about writing a class, he was merely saying that it can easily be done (e.g. the concept is an easy one to understand - its elementary binary mathematics).
04 Mar, 2008, Guest wrote in the 10th comment:
Votes: 0
Maybe, but elementary binary mathematics are a foreign concept to a lot of people, some programmers included. :)
04 Mar, 2008, David Haley wrote in the 11th comment:
Votes: 0
The point though is that it is easy to do, or easy to find someone who can do it, and even easier if you know about the STL and use std::bitset like Davion suggested.

Incidentally I happen to have just such a class that I might release. The main reason I haven't is that it's probably useless to most people because they use C. I also haven't benchmarked it against std::bitset, so it might not be advantageous.
07 Mar, 2008, quixadhal wrote in the 12th comment:
Votes: 0
Yes, in C++, you have several transparent means to use 1 bit per boolean value without anything outside the class needing to know. You just have to move away from the if(object.flags & BV27) style of coding and use if(object.property). IMHO, that looks cleaner anyways.

In C, you *can* use bitfields, but they're pretty ugly and to be honest, I'd need to have a really BIG world before it became an issue to me. YMMV.

My old Diku currently takes up 9.5M of RAM. If I expanded the structures of all the mobs/objects/rooms so the flags went from 32 bits to 32 bytes, that's an inflation of… probably about 2.5M across the 9000 or so structures. Significant, to be sure, but my 5 year old server has 512M of RAM.

Obviously, the bigger your world, and the more flag-type structures, the more it matters. For me, I wouldn't even notice unless the process got up around 64M. Actually, linux is usually good enough about swapping that *I* probably wouldn't notice until it got up to 750M, but that's another issue… *grin*
07 Mar, 2008, David Haley wrote in the 13th comment:
Votes: 0
I agree with all of your points for self-hosting people, or people who have reasonable hosting plans with respect to memory usage. 2.5M can be a fair bit though when you have a quota of say 50M, and it is so easy to avoid (in C++ at least). Well, hey, even in C, it's not so bad if you encapsulate it correctly. The current EXT_BV implementation isn't so bad in this respect, IIRC. (It's been a while since I looked at it.)
07 Mar, 2008, Guest wrote in the 14th comment:
Votes: 0
I tend to object in general to anyone who advocates the "resources exist, therefore we should use them all" argument. Just because one has 500GB of drive space and 2GB of RAM doesn't mean it's a great idea to waste it all on stuff like this. Look only as far as Vista for an example of what that attitude has done to OS development over the years.
07 Mar, 2008, drrck wrote in the 15th comment:
Votes: 0
Samson said:
I tend to object in general to anyone who advocates the "resources exist, therefore we should use them all" argument. Just because one has 500GB of drive space and 2GB of RAM doesn't mean it's a great idea to waste it all on stuff like this. Look only as far as Vista for an example of what that attitude has done to OS development over the years.


That's a really good analogy (unfortunately for consumers).
08 Mar, 2008, David Haley wrote in the 16th comment:
Votes: 0
It's a trade-off. Resources exist, therefore we should not mind using them if it makes our development time go down. It's not that resources should be used just for the sake of using resources (and I certainly don't think that quixadhal was advocating that): it's that we shouldn't be as concerned as we might otherwise be (which is a very reasonable position).
0.0/16