06 Aug, 2009, Davion wrote in the 21st comment:
Votes: 0
KaVir said:
Kintar said:
Of course, I actually wrote a MySQL-based persistence mechanism for my codebase, then yanked the entire thing out and wrote a file-based mechanism. It's just simpler for the average mud admin to deal with text files than a relational database, and the implementation challenges are a pretty even trade off IMO.

I went for flat files as well. I was considering using MySQL originally, but it would have increased my hosting costs, and seemed more complex than flat files, so I never really got around to it. However I've been thinking about it again since integrating the mud with the website, particularly as I fancy the idea of having a browser game that shares the same data as the mud.


No MySQL doesn't absolutely mean no web integration! Instead of changing the entire way your mud saves and loads data, why don't you just give your website the ability to load from flatfiles? You don't even have to stick to php (unless your website is already in PHP). The only reason to actually use a DB with regards to web integration is simplicity, and relational mapping. If the task requires you to either rewrite your game, or give the features to the website, simplicity is already out the window :).
06 Aug, 2009, KaVir wrote in the 22nd comment:
Votes: 0
Davion said:
No MySQL doesn't absolutely mean no web integration!

Well I realise that (I already have web integration), I was just thinking it might be an easier way to share data between the mud and a browser game, particularly if the latter needs to cache a lot of data.

I fancy creating a vaguely Travian-style browser game, but have it share the same world as the mud players, so that the two can interact (at least indirectly). I reckon it'd be a pretty good way to draw in new mudders, too, as browser games are fairly popular. But such a game needs to keep a lot of data in memory, and I suspect I'd need to do a fair bit of caching.
06 Aug, 2009, Davion wrote in the 23rd comment:
Votes: 0
You should consider embedding an http server into your game. There's ones out there (like EasyHTTPD) that will make keeping data integrity easy.
06 Aug, 2009, KaVir wrote in the 24th comment:
Votes: 0
Davion said:
You should consider embedding an http server into your game.

I've already got one - anything going to http://www.godwars2.org/mwi is passed to the mud (using FastCGI), which generates the webpage on the fly.

I'm just not sure I fancy writing my own mechanism for caching large amounts of data, that's all. I guess I can worry about that later on, if it becomes a problem.
06 Aug, 2009, aidil wrote in the 25th comment:
Votes: 0
KaVir said:
Davion said:
You should consider embedding an http server into your game.

I've already got one - anything going to http://www.godwars2.org/mwi is passed to the mud (using FastCGI), which generates the webpage on the fly.

I'm just not sure I fancy writing my own mechanism for caching large amounts of data, that's all. I guess I can worry about that later on, if it becomes a problem.


I integrated a webserver into my game, but use an apache server in reverse proxy mode to do the caching. All my game has to do is provide the proper headers to tell the proxy to not cache specific content.
06 Aug, 2009, Kintar wrote in the 26th comment:
Votes: 0
flumpy said:
I use XStream which serializes java objects to xml format. Or JSON format. Or any other kind of format you can write a wossname for. The xml it produces can be re-serialized and is pretty readable and concise. You can alias certain things so they don't look so bloaty.


I've used XStream in projects in the past, and it does work pretty well. This just comes back to my distaste for using XML to persist objects in a MUD. I really think a simpler layout is in order. Compare:

<?xml version="1.0"?>
<entity id="1">
<component name="MudObject">
<name>Knife</name>
<aliases>
<alias>blade</alias>
<alias>dagger</alias>
</aliases>
<location><entity refid="2"/></location>
</component>
<component name="Weapon">
<damage>1d10</damage>
<durabilityMax>50</durabilityMax>
<durabilityCurrent>50</durabilityCurrent>
</component>
</entity>
<entity id="2">
<component name="Location">
<!– blah, blah, blah –>
</entity>


With:

entity 1:
MudObject:
name: Knife
aliases: blade, dagger
location: #2
Weapon:
damage: 1d10
duabilityMax: 50
durabilityCurrent: 50

entity 2:
Location:
# blah, blah, blah


XML is perfectly serviceable, but in a case like this, all the tag open/closes just clutter the data unnecessarily.

flumpy said:
Kintar said:
When it comes to plaintext storage formats, avoid the jihads. :wink: Just use whatever format you find that is clean and concise for your usage, and has a well-supported library for your programming language of choice.


I agree entirely but the following seems to contradict that:

Kintar said:
I've actually come to the point lately where I just define the bare minimum of what will be needed for my code, then use YACC or ANTLR to create a parser for it. It's a little more work, but it's typically a LOT cleaner than using YAML/JSON/XML/whathaveyou for files that will never need to be consumed by anything other than my applications. (Remember: XML, YAML and their ilk were created for data interchange. If you don't need other applications to read your data, you're just taking on extra cruft for no reason.)


Well… to be honest.. when you persist data you are "interchanging" it with the hard drive and back off it again into your application. So I don't agree with you entirely on that statement but I get what you are trying to get at.


Looks like part of our discussion comes from the fact that we're using different definitions for the same terms. To me, "serialization" means moving an object from memory to a persistent storage medium. Interchange, on the other hand, denotes a transfer from an in-memory representation on one system or application to an in-memory representation on another system or application. This means it has to be serialized from memory in one location, transferred to another location, then deserialized, possibly into a new data structure.

That last bit is important. If you're just going MUD->disk->MUD, then you're not moving from one memory structure to a potentially different one, you're just offloading from RAM, then loading it back. XML, YAML and the like are EXCELLENT when you need a flexible, structured form that can represent any type of data, and thus can be translated from one memory structure to another, but if you're just offloading the data with the intent of bringing it back to the same program at a later time, why bother with all the extra information that defines the structure of the data? You already know what that structure is!

flumpy said:
The trouble with proprietary formats is they are, well, proprietary, and once written even if you wanted to interchange them with other systems you then couldn't unless you wrote something extra in that system. XML is pretty human readable, JSON is a bit strange but still readable, and even a database can be human-read (once it's been processed by an intermediary program of some sort of course). Each have a semi-standard access mechanisms that (almost) everyone can use. Would your proprietary format be the same?


You're exactly correct about the difference between a proprietary format and XML, and that's the precise reason I'm recommending using a proprietary format for your MUD. You don't have to worry about all the overhead of the standard formatting of XML, etc. If you DO end up needing to transfer the data to another application, writing a transform template from your proprietary structure to XML/YAML/JSON is very simple, and might (I would say probably) carry less of an investment in programming, code bloat and complexity than simply using the open format for everything.

flumpy said:
Then again if you are using something like ANTLR you may end up reinventing the wheel. Why not write your stuff out in "builder" format (e.g. a groovy builder?) or in a format your scripting language can easily and readily parse? Well I expect that actually might be too complex, as I have often pondered over this myself.


It's actually extremely simple. The examples I've given are basically the very thing you're promoting here. ANTLR is nothing except a tool that helps speed the development of a piece of code that can read a flat file; in other words, a script that reads a storage format. And the ANTLR runtime library plus the code it generates is going to be 1/1000th the size and many times the speed of the Groovy runtime library and its associated builder. And since reading a proprietary data format from a script or Groovy requires an investment in custom code anyway, why not use the leaner and no less simple solution?

flumpy said:
One thing I would like to say about this thing would be to keep "encapsulation" in mind. If you abstract a layer over the top of this kind of persistence it will mean that you should be able to swap your file system for a database without even blinking an eye, and that is what you should be aiming for.


Very true. The discussion to this point has been entirely about the implementation of that data access layer, though. For the following code…
DataAccessObject dao = DAOFactory.getCurrentDAOImplementation();
dao.persist(entities);


…we're talking about what happens behind the scenes in MyDataAccessImplementation.persist(Set<Entity> es), and why one format would be chosen over another. My advice is still: Roll your own flat-file persistence. It's quick, it's typically easier to read and edit than XML, and if you end up needing the ability to exchange data with another system, writing a translator from your proprietary format to XML isn't going to take long, and will keep the extra memory usage and performance hit of highly structured formats like XML out of your lag-sensitive codebase.
06 Aug, 2009, David Haley wrote in the 27th comment:
Votes: 0
Writing a parser for a file is really only a relatively small part of the battle. The complicated parts come from things like versioning, dealing with missing fields (use a default? error message?), unexpected fields, and so forth. These are policy decisions. Writing a parser is dead simple even without tools like bison/ANTLR/etc. once you know what you're doing.
06 Aug, 2009, Kintar wrote in the 28th comment:
Votes: 0
David Haley said:
Writing a parser for a file is really only a relatively small part of the battle. The complicated parts come from things like versioning, dealing with missing fields (use a default? error message?), unexpected fields, and so forth. These are policy decisions. Writing a parser is dead simple even without tools like bison/ANTLR/etc. once you know what you're doing.


Agreed, and those are considerations no matter if you're sticking the data in a rdbms, XML file, flat file, or having your friend Joe (you know Joe, with the photographic memory and way too much spare time?) remember your objects. ;)
06 Aug, 2009, Tyche wrote in the 29th comment:
Votes: 0
Your second example looks almost like YAML,
Kintar, which is one reason why some would
prefer it to XML, regardless
of its interchange capability.
06 Aug, 2009, Erok wrote in the 30th comment:
Votes: 0
David Haley said:
Writing a parser for a file is really only a relatively small part of the battle. The complicated parts come from things like versioning, dealing with missing fields (use a default? error message?), unexpected fields, and so forth. These are policy decisions. Writing a parser is dead simple even without tools like bison/ANTLR/etc. once you know what you're doing.

David expands a bit on what I meant earlier by validation.

This is what drove me towards XML because I could use a third-party validating parser like Apache Xerces that ensures the data is sane in terms of version, structure, value, etc. before mapping it into an object. An XML-aware editor makes hand-edits of the files a breeze as well (e.g., colorizing tags vs. data, validating on the fly, etc).

I think I've seen some schema projects for other formats, like YAML, but figured an Apache project was more likely to have the long-term support I would want.
06 Aug, 2009, elanthis wrote in the 31st comment:
Votes: 0
Unless you actually need to index and search your MUD data in a relational fashion, don't use any SQL database for storing data. SQL is relational query language, not a generic store-anything magic box.

Understanding your data, how you will use it, and the best way to store it is a fundamental aspect of program design. If you don't have a clear idea of what you need, don't write any code for a new MUD server. Either you just haven't thought about your design enough to be ready to write code and you need to spend more time "at the drawing board" before cracking at the code, or you haven't the necessary experience and knowledge to be able to come up with a proper design and you need to spend more time working on and studying existing well-designed projects before starting a new one from scratch.
07 Aug, 2009, flumpy wrote in the 32nd comment:
Votes: 0
Kintar said:
David Haley said:
Writing a parser for a file is really only a relatively small part of the battle. The complicated parts come from things like versioning, dealing with missing fields (use a default? error message?), unexpected fields, and so forth. These are policy decisions. Writing a parser is dead simple even without tools like bison/ANTLR/etc. once you know what you're doing.


Agreed, and those are considerations no matter if you're sticking the data in a rdbms, XML file, flat file, or having your friend Joe (you know Joe, with the photographic memory and way too much spare time?) remember your objects. ;)


Well not to be picky but the interface on your friend Joe isn't all that friendly, and he can't write very fast :P

Seriously though, David's correct when he says it's a small part of the battle to just write a parser. I would rather prefer someone else doing the hard graft for me, like the nice people at codehaus, so I can get on and think of cool things to do in my mudbase.

I disagree somewhat with the verbosity argument because tbh it really doesn't matter how verbose your format is. No one is going to complain about it if its readable and you can use some nice ide tools or standardized editors to help manipulate that data. YAML, SQL, XML, SQL, Kintar++ blah blah blah use whatever you feel comfortable with fits what you are trying to do.

One thing I would hold up on is trying to use the Apache Xerces to do anything with XML. It is overly complex and probably overkill, and you can do what you want with a third of the problems with something like XStream, GORM or even plain old java serialization without trying to get your melon around their API.
07 Aug, 2009, Kintar wrote in the 33rd comment:
Votes: 0
flumpy said:
Seriously though, David's correct when he says it's a small part of the battle to just write a parser. I would rather prefer someone else doing the hard graft for me, like the nice people at codehaus, so I can get on and think of cool things to do in my mudbase.


I think you're missing the point of his statement and, honestly, the reason this thread has generated so much traffic. When persisting data for a MUD, it's important that it be easy for anyone to edit. Corrupted data files are one of the biggest problems a non-coding administrator is likely to face, and having them in a format that has the maximum information-to-formatting ratio is hugely important. When comparing that requirement to the simplicity of writing a parser, the decision of whether to roll your own persistence or use a pre-made API that's easier to code for but harder to edit the output is almost always going to fall on the side of roll-your-own.

flumpy said:
I disagree somewhat with the verbosity argument because tbh it really doesn't matter how verbose your format is. No one is going to complain about it if its readable and you can use some nice ide tools or standardized editors to help manipulate that data.


Nobody who has good control over their hosting environment, that is. How many MUDs out there are running on a server that gives the bare minimum support? What about the administrators who have to SSH in to their host and use old-school VI (not VIM with spiffy syntax hilighting) to try and recover their data? Remember that the persistence portion of a MUD should be accessible to anyone trying to host the MUD, not just programmers and people with experience and an array of tools at their disposal.

flumpy said:
YAML, SQL, XML, SQL, Kintar++ blah blah blah use whatever you feel comfortable with fits what you are trying to do. (Emphasis added by quoter)


The bolded portion of that quote is the point I'm trying to make, and I apologize if that point has taken this thread a little off-track from its original question. Unless your codebase is for your use only, and is never going to be in the hands of people who don't know the tools and APIs you're using to build it, then your storage format is going to be one of the largest parts of how your code is viewed in terms of usability. If, on the other hand, you're never releasing your source and are the only person who's ever going to have to read/edit/recover your files, then by all means use whatever API you're most comfortable programming to.
07 Aug, 2009, flumpy wrote in the 34th comment:
Votes: 0
Kintar said:
The bolded portion of that quote is the point I'm trying to make, and I apologize if that point has taken this thread a little off-track from its original question. Unless your codebase is for your use only, and is never going to be in the hands of people who don't know the tools and APIs you're using to build it, then your storage format is going to be one of the largest parts of how your code is viewed in terms of usability.


…and generally that is whichever format would be most used "out in the community" and its YAML, XML or the likes. We had this discussion a couple of months ago, and there it is argued quite well that the format of the data or the way code is written in you code base is almost entirely irrelevant when compared to how usable the tools are to manipulate that data. There are more tools for those formats out there than there will ever be for your proprietary code, and on many more platforms than you can probably write tools for. More people can grok YAML, or whatever, and that's my point.

Kintar said:
If, on the other hand, you're never releasing your source and are the only person who's ever going to have to read/edit/recover your files, then by all means use whatever API you're most comfortable programming to.


Disagree. I think you are flipping things around in your thinking. Proprietary formats are ok if you are the only person who will have to deal with them, edit them and have to maintain them*.

* note this comes from a history of not only being a developer but being in an application support position for quite a while now. If I have a system passed to me that is written with some xml config files and uses sql and a database (even if its HSQL) I know it is gonna take me a 1/3 of the time to grok whats going on with it, than if that system has a proprietary flat file system I have to learn. Believe me, its no fun trying to work out why you changed a semicolon here or there and now nothing works… :(
07 Aug, 2009, flumpy wrote in the 35th comment:
Votes: 0
07 Aug, 2009, Kintar wrote in the 36th comment:
Votes: 0
flumpy said:
…and generally that is whichever format would be most used "out in the community" and its YAML, XML or the likes. We had this discussion a couple of months ago, and there it is argued quite well that the format of the data or the way code is written in you code base is almost entirely irrelevant when compared to how usable the tools are to manipulate that data.


I think you're conflating points again. I just finished reading the thread you linked, and it's about the tools shipping with the MUD to make changes to the MUD, not about file-format specific editors. Yes, you definitely need tools to allow your users to modify their game world. This does not mean that you hand them an XML file and a schema, and say, "Just go grab an XML editor".

I'm actually going to bow out of this discussion, because I think we're turning it more into "Kintar and flumpy disagree on terminology and approach a discussion from opposite standpoints" than a talk about storing world data. :)
07 Aug, 2009, David Haley wrote in the 37th comment:
Votes: 0
It's not just tools. It's what you do with the data you've read in. Imagine you have some file format, and later you add in new fields, rename some, maybe even delete some. You still need to be able to handle this gracefully as you load the data – the format it's in is completely irrelevant. YAML/XML/BlaBlaBlaML won't help you here at all – they provide a format that can change, but they don't help at all in dealing with the policy decisions of data that isn't what you expect. Validation can help to some extent, but note that how you validate a file can change depending not only on its version, but also on the version you have now looking back at some previous version!

Tools to read XML are fine and dandy but most of the time this is stuff managed completely by the MUD server. Elanthis wasn't talking about tools to read and write files, at least not from what I remember and from what I gather from a quick re-read. Builders should basically never be reading your persistence files directly: they should have tools on top of it that completely abstract away the persistence format. The user shouldn't care if you use YAML, XML, a home-grown format, SQL, or what-have-you to store files.

EDIT: this was in reply to flumpy, Kintar hadn't posted when I started writing this.
07 Aug, 2009, flumpy wrote in the 38th comment:
Votes: 0
Funnily enough I think we all actually basically agree, when alls said and done :D
07 Aug, 2009, David Haley wrote in the 39th comment:
Votes: 0
The reason I insist on the policy question a little bit is because some people (not necessarily those here) put too much emphasis on the format itself, and not enough on what one actually does with it. The format needs to be extensible etc., but an extensible format is not enough to solve all the problems. Some people consider XML/YAML/EtcML to be the final solution to storage problems, but really they're not, when you are in circumstances such as this one.
07 Aug, 2009, Erok wrote in the 40th comment:
Votes: 0
David Haley said:
Some people consider XML/YAML/EtcML to be the final solution to storage problems, but really they're not, when you are in circumstances such as this one.

Agreed.

Something I was trying to emphasize earlier as well is that a solution that supports schema-based validation makes things easier on you by ensuring the data is well-formed (i.e., complies to the version of the format it says it's in) *before* extracting the data. Normally the parser will support an upgrade path from an older format to a newer format, by converting/defaulting/ignoring fields as it goes (i.e., the policies I think David refers to).
20.0/55