23 Dec, 2006, Omega wrote in the 1st comment:

Votes: 0

Hey all,

I am currently trying to find methods of lowering memory use and cpu use in my base, I have done allot of work to keep it low and I was wondering if there are any methods, or tricks that can aid in lowering the overall usage of the mud, whether it be ways to lower the executable file-size, or methods of handling loops that will keep their use down.

I've gone about making it so that specific files get access to certain headerfiles from the system, as to not load headers that are not needed, and this helped cut my executable size down, however my cpu usage and mem-usage stay static and higher then i'd like to see.

So if anyone has idea's of how to keep memory usage down, and cpu usage down. please let me know.

My mud is C++ if that is any help :)

23 Dec, 2006, Justice wrote in the 2nd comment:

Votes: 0

There's alot you can do to optimize, but without any knowledge of your mud I couldn't say where to start. Have you done any profiling on where the resources are being used?

23 Dec, 2006, Omega wrote in the 3rd comment:

Votes: 0

yeah, i'm currently trying to optimize my structures, so that they are using less and less memory, but that is only part of the problem.

I've also been optimizing loops so that they only do things like, update necessary data.

but these things do not show a definate change in the muds overall memory and cpu usage.

23 Dec, 2006, Guest wrote in the 4th comment:

Votes: 0

General advice which may help:

Use the most appropriate variable types for structure members. If you don't need "long int" then don't use it. Use "short int" if that will cover the value range that member should have.

Don't know how much this helps, but once you've changed the types around, group all of the char* types together in the structure definition. Do the same with int, short, bool, etc. Start from the top with the larger ones and descend down until you get to the bools. Like so: http://www.mudbytes.net/index.php?a=past...

Whenever possible, use shared string management. Muds use up most of their memory with string data so anything you can do to handle reference counting or other similar methods will save you the most.

Don't place too much importance on lowering the binary file size. That doesn't seem to have much correlation to the amount of system memory the code ends up using. I went from a low point of just over 3mb in C to around 16mb now as C++ and my memory usage figures didn't change all that much throughout that time. Optimizing your includes will have the most impact on binary size. The less you can get away with needing, the better. And it also helps speed up compiling time.

23 Dec, 2006, Omega wrote in the 5th comment:

Votes: 0

so grouping data-types doesn't do much, just put them in decending order from largest to smallest.

23 Dec, 2006, Guest wrote in the 6th comment:

Votes: 0

Well like I said, I'm not sure if grouping the datatypes helps or not. I have no idea how to test for that. It does make the code look a bit neater though if you care about that :)

24 Dec, 2006, Omega wrote in the 7th comment:

Votes: 0

about grouping data-types:

Members of a structure get aligned to memory address boundaries equal to their size
so if you have: int, char, int

your structure is not 9 bytes, its 12
because the memory in bytes looks like iiiicpppiiii
thats 4 bytes for the first integer, 1 byte for the char, 3 bytes for padding, and 4 bytes for your next integer

But then, to reduce CPU usage
you want your more commonly accessed elements to the front of the structure
Packing can save you space in structures where you have lots of members or lots of structures…

Now… for padding… it is not necessarily true that there is always padding. int, short, short, int wont be padded

just remember long long = 8, double = 8, int = 4, * = 4, float = 4, short = 2, char = 1, bool = 1

You can look at your structures and if you add it up, going from top to bottom, and you encounter an item that is not a divisor of the current size, there will be padding

It might be minimal. I mean, say your Object structure end up with 10 bytes padding in total

You'd need 10,000 objects to have 100k wasted space

Sometimes in cases of varying size data structures

ABOUT VSZ:

VSZ includes things like DLLs and loaded code modules
Its also a 'shared' deal, so if you have, say, 20 megs of DLLs
but they're also used by other apps

VSZ is how much virtual memory. RSS is how much physical memory actually being used

In other words, VSZ is out of your control, you use a shared library file, and your vsz will raise based on its active use according to documentation and conversations with afew coder friends.

Eliminating executable size. okay, i no longer want to down my executable size, as i have recently found out why its so high.

When you enable -g# && || -ggdb# you will enable debugging symbols, which make up anywhere between 50 and 90% of your executable size.

So to lower executable size, you remove -g and your -ggdb# from your makefile, and your exe will drop significantly in-size (mine went from 36mb to 6mb) but by doing so, you lose the ability to debug your mud, ie, no more core as the debug symbols are gone.

Now if your mud is stable, been years without a crash, and no code changed, then one might say, go for it, remove the -g and save the space, else-wise, don't bother, the ability to debug is too important to lose.

And back onto VSZ, since its out of your control and truly, if your mud uses outside libs, ie, -lm, then your VSZ will ofcourse be an odd site.

this is what I have found out in my recent searchs and conversations with coders and peering into documentation.

So to answer samson, grouping of data in structures/classes does infact help your mem and cpu usage. I personaly went through my main structs/class's and re-organized them, and cut my cpu/mem usage by doing so, now, its not nearly enough for what i wanted, but its a start on the path :)

25 Dec, 2006, Garil wrote in the 8th comment:

Votes: 0

As you've mentioned, reducing structures by a few bytes does little to reduce overall size. Micro-optimizing a MUD is usually a futile effort, though fun if you're into that kind of thing. One of the best techniques for reducing memory is to only load areas into memory as they are needed, and to unload them when they're not. However, as usage on the MUD increases, so will memory usage. This also has the effect of reducing the number of mobs/etc that need to be updated every tick, thus reducing CPU usage.

To reduce CPU usage, you're going to want to tackle what consumes 99% CPU on most muds, the tick update functions. If you were to run the mob/room/obj tick update functions half as often, you're going to save about half the CPU time. The tradeoff is for interactivity, mobs will move twice as slow, etc. You can play around with the various times to see how this effects your game.

Look for ways to shortcut large loops earlier, like those that go through the entire char list. Put cheaper checks first like simple equality checks on variables before checks that call other functions. Don't worry too much about smaller loops or individual functions unless they have some really bad algorithms (use a profile to find these, gprof for instance).

Code optimization is really not the final answer, what it boils down to in the end is complexity of your environment. The less complex, the fewer resources you will use. You have to decide what you're willing to sacrifice in the game to save that extra bit of memory and CPU.

25 Dec, 2006, kiasyn wrote in the 9th comment:

Votes: 0

you can often sacrifice memory for cpu or cpu for memory.

25 Dec, 2006, Tyche wrote in the 10th comment:

Votes: 0

Darien said:

about grouping data-types:
your structure is not 9 bytes, its 12
because the memory in bytes looks like iiiicpppiiii
thats 4 bytes for the first integer, 1 byte for the char, 3 bytes for padding, and 4 bytes for your next integer

Right, but rather than mucking with the source one can just compile with alignment set to 1. However the default is 4 because on 32-bit x86 it's more efficient. This level of twiddling is a waste of ones time. If one is hitting host memory limits/quotas, it's better to implement an object caching system.

Edit: It looks like gcc will group like types.

$ cat struct.c 
#include <stdio.h>

struct {
short w1;
short w2;
int x;
char y1;
char y2;
int z;
} t;

int main() {
  printf("struct t size is %d", sizeof(t));
  return 0;
}

$ gcc struct.c ;./a
struct t size is 16

$ gcc struct.c -fpack-struct ;./a
struct t size is 14

The above option is dangerous as it screws up structs you might be including to use other libraries.

26 Dec, 2006, Omega wrote in the 11th comment:

Votes: 0

structure optimization, may not save much, only afew bytes, but when your dealing with say, the room structure, and you can save, 5 bytes per room, it adds up when your mud has like, 20,000+ rooms in it.

Any/all memory savings are good.

now, as for optimizing the updating loops. i've found that by making different lists to handle different updating, tends to save the mud ever-so-slightly.

case and point, water floating objects, eventualy sink in my mud, so instead of parsing the object_list to find which objects are infact, floating, i make it so that in obj_to_room it checks the room if it is a water room, and then checks if the object can float or will just sink, and adds it appropraitly to the obj_float_list, which, in-turn, dramaticaly cut down my cpu usage. parsing smaller lists, or lists that can often be empty, makes life easier on the cpu.

So yeah, optimizing your loops is often a really good method of dealing with cpu usage on the whole. For myself, most of my updating happens on events, each character specific, (in the case of characters) this both raises and lowers the cpu usage, in some cases, pending on when everyones events happen. and ofcourse, different effects space out how long it takes for certain events to take place, so when they are all crammed together, yeah, you experience alittle more of an issue, but when they are spaced out, the cpu drops.

26 Dec, 2006, Justice wrote in the 12th comment:

Votes: 0

I agree, while I haven't done it with my current mud project… using small caches is an easy way to speed up many loops. Generally speaking, this improvement is directly related to the proportion of updated objects vs non-updated ones. The greater the difference, the greater the improvement.

As for the choppiness of your event handler… for less time-critical events, try to use a limited time-slice. You can decrease the time effect by speeding up the game pulse. Using 1/2 or 1/3rd for the event handler and an offset for the rest is an easy way to avoid affecting the timing of other systems while doing this, but it will make the game more susceptible to over-running the time-slice.

Another option is to use a timer and do a true time-slice, but this increases CPU usage although it distributes it more evenly.

27 Dec, 2006, Omega wrote in the 13th comment:

Votes: 0

actualy, the event system is quite well done, the mud doesn't even feel the events anymore, though it used to.

It puts the event into 'buckets' and based on the pulse, it parses the appropriate bucket, then parses through the events in said bucket for ones at or less than the pulse count.

the mud houses a global pulse counter, and everytime the game-loop cycles, it increases. and when you queue a new event, it queue's it as current-pulse + (some number value you want)

it works quite nice :) and is fast :) my old system just put things into a giant-list and parsed the entire list and checked pulses.

27 Dec, 2006, Justice wrote in the 14th comment:

Votes: 0

Sounds like a simple hashtable to me. What I use is pretty similar, based on what Trax used for events in his mud back in hrmm, '02? Basically, each bucket was the head of a list. You could calculate what bucket to use with (pulse % hash). New events were added to the list using an insertion sort. After determining what list to use, you simply pull off the top until you find one with a pulse > than the current time.

27 Dec, 2006, Omega wrote in the 15th comment:

Votes: 0

what you just described sounds like dg_events :)

27 Dec, 2006, Guest wrote in the 16th comment:

Votes: 0

How significant is the difference between a hashed event table and a straight list where the events are just checked against the current time and activated until the next one in line is later than "now"?

And also, has anyone ever done one that uses the C++ STL for queues? Not that I know much about those, but would they be any better or worse for this kind of thing?

27 Dec, 2006, kiasyn wrote in the 17th comment:

Votes: 0

Depends on the size of the table/list.

27 Dec, 2006, Justice wrote in the 18th comment:

Votes: 0

Because what I'm using is an ordered list, the hash has no effect on executing events. However it does reduces insertion time on average to between half and 1/10th, but primarily it stabilizes the insertion time by reducing the number of values that must be examined.

Compared to an unordered list, the effect is quite dramatic. Easily improving performance by a factor of 10 or greater.

I'm using std::list for mine. Although I can get a slight increase in performance from using a custom linked list, this was easier.

27 Dec, 2006, Justice wrote in the 19th comment:

Votes: 0

On a side note, the size of the hash can have a major effect on the distribution of values. For example a hash of 10, when most values are divisible by 10 would place those values in the same bucket. Negating any performance benefit.

27 Dec, 2006, Justice wrote in the 20th comment:

Votes: 0

So I got bored… and wrote a simplified version of what I use to upload here. Mine included alot of the handler code that's specific to the code I have… this is generic C++ without any outside references. Could probably use a bit of improvement, but eh, it works. I uploaded it as "pulse timer" since it's not event code. It's a delayed timer that uses an internal value to keep track of how many pulses it has encountered.

Random Picks

RavenMUD

Paradigm Shift

Elusive Dreams