04 Sep, 2009, David Haley wrote in the 21st comment:

Votes: 0

Remcon said:

Isn't that more or less the same as freeing it and creating a new pointer later?

Functionally, yes; that is, in terms of what it does without taking into account efficiency. But what the system actually does can make a difference w.r.t. fragmentation of the memory pool, having to traverse it, etc.

But you wouldn't be wasting memory as long as you're properly allocating and freeing things.

Silenus said:

Thanks David. I guess for this sort of thing it sounds like I dont have much of a choice. I did find some materials of how to measure changes in the performance of a given system (probably old hat to you) and different techniques for doing so(some paper on ACM had a list of references). I guess I will start with something simple and then try something more ambitious maybe to pair with one of the lpmud servers.

I think part of the problem is that you are trying to do expert-level, highly complex systems work and optimization, and are feeling frustrated that you don't understand it. You shouldn't get discouraged that, as you are just getting started, you don't have such expertise yet; just take things slowly and work toward it.

04 Sep, 2009, Remcon wrote in the 22nd comment:

Votes: 0

ah ok thanks for taking the time to answer David :).

05 Sep, 2009, elanthis wrote in the 23rd comment:

Votes: 0

Sounds like you're interested in a compacting, moving collector, which is a separate topic from a generational collector. Garbage collection is a pretty interesting field. :)

The idea behind a generational collector is that most newly allocated objects in real programs are often very short lived and so it is more efficient and productive to focus on collecting new objects rather than old objects.

The problem with allocating and freeing objects frequently is that it fragments the heap, which a compacting/moving collector deals with. Say you allocate four objects: the first is 20 bytes, the second is 12 bytes, the third is 20 bytes, and the fourth is 8 bytes. You have used 60 bytes of memory. You now free the second and fourth objects, which frees up 20 bytes of memory. You now try to allocate another 20-byte object. However, the memory allocator has to request more memory from the OS, because that 12 byte hole that the second object left is not big enough to store the 20-byte object. After a lot of allocations and deallocations you can quickly end up with a ton of small objects in the heap that large objects can't be allocated in, potentially hundreds of megabytes worth of memory. A compacting collector, as the name implies, shuffles the allocated objects to put them all towards the "front" of the heap, reducing fragmentation, saving memory, and actually making the allocator faster since it has less work to do to find a suitable chunk of contiguous memory for new objects.

The actual optimization of a garbage collector is pretty hardcore stuff. It's not something you should worry about if you don't have a good math background and a firm understanding of low-level hardware architecture, including how the CPU works, the purpose and performance characters of the CPU cache, the memory bus, and even how RAM works. And that's the easy stuff. :)

As a taste of the kinds of topics you should study if you're really interested in optimizing memory management (which I encourage, the field needs more good minds!) checkout this paper by Ulrich Drepper on how modern hardware works with memory and its effects on applications: http://people.redhat.com/drepper/cpumemo...

05 Sep, 2009, David Haley wrote in the 24th comment:

Votes: 0

Another interesting hardware thing to be cognizant of is pointer alignment; some systems enforce it, and others, while not enforcing it, incur severe performance penalties for leaving alignment. So this is another thing that garbage collectors need to track, and it can make the fragmentation problem that much more thorny.

05 Sep, 2009, Silenus wrote in the 25th comment:

Votes: 0

Thanks Elanthis. I have just started studying this stuff so I have been trying to come up with relevant projects that hone in and test my understanding of what I do know. My knowledge of low level hardware still is limited to textbook stuff mainly covered in the first volume of Patterson and Hennessy (Computer Organization havent gotten to Computer Architecture yet or the sister book on parallel computer architecture) and some books on OS implementation. I will read over this linked document it seems pretty interesting.

I am not sure if a moving compacting collector is the same as a copy collector. My understanding is that modern gc systems use complex hybrids of various different algorithms doing a combination of different things. I find systems level stuff quite fascinating though I doubt I will do anything of this stuff ever professionally I think I could get quite a bit of enjoyment out of coding and understanding systems level code.

06 Sep, 2009, elanthis wrote in the 26th comment:

Votes: 0

Yeah, a lot of GCes are pretty complicated beasts. I'm not sure there is a difference between copying and compacting, but I could be wrong. I'm by no means an expert in garbage collectors. All of mine have been relatively simple beasts

06 Sep, 2009, David Haley wrote in the 27th comment:

Votes: 0

It's probably worth mentioning that relatively few people have worked on the truly remarkable/complex garbage collectors. It is an extremely complicated field, and expectations should be set realistically when it comes to implementing much less designing a garbage collector.

06 Sep, 2009, elanthis wrote in the 28th comment:

Votes: 0

For my own edification, I looked up the specifics of the meaning of "copying" in garbage collectors, and Silenus is right. A copying collector is one that uses memory copies as a form of tracking reachable objects, where as a moving/compacting collector may do the memory moves separate from the mark phase. Basically, all copying collectors are compacting collectors, but not all compacting collectors are copying collectors.

A copying collector would incur a performance hit for every GC run due to needing to always copy objects during mark phase, even if the heap is not fragmented. A better design would avoid copying objects needlessly, which is what most modern moving/compacting collectors attempt to do. The advantage of a copying collector is that allocation is _very_ fast (so long as there is enough free heap space) because there is no free block list, but simply a contiguous chunk of unused memory.

Given that you can use different methods for different generations of objects in a generational GC, it is quite possible and potential advantageous to use a copying approach for the youngest generation of objects, making the usual pattern of frequent-allocation of small objects take advantage of the fast allocation speed of a copying collector. The copy portion of the copying collector would essentially copy the objects into the portion of the heap reserved for older generations of objects.

One particular trick (unrelated to copying/moving) to keep in mind for generational collectors is that some objects deserve to start life as an older generation. The idea behind generational collectors is that most languages create a lot of temporary objects. However, for most applications there are a number of types that are usually long-lived. In a game, for example, the objects representing game actors usually last for at least a few seconds, if not for hours, so there's little point in putting them into the youngest generation. Many generic collectors implementations actually will do this, albeit they do it based on size, since those short-lived objects are also usually small. An application-specific GC may use allocation flags to indicate that certain objects are expected to be long-lived, and others are expected to be short-lived. In a scripting language, this can even go so far as to differentiate between strings created explicitly and strings created as temporaries during string concatenation operators (e.g., echo str + " foo", compared with var foo = read_file). Multi-generational collectors exist as well, and you can differentiate between objects that you expect to live a long time (game actors), objects you expect to live a short time (string concatenation temporaries), and objects that may or may not exist a short time (most everything else).

Unfortunately, none of the popular generic GCes out there (not that there's a lot of them) include an API that lets the user specify the expected lifetime of an object during allocation. Something I should suggest to the Boehm folks, perhaps.

Random Picks

Realm of Magic

DUMII

The Darkening Sun