20 Dec, 2008, David Haley wrote in the 41st comment:
Votes: 0
If you're a little careful, yes. You might have to copy the list before iteration, for example. Lists get finicky if you remove elements during iteration.

EDIT: you could avoid the copying if you use the trick I mentioned earlier about listing not pointers but identifiers, and using those to tell if the object is still valid.
20 Dec, 2008, quixadhal wrote in the 42nd comment:
Votes: 0
Seeing all the MANY problems that Diku's seem to have with this particular issue (having something extracted in the middle of a loop), perhaps the real solution is to change the system to flag a mob/player for death/extraction, and only actually DO that at the bottom of the update loop?

Most player output doesn't get sent directly to the socket, it gets added to a buffer that is flushed at the top (or bottom) of the loop anyways, so maybe anything that affects list structures should be done as an update at the end of the loop, rather than at the moment it happens.

As long as spells and such check for the dead flag, so you don't have someone reduced to -10 hit points and then a cure light goes off and heals them for 12 points, despite the fact that they should be dead….
20 Dec, 2008, Scandum wrote in the 43rd comment:
Votes: 0
quixadhal said:
Seeing all the MANY problems that Diku's seem to have with this particular issue (having something extracted in the middle of a loop), perhaps the real solution is to change the system to flag a mob/player for death/extraction, and only actually DO that at the bottom of the update loop?

My mud moves all mobs and objects to a junk room which is emptied at the end of each pulse. It prevents crashes, and all you really need to check for is that ch and vch_next are still in the same room with mass attacks.
21 Dec, 2008, Guest wrote in the 44th comment:
Votes: 0
quixadhal said:
Seeing all the MANY problems that Diku's seem to have with this particular issue (having something extracted in the middle of a loop), perhaps the real solution is to change the system to flag a mob/player for death/extraction, and only actually DO that at the bottom of the update loop?


Smaug already has this though. There are extraction queues for objects and mobs, and an extract_char function which usually deals with sending them there. It could be in this particular case that there's a stray statement somewhere that's changing the room pointer for the mob without going through the proper channels. That's not easy to find.
21 Dec, 2008, David Haley wrote in the 45th comment:
Votes: 0
In the end of the day, the real problem is the lack of coherent API for these kinds of changes. That's why the stray statements that Samson mentioned exist: the code doesn't have clear points of access, and instead everybody kind of pokes everything else using helper functions most of the time. It is those other times that cause the trouble.

I don't mean to sound too pessimistic but I often get somewhat frustrated with the architecture. It's turned into a lump of spaghetti that is too hard to work with. I'm too often tempted to just bite the bullet and truly start from scratch, but that raises so many other issues that it is an extremely daunting prospect.
21 Dec, 2008, Banner wrote in the 46th comment:
Votes: 0
So the problem cannot be solved?
21 Dec, 2008, Kayle wrote in the 47th comment:
Votes: 0
DavidHaley said:
In the end of the day, the real problem is the lack of coherent API for these kinds of changes.

For now. ;) ::snickers with DavidHaley about some shared secret::

DavidHaley said:
It's turned into a lump of spaghetti that is too hard to work with.

Don't have to tell me this. ::mutters::

Banner said:
So the problem cannot be solved?

It can be solved, but it's going to take a monumental amount of effort. If I'm understanding what's been discussed, and SWR actually happened after Smaug implemented the extracted_char_queue, you're going to have to figure out where something is being killed off or removed, without being added to the queue as it should be. If SWR was derived before the queue was implemented, you'll likely need to go into SmaugFUSS and implement the queue and see if that helps things any.
21 Dec, 2008, Scandum wrote in the 48th comment:
Votes: 0
Banner said:
So the problem cannot be solved?

Crashes of this nature should be pretty rare, there's probably something in your code that's contributing to it. I also wouldn't exclude the possibility that the bug isn't vch_next related.

A queue isn't necessary either, you can dump mobs and objects into a room in Limbo and purge everything in it at the end of each pulse, shouldn't be too hard to do.

If you really want to find out where things go wrong you'll need to get rid of the custom memory and string allocation routines and use Valgrind.
21 Dec, 2008, Sharmair wrote in the 49th comment:
Votes: 0
Davion said:
… Also, maybe step through all those frames checking mob's (aka, 0x14e16b0) in_room to see if it's NULL to begin with, or if it happens somewhere after the prog is triggered.

This is an invalid debugging method and is bound to give misleading data. The core is like a snapshot
of the state of the game at the time of the crash (actually, it is a memory dump of the game at the time
of the crash). Even though the different frames have different pointers they use, they all point to the
same patch of memory (on the heap in this case) and will show the exact same thing, the state of the
pointed to memory at the time of the crash, even though in_room may have been valid at the time
the prior frames used the data.
21 Dec, 2008, David Haley wrote in the 50th comment:
Votes: 0
Banner said:
So the problem cannot be solved?

What Kayle said. It's solvable, it's just that the architecture makes it rather difficult. It's unclear to me if it's better to solve it in the current codebase, or invest time in a new codebase. Neither seems hugely attractive to me…
Scandum said:
A queue isn't necessary either, you can dump mobs and objects into a room in Limbo and purge everything in it at the end of each pulse, shouldn't be too hard to do.

Dumping mobs into some sort of limbo isn't really correct either: sure, you won't crash, but you'll be breaking the iteration improperly when you detect that you're in limbo instead of wherever you wanted to be.
21 Dec, 2008, Scandum wrote in the 51st comment:
Votes: 0
DavidHaley said:
Dumping mobs into some sort of limbo isn't really correct either: sure, you won't crash, but you'll be breaking the iteration improperly when you detect that you're in limbo instead of wherever you wanted to be.

You've got the exact same problem with a separate queue.

The only real work around I can think of is to create a linked list of mobs in the room, next parse that linked list and hit everything that is still in the same room. It's not a whole lot of overhead, and if you create a special function to generate the list it should be fairly easy to add it wherever needed.
21 Dec, 2008, David Haley wrote in the 52nd comment:
Votes: 0
Well, actually, it depends on how the separate queue is implemented. If it's implemented using its own list structure, you can test if a character has died this round by testing for presence in the queue, but you can still use the next_in_room pointer to go to the next character in that room.
21 Dec, 2008, ghasatta wrote in the 53rd comment:
Votes: 0
I think there is a fundamental question that needs to be decided:

As you iterate through the list, can each iteration mutate (e.g., remove) the list element that is being iterated over, or can each iteration mutate 1 or more list elements?

If the answer is that each iteration can mutate an indeterminate number of list members (e.g., a player gets killed, so the player and his pet are both removed from the room simultaneously) then I say go exponential. This would be (relatively) cpu intensive but I don't think it will matter unless you have a list of several hundred (thousand?) things. I think this is basically what David Haley is suggesting.

You would build your own local list of all items in the room at the start of the iteration. Then, for each item in your local list, you search the room list for the presence of that item. When found, operate on it. If not found, move on happily.

OTOH, if each iteration can modify at best the one element that is currently being iterated over, I think you merely need to store the 'next' pointer before you execute any logic, as has been suggested earlier in this thread.

My guess is that there really aren't any limits on how many elements can be removed from the list in a given iteration. Maybe a mobprog decides to move half the contents of the room to another location. Sounds scary.
21 Dec, 2008, David Haley wrote in the 54th comment:
Votes: 0
I'm not sure what exactly you mean by "going exponential", but, yes, the problem is that the list can be changed in more or less arbitrary ways during iteration as things act upon the room's contents. It's not limited to just the current element.
21 Dec, 2008, ghasatta wrote in the 55th comment:
Votes: 0
I meant that your time to complete iteration of your 'local' list would be exponential relative to the size of the list - as your list increased, you are potentially increasing the number of iterations required exponentially. The best case would be if each list remains ordered in the exact same way, but even then I think you are looking at what a mathematician might call a triangular number. If you have 10 elements in a room, you would iterate ~55 times to cover everything. If you had 100 elements, you would iterate ~5000 times.

So, all I was trying to say was that if a mud would tend to have many items in a given room, this would possibly be costly.
21 Dec, 2008, Sharmair wrote in the 56th comment:
Votes: 0
Scandum said:
One solution is using a global vch_next which you update to ->next or next_in_room if it's found in the extract_char routine, but that's only really useful in char_update, violence_update, and other main loops.

This is probably a very bad idea. At least with local holder variables you have a chance that each loop
will work with its own holder. But with a global, if you have nested loops, the loops are bound to get
confused (let alone the code is made more messy).
21 Dec, 2008, Sharmair wrote in the 57th comment:
Votes: 0
Banner said:
So the problem cannot be solved?

What problem? Your crash issue, or the general problem of lists in MUDs (which I doubt is your
problem, at least not the main problem)?

Though, the answer to both is that they can be solved.

As far as the list issue, I would think that it can be handled much easier then alot have implied.
In fact, SMAUG is really quite a bit more stable then most Dikus, even with the higher complexity
with all the added features interconnecting more then on the more primitive codebases. I am running
some tests now and will comment more on this later.

As far as your crash, You have not given enough info to for sure say what the real problem is. From
what I have seen so far, a double kill is the most likely. Though, from some of the code you have
shown so far, I can tell your code is based on an older SMAUG so you might not have all of the stability
of the newer code.
Anyway, if you can get a core on that crash in the mprog you showed at the start, it would be helpful
if I could see the objects in the room at the time of the crash, and the code for extract_char,
char_from_room and the parts of raw_kill after the calls to the triggers at least until it is handling
the player only case.
To get the objects you can probably use actor to get the room at the mprog_driver frame and print:
print *actor->in_room->first_content
print *actor->in_room->first_content->next_content
print *actor->in_room->first_content->next_content->next_content
etc. until you get the "Cannot access memory at address 0x0" message.
There are ways of doing the list with less typing using the $# from the last print and adding just the
next_content. But if you don't understand this, just use the brute force long names. Also, if you notice
the next_content being NULL, you don't have to try to print it.
21 Dec, 2008, Scandum wrote in the 58th comment:
Votes: 0
Sharmair said:
Scandum said:
One solution is using a global vch_next which you update to ->next or next_in_room if it's found in the extract_char routine, but that's only really useful in char_update, violence_update, and other main loops.

This is probably a very bad idea. At least with local holder variables you have a chance that each loop
will work with its own holder. But with a global, if you have nested loops, the loops are bound to get
confused (let alone the code is made more messy).

I don't think there are many muds with nested main loops in the update handler. All you really need is one global vch_next for char_update, violence_update, aggr_update, etc. It's far from messy, just an alien concept to most.

ghasatta said:
You would build your own local list of all items in the room at the start of the iteration. Then, for each item in your local list, you search the room list for the presence of that item. When found, operate on it. If not found, move on happily.

What's wrong with what I suggested initially, to simply check if ch and vch have the same in_room pointer? Combined with delayed extraction that'd do the trick.
21 Dec, 2008, David Haley wrote in the 59th comment:
Votes: 0
ghasatta said:
I meant that your time to complete iteration of your 'local' list would be exponential relative to the size of the list - as your list increased, you are potentially increasing the number of iterations required exponentially. The best case would be if each list remains ordered in the exact same way, but even then I think you are looking at what a mathematician might call a triangular number. If you have 10 elements in a room, you would iterate ~55 times to cover everything. If you had 100 elements, you would iterate ~5000 times.

Sorry, I should have made my question clearer. I know what exponential complexity is, I just am not sure what you mean in terms of implementation. I'm not sure why you're iterating this many times, or why you need to. If you have a local copy of the list, just iterate over the local copy of the list. If you need to check that the element is still in the room, iterate once over the main list. At worse, for each element in the local list, you will re-check every element in the main list. That's O(n^2), certainly not O(2^n) or other exponential. (Actually, it would be better in practice, since the lists start out ordered the same way, and elements are only removed from the middle or added to the end of the main list. So you don't need to restart the iteration from scratch. It would suffice to maintain two parallel traversals, for example.)

Scandum said:
What's wrong with what I suggested initially, to simply check if ch and vch have the same in_room pointer? Combined with delayed extraction that'd do the trick.

Because you won't finish iterating over the elements in the room once you detect that the rooms have changed. As I said a few posts ago, this will avoid crashes, but is still not correct.
21 Dec, 2008, Scandum wrote in the 60th comment:
Votes: 0
DavidHaley said:
Scandum said:
What's wrong with what I suggested initially, to simply check if ch and vch have the same in_room pointer? Combined with delayed extraction that'd do the trick.

Because you won't finish iterating over the elements in the room once you detect that the rooms have changed. As I said a few posts ago, this will avoid crashes, but is still not correct.

No, not if you build a duplicate list of the mobs in the rooms and parse that list instead.
40.0/67