16 Dec, 2008, tphegley wrote in the 1st comment:
Votes: 0
I have found a memory leak in my code and I'm not sure when it started. If I medit a mob, then save it, any other player on the mud tries to put in a command and gets 'huh?' and then after a minute or so it crashes. Here's what it said in gdb:

Program received signal SIGSEGV, Segmentation fault.
0x400ca18d in mempcpy () from /lib/libc.so.6
(gdb) bt
#0 0x400ca18d in mempcpy () from /lib/libc.so.6
#1 0x400c0b32 in _IO_default_xsputn_internal () from /lib/libc.so.6
#2 0x4009e4c5 in vfprintf () from /lib/libc.so.6
#3 0x400b728b in vsprintf () from /lib/libc.so.6
#4 0x400a4dfd in sprintf () from /lib/libc.so.6
#5 0x0804a537 in talk_channel (ch=0x73726568,
argument=0x6f766120 <Address 0x6f766120 out of bounds>, channel=1948279913,
verb=0x73206568 <Address 0x73206568 out of bounds>) at act_comm.c:566
#6 0x746f0a20 in ?? ()
#7 0x73726568 in ?? ()
#8 0x6f766120 in ?? ()


Then it spit out those numbers up through 1200 before I just quit the backtrace. I have not messed with the talk channel and have been working on classes. Is there anything in that code that you can discern or what kind of information can I give?
16 Dec, 2008, tphegley wrote in the 2nd comment:
Votes: 0
Here it is in Valgrind:

==8892== Invalid write of size 1
==8892== at 0x401E414: mempcpy (mc_replace_strmem.c:677)
==8892== by 0x40C9B31: _IO_default_xsputn (in /lib/libc-2.3.5.so)
==8892== by 0x40A74C4: vfprintf (in /lib/libc-2.3.5.so)
==8892== by 0x40C028A: vsprintf (in /lib/libc-2.3.5.so)
==8892== by 0x40ADDFC: sprintf (in /lib/libc-2.3.5.so)
==8892== by 0x804A536: talk_channel (act_comm.c:566)
==8892== by 0x804AC45: do_yell (act_comm.c:913)
==8892== by 0x807F7AC: violence_update (fight.c:419)
==8892== by 0x809F3D7: update_handler (update.c:2411)
==8892== by 0x8070849: game_loop_unix (comm.c:933)
==8892== by 0x8070DE3: main (comm.c:520)
==8892== Address 0xbf00c7c2 is not stack'd, malloc'd or (recently) free'd
==8892==
==8892== Process terminating with default action of signal 11 (SIGSEGV)
==8892== Access not within mapped region at address 0xBF00C7C2
==8892== at 0x401E414: mempcpy (mc_replace_strmem.c:677)
==8892== by 0x40C9B31: _IO_default_xsputn (in /lib/libc-2.3.5.so)
==8892== by 0x40A74C4: vfprintf (in /lib/libc-2.3.5.so)
==8892== by 0x40C028A: vsprintf (in /lib/libc-2.3.5.so)
==8892== by 0x40ADDFC: sprintf (in /lib/libc-2.3.5.so)
==8892== by 0x804A536: talk_channel (act_comm.c:566)
==8892== by 0x804AC45: do_yell (act_comm.c:913)
==8892== by 0x807F7AC: violence_update (fight.c:419)
==8892== by 0x809F3D7: update_handler (update.c:2411)
==8892== by 0x8070849: game_loop_unix (comm.c:933)
==8892== by 0x8070DE3: main (comm.c:520)
==8892==
==8892== ERROR SUMMARY: 64 errors from 3 contexts (suppressed: 15 from 2)
==8892== malloc/free: in use at exit: 18,257,900 bytes in 65 blocks.
==8892== malloc/free: 610 allocs, 545 frees, 18,444,304 bytes allocated.
==8892== For counts of detected errors, rerun with: -v
==8892== searching for pointers to 65 not-freed blocks.
==8892== checked 18,741,276 bytes.
==8892==
==8892== LEAK SUMMARY:
==8892== definitely lost: 0 bytes in 0 blocks.
==8892== possibly lost: 0 bytes in 0 blocks.
==8892== still reachable: 18,257,900 bytes in 65 blocks.
==8892== suppressed: 0 bytes in 0 blocks.
==8892== Rerun with –leak-check=full to see details of leaked memory.
16 Dec, 2008, tphegley wrote in the 3rd comment:
Votes: 0
here it is with valgrind –leak-check=full –show-reachable=yes ../src/envy

==9048== Jump to the invalid address stated on the next line
==9048== at 0x20706C65: ???
==9048== Address 0x20706c65 is not stack'd, malloc'd or (recently) free'd
==9048==
==9048== Process terminating with default action of signal 11 (SIGSEGV)
==9048== Access not within mapped region at address 0x20706C65
==9048== at 0x20706C65: ???
==9048==
==9048== ERROR SUMMARY: 21 errors from 2 contexts (suppressed: 15 from 2)
==9048== malloc/free: in use at exit: 18,520,044 bytes in 67 blocks.
==9048== malloc/free: 610 allocs, 543 frees, 18,705,720 bytes allocated.
==9048== For counts of detected errors, rerun with: -v
==9048== searching for pointers to 67 not-freed blocks.
==9048== checked 19,001,752 bytes.
==9048==
==9048==
==9048== 364 bytes in 1 blocks are still reachable in loss record 1 of 3
==9048== at 0x401C909: malloc (vg_replace_malloc.c:207)
==9048== by 0x40BE16E: __fopen_internal (in /lib/libc-2.3.5.so)
==9048== by 0x40BE22E: fopen@@GLIBC_2.1 (in /lib/libc-2.3.5.so)
==9048== by 0x80998A6: save_char_obj (save.c:262)
==9048== by 0x809F328: char_update (update.c:1564)
==9048== by 0x809F47A: update_handler (update.c:2442)
==9048== by 0x8070849: game_loop_unix (comm.c:933)
==9048== by 0x8070DE3: main (comm.c:520)
==9048==
==9048==
==9048== 8,519,680 bytes in 65 blocks are still reachable in loss record 2 of 3
==9048== at 0x401BA29: calloc (vg_replace_malloc.c:397)
==9048== by 0x80723F6: alloc_perm (db.c:3228)
==9048== by 0x8073E5A: load_helps (db.c:793)
==9048== by 0x807620E: boot_db (db.c:553)
==9048== by 0x8070DBC: main (comm.c:517)
==9048==
==9048==
==9048== 10,000,000 bytes in 1 blocks are still reachable in loss record 3 of 3
==9048== at 0x401BA29: calloc (vg_replace_malloc.c:397)
==9048== by 0x8075F03: boot_db (db.c:433)
==9048== by 0x8070DBC: main (comm.c:517)
==9048==
==9048== LEAK SUMMARY:
==9048== definitely lost: 0 bytes in 0 blocks.
==9048== possibly lost: 0 bytes in 0 blocks.
==9048== still reachable: 18,520,044 bytes in 67 blocks.
==9048== suppressed: 0 bytes in 0 blocks.
Segmentation fault
16 Dec, 2008, tphegley wrote in the 4th comment:
Votes: 0
Is this a case where sprintf is not being freed somewhere and it is overrunning into something else or am I way off base?
16 Dec, 2008, David Haley wrote in the 5th comment:
Votes: 0
tphegley said:
Then it spit out those numbers up through 1200 before I just quit the backtrace.

Which numbers are you referring to?
tphegley said:
# ==8892== Invalid write of size 1
# ==8892== at 0x401E414: mempcpy (mc_replace_strmem.c:677)
# ==8892== by 0x40C9B31: _IO_default_xsputn (in /lib/libc-2.3.5.so)
# ==8892== by 0x40A74C4: vfprintf (in /lib/libc-2.3.5.so)
# ==8892== by 0x40C028A: vsprintf (in /lib/libc-2.3.5.so)
# ==8892== by 0x40ADDFC: sprintf (in /lib/libc-2.3.5.so)
# ==8892== by 0x804A536: talk_channel (act_comm.c:566)
# ==8892== by 0x804AC45: do_yell (act_comm.c:913)
# ==8892== by 0x807F7AC: violence_update (fight.c:419)
# ==8892== by 0x809F3D7: update_handler (update.c:2411)
# ==8892== by 0x8070849: game_loop_unix (comm.c:933)
# ==8892== by 0x8070DE3: main (comm.c:520)
# ==8892== Address 0xbf00c7c2 is not stack'd, malloc'd or (recently) free'd

This means that the call to sprintf in talk_channel is trying to write to memory that doesn't exist. The first thing to do is see what that line of code is (act_comm.c:566) just to make sure it looks ok. I'm not sure what the link with medit would be, but it's our starting place at least…

sprintf doesn't allocate memory, but it does clearly write into memory. The message above is saying that it wrote one byte into unallocated memory. Usually, that's indicative of an off-by-one error, for example a buffer was allocated to be large enough to contain the string text but not the trailing \0.
16 Dec, 2008, tphegley wrote in the 6th comment:
Votes: 0
Here is act_comm.c:566:

switch (channel)
{
default:
sprintf (buf, "You %s '%s'\n\r", verb, argument); <———566
send_to_char (buf, ch);
sprintf (buf, "$n %ss '$t'", verb);
break;
case CHANNEL_SHOUT:
sprintf (buf, "^c$n^c shouts across the land '^B$t^c'^w");
position = ch->position;
ch->position = POS_STANDING;
act (buf, ch, argument, NULL, TO_CHAR);
ch->position = position;
break;
16 Dec, 2008, tphegley wrote in the 7th comment:
Votes: 0
DavidHaley said:
Which numbers are you referring to?


These numbers in bold:


#6 0x746f0a20 in ?? ()
#7 0x73726568 in ?? ()
#8 0x6f766120 in ?? ()
16 Dec, 2008, tphegley wrote in the 8th comment:
Votes: 0
This is on my 'beta' port. I haven't messed with the talk function at all. I've just been creating new skills/affects for classes.

On the regular player port, I don't have this issue and the code in act_comm.c:566 is exactly the same.
16 Dec, 2008, tphegley wrote in the 9th comment:
Votes: 0
Here are all the lines of code:

act_comm.c:913
void
do_yell (CHAR_DATA * ch, char *argument)
{
talk_channel (ch, color_strip(argument), CHANNEL_YELL, "yell"); <—–913
return;
}


fight.c: 419
if (IS_NPC (target))
{
sprintf (buf, "A call to arms! I've spotted %s at %s!",
target->short_descr,
newname);
do_yell (ch, buf); <—–419
}
else
{
if (is_outlaw (ch, target))
{
sprintf (buf, "A call to arms! I've spotted the outlaw %s
at %s!",
target->name,
newname);
do_yell (ch, buf);
}


update.c:2411
if (–pulse_area <= 0)
{
pulse_area = number_range (PULSE_AREA / 2, 3 * PULSE_AREA / 2);
if (!sleepmode)
area_update ();
if (!firsttime)
{
if (count_players() == 0)
{
standby();
}
else
{
wakeup();
}
}
else
{
firsttime = FALSE;
}
}
<——–This space here is the 2411 line.
if (–pulse_violence <= 0)
{
pulse_violence = PULSE_VIOLENCE;
if (!sleepmode)
violence_update ();
}


comm.c 920
/*
* Autonomous game motion.
*/
update_handler (); <———–933



comm.c: 520
#if defined( unix ) || defined ( WIN32 )
control = init_socket (port);
// init hash table for ip traffic controling

for (i = 0 ; i < IP_HASH_SIZE; i++)
{
iptable[i] = NULL;
}

boot_db ();
sprintf (log_buf, "-AD- is ready to rock on port %d.", port);
log_string (log_buf);
game_loop_unix (control); <——-Line 520
16 Dec, 2008, Davion wrote in the 10th comment:
Votes: 0
Can you show us all of talk_channel? Slim chance it might be in there. Also, does valgrind spit anything out when you medit/save something? Or is that the only thing you get from valgrind?
16 Dec, 2008, Scandum wrote in the 11th comment:
Votes: 0
No offense meant, but have you tried a clean compile?
16 Dec, 2008, tphegley wrote in the 12th comment:
Votes: 0
Davion said:
Can you show us all of talk_channel?


How would it be in there if I haven't changed anything in the function? Can functions overlap with memory issues?

Davion said:
Slim chance it might be in there. Also, does valgrind spit anything out when you medit/save something? Or is that the only thing you get from valgrind?


It didn't show anything when I did the medit. When
16 Dec, 2008, tphegley wrote in the 13th comment:
Votes: 0
Scandum said:
No offense meant, but have you tried a clean compile?


Yes, I've done make clean and then make again. I usually make clean after every code change.
16 Dec, 2008, Davion wrote in the 14th comment:
Votes: 0
tphegley said:
How would it be in there if I haven't changed anything in the function? Can functions overlap with memory issues?

Might not be at all! Just, you're showing those lines of code, and there isn't enough there to tell if it actually -isn't- there. Have you modified anything else since this started happening? These things are really tricky to pin down. It looks though that it might be deeper. GDB is showing the the argument sent to talk_channel has already gone out of bounds. And if players are getting a 'Huh?' when entering a command, might be right down to the descriptor input getting chopped up. Try doing a save after a redit, or any of the other OLC's and see if it happens. If it still does it, it might be somewhere in the save. You might also have to look at the save functions to make sure they don't snipe out what's been modified and save only that (usually though, if one thing in an area is changed, the entire area is saved.)
16 Dec, 2008, Tyche wrote in the 15th comment:
Votes: 0
Quote
# #5 0x0804a537 in talk_channel (ch=0x73726568,
# argument=0x6f766120 <Address 0x6f766120 out of bounds>, channel=1948279913,
# verb=0x73206568 <Address 0x73206568 out of bounds>) at act_comm.c:566
# #6 0x746f0a20 in ?? ()
# #7 0x73726568 in ?? ()
# #8 0x6f766120 in ?? ()


You have overwritten an array on the stack. Sometime before you entered talk_channel().
All the addresses above are invalid and look to be ascii text. The easiest way to debug this is to set a break point in gdb and step through it.
16 Dec, 2008, Scandum wrote in the 16th comment:
Votes: 0
tphegley said:
Yes, I've done make clean and then make again. I usually make clean after every code change.

It sounds to me like a classic memory corruption by either a buffer over or under flow.

If your mud uses a custom memory manager get rid of it asap since they mess up valgrind, use more memory, and are slower. If you have memory to spare the same goes for the hashed string memory managers which use roughly 33% less memory but are definitely slower.

From what you told I'd say something in the medit routine overwrites memory that deals with the command interpreter. Possibly you can narrow it down some more, or does it happen if you enter and leave the medit menu without changing anything?
16 Dec, 2008, tphegley wrote in the 17th comment:
Votes: 0
Davion said:
Try doing a save after a redit, or any of the other OLC's and see if it happens. If it still does it, it might be somewhere in the save.


It does it with oedit, redit, and medit whenever I change something and then save.

Tyche said:
You have overwritten an array on the stack. Sometime before you entered talk_channel().
All the addresses above are invalid and look to be ascii text. The easiest way to debug this is to set a break point in gdb and step through it.


Ok. I'll try and figure out the break points from Nick Gammon's site. See what I can come up with.


Scandum said:
It sounds to me like a classic memory corruption by either a buffer over or under flow.

If your mud uses a custom memory manager get rid of it asap since they mess up valgrind, use more memory, and are slower. If you have memory to spare the same goes for the hashed string memory managers which use roughly 33% less memory but are definitely slower.

From what you told I'd say something in the medit routine overwrites memory that deals with the command interpreter. Possibly you can narrow it down some more, or does it happen if you enter and leave the medit menu without changing anything?


It is Envy code and I believe it uses malloc just as regular envy did. When it comes to memory stuff, I haven't ever really worked with it.

It seems my 'asave changed' (the command I have always used to save the areas) is messing up causing the memory to go bad. I haven't changed any ways of saving or done any modifying to the save code.

I'll continue to try and see what's up.
16 Dec, 2008, David Haley wrote in the 18th comment:
Votes: 0
Given that the corruption occurs in a sprintf channel, I believe that your problem is that you're writing too much into that buffer – and since the sprintf statement looks innocuous enough, I would try looking at the things you're passing to the print statement.

Namely:
sprintf (buf, "A call to arms! I've spotted %s at %s!",
target->short_descr,
newname);
do_yell (ch, buf); <—–419

I would look at anything you've done recently to short_descr, as well as figure out where "newname" is coming from.

Also, since the problem seems to come from using OLC followed by a save, I would make sure that you haven't changed anything in the save command. Finally, also make sure you haven't changed anything in the OLC commands.

I would also try using a breakpoint to examine the contents of target->short_descr and newname at the code point I mentioned above. Setting a breakpoint is actually pretty easy, Nick's guide goes over it well if I remember correctly.
16 Dec, 2008, Scandum wrote in the 19th comment:
Votes: 0
tphegley said:
It is Envy code and I believe it uses malloc just as regular envy did. When it comes to memory stuff, I haven't ever really worked with it.

Looks like envy is using the alloc_perm, alloc_mem, and free_mem routines in db.c and a string hasher in smm.c.

Nowadays, and especially if you use Valgrind, you'd want to call calloc, malloc, and free directly. Given the way envy deals with strings you're probably better off using strdup with free instead of str_dup and free_str.

If you change this Valgrind should detect and report memory corruptions correctly.
16 Dec, 2008, tphegley wrote in the 20th comment:
Votes: 0
DavidHaley said:
Given that the corruption occurs in a sprintf channel, I believe that your problem is that you're writing too much into that buffer – and since the sprintf statement looks innocuous enough, I would try looking at the things you're passing to the print statement.

Namely:
sprintf (buf, "A call to arms! I've spotted %s at %s!",
target->short_descr,
newname);
do_yell (ch, buf); <—–419

I would look at anything you've done recently to short_descr, as well as figure out where "newname" is coming from.

Also, since the problem seems to come from using OLC followed by a save, I would make sure that you haven't changed anything in the save command. Finally, also make sure you haven't changed anything in the OLC commands.

I would also try using a breakpoint to examine the contents of target->short_descr and newname at the code point I mentioned above. Setting a breakpoint is actually pretty easy, Nick's guide goes over it well if I remember correctly.


I did a diff from my player port and code port and nothing is changed from short_descr and newname. newname is stripping the color from a string. It hasn't been touched.
0.0/36