21 Jan, 2009, Kayle wrote in the 1st comment:
Votes: 0
So. I was rather bored earlier today, so I decided to be a smartass and try and embedd Kiasyn's old Newsbot (available in the repository here) into my codebase. Everything went well, and I got everything moved over, and even got it all updated for const correctness (I'm assuming.) Now. everything compiled fine. And then I tried to run it. Well, that was a rather large mistake. So then I used some cheap workarounds and unconst-fixed it, and just generally did a hack job, and got it to boot. Well, as it turns out that wasn't any better. Because now it's just showing the feed as blank lines with the [{MWForums}] tag on the front of 3 spaces. Not exactly real explanatory.


*** glibc detected *** /home/kayle/coding/src/elysium: free(): invalid pointer: 0x0000000000916040 ***
======= Backtrace: =========
/lib64/libc.so.6[0x349ea77ec8]
/lib64/libc.so.6(cfree+0x76)[0x349ea7a486]
/home/kayle/coding/src/elysium(_Z9loadFeedsv+0x1ed)[0x5dfbb3]
/home/kayle/coding/src/elysium(_Z14update_handlerv+0x167)[0x61c91e]
/home/kayle/coding/src/elysium(_Z9game_loopv+0x5d0)[0x526c92]
/home/kayle/coding/src/elysium(main+0x3bf)[0x528a2c]
/lib64/libc.so.6(__libc_start_main+0xe6)[0x349ea1e576]
/home/kayle/coding/src/elysium[0x430d49]
======= Memory map: ========
00110000-0011c000 r-xp 00000000 08:05 810619 /lib64/libnss_files-2.9.so
0011c000-0031b000 —p 0000c000 08:05 810619 /lib64/libnss_files-2.9.so
0031b000-0031c000 r–p 0000b000 08:05 810619 /lib64/libnss_files-2.9.so
0031c000-0031d000 rw-p 0000c000 08:05 810619 /lib64/libnss_files-2.9.so
00400000-006c9000 r-xp 00000000 08:07 53870807 /home/kayle/coding/src/elysium
008c9000-008cf000 rw-p 002c9000 08:07 53870807 /home/kayle/coding/src/elysium
008cf000-00c97000 rw-p 008cf000 00:00 0 [heap]
349e600000-349e620000 r-xp 00000000 08:05 810676 /lib64/ld-2.9.so
349e81f000-349e820000 r–p 0001f000 08:05 810676 /lib64/ld-2.9.so
349e820000-349e821000 rw-p 00020000 08:05 810676 /lib64/ld-2.9.so
349ea00000-349eb68000 r-xp 00000000 08:05 810704 /lib64/libc-2.9.so
349eb68000-349ed68000 —p 00168000 08:05 810704 /lib64/libc-2.9.so
349ed68000-349ed6c000 r–p 00168000 08:05 810704 /lib64/libc-2.9.so
349ed6c000-349ed6d000 rw-p 0016c000 08:05 810704 /lib64/libc-2.9.so
349ed6d000-349ed72000 rw-p 349ed6d000 00:00 0
349ee00000-349ee02000 r-xp 00000000 08:05 810705 /lib64/libdl-2.9.so
349ee02000-349f002000 —p 00002000 08:05 810705 /lib64/libdl-2.9.so
349f002000-349f003000 r–p 00002000 08:05 810705 /lib64/libdl-2.9.so
349f003000-349f004000 rw-p 00003000 08:05 810705 /lib64/libdl-2.9.so
349f200000-349f284000 r-xp 00000000 08:05 810727 /lib64/libm-2.9.so
349f284000-349f483000 —p 00084000 08:05 810727 /lib64/libm-2.9.so
349f483000-349f484000 r–p 00083000 08:05 810727 /lib64/libm-2.9.so
349f484000-349f485000 rw-p 00084000 08:05 810727 /lib64/libm-2.9.so
349f600000-349f617000 r-xp 00000000 08:05 810712 /lib64/libpthread-2.9.so
349f617000-349f816000 —p 00017000 08:05 810712 /lib64/libpthread-2.9.so
349f816000-349f817000 r–p 00016000 08:05 810712
Program received signal SIGABRT, Aborted.
0x000000349ea32f05 in raise () from /lib64/libc.so.6
Missing separate debuginfos, use: debuginfo-install e2fsprogs-libs-1.41.3-2.fc10.x86_64
glibc-2.9-3.x86_64 keyutils-libs-1.2-3.fc9.x86_64 krb5-libs-1.6.3-16.fc10.x86_64 libgcc-4.3.2-7.x86_64
libselinux-2.0.73-1.fc10.x86_64 libstdc++-4.3.2-7.x86_64 mysql-libs-5.0.67-2.fc10.x86_64
openssl-0.9.8g-12.fc10.x86_64 zlib-1.2.3-18.fc9.x86_64
(gdb)

That's what I see in GDB. Talk about a screenfull. It's a bit daunting honestly.

So then I:
(gdb) bt
#0 0x000000349ea32f05 in raise () from /lib64/libc.so.6
#1 0x000000349ea34a73 in abort () from /lib64/libc.so.6
#2 0x000000349ea72438 in __libc_message () from /lib64/libc.so.6
#3 0x000000349ea77ec8 in malloc_printerr () from /lib64/libc.so.6
#4 0x000000349ea7a486 in free () from /lib64/libc.so.6
#5 0x00000000005dfbb3 in loadFeeds () at rss.cpp:221
#6 0x000000000061c91e in update_handler () at update.cpp:1678
#7 0x0000000000526c92 in game_loop () at comm.cpp:851
#8 0x0000000000528a2c in main (argc=2, argv=0x7fffffffe758) at comm.cpp:524
(gdb) frame 5
#5 0x00000000005dfbb3 in loadFeeds () at rss.cpp:221
221 DISPOSE( name );
(gdb) list
216 rss->format = format;
217 rss->lastupdate = 0;
218 rss->nextpost = 0;
219 rss->nextupdate = 0;
220 rss->interval = interval;
221 DISPOSE( name );
222 DISPOSE( url );
223 DISPOSE( input );
224 DISPOSE( output );
225 DISPOSE( format );
(gdb) print name
$1 = 0x916040 "[*%*category*%*] *%*title*%* [*%*qsfauthor2*%*] (*%*url*%*)\n"
(gdb)


Now, that's not entirely right. For this particular entry, name should be MWForums:
#FEED
MWForums
http://www.malevolentwhispers.org/index....
mwforums
mwforums
[*%*category*%*] *%*title*%* [*%*qsfauthor2*%*] (*%*url*%*)
15
#END


Of course this further peaks my curiousity.
(gdb) print name
$1 = 0x916040 "[*%*category*%*] *%*title*%* [*%*qsfauthor2*%*] (*%*url*%*)\n"
(gdb) print format
$2 = 0x916040 "[*%*category*%*] *%*title*%* [*%*qsfauthor2*%*] (*%*url*%*)\n"
(gdb) print url
$3 = 0x916040 "[*%*category*%*] *%*title*%* [*%*qsfauthor2*%*] (*%*url*%*)\n"
(gdb)


Huh. That's odd. Wth is up with that? I dunno. And that's where all of you come in. Maybe you'll see something I'm not. Maybe Kiasyn will happen along and just know what went wrong. I put all the code for the bot in here. And I'm using SmaugFUSS 1.9 as a base for my code, And yeah… I haven't a fucking clue here guys..

Oh, my DISPOSE macro:
#define DISPOSE(point)                         \
do \
{ \
if( (point) ) \
{ \
if( typeid((point)) == typeid(char*) || typeid((point)) == typeid(const char*) ) \
{ \
if( in_hash_table( (char*)(point) ) ) \
{ \
log_printf( "&RDISPOSE called on STRALLOC pointer: %s, line %d\n", __FILE__, __LINE__ ); \
log_string( "Attempting to correct." ); \
if( str_free( (char*)(point) ) == -1 ) \
log_printf( "&RSTRFREEing bad pointer: %s, line %d\n", __FILE__, __LINE__ ); \
} \
else \
free( (void*) (point) ); \
} \
else \
free( (void*) (point) ); \
(point) = NULL; \
} \
else \
(point) = NULL; \
} while(0)


Hmm. That should be everything…
21 Jan, 2009, Kayle wrote in the 2nd comment:
Votes: 0
Yeah. So. Rewrote how it works and got it to work.
21 Jan, 2009, The_Fury wrote in the 3rd comment:
Votes: 0
The_Fury claps at your performance.
21 Jan, 2009, Kayle wrote in the 4th comment:
Votes: 0
THe short answer to what I did was to switch the calls of fread_line to fread_string with regards to posts, and I just changed the loadFeeds function to have one Feed. the MWForums feed. and yeah. Works like a charm.
21 Jan, 2009, kiasyn wrote in the 5th comment:
Votes: 0
pwned :p
21 Jan, 2009, Kayle wrote in the 6th comment:
Votes: 0
next order of business, giving the python script the ability to read cookies, and making it so that it can log in to see the parts of the forums that actually have chatter on them. >.< Any suggestions, Kiasyn? lol


[Edit:] Reply != Preview. -.-;
21 Jan, 2009, Davion wrote in the 7th comment:
Votes: 0
MegaBot uses python?
21 Jan, 2009, Kayle wrote in the 8th comment:
Votes: 0
No, MegaBot uses Lua.

What I'm working on is an embedded RSS Reader for MW itself, so I can monitor my websites RSS Feeds and broadcast them on an in game channel. That way, MEgaBot isn't spewing off a bunch of game related forums because I'm too lay to check my site every day. :P

My problem now is that the python bot that I have to check the feed, can't see the forums that are active on the MW website, and I need a way to give it a login and have it parse the necessary cookies so that it can see whatever boards I need it to see.

Bot as is:
# Uber thanks to http://www.xml.com/lpt/a/2002/12/18/dive...
from xml.dom import minidom
import urllib
import sys
DEFAULT_NAMESPACES = \
(None, # RSS 0.91, 0.92, 0.93, 0.94, 2.0
'http://purl.org/rss/1.0/', # RSS 1.0
'http://my.netscape.com/rdf/simple/0.9/' # RSS 0.90
)
def load(rssURL):
return minidom.parse(urllib.urlopen(rssURL))
def getElementsByTagName(node, tagName):
children = node.getElementsByTagName(tagName)
if len(children): return children
return []
def first(node, tagName):
children = getElementsByTagName(node, tagName)
return len(children) and children[0] or None
def textOf(node):
return node and "".join([child.data for child in node.childNodes]) or ""

if __name__ == '__main__':
# rssDocument = load('http://www.malevolentwhispers.org/index....')
rssDocument = load(sys.argv[1]);
fileObj = open(sys.argv[2],"w");
for item in getElementsByTagName(rssDocument, 'item'):
fileObj.write( "#POST\n" );
fileObj.write( textOf(first(item,'category')) + "~\n" )
fileObj.write( textOf(first(item,'title')) + "~\n" );
fileObj.write( textOf(first(item,'author')) + "~\n" );
fileObj.write( textOf(first(item,'link')) + "~\n" );
fileObj.write( textOf(first(item,'feedburner:origLink')) + "~\n" );
fileObj.write( "#END\n" );
fileObj.close();


I need to expand the above to read in some kind of cookie before pulling the RSS because the QSFP feeds allow your RSS to be directly influenced by your permissions. So with a cookie, and an account I'd be able to make it see any board I wanted, instead of just the boards viewable by guests. But. I lack a knowledge of python. Thus I cannot accomplish my task.
21 Jan, 2009, David Haley wrote in the 9th comment:
Votes: 0
I think that you send cookies along with the HTTP header. So you'd need to muck around with this:

def load(rssURL):
return minidom.parse(urllib.urlopen(rssURL))


My presumption – without knowing anything about urllib – is that that line of code is accessing the URL with a straight, simple GET request. You'll need to figure out how to modify that request to send an HTTP header, including your cookie.
21 Jan, 2009, David Haley wrote in the 10th comment:
Votes: 0
I posted an example to the Pastebin of how to make an HTTP request with cookies and get data back.
21 Jan, 2009, kiasyn wrote in the 11th comment:
Votes: 0
Megabot is C++ running embedded lua scripts, the old news bot was a clusterfuck of C, C++ and python.
21 Jan, 2009, Kayle wrote in the 12th comment:
Votes: 0
Yeah, that about sums it up.
21 Jan, 2009, Kayle wrote in the 13th comment:
Votes: 0
Ok. Using David's example, and with his help, I got the script to use cookies when it gets the information.

# Uber thanks to http://www.xml.com/lpt/a/2002/12/18/dive...
from xml.dom import minidom
import httplib, urllib
import sys
DEFAULT_NAMESPACES = \
(None, # RSS 0.91, 0.92, 0.93, 0.94, 2.0
'http://purl.org/rss/1.0/', # RSS 1.0
'http://my.netscape.com/rdf/simple/0.9/' # RSS 0.90
)
def load(rssURL):
conn = httplib.HTTPConnection("www.malevolentwhispers.org")
cookies = {'Cookie': 'qsf_user=xxx;qsf_pass=xxx'}
conn.request("GET", "/index.php", urllib.urlencode({'a': 'index.php'}), cookies)
r1 = conn.getresponse()
# might want to worry a little if r1.status isn't 200 (i.e. OK)
data = r1.read()
return minidom.parse(data)
def getElementsByTagName(node, tagName):
children = node.getElementsByTagName(tagName)
if len(children): return children
return []
def first(node, tagName):
children = getElementsByTagName(node, tagName)
return len(children) and children[0] or None
def textOf(node):
return node and "".join([child.data for child in node.childNodes]) or ""

if __name__ == '__main__':
# rssDocument = load('http://www.malevolentwhispers.org/index....')
rssDocument = load(sys.argv[1]);
fileObj = open(sys.argv[2],"w");
for item in getElementsByTagName(rssDocument, 'item'):
fileObj.write( "#POST\n" );
fileObj.write( textOf(first(item,'category')) + "~\n" )
fileObj.write( textOf(first(item,'title')) + "~\n" );
fileObj.write( textOf(first(item,'author')) + "~\n" );
fileObj.write( textOf(first(item,'link')) + "~\n" );
fileObj.write( textOf(first(item,'feedburner:origLink')) + "~\n" );
fileObj.write( "#END\n" );
fileObj.close();


That's the script as is. And it somewhat works. It sends the information out, but then Apache is rejecting it saying it doesn't have access rights. So. Here's the command line output from running the script manually:

command line said:
[kayle@boralis bin]$ python ../bin/rss.py http://www.malevolentwhispers.org/index.... &
[1] 31696
[kayle@boralis bin]$ Traceback (most recent call last):
File "../bin/rss.py", line 30, in <module>
rssDocument = load(sys.argv[1]);
File "../bin/rss.py", line 17, in load
return minidom.parse(data)
File "/usr/lib64/python2.5/site-packages/_xmlplus/dom/minidom.py", line 1915, in parse
return expatbuilder.parse(file)
File "/usr/lib64/python2.5/site-packages/_xmlplus/dom/expatbuilder.py", line 924, in parse
fp = open(file, 'rb')
IOError: [Errno 2] No such file or directory: '<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">\n<html>head>\n<title>403 Forbidden</title>\n</head><body>\n<h1>Forbidden</h1>\n<p>You don\'t have permission to access /index.php\non this server.</p>\n<hr>\n<address>Apache/2.2.10 (Fedora) Server at www.malevolentwhispers.org Port 80</address>\n</body></html>\n'

[1]+ Exit 1 python ../bin/rss.py http://www.malevolentwhispers.org/index....
[kayle@boralis bin]$


And then here's the access_logs. the 403's are the modified script on the MW Development port. and the 200's are the original script running on MW's live port.
access_log said:
66.218.49.113 - - [21/Jan/2009:11:36:20 -0800] "GET /index.php?a=rssfeed HTTP/1.0" 200 4122 "-" "Python-urllib/1.17"
149.77.73.14 - - [21/Jan/2009:11:37:04 -0800] "POST /index.php HTTP/1.1" 403 303 "-" "-"
149.77.73.14 - - [21/Jan/2009:11:37:35 -0800] "GET /index.php?a=rssfeed HTTP/1.1" 200 4122 "-" "curl/7.16.1 (i386-pc-solaris2.10) libcurl/7.1$
149.77.73.14 - - [21/Jan/2009:11:37:55 -0800] "GET /index.php HTTP/1.1" 403 303 "-" "-"
149.77.73.14 - - [21/Jan/2009:11:38:43 -0800] "GET /index.php HTTP/1.1" 403 303 "-" "-"
149.77.73.14 - - [21/Jan/2009:11:39:33 -0800] "GET /index.php?a=rssfeed HTTP/1.1" 403 303 "-" "-"
149.77.73.14 - - [21/Jan/2009:11:39:45 -0800] "GET /index.php?a=rssfeed HTTP/1.1" 403 303 "-" "-"
149.77.73.14 - - [21/Jan/2009:11:39:59 -0800] "GET /index.php?a=rssfeed HTTP/1.1" 403 303 "-" "-"
66.218.49.113 - - [21/Jan/2009:11:40:02 -0800] "GET /index.php HTTP/1.1" 403 303 "-" "-"
66.218.49.113 - - [21/Jan/2009:11:40:02 -0800] "GET /index.php HTTP/1.1" 403 303 "-" "-"
149.77.41.156 - - [21/Jan/2009:11:40:17 -0800] "GET /index.php?a=rssfeed HTTP/1.1" 200 1180 "-" "Mozilla/5.0 (Windows; U; Windows NT 5.1; en-$
149.77.73.14 - - [21/Jan/2009:11:40:56 -0800] "GET /index.php HTTP/1.0" 200 10737 "-" "Wget/1.10.2.1"
66.218.49.113 - - [21/Jan/2009:11:41:00 -0800] "GET /index.php?a=rssfeed HTTP/1.0" 200 4122 "-" "Python-urllib/1.17"
66.218.49.113 - - [21/Jan/2009:11:41:00 -0800] "GET /index.php?a=rssfeed HTTP/1.0" 200 4122 "-" "Python-urllib/1.17"
149.77.73.14 - - [21/Jan/2009:11:41:52 -0800] "GET /index.php?a=rssfeed HTTP/1.1" 403 303 "-" "-"


The only difference I can see is that the one that succeeds is http/1.0 and the one that fails is HTTP/1.1, Could this really make a difference? And if so, how do I make the httplib use HTTP/1.0?
21 Jan, 2009, David Haley wrote in the 14th comment:
Votes: 0
That's a pretty interesting observation, but I think that the HTTP version isn't the culprit in this case:

$ curl -v "www.malevolentwhispers.org/index.php?a=r..."
* About to connect() to www.malevolentwhispers.org port 80 (#0)
* Trying 66.218.49.113… connected
* Connected to www.malevolentwhispers.org (66.218.49.113) port 80 (#0)
> GET /index.php?a=rssfeed HTTP/1.1
> User-Agent: curl/7.16.1 (i386-pc-solaris2.10) libcurl/7.16.1 OpenSSL/0.9.7g zlib/1.2.3
> Host: www.malevolentwhispers.org
> Accept: */*
>
< HTTP/1.1 200 OK
< Date: Wed, 21 Jan 2009 22:15:37 GMT
< Server: Apache/2.2.10 (Fedora)
< X-Powered-By: PHP/5.2.6
< P3P: CP="CAO PSA OUR"
< Set-Cookie: PHPSESSID=ticn3bl3ijd0q9j5rn6t8abl55; path=/
< Expires: Thu, 19 Nov 1981 08:52:00 GMT
< Cache-Control: no-store, no-cache, must-revalidate, post-check=0, pre-check=0
< Pragma: no-cache
< Set-Cookie: qsf_lastallread=1232489737; expires=Thu, 21-Jan-2010 22:15:37 GMT; path=/; domain=www.malevolentwhispers.org; httponly
< Connection: close
< Transfer-Encoding: chunked
< Content-Type: text/xml
(file follows)
21 Jan, 2009, quixadhal wrote in the 15th comment:
Votes: 0
Looking at the curl output, you might also want to capture and pass the PHPSESSID along with your other cookies. I suspect your server may be expecting it and rejecting you on the assumption that no session means hax0r.
21 Jan, 2009, David Haley wrote in the 16th comment:
Votes: 0
Hmm, that's an interesting idea, and it would be pretty annoying if it's true (because this isn't meant to be an interactive script, and so there isn't really a "session" to speak of). I wonder if it is though because it succeeds even though the first query doesn't have the session id cookie. I worked with Kayle on this for a bit and am somewhat confused…
0.0/16