05 Sep, 2013, THUFIR wrote in the 1st comment:
Votes: 0
for regex triggers on MUD game responses, what sort of encoding do most MUD's send out? ANSI? and then, to turn it into "plain text", that's UTF-8?
05 Sep, 2013, Scandum wrote in the 2nd comment:
Votes: 0
Most MUDs use ASCII encoding. Those that need more complex encoding for special characters (Russian, Scandinavian) generally use UTF-8.
05 Sep, 2013, plamzi wrote in the 3rd comment:
Votes: 0
ANSI is a terminal control code that you can learn more about here: http://en.wikipedia.org/wiki/ANSI_escape...

ASCII and UTF-8 are character encodings.

Like Scandum said, most MUDs use ASCII, which is much older and more limited. But UTF-8 was designed to be compatible with ASCII and extend it (by a lot), so in most cases you will be fine decoding ASCII using the UTF-8 decoding utilities provided by your framework.

Now, if your client supports UTF-8, you have to announce to servers that it does–otherwise they may not even try to send any.

Scandum is the best source to help you with negotiating UTF-8 via MTTS as outlined here: http://tintin.sourceforge.net/mtts/ . He helped me with same just last week.

There is also the more generic negotiation via CHARSET that is part of the TELNET protocol ( http://tools.ietf.org/html/rfc2066 ):

Client: IAC (255) WILL (251) CHARSET (42)

Server: IAC SB (250) CHARSET REQUEST (1) UTF-8 IAC SE

Client: IAC SB CHARSET ACCEPTED (2) UTF-8 IAC SE (240)


The numbers in brackets are the decimal byte value that you need to send. Most of the "characters" used in negotiation are non-printable.
06 Sep, 2013, THUFIR wrote in the 4th comment:
Votes: 0
Well, there are a number of constants I can send for handshake:

http://commons.apache.org/proper/commons...

However, the default settings seem fine. I want to keep the colors for output, but strip, or convert ANSI text to "dumb" text within the client. I just don't know what that process is called.
14 Sep, 2013, THUFIR wrote in the 5th comment:
Votes: 0
plamzi said:
There is also the more generic negotiation via CHARSET that is part of the TELNET protocol ( http://tools.ietf.org/html/rfc2066 ):

Client: IAC (255) WILL (251) CHARSET (42)

Server: IAC SB (250) CHARSET REQUEST (1) UTF-8 IAC SE




Charset charset = java.nio.charset.StandardCharsets.US_ASCII;
telnetClient.setCharset(charset);
TelnetOptionHandler simpleOptionHandler = new SimpleOptionHandler(255, true, true, true, true);
telnetClient.connect(host, port);
telnetClient.setCharset(charset);
telnetClient.addOptionHandler(simpleOptionHandler);
int[] responseLocal = simpleOptionHandler.startSubnegotiationLocal();
int[] responseRemote = simpleOptionHandler.startSubnegotiationRemote();

byte command = (byte) 255;
telnetClient.sendCommand(command);
// simpleOptionHandler.answerSubnegotiation();
log.log(Level.INFO, "{0}\t{1}", new Object[]{responseLocal, responseRemote});



I'm not understanding the API documentation on how to send these commands to the remote server:


http://commons.apache.org/proper/commons...


For example, what int val do I pass to the SimpleOptionHandler constructor?
14 Sep, 2013, Idealiad wrote in the 6th comment:
Votes: 0
The int is an opcode, for example for charset, '42'. Look at the link plamzi gave you and you'll see the int values for various opcodes.
0.0/6