21 Jan, 2010, David Haley wrote in the 21st comment:
Votes: 0
elanthis said:
And then you get the people who are using binary formats for speed instead of using a well-designed text format. One of my professors for example keeps advocating complex serialization frameworks that can handle both XML and a custom binary format, one for development/debugging and one for release-mode. Naturally, the binary version is faster, but not noticeably so in any but the largest of files, and simply using a less retarded text format (e.g. JSON) can really close the performance gap.

I'm not sure how "large" you mean by the "largest" of files, but at work we store tables with some 25k lines and… oh… 50 columns?? in both text and Python-binary-pickle format. (The text for guaranteed compatibility, the binary for speed.) Loading the data structure from the binary pickle takes about two seconds whereas loading it from the text pickle takes a good fifteen. (The files are about 2MB at high bz2 compression.)
The file format itself is dead simple, it's literally just space-separated tabular values.

Of course I'm agreeing with your general point, which is that it is very often not worth using some weird-ass binary format, but this seems like a case with obvious gains. Especially as you start loading up 50 of these things, you really don't want to sit around for 15*50=750s=12.5 minutes. It so happens that in this case, we can take advantage of an extraordinarily well-known Python data structure pickling method, so we don't even have to roll our own.

Quote
I seriously would use a function to set/check/clear bits.

100% agreed. It's almost always better to express things via an API than direct operations.
21 Jan, 2010, elanthis wrote in the 22nd comment:
Votes: 0
I'm willing to bet the parser for those text files could be improved dramatically. The biggest advantage that most binary protocols have is that they usually give things in absolute sizes up front, so that there's no need for linear (or even quadratic) allocation/copy patterns, while many text files have to scan until an end token is found, and do a good of extra allocation and copying because of it. You can of course devise a text protocol that has no such limits. HTTP for example has the option of using either chunked encoding (no known up-front size) or an explicit size via Content-Length. PHP's serialization format also uses explicit lengths for things like strings, although not for arrays iirc. Even without them, a better algorithm can drastically help in parsing times, e.g. using a linear algorithm instead of a quadratic one for allocation patterns when reading in large arrays/strings, combined with sensible values for default sized, default chunks, etc.

I honestly find 2 seconds to read in a data file that size to be pretty low, so even the binary pickle decoder might need some work.
21 Jan, 2010, David Haley wrote in the 23rd comment:
Votes: 0
elanthis said:
I'm willing to bet the parser for those text files could be improved dramatically.

That's quite possible. I haven't really looked at it, and I don't think it was written to be super-duper-efficient. (Efficiency in this particular area is usually a concern only insofar as things being slow is annoying because you have to wait around. This is different from, say, high-frequency operations where wasting a few milliseconds can cost you percentage points of real money on your transaction.)
Indeed, the size is not known up-front although newer versions of the format do give the number of lines up-front for better allocation – having that size hint helps quite a bit.

elanthis said:
I honestly find 2 seconds to read in a data file that size to be pretty low, so even the binary pickle decoder might need some work.

Well, there's at least two other things going on here. One is the Python pickling library, which is outside of our control (unless we felt like reimplementing it, which we don't, and there's little justification for it anyhow since the authors are presumably not dumb). Then there is the fact that this is a highly compressed file. I haven't benchmarked it but I suspect that more time is spent decompressing the file from disk (over a networked file system blablabla) than "parsing" the resulting binary pickle data.
21 Jan, 2010, KaVir wrote in the 24th comment:
Votes: 0
elanthis said:
Keeping in mind that bool is the size of an int, then not only would that potential change break some corner-cases of the interface, but your bit set memory usage would increase by 32x on most platforms.

No, the size of a bool is implementation-dependent (and I've yet to see an implementation that didn't give it a size of 1 byte).

elanthis said:
Bzzt. Try again. Aside from the fact that a good compiler (you know, those written many, many years after K&R's latest revision of the C book) will optimize bit field access as well as you could optimize it by hand,

Actually Tonitrus is correct about integers being faster than bit fields. This is because a byte is the lowest level at which data can be accessed - meaning that every time you read or manipulate a value in a bit field the compiler has to use bitwise operators on it. If you don't take this fact into account when using bit fields, you could even end up with code that isn't thread-safe.

Bit fields really are nasty. As I said before, if you're using C I'd suggest an array of unsigned char - no padding between the bits, and you can even choose the bit order for efficiency if you enjoy that sort of thing (a waste of time for a mud IMO, but the point is it's not going to be slower or use more memory than bit fields). Use macros for readability, or even functions if you prefer.
21 Jan, 2010, David Haley wrote in the 25th comment:
Votes: 0
KaVir said:
Actually Tonitrus is correct about integers being faster than bit fields. This is because a byte is the lowest level at which data can be accessed - meaning that every time you read or manipulate a value in a bit field the compiler has to use bitwise operators on it. If you don't take this fact into account when using bit fields, you could even end up with code that isn't thread-safe.

You seem to be assuming that bit sets must be implemented as bytes. I'm not sure why you don't think you can implement bit sets using the data-size-aligned number type and use bit masks of that size as well.
It is true that you will need the bitwise operation, but that is so amazingly fast and your statement (well, especially Tonitrus's) was strong enough that it seems like you're saying something else. In fact, the original claim was made due to alignment problems. So my response is made in that context.

I'm not sure what thread-safety has to do with this; a simple 'toggle' command can just as well not be thread safe with integers. E.g.,
i = !i
this isn't threadsafe unless you lock access to i one way or another. In other words: both approaches suffer from thread safety issues.
21 Jan, 2010, Skol wrote in the 26th comment:
Votes: 0
This makes me wonder if there should be a similar change to the RaM on the act bits.
Quick question (realizing I'm a hack ;p), what would be the disadvantage of doing something like:
ch->act as an array. IE. say ch->act[ACT_MOBILE] where #define ACT_MOBILE = 1. Saving the entire thing as a boolean in a string format like: Act 0001001000101011100000111000001010000000~
That sort of thing, then ch->act[ACT_AGGRESIVE] would say be the 34th spot in the array.
That make sense? I could then have a nearly unlimited set.

Or store them more as Act 234~ Act 3~ etc, so it only turns those that are used to 'on' and it's dynamically expandable.

Anyway, thinking outloud.
21 Jan, 2010, KaVir wrote in the 27th comment:
Votes: 0
David Haley said:
You seem to be assuming that bit sets must be implemented as bytes.

No, but I prefer using unsigned char in this situation for portability reasons, as it's the only type guaranteed to have no padding bits.

David Haley said:
It is true that you will need the bitwise operation, but that is so amazingly fast and your statement (well, especially Tonitrus's) was strong enough that it seems like you're saying something else.

Yes, bitwise operations are extremely fast - but they're still extra instructions. Elanthis is incorrect to suggest that his bit field approach can be optimised to be as fast as Tonitrus's integer approach, and I felt it was particularly inappropriate for him to make sarcastic remarks about other people's knowledge of optimisation while making such statements himself.

David Haley said:
I'm not sure what thread-safety has to do with this

It was just a comment in reference to the suggestion that bit fields can be as fast as integers (the reason why that statement isn't true is the same reason bit fields can result in code that isn't thread-safe - the machine cannot manipulate single bits). Basically another example of how bit fields can trip you up.
21 Jan, 2010, David Haley wrote in the 28th comment:
Votes: 0
KaVir said:
Elanthis is incorrect to suggest that his bit field approach can be optimised to be as fast as Tonitrus's integer approach

I don't want to put words in Elanthis's mouth, but he was probably reacting to the claim about alignment that Tonitrus made, which is indeed incorrect (or misleading, depending on how you look at things). I agree with you that the extra bitwise operation is indeed an extra instruction. (That's an almost tautological statement I made there :wink:)

KaVir said:
No, but I prefer using unsigned char in this situation for portability reasons, as it's the only type guaranteed to have no padding bits.

I don't really see what padding bits have to do with things when you have a chunk of memory, and you're applying bitwise operations on that chunk. An array of uint32 elements won't have "holes" in it or padding in between elements, and if your masks are of size uint32 as well, things just line up. (Alignment might change depending on the platform, of course.)

KaVir said:
(the reason why that statement isn't true is the same reason bit fields can result in code that isn't thread-safe - the machine cannot manipulate single bits)

But even operations like "i = !i" aren't atomic, so this isn't thread-safe either. I'm not at all saying that you're wrong about bit fields being potentially thread-unsafe; I'm saying that I don't see the difference w.r.t. single-value representation. In all cases you need to worry about thread safety.
22 Jan, 2010, elanthis wrote in the 29th comment:
Votes: 0
Yes, bool is indeed only 1 byte. Not sure where I got the idea it was sizeof(int), but it was definitely a wrong idea. :)

And yes, accessing a raw int is faster than any bitwise op. I was speaking in terms of using an int as a custom bitfield, which obviously isn't what Tonitus was talking about, which I obviously knew from the second half of my post… I need to lay off the liquor or something.
22 Jan, 2010, quixadhal wrote in the 30th comment:
Votes: 0
Skol said:
This makes me wonder if there should be a similar change to the RaM on the act bits.
Quick question (realizing I'm a hack ;p), what would be the disadvantage of doing something like:
ch->act as an array. IE. say ch->act[ACT_MOBILE] where #define ACT_MOBILE = 1. Saving the entire thing as a boolean in a string format like: Act 0001001000101011100000111000001010000000~
That sort of thing, then ch->act[ACT_AGGRESIVE] would say be the 34th spot in the array.
That make sense? I could then have a nearly unlimited set.

Or store them more as Act 234~ Act 3~ etc, so it only turns those that are used to 'on' and it's dynamically expandable.

Anyway, thinking outloud.


You could do it, but that's no better than storing the entire Act flag set as integers. IE: You can't look at the file and know what anything means unless you have the header file handy.

It's not hard to just store the bits that are set as words when writing and just set those bits when reading it back in.

I'd much rather see "Act AGGRESSIVE SENTINAL HATE_EVIL~" than either "Act 16777473" or "Act 00000000100000000000000010000001~"
22 Jan, 2010, KaVir wrote in the 31st comment:
Votes: 0
David Haley said:
I don't want to put words in Elanthis's mouth, but he was probably reacting to the claim about alignment that Tonitrus made

He didn't mention alignment until the paragraph after the one I quoted, in which he stated "Also, what David said. ints aren't even guaranteed to be aligned on all platforms", so I took that to mean that the first paragraph was a reference to the speed of accessing an integer compared to accessing a bit field.

David Haley said:
I don't really see what padding bits have to do with things when you have a chunk of memory, and you're applying bitwise operations on that chunk. An array of uint32 elements won't have "holes" in it or padding in between elements, and if your masks are of size uint32 as well, things just line up. (Alignment might change depending on the platform, of course.)

I'm not sure of the exact rules for permitted padding bit positions, but I do recall reading about a 32 bit integer with a padding bit in the middle. It's also possible for some combinations of padding bits to generate trap representations. While such scenarios are hardly common, given the choice I would rather use unsigned char, which is guaranteed to have no padding.

David Haley said:
I'm not at all saying that you're wrong about bit fields being potentially thread-unsafe; I'm saying that I don't see the difference w.r.t. single-value representation.

Well you'd use a mutex to protect the variable before changing it, agreed?

The problem is that if you do that on a bit field variable, it won't protect the other variables in the bit field struct. And because the machine cannot manipulate individual bits, if the other thread decides to modify another variable stored in the same word, that mutex isn't going to work.

You can get around this problem of course, but if you don't understand how bit fields are stored and manipulated (and therefore why they're not as fast as integers) it's not something you're likely to consider. That was basically the reason I brought it up.
22 Jan, 2010, Skol wrote in the 32nd comment:
Votes: 0
quixadhal said:
You could do it, but that's no better than storing the entire Act flag set as integers. IE: You can't look at the file and know what anything means unless you have the header file handy.

It's not hard to just store the bits that are set as words when writing and just set those bits when reading it back in.

I'd much rather see "Act AGGRESSIVE SENTINAL HATE_EVIL~" than either "Act 16777473" or "Act 00000000100000000000000010000001~"

Yeah, I was thinking along the lines of player file read/write:
Act AUTO_ASSIST~
Act PLAYER_KILLER~

In your definitions: #define PLAYER_KILLER 4 (etc)

Then it reads those in and stores them simply as ch->act[PLAYER_KILLER], 1 if defined else 0. But it only reads/writes those it needs to, and, you wouldn't be limited like it is with bits as you could define on down the array until you're blue in the face without issues.
22 Jan, 2010, David Haley wrote in the 33rd comment:
Votes: 0
KaVir said:
I'm not sure of the exact rules for permitted padding bit positions, but I do recall reading about a 32 bit integer with a padding bit in the middle.

A 32-bit number with padding in the middle is… 33 bits, ne? I have to say, that sounds kind of crazy. :thinking: Maybe I'm misunderstanding you?

KaVir said:
Well you'd use a mutex to protect the variable before changing it, agreed?

The problem is that if you do that on a bit field variable, it won't protect the other variables in the bit field struct. And because the machine cannot manipulate individual bits, if the other thread decides to modify another variable stored in the same word, that mutex isn't going to work.

You can get around this problem of course, but if you don't understand how bit fields are stored and manipulated (and therefore why they're not as fast as integers) it's not something you're likely to consider. That was basically the reason I brought it up.

Yes, you'd use a mutex, but "you" wouldn't be using the mutexes on anything at all yourself, because you'd be talking to a generic bit field data structure that would take care of mutexes etc. for you in any case (whether you implement it as integers, unsigned chars, whatever).
But, I disagree with the statement in the second paragraph. If the mutex covers even just individual words, another thread wouldn't be able to access a variable in the same word if the mutex is already locked.
22 Jan, 2010, KaVir wrote in the 34th comment:
Votes: 0
David Haley said:
A 32-bit number with padding in the middle is… 33 bits, ne? I have to say, that sounds kind of crazy. Maybe I'm misunderstanding you?

I'm referring to this part of the C99 rationale:
6.2.6.2        Integer types 
Padding bits are user-accessible in an unsigned integer type. For example, suppose a machine
uses a pair of 16-bit shorts (each with its own sign bit) to make up a 32-bit int and the sign
bit of the lower short is ignored when used in this 32-bit int. Then, as a 32-bit signed
int, there is a padding bit (in the middle of the 32 bits) that is ignored in determining the value
20 of the 32-bit signed int. But, if this 32-bit item is treated as a 32-bit unsigned int, then
that padding bit is visible to the users program. The C committee was told that there is a
machine that works this way, and that is one reason that padding bits were added to C99.
Footnotes 44 and 45 mention that parity bits might be padding bits. The committee does not
know of any machines with user-accessible parity bits within an integer. Therefore, the
25 committee is not aware of any machines that treat parity bits as padding bits.


David Haley said:
But, I disagree with the statement in the second paragraph. If the mutex covers even just individual words, another thread wouldn't be able to access a variable in the same word if the mutex is already locked.

It's a known problem (see the section of this FAQ entitled "So how do I deal with the adjacent field..."). This article also explains the issue in ....
22 Jan, 2010, David Haley wrote in the 35th comment:
Votes: 0
6.2.6.2        Integer types 
Padding bits are user-accessible in an unsigned integer type. For example, suppose a machine
uses a pair of 16-bit shorts (each with its own sign bit) to make up a 32-bit int and the sign
bit of the lower short is ignored when used in this 32-bit int. Then, as a 32-bit signed
int, there is a padding bit (in the middle of the 32 bits) that is ignored in determining the value
20 of the 32-bit signed int. But, if this 32-bit item is treated as a 32-bit unsigned int, then
that padding bit is visible to the users program. The C committee was told that there is a
machine that works this way, and that is one reason that padding bits were added to C99.
Footnotes 44 and 45 mention that parity bits might be padding bits. The committee does not
know of any machines with user-accessible parity bits within an integer. Therefore, the
25 committee is not aware of any machines that treat parity bits as padding bits.

That sounds like insanity to me. You explicitly ask for 32 bits (e.g., a uint32) and get 31?? Utter nonsense to deal with that. :sad:

KaVir said:
It's a known problem (see the section of this FAQ entitled "So how do I deal with the adjacent field..."). This article also explains the issue in ....

I think there might have been some vocabulary overload here, perhaps due to an oversight on my part. I was thinking just general bitset (field of bits), not the specific construct where you specify in a struct the size in bits of the struct members. That explains why I wasn't understanding you, though :smile:

It's worth noting that even using bit-size-specified-struct-members, this problem only comes up when you are mixing your flag set into your thing-with-flag's data; you can still get around all of this by having your generic bit set class (which could be implemented with these, whatever) handle synchronization. Due to padding, the flag-implemented-as-bit-specified-fields will not overlap with others, so this should be fine. In fact, the "compliant solution" in your second link does exactly what I have been proposing.
22 Jan, 2010, quixadhal wrote in the 36th comment:
Votes: 0
Skol said:
Then it reads those in and stores them simply as ch->act[PLAYER_KILLER], 1 if defined else 0. But it only reads/writes those it needs to, and, you wouldn't be limited like it is with bits as you could define on down the array until you're blue in the face without issues.


Just realize that you'll be wasting a LOT of space that way. Far more than was discussed earlier, because C doesn't have sparse arrays. So, if you have bit 0 set, and bit 672 set, you have to have an array of 673 characters to hold those two results. As long as you're good with that, go for it.
22 Jan, 2010, KaVir wrote in the 37th comment:
Votes: 0
David Haley said:
That sounds like insanity to me.

I don't disagree. I certainly don't think it'll be a common issue, but using unsigned char ensures that there will never be any padding bits, so I see no reason not to use it.

David Haley said:
I think there might have been some vocabulary overload here, perhaps due to an oversight on my part. I was thinking just general bitset (field of bits), not the specific construct where you specify in a struct the size in bits of the struct members.

Ah, well I've been talking about bit fields (which is the name for that specific construct).

I've certainly no objection to bitsets in general, and do indeed recommend using them, although looking back over my previous posts I can see that perhaps I didn't make that very clear. When I talked about using an array of unsigned char, I didn't mean using each byte as a single bool, but rather using bitwise operators to manipulate its individual bits (thus my commentry about padding bits, bit order and macros), so that each array element contains 8 boolean flags. I believe some muds already use this approach, referring to it as an "extended bitvector"?
22 Jan, 2010, David Haley wrote in the 38th comment:
Votes: 0
KaVir said:
When I talked about using an array of unsigned char, I didn't mean using each byte as a single bool, but rather using bitwise operators to manipulate its individual bits (thus my commentry about padding bits, bit order and macros), so that each array element contains 8 boolean flags. I believe some muds already use this approach, referring to it as an "extended bitvector"?

I think that is correct, yes. But I don't remember if they use unsigned chars or unsigned ints.

I guess the problem here is that you might run into alignment issues again (e.g., uchar_array[2]), although better to have alignment issues than the extremely weird situation described in that snippet you posted. (Then again, the snippet also admits that the authors think it's exceedingly strange as well, and don't even know of machines that behave that way.) Alignment is the main reason to not use unsigned char arrays, no?
22 Jan, 2010, KaVir wrote in the 39th comment:
Votes: 0
David Haley said:
Alignment is the main reason to not use unsigned char arrays, no?

No, bytes can be aligned at any address (which is why they're used for padding for larger types).
22 Jan, 2010, David Haley wrote in the 40th comment:
Votes: 0
Sorry, I didn't mean that in terms of where you can put bytes. My understanding was that there were performance implications in that accessing an address at a power of 4 is very fast, but if you are off that alignment, some processors need to do funny things to get the relevant byte out. Perhaps it's just another bitwise operation on some processors. Anyhow, normally this doesn't matter, but since we were talking about speed on a per-instruction level…
20.0/62