14 Jan, 2009, quixadhal wrote in the 1st comment:
Votes: 0
Well, in doing some prep-work to dig into C++ a bit more, I needed to start brushing up on things. I decided to poke around my trust old friend std::map.

Quote
#include <iostream>
#include <string>
#include <map>

using namespace std;

int main(int argc, char**argv) {
map <string, int> testme;

testme["foo"] = 1;
testme["bar"] = 2;

cout << "size == " << testme.size() << endl;
cout << "testme[foo] == " << testme["foo"] << endl;
cout << "testme[ack] == " << testme["ack"] << endl;
cout << "size == " << testme.size() << endl;

map <string, int> ::iterator i = testme.begin();
i = testme.find("pfft");

if (i != testme.end())
do {
cout << "testme[pfft] == " << i->second << endl;
} while (++i != testme.end());
else
cout << "testme[pfft] is not defined." << endl;

cout << "size == " << testme.size() << endl;
}


Much to my surprise!

Quote
quixadhal@andropov:~$ ./test
size == 2
testme[foo] == 1
testme[ack] == 0
size == 3
testme[pfft] is not defined.
size == 3


I wasn't aware that trying to reference an element of a std::map container that hadn't been defined yet, resulted NOT in the exception I expected, but rather created the entry on the fly, and gave it whatever default empty value the data type in question has.

While that can be good for some things, it's kindof bad for others. Consider, the prime application I see for std::map is for the room/mob/obj lists. Being able to refer to room[3001] directly, instead of pawing through a list or tree, is enormously helpful.

However, as the above shows, it's not as handy as I had hoped! If a wizard types "goto 3002", I can't just look to see if room[3002] exists by seeing if it's defined… I have to declare an iterator and do room.find(3002), otherwise it WILL exist (and be an empty room). Hrmph.
14 Jan, 2009, ghasatta wrote in the 2nd comment:
Votes: 0
http://www.sgi.com/tech/stl/Map.html

"Note that the definition of operator[] is extremely simple: m[k] is equivalent to (*((m.insert(value_type(k, data_type()))).first)).second. Strictly speaking, this member function is unnecessary: it exists only for convenience."

I maintain a map of ids to object pointers, and then have a get_object_by_id() function that returns the object pointer, or NULL if not found. This way any code can simply call the get_object_by_id() function and test for a null value, rather than having to care about the underlying container. Just a thought.
14 Jan, 2009, David Haley wrote in the 3rd comment:
Votes: 0
The operator[] has to create the element if it doesn't exist. ghasatta gave the technical reason from implementation detail, but it must be the case also from principle. Consider this:

std::map m;
m[k] = v;


How could this possibly work if operator[] didn't create the element? This works by returning a reference to the map element – you couldn't do it otherwise.

If you want to throw exceptions etc., you would have to write a helper function to do that. It's pretty easy to do, fortunately.
16 Jan, 2009, Guest wrote in the 4th comment:
Votes: 0
Yep. I ran into much the same when I used some of those in AFKMud. Ended up with results I wasn't counting on because I was assuming it wouldn't exist. But unfortunately it existed once checked to see if it existed. The stuff of great temporal paradoxes!

Somewhere around here I think I even asked about this very situation and Darien explained it to me then, even though I still think it's pretty stupid for it to work that way. He and others ( sorry, I forget who now ) helped me come up with search loops to look and see if an element exists or not.
16 Jan, 2009, quixadhal wrote in the 5th comment:
Votes: 0
DavidHaley said:
The operator[] has to create the element if it doesn't exist. ghasatta gave the technical reason from implementation detail, but it must be the case also from principle. Consider this:

std::map m;
m[k] = v;


How could this possibly work if operator[] didn't create the element? This works by returning a reference to the map element – you couldn't do it otherwise.


While I certainly agree that it has to work that way because of implementation details, and that's the way it DOES work, there's absolutly NO reason it HAS to be that way.

The compiler is perfectly well aware of the difference between an assignment and an expression evaluation. In short, if the

While I certainly agree that it has to work that way because of implementation details, and that's the way it DOES work, there's absolutly NO reason it HAS to be that way.

The compiler is perfectly well aware of the difference between an assignment and an expression evaluation. In short, if the [ operator is used in the left-hand-side of an expression, it can create the element and return the value being assigned. If it's on the right-hand-side of an expression, it could throw an exception if said entry doesn't already exist. The compiler even has enough data to be able to delay that decision to see if the evaluation of the expression would BECOME a left-hand-side further upstream.

Of course, one presumes a committee sat around a table eating donuts and swilling coffee for many hours and decided that the way it is actually done would cause less damage than other ways it could have been done, so that's what we're stuck with unless we're willing to override the default operators or subclass.
16 Jan, 2009, David Haley wrote in the 6th comment:
Votes: 0
I disagree with you – operator[] doesn't have the semantics you describe. It returns a type, just like anything else. It is that type that is then checked for assignment etc. I think you're attributing far too much to this process: operator[] is a method just like any other. std::map objects are not subscripted like arrays. So it doesn't matter what the compiler could know or not. It's not up to the compiler.

How would you propose to fix this? Recall that operator[] has to be treated like a method here, just like any other, and that when the compiler sees m[1], it has to invoke operator[]. Would you start introducing two versions of operator[]? What about all the other operators?

Again I think you're attributing too much to all of this. std::map is not a language feature, it's a library feature. The library has to work within the constraints of the language, and in this case, operator[] simply works this way and as I said earlier, given the various constraints on its usage, it has to work this way.
16 Jan, 2009, quixadhal wrote in the 7th comment:
Votes: 0
If I declare a std::map m, and attempt to access value 1 in it, is there some mystical reason why the [] operator on the map container m can't first check to see if it contains an item whose index is 1 (which is already must do), and instead of calling m.insert, throw an exception?

If [] really is just a function which happens to be mapped to an operator in the std::map class, I see no reason why it MUST evaluate to m.insert() directly.

And as for std::map objects not being subscripted like arrays… they most certainly are! That's kindof the point. It just isn't a simple memory location offset, it's a lookup table for key/value pairs.

Unless they've changed C++ a LOT in the last few years, one of the big selling points was that operators were simply functions that could be redefined or overloaded just like any other function. The only difference is that you have only a limited (fixed) set of names, and their calling syntax is different than a normal function.
16 Jan, 2009, Igabod wrote in the 8th comment:
Votes: 0
I just wanna say that the subject line is very misleading… std's and lsd… sounds like a late 70's early 80's party to me.
16 Jan, 2009, David Haley wrote in the 9th comment:
Votes: 0
quixadhal said:
If I declare a std::map m, and attempt to access value 1 in it, is there some mystical reason why the [] operator on the map container m can't first check to see if it contains an item whose index is 1 (which is already must do), and instead of calling m.insert, throw an exception?

Yes – the reason I gave earlier :wink:

DavidHaley said:
Consider this:

std::map m;
m[k] = v;


How could this possibly work if operator[] didn't create the element? This works by returning a reference to the map element – you couldn't do it otherwise.


Expanding upon this somewhat, the only way to assign to something inside the map is if operator[] returns a reference to that slot somehow. The only way to refer to a slot is to have that slot exist in the first place – otherwise you have no slot to assign into.

quixadhal said:
And as for std::map objects not being subscripted like arrays… they most certainly are! That's kindof the point. It just isn't a simple memory location offset, it's a lookup table for key/value pairs.

I'm not sure I follow why you say that they certainly are … you described completely different things: memory location offset, vs. a function that calls something that happens to use [] instead of ().

If you didn't override operator[] on std::map, you couldn't call m[1] at all. So the subscription is a totally different concept – it has nothing to do with pointer arithmetic. It just happens to use the same syntax – which is arguably confusing because it leads to issues like this one.

Unlike other languages, where subscripts have a special language specification meaning for objects, in C++ std::map's operator[] is a function just like any other that happens to use a different syntax.
16 Jan, 2009, quixadhal wrote in the 10th comment:
Votes: 0
DavidHaley said:
Expanding upon this somewhat, the only way to assign to something inside the map is if operator[] returns a reference to that slot somehow. The only way to refer to a slot is to have that slot exist in the first place – otherwise you have no slot to assign into.


So, you can't conceive of an expression like "NULL = 37;"? Clearly, that would be an error (and generate an exception), yet there's nothing in the language which should disallow it. Yes, most compilers will be smart enough to realize that won't work and stop you, but the language spec doesn't prevent it.

DavidHaley said:
quixadhal said:
And as for std::map objects not being subscripted like arrays… they most certainly are! That's kindof the point. It just isn't a simple memory location offset, it's a lookup table for key/value pairs.

I'm not sure I follow why you say that they certainly are … you described completely different things: memory location offset, vs. a function that calls something that happens to use [] instead of ().

If you didn't override operator[] on std::map, you couldn't call m[1] at all. So the subscription is a totally different concept – it has nothing to do with pointer arithmetic. It just happens to use the same syntax – which is arguably confusing because it leads to issues like this one.


I didn't say it produced offsets, in fact I specifically said it wasn't a simple offset. It is, however, subscripted. m[1] does work, and without me having to overload anything at all, I'm sure the internals are taking m's structure, looking up the "1" in a tree or heap, and returning the object reference for me. The point is, it is designed to work syntactically like an array. As such, I would expect an array to produce an error if I step out of bounds, and trying to access a non-existent element is pretty much out of bounds.

I don't buy the argument that the language requires it to work this way. If exceptions weren't available, I'd concede the point, but there's absolutely no reason the code generated for the lookup of m[1]'s object's address can't throw an exception instead of spontaneously generating a new object.

The fact that you and I have BOTH stated that [] is simply a function, only makes my point even more solid. 1 is the argument being passed to the function [] in object m. Why can't that throw an exception if m[1] doesn't have an object associated yet?

The argument here is a moot thing, since neither of us has the ability to change the way the standards committee for C++'s STL library works – and I know it doesn't work the way I think it should – I just don't believe there's any requirement for it to work the way it does, other than the standard itself.
16 Jan, 2009, David Haley wrote in the 11th comment:
Votes: 0
quixadhal said:
So, you can't conceive of an expression like "NULL = 37;"? Clearly, that would be an error (and generate an exception), yet there's nothing in the language which should disallow it.

Nope, actually, I can't conceive of an expression like that. Why? Because "NULL" is a constant. It would be like saying "1 = 2" – that's just meaningless. (As opposed to "1 == 2" of course…)

quixadhal said:
Yes, most compilers will be smart enough to realize that won't work and stop you, but the language spec doesn't prevent it.

Are you completely sure that the language spec allows arbitrary expressions on the left-hand side? If it does, it shouldn't, and that is a conceptual error. (But I don't think it does…)

quixadhal said:
I didn't say it produced offsets, in fact I specifically said it wasn't a simple offset. It is, however, subscripted. m[1] does work, and without me having to overload anything at all

Obviously "you" are not overloading anything, but the std::map class sure as heck is. I'm not sure what you're saying here: subscripting objects is not something you can normally do! You need to overload operator[] before you can take the subscript of an object…

quixadhal said:
I don't buy the argument that the language requires it to work this way. If exceptions weren't available, I'd concede the point, but there's absolutely no reason the code generated for the lookup of m[1]'s object's address can't throw an exception instead of spontaneously generating a new object.

Well, you could show us how to do it if you're so sure it's possible :wink:

You're forgetting that the language is not the one in charge of the code generated for the lookup of m[1]. I think that's the crux of this, really.

quixadhal said:
The fact that you and I have BOTH stated that [] is simply a function, only makes my point even more solid. 1 is the argument being passed to the function [] in object m. Why can't that throw an exception if m[1] doesn't have an object associated yet?

Because operator[] has to return a reference to allow assignment, as has been said a few times now.

What I want you to do is quite simple: tell me how to allow operator[] to both return a reference when necessary, and not return a reference when unnecessary.

Keep in mind that for an object, m[k] is the same thing as m.operator[](k).

Keep in mind that the same thing happens whether the subscript is on the left or right side of an expression. (Keep in mind that changing that would require completely changing how operator overloading works in the language specification.)


I postulate that as the language specification is now, it is not possible to have operator[] do different things depending on context. Therefore, operator[] simply must work the way it does, in order to allow what it allows. To do what you want it to do has nothing to do with the STL at all! – it is prevented by the language itself, not the library code.

Of course, this could be settled simply by explaining how to make it work the way you want it to, instead of saying it should be possible.
16 Jan, 2009, Mister wrote in the 12th comment:
Votes: 0
operator[] is set to return a reference to the item it points to. Making it throw an exception in case of item not found, would mean there needs to be another method to add items to a map.
Also, it can't be overloaded to do something else based on context, since C++ has no concept of context in this regard. Maybe in the C++0x new features is something that can do this (operator&& comes to mind.)

The only way for [] to work as quixadhal wants to, would be to set it to return some kind of "proxy object" that creates the entry if assigned to, and returns the item or throws an exception if its read/accessed in some way. That's maybe doable (although insanely complex given the vast array of operations it must be able to do), but would lead to a whole other kind of problems.

Remember: std::map, and all of the standard library, is just a library, not part of the language per-se. That means, std::map is actually implementable by plain, non-library means.

You can always create a subclass of std::map, with the modifications you want to: an operator[] that first checks if the entry exists and throws an exception if it doesn't, and some kind of 'add' method to create new entries.
16 Jan, 2009, David Haley wrote in the 13th comment:
Votes: 0
Mister said:
The only way for [] to work as quixadhal wants to, would be to set it to return some kind of "proxy object" that creates the entry if assigned to, and returns the item or throws an exception if its read/accessed in some way. That's maybe doable (although insanely complex given the vast array of operations it must be able to do), but would lead to a whole other kind of problems.

I was going to suggest this as a solution, but I don't believe it would actually work. It couldn't "pretend" to be the object – you'd have to have a further getter of some sort to get the real object, or throw an exception if it's not found. What would it mean to "read" the proxy object? E.g., i = m[1] + 1. What is the compiler supposed to do with that statement? Or, m[1].show() – how is the proxy object supposed to pass along that method? (it's a compile-time thing, remember, unlike dynamic languages…)
16 Jan, 2009, Pedlar wrote in the 14th comment:
Votes: 0
you can make a global room entity, and have a operator[](int vnum) check and keep the data ina std::list instead of using map and trieing to figure the way around map:

std::list<RoomData*> *r_list;
class Room {
public:
operater [] (int vnum) { <insert itteration here> }
}

and then use Room as a global, soya can do Room[1] = RoomData; theoreticly.
16 Jan, 2009, elanthis wrote in the 15th comment:
Votes: 0
quixhdal – whether you're right or wrong, the fact is that std::map (and other STL containers) work the way that they do, and that's that. Changing it would break existing applications massively. In the end, you're just coding wrong. You are tyring to pretend that C++ is Python or something. It's not. The proper way to access any key in a map (or other container) is to check that the key exists first. Something like:

std::map<key, value>::const_iterator i = mymap.find(lookup_key);
if (i != mymap.end())
… use i->second
else
… key not there

You really don't want maps throwing exceptions on undefined keys. That is BAD BEHAVIOR. Exceptions are for EXCEPTIONAL circumstances. Times when the application is ****ed and needs special attention. Check your keys before accessing elements, or use a different language that's designed for dynamic, loose, undisciplined programming.

David is right that while operator overloads are functions, there is only a [] operator. There is no []= operator (which would be awesome, but it doesn't exist). Even if it did, the current implementation would continue to works in ways that a []= would not, e.g. type& elem = mymap[key]; elem = value; That is pretty much mandatory for various kinds of forwarding functions and proxy types to wrap an STL container.

That said, you are fully correct that it is possible in C++ to make a container using [] that allows assigning to new keys in the container without automatically creating those keys on access for missing elements, although it would have a few corner cases to work around. Essentially, accessing a key would need to require a special "container_reference" type which would store a writable iterator to the element if it exists or a copy of the key and a reference to the container if it does not. Assigning could then work in both cases as could reading.

You could easily write a std::map wrapper (as well as a std::vector wrapper, or wrappers for other STL containers) that does that. That may be the right choice for your application's support code, but it is NOT the right choice for a high-performance, low-level data-structure library for C++, which is what the STL is meant to be. The STL chose not to do that kind of proxying for a reason. Most importantly, that would incur a performance cost in many cases, at least in C++03 and older. It may be possible to implement it without any performance loss in C++0X, but I'm not sure.
16 Jan, 2009, Mister wrote in the 16th comment:
Votes: 0
DavidHaley said:
It couldn't "pretend" to be the object – you'd have to have a further getter of some sort to get the real object, or throw an exception if it's not found. What would it mean to "read" the proxy object? E.g., i = m[1] + 1. What is the compiler supposed to do with that statement?

"operator Type()". The proxy object could be as simple as just having an operator= for assignment, and operator int() so that it transforms automatically to int (or throw an exception) when required. Some sort of lazy evaluation trick.
16 Jan, 2009, David Haley wrote in the 17th comment:
Votes: 0
elanthis said:
Essentially, accessing a key would need to require a special "container_reference" type which would store a writable iterator to the element if it exists or a copy of the key and a reference to the container if it does not. Assigning could then work in both cases as could reading.

Would you mind going into a little more detail here? I think that to get this to work, you'd have to fairly significantly change the semantics of the std::map object – which to me means that the true goal hasn't really been accomplished. I think what you're proposing is basically the proxy object that Mister and I are talking about.

Mister said:
"operator Type()". The proxy object could be as simple as just having an operator= for assignment, and operator int() so that it transforms automatically to int (or throw an exception) when required. Some sort of lazy evaluation trick.

True, and then a template could overload the type conversion operator to the correct type. But it's still not quite right – see Elanthis's example: type& v = m[k] – this doesn't work with the type operator overloading. (see below)

There are still several reasons why the STL doesn't/shouldn't work this way, though…
- it's inefficient; assignments to existing elements would have to go through the proxy. Unnecessary inefficiency is usually a big no-no in the STL (and C++ in general).
- it confuses type checking; the result of the subscription is no longer the very reasonable/expectable contained element type, it's some funky proxy object. (For example, the new 'auto' keyword wouldn't make sense anymore.) This leads to e.g. the problem Elanthis pointed out.
- The result of accessing an element by 'find' and by subscript is no longer the same thing.

I really am inclined to agree with Elanthis; if you want exceptions (and note that using them this way is very much against the C++ way!) use a wrapper function for looking up elements.



Example code:
$ cat op.cpp
#include <iostream>
using namespace std;

template<typename T>
struct S
{
operator T() const;
T* obj;
};

template<typename T>
S<T>::operator T() const
{
return *(this->obj);
}

int main()
{
S<int> s;
int i = 1;
s.obj = &i;

int &j = s;
cout << s << endl;
j = 2;
cout << s << endl;

return 0;
}

$ g++ op.cpp
op.cpp: In function 'int main()':
op.cpp:24: error: invalid initialization of reference of type 'int&' from expression of type 'S<int>'
16 Jan, 2009, elanthis wrote in the 18th comment:
Votes: 0
Or he could just learn to use C++ properly. :p

Pedlar said:
you can make a global room entity, and have a operator[](int vnum) check and keep the data ina std::list instead of using map and trieing to figure the way around map:


Why the hell would you try to use a std::list to emulate a std::map? The STL containers aren't about convenience, they're about having low-level efficient implementations of data structures… and if the data structure you want is a map, it would be pure stupidity (and I mean that in the fullest sense of the word) to replace it with a std::list.

Especially if you're using a numeric key. In that case it might make sense to use a std::vector if the keys are densely packed, but it would never make sense to use a std::list for that kind of thing.

Honestly, I just wouldn't even use the [] operator at all. If you feel that the STL container API is too low-level, then make a little std::map wrapper that has set, get, and has_key methods, and use that. I almost never expose a std::map in an API anyway – it's only ever used to implement things inside of an actual module. That's what STL containers are best used for, after all. A custom map wrapper with more idiot-friendly methods is quite simple to write…

class my_map_key_exception {
// blah
};

template <typename _Key, typename _Value> class my_map : public std::map<_Key, _Value> {
public:
void set (const _Key& key, const _Value& value) {
insert(std::pair<_Key, _Value>(key, value));
}

const _Value& get (const _Key& key) const {
const_iterator i = find(key);
if (i == end())
throw new my_map_key_exception();
return i->second;
}

_Value& get (const _Key& key) {
iterator i = find(key);
if (i == end())
throw new my_map_key_exception();
return i->second;
}

bool has_key (const _Key& key) const {
return find(key) != end();
}
};


Add in [] overrides to throw an exception while you're at it, too, if you want. Nobody is stopping you.

Heck, you could add a compile-time feature test to a configure script (or just a setting in a header or Makefile) to use the TR1/C++0x unordered_map type instead of map, which would give you better memory-usage and performance in just about every conceivable case. (unordered_map is a hash table implementation, while map uses a tree which in turn means that iterating over it always lists elements in sorted order by key – which is really a useless property in most cases.)

Use the STL to implement your stuff. Don't pretend it's a replacement for hash tables in Perl/PythonRuby/PHP/Lua or whatever and then complain when you find out they're not.
17 Jan, 2009, quixadhal wrote in the 19th comment:
Votes: 0
I'm not really complaining (much), elanthis. :)

I was surprised by the behaviour, since it's entirely counter-intuitive to me… but I acknowledge that that is how they chose to implement things, and can work with that.

I was simply protesting David's assertion that the language itself forced that implementation, as I don't see it. I see it as a decision made by those who standardized the STL container set.

Oh, and as for my "NULL = 3" example, how about "*(0) = 3", or one the compiler might not catch as in "int a = 0; *a = 3;"? Nothing in the language should prevent that (and in fact, before paging, that might be a perfectly valid way to set a value for hardware – if you have a C compiler for the C64, you can indeed do *(49152) = 15 to get a light grey screen border). I meant to use NULL to bring to mind a pointer to empty space, as opposed to the literal value 0 which it evaluates to.
17 Jan, 2009, David Haley wrote in the 20th comment:
Votes: 0
I think by this point there's been ample things said to support the claim that std::map must be implemented the way it is because of how the language works – a library can't do anything about it. The crux of the issue lies in the difference between operator[] for reading and operator[] for assigning, which the language does not differentiate. If you had an operator[]=, this would be fine. Maybe it "shouldn't" be this way, but, eh, well, it is…

And NULL = 3 and *(0) = 3 are indeed completely different statements. One has a constant on the left-hand side of an assignment, and the other has a pointer dereference – which is a valid thing to have. In other words, the latter is correct syntactically and perhaps conceptually to some extent, but it is not correct semantically. NULL=3 is incorrect no matter how you look at it.
0.0/20