13 Jul, 2009, Idealiad wrote in the 1st comment:

Votes: 0

Bear with me if I'm not getting the vocabulary right. What I want to do is create a relationship map for things in a list, where each thing has a relationship (which could be represented as whatever, a list, a string, etc.) with every other thing in the list. Like:

a b c d e
a   r r r r
b r   r r r 
c r r   r r
d r r r   r
e r r r r

Where the list of things is [a, b, c, d, e] and a relation is r. Then to find the relation between a and b you'd do lookup(a, b). The relation between b and a might be the same or different, you'd get it with lookup(b, a).

I'm using Python so my first thought is to use a dictionary (otherwise known as a hash), where each item is a key, and its value is another dictionary, where thatdictionary's keys are the other items in the list. Then the values of those keys is the relationship r. A lookup then would look like

dictionary[item][other item]

Is there another structure that could work more efficiently and/or be easier to manage?

13 Jul, 2009, Davion wrote in the 2nd comment:

Votes: 0

I think what you have is pretty good. Considering it seems you're going for a sparse array, it's likely to be the most effective. And if you need to know if a certain relation exists, you can always use 'key in' for it.

13 Jul, 2009, flumpy wrote in the 3rd comment:

Votes: 0

Idealiad said:

a b c d e
a   r r r r
b r   r r r 
c r r   r r
d r r r   r
e r r r r

dictionary[item][other item]

Is there another structure that could work more efficiently and/or be easier to manage?

Nope, a Hash is the standard way to do this sort of thing, and the "dictionary" thingy is pythons hash.

Sounds good to me too :D

13 Jul, 2009, David Haley wrote in the 4th comment:

Votes: 0

There are perhaps ways to do it more efficiently, but do you really need to?

An interesting thing to note is that your relation appears to be symmetric, i.e. if a R b then b R a. You can take advantage of this fact by only storing half the relation; perhaps only store x R y where x < y. So, if somebody asks you for x R y, you figure out which of x and y is smaller, and then look up the pair in that order.

BTW map is the "correct" conceptual term for this; it's kind of unfortunate that a lot of languages have mixed the notion of "map" with "hash" because a hash table is just one implementation among many of a map. You wouldn't call a map implemented as a binary tree a hash, after all.

13 Jul, 2009, Runter wrote in the 5th comment:

Votes: 0

David Haley said:

BTW map is the "correct" conceptual term for this; it's kind of unfortunate that a lot of languages have mixed the notion of "map" with "hash" because a hash table is just one implementation among many of a map. You wouldn't call a map implemented as a binary tree a hash, after all.

I agree with most of what you said, but I do disagree with the comparison. If it was implemented by a binary tree odds are said languages would call it a btree, etc. In these languages they are naming the data types based on their implementation. Not their conceptual term. Otherwise they might all might share the properties to merit calling them an associative array. Maybe your idea of unfortunate was someone elses idea of appropriate. :biggrin:

13 Jul, 2009, David Haley wrote in the 6th comment:

Votes: 0

But a binary tree is not a map. It's a binary tree. Sure, in some sense, you could argue that a set implemented with a binary tree is a mapping from object to boolean. But a map really is a very separate concept from a binary tree.

It's generally a poor idea to freely mix implementation and concept, if anything because it makes it very confusing when you want to change your implementation!

Java's approach is pretty good: you have the conceptual, abstract container classes (Map, Set, List, …) which can only be instantiated using classes with a specific implementation (HashMap, TreeMap, HashSet, TreeSet, ArrayList, LinkedList, …).

It means that you can use containers as what they represent conceptually and pass them around while being agnostic w.r.t. their implementation, but when you create them, you can pick the implementation that suits its usage.

13 Jul, 2009, Runter wrote in the 7th comment:

Votes: 0

David Haley said:

The underlying implementation for a map can be a binary tree.

13 Jul, 2009, Runter wrote in the 8th comment:

Votes: 0

David Haley said:

Java's approach is pretty good: you have the conceptual, abstract container classes (Map, Set, List, …) which can only be instantiated using classes with a specific implementation (HashMap, TreeMap, HashSet, TreeSet, ArrayList, LinkedList, …).

Sounds like you should be upset that java is calling something a hash as a data type. Just because another language doesn't include the term 'map' on it doesn't mean it isn't implied. Since all of those data types share properties of a 'map.'

13 Jul, 2009, David Haley wrote in the 9th comment:

Votes: 0

I have no idea what you just said. :rolleyes:

My problem is committing the error of directly associating a concept with its implementation. It's important to keep the concept and the implementation separate. You can understand one without understanding the other and it's important to not muddle them.

13 Jul, 2009, Runter wrote in the 10th comment:

Votes: 0

David Haley said:

I disagree with the notion that there is a problem with describing the implementation. You just gave an example with Java where it does it and you said it was good. [So I can only assume you just want the word Map tacked on to Hash]

So HashMap would be fine to you.

But earlier you used the example of how it was silly by suggesting if it was implemented as a binary tree Hash[map] wouldn't be appropriate. No kidding. If it was implemented like that then it would have been named that at all.

13 Jul, 2009, David Haley wrote in the 11th comment:

Votes: 0

I didn't say that there's a problem with describing the implementation in addition to the concept. I said that I have a problem with people only specifying the implementation when they're describing the general concept. Languages that use "hash" to mean the concept "associative array" or "map" have given rise to a whole group of people who think that "hash" == "map".

13 Jul, 2009, elanthis wrote in the 12th comment:

Votes: 0

What you're actually talking about is called a "graph." Each item you're interested in is a "node" (or vertex, or point) and each relationship is an "edge" (or line). These are often implemented as "dictionaries" (or maps, or associative containers, or tables), which themselves are often implemented either as hash tables or as trees. Python calls them dictionaries.

So far as efficiency, what you have is more than sufficient. If relationships are symmetrical and if your keys can be ordered (sorted), then I would suggest picking the higher key as the top-level dictionary and the lower key as the inner dictionary (or the reverse, so long as you're consistent) so that you don't need to put each relationship in twice. For example, if the keys are strings, and you want the relationship for "foo" and "bar" then you would look up relationships["foo"]["bar"]. If you wanted to look up "alpha" and "gamma" then you would use relationships["gamma"]["alpha"]. That obviously is counter-productive if your relationships are asymmetrical.

If the keys can be efficiently combined into a unique key, you could also use a single dictionary. For example, if you have string keys, and you know the keys will never ever have the : character in them, then you could concatenate the two keys (in consistent sorted order, if appropriate) with a : between them and look up that single combined key in a single relationship dictionary. However, chances are that two dictionaries will be more efficient, since it's quite likely that two lookups in an efficient dictionary implementation will be faster than a string concatenation (but maybe not, so benchmark if you really care).

Basically, though, you've got the best setup you're going to get. :)

14 Jul, 2009, quixadhal wrote in the 13th comment:

Votes: 0

David Haley said:

You probably really hate perl, as not only does it refer to associative arrays as "hashes", and the way hashes are implemented, they can ALSO be treated as arrays (odd elements are keys, even elements are values)…. but there is also a "map" function that's used to filter arrays through functions to generate output arrays.

Quote

my $query = join('+OR+', map { my $x = $_; $x =~ s/ /+/g; "%22$x%22";} (sort keys %kw));

FWIW, I agree… it does make learning new languages annoying, as you have to relearn the new meaning of words you're already used to using.

14 Jul, 2009, kiasyn wrote in the 14th comment:

Votes: 0

quixadhal said:

Quote

my $query = join('+OR+', map { my $x = $_; $x =~ s/ /+/g; "%22$x%22";} (sort keys %kw));

Everyone hates perl - look at your code snippet! :P giberish.

14 Jul, 2009, quixadhal wrote in the 15th comment:

Votes: 0

Bah, you're just jealous. Not all languages are bless'd, like perl. I understand, perl can get people all tie'd up.

Still, it is pretty cool to be able to bind an SQL table to a live data structure and do things like add a field to the hash and have it automatically do an "alter table foo add column bar text".

It would be challenging to do that in C++ (perhaps not impossible, not sure how late-binding works in C++).

14 Jul, 2009, David Haley wrote in the 16th comment:

Votes: 0

You can do things like that in very many high-level interpreted languages. Certainly Lua, Python, Ruby, SmallTalk, and similar support such things. Any language that allows you to intercept the getting and setting of attributes would let you do similar things. SqlAlchemy in Python, for example, makes very heavy use of this kind of thing.

14 Jul, 2009, Runter wrote in the 17th comment:

Votes: 0

Perl is a 1337 language indeed. With all of the cryptic elegance that comes with it.

Random Picks

Dead Souls Dev

NewHopeMUSH

Void of Reality