26 Jan, 2010, David Haley wrote in the 21st comment:
Votes: 0
Having a formal expression is necessary for discussing optimality etc.

I think it's pretty easy actually to show that if you must always place the heaviest rock first, then only then is it optimal to always place it into the least heavy pile. (It's a quick proof by induction.) Of course the bind is that there's really no reason to think that you must always place the rocks in that order. (Besides, we've just seen that it doesn't always work.)
26 Jan, 2010, Scandum wrote in the 22nd comment:
Votes: 0
Always an option to use both algorithms and pick the most favorable result, that should cover most of the bases.
26 Jan, 2010, elanthis wrote in the 23rd comment:
Votes: 0
You might look up algorithms for the bin packing problem. It's a very similar problem (you have any number of bins, but each bin has a maximum capacity, and you are trying to reduce the number of bins necessary to pack your items) that might lead to finding solutions for your specific problem. For what it's worth, your algorithm is more or less the "best easy" solution to bin packing, again without getting into the stuff of nightmares (for those of us who dislike the math portion of CS, at least). It's more or less accepted that no solution can give the best distribution for all possible input sets, so you either just need to pick the algorithm you like best, or implement multiple algorithms per Scandum's suggestion; I personally would find implementing multiple algorithms to be onerous and wasteful unless I _really_ needed even distribution.

Note that you may well have a lot of other options. If you're just distributing rocks with no further context or purpose, you've got what you've got. If you're dealing with more interesting systems, however, you can do any number of additional steps to arrive at a more even distribution when the full system is resolved. If you are going to be distributing rocks multiple times into bins that have some further purpose (total weight of some other sub-system, for example), then you can keep track of previous distributions to ensure that the bin that got the least last time is more likely to get a larger sum the next time. You can fill bins with extra weight (loose pebbles, whatever) to enforce evenness. You can avoid large variations in rock size to reduce the likelihood of hitting an sub-optimal corner case, or spread larger rocks amongst different sets of distributions to make it easier for smaller rocks to pad out the differences.

In other words, without further context, it may be that you're asking the wrong question and getting a sub-optimal answer because of it. Maybe you're just trying to pack rocks in bins and that's that, but I can't imagine what you need that for in a MUD programming context. :)
26 Jan, 2010, David Haley wrote in the 24th comment:
Votes: 0
elanthis said:
It's more or less accepted that no solution can give the best distribution for all possible input sets

I didn't realize that this was accepted at all…! All that's been established is that the two algorithms given so far are not optimal. (That one works when the other fails is just coincidence at this point, although it would be interesting to explore if there was a deeper relationship there.)
EDIT: in fact, I already gave a "perfect" algorithm. It's just incredibly slow for non-trivial input sets. But it's guaranteed to give the optimal answer. (Eventually.) So it's not like this is an unsolvable problem or anything.
26 Jan, 2010, Twisol wrote in the 25th comment:
Votes: 0
I happened to notice that for the numbers Scandum used, they were all less than the optimal size (49/3 = 16), while for the numbers David used, one happened to be more than the best-case (45/3 = 15). It's also interesting to point out that, if any bin is ever greater than the best-case, it should be considered locked from any further adding. With David's numbers, the problem is effectively reduced to dividing the remaining numbers into two bins, since nothing else should be added onto the first bin.

Also, I noticed that using Shasarak's algorithm over Scandum's numbers, but alternating between one end of the list and the other (i.e. 10, 4, 9, 5, 8, 6, 7), gave a better solution than before, but not quite as good as Scandum's algorithm in that case. (17, 17, 15)
26 Jan, 2010, David Haley wrote in the 26th comment:
Votes: 0
Twisol said:
It's also interesting to point out that, if any bin is ever greater than the best-case, it should be considered locked from any further adding.

You can prove fairly straightforwardly that Shasarak's algorithm has this property. A proof sketch is that you only add to the bucket with the least; the one with the least will never be greater than optimal. If it were greater than optimal, then the others are also greater than optimal, therefore contradicting the calculation of the optimal value. Hence, you are guaranteed to "lock out" any bin that goes above the optimal.

I think this actually shows the problem with Scandum's approach: it doesn't do well if you have to go above the optimal. When I chose my numbers I was deliberately constructing a case where one number was greater than the optimal. You'll also note that the sum of 45 is evenly divisible by three, which I was playing with too, but didn't see if it made a difference.
26 Jan, 2010, Scandum wrote in the 27th comment:
Votes: 0
Guess my algorithm could be improved by calculating a new average after filling the first bucket.

So: 20,10,5,4,3,2,1

Would result in:

(average = 45 / 3 + 45 % 3 = 15)
20 = 20
(average is 25 / 2 + 25 % 2 = 13)
10, 3 = 13
5, 4, 2, 1 = 12
26 Jan, 2010, David Haley wrote in the 28th comment:
Votes: 0
Special-casing your solutions can bring you to a better solution for some cases while moving you further away in others. You don't really want to design algorithms with patch-work unless you can establish that the patch is always helpful.
26 Jan, 2010, elanthis wrote in the 29th comment:
Votes: 0
David Haley said:
elanthis said:
It's more or less accepted that no solution can give the best distribution for all possible input sets

EDIT: in fact, I already gave a "perfect" algorithm. It's just incredibly slow for non-trivial input sets. But it's guaranteed to give the optimal answer. (Eventually.) So it's not like this is an unsolvable problem or anything.


That's what I meant. The traveling salesman problem is "solvable" too, just not in any computationally feasible manner. :) A solution you can't actually use isn't a solution at all.
27 Jan, 2010, David Haley wrote in the 30th comment:
Votes: 0
elanthis said:
That's what I meant. The traveling salesman problem is "solvable" too, just not in any computationally feasible manner. :) A solution you can't actually use isn't a solution at all.

But nobody has established, either, that this problem is necessarily NP-hard. The brute force solution was just that, a silly brute-force solution. It's not clear to me that this is also an NP-hard problem, although I'll grant that it certainly smells of it. It depends on what additional constraints there are.

Of course, I agree with your earlier, other point, regarding what this thing is actually being used for. It's quite possible (likely?) that a "good enough" solution would be quite appropriate.
27 Jan, 2010, Scandum wrote in the 31st comment:
Votes: 0
David Haley said:
You don't really want to design algorithms with patch-work unless you can establish that the patch is always helpful.

Okay..

Anyways, with the current test cases it looks like the better algorithm.
27 Jan, 2010, David Haley wrote in the 32nd comment:
Votes: 0
Does probability favor this algorithm, too? :rolleyes:
20.0/32