18 Dec, 2011, David Haley wrote in the 61st comment:
Votes: 0
Rarva said:
If you understand it now, you see why there is no problem multithreading the part whereI need to find the max, as it is simply no more than that: finding the max element in an array.

Sorry, it still isn't very clear. :sad:
You're summing data… but why does sum have a second parameter? What does it mean to execute a switch statement on 'data', using the sizeof an int as a parameter?

And this problem is very different from finding the maximum element in an array!

[100 -200 25 25 25 25 25 25]

it should be very clear that the largest element has nothing to do with the solution.

Runter said:
So we shouldn't talk about hardware architecture as part of the algorithm, because it's not.

Agreed in the vast majority of cases. Just to quibble, though, sometimes, it is – if your algorithm is accessing data in a way that the hardware architecture doesn't like, you can make your solution go from ok to very bad. (e.g., processor caching, data access speed, etc.)
Unless of course you meant the high-level algorithm, like from a mathematical standpoint, not its implementation, in which case I agree with you unreservedly.
18 Dec, 2011, Rarva.Riendf wrote in the 62nd comment:
Votes: 0
ok I will go from start to the end so you understand the whole process

1 2 3 4 5 < max is 5, then I move the vector elements, removing the last one and add it to the previous on
1 2 3 4 and add it
1 3 5 7 9 <finding max-> 9 then move again and add it

at this point I will have calculted 1 (2+1) ( 3+2) (4+3) (5+4)

1 2 3 4 5
1 2 3 4
1 2 3
1 3 6 9 12 < max is 12 (5 4 3) it has been found on index 5 of array, for a sum of 3 number

at this point I will have calculted 1 (2+1) ( 3+2) (4+3) (5+4) and (3 + 2 +1) (4 + 3 + 2) ( 5 + 4 + 3)
1 2 3 4 5
1 2 3 4
1 2 3
1 2
1 3 6 10 14 <MAX is 14 it has been found on index 5 of array, for a sum of 4 number

at this point I will have calculted 1 (2+1) ( 3+2) (4+3) (5+4) and (3 + 2 +1) (4 + 3 + 2) (5+4+3) and (5+4+3+2) (4+ 3+2 +1)
1 2 3 4 5
1 2 3 4
1 2 3
1 2
1
1 3 6 10 15 < the max is 15 at this point I will have calculated all the possible subbvector sums, and just have to show the max I found till yet, with its index and position
18 Dec, 2011, Runter wrote in the 63rd comment:
Votes: 0
Rarva.Riendf said:
'The fact that it uses threading it achieve it is irrelevant to the algorithm'
In the canonic solution it is irrelevant, in my solution the speed gain can be significant. And sometimes the best algo for a single thread is not the best if you can parallelize. And there is nothing mysterious about parallelizing the parsing of an array to find the max element…

And my solution is n^2 so it will not be faster than the best solution anyway, but mine can be partly parallelized…


Okay, so tell us what this means and why it is faster, and why that has nothing to do with the fact that you are using multiple processors to do the actual work.
18 Dec, 2011, Rarva.Riendf wrote in the 64th comment:
Votes: 0
read my example and tell me what you do not understand in it. I think I made it simple enough. I could code it, but lacking the know how needed for the bit fiddling, moving all values from a vector by one, then adding it to another would be way too costly for my computer to run…
18 Dec, 2011, David Haley wrote in the 65th comment:
Votes: 0
Sorry, Rarva, I don't understand your notation. I don't know what input you're working on. I don't know what your output is. I don't know what you do when you hit negative numbers. I don't know why you start with "1 2 3 4 5" a bunch of times and keep ending up with different processes. When you laid out the pseudocode earlier, I didn't understand that notation earlier. I gave you an example of something I didn't understand (the sum/switch statement). What are you trying to say with something like that? Can you explain the process rather than give a bunch of 1 2 3 4 5 examples?
18 Dec, 2011, Rarva.Riendf wrote in the 66th comment:
Votes: 0
Quote
I don't know what input you're working on.

I chose 1 2 3 4 5 so you can follow the number manipulation , you can look at my first example where I used the original example with the exact same algo.

Quote
I don't know what your output is.

max is 15, found at the fith position, at the last line.

I give it up explaining it , I will code it so you can see it works (and how), but I will have to code is with bit fiddling, so it will be very very slow…
18 Dec, 2011, David Haley wrote in the 67th comment:
Votes: 0
I don't care about exact code, you don't need to go to that extent… I just want to see pseudocode that uses well-known operations that I can follow. :smile:

Your example makes sense only in the case where everything starts from the first position and moves on from there. What if the greatest sum starts at index 27? What about negative numbers? etc.
18 Dec, 2011, Rarva.Riendf wrote in the 68th comment:
Votes: 0
45	-34	8	7	48	-30
45 -34 8 7 48 //max 55 the sum of the two vector is (45 11 -26 15 55 18)
45 -34 8 7 // here I find the 63 the sum of 3 vectr till yet is (45 11 19 -19 63 25)
45 -34 8 // max 45 sum is 45 11 19 26 29 33
45 -34 //max 74 sum is 45 11 19 26 74 -1 we found a max of 74 for a sum of 4 elements
45 <- max 74 no change here…
45 11 19 26 74 44

get excel it is simple enough to copy paste the vector incrementing its position by one each line and do the sum every line

You will see by your own eyes that if you repeat it to the end you will have calculated all the possible subvector sums.
I guss you will have to write it to realiza how simple it is.
18 Dec, 2011, David Haley wrote in the 69th comment:
Votes: 0
OK, at this point, I have to say that all I'd like to see is a clear description of the actual operations using standard functions and notation. If you don't feel like providing that, which would be totally fine, then there isn't much point continuing this…
18 Dec, 2011, Rarva.Riendf wrote in the 70th comment:
Votes: 0
x y z t
+ x y z -> x (x+y) (z+y) (t +z)
+ x y -> (z+ x+y) ( t + z + y) (and (x) and (x+y) but both sums were already done the step before)
+ x -> (t + z + y + x)

I cannot be more standart….You are just oblivious to the fact I just calculated all the contigous subvectors this way)…
18 Dec, 2011, Rarva.Riendf wrote in the 71st comment:
Votes: 0
/*
* test.c
*
* Created on: 15 dc. 2011
* Author: RARVA.Riendf
*/

#include <stdio.h>
#include <stdlib.h>
#include <stdint.h>

#define VALUE_NUMBER 5000

int data[VALUE_NUMBER] =
{paste data here};

struct max {
int pos;
int value;
};

struct max findMax(int data[]) {
struct max currentMax;
currentMax.pos = 0;
currentMax.value = data[0];
int i;
for ( i = 0;i < VALUE_NUMBER;i ++) {
if (data[i] >= currentMax.value) {
currentMax.pos = i;
currentMax.value = data[i];
}
}
return currentMax;
}

void switchData(int data[]) {
int i;
for ( i = VALUE_NUMBER - 1;i > 0;i –)
data[i] = data[i - 1];
data[0] = 0;
}

void copyData(int data[], int targetData[]) {
int i;
for (i = 0;i < VALUE_NUMBER;i ++)
targetData[i] = data[i];
}

void sumData(int data[], int data1[]) {
int i;
for (i = 0;i < VALUE_NUMBER;i ++)
data1[i] += data[i];
}

struct max maxof(struct max max1, struct max max2) {
if (max1.value > max2.value)
return max1;
if (max2.value > max1.value)
return max2;
return max1.pos > max2.pos ? max1 : max2;
}

int main() {
struct max currentMax, tempMax;
currentMax = findMax(data);
int nbElements = VALUE_NUMBER;
int currentData[VALUE_NUMBER];
copyData(data, currentData);
int switchedVector[VALUE_NUMBER];
copyData(data, switchedVector);
while (nbElements) {
nbElements –;
switchData(switchedVector);
sumData(switchedVector, currentData); //<- (if data is 1 2 3 it does 1 2 3 + 0 1 2 next loop it will be 1 3 5 + 0 0 1) I know it is possible in binary with the appropriate << and &
tempMax = findMax(currentData); //<- find the highest value and its position, and this can be heavily and easily parallized if needed
currentMax = maxof(tempMax, currentMax); // <- pick the highest one
}
fprintf(stderr, "%d\n\r", currentMax.value);
fprintf(stderr, "%d", currentMax.pos);
fflush(stderr);
fflush(stdout);
return 0;
}


All the data manipulation in the vector can be done by bit fiddling. Using loops is perticulary inefficient when you can just use one operation, but I wont bother, it was just to prove my algo. I find 21494 with it.


for reference my pseudocode…
struct max {
int value;
int position;
}
data[nbElements] = {x,y, z, ….};
index = nbElements
max tempMax;
max finalMax;

while (nbElements) {
nbElements–;
sum (data, (switch(data, sizeof(int)) <- (if data is 1 2 3 it does 1 2 3 + 0 1 2 next loop it will be 1 3 5 + 0 0 1) I know it is possible in binary with the appropriate << and &
tempMax = findmax(data) <- find the highest value and its position, and this can be heavily and easily parallized if needed
finalMax = maxof (temMax,finalMax) <- pick the highest one
}

Siiiigh
18 Dec, 2011, David Haley wrote in the 72nd comment:
Votes: 0
By notation, I meant the operations, as I said – I didn't need generalized notation of the examples because it wasn't clear what your general case was in the first place. A full source file does the job, because I can see what exactly is happening, so thanks for that.

In any case, yikes, that's inefficient! But, yes, now that you've laid out how your solution actually works, you can improve performance in the linear scans with parallelization, although you'd need an awfully large input set for the gains to start really showing. But you won't change the asymptotic performance of the algorithm, which is really the whole point of this exercise.

EDIT: I hope it is clear to you, comparing your pseudocode and your actual code, why your pseudocode is confusing. For instance the term "switch" doesn't actually mean what you're using it to mean. The "sum" operation makes sense once you understand that "switch" is returning a data array.
18 Dec, 2011, Rarva.Riendf wrote in the 73rd comment:
Votes: 0
I hoped that with the example it would make sense, anyway I know how innefficient it is.(especially since the sum and the switch should be a single operation, but playing with memory directly requires a better grasp of C syntax than I have ).
Though the code itself can be greatly improved, it will stay a n^2 at best. I really wonder what was the original algo though :)
18 Dec, 2011, David Haley wrote in the 74th comment:
Votes: 0
The bit-level and memory operations will only marginally improve efficiency… in a case like this, you'll get much better performance by changing the algorithm rather than trying to change the constant-time factors.

What is the original algorithm that you're referring to?
18 Dec, 2011, Idealiad wrote in the 75th comment:
Votes: 0
He probably means the one cited in the OP. I still have to read that part of the book :).
18 Dec, 2011, David Haley wrote in the 76th comment:
Votes: 0
Ah. The book refers to the canonical solution.
18 Dec, 2011, Rarva.Riendf wrote in the 77th comment:
Votes: 0
Quote
Now this problem would not be so interesting but for the story behind it. The author tells a story that the original algorithm the programmer came up with was going to take 15 days to process a vector of 100,000 numbers (note that this book is relatively old)

Because I doubt even my version would take that long on an old computer :)
18 Dec, 2011, Sharmair wrote in the 78th comment:
Votes: 0
Rarva.Riendf said:
I hoped that with the example it would make sense, anyway I know how innefficient it is.(especially since the sum and the switch should be a single operation, but playing with memory directly requires a better grasp of C syntax than I have ).
Though the code itself can be greatly improved, it will stay a n^2 at best. I really wonder what was the original algo though :)

Nice to finally get something other then talk. And with something to actually work with I can
fill is a few things that were still omitted.

I worked the code into my test program so I could run it while setting the array size and
timing it. It does in fact seem to be O(n^2) and run about 10 times slower then my
refined brute force method (see post 47), the output was as follows:

Rarva.Riendf method, size = 5000.
max = 21494 pos = 569 time = 0.625 sec.

Rarva.Riendf method, size = 2500.
max = 21494 pos = 569 time = 0.156 sec.
60.0/78