• Hi All

    Please note that at the Chandoo.org Forums there is Zero Tolerance to Spam

    Post Spam and you Will Be Deleted as a User

    Hui...

  • When starting a new post, to receive a quicker and more targeted answer, Please include a sample file in the initial post.

Sumproduct with multiple row and column criteria - works, but why?

lotusaurus

New Member
I have a table of numeric data, which could be sales; each row represents a combination of two factors (e.g., store type and region) while each column represents a result for a numeric factor, say product code. So I was trying to sum the scores for a particular combination, of store type and region, for particular product codes.

I solved this thorny problem using an extension of what I found in the article here: http://chandoo.org/wp/2011/05/26/advanced-sumproduct-queries/

I used SUMPRODUCT(($A$2:$A$5=$A$9)*($B$2:$B$5=$B$9)*($C$1:$E$1>=$C$7)*($C$1:$E$1<=$C$8),$C$2:$E$5). (The absolute references are just there so that I can easily copy to another cell and play around.)

Now this gives me the answer I want, but I don't understand why it works. In the referenced article, I can understand how the array multiplication works, but it doesn't seem to apply in this case, yet it still works!

My data is a 4 x 3 array. In my SUMPRODUCT, I have, in order, these criteria arrays (4 x 1)*(4 x 1)*(1 x 3)*(1 x 3), which somehow ends up with a 4 x 3 array that multiplies correctly with my 4 x 3 data array. I thought the 4 at the beginning and the 3 at the end were the key, but if I reorder the criteria arrays ("Alternative" in my spreadsheet) to the following: (4 x 1)*(1 x 3)*(1 x 3)*(4 x 1), I still get the correct answer. I also tried (4 x 1)*(1 x 3)*(4 x 1)*(1 x 3) and still got the correct answer.

So my conclusion is that the array multiplication is indeed commutative (if that's the right word) and somehow Excel "collapses" arrays of the same dimensions. This is wonderfully convenient, but I'd like to understand why this approach works.
 

Attachments

  • Sumproduct.xlsx
    10.1 KB · Views: 6
Hi ,

The SUMPRODUCT function works that way , but it will not work for all arrays.

When you have two arrays , one a column vector , such as {1;2;3;4} ( a 4 x 1 array ) , and another a row vector , such as {1,2,3} ( a 1 x 3 array ) , what SUMPRODUCT does is form the sums of the products as follows :

{1,2,3} x 1

{1,2,3} x 2

{1,2,3} x 3

{1,2,3} x 4

which results in the same result as MMULT , which is a true matrix multiplication.

The resulting matrix , at this stage , is a 4 x 3 matrix ( 4 x 1 multiplied by 1 x 3 ).

However , consider another matrix multiplication of a 3 x 1 column vector , such as {1;2;3}.

Doing another MMULT will give you the correct matrix multiplication of 4 x 3 by 3 x 1 , which will result in a column vector 4 x 1.

However , if you try this with SUMPRODUCT , you will get an error value. Because , the SUMPRODUCT function will form the products as follows :

{1,2,3} * 1

{2,4,6} * 2

{3,6,9} * 3

{4,8,12} * ?

There is no fourth element in the second array , and this results in an error value.

Thus SUMPRODUCT will not necessarily work in all cases , and only in specific cases does it do a matrix multiplication ; in most cases , it does an element by element multiplication and this will not be the same as a matrix multiplication.

Narayan
 
Thanks for replying. I do understand a little about matrix multiplication, and as you say, SUMPRODUCT doesn't necessarily do this always, and in this case I don't specifically want it; here I only need the element by element multiplication.

What I'm puzzled about is the "looseness" of the arguments for SUMPRODUCT in the example I posted. What I'm trying to do is select and sum elements from my table of data based on multiple row and column criteria. Most examples I've seen use a single criterion for rows and for columns, so the dimensions of the arguments for selection are (r x 1)*(1 x c) applied to data of (r x c). The dimensions fit neatly: the selection criteria produce an (r x c) array of 1s and 0s, so the resulting array size follows matrix multiplication dimension rules. This array is then element by element multiplied by the (r x c) data, and then summed via SUMPRODUCT.

What I didn't expect was that using multiple single column or single row vectors would work, because if the matrix multiplication dimension rules applied, using selection arguments of (r x 1)*(r x 1)*(1 x c)*(1 x c) should result in an error. And in fact it seems that if I drop the "," separator and use "*" instead, I can put the (r x c) data range reference anywhere amongst the arguments and still get the result I want.

It seems that what SUMPRODUCT is doing is
  1. take all the (r x 1) arguments and multiply them (element by element) to arrive at an (r x 1) result
  2. do the same with all the (1 x c) arguments and arrive at a (1 x c) result
  3. matrix multiply the two results to get an (r x c) matrix of 1s and 0s
  4. multiply this matrix by the (r x c) data and sum the result
Perhaps this is already quite obvious to more advanced Excel users, but this is quite new for me.
 
(r x 1)*(r x 1)*(1 x c)*(1 x c) works as each () evaluates to either a number or a True/False, which using the *'s between the () converts the True/Falses to 1/0's

(r x 1),(r x 1),(1 x c),(1 x c) won't work if the ()'s contain logic as the values will be True/false not 1/0's

--(r x 1),--(r x 1),--(1 x c),--(1 x c) will work as each array is converted to a value by the use of double negatives

Have you read:
http://chandoo.org/wp/2011/05/26/advanced-sumproduct-queries/
&
http://www.excelhero.com/blog/2010/01/the-venerable-sumproduct.html
Which both explain how Sumproduct works in Advanced situations
 
Back
Top