• Hi All

    Please note that at the Chandoo.org Forums there is Zero Tolerance to Spam

    Post Spam and you Will Be Deleted as a User

    Hui...

  • When starting a new post, to receive a quicker and more targeted answer, Please include a sample file in the initial post.

Determine last column value based on criteria of a column grouping

cat2phat

New Member
I am looking for a more efficient method of determining the last status within an order that is not "OK". The file being referenced has 500,000+ records and the current solution is not viable due to slow processing.

Current solution
{=INDEX(D:D,MAX(IF(A:A=A2,IF(D:D<>"OK",ROW(A:A),""),"")))}

Code:
Order Item #   Product   Status
A         1            Corn         OK
A         2            Beans       OK
A         5            Rice          CLEARANCE
A         6            Celery      OK
A         10          Carrots    EXPIRED
A         11          Cereal      OK
B         1            Corn         OK
B         2            Beans       OK
B         5            Rice          EXPIRED
B         6            Celery      OK
B         10          Carrots    CLEARANCE
B         11          Cereal     OK

Example code would result in Order A's last status of Expired and Order B's last status of Clearance.

Once the last status is calculated I can be more specific in my need which is needing to know if the last status of an item is Expired.
 
Since we're dealing with a large database, calc speed is definitely our biggest item. First, your example has the Order sorted. Is this correct in the actual data? If not, can we?

Assuming yes...
First, create a helper column in E. In E2, put this formula:
=IF(D2<>"OK",ROW(),E1)

Then, formula to get status of say, "B" would be:
=INDEX(D:D,INDEX(E:E,MATCH("B",A:A)))

Does that work?
 

Attachments

  • Example Match.xlsx
    10 KB · Views: 0
Calculation speed, as Luke M says is the biggest item.
2 ways to reduce calculation time:
1. Have fewer formulae
2. Have shorter formulae
3. and the third of the two ways; have the formulae only refer to the necessary ranges (rather than entire columns).

In the attached, I have put such shorter and referring only to necessary ranges formulae in column F, but haven't reduced their number and already, it is a lot quicker to calculate.

To address having fewer formulae, I wrote a little macro, which executes on the press of the button in the sheet. It places a new list on the sheet, one row per order and the last not-OK status for that order using the new formulae.
In addition, you can click on that status and it will take you to the original cell in column D.

Any use?
 

Attachments

  • chandoo24278Book1.xlsm
    21.8 KB · Views: 0
Since we're dealing with a large database, calc speed is definitely our biggest item. First, your example has the Order sorted. Is this correct in the actual data? If not, can we?

Assuming yes...
First, create a helper column in E. In E2, put this formula:
=IF(D2<>"OK",ROW(),E1)

Then, formula to get status of say, "B" would be:
=INDEX(D:D,INDEX(E:E,MATCH("B",A:A)))

Does that work?

Background on the data. The Item # is always sequential but can be broken up as in the example. Also to note is that the first item's status is always "OK".

I was trying very complex array formula's where it just needed simplicity. Luke, you provided that simplicity and a concept that works. Thank you.

Here is the result.

Column E
=IF(D2<>"OK",ROW(),IF(A2=A1,E1,""))
Your formula didn't account for a new Order. When a new order started, it would use the previous orders column E value. The added IF blanks out the E column if it equals "OK".

Column F
=INDEX(D:D,INDEX(E:E,MATCH(A2,A:A)))

Alternative to Column F
Column G
=IF(E3="",INDEX(D:D,E2),"")
 
Further to my previous offering, could you supply a workbook, with any sensitive data searched-and-replaced?
 
Back
Top