• Hi All

    Please note that at the Chandoo.org Forums there is Zero Tolerance to Spam

    Post Spam and you Will Be Deleted as a User

    Hui...

  • When starting a new post, to receive a quicker and more targeted answer, Please include a sample file in the initial post.

Combining text fields for duplicate records

lwilt

Member
If I have a table that looks like:

emailcustomer numberapples trees sports ice cream
abc@yahoo.com123 YY Y
abc@yahoo.com126
123@gmail.com858 Y
123@gmail.com767 YY
owow@yahoo.com444 Y
owow@yahoo.com455
999@yahoo.com1001 Y Y
999@yahoo.com1103 Y
999@yahoo.com 1005 Y

How can I combine the responses so that there isn't any duplicate emails and the responses all get combined under each of the questions so I don't lose any of the responses? There would be 17 total questions I would be trying to combine for the dup emails if that matters.
 
Assuming the last 3 columns are yes/no type response, easiest would be to create a PivotTable from the data, put email in the Row Field, Number as a data field (summed), and the remaining columns in the Data Field as counts. Format the columns that need to have a "Y" with a custom cell format of:
"Y";;;

upload_2015-9-29_13-20-26.png
 
I tried it using a pivot table but because the file is so big I keep getting errors because how big the file is and not having enough resources to complete the action. I have about 200,000 emails in this and approx 17 fields that need to be counted at each email so it's definitley a big file.

Is there another solution to how to do this?
 

Hi !

Two other solutions at least needing a sample workbook
with source and result worksheets …
 
I think you requested a sample workbook so I attached one...if not could you restate that I wasn't sure
 

Attachments

  • SAMPLE.xlsx
    9.7 KB · Views: 2
I can't post the actual worksheet it has customer data in it. When I did what Luke was talking about it eventually gets an error about not having enough space to complete the action of trying to count each column because the file is so big.
 
Can you try creating the file while having no other programs open on your machine? It took awhile to calculate, but I was able to use my solution on a dataset of 400k rows and 20 columns.
 
Back
Top