• Hi All

    Please note that at the Chandoo.org Forums there is Zero Tolerance to Spam

    Post Spam and you Will Be Deleted as a User

    Hui...

  • When starting a new post, to receive a quicker and more targeted answer, Please include a sample file in the initial post.

Formula to extract values from dataset

Wojciech

New Member
Hi,

To describe the situation, in column A I have a reference, 1,2,3.... in column B I have a long string of raw data, there are several thousands of rows. The data is very unstructured but within some of the cells is information I am looking for. Usually it looks something like "gibberish (Name) gibberish" and I am looking for the name in the middle (not always same position but always somewhere in middle of the cell)

So I want to create column C that will list all of the values, be it the actual instances where it found the phrase (Name) or the reference number I created in column A. Notice that it needs to be specific "(Name)" as Name without brackets pops up all of the time as well and I don't need that, but I think matching exact won't work because of the gibberish around it (which is always different).

I can do this quite easily using filters, but I do this over and over again at work and I want to automate this process or at least know how to do similar things better in the future.

I'd love to upload it but 1. The files are very large, 2. they're secured 3. they're confidential.

Please help!
 
First of all, welcome onboard..


It would really help members to help you if you share a sample file..


No need to share the actual files or secured files..just create a sample file with expected result..
 
Hi Asheesh,

Thank you! And you're right, here is a little demonstration, I hope it helps!
Keep in mind, I know that this can easily be done by using filters, but I'm spending hours on doing the same thing over and over using filters across multiple documents where I know the right formula would do it in a jiffy.

Overall I'm not very good at excel and I especially can never get the list like formulas to work.
 

Attachments

  • SAMPLE.xlsx
    9.3 KB · Views: 7
Nice one...let me figure a way out and will get back...

By the way, what is the size of your actual data set..I can plan a solution accordingly..

Just trying to figure out, if I can go for a single formula solution or should we use helper columns..
 
Looking at an average file now, 95,000 rows, I'm looking to pull out names, values, dates etc. of about 200 instances. The files are actually only around 2mb each. Not sure why that's the case I really thought It'd be bigger. I've about 40 of such to work on atm, that's why I need to automate this or I'll go insane.

Also, can I add a second query? In regards to how I intend to also clean the data , see sample below. This isn't as important and I think I'm more so just curious if this can be done.
 

Attachments

  • SAMPLE.xlsx
    8.9 KB · Views: 6
Hi,

Used couple of helper cells since your data seems to be huge. See the attached

Regarding your second query - I am sorry but can you please explain your query once again as I couldn't relate the question asked with the given example..
 

Attachments

  • SAMPLE_wojciech.xlsx
    10.2 KB · Views: 1
Back
Top