• Hi All

    Please note that at the Chandoo.org Forums there is Zero Tolerance to Spam

    Post Spam and you Will Be Deleted as a User

    Hui...

  • When starting a new post, to receive a quicker and more targeted answer, Please include a sample file in the initial post.

Powerpivot - many-to-many "master table" that changes

J W

New Member
I'll bet this has been answered in some format somewhere else, it's just hard to find the right keywords to search for.

The problem I'm trying to solve is relating to "de-normalized" excel tables from two different sources. Neither table has a "primary key" nor a sensible way to create a composite key that would make each row's value unique. (The best example is that a few rows might have the exact same "descriptive" data, and the only difference would be comments that define why each row has different data

What I think will work is to create a composite key in each table. Then create a "Master Table" from both table's composite keys, eliminating duplicate entries. I would then relate that "Master Table" to each of the "sub tables" and create measures from there.

So far, so good (I think).

The problem I'm trying to solve is the on-going process of maintaining that "Master Table." Either one of the two original tables could add new data that creates a new "composite key."

What's the easiest (best, any possible) way of keeping that "Master Table" up to date?

The manual process would be to get each table every day, create their composite keys, combine them, remove the duplicates, then replace the "Master Table" data with the new list.

Is there an easier way?
 
Last edited:
Can you upload sample table of each with enough data (say about 50 rows)? It's far easier to make recommendation when I can look at actual data patterns.
 
Hopefully this got uploaded correctly.
LocalData has a version of what I can control / update, etc.
SysData has a version of what our system creates.
Key Data is an attempt to show what I'm trying accomplish with the "Master Table."

Further explanation:
The data is for project costs. I have a spreadsheet to perform my calcs. The system of record can be updated by any number of people.
So, I could set things up today to compare one file to the other using the composite key example (just a few columns concatenated together). I would get the key from each file, remove duplicates, and then set up a PowerPivot relationship between that "Master Table" and the source data tables.

The problem is, anyone can update the system, creating new rows of data for projects, and therefore new "keys" that I might not be aware of.

I'm trying to find the "best" way to keep the Master Table of composite keys up-to-date.

Some of the articles I've read have talked about creating Access databases and importing this data into that and using relational database tools to create / update the Master Table automatically. That seems like a lot of trouble, and I'm not an Access expert by any means. I've created applications in FoxPro, used Oracle tools before and have other relevant experience. I just don't have the time to learn Access. Unless that's the "best" option.

Anyway, back to the data:
On the Key Data tab I have the full list from each source tab, a list with duplicates removed; and then those two non-duplicate lists combined and de-duplicated.
I also put a one column pivot table on each source tab that gives me the two de-duplicated lists, that I could then copy/paste and de-duplicate again to update the Master table.

If the number of words above seems like a lot; it's nothing compared to how much work I think this would be to use.\

Process stuff
So, at most, these data tables would only change daily. That's the primary objective of this request.
On a related note, I then try to get my "LocalData" to match the SysData, so I can do some offline analysis and presentations.
I'm hoping this solution leads to something like this:
Step 1: Get updated data tables
Step 2: Update the Master table (see above)
Step 3: Generate PowerPivots that compare the data in each table.
Step 4: Updated LocalData; refresh pivots, repeat until the data is the same.

Without this "Master Table" thing figured out right now, I basically use the two pivot tables at the top of the source data tabs and:
1. Copy and paste each one into a consolidated table;
2. Add a column that defines the source of the data;
3. Update pivots to identify variances;
4. Update source data and pivot table;
5. Copy and paste that new data to the consolidated table in "1."
6. Refresh the pivots and continue with "3."
The tricky part is making sure my cut and paste doesn't overwrite any data, and that the "source" column has the right value for what's in there.

Whew! I'm sure hoping PowerPivot can make this thing easier!

Thanks for any and all help.
 

Attachments

  • Example data.xlsx
    55.4 KB · Views: 3
Take a look at using PowerQuery to create your Key tables - it will remove duplicates for you - and load them into the data model. That should overcome the issue of users adding other data as the Query will provide the most recent data to the Data Model at all times.
I'm rushing out here, but I watched a YouTube video the other day from, I think, ExcelisFun, which addressed this very issue ....
 
Thanks. I'll give that a look.
(Last time I tried to install PowerQuery my machine wouldn't let me; something about not having the latest "something." We're kind of conservative in our installations / upgrades - just went from XP to Win 7 in February, 2015; and IE 11 a few months ago :) I'll try again though, and check out that video.
 
Yep. From looking at it. PowerQuery to build model is probably the way to go.

You can even use Merge Column function to generate key with delimiter of your choice (or just calculated column) as new data comes in.

On top, you can use it to Merge & Append multiple query.

My favorite part is that you can write native SQL query to let the database do heavy lifting with ease.
 
Oh one thing. If at all possible, make sure you are happy with the data model generated from PowerQuery before you load it to PowerPivot.

Otherwise, you are going to have to update PowerPivot everytime you make changes to PowerQuery/Data Model.
 
Back
Top