• Hi All

    Please note that at the Chandoo.org Forums there is Zero Tolerance to Spam

    Post Spam and you Will Be Deleted as a User

    Hui...

  • When starting a new post, to receive a quicker and more targeted answer, Please include a sample file in the initial post.

Very weird formula works in filtered context not in notfiltered

tomas

Active Member
Hi in the picture in attachement.

1.I have dataset about 111000 rows.

2. Calculated field rank_90 count what would be 90 percentil if we assign rows from 1:111000

3. Formula above should is calculation for field 90_perc. It should filter table and take only rows until the value rank_90 calculated fiels.

4. then from this filtered table it should return max value from column Per.diff.

5. the result should be as you see on picture 7% but calculated field returns max of whole table.

6. For calculated field 10_per it works

7. And what the most intriguing fact is, this field 90_perc works well if I filter table e.g. based on month.

8. For me it's great it works in filtered context but it's very surprising it does'n work for whole dataset
 

Attachments

  • perc.png
    perc.png
    121.2 KB · Views: 1
Can you please post a sample file and highlight the issues
It is very difficult to understand the issue from the description above
 
Thanks Hui

I figured the problem out. Only if someone wants to know I explain.

In simplified sample

1. Column per.diff is sorted from the smallest and ranked from 1 to up I think ascending :)

2. Calculated field [rank_90] calculates 90 percentil of number rows so if there is 3100 rows it returns about. 2700.

3. And I needed corresponding value from col PER.DIFF which would be 90 percentil - which is accomplishe in calculated field [90_Perc]

4. I used function topn which subset your dataset by rows from one until my calculated field [rank_90] and select maxx value which since is sorted correct90 percentil.

5. But as I learned If a column I want to get return value PER.DIFF contains duplicate it takes more rows with duplicate values as one - and returns bigger dataset consequenlty greater number than real 90 percentil.

6. That is why I got bad result for whole dataset because there were many duplicates but when I filtered to minor datasets I got about correct answer.

7. Solution was for argument N value ( where I put my calculated field ) subtract difference between countrows and distinctcount.
 

Attachments

  • chandoo.xlsx
    291 KB · Views: 2
Neither this is completely good I think I need to in step subtract difference between countrows and distinctcount only in the subset of data until the row [rank_90] so I am working on it
 
I finally got it

If someone wants to have percentile values dynamically recalculated based on his filters this is dax formula

90:=maxx(FILTER(Tabulka2;rankx(Tabulka2;[rank];;1)<=ROUNDUP((COUNT([PER.DIFF])-1)*0,9+1;0));[PER.DIFF])

With previous workbook it should be understandable
 
Back
Top