jbrass
Frequent Contributor

Count Duplicates Based on Text Field

Hello!

I'm running into a problem that I'm struggling to solve and was hoping that I could get some help here. I have a numbered list that represents employees. Sometimes, employees are converted from part-time to full-time and when this happens, they are assigned a new employee ID. I need to be able to identify cases where there are multiple list items that represent the same individual (based on a text field that shows their name). In excel, this would be accomplished by using a countif formula. I'd like to know if there is a simple solution that I'm just not seeing. I think I could create a new list based on the concatenation of employee ID and name, and then have the name be a list property and then do a count based on that, but I'd like to know if there is a solution that requires less steps (preferably no need for a new list or import action). Screenshot below; The "Count of items" field is what I'd like to make formulaic.

jbrass_0-1616090450090.png

Thank you!

Josh

1 ACCEPTED SOLUTION

Accepted Solutions
kevin.cho
Certified Master Anaplanner

You can achieve the same result in your example using the RANK function. Create a helper line item to rank (in example below, Count = 1), and then the Rank formula = RANK(Count, ASCENDING, MAXIMUM, TRUE, Name).

The combination of ASCENDING for the direction parameter, MAXIMUM for the equal behaviour parameter and Name for the grouping parameter will give you the highest index of matches. Note that the "Name" value will have to match exactly.

See example below:

Example.png

Example 2.png

PS: My screenshot doesn't show it, but the "Count" line item that is used to help do the rank a) doesn't need to be against the dimensionality of the module, and b) doesn't even need to be a line item itself! You can hard-code "1" into the RANK formula --> RANK(1, ASCENDING, MAXIMUM, TRUE, Name)

View solution in original post

11 REPLIES11
rob_marshall
Moderator

@jbrass

Josh,

扮演魔鬼的代言人,如果你有两个different individuals with the same name?

jbrass
Frequent Contributor

I can actually use email (which is a text field) instead of name to make it a tighter match. Understand that this may have holes, but this is just being used to flag potential issues so I'm not too concerned.

Thanks,

Josh

rob_marshall
Moderator

@jbrass

In that case, see if this. works:

  • Can you make a flat list out of the email addresses?
  • if so, then create a constant line item with a 1 as the value
  • then create a module based on the flat email list and do a sum off the constant line item. If it is >1, then you have duplicates.

Does that work for you?

Rob

jbrass
Frequent Contributor

Thanks Rob. That makes sense. I was trying to see if there was a way to do it without having to run an import process (to create that list), but it sounds like this is the only way.

Thanks again,

Josh

jbrass
Frequent Contributor

Thanks Rob. That’s the approach I’ll take. Was trying to see if I could skip the import process (that we’d use to populate that email list), but it sounds like that’s not possible.

Appreciate your help on this!

Josh

kevin.cho
Certified Master Anaplanner

You can achieve the same result in your example using the RANK function. Create a helper line item to rank (in example below, Count = 1), and then the Rank formula = RANK(Count, ASCENDING, MAXIMUM, TRUE, Name).

The combination of ASCENDING for the direction parameter, MAXIMUM for the equal behaviour parameter and Name for the grouping parameter will give you the highest index of matches. Note that the "Name" value will have to match exactly.

See example below:

Example.png

Example 2.png

PS: My screenshot doesn't show it, but the "Count" line item that is used to help do the rank a) doesn't need to be against the dimensionality of the module, and b) doesn't even need to be a line item itself! You can hard-code "1" into the RANK formula --> RANK(1, ASCENDING, MAXIMUM, TRUE, Name)

Misbah
Moderator

@kevin.choBrilliant. I think this can be turned into Best Practice Article.

kevin.cho
Certified Master Anaplanner

Thanks@Misbah:slightly_smiling_face:Will see how to do that (might want to add a bit more to it too!)

Misbah
Moderator

@kevin.cho

Sure, go for it. You can extend the capabilities of Rank function instead of just focusing on this particular use case. Looking at the hidden capabilities of Rank Function I started liking it even more.

Some of the limitations of Isfirstoccurrence (IFO) can easily be tackled with Rank function. For Example unlike IFO you can easily get the Rank 1s (Unique or in other words first occurred values) on any line item.