How to Find Potential Duplicates in 7.93

Options
Since Blackbaud has continually refined the algorithm for finding duplicates to the point that the "Duplicate Constituent Tool" is now useless - (not even going near that they removed the option for creating an Output Query from it) - how are organizatons going about finding potential duplicates?


I ran it this AM - and it found 4 records in our database - 1 with 0 potential duplicates (then why did it include him??) and 3 who had the first name Robert and the same middle initial.  Not another piece of data in their records is remotely similar, not their house #, city, street, zip, date of birth, etc.


And yet, I know, from day-to-day work, we are finding duplicate records, especially since we recently merged with another organization similar to ours.


How are other users looking for and finding REAL potential duplicates?  Queries?  on what kinds of criteria that won't return most of the database??

Comments

  • I know this is an extremely round about way, but I"m not sure of another way right now. We export out information into excel and then utilize conditional formatting to indicate where there are duplicates based on certain fields. We then grab the ID numbers, put them into a query and then look to make changes within the system.


    You are right, they really messed up the duplicate tool by making changes to something that wasn't really broken in the first place.
  • Yes, this is a huge problem in RE.


    Fortunately, we recently began the process of migrating our web presence over to the Luminate platform, and one of the phases prior to syncing RE with Luminate was a duplicate record search. They went in and thoroughly searched for duplicates on several parameters, and while this wouldn't necessarily be easy (much of the work would be done in Excel), I can see how to do it.


    Here are the paremeters with the best results:
    1. Duplicate email address (export all email addresses, do an advanced sort on the email line to pull out all unique email addresses, and then use the "COUNTIF" function to count how many records share that email address. In RE, search by specific email address and decide if this is a real duplicate or not)
      1. This may or may not prove super useful, depending on how many email addresses you have. Also, duplicates without email addresses obviously won't be located.
    2. Find records with the same last name + first initial + address line 1. Here's how I would do this on my own:
      1. Export all records from RE and include constituent ID, First name, Last name, Address line 1
      2. Create a unique key for each record that uses the first initial, whole last name, and the first, say, 7 characters in address line 1. (e.g., "=LEFT(B1,1)&C1&LEFT(D1,7)"
      3. Copy and paste this new column over itself and use the "paste special" function that leaves you with only the values, not the formula (not sure if this step is necessary, but I think it is)
      4. Next to the unique IDs, do a COUNTIF function to count how many records share this unique key (e.g., "=COUNTIF(E:E,E2)")
      5. Sort by your new column (in this scenario, this is column E) from largest to smallest, and then delete all rows where the value = 1. Those are rows that show no potential duplicates.
      6. Create some sort of novel import so that you can pipe this spreadsheet into a query (we don't ever use the "inactive" checkbox in RE, so I just create an import with the constituent ID and a clumn for "record is inactive" with the value "no" and pipe that into a query. Easy peasy.)
      7. Pipe this query into a query list, and sort that query list by the address field.
      8. Now you can go through record by record and either merge the appropriate records if there is indeed a duplicate or you can remove individual records from the query list. 
    I got rid of about 700 dupilcate records this way.

     
  • Another option is MergeOmatic from Omatic Software. It is a free tool that installs to your Plug-Ins module. I have used this in addtion to Access dupe checks.


    http://www.omaticsoftware.com/Solutions/RaisersEdge/MergeOmatic.aspx
  • Spencer, you beat me to the punch. I also recommend the Omatic Plug in.. It's a great duplicate tool, you can tell it to skip a record or to make one record primary over the other.  In your output column you can include giving totals, their NetCommunity User id and much more. That makes it really easy to tell which is the "main" record.


    You can also set it to automatically add a note to the main record that recordXYZ was merged into this one, and it will also add an alias of the previous record ID number to the remaining record.


    Love it!

Categories