The joys of inheriting a mess: How to import from a list without Const ID or Import ID

Rebecca Timberlake 2 · May 2017

We have a list of 3500+ constituents from an event mailing that we did last year that I am being asked to recreate in RE. Should be easy...but the constituent ID and import ID were not included on the export file, and there was no action, appeal, or event record added to the constituents records.

So, the puzzle begins, how do I take a mailing list of 3500+ constituents that includes addressee, salutation, address, city, state, zip and somehow import it into RE without creating 3500+ duplicate records? All I want to do is add an attribute to the constituents records so that I can have some kind of indication of their mailing interests.

Put on your creative problem-solving hats and let's hear how you would approach this!!

Thanks so much for all of your help!

Rebecca

Melissa S Graves · May 2017

Any chance you have import-o-matic or importacular?

Rebecca Timberlake 2 · May 2017

Unfortunately, No.

Ryan Hyde · May 2017

Rebecca Timberlake:

We have a list of 3500+ constituents from an event mailing that we did last year that I am being asked to recreate in RE. Should be easy...but the constituent ID and import ID were not included on the export file, and there was no action, appeal, or event record added to the constituents records.

So, the puzzle begins, how do I take a mailing list of 3500+ constituents that includes addressee, salutation, address, city, state, zip and somehow import it into RE without creating 3500+ duplicate records? All I want to do is add an attribute to the constituents records so that I can have some kind of indication of their mailing interests.

Put on your creative problem-solving hats and let's hear how you would approach this!!

Thanks so much for all of your help!

Rebecca

I have a kinda-sorta solution to this. The idea is, you take everyone in RE, export some info about them (first name, last name, address, whatever), and create a unique key based on that info. Then you create a key using the same system in your mailing list. Then you use vlookup to try to find matches between those keys.

This takes some understanding of Excel, and I don't know your comfort level, but I'm going to run through an example here as though you are fairly comfortable. Ask questions if need be.

Let's say you export info on all of your constituent records, and that info includes the following columns:

First name -- Last name -- Address Lines -- Zip Code -- Constituent ID

Create a new column in the resulting spreadsheet called "key". It's best to actually make this the A column, so insert a new column to the left of "First Name" and just label it "Key." In that key, type in some variation on this equation:

=LEFT(A2,3)&LEFT(B2,4)&LEFT(C2,5)&LEFT(D2,5)

So what you're doing is grabbing the first 3 characters from "first name," the first 4 characters from "last name", the first 5 charactesr from "Address lines," and the first 5 characters from "zip code." This is a fairly unique key that identifies the line. Populate this formula into every row of the spreadsheet so every constituent has a key. Then copy and paste this column onto itself, using the "paste special - characters only" to get the key in every cell without the formula. This makes later processes a work a little faster.

Next, do the exact same thing in your mail file, using the exact same parameters from the RE export file (first 3 of the first name, first 4 of the last name, etc). Copy & paste it onto itself again, just like the export file.

Now, use the VLOOKUP formula to try to match up the data. Create a new column in your mail file spreadsheet to the right of your key and call it "Match." Use some variation on this formula to bring over the constituent IDs from the RE export to the mail file:

=VLOOKUP(A2,[export.csv]Sheet1!$A:$F,6,FALSE)

If the keys you created for each file match, then the ID will come over into the event mailing spreadsheet. You'll need to further investigate to ensure that you have the right info, so you might also want to use VLOOKUP to bring in the first name, last name, and address in new columns. Then you can visually inspect to see if there are any mismatches.

You'll be left with a bunch of #N/A results. Theoretically, these are new constituents.

You can then do any importing and updating that you need.

This is a mess, and if you're not comfortable with Excel, it might be a little too overwhelming, but I've had to match up giant lists of data like this before (Thanks, United Way!) and it's the only solution I personally have found to reduce the overall work load.

Rebecca Timberlake 2 · May 2017

Ryan Hyde....you are a GENIUS!!!!

I had jumpped to the vlookup step, but knew there had to be something before that which I was missing. Thank you so much for breaking it down like that. I am going to get started on this straight away!

Thank You!

Rebecca

Ryan Hyde · May 2017

Rebecca Timberlake:

Ryan Hyde....you are a GENIUS!!!!

I had jumpped to the vlookup step, but knew there had to be something before that which I was missing. Thank you so much for breaking it down like that. I am going to get started on this straight away!

Thank You!

Rebecca

haha, thanks! I hope it works out for you

Melissa S Graves · May 2017

I would just be wary of the #N/As all being new constituents. Some could have a new address since last year and if that is in the unique ID it will not match. Especially if you say these are all exported originally from RE, you may want to do a less unique "unique id" with just name info and use a verification process on those to help with not getting dupes.

Ryan Hyde · May 2017

Melissa Graves:

I would just be wary of the #N/As all being new constituents. Some could have a new address since last year and if that is in the unique ID it will not match. Especially if you say these are all exported originally from RE, you may want to do a less unique "unique id" with just name info and use a verification process on those to help with not getting dupes.

Yes, they're "theoretically" new records. Depending on how many you have, you'll either want to manually search for them, make a less stringent key (whole first name plus whole last name?), or import them and run the dedupe module if you don't already have a mountain of possible dupe records in there like I do.

Rebecca Timberlake 2 · May 2017

I was worried about that too, but wanted to start somewhere. I followed Ryan's plan and have nearly 2000 records that match perfectly! Now it's off to loosen up the criteria a bit to see if I can get more matches. I will do that until I can't match any further and then will manually check the remainder. But, at this point, that's 500 records MAX which is WAY better than 2500.

RT

Melissa S Graves · May 2017

YAY!

Ryan Hyde · May 2017

That's awesome!!! Glad this worked out

Steven Cinquegrana 2 · May 2017

Ryan Hyde:
...

You can then do any importing and updating that you need.

This is a mess, and if you're not comfortable with Excel, it might be a little too overwhelming, but I've had to match up giant lists of data like this before (Thanks, United Way!) and it's the only solution I personally have found to reduce the overall work load.

And this is essentially what any importer integration does in the background, perhaps with some fuzzy matching, aliased names (Mary/Marie, Stefan/Steven, etc). If you've ever put a Plug-In together you can roll your own if you have the patience.

Chas Dreyfus · May 2017

Yes! Exactly. It's not uncommon that I receive lists without a constituent identification field, and that's the solution that came to me, too. I have columns for Primary Addressee, Name, and Spouse Name, as well as for email addresses, so I have more ways to find them. And, I update the file during the (school) year.

Faith Murray · May 2017

Nobody here has mentioned the obvious and easy solution in RE Import itself. Select "Import new records", "Valiate Data only" and then "Use duplicate criteria for new records". You can do this without a constituent ID. It then creates an exception report with all identified duplicates flagged. See the KB article here. I have found RE's duplicate search mechanism superior to trying to find matches in Excel, because RE can use a "fuzzy" match criteris to identify potential dupes. In Excel, a column not spelled quite the same won't match up.

Ryan Hyde · May 2017

Faith Murray:

Nobody here has mentioned the obvious and easy solution in RE Import itself. Select "Import new records", "Valiate Data only" and then "Use duplicate criteria for new records". You can do this without a constituent ID. It then creates an exception report with all identified duplicates flagged. See the KB article here. I have found RE's duplicate search mechanism superior to trying to find matches in Excel, because RE can use a "fuzzy" match criteris to identify potential dupes. In Excel, a column not spelled quite the same won't match up.

Whoa! I didn't know that's what that was for at all. That's just one of those checkboxes I started ignoring ages ago and never even considered.

That said, I have no personal experience with this (obviously) so I can't say how well it works. I've always thought that RE's duplicate criteria are a little too fuzzy, and the tool has a tendency to give out too many false positives, but I've never used it for a smaller set of constituents like this, so maybe the results are a lot more useful.

Cathy Spencer · May 2017

I will give that a try next time! Thanks!

Ryan Hyde · May 2017

Faith Murray:

Nobody here has mentioned the obvious and easy solution in RE Import itself. Select "Import new records", "Valiate Data only" and then "Use duplicate criteria for new records". You can do this without a constituent ID. It then creates an exception report with all identified duplicates flagged. See the KB article here. I have found RE's duplicate search mechanism superior to trying to find matches in Excel, because RE can use a "fuzzy" match criteris to identify potential dupes. In Excel, a column not spelled quite the same won't match up.

Hey folks! I had an opportunity to test out Faith's method, and it's pretty great. Now, the list that I had only included first name and last name, so there were some records that resulted in 10+ possible dupes (imagine what would happen if one of your names was "John Smith"), but if I had more criteria I imagine the results would be tighter.

The only gripe I have is that it's difficult to isolate the constituent ID number from the report. I exported the report as a .csv file to try to match it up to my spreadsheet more easily, but the output that gives you the potential dupe spreads over two lines, as such:

Validation error: Duplicate record found.

Possible matches (IDs): 56978

That exists in one cell.

I tried to do text-to-columns based on the : to isolate the constituent ID, but something about the formatting within the cell just erases the entire 2nd line.

Does anyone have a workaround for this sort of thing?

Alan French · May 2017

Can you use the Right function first, to knock the first 60-something characters off the front of the error message (assuming it's always the same for each line) before you do the text-to columns? Or you could even do a Find and Replace on the whole sheet to replace the string "Validation error: Duplicate record found. Possible matches (IDs): " with nothing, if that's easier.

Ryan Hyde · May 2017

Alan French:

Can you use the Right function first, to knock the first 60-something characters off the front of the error message (assuming it's always the same for each line) before you do the text-to columns? Or you could even do a Find and Replace on the whole sheet to replace the string "Validation error: Duplicate record found. Possible matches (IDs): " with nothing, if that's easier.

Aha, yes, these both seem like possible solutions, but a colleague of mine managed to dig up an article about using TRIM with REPT, RIGHT, and SUBSTITUTE to effectively only pull the last "word" from a string. And I mean, how could I pass up on a solution that's that complex???

Now, if there's more than one possible match, it'll only pull the last one, but I feel like in cases where you have multiple possible matches it's up to you to manually figure out which one (if any) is correct.

The formula is: =TRIM(RIGHT(SUBSTITUTE(text," ",REPT(" ",100)),100))

And here's the article: https://exceljet.net/formula/get-last-word

Daniel Bayer · May 2017

Ryan Hyde:

The formula is: =TRIM(RIGHT(SUBSTITUTE(text," ",REPT(" ",100)),100))

And here's the article: https://exceljet.net/formula/get-last-word

That is a piece of twisted genius and I LOVE IT.

Melissa S Graves · May 2017

This is the thing I do not like about the import exception route is - do you just take forgranted that RE found the correct duplicate and now you just import your data to it without checking it? All you get is an ID#.

With the vlookup option you really can pull in data from the two sheets and visually inspect the important fields from the two sources and confirm the match. I probably would do this before any import.

Ryan Hyde · May 2017

Melissa Graves:

This is the thing I do not like about the import exception route is - do you just take forgranted that RE found the correct duplicate and now you just import your data to it without checking it? All you get is an ID#.

With the vlookup option you really can pull in data from the two sheets and visually inspect the important fields from the two sources and confirm the match. I probably would do this before any import.

The more I think about this the more I think I'm with you, Melissa. However, what if you used Faith's method, got a list of IDs, imported that list as a constituent update and output a query, then took that query and output the fields you'd like to inspect . . . it's roundabout, but you'd end up with a list you could visually inspect, and if you're starting with my original first step (output everyone in your database), this could actually take less time. I've got 180k+ records in my database.

Lisa Fields 2 · May 2017

Ryan Hyde:

Faith Murray:

Nobody here has mentioned the obvious and easy solution in RE Import itself. Select "Import new records", "Valiate Data only" and then "Use duplicate criteria for new records". You can do this without a constituent ID. It then creates an exception report with all identified duplicates flagged. See the KB article here. I have found RE's duplicate search mechanism superior to trying to find matches in Excel, because RE can use a "fuzzy" match criteris to identify potential dupes. In Excel, a column not spelled quite the same won't match up.

Hey folks! I had an opportunity to test out Faith's method, and it's pretty great. Now, the list that I had only included first name and last name, so there were some records that resulted in 10+ possible dupes (imagine what would happen if one of your names was "John Smith"), but if I had more criteria I imagine the results would be tighter.

The only gripe I have is that it's difficult to isolate the constituent ID number from the report. I exported the report as a .csv file to try to match it up to my spreadsheet more easily, but the output that gives you the potential dupe spreads over two lines, as such:

Validation error: Duplicate record found.

Possible matches (IDs): 56978
That exists in one cell.

I tried to do text-to-columns based on the : to isolate the constituent ID, but something about the formatting within the cell just erases the entire 2nd line.

Does anyone have a workaround for this sort of thing?

I use text to columns twice. First by Fixed Width and then by space (I think) it definitely works but can't test it right now.)

Alicia Barevich · May 2017

Ryan Hyde:

Melissa Graves:

This is the thing I do not like about the import exception route is - do you just take forgranted that RE found the correct duplicate and now you just import your data to it without checking it? All you get is an ID#.

With the vlookup option you really can pull in data from the two sheets and visually inspect the important fields from the two sources and confirm the match. I probably would do this before any import.

The more I think about this the more I think I'm with you, Melissa. However, what if you used Faith's method, got a list of IDs, imported that list as a constituent update and output a query, then took that query and output the fields you'd like to inspect . . . it's roundabout, but you'd end up with a list you could visually inspect, and if you're starting with my original first step (output everyone in your database), this could actually take less time. I've got 180k+ records in my database.

Just copy the column of possible dup const IDs into the "Const ID one of" in a query, output yor desired fields, and compare it to your import file. Or click through the records. Maybe this is what you meant, but an import really isn't necessary here. My yearly import process is pretty conviluted and has lots of dups (75%?), so I end up doing a lot of manual/global updates, but the comparing query output to a spreadsheet or query output from the source database definitely saves some headache!

Faith Murray · May 2017

I'd still go with the roundabout route proposed by Ryan for checking RE's dupe suggestions. I firmly believe that comparing two Excel sheets is not accurate enough. From importing lists from other departments and other organizations, there have been so many times when RE caught a dupe that had been missed in Excel (even using a professional service to dedupe the two Excel sheets). This is because the Excel sheets cannot take into account alternate addresses, former addresses, alternate names, relationships, businesses that have different addresses, etc. A lot of the lists you get in this industry have outdated addresses and emails in them because other people don't keep their lists clean. My preference is to use three sweeps - 1) run the list past RE Import for dupes; 2) isolate the dupe constituent codes, copy them into a Query and export relevant data, and do manual comparison; then 3) import/update the adjusted list.

Ryan, btw, I loved the Excel formula! That looks like so much fun. Now I've got to try it. :)

Ryan Hyde · May 2017

Just copy the column of possible dup const IDs into the "Const ID one of" in a query, output yor desired fields, and compare it to your import file.

Duh, yes, this is a much better way to do it. Thanks!

Joan Perry · May 2017

This thread is what makes this community AWESOME!!!!

The joys of inheriting a mess: How to import from a list without Const ID or Import ID

Comments

Categories