Database Clean outside of RE - Merging of resulting duplicates

Options
Hi,


We are a UK based data processing bureau, we are currently pitching for a data cleanse project that will involve the client supplying us the whole RE base and :-

 

We identify and resolve bad quality address information.

We identify and link together duplicates that RE has been unable to previously identify (we have more sophisticated matching techniques).

 

Is it possible anyone could provide some guidance on the considerations we need to take into account when supplying the results of the processing back to the client?

 

I.e. is there a tool within RE that would utilize the ‘link’ we would have created for the matches and allow it to automatically merge the records? Or so that we have more control, is it better for us the merge the records outside of RE (and create a single record for each set of duplicates) and then replace these duplicates within RE?

 

Many thanks in advance for any advice.

 

Kind Regards

 

Mark

 

Comments

  • I believe all the merging could be done outside of RE, but you would still need to delete the "old" record once that info is completed. There is a great merging feature in RE that could merge two records together with an option to delete the source record. This may be the most effecient option for your project. 


    Good luck

     

     

  • Hi John,


    Thank you for your ‘Frank’ response.


    I would suggest you are being slightly presumptuous in your statement suggesting that we are inadequate in our ability to support our client. Based on the fact that you have no knowledge of what services we supply or our ability to deliver them.


    Ultimately this data need to come out of the system as there are issues with the data that clearly cannot be resolved within. I will not go into detail here but there are address quality issues that will require the data to be run against the UK PAF file and also there is a need to match the data against other industry files for flagging.  The issue with duplication (as I understand it), is present after the internal matching routines have been performed.


    Our area of expertise revolves around data quality and ultimately there are data quality issues here that need to be resolved. I’m not suggesting that the data has to be merged outside of RE.  At this stage I was attempting to ascertain if there were any other options open to the client other than going through the merges one by one.


    Kind Regards


    Mark  
  • Aren't there third party developers out there that have developed products to help with more sophisticated merging?  Like Omatic or Ziedman?

     

Categories