This is the second of my two-part blog about the types of intelligence Novetta has built into Novetta Entity Analytics. This intelligence is based on the knowledge Novetta’s team of computer science, data integration, and services experts has gained over the past 14 years consulting on many large data integration and entity analytics projects with our federal government customers.
In my first blog, I gave an overview of the built-in intelligence in Novetta Entity Analytics that helps prepare data for entity resolution. In this blog, I will cover the threshold settings, resolution strategies and conflict remediation processes we’ve added to help achieve the most accurate entity resolution and analytics results possible.
- Adjustable threshold settings ensure best matching results
Novetta Entity Analytics leverages built-in entity resolution knowledge to automatically create threshold settings users select to adjust how many false positive matches are returned when entities are resolved. Users choose the low, medium or high setting, depending on the type of analysis they are performing. EXAMPLE: The impact of a larger number of false positives or false negatives on a marketing campaign is minimal, but the consequences of both for law enforcement are significant. Novetta Entity Analytics provides users with guidelines and recommendations for selecting the most appropriate threshold setting for their combination of data sources, use case and desired matching results. The software also provides users who want more granular control with the ability to view histograms, specify precise threshold points, and get immediate feedback about how their choices impact potential match results.
- Resolution strategies streamline integration, increase accuracy
Novetta Entity Analytics includes a collection of pre-built, customizable resolution strategies to help correlate different data sources as needed. Our team has developed these strategies over the years as we’ve applied our technology to a range of use cases and devised new methods and best practices for matching various data combinations. We continually update and add new resolution strategies to the collection as we gain new insights. EXAMPLE: Some data sets have few identifying attributes, such as phone numbers, addresses or account numbers, which makes accurately matching records difficult. For this type of data, Novetta Entity Analytics selects a resolution strategy that adds information about the relationships between entities as a new data attribute and uses it to help match records during the resolution process. Novetta Entity Analytics automatically applies the best resolution strategies and optimal rule sets for specific combinations of data sources. The strategies and rules Novetta Entity Analytics creates and uses during the entity resolution process are determined based on the information-rich data characterization histograms and uniqueness values I discussed in my first built-in intelligence blog post.
- Automated conflict remediation reduces over-chaining of entity records
Novetta Entity Analytics chains records together during the resolution process to form entities, and then detects potential conflicts by reviewing all records chained to each entity. The software applies its unique built-in conflict resolution rules to entities with over-chained records and either splits them into separate entities or removes the record from the entity. The built-in conflict rules in Novetta Entity Analytics leverage our extensive experience combining entities and identifying common resolution errors to achieve the highest entity accuracy rates available. EXAMPLE: Novetta Entity Analytics initially chains records with multiple and similar names (Mike Smith, Michael Smith, Mike Smith Sr. and Mike Smith Jr.) and the same address into a single entity. The software then reviews the available information across all chained records within resolved entities to automatically resolve conflicts, such as breaking the “single” Mike Smith entity into two entities (father and son) living at the same residence. Novetta Entity Analytics allows users to resolve data using looser matching thresholds and then hone data accuracy by applying pre-built conflict rules to identify and remediate over-chained records. This more aggressive approach to resolution results in a better balance between false positives and false negative matches and provides greater overall data accuracy than other technologies.
Delivering automated data integration and data science expertise
The examples described above, and in part one of my built-in intelligence blog post, discuss some of the knowledge we’ve included in Novetta Entity Analytics to allow users of any level to easily combine diverse data sets and resolve entities within them. In addition, the software has precise and easy-to-use visual mapping tools, and powerful relationship identification, entity augmentation and knowledge development processes.
Novetta Entity Analytics users don’t have to be data integration or entity resolution experts because they can rely on Novetta’s breadth and depth of experience that has been built into the software. In fact, any organization that deploys Novetta Entity Analytics can ensure they will realize immediate business results without having to hire a team of data scientists and data integration resources to get the job done.