Scalable Automatic Data Repair

Mohamed Yakout (Inventor), Ahmed K Elmagarmid (Inventor), Laure Berti-Equille (Inventor)

Research output: Patent

Abstract

A computer implemented method for generating a set of updates for a database comprising multiple records including erroneous, missing and inconsistent values, the method comprising using a set of partitioning functions for subdividing the records of the database into multiple subsets of records, allocating respective ones of the records to at least one subset according to a predetermined criteria for mapping records to subsets, applying multiple machine learning models to each of the subsets to determine respective candidate replacement values representing a tuple repair for a record including a probability of candidate and current values for the record, computing probabilities to select replacement values for the record from among the candidate replacement values which maximise the probability for values of the record for an updated database.

Original languageEnglish
Patent numberUS2012303555
IPCG06F 15/ 18 A I
Priority date25/05/11
Publication statusPublished - 29 Nov 2012

Fingerprint

Dive into the research topics of 'Scalable Automatic Data Repair'. Together they form a unique fingerprint.

Cite this