Abstract
A computer-implemented method comprising partitioning data representing an input instance of a database including multiple tuples into multiple fragments of tuples, detecting tuples which violate a data quality specification in respective ones of the fragments, selecting a data cleaning asset on the basis of characteristics of errors in detected tuples for a fragment and based on declared asset capabilities, assigning a selected data cleaning asset to the fragment, the selected data cleaning asset to provide a set of candidate corrections for the detected tuples in the fragment, providing data representing an output instance of the database in which detected tuples are replaced with selected candidate corrections.
Original language | English |
---|---|
Patent number | US2013275393 |
IPC | G06F 17/ 30 A I |
Priority date | 12/04/12 |
Publication status | Published - 17 Oct 2013 |