Learning Based Data Quality Enhancement

Learning Based Data Quality Enhancement

Application Description:

Data needs to be consistent, complete and accurate before analysis can be performed. Otherwise, Garbage In Garbage Out.Data Science is used to de-duplicate data, deal with missing values, make information consistent across the dataset.

What’s different

  • Provide an end-to-end data quality enhancement pipeline
  • Data quality score tracking to ensure it is maintained above threshold
  • Data processing, transformation and integration from heterogeneous sources
  • Adopt efficient methods to handle missing data in a variety of approaches
  • Robust and rapid quality enhancement with minimal impact on downstream steps
  • Advanced machine learning methods to improve quality score on a continual basis
  • Consistency of de-duplication to ensure removal of a maximum number of duplicates

Related Projects