Repository logo
Log In
  1. Home
  2. Functional Dependency Discovery: An Experimental Evaluation of Seven Algorithms
 
  • Details

Functional Dependency Discovery: An Experimental Evaluation of Seven Algorithms

ISSN
2150-8097
Book Title
Proceedings of the International Conference on Very Large Data Bases (PVLDB)
Date Issued
2015
Author(s)
Papenbrock, Thorsten
Ehrlich, Jens
Marten, Jannik
Neubert, Tommy
Rudolph, Jan-Peer
Schönberg, Martin
Zwiener, Jakob
Naumann, Felix  
DOI
10.14778/2794367.2794377
URL
http://dx.doi.org/10.14778/2794367.2794377
Abstract
Functional dependencies are important metadata used for schema normalization, data cleansing and many other tasks. The efficient discovery of functional dependencies in tables is a well-known challenge in database research and has seen several approaches. Because no comprehensive comparison between these algorithms exist at the time, it is hard to choose the best algorithm for a given dataset. In this experimental paper, we describe, evaluate, and compare the seven most cited and most important algorithms, all solving this same problem.
First, we classify the algorithms into three different categories, explaining their commonalities. We then describe all algorithms with their main ideas. The descriptions provide additional details where the original papers were ambiguous or incomplete. Our evaluation of careful re-implementations of all algorithms spans a broad test space including synthetic and real-world data. We show that all functional dependency algorithms optimize for certain data characteristics and provide hints on when to choose which algorithm. In summary, however, all current approaches scale surprisingly poorly, showing potential for future research.
Project(s)
Metanome,profiling
File(s)
Loading...
Thumbnail Image
Name

p1897-papenbrock.pdf

Size

842.12 KB

Format

Adobe PDF

Checksum

(MD5):bbd3d8ce85e43668cf6b9acdcbc012ea

Target group entry
  • Prospective Students
  • Students
  • Researchers
  • Alumni
  • Press
  • Partnership
Top Level Research At HPI
  • Research Clusters
  • Research Groups
  • Research Schools
  • Academic Partnerships
  • HPI Data Center
  • PhD Program
  • Apply now
  • Contact
  • Accessibility
  • Imprint
  • Data Protection
  • General Terms and Conditions
  • Consumers Right of Revocation
  • House Regulations
  • Shop
  • Open Positions

Built with DSpace-CRIS software - Extension maintained and optimized by 4Science

  • Cookie settings
  • Privacy policy
  • End User Agreement
  • Send Feedback