Data Fusion and Conflict Resolution tool for Linked Data
The value of Linked Data lies in the ability to link pieces of data, even across data sources. A data integration process can provide a unified view on the data. Tools covering all steps of the integration process need to be designed and developed in order to leverage the potential for semantic applications and the enterprise environment where data integration is crucial.
LD-FusionTool covers the Data Fusion step in the integration process for RDF, where data are merged to produce consistent and clean representations of objects, and conflicts which emerged during data integration need to be resolved. This involves several tasks that LD-FusionTool tackles:
In addition, LD-FusionTool leverages the structure of the RDF graph by considering dependency of properties and resources in order to simplify data fusion and improve the quality of results.
The figure in this section depicts a high-level architecture of LD-FusionTool.
For more information about how LD-FusionTool works, please refer to the following publications:
LD-FusionTool is distributed as an executable java .jar archive. You can download the binary along with examples here.
LD-FusionTool can be run with Java, e.g., with the following command:
java -jar odcsft-application-<version>-executable.jar
Running it without parameters prints out the usage message with possible options.
What and how should be fused is configured with an XML configuration file. See examples, which also serve as the documentation of the configuration XML, and wiki page about conflict resolution configuration.
LD-FusionTool was originally developed for a Linked Data processing framework ODCleanStore (hence it was originally named ODCS-FusionTool), which has been superceded by ETL tool UnifiedViews developed as a joint project of Charles University, Semantic Web Company GmbH, Semantica.cz s.r.o., and EEA s.r.o.
The version for UnifiedViews is available as a Data Processing Unit at GitHub.