Big Data Integration
Until recently, structured (e.g., relational) and unstructured (e.g., textual) data were managed very differently: Structured data was queried declaratively using languages such as SQL, while unstructured data was searched using boolean queries over inverted indices. Today, we witness the rapid emergence of Big Data Integration techniques leveraging knowledge graphs to bridge the gap between different types of contents and integrate both unstructured and structured information more effectively. I will start this talk by giving a few examples of Big Data Integration. I will then describe two recent systems built in my lab and leveraging such techniques: ZenCrowd, a socio-technical platform that automatically connects Web documents to semi-structured entities in a knowledge graph, and Guider, a Big Data Integration system for the cloud.
Philippe Cudre-Mauroux is a Full Professor and the Director of the eXascale Infolab at the University of Fribourg in Switzerland. He received his Ph.D. from the Swiss Federal Institute of Technology EPFL, where he won both the Doctorate Award and the EPFL Press Mention in 2007. Before joining the University of Fribourg, he worked on information management infrastructures at IBM Watson (NY), Microsoft Research Asia, and MIT. He recently won the Verisign Internet Infrastructures Award, a Swiss National Center in Research award, a Google Faculty Research Award, as well as a 2 million Euro grant from the European Research Council. His research interests are in next-generation, Big Data management infrastructures for non-relational data.