Similarity of Graphs

Description:

Graph analysis is an important area of data mining research, with wide applications in social network mining, community detection, and online advertising campaigns, to name a few. One of the important issues in graph analysis is to measure the similarity among two given (sub)graphs for, say, graph matching in general as well as to compare different communities in online social networks. The latter in turn can be exploited in many ways, such as launching online advertisements of new products.

Discovering structures in very large databases encompasses many research problems; among which are detecting correlation among database columns, schema extraction, and schema matching. Schema matching in particular is concerned with matching two database schemas, given the relationships among their columns. It is a prerequisite for data analytics dispersed over different, possibly large heterogeneous databases, e.g., Facebook and Amazon co-purchase databases.

As we will demonstrate in the pro-seminar/seminar, graph analysis can be used to discover structure in very large databases. For instance, one of the promising approaches to schema matching thus far is to represent each database schema by a graph of columns, and schema matching boils down to matching two column graphs.

Objective:

This seminar aims at providing students the knowledge on:

(a) graph similarity analysis and database structure discovery,

(b) the interactions between the two fields, in particular, how graph analysis provides new solutions to database structure discovery, and how database structure discovery provides new challenges for graph analysis to address, and

(c) the applicability of both fields to the more general problem of how to compare different patterns (e.g., the purchase pattern of PS4 and that of Xbox in Germany) as well as different structures (e.g., the friendship circle of KIT students and that of MIT students). The knowledge obtained from this seminar is beneficial for both students who want to pursue a career in industry, and those who want to delve further into computer science research.