Center for Nonlinear Studies

Wednesday, December 18, 20133:00 PM - 4:00 PMCNLS Conference Room (TA-3, Bldg 1690)
Seminar
Semantic Graph Database Analytics: Complex is the New Big Data
Cliff JoslynPNNL
Two of "big data"'s canonical "Four V" values are Volume and Variety (the others being Velocity and Veracity). Addressing high volume information which is simple and homogeneous is itself enough of a problem; but when combined with complexity and heterogeneity, computational challenges can be exponentiated. PNNL is advancing the concepts and technologies around Semantic Graph Databases (SGDBs) as a NoSQL database paradigm for handling big, complex, openworld data. Where traditional, relational databases (RDBs) are structured as rows and columns, typed by schemata, and queried through SQL, SGDBs are structured as labeled nodes and edges, typed by ontologies, and queried through graphlike languages including SPARQL. While not all graph data are best addressed in SGDBs, SGDB concepts and structures arise naturally when graph data have directionality and a variety or complexity of labels and attributes carrying the semantic and logical content of the information being represented. In this talk I will introduce SGDB concepts and survey recent PNNL work, including PNNL's SGEM engine for high perfomance SGDBs; our descriptive semantic statistical approaches to characterizing large SGDBs; the role and value of hybrid graph/relational data models; the search for benchmark test data sets; and the mathematical issues involved in extending network science approaches to the labeled, directed graph structures typical of SGDBs.

Host: Josephine Olivas