Lab Home | Phone | Search | ||||||||
|
||||||||
The last decade has seen a drastic change in the ways and rate at which people interact, and the technology available to observe and record these interactions. We have the ability to collect massive amounts of data: logs of emails, IP traffic, phone calls, SMS messaging, blog posts, and social media. The pervasiveness of communication and information networks in today's world necessitate the development of better models and techniques to address the challenges in efficiency and scalability that arise, and to leverage the temporal and relational information inherent in the data. In this work we present a data mining approach for analyzing streaming data from communication and information networks. We first build a stochastic model for a system of temporal processes, which we call the REWARDS (REneWal theory Approach for Real-time Data Streams) Model, and propose statistical methods to identify dependencies in the system. Applying this model to the network context, we develop efficient algorithms to identify anomalous activity, study information diffusion, and measure influence between entities. We demonstrate the usefulness of our approach with experiments on a variety of real-world data. Host: Aric Hagberg, CNLS |