DIG: Domain-specific Insight Graphs

WHAT IS DIG

You can now download DIG and run it on your laptop: dig-etl-engine.

DIG is a domain-specific indexing, search and analysis system. The DIG system harnesses state-of-the-art open source software combined with an open architecture and flexible set of APIs to facilitate the integration of a variety of extraction and analysis tools.

DIG builds on rich models of a domain that support fine-grained data collection, organization, and analysis. DIG builds a graph of the entities and relationships within a domain using scalable extraction and linking technologies. DIG also includes a faceted content search interface for users to query DIGs and visualize information on maps, timelines, and tables.

DIG is designed to be scalable by building on open-source cloud-based infrastructure (i.e., HDFS, Hadoop, Elastic Search, etc.), supports a diversity of source types, and is rapidly re-targetable to new domains of interest.

Popular Science published a very interesting article THE MAN WHO LIT THE DARK WEB: Data-mining tools are helping cops bust open online human trafficking that describes the history of the DARPA MEMEX program that funds our DIG project, and provides details on how DIG is being used by law enforcement agencies to combat human trafficking.

For the information on MEMEX you can checkout this website http://www.ee.columbia.edu/ln/dvmm/memex/index.html#About provided by Columbia University's Digital Video and Multimedia (DVMM) Lab.

PEOPLE

Current Researchers

Pedro Szekely

Research Director

Craig Knoblock

Interim Excecutive Director

Kevin Knight

Director & Professor

Daniel Marcu

Director & Research Associate Professor

Mayank Kejriwal

Research Asst Professor of Industrial and Systems Engineering, Research Lead

Dipsy Kapoor

Research Programmer

Amandeep Singh

Research Programmer

Yixiang Yao

Research Programmer

Jason Slepicka

PhD Student

Majid Ghasemi Gol

PhD Student

Anika Jain

MS Student

Jiayuan Ding

MS Student

Qingyuandi Lin

MS Student

Rahul Kapoor

MS Student

Shreya Venkatesh

MS Student

Xi Jin

MS Student

External Collaborators

Matthew Michelson

InferLink

Steven Minton

InferLink

Brian Amanatullah

InferLink

Shih-Fu-Chang

Senior Vice Dean Columbia University

Thomas Schellenberg

Next Century

Rachel Artiss

Next Century

David Flynt

Next Century

Mike Tamayo

Next Century

Svebor Karaman

Postdoc at Columbia University

Tao Chen

Postdoc at Columbia University

Tianxin Zhao

MS Student

Zihao (Kevin) Zhai

MS Student

Previous Contributors

Ashish Bharadwaj Srinivasa

MS Student

Dhvanan Shah

MS Student

Kaushal Shah

MS Student

Vinay Dandin

MS Student

Domain-specific Insight Graphs

WHAT IS DIG

PEOPLE

Current Researchers

External Collaborators

Previous Contributors

ORGANIZATIONS

PUBLICATIONS

ACKNOWLEDGMENT