Common Sense Knowledge Graphs (CSKGs)

Organizers

Commonsense reasoning is an important aspect of building robust AI systems and is receiving significant attention in the natural language understanding, computer vision, and knowledge graphs communities. At present, a number of valuable commonsense knowledge sources exist, with different foci, strengths and weaknesses. Our tutorial will survey the most important commonsense knowledge resources, and introduce a new commonsense knowledge graph (CSKG) to integrate several existing resources. The tutorial will also introduce several tools to work with CSKG including query mechanisms, knowledge graph embeddings, and a framework to create a commonsense question answering systems. In a hands-on session, participants will use the framework and tools to build a question answering application using CSKG and language models.

Tutorial Program

Our tutorial will consist of two main parts:

  1. presentations that introduce the individual knowledge graphs, their integration and consolidation in a single CSKG, refining operations, explanation of how we compute embeddings, how we complete missing knowledge, and how it can be applied to reason on natural language questions.
  2. A hands-on session that allows tutorial participants to load and inspect (parts of) our graph (e.g., using graph-tool in python), understand how the embeddings are computed in code (using the numbatch approach as in ConceptNet), understand how to perform CSKG completion (e.g., using the ambiverse library), and several algorithms that use CSKG to reason over CSKG to answer questions.

Preliminary agenda:

Time

Content

Material/Format

09:00-09:15

Introduction to commonsense knowledge

Slides

09:15-09:40

Review of existing commonsense knowledge graphs

Slides

09:40-10:30

Consolidating commonsense graphs

Slides

10:30-11:00

Coffee Break

 

11:00-11:15

Introduction to activities/set-up

Slides/Data/Executable Code

11:15-12:10

2 Hands-on activities: loading and analyzing CSKGs

Slides

12:10-12:30

Wrap up of integrating CSKG

Slides

12:30-14:00

Lunch break

 

14:00-14:45

Embeddings and KG completion

Slides

14:45-15:30

2 hands-on activities involving embeddings and KGC

Slides/Data/Executable Code

15:30-16:00

Coffee Break

 

16:00-16:30

Answering questions with CSKG

Slides

16:30-17:10

1 hands-on activity

Slides/Data/Executable Code

17:10-17:30

Open problems, wrap-up and discussion

Slides

Learning Outcomes:

  1. Familiarization with the state-of-the-art knowledge sources that provide commonsense knowledge
  2. Understanding how these sources can be integrated into a single commonsense knowledge graph
  3. Hands-on activities involving analysis of a consolidated CSKG, intrinsic operations (e.g., computation of embeddings), and application of the CSKG on standing commonsense tasks in a natural language form.

Presentation Style style will be informal and very hands-on. We will avoid representation specifics and terminology used by the individual graphs to the best extent possible (i.e. without losing rigor, or over-simplifying). Our slides will focus on visual intuition, use actual examples, present lessons learned from our latest implementations of algorithms and commonsense reasoning frameworks, and will be accompanied by demos and hands-on activities that participants will be able to do without requiring extensive platform-dependent setup. We will permit questions and interactions throughout the tutorial. All three of us will be mostly present throughout the tutorial but will be individually presenting our sections.

Background & Requirements

Capturing, representing, and leveraging commonsense knowledge has been a paramount for AI since its early days, cf. (McCarthy, 1960). In the light of the modern large (commonsense) knowledge graphs and various neural advancements, the recently introduced DARPA Machine Common Sense program represents a new effort to understand commonsense knowledge through question-answering evaluation benchmarks. Intuitively, graphs of (commonsense) knowledge are essential in such tasks in order to inject background knowledge that humans possess and apply, but machines cannot access or distill directly in communication.

Our team has been working on several aspects of commonsense knowledge found in knowledge graphs. Firstly, we have been integrating a number of knowledge graphs in a single commonsense knowledge graph, including: ConceptNet, WordNet, Visual Genome, FrameNet, ATOMIC, WebChild, Wikidata, and Cyc. Secondly, we have been building software to perform intrinsic operations on the consolidated CSKG, such as generating embeddings and performing knowledge graph completion. Thirdly, we have a framework that allows us to integrate (parts of) our CSKG into a reasoning system that answers questions phrased in natural language. These questions come from several commonsense evaluation datasets, focused on social, physical, visual, and situational reasoning.

It is important for the semantic web community to keep pace with this new ‘wave’ of commonsense knowledge representation and reasoning for downstream applications. Aspects that we seek to address during through our presentation and activities are: what is a commonsense knowledge graph? Which kind of knowledge is captured by existing CSKGs? What are their major strengths and weaknesses? How can they be integrated into a graph that is more than ‘a sum of its parts’? How can this graph be refined/enriched with more semantics or missing information? How can it be applied on downstream applications in natural language understanding? Our tutorial will be very practical, and will answer these questions in a focused, foundational manner that all Semantic Web researchers will be able to easily follow.

Prior knowledge expected from participants (beyond fairly basic Python 3 skills, and familiarity with Semantic Web concepts like RDF) will be minimal. Some knowledge of machine learning, including basic concepts like training, testing and validating, feature engineering etc. will be helpful but are not absolute prerequisites, as we will not go into advanced machine learning math or optimization. Additionally, where possible, we will introduce basic machine learning concepts so that everyone has an opportunity to follow along. Participants are not expected to have any knowledge of answering natural language commonsense questions.

We will be using our own computers for presenting demos and PowerPoint slides and only require equipment to facilitate such projection for an extended period of time (e.g., projector, table, power outlet). We (and also the participants) will require an internet/wifi connection to access the tutorial material. There are no audio elements to our presentation. All demos and hands-on activities will be doable on a reasonable laptop by interested participants. We will also bring extra USB storage devices with copies of code, programs and slides in case some participants did not download the material prior to the tutorial.

Expected Coverage

Tools and data:

Bibliography: