IASTED - Tutorial Session | Artificial Intelligence and Soft Computing | June 22 – 24, 2011

~ASC 2009~

~ASC 2008~

~ASC 2007~

~ASC 2006~

~ASC 2005~

~ASC 2004~

~ASC 2003~

~ASC 2002~

The Fourteenth IASTED International Conference on
Artificial Intelligence and Soft Computing
ASC 2011

June 22 – 24, 2011
Crete, Greece

TUTORIAL SESSION

Introduction to causal discovery: A Bayesian Networks approach

Miss Sofia Triantafillou
ICS FORTH, Greece
[email protected]

Asst. Prof. Ioannis Tsamardinos
University of Crete, Greece
[email protected]

Duration

3 hours

Abstract

The tutorial presents an introduction to basic assumptions and techniques for causal discovery from observational data with the use of graphs that represent conditional independence models. It first presents the basic theory of causal discovery such as the Causal Markov Condition, the Faithfulness Condition, and the d-separation criterion, graphical models for representing causality such as Causal Bayesian Networks, Maximal Ancestral Graphs and Partial Ancestral Graphs. It presents prototypical and state-of-the-art algorithms such as the PC, FCI and HITON for learning such models (global learning) or parts of such models (local learning) from data. The tutorial also discusses the connections of causality to feature selection and present causal-based feature selection techniques. Finally, case-studies of applications of causal discovery algorithms are presented.
The tutorial aims to:

Familiarize the audience with the field and increase comprehension of the problem of causal induction as it pertains to everyday data analysis tasks; familiarize the audience with formalisms that represent causal relations among variables and provide a language for thinking about causality and causal discovery
Increase understanding of the basic principles of causal induction and familiarity with prototypical and state-of-the-art algorithms in the field; enable the correct interpretation of the output of such algorithms
Enable the correct application of causal-discovery algorithms in practical data mining, machine learning, or statistical analysis tasks

More specifically, it aims to clarify the following issues:

While most machine learning techniques assume identically and independently distributed data (i.i.d. data) quite often in many fields the data do not follow this assumption. The data may be experimental (e.g., after knocking out a gene) or under selection bias, e.g., in case-control studies. The tutorial helps understanding the differences and how they arise due to the causal structure of the domain
It is often the case that the purpose of the analysis is to identify important variables (a.k.a feature selection), called biomarkers in biology, risk factors in medicine, etc. The tutorial helps understanding the connection between the selected variables and the causal structure.
It is often the case that prediction models are not the final goal, but instead the goal is to control a system, e.g., treat a patient, design a drug with desired properties, etc. Causal modeling and induction is necessary to build machine learning models that can predict the outcome in a system that is being manipulated (e.g., under different experimental conditions).
The tutorial provides a deeper understanding in standard (non-causal) Bayesian Networks that have been proven important in Machine Learning, reasoning with Uncertainty in Artificial Intelligence, and Decision Support Systems for over two decades.
Causal discovery has already led to important discoveries, thus knowledge of these methods and their potential is important for the data analysts of the future.

Timeline

The tutorial outline is shown below:

Basics of Causal Induction and Modeling (Duration : 45')
Learning Causal and non-Causal Bayesian Networks from Data (Duration : 45')
Inferring Causation in the Presence of Latent Variables (Duration : 45')
Case Studies and Future Work (Duration : 45')

Target Audience

The tutorial is designed for a wide audience with a general AI background; familiarity with basic machine learning and statistical concepts is helpful for a more in-depth understanding of the techniques.

Presenters

Asst. Prof. Ioannis Tsamardinos – University of Crete, Greece
Miss Sofia Triantafillou – ICS FORTH, Greece

Qualifications of the Instructor(s)s

Sofia Triantafillou graduated from the school of Applied Mathematical and Physical Sciences of the National Technical University of Athens in 2007. She received her M.Sc. degree at 2010 from the Computer Science Department in University of Crete titled "A Constraint-Based Algorithm for Learning Causal Structure from Overlapping Variable Sets". She is currently a PhD candidate in the Computer Science Department of University of Crete, under the supervision of Dr. Ioannis Tsamardinos, and a member of the Bioinformatics Laboratory of the Institute of Computer Science, part of the Foundation for Research and Technology - Hellas. Her research interests include artificial intelligence, causal discovery, and focus on applications of machine learning and in Bioinformatics.

Ioannis Tsamardinos is the Head of the Bioinformatics Laboratory at ICS-FORTH, an Assistant Professor at the Department of Computer Science at University of Crete, and an Adjunct Assistant Professor at the Department of Biomedical Informatics at Vanderbilt University. He received his Ph.D. in 2001 from the Intelligent Systems Program of the University of Pittsburgh and worked as an Assistant Professor at the Dept. of Biomedical Informatics at Vanderbilt University between 2001 and 2006. He has developed several state-of-the-art algorithms and systems for Machine Learning, Data Mining, Bayesian Network learning, Variable Selection, and Causal Discovery with over 50 publications in international journals, conferences, and books. He has also participated in several applied projects to biomedicine, including the analysis of clinical, epidemiological, microarray gene-expression, proteomics and text-categorization. Distinctions with colleagues and students include the best performance in one of the four tasks in the recent First Causality Challenge Competition, ISMB 2005 Best Poster Winner, a Gold Medal in the Student Paper Competition in MEDINFO 2004, the Outstanding Student Paper Award in AIPS 2000, the NASA Group Achievement Award for participation in the Remote Agent team and others. International recognition has led to membership to the Editorial Board Member of the Journal of Artificial Intelligence Research and about 1800 citations (as estimated by the Publish or Perish tool). Ioannis is involved in several educational activities, regularly teaches the Machine Learning, Artificial Intelligence, and Algorithms in Bioinformatics courses at University of Crete, and has presented three other related tutorials in conferences and summer schools.

The Fourteenth IASTED International Conference onArtificial Intelligence and Soft ComputingASC 2011