Instructor Information

Thomas Woolf

Course Information

Course Description

This course introduces the fundamentals behind the mathematical and logical framework of graphical models. These models are used in many areas of machine learning and arise in numerous challenging and intriguing problems in data analysis, mathematics, and computer science. For example, the “big data” world frequently uses graphical models to solve problems. While the framework introduced in this course will be largely mathematical, we will also present algorithms and connections to problem domains. The course will begin with the fundamentals of probability theory and will then move into Bayesian networks, undirected graphical models, templatebased models, and Gaussian networks. The nature of inference and learning on the graphical structures will be covered, with explorations of complexity, conditioning, clique trees, and optimization. The course will use weekly problem sets and a term project to encourage mastery of the fundamentals of this emerging area. Course Note(s): This course is the same as EN.605.625 Probabilistic Graphical Models.

Prerequisites

Graduate course in probability and statistics (such as EN.625.603 Statistical Methods and Data Analysis).

Course Goal

Determining an optimal graphical model and solving real-world questions on that model are skills needed for dealing with large data and complex questions.  While simple questions can be addressed in a single statement of probability or looked up quickly in an Excel database, questions with many linked events that build to create a complex network are not trivial to define optimally or to query for learning.  This course has as its overarching goal the creation of an ability to work confidently with graphical models and their applications to real-world problems.

Course Objectives

  • Challenge poorly designed graphical models and poorly addressed inference, learning and optimization problems for graphical models.
  • Explain theoretical principles and practical approaches to create, usefully apply, and maintain an application of graphical models for domain specific problems.
  • Apply strategies and skills for structured learning and inference on graphical models
  • Create graphical models for domain problems of interest to you and realize how to solve and apply them to other domains.

When This Course is Typically Offered

Offered for the first time in Spring of 2014

Syllabus

  • Introduction, Reminder of Probability Theory, Overview of Graphs
  • Bayesian Networks
  • Undirected Graphical Models
  • Template Based Representations
  • Gaussian Network Models Exponential Family of Models
  • Exact Inference
  • Clique Trees and Belief Propagation
  • Inference as Optimization
  • Particle based Approximate Inference
  • MAP inference Inference with Dirichlet Learning: Basics
  • Learning: Directed Models
  • Learning: Undirected Models
  • Partially Observed Data
  • Structured Decision Problems Causality

Student Assessment Criteria

Participation (Class Discussions and Personal Journal) 30%
Assignments: Problem Sets during term 30%
Class Project: with Milestones during term 40%

Discussions (15% of grade):  (1) number of original posts, replies to other posts (politeness, on-topic, helpful and engaged comments), (2) quality of post (logical argument, depth of engagement with material, politeness, on-topic), (3) originality of post (going beyond the usual or a web-search),

        (and as part of the same component)

Personal Journal Entries (15% of grade): (1) number of entries, quality of entries (originality, degree of engagement with the material, amount of effort demonstrated), (2) originality of entries (going beyond the book/modest web search; demonstrating clear independent thinking), (3) willingness to think outside the boundaries of the class and to find problems/application domains that impact on your interests

Problem Sets (30% of grade):  (1) all assigned problems completed, extra-credit for more problems done (maximum of five extra points)  (2) quality of work (intermediate steps demonstrated, neatly and clearly organized),  (3) final answer correct (with steps illustrated and with final answer clearly demonstrated),

Project (40% of grade; with four deliverables before the final due date)

  Topic (10-percent): (1) doableness of the project (not too hard and not too easy), (2) amount of thinking behind the topic choice demonstrated (the why that project question),  (3) some presentation of nearby alternative topics and why they were not selected (further rationalizing the choice and its doability)

  Outline/Bibliography (15-percent): (1) quality and quantity of references;  (2) initial demonstration of grappling with the topic through outline; quality and originality of the outline; (3) continued ‘doability’ check (that ‘mission creep’ hasn’t happened with it becoming too hard or too easy)

  Pseudo Code (15-percent):  (1) should flow logically,  (2) should be clearly laid out, (3) amount of actual underlying code or math needed to fill in the steps should be possible to estimate from the pseudo code,

  Fifty-percent Done (10-percent): (1) continued ‘doability’ check, should clearly be showing significant progress (not just the easy things done with all the hard work ahead); (2) ideally working prototypes of code with demonstrations that they are working; (3) logical analysis of the remaining steps to completion and timelines for when those will be done

  Final (50-percent): (1) quality of final work (significant class project achieved on-time), (2) steps clearly defined and described, if code (quality of documentation, quality of demonstration that it works correctly), if math (quality of intermediate steps; demonstration of logical completeness and soundness of result), if bio-related (connections to literature and to other bio projects in literature demonstrated; quality of results and logical connections to intermediate steps) (3) context clearly documented (connections to other’s work; ideal end-point for this type of project; any possible connections to your own long term goals and plans)

Participation Expectations

It is expected that each class will take approximately 9–12 hours per week to complete. Here is an approximate breakdown: reading the assigned sections of the texts (approximately 2 hours per week) as well as some outside reading, listening to the audio annotated slide presentations (approximately 2 hours per week), contributing to the discussion forum (approximately 1-2 hours per week), problem assignments (approximately 2-3 hours per week) and progress on the term project (roughly 2-3 hours per week).

Textbooks

Textbook information for this course is available online through the MBS Direct Virtual Bookstore.

Course Notes

There are no notes for this course.

Final Words from the Instructor

Main Text:

Probabilistic Graphical Models: Principles and Techniques

 http://www.amazon.com/Probabilistic-Graphical-Models-Principles-Computation/dp/0262013193/ref=sr_1_1?s=books&ie=UTF8&qid=1390168942&sr=1-1&keywords=koller+friedman

(Last Modified: 01/28/2014 08:25:15 AM)