This course introduces students to key computer vision techniques for real-time applications. Students will learn to quickly build applications that enable computers to "see," and make decisions based on still images or video streams. Through regular assignments and in class laboratory exercises (students are advised to bring their own laptop to class), students will build real-time systems for performing tasks including object recognition and face detection and recognition. Key computer vision topics addressed in the course include human and machine vision: how does the brain recognize objects?, and what can we emulate?, camera models and camera calibration; edge, line and contour detection; optical flow and object tracking; machine learning techniques; image features and object recognition; stereo vision; 3D vision; face detection and face recognition. Students will be exposed to the mathematical tools that are most useful in the implementation of computer vision algorithms.
Python programming experience, and prior knowledge of linear algebra, geometry, and probability theory is desired.
This course introduces students to computer vision through intensive use of the open source machine vision libraries including OpenCV, scikit-image, and Caffe. Students in the course will learn to quickly build applications that enable computers to "see" and make decisions based on still images or video input from a camera.
Understand key computer vision techniques and applications
Use open-source Python packages to implement systems capable of performing image processing tasks such as image smoothing, image transformations, image segmentation, detecting primitives like edges, lines, contours and corners, or detecting more complex patterns useful for creating panoramas or recognizing objects.
Use Python to implement tasks such as camera calibration, optical flow computation, video tracking and 3D reconstruction from stereo.
Use Python to implement machine learning algorithms for image classification and object recognition in images and videos. Experiment with developing real time systems for performing face detection, face recognition and object recognition.
Understand the basics of deep learning techniques for vision applications and train basic deep networks using open-source libraries such as Theano or Caffe.
When This Course is Typically Offered
This course is typically offered in the spring semester at APL.
- Introduction - computer vision, python
- Color, image formation, camera calibration
- Filtering, pyramids, edges
- Histograms, basic segmentation, morphology, advanced segmentation
- Biomedical/biomimetic vision
- Lines, corners, keypoint detectors
- Video, optical flow, GFTT
- 3D vision/geometry, calibration/3D recognition
- Intro to machine learning for Computer Vision - SVM, Boosting, Bagging, Decision trees
- Intro to Deep Learning
- ConvNets, R-CNN
- DNN flavors - (stacked) RBM, autoencoders, CNNs for depth, 3D CNNs
Student Assessment Criteria
|Final Project Presentation||10%|
Computer and Technical Requirements
Python programming experience; prior knowledge of linear algebra, geometry, and probability theory is desired
Textbook information for this course is available online through the MBS Direct Virtual Bookstore.
There are notes for this course.
Final Words from the Instructor
We are using open source tutorials and reference manuals that are posted online. Our Blackboard site points to these tutorials in the first module of this class. We will mainly use the OpenCV-Python Tutorials written by Alexander Mordvintsev & Abid Rahman K.
Term Specific Course Website
(Last Modified: 01/27/2016 12:21:17 PM)