2021-12-27
What is action recognition? Given an input video/image, perform some appropriate processing, and output the “action label” CNNs for Action / Activity Recognition 1 Why CNN? Convolutional neural networks report the best performance in static image classification.
2021-07-20
Introduction Motivation Gain a higher level understanding of the scene, e.g. What are these persons doing (walking, sitting, working, hiding)? How are they doing it? What is going on in the scene (meeting, party, telephone conversation, etc…)?
2021-07-20
Introduction Gesture a movement usually of the body or limbs that expresses or emphasizes an idea, sentiment, or attitude the use of motions of the limbs or body as a means of expression Automatic Gesture Recognition A gesture recognition system generates a semantic description for certain body motions Gesture recognition exploits the power of non-verbal communication, which is very common in human-human interaction Gesture recognition is often built on top of a human motion tracker Applications Multimodal Interaction Gestures + Speech recognition Gestures + gaze Human-Robot Interaction Interaction with Smart Environments Understanding Human Interaction Types of Gestures Hand & arm gestures
2021-07-20
Multi-Camera Systems Type of multi-camera systems Stereo-camera system (narrow baseline) ​ Close distance and equal orientation An object’s appearance is almost the same in both cameras Allows for calculation of a dense disparity map Wide-baseline multi-camera system
2021-07-20
Introduction Tracking Vs. Detection Detection: Find an object in a single image Face, person, body part, facial landmarks, … No assumption about dynamics, temporal consistency made Tracking: determine a target’s locations (and/or rotation, deformation, pose, …) over a sequence of images
2021-07-19
Deep Learning for Object Detection People detections is a special case of object detection (one of the most challenging object classes to detect) Recently, most detectors are trained for the more challenging task of multi-object detection Goal: Given an image, detect all instances of, say, 1000 different object classes “Person” always one of the classes Speed is an issue Sliding Window: Look at each position, each scale Cascades look at each position too They just take a shorter look at most positions/scales Region Proposals: Avoid useless positions/scales from the beginning Region Proposals 💡Idea
2021-07-16
Kinect What is Kinect? Fusion of two groundbreaking new technologies A cheap and fast RGB-D sensor A reliable Skeleton Tracking Structured light Kinect uses Structured Light to simulate a stereo camera system Kinect provides an unique texture for every point of the image, therefore only Block-Matching is required Pose Recognition for User Interaction Few constrains:
2021-07-07
Motivation Model body-parts separately Break down an objects’ overall variability into more manageable pieces Pieces can be classified by less complex classifiers Apply prior knowledge by (manually) splitting the global object into meaningful parts Advantages deal better with moving body parts (poses) able to handle occlusions, overlaps sharing of training data Disadvantages require more complex reasoning problems with low resolutions Part-based models Two main components
2021-07-07
COCO Keypoints Detection 17 Keypoints: Keypoint detection format: Annotations Annotations for keypoints are just like in Object Detection (Segmentation), except a number of keypoints is specified in sets of 3, (x, y, v).
2021-05-25