Real-Time Vision-based 3D Motion Capture

While motion capture is typically solved using magnetic systems and optical systems, there exist mass market applications in which such solutions are untenable either due to cost or because it is impractical for people entering an environment to be suited up with active devices or special reflectors. Magnetic systems also suffer from electro-magnetic interference caused by ferrous metal objects nearby, magnets in audio speakers or monitor's and TV radiation. Wires also are another drawback of magnetic motion capture systems. They are encumbered and limit the space in which a performer can move. It is just during the past two years that wireless magnetic systems have been introduced. Nevertheless, the performer still has to carry a backpack containing an electronics unit that is connected to the sensors/receivers. Also, the space limitation is not completely eliminated by the wireless system as the signal becomes too weak when the distance between receiver and transmitter becomes too great. Unlike magnetic systems, the optical motion capture systems do not face wire and electro-magnetic interference problems. However, their costs are considerably higher. Furthermore, most of the optical systems do not operate in real-time. They require post-processing calculation and some manual point registration, which prevent them from operating in real-time. However, the first real-time optical system was recently introduced in SIGGRAPH98 by Motion Analysis.

Due to these restrictions of existing systems, a vision-based motion capture system which does not rely on contact devices would have significant advantages. In such vision-based systems, interference and encumbrance will no longer be a problem. Our system demonstrates the application of inexpensive and a completely unencumbered computer vision system to the motion capture problem. The system runs on a network of Dual-Pentium 400 PCs at 20-30 frames per second (depending on the size of person whom the system observes). We recently demonstrated the system at SIGGRAPH98's Emerging Technologies. The project, called "Shall We Dance?",is the result of collaboration amongst ATR's Media Integration Research Laboratory, the University of Maryland's Computer Vision Laboratory, and Massachusetts Institute of Technology's Media Laboratory. Below is a snap-shot of our SIGGRAPH98 demonstration. I was dancing in the dancing area which was surrounded by six cameras; and the CG character in tuxedo was animated by my motion. The sumo character was animated by another studio next door from ATR's MIC lab.



See... how much people like it.




How it works. . . . .

A set of color CCD cameras observes a person. The number of cameras could be any number more than two; our current system uses six cameras. Each camera is attached to a PC running the W4 system. W4 detects people, and locates and tracks body parts. It performs background subtraction, silhouette analysis and template matching to locate the 2D positions of salient body parts, e.g., head, torso, hands, and feet, in the image. A central controller obtains the 3D positions of these body parts by triangulation and optimization processes. Kalman filters are also used to smooth the motion trajectories as well as to predict the body part locations for the next frame. We then feed back this prediction to each instance W4 to help in 2D localization. The graphic reproduction system uses the body posture output to render and animate the cartoon-like character.


© The graphical human model is copyrighted by ATR Media Integration & Communications Research Laboratory




Related Publications & Demonstrations. . . . .
  1. Virtual Metamorphosis [Abstract & Paper at IEEE site]
    J. Ohya, J. Kurumisawa, R. Nakatsu, K. Ebihara, S. Iwasawa, D. Harwood, and T. Horprasert
    IEEE Multimedia: Media Spaces, Vol.6 No.2, IEEE Compter Soc. Press, April-June 1999, pp.29-39.

  2. "Real-time 3D Motion Capture"
    T. Horprasert, I. Haritaoglu, C. Wren, D. Hardwood, L. S. Davis, and A. Pentland
    2rd Workshop on Perceptual User Interace, San Francisco, USA, November 1998.

  3. "Shall We Dance? Real-Time 3-D Control of a CG Puppet," [See Also: this link]
    with I. Haritaoglu, C. Wren, K. Ebihara, L.S. Davis, J. Ohya, A. Pentland, J. Kurumisawa, T. Sakaguchi.
    In SIGGRAPH98: Emerging Technology, Orlando, Florida, July 1998.



SITE MAP:
ACADEMIC & RESEARCH PAGE : BIOGRAPHY | AFFILIATIONS | PROJECTS | PUBLICATIONS | RESEARCH LINKS | FELLOWS
BACKGROUND SUBTRACTION | MOTION CAPTURE | HEADPOSE ESTIMATION | FLOW FINDER APPLET


PERSONAL PAGE: ABOUT | BEDTIME READING | FAMILY & FRIENDS | THAILAND
MY SMALL GALLERY | MY LITTLE SPORT PRIDE | MY ONLINE TINY LIBRARY


CONTACT INFO