University of California in Santa Barbara
CS290I 3D User Interfaces: Final Project

by Svetlin (Alex) Bostandjiev and Benjamin Garn

Augmented Reality (AR)
as a Learning Environment for Physical Phenomena

We created an Augmented Reality tool to help teach Physics concepts in the classroom. One part of the project was to create a tracking system for the objects involved in the educational lessons and calculate and overlay physical forces (mostly Alex). The other part was to come up with a user interface for the different phases of the experiments (mostly Benjamin).

Results

Tracking System and Interaction Tracking System Hand Menu Interaction

Tracking System for Physical Phenomena

Introducation

We use Augmented Reality (AR) as a learning environment of physical phenomena. We present a technique for allowing professors to use physical props for illustration of material during lessons. The benefit of interaction with the props is that it allows the professor and students to produce their own test scenarios. In our current system the professor can move a mono-colored ball around. While we track the ball real time we are able to add graphical augmentations to the ball that, for example, represent its instantaneous velocity, acceleration, and centripetal force. This strengthens students' understanding of physics as they can reflect upon lessons that pertain to the real world.

Utilizing a physical prop along with virtual objects also allows for a unique illustration of the difference between solutions given by formulas in textbooks and those seen in reality. Take for example a simple demonstration where a teacher drops a physical ball to demonstrate acceleration due to gravity. With the use of augmentations the students are able to view both the physical ball dropped as well as a virtual ball dropped at the same time and from the same height. While the physical ball is subject to drag the virtual ball can be made to fall as if it were in a vacuum. Because we are distributing the video, students viewing the lecture from a computer will also be able to rewind the video and view the balls dropping in slow motion to allow for more accurate comparisons.

Additionally we are able to use our technique in order to demonstrate some physical examples which are not as easily or as powerfully communicated through two dimensional drawings or illustrations. An excellent example is given by the figure. In the illustration a fork and spoon are balanced on the edge of a glass using only a matchstick. While this is an impressive demonstration, it may be difficult for some students to visualize the center of mass and the vectors along which the forces are interacting. Using virtual annotations the center of mass for both the fork and spoon can be placed in 3D along with the center of mass of the complete configuration. This would allow students to view the setup from any angle, while still seeing an illustration of exactly where the center of mass lies, and the forces which are applied to the glass and matchstick.

Previous work

Some previous works [Fjeld 2003][Shelton 2002][Kaufmann 2003] explore the use of AR in education. These works focus on showing 3D graphical models to assist students with complex spacial problems. Simple user interaction is also implemented, however, these models are primarily governed by their own internal forces in an ideal setting. Our system on the other hand, further utilizes the interaction with the real world as we more directly tie the real world to the virtual models. That is due to the fact that we attempt to teach physics and the calculations of physical forces are based on events that take place in the real world. Moreover, by means of AR we are able to visualize forces that our eyes cannot see.

Implementation

We present two technological advances for distance learning. First we provide a technology that allows instructors to easily add computer data and images to physical props used during their Physics lectures. In order to do this, we use ARToolKit to establish a coordinate system, such that the xy plane corresponds to a flat surface on which the demonstration takes place. Assuming that the flat surface is parallel to the ground, we can account for gravity in the negative z direction. Then, using computer vision we can track real props involved in the educational lessons. Our current system can track simple prop like mono-colored balls. Having a coordinate system in which we can reliably track simple real props' 3D position over time provides us with a powerful educational framework in which we can let the computer calculate physical forces taking place in the real world demonstration such as speed, velocity, acceleration, centripetal and centrifugal force, pressure, friction, elasticity, and energy changes. We are then able to overlay graphics on top of the real props to visualize these forces, invisible to the human eye.

The first phase of every demonstration is introducing any real props the instructor is interested in using into the system. This phase involves obtaining useful information about the prop, such as color histogram and shape. Then, during the experiment we track the props in real time to locate their 3D positions. In the case of a mono-colored ball we initially retrieve information about the color, shape, and real size of the ball. Then at every real time frame, we use a simple color segmentation approach to approximate where the center pixel of the ball is. After a first rough approximation of the location of the center pixel, we grow a region of similarly colored pixels around that pixel. That region eventually covers the whole ball and then its center of mass defines the 2D center pixel of the ball. Afterwards, we use simple trigonometry to find the 3D position of the center of the ball relative to the coordinate system defined by the ARToolKit marker. This technique provides us with a simple way of tracking the 3D position of regular single colored objects. Once we know where the center of the ball is at every frame we calculate the ball's instantaneous velocity (blue), acceleration (green), and centripetal force [1] (yellow), and overlay vectors on top of the ball, see.


Tracking a physical prop


The tracked object (orange ball) with overlayed graphics:
trajectory of motion (black Bezier curve), velocity (blue), acceleration (green), centripetal force (yellow)

Another type of augmentation is to display particular "out of the book" graphs of interest along with the demonstration. This provides visual explanations of graphs that are fundamental in Physics and are often confusing for students without a strong mathematical background. Also we can dynamically draw diagrams on the screen to illustrate changes of forces with time. Formulas can be displayed along with the graphs and diagrams.

In order to enhance teaching of the physical concepts the instructor may need to reproduce the demonstration in slow motion, or go through it frame by frame, or just display a particular snapshot of interest. Therefore, we have implemented video recording and playback, as well as frame by frame access. While showing particular frames, the instructor may display different types of augmentations (i.e. vectors, graphs, formulas) as appropriate.

Another benefit of having a recorded video stream is that it provides us with information about future frames, which can be used to correct for inaccuracies of the real time computer vision algorithms. In the example we have developed, we use the information from all frames to accurately display the trajectory of the ball over the whole demonstration period. We achieve this by fitting a Bezier curve to the set of points representing the ball's position at each frame [2]. Being able to display the trajectory of the ball time helps teaching that the net force on the ball moving along a curve can be decomposed into a tangential velocity component that changes the speed of the ball and a perpendicular centripetal force component that changes the direction of motion. Moreover, the instructor can give a visual explanation of the concept of instantaneous acceleration by showing how the acceleration vector is the difference of two consecutive velocity vectors.

Future Work

The first area of improvements will be in increasing the amount of interaction possible when using the physical props. So far we can reliably track single colored regular shaped props. This allows us to easily integrate in our system a variety of physical phenomena that can be simulated using such simple props. For example, we can track a pendulum to assist teaching the concepts of angular momentum and torque [3], or we can augment Newton's cradle to teach conservation of momentum [4][5]. In order to increase the capabilities of our system to accout for more complex physical events, our next step would be to allow instructors to introduce props of irregular shape and different color patterns into the system.

Interaction

This part of the project is about creating an interface that can be controlled with a hand that is tracked using a camera.

Hand Color Tracking

Getting the camera image

We used the OpenCV library to be independent from the camera type and driver. OpenCV is also responsible for the camera configuration in resolution and frame rate. The API delivers a 32Bit RGBA image. The framework is very fast. As only shortcoming there has to be mentioned that OpenCV is unable to fix contrast and brightness for the camera. This will result in problems when the lightning conditions change.

Transforming the color space

Color tracking is by definition tracking of a specified color, but the same real-world color might appear very different under different lightning conditions. To get rid of this problem we convert the camera image into the HSV color space, where the hue value is computed using the ration between red, green and blue. So it is possible to identify the dominant color of a pixel while the brightness of the color is ignored.

Tracking the color

The tracking color is defined using the hue and the s value from HSV. There is also a threshold defined to define the maximum distance from the tracking color. As a "one-fits-for-all" solution we used 15 for hue and 1 for saturation threshold. Now the image can be scanned sequentially, so each pixel is checked whether it falls into the threshold. This step will result in a binary image. The position of the tracked object is defined by the average position computed out of the binary image.

Reducing noise

If there are a lot of similar colors on an image the last step will generate a lot of noise. To reduce this noise we increase the weight of pixels that are surrounded by pixels that are also within the threshold. This will remove small outliers that disturb the position.


Camera image

without noise reduction

3x3 pixel noise reduction

5x5 pixel noise reduction

Creating the menu

The intention of the menu was to create an interface that can be easily integrated into other projects. That means it should be flexible in design und uncomplicated in code integration.

Implementation of the menu framework

The complete menu is object oriented. All coordinates are 2D, so they represent the pixel coordinates on the screen. To integrate the menu into any OpenGL application it is enough to add a few lines of code.

Creating the menu structure

Before the menu can be used, its structure must be defined. This works like in most common menu libraries.
	// create new HandMenu and define the initial position
	myMenu = new HandMenu(280, 200, 40, 40, true);

	// create OpenGL list for label
	hdIconRecord(labelRecord);
	// Submenu playback
	e = new HandMenuElement(labelMenuPlay);
	// Create a new action-button giving a function pointer
	a = new HandMenuAction(videoPlayStop);

	// Adding an action to the menu
	a->setGlLabel(labelStop);	// Assign the OpenGL list as label
	a->setFramesUntilAction(30);	// Set after how many frames 
   					   the action will be executed

	// Adding stuff together
	e->addAction(a);
	menu->addMenuElement(e);
				
After creating the structure there are two remaining steps that must be done 1. The menu must be displayed in the OpenGL display function.
	// display menu
	menu->display();					
				
2. The menu must be informed where the cursor is, to check if any action occurred. This should happen in the OpenGL idle function.
	// display menu
	menu->checkForEvents(hand_x, hand_y);
				

Gesture Activation vs. Time activation

There are two ways, how a hand-controlled menu can be implemented: We decided to use a time-based activation because gestures in air are not very comfortable and decrease also the accuracy of the curser position. The Time-based activation enables us also to create smaller menu entries. A small circle in the cursor will indicate when the action will be executed.

Screenshots

Future work

The tracking behavior at the edges of the image is still not optimal, because we are working with an average position of all pixels tracked. But if half of the hand is not in the image, the tracked position is shifted to the center of the image. As a result the weight of tracked pixels close to the border should be improved. Another solution might be to switch to shaped based hand recognition, where this problem won't occur.

Source Code

source code

References

[1] "Centripetal Force." Wikipedia

[2] "Bezier Curve." Wikipedia

[3] Breisinger, Marc. Höllerer, Tobias. Ford, James K. Folsom, Doug. "Implementation and Evaluation of a 3D Multi Modal Learning Environment." ED-MEDIA, World Conference on Educational Multimedia, Hypermedia & Telecommunications, Orlando, FL, June 26–30, 2006, pp. 2282–2289

[4] "Newton's Cradle." Wikipedia

[5] "Conservation of Momentum." Wikipedia

[Fjeld 2003] Fjeld, M. P. Juchli and B. M. Voegtli (2003): Chemistry Education: A Tangible Interaction Approach. Proceedings of INTERACT 2003, pp. 287-294.

[Shelton 2002] Shelton, B. E. 2002. Augmented reality and education: Current projects and the potential for classroom learning. New Horizons for Learning, 9(1).

[Kaufmann 2003] Kaufmann, H., Collaborative Augmented Reality in Education, Position paper for keynote speech at Imagina 2003 conference, Feb. 3rd, 2003. Imagina03, (2003).