From jderobot
Jump to: navigation, search

The main goal of this project is to obtain an estimation of the position and orientation of an agile moving camera only by means of the image sequence retrieved by the camera. To do this, we will first design a 2D tracking algorithm for saliency points in the camera frame, and then put this tracked points as landmarks into an EKF-based 3D location algorithm. Initialization of 3D position of landmarks will be achieved with an auxiliar EKF. Finally, the results (positioned and oriented camera model, landmark loci and their error ellipsoids) will be shown in an openGL window.

Project Card[edit]

  • Project Name: SLAM from a single moving camera
  • Authors: Luis Miguel López Ramos (lopramlm [at] gmail [dot] com) and Jose María Cañas Plaza (jmplaza [at] gsyc [dot] es)
  • Academic Year: 2009-2010
  • Degree: Grad (Telecommunication)
  • Documentation (in spanish): memoria
  • Tags: vision, signal processing
  • Technology: C, C++, jde suite, opencv, Kalman filter, Extended Kalman Filter
  • State: Ready for presentation
  • Targets: Nao, HTC, Wearable Robots, Augmented Reality
  • Source License: GPLv3



In this vide we use a witness camera (yet uncalibrated) in order to demonstrate motion detection.

Last video[edit]

Note that images from the witness camera are NOT used in the location algorithm!


Already adding new features to the map[edit]

We can see the error ellipsoids in the applicant point in black color. When the major axis of the ellipsoid is below the consolidation threshold it is added to the normal map. At this moment it changes its colour.


Automatic feature initialization: first approach[edit]

  • Lost landmarks are discarded 1 by 1 and set out of the covariance map.
  • New landmarks are added to the map without stopping the filter.
  • Using inverse z initialization for features selected by user.


GL view improvements and a better video[edit]

  • Now the cube has got the same orientation of the camera model.
  • A frustrum (field of view pyramid) is showed.
  • A shadow is traced right below the camera for better visualization.


Proof of concept[edit]

The MATLAB code has been correctly translated to C. In addition, I have added the functionality of initializing point positions manually. The code is now ready to show the potential of the EKF.

In this video you can see the GUI with the new 'Tracking3D' tab in which the parameters of the 3D EKF lay. In order not to annoy you, i've skipped the manual assignation process (you will only see the last point). The EKF is initalized from a rough 'a priori' position estimation. Quickly, it finds the true position of the camera. The red 'X' marks are the expected projection of the known points.


EKF full problem solved[edit]

Video: matlab_cube1.flv

Estim q x.jpg


Mini-problem solved[edit]

blue:  real data 
green: real data + gaussian noise (the only data we have access) 
red:   estimates 
time scale in milliseconds
units of w are radians/second

Estim q.jpg

Note that the estimation of velocity is often very different from real velocity. This is because our problem is analogous to a point moving on the surface of a sphere, in which several paths with different integrals drive to the same point.


Splitting the problem in smaller problems for the sake of understanding[edit]

The full-covariance Kalman filter is a huge problem and treating it may be too difficult.

Before resolving the complete problem, we will start resolving a problem which only includes the EKF's prediction step:


We have a rigid object (our camera) which rotates over its center in a case of "constant" angular velocity (angular acceleration is gaussian). We have access to the quaternion corresponding to the orientation with an amount of gaussian noise. Estimate optimally the orientation and angular velocity of the camera.


Preparing the jump from 2D to 3D[edit]

Here comes the hardest part of this project, in which we will start taking into account the relative position and orientation of camera and landmarks in the world. The set of N Kalman filters will become a full-covariance, joint Extended Kalman filter whose state vector will comprend:

  • Camera position (x, y, z)
  • Camera orientation (quaternion: q0 q1 q2 q3)
  • for i from 1 to N:
    • ith Landmark position (x_i, y_i, z_i)


Last videos[edit]




Added patch correlation to 2D tracking[edit]

Now the correlation criterion is used to couple features with Kalman filters. We can control the weights of the correlation and the distance. Concretely, the correlation is measured between a patch around the current feature and the last accepted feature.


Patch Correlation[edit]


Playing with ROIs & Histograms[edit]

ROI (the red square) is a useful tool which reduces the amount of CPU load, making our algorithms faster.

Histograms can be a fast and simple measure for similarity between images. OpenCV provides some functions which compare two histograms



We defined 2 search areas in the point's environment:

  • The inner circle ("promiscuous" area).The filter will seek any point inside such circle.
  • The outer crown. The filter will seek only the non-coupled points inside this zone.

Here is a video showing the last achievements.

The next step is to use the appearance similarity to more accurately couple points and filters.


Some considerations about the tracking policy[edit]

The main issue about the chosen policy is that tracking lacks ===unity===, say, one object gives some features due to noise and more than one kalman filter track the object.

We need to apply a criterion when removing repeated filters. This may be:

Remove newest[edit]

Remove slowest[edit]


Dynamic Kalman filters[edit]

We use a kalman filter pool so that creating and deleting kalman filters costs no significant time to our process and we can use more complex graphics routines.

Now we have a variable number of features, we must be prepared for objects entering and leaving the scene, blinking points and other potentially problematic events. We have identified 8 cases:









For the first experiments, we choose the "always create" policy: every feature which we can't couple with a filter is considered a possible new object and is assigned a new Kalman filter.


Detecting a variable number of objects[edit]

We are going to use the blob extracting library cvBlob (http://code.google.com/p/cvblob/) so that the ball detection process returns the actual number of balls in the scene.

Issue: we have had to compile our project in C++.

The labeling function only searches for connected components. We use a processing similar to opening (Dilation + Erosion) to suppress unconnected pixels.

Issue: Morphological filters are a great load for the CPU and slows down our process dramatically.


Detecting n objects a little better[edit]

Now we reject a cluster feature associated to a filter when there is another filter such that the Mahalanobis distance to our centroid is smaller than the Mahalanobis distance to the original filter.

It takes some time to converge to the optimal and lots of features are discarded, but it seems robust.



Detecting n objects[edit]

  • First we have applied the K-mean algorithm to separate the scene in n clusters.
  • Then we have replicated our Kalman filter n times, cupling each filter to one centroid of the clustering, starting with the closest pair.
  • Here is the result



I have improved the kinetic model of the ball. Now the covariance matrix of the acceleration is not diagonal, but defined by an ellipse (the former circle stretched in the direction of the speed). This ellipse's minor axis is the same as when the ball is not moving (sigma_a), and the major axis is this sigma_a "modulated" by the speed.

p = (sigma_a)^2 * (1 + Alpha * norm(û))

Alpha is a "modulation index"(tunable in the GUI). û is the estimated speed vector.


In this video you can see how the ellipse becomes more eccentric when the speed varies in a concrete direction. When the ball stops, you can see it converging gradually to a circle. Now look at the values of SIGMA_A and SIGMA_V. At this point they are much lower than before. We needed to tell our filter that the variances of error and acceleration were huge, so that it followed the ball, which meant big errors in estimation. Now the error is smaller (the error ellipse is scaled for viewing, look at P_SIZE). This means that our current model is actually better than the former one.


Obtaining an uncertainty ellipse from a covariance matrix[edit]

Let our covariance matrix be

Thus, the parameters of the ellipse are

where p_1 and p_2 are the semi-axis of the ellipse and a is the angle formed by the (1,0) vector and the first axis.

Obtaining a covariance matrix from an uncertainty ellipse[edit]

WARNING: Procedure seen at http://www.mathworks.de/matlabcentral/newsreader/view_thread/173687#447084 is WRONG!!

The right way:

Obtain the vectors which represent the axis of the ellipse: V1, V2. The covariance matrix is

Q = [V1(:) V2(:)];
Cov = Q * Q';


We need uncertainty ellipses[edit]

Now the ellipse is displayed we realize that it always is spherical (in 2D case, a circle). This is because we're using a model in which the acceleration covariance is a diagonal matrix (i.e. there's no cross-variance between x and y) and it's independent from time and the state of the filter.



2D Kalman filter works fine![edit]

Now the filter gives a good and stable estimation of the position and speed of the ball. We can see that the green marks (estimation, cross is position, vector is speed) are similar to the blue ones, but vary more smoothly. I've put a button in the GUI which resets the filter and, in addition, lets us change the model's variables in order to tune the filter finely.

The controls:

* Sigma_a => variance of the acceleration  (zero-mean Gaussian)
* Sigma_v => variance of the measure error (zero-mean Gaussian)
* B       => initial auto covariance for the x and y components of position
* B_u     => initial auto covariance for the x and y components of speed
* DELTA_T => theoretically it is the time between two iterations of the filter, however it can be an arbitrary value, used to adjust the damping of the speed estimation 

The filter predicts the position of the ball even without measures of the position. This makes the cross to follow the ball's most likely trajectory when it is occluded (in this case by the sheet of paper).



Trying to include a Kalman filter which tracks in 2D an object (in this case, the Aibo ball).

The blue circle is the measure, the green cross is the Kalman position estimation. Arrows represent speed.

Currently I have errors, the cross moves well at beginning but becomes lazier and lazier.



Working with matrices and gsl[edit]

I am using gsl library in order to make matrix operations required by the Kalman filter.

gsl_matrix_calloc initializes matrix positions to 0.

gsl_matrix_set and gsl_matrix_get are used to access matrix values.

Matrix Product: The library gsl_blas ("basic linear algebra subprograms") provides matrix multiplication.

Inversion: we must use gsl_linalg_LU_decomp and gsl_linalg_LU_invert.

GNU Octave is useful to visualize our Matrix operations.


I have included a corner detector in the openCVdemo schema. It uses the function "GoodFeaturesToTrack" from OpenCV.

GoodFeaturesToTrack receives a monochrome image and looks for points with a significative level gap.

A good point is that it "sees" the vertices of the cube's red surface. It also sees the lamp edges.

It would be better if it found the fourth vertex of the red square. On the left, there is a messy mass of feature points that may confuse our application.


Calibrating Cameras[edit]

Calibrador Schema[edit]

Extrinsics Schema[edit]


As an exercise in order to learn how to work with color variables and graphic interfaces, I have included a RGB color filter in OpenCVDemo.