Introduction to Augmented Reality using AprilTags
In this assignment, you are given a set of images with an AprilTag. You are also given the size of the tag and the pixel locations of its corners on the images. Your task is to recover the camera pose with two different approaches: PnP with coplanar assumption, and P3P followed by Procrustes. After you recover the camera pose, we will place a virtual camera at that same position to render a virtual bird as if it stands on the tag.
(a) sample image (b) sample result
Figure 1: Projecting a bird mesh on the AprilTag center
1 Problem 1 – 5pts, Establish a World Coordinate
(a) world coordinate setup
To know where the camera is in the world, we must first define a world coordinate system. For this assignment, since we want to place a bird onto the tag, it is convenient to place the world coodinate at the center of the tag as shown in the figure above.
From the lecture, we know that to recover camera pose from 2D-3D correspondence, we need at least 3 points. Here the AprilTag provides us 4 points. We used the detection software for AprilTag to find the 2D pixel coordinates of a, b, c, d, which have been provided to you in corners.npy and will be used in run PnP.py and run P3P.py.
Your first task is to find the 3D world coordinates of a, b, c, d by completing est Pw.py given the length of the side of the tag. With the world coordinate placed at the center of the tag, this should not be difficult. This function will be used for the next two sections.
1.1 Files to complete and submit
- est Pw.py
This function is responsible for finding world coordinates of a, b, c and d given tag size
2 Problem 2 – 25pts, Solve PnP with Coplanar assumption
After getting the 2D-3D correspondence, we will recover the camera pose with two different approaches. This section covers the first method which closely follows lecture slides ”Extrinsics from Collineation – Pose from Projective Transformations”
Since the world frame is conveniently placed on the tag, the z component of the corners should be zero. Following the slides, we have
where K is a homography H. To recover H, you will use est homography.py that you’ve programmed in hw2. We also provided a completed version. You are free to choose either.
After getting H, you should follow the slides to recover R and t. Note that in the slide, R = Rcw and t = tcw, but the function requires to return Rwc and twc, which describe camera pose in the world. You will complete all of above inside PnP.py
Once you complete est Pw.py and PnP.py, run python run PnP.py in terminal. This will generate a GIF called bird collineation.gif.
2.2 Files to complete and submit
This function is responsible for recovering R and t from 2D-3D correspondence
- bird collineation.gif
This GIF visualizes the bird for method 1.
3 Problem 3 – 70pts, Solving the Perspective 3-Point problem
Here you will go through the process of calculating the camera pose by first calculating the 3D coordinates of any 3 (out of 4) corners of the AprilTag in the camera frame. You already know the 3D coordinates of the same points in the world frame from Part 1, so you will use this correspondence to solve for Camera Pose R,t in the world frame by implementing Procrustes.
3.1 P3P Derivation 20 pts
Figure 3: Illustrates the geometry of the three point space resection problem. The triangle side lengths a,b,c are known and the angles α,β,γ are known. The problem is to determine the lengths s1,s2,s3 from which the 3D vertex point positions (in the camera frame) P1,P2,P3 can be immediately determined.
You only need to solve till you get a 4th degree polynomial. Basically get the coefficients of this polynomial in terms of a,b,c,α,β,γ, then we will code them up in the next part to get the roots of the polynomial.
You already have a part of the proof for P3P in class (PNP slides). You can refer this document if you face difficulty in solving P3P. P3P reference paper (you need to use Grunert’s solution)
3.2 Coding Sections
3.2.1 P3P – 25 pts
In this function, you need to implement P3P to basically calculate the 3D coordinates of the 3 points in the camera frame. Refer slides ”Perspective N Point Problem”
You will define a,b,c and α,β,γ and then compute the coefficients of the polynomial you got in the previous part. Then you will solve the polynomial to get its roots which will then give you the distances s1,s2,s3. You will then use these distances to get the 3D coordinates of the 3 points in the camera frame.
Important Points to consider:
- you have 4 corners and are free to use any 3 of them.
- You can use numpy.roots to get the roots of the polynomial
- You will get total of 4 roots out of which 2 will be real, you need to select the one that gives all the distances of the 3 points from the camera to be positive.
3.2.2 Procrustes – 25 pts
You need to use the correspondence of the same 3 points in the world frame and the camera frame and solve for camera pose using the Procrustes method. Use the Procrustes slides from class for your reference.
N=3 min X ||Ai − RBi + T||2
Once you complete P3P.py, run python run P3P.py in terminal. This will generate a GIF called bird P3P.gif.
3.3.1 Files to complete and submit
This file has two functions P3P and Procrustes that you need to write. P3P solves the polynomial and calculates the distances of the 3 points from the camera and then uses the corresponding coordinates in the camera frame and the world frame. You need to call Procrustes inside P3P to return R,t.
- bird P3P.gif
This GIF visualizes the bird for method 2.