Each .mat contains:

1. face_img: N*360*360. The face images in the Normalization Space II.

2. gaze_dirs: N*2. Ptich and yaw of the gaze direction in the Normalization Space II.

3. warpMats: 3*3*N. The warping matrix from camera coordinate system to Normalized Space II.

4. gazeRotMat: 3*3*N. The linear transformation matrix from Normliation Space II to Normliation Space I, i.e., gaze in Normliation Space II = gazeRotMat * gaze in Normliation Space I

5. meta: 
	col_1: image index in the original video;
	col_2: if the value is 1, this image could be used for calibration;
	col_3: face location;
	col_4: gaze region;
	col_5, 6: pitch and yaw of the gaze direction in the Normalization Space I (radian)；
	col_7 - 30：(x, y)*12. The 12 landmarks of the left eye and right eye.
	
Note that for face location (facing the people) and gaze region (facing the wall), 1 is upper-left, 2 is upper-center, 3 is upper-right, 4 is left, 5 is center, 6 is right, 7 is lower-left, 8 is lower-center and 9 is lower-right.