Each .mat contains: 1. face_img: N*360*360. The face images in the Normalization Space II. 2. gaze_dirs: N*2. Ptich and yaw of the gaze direction in the Normalization Space II. 3. warpMats: 3*3*N. The warping matrix from camera coordinate system to Normalized Space II. 4. gazeRotMat: 3*3*N. The linear transformation matrix from Normliation Space II to Normliation Space I, i.e., gaze in Normliation Space II = gazeRotMat * gaze in Normliation Space I 5. meta: col_1: image index in the original video; col_2: if the value is 1, this image could be used for calibration; col_3: face location; col_4: gaze region; col_5, 6: pitch and yaw of the gaze direction in the Normalization Space I (radian); col_7 - 30:(x, y)*12. The 12 landmarks of the left eye and right eye. Note that for face location (facing the people) and gaze region (facing the wall), 1 is upper-left, 2 is upper-center, 3 is upper-right, 4 is left, 5 is center, 6 is right, 7 is lower-left, 8 is lower-center and 9 is lower-right.