A timeline of research & engineering projects I've worked on so far — spanning SLAM, 3D vision, deep learning, and robotics. Click any title for more details, source code and demos.
2024
Industry Project @ByteDance / PICO PICO / Inter-Perception RD, ByteDance · Shanghai, China · May 2024 – Dec 2024
Core Developer · PICO MR cloud-based mapping and localization — Visual Positioning Service (VPS).
Industry Project @ByteDance / PICO PICO / Inter-Perception RD, ByteDance · Shanghai, China · Mar 2024 – Dec 2024
Project Owner · High-performance Bundle Adjustment for large-scale scenes — Ceres CUDA, FastBA, MegBA.
2023
Industry Project @ByteDance / PICO PICO / Inter-Perception RD, ByteDance · Shanghai, China · Mar 2023 – Dec 2024
Core Developer · Real-time sharing and localization of anchor maps across multiple devices, enabling multi-player MR games.
2022
Industry Project @ByteDance / PICO PICO / Inter-Perception RD, ByteDance · Shanghai, China · Jun 2022 – Dec 2024
Project Owner · Vision-based, low-cost mapping and localization powering MR Spatial Anchors on PICO headsets.
Industry Project @ByteDance / PICO PICO / Inter-Perception RD, ByteDance · Shanghai, China · May 2022 – Jan 2024
Project Owner · Line-feature based VIO designed to improve re-localization performance on MR headsets.
Industry Project @ByteDance / PICO PICO / Inter-Perception RD, ByteDance · Shanghai, China · Feb 2022 – Jun 2022
Project Owner · Fuse cloud-based high-precision localization results into the on-device 6DoF tracking loop.
2021
Industry Project @ByteDance / PICO PICO / Inter-Perception RD, ByteDance · Shanghai, China · Dec 2021 – Feb 2022
Project Owner · Use IMU data to compensate for the rolling-shutter “jelly” effect in RGB cameras under fast motion.
2020
Industry Project @MEGVII Research / SLAM Dev, MEGVII · Beijing, China · Dec 2020 – Mar 2021
Core Developer · Multi-camera and lidar extrinsic calibration with a re-engineered workflow.
Industry Project @MEGVII Research / SLAM Dev, MEGVII · Beijing, China · May 2020 – Dec 2020
Project Owner · End-to-end OCR toolbox covering detection, recognition, post-processing, and TensorRT acceleration.
Industry Project @MEGVII Research / SLAM Dev, MEGVII · Beijing, China · Mar 2020 – Dec 2021
Project Owner · Camera-based semantic mapping for parking environments using line and marker features.
2019
Industry Project @MEGVII Research / SLAM Dev, MEGVII · Beijing, China · Dec 2019 – Mar 2021
Project Owner · AGV-level multi-sensor fusion module for static and dynamic obstacle state estimation.
Industry Project @Horizon Robotics ADAS Product Line, Horizon Robotics · Beijing, China · Jan 2019 – Dec 2019
Core Developer · Evaluation toolboxes covering classification, keypoint detection, keypoint regression, and image-based 3D-box tasks.
2018
Industry Project @Horizon Robotics ADAS Product Line, Horizon Robotics · Beijing, China · Aug 2018 – Dec 2019
Project Owner · Vehicle and cyclist 3D bounding box detection for the auto-driving perception stack.
Industry Project @Horizon Robotics ADAS Product Line, Horizon Robotics · Beijing, China · Jul 2018
Project Owner · Camera IPM-based parking space detection that emits 3D coordinates for downstream parking planning.
Academic Project @Robotics Program CIS Department@UPenn · Philadelphia, PA · Apr 2018
source code
demo 1 / demo 2
- Within this project, we implemented the Visual Inertial Odometry to estimate the states of a Quadrotor, including its global position [x, y, z], pose [roll, pitch, yaw] and linear velocity with respected to the world [vx, vy, vz].
- We achieved both simulation on Linux/ROS and lab environment test on a real Quadrotor tasks.
Course Work @Robo Program CIS Department@UPenn · Philadelphia, PA · Feb 2018
Implemented two of important robotics required algorithms.
- Polynomial Trajectory Planning and Generation: design and generate an optimal polynomial smooth trajectory for the Quad rotor given a specified 3D Map.
source code - Hand-handled Camera Calibration: aimed to estimate the intrinsic (K) and extrinsic (R, t) matrices for the camera using a planer world space(e.g. checkerboard) and multiple views, and optimized all variables using the reprojection errors.
source code
Research Project @GRASP Lab CIS Department@UPenn · Philadelphia, PA · Jan 2018
source code
Built a deep network to achieve the 3D-MNIST data classification, 3D bounding box estimation and cube plane depth estimation.
- 3D Data (3D MNIST) Pre-Processing, including the Voxel generation and 2D multiple views generation.
- Deep network construction which can estimate class label, depth map and 2D bbox in one single stage.
- 3D bounding box estimation.
2017
Research Project @Robotics Program CIS Department@UPenn · Philadelphia, PA · Dec 2017
Mastered the algorithm and implementation of most generative models, including AE, VAE and multiple GANs.
- Implemented AE(Auto-Encoder) and VAE(Variational AutoEncoder) on people face data, e.g. cufs human faces.
source code - Researched on most popular GANs model, including DCGAN, WGAN and WGAN-GP, also wrote a detailed report to discuss their properties.
source code - Applied basic GAN model to more complicated problem, Image-to-Image Translation, my implementations included cGAN and cycleGAN.
Academic Project @Robotics Program CIS Department@UPenn · Philadelphia, PA · Oct 2017
source code
Manually implemented Faster RCNN Deep Network (Tensorflow based) to achieve the object detection and classification task, here we used cifar10 data as the target object and cifar100 data as the background.
Research Project @GRASP Lab CIS Department@UPenn · Philadelphia, PA · Mar 2017
Implemented Visual SLAM package and built 3D trajectory as well as 3D point cloud map.
- Structure From Motion(SFM): implemented Triangulation, PnP and Bundle Adjustment manually with Jacobian for camera state and 3D landmarks estimation and optimization.
source code - Visual SLAM Package on a Quadrotor: based on SFM structure, succeeded to implement Visual Odometry(VO), applied local BA(Bundle Adjustment) and combined g2O package to increase the SLAM efficiency as well as performance.
source code
Research Project @GRASP Lab CIS Department@UPenn · Philadelphia, PA · Feb 2017
Implemented some of the most popular used machine learning algorithms and applied them into the real robot cases.
- Object Detection and Geometry distance estimation based on GMM: trained some GMMs based on the image color data, also combined image processing and geometry algorithms to detetct any read barrel in a robot’s view and estimate its real geometry distance from robot.
source code - Robot Gesture Recoginition using Hidden Markov Model (HMM): built an algorithm based on the HMM to recognize different robot arm motion gestures. Finally, it was able to classify unknown arm motions within almost real-time.
source code - Reinforcement Learning using Policy Gradient Methods: implemented several methods for dealing with the Robot Walking on the Frozen Lake problem. The situation can be represented via a Markov Decision Process(MDP) and a strategy for retrieving the frisbee can be obtained using value iteration (VI), policy iteration (PI), and policy gradient optimization (PGO).
source code
Research Project @GRASP Lab CIS Department@UPenn · Philadelphia, PA · Jan 2017
source code
demo 1 / demo 2 / demo 3 / demo 4
- Implemented the filter-based 2D SLAM, used IMU data for prediction, Laser-scan data for update, and particle filter for state converge as well as 2D log-odds map optimization.
- All data were collected on a ground-walkder robot.
2016
Research Project @GRASP Lab CIS Department@UPenn · Philadelphia, PA · Dec 2016
source code
demo 1 / demo 2 / demo 3 / demo 4
- Succeeded in implementing a Kalman filter to track three-dimensional orientation of a hand-handled camera.
- Given IMU sensor readings from gyroscopes and accelerometers, the algorithm will estimate the underlying 3D orientation by learning the appropriate model parameters from ground truth data given by a Vicon motion capture system.
- Generated a real-time panoramic image using images captured by the camera and information filtered by the 3D orientation filter.
Course Project @Computer Science Program CIS Department@UPenn · Philadelphia, PA · Nov 2016
source code
Java provides several types of numbers, but it does not provide fractions, therefore to implement a Fraction API (Application Programmer’s Interface).
Academic Project @Robotics Program CIS Department@UPenn · Philadelphia, PA · Aug 2016
source code
demo 1 / demo 2
- Utilized human skin color as feature to train GMM model to filter out face region candidates. Combined with edge mask to separate union face regions for better detection.
- Implemented PCA to construct Eigen Faces dataset and included the third-part package Face++ to improve detection performance. The final accuracy of detection reached 82.6%.
- Implemented image morphing such as TPS and gradient blending to complete face replacement task.