Works and Projects

A timeline of research & engineering projects I've worked on so far — spanning SLAM, 3D vision, deep learning, and robotics. Click any title for more details, source code and demos.

2024

PICO MR Cloud-Based Mapping & Localization (VPS)

Industry Project @ByteDance / PICO PICO / Inter-Perception RD, ByteDance · Shanghai, China · May 2024 – Dec 2024

Core Developer · PICO MR cloud-based mapping and localization — Visual Positioning Service (VPS).

High-Performance Bundle Adjustment for Large-Scale Scenes

Industry Project @ByteDance / PICO PICO / Inter-Perception RD, ByteDance · Shanghai, China · Mar 2024 – Dec 2024

Project Owner · High-performance Bundle Adjustment for large-scale scenes — Ceres CUDA, FastBA, MegBA.

2023

Multi-Device Anchor Map Sharing

Industry Project @ByteDance / PICO PICO / Inter-Perception RD, ByteDance · Shanghai, China · Mar 2023 – Dec 2024

Core Developer · Real-time sharing and localization of anchor maps across multiple devices, enabling multi-player MR games.

2022

MR Spatial Anchor on PICO Headset

Industry Project @ByteDance / PICO PICO / Inter-Perception RD, ByteDance · Shanghai, China · Jun 2022 – Dec 2024

Project Owner · Vision-based, low-cost mapping and localization powering MR Spatial Anchors on PICO headsets.

Line-Feature based VIO for Re-Localization

Industry Project @ByteDance / PICO PICO / Inter-Perception RD, ByteDance · Shanghai, China · May 2022 – Jan 2024

Project Owner · Line-feature based VIO designed to improve re-localization performance on MR headsets.

Cloud-Loc + Local 6DoF Tracking Fusion

Industry Project @ByteDance / PICO PICO / Inter-Perception RD, ByteDance · Shanghai, China · Feb 2022 – Jun 2022

Project Owner · Fuse cloud-based high-precision localization results into the on-device 6DoF tracking loop.

2021

Rolling Shutter Compensation with IMU

Industry Project @ByteDance / PICO PICO / Inter-Perception RD, ByteDance · Shanghai, China · Dec 2021 – Feb 2022

Project Owner · Use IMU data to compensate for the rolling-shutter “jelly” effect in RGB cameras under fast motion.

2020

Multi-Camera & Lidar Extrinsic Calibration

Industry Project @MEGVII Research / SLAM Dev, MEGVII · Beijing, China · Dec 2020 – Mar 2021

Core Developer · Multi-camera and lidar extrinsic calibration with a re-engineered workflow.

OCR Toolbox (Detection + Recognition + TensorRT)

Industry Project @MEGVII Research / SLAM Dev, MEGVII · Beijing, China · May 2020 – Dec 2020

Project Owner · End-to-end OCR toolbox covering detection, recognition, post-processing, and TensorRT acceleration.

Parking Space Semantic Mapping

Industry Project @MEGVII Research / SLAM Dev, MEGVII · Beijing, China · Mar 2020 – Dec 2021

Project Owner · Camera-based semantic mapping for parking environments using line and marker features.

2019

AGV Multi-Sensor Fusion

Industry Project @MEGVII Research / SLAM Dev, MEGVII · Beijing, China · Dec 2019 – Mar 2021

Project Owner · AGV-level multi-sensor fusion module for static and dynamic obstacle state estimation.

Perception Model Evaluation Toolboxes

Industry Project @Horizon Robotics ADAS Product Line, Horizon Robotics · Beijing, China · Jan 2019 – Dec 2019

Core Developer · Evaluation toolboxes covering classification, keypoint detection, keypoint regression, and image-based 3D-box tasks.

2018

Vehicle / Cyclist 3D-Box Detection

Industry Project @Horizon Robotics ADAS Product Line, Horizon Robotics · Beijing, China · Aug 2018 – Dec 2019

Project Owner · Vehicle and cyclist 3D bounding box detection for the auto-driving perception stack.

IPM-based Parking Space Detection

Industry Project @Horizon Robotics ADAS Product Line, Horizon Robotics · Beijing, China · Jul 2018

Project Owner · Camera IPM-based parking space detection that emits 3D coordinates for downstream parking planning.

EKF-Based Visual Inertial Odometry

Academic Project @Robotics Program CIS Department@UPenn · Philadelphia, PA · Apr 2018

source code
demo 1 / demo 2

Within this project, we implemented the Visual Inertial Odometry to estimate the states of a Quadrotor, including its global position [x, y, z], pose [roll, pitch, yaw] and linear velocity with respected to the world [vx, vy, vz].
We achieved both simulation on Linux/ROS and lab environment test on a real Quadrotor tasks.

Planning and Optimization

Course Work @Robo Program CIS Department@UPenn · Philadelphia, PA · Feb 2018

Implemented two of important robotics required algorithms.

Polynomial Trajectory Planning and Generation: design and generate an optimal polynomial smooth trajectory for the Quad rotor given a specified 3D Map. source code
Hand-handled Camera Calibration: aimed to estimate the intrinsic (K) and extrinsic (R, t) matrices for the camera using a planer world space(e.g. checkerboard) and multiple views, and optimized all variables using the reprojection errors. source code

3D Object Detection and Recognition

Research Project @GRASP Lab CIS Department@UPenn · Philadelphia, PA · Jan 2018

source code
Built a deep network to achieve the 3D-MNIST data classification, 3D bounding box estimation and cube plane depth estimation.

3D Data (3D MNIST) Pre-Processing, including the Voxel generation and 2D multiple views generation.
Deep network construction which can estimate class label, depth map and 2D bbox in one single stage.
3D bounding box estimation.

2017

Generative Models Implementation

Research Project @Robotics Program CIS Department@UPenn · Philadelphia, PA · Dec 2017

Mastered the algorithm and implementation of most generative models, including AE, VAE and multiple GANs.

Implemented AE(Auto-Encoder) and VAE(Variational AutoEncoder) on people face data, e.g. cufs human faces. source code
Researched on most popular GANs model, including DCGAN, WGAN and WGAN-GP, also wrote a detailed report to discuss their properties. source code
Applied basic GAN model to more complicated problem, Image-to-Image Translation, my implementations included cGAN and cycleGAN.

Faster RCNN Playing with Cifar10 and Cifar100

Academic Project @Robotics Program CIS Department@UPenn · Philadelphia, PA · Oct 2017

source code
Manually implemented Faster RCNN Deep Network (Tensorflow based) to achieve the object detection and classification task, here we used cifar10 data as the target object and cifar100 data as the background.

SFM and Visual SLAM

Research Project @GRASP Lab CIS Department@UPenn · Philadelphia, PA · Mar 2017

Implemented Visual SLAM package and built 3D trajectory as well as 3D point cloud map.

Structure From Motion(SFM): implemented Triangulation, PnP and Bundle Adjustment manually with Jacobian for camera state and 3D landmarks estimation and optimization. source code
Visual SLAM Package on a Quadrotor: based on SFM structure, succeeded to implement Visual Odometry(VO), applied local BA(Bundle Adjustment) and combined g2O package to increase the SLAM efficiency as well as performance. source code

Machine Learning applied in the modern robot

Research Project @GRASP Lab CIS Department@UPenn · Philadelphia, PA · Feb 2017

Implemented some of the most popular used machine learning algorithms and applied them into the real robot cases.

Object Detection and Geometry distance estimation based on GMM: trained some GMMs based on the image color data, also combined image processing and geometry algorithms to detetct any read barrel in a robot’s view and estimate its real geometry distance from robot. source code
Robot Gesture Recoginition using Hidden Markov Model (HMM): built an algorithm based on the HMM to recognize different robot arm motion gestures. Finally, it was able to classify unknown arm motions within almost real-time. source code
Reinforcement Learning using Policy Gradient Methods: implemented several methods for dealing with the Robot Walking on the Frozen Lake problem. The situation can be represented via a Markov Decision Process(MDP) and a strategy for retrieving the frisbee can be obtained using value iteration (VI), policy iteration (PI), and policy gradient optimization (PGO). source code

The Particle Filter based Fast SLAM

Research Project @GRASP Lab CIS Department@UPenn · Philadelphia, PA · Jan 2017

source code
demo 1 / demo 2 / demo 3 / demo 4

Implemented the filter-based 2D SLAM, used IMU data for prediction, Laser-scan data for update, and particle filter for state converge as well as 2D log-odds map optimization.
All data were collected on a ground-walkder robot.

2016

Orientation Estimation using Unscented Kalman Filter(UKF)

Research Project @GRASP Lab CIS Department@UPenn · Philadelphia, PA · Dec 2016

source code
demo 1 / demo 2 / demo 3 / demo 4

Succeeded in implementing a Kalman filter to track three-dimensional orientation of a hand-handled camera.
Given IMU sensor readings from gyroscopes and accelerometers, the algorithm will estimate the underlying 3D orientation by learning the appropriate model parameters from ground truth data given by a Vicon motion capture system.
Generated a real-time panoramic image using images captured by the camera and information filtered by the 3D orientation filter.

Fraction API Design

Course Project @Computer Science Program CIS Department@UPenn · Philadelphia, PA · Nov 2016

source code
Java provides several types of numbers, but it does not provide fractions, therefore to implement a Fraction API (Application Programmer’s Interface).

The Face Detection and Replacement Package Design

Academic Project @Robotics Program CIS Department@UPenn · Philadelphia, PA · Aug 2016