Aniket Agarwal
I am a Machine Learning Engineer at Gatik.ai in the Robotics & Autonomy Team. My work is focused on building perception stack for autonomous vehicles. My research interests
primarily lie in the domain of 4D Tracking, Multimodal Perception, 3D Computer Vision and in a general sense, building robust AI agents able to handle long-ranged dependency problems.
I graduated with a Master's in Robotics from Carnegie Mellon University, advised by Prof Laszlo A. Jeni.
My work at CMU was focused on developing perception/robotics stack for DARPA Triage Challenge. I also hold an undergraduate degree in Applied Mathematics from IIT Roorkee.
Previously, I was a Data & Applied Scientist in the MSAI team at Microsoft working on Learning to Rank systems especially in the context of Outlook and Teams Search Relevance.
I also worked with Prof Karthik Narsimhan on long-horizon video understanding.
I've had the pleasure of working with Prof Christian Desrosiers and Prof Jose Dolz from ETS Montreal, on the topic of semi-supervised segmentation. I've also worked with Prof Mohamed Elhoseiny from KAUST, tackling the problem of long-tail Visual Relationship Recognition.
I have worked with Prof Angela Yao from NUS Singapore, where our problem focus was on egocentric hand action recognition.
Email  / 
Resume  / 
Google Scholar  / 
Linkedin  / 
Twitter  / 
Github
|
|
Research
I am interested in solving problems at the intersection of 3D vision, graphics and in general multimodal AI. Much of my research till now is on vision related problems and a few now also on applicative mobile robotics.
|
|
Multimodal Human Mesh Recovery for Stand-off Triage in Mass Casualty Scenarios
Aniket Agarwal
CMU MSR Thesis, (under review), 2025
project
/
New Coverage
Proposed a multimodal human mesh recovery (HMR) framework fusing RGB, LiDAR, and IR data using a transformer-based Modality Unifier, enabling robust 3D pose estimation under degraded sensing. Achieved up to 14%
lower MPJPE than single-modality baselines and successfully deployed the model on Jetson Orin for real-world autonomy tasks & casualty motor-alertness detection
|
|
Building Scalable Video Understanding Benchmarks through Sports
Aniket Agarwal*,
Alex Zhang*,
Igor Gilitschenski,
Karthik Narasimhan,
Vishvak Murahari^,
Yash Kant^
DMLR@ICLR, 2024
project
/
arXiv
/
code
We proposed an automated pipeline, dubbed ASAP, for creating LVU benchmarks using sports. We also propose a new benchmarks using Cricket videos, dubbed LCric, using the said pipeline.
|
|
Exploring Long Tail Visual Relationship Recognition with Large Vocabulary
Aniket Agarwal*,
Sherif Abdelkarim*,
Panos Achlioptas,
Jun Chen,
Jiaji Huang,
Boyang Li,
Kenneth Church,
Mohamed Elhoseiny
ICCV, 2021
project
/
pdf
/
supp
/
arXiv
/
code
/
cite
We propose new benchmarks, GQA-LT & VG8K-LT, for solving and properly targeting the problem of long-tail VRR. Additionaly, we propose RelMix and VilHub, two techniques that can be utilized on top of any VRR model to make its prediction accuracy on tail classes better.
|
|
RelTransformer: A Transformer-Based Long-Tail Visual Relationship Recognition
Jun Chen,
Aniket Agarwal,
Sherif Abdelkarim,
Deyao Zhu,
Mohamed Elhoseiny
CVPR, 2022
arXiv
/
code
/
cite
Transformers have seen massive success for solving various Vision and Language related tasks. Here we propose a novel transformer based architecture for solving the long-tail VRR problem.
|
|
Visual Relationship Detection using Scene Graphs: A Survey
Aniket Agarwal*,
Ayush Mangal*,
Vipul*
Survey Paper
arXiv
/
cite
One of the first survey papers on topic of scene graphs, widely used for Visual Relationship Recognition.
|
|
Revisiting CycleGAN for semi-supervised segmentation
Arnab Mondal,
Aniket Agarwal,
Jose Dolz,
Christian Desrosiers
arXiv
/
code
/
cite
Utilizing the concepts of cycle-consistency for improving the semi-supervised baseline for image segmentation.
|
|
Carnegie Mellon University
Master of Science in Robotics
Aug '23 - Jul '25
Advised by Prof Laszlo A. Jeni
|
|
Indian Institute of Technology Roorkee
Integrated MSc in Applied Mathematics
July '17 - May '22
Student Societies:
- Co-President | Vision and Language Group (Link)
- Treasurer | ACM IIT Roorke Chapter (Link)
- Mentor | Student Mentorship Programme
- Teaching Assistant | MAN-001 course taught to freshmen year
|
Miscellaneous
Some of the best experiences I've had in my undergraduate life is due to Vision and Language Group(Link). The seniors and people there are some of the best and loveliest you can find in IIT Roorkee. I am highly indebted for being a part of the group.
In my free time, I love to watch TV shows and you can most definitely find me listening to music at any time of day. I love Rock and Lo-Fi pop genre and am an ardent fan of Radiohead.
|
This template is stolen from here.
|
|