I’m Kyle Vedder

I believe the shortest path to getting robust, generally capable robots in the real world is through the construction of systems whose performance scales with compute and data, without requiring human annotations.

In service of this, I am interested in designing and scaling fundamentally 3D vision systems that learn just from raw, multi-modal data. My contrarian bet is on the multi-modal and 3D aspects; a high quality, 3D aware representation with diverse data sources should enable more sample efficient and robust downstream policies. Most representations today are 2D for historical reasons (e.g. lots of RGB data, 2D convolutions won the hardware lottery), but I believe this ends up pushing a lot of 3D spacial understand out of the visual representation and into the downstream policy, making them more expensive to learn and less robust.

My current line of work is focused on tackling scene flow, a problem that requires systems to construct a robust understanding of the dynamics of the 3D world. For data availability reasons, it primarily focuses on the Autonomous Driving domain, but the same principles apply to other domains, e.g. indoor service robots.


I am a CS PhD candidate at Penn under Eric Eaton and Dinesh Jayaraman in the GRASP Lab. My current line of work is focused on scene flow with the general goals of:

During my undergrad in CS at UMass Amherst I did research under Joydeep Biswas in the Autonomous Mobile Robotics Lab. My research was in:

I have also done many industry internships:

More Information