publications
For the up-to-date publication list, please see Google Scholar or Semantic Scholar pages.
Papers (Journals & Conferences)
-
Modular Action Concept Grounding in Semantic Video Prediction. IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2022 Object-Oriented semantic manipulation of scenes with unsupervised capsule networks which learn grounding of both objects and actions. arXiv project
-
Experience Replay with Likelihood-free Importance Weights. Learning for Dynamics and Control (L4DC) 2022 A likelihood-free density ratio estimator to reweight experiences based on their likelihood under the stationary distribution of the current policy. Oral arXiv
-
X-Pool: Cross-Modal Language-Video Attention for Text-Video Retrieval. IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2022 Text to video retrieval with a scaled dot product attention for a text to attend to its most semantically similar frames. arXiv project code
-
GLiDE: Generalizable Quadrupedal Locomotion in Diverse Environments with a Centroidal Model. Workshop on Algorithmic Foundations of Robotics (WAFR) 2022 Model-Free RL in centroidal model for desired body accelerations with subsequent computation of ground reaction forces using a robot model. arXiv project
-
Neural Shape Mating: Self-Supervised Object Assembly with Adversarial Shape Priors. IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2022 Pairwise 3D geometric shape mating as general framework for part to part 3D pose matching for shape assembly. project
-
Pessimistic Bootstrapping for Uncertainty-Driven Offline Reinforcement Learning. International Conference on Learning Representations (ICLR) 2022 Generalization beyond dataset in offline RL: Uncertainty quantification via the disagreement of bootstrapped Q-functions, and pessimistic updates by penalizing the value function based on the estimated uncertainty arXiv
-
PlaTe: Visually-Grounded Planning with Transformers in Procedural Tasks. IEEE Robotics and Automation Letters (RA-L) and ICRA 2022 Planning Transformer to learn structured and plannable state and action spaces directly from unstructured videos. The model learns both action-conditional video prediction and goal conditioned planning. arXiv project
-
Value Gradient weighted Model-Based Reinforcement Learning. International Conference on Learning Representations (ICLR) 2022 Value aware model learning to fix Objective Mismatch in Model-based RL. The gradient of the empirical value function as a measure of the sensitivity of the RL algorithm to model errors arXiv project
-
Accelerated Policy Learning with Parallel Differentiable Simulation. International Conference on Learning Representations (ICLR) 2022 A high-performance differentiable simulator and a new policy learning algorithm (SHAC) that can effectively leverage simulation gradients, even in the presence of non-smoothness pdf project
-
Centralized Model and Exploration Policy for Multi-Agent RL. International Conference on Autonomous Agents and Multi-Agent Systems (AAMAS) 2022 Fully cooperative multi-agent settings (Dec-POMDPs) are fiendlishly hard. MARCO builds on the insight that using just a polynomial number of samples, it can learn a centralized model that generalizes across different policies. Oral Presentation arXiv
-
Integration of Reinforcement Learning in a Virtual Robotic Surgical Simulation. Journal of Surgical Innovation 2022
-
Convergence and Optimality of Policy Gradient Methods in Weakly Smooth Settings. Confernce on Artificial Intelligence (AAAI) 2022 Convergence analysis in RL relies on non-intuitive, impractical and often opaque conditions such as strict smoothness and bounded function approximation. In this work, we establish explicit convergence rates of policy gradient methods without relying on these conditions, instead extending the convergence regime to weakly smooth policy classes with L2 integrable gradient. arXiv
-
Dynamic Bottleneck for Robust Self-Supervised Exploration. Advances in Neural Information Processing Systems (NeurIPS) 2021 Robust exploration via dynamic bottleneck-based representation and UCB-based bonus. arXiv
-
Neural Hybrid Automata: Learning Dynamics with Multiple Modes and Stochastic Transitions. Advances in Neural Information Processing Systems (NeurIPS) 2021 A recipe for learning SHS dynamics without a priori knowledge on the number of modes and inter-modal transition dynamics. Method leverages Normalizing Flows and Stochastic ODEs. arXiv
-
Drop-DTW: Aligning Common Signal Between Sequences While Dropping Outliers. Advances in Neural Information Processing Systems (NeurIPS) 2021 Drop-DTW efficiently computes the optimal alignment between two variable-length sequences while automatically dropping the outlier elements from the matching. arXiv
-
A Persistent Spatial Semantic Representation for High-level Natural Language Instruction Execution. Conference on Robot Learning (CoRL) 2021 Data-augmentation with simple perturbations improve robustness, generalization, and OOD performance in Offline RL arXiv project code poster
-
S4RL: Surprisingly Simple Self-Supervision for Offline Reinforcement Learning. Conference on Robot Learning (CoRL) 2021 Data-augmentation with simple perturbations improve robustness, generalization, and OOD performance in Offline RL arXiv
-
Seeing Glass: Joint Point-Cloud and Depth Completion for Transparent Objects. Conference on Robot Learning (CoRL) 2021 TraspareNet is a joint point cloud and depth completion method to recover learned depth of transparent objects in cluttered and complex scenes, even with partially filled fluid contents within the vessels Oral Presentation arXiv project code talk
-
Learning by Watching: Physical Imitation of Manipulation Skills from Human Videos. IEEE International Conference on Intelligent Robots and Systems (IROS) 2021 Style transfer human videos to robot perspective, then sparse unsupervised keypoints for reward estimation, use RL for model-free task completion. arXiv project video
-
-
Robust Value Iteration for Continuous Control Tasks. Robotics: Systems and Science (RSS) 2021 Robustness to Sim2Real via Dynamic Programming based Value Iteration in Continuous time RL. arXiv project
-
GIFT: Generalizable Interaction-aware Functional Tool Affordances without Labels. Robotics: Systems and Science (RSS) 2021 Interaction-aware affordance mapping to unsupervised keypoints for tool-use in different scenarios. arXiv blog video
-
Value Iteration in Continuous Actions, States and Time. International Conference on Machine Learning (ICML) 2021 RL in continuous states and actions can be solved with a closed-form extention of value iteration in cases of non-linear control-affine dynamics, resulting in a practical alternative to policy search. arXiv project
-
Principled Exploration via Optimistic Bootstrapping and Backward Induction. International Conference on Machine Learning (ICML) 2021 Improving exploration in RL through Optimistic Bootstrapping using UCB-bonus to capture epistemic uncertainty. Time-consistent uncertainty propagation through backward induction. arXiv code
-
Coach-Player Multi-agent Reinforcement Learning for Dynamic Team Composition. International Conference on Machine Learning (ICML) 2021 Coordinating teams with time-varying composition and roles requires oversight from coach who can help with low-frequency updates to role assignments and team strategy. Long Talk arXiv
-
Tesseract: Tensorised Actors for Multi-Agent Reinforcement Learning. International Conference on Machine Learning (ICML) 2021 Tensorised formulation of the Bellman equation in Cooperative multi-agent RL is an effective solution to exponential blowup of the action space with the number of agents. arXiv
-
Dynamics Randomization Revisited: A Case Study for Quadrupedal Locomotion. IEEE International Conference on Robotics and Automation (ICRA) 2021 Dynamics randomization is neither necessary nor sufficient for sim-to-real transfer of learning robust locomotion policies. arXiv project
-
LASER: Learning a Latent Action Space for Efficient Reinforcement Learning. IEEE International Conference on Robotics and Automation (ICRA) 2021 Learn a lower dimensional action-space that results in efficient exploration in similar tasks. arXiv project video
-
LEAF: Latent Exploration Along the Frontier. IEEE International Conference on Robotics and Automation (ICRA) 2021 Learn a dynamics aware manifold of reachable states, and then use this for guided exploration in hard continuous control tasks with RL. arXiv project
-
Emergent Hand Morphology and Control from Optimizing Robust Grasps of Diverse Objects. IEEE International Conference on Robotics and Automation (ICRA) 2021 A data-driven bayesian optimization approach to jointly optimize hand-design along with policy for grasping diverse objects in multiple modes. arXiv project
-
C-Learning: Horizon-Aware Cumulative Accessibility Estimation. International Conference on Learning Representations (ICLR) 2021 Horizon-Aware policies trade off safety and performance while encoding multimodal solutions. Insight is to learn cumulative accessibility C(s,a,h) with time horizon h instead of the usual Q-function Q(s,a). arXiv project talk
-
Conservative Safety Critics for Exploration. International Conference on Learning Representations (ICLR) 2021 We need to guarantee safety during training in RL. Instead of unintuitive specification of state based safety, we can learn safety as a separate value function, and can jointly optimize for task performance with safety value as a constraint. arXiv project talk
-
Skill Transfer via Partially Amortized Hierarchical Planning. International Conference on Learning Representations (ICLR) 2021 Combine benefits of learned world-model with a set of modular skills for faster online test-time adaptation. Use learned skills during planning stage improves both speed and data efficiency arXiv project talk
-
DIBS: Diversity inducing Information Bottleneck in Model Ensembles. Confernce on Artificial Intelligence (AAAI) 2021 Ensembles of deep nets to model uncertianty in modeling multi-modal data by encouraging diversity in prediction through adversarial loss for learning the stochastic latent variables arXiv
-
Unsupervised Disentanglement of Pose, Appearance and Background from Images and Videos. Transactions on Pattern Analysis and Machine Intelligence (T-PAMI) 2021 arXiv code
-
Causal Discovery in Physical Systems from Videos. Advances in Neural Information Processing Systems 33 (NeurIPS) 2020 Learn the underlying generative model as a causal graph with a few frames of observation. Generalize across variable latent dynamics (both graph connectivity and parameters). arXiv project talk
-
Counterfactual Data Augmentation using Locally Factored Dynamics. Advances in Neural Information Processing Systems 33 (NeurIPS) 2020 Outstanding Paper Award at Object-Oriented Learning Workshop, ICML 2020 arXiv code talk
-
Curriculum By Smoothing. Advances in Neural Information Processing Systems 33 (NeurIPS) 2020 Curriculum deisgn to improve representation learning in CNN by restricting access to high frequency information until later in the training Spotlight Talk arXiv talk
-
Learning a Contact-Adaptive Controller for Robust, Efficient Legged Locomotion. Conference on Robot Learning (CoRL) 2020 arXiv blog video
-
Visuomotor Mechanical Search: Learning to Retrieve Target Objects in Clutter. IEEE International Conference on Intelligent Robots and Systems (IROS) 2020 arXiv project talk
-
A Programmable Approach to Neural Network Compression. IEEE Micro 2020 Demystifying network compression: user inputs pretrained model, compression scheme, objective and constraints. Condensa uses bayesian optimization to infer optimal sparsity ratio and corresponding compressed model. arXiv pdf project code
-
Ocean: Online Task Inference for Compositional Tasks with Context Adaptation. Conference on Uncertainty in Artificial Intelligence (UAI) 2020 A hierarchical latent variable prior that goes beyond vanilla gaussians to capture global and local context in sequential decision making for Meta-RL. arXiv pdf supp code talk
-
Angular Visual Hardness. International Conference on Machine Learning (ICML) 2020 Normalized angular distance between the sample feature embedding and the target classifier to measure sample hardness. arXiv project slides talk
-
Motion Reasoning for Goal-Based Imitation Learning. IEEE International Conference on Robotics and Automation (ICRA) 2020 Combine task & motion planning to disambiguate the true intention of the demonstrator from video where they performed multiple subtasks but only a subset was relevant to true objective, others were constraint satisfaction. arXiv video
-
Combining Model-Free and Model-Based Strategies for Sample-Efficient Reinforcement Learning. IEEE International Conference on Robotics and Automation (ICRA) 2020 Best Paper Award at 2019 Neurips Workshop on Robot Learning arXiv video talk
-
IRIS: Implicit Reinforcement without Interaction at Scale for Learning Control from Offline Robot Manipulation Data. IEEE International Conference on Robotics and Automation (ICRA) 2020 Offline demonstrations are both suboptimal and multimodal. Use two-stage model-learning: a high leven generative model to fit multi-modal state density, and a low-level imitation model for near optimal control. arXiv project video talk
-
Controlling Assistive Robots with Learned Latent Actions. IEEE International Conference on Robotics and Automation (ICRA) 2020 Learn a action space encoding from expert demonstrations, align the encoding with lower-dimension controller to enable efficient teleoperation. arXiv blog video talk
-
Making Sense of Vision and Touch: Learning Multimodal Representations for Contact-Rich Tasks. IEEE Transactions on Robotics (T-RO) 2020 arXiv project
-
Scaling Robot Supervision to Hundreds of Hours with RoboTurk: Robotic Manipulation Dataset through Human Reasoning and Dexterity. IEEE International Conference on Intelligent Robots and Systems (IROS) 2019 IROS Best Cognitive Robotics Paper Finalist arXiv project code blog
-
Continuous Relaxation of Symbolic Planner for One-Shot Imitation Learning. IEEE International Conference on Intelligent Robots and Systems (IROS) 2019 arXiv video
-
Variable Impedance Control in End-Effector Space: An Action Space for Reinforcement Learning in Contact-Rich Tasks. IEEE International Conference on Intelligent Robots and Systems (IROS) 2019 arXiv project
-
Dynamics Learning with Cascaded Variational Inference for Multi-Step Manipulation. Conference on Robot Learning (CoRL) 2019 Oral Presentation arXiv project
-
AC-Teach: A Bayesian Actor-Critic Method for Policy Learning with an Ensemble of Suboptimal Teachers. Conference on Robot Learning (CoRL) 2019 arXiv project code blog
-
Learning Task-Oriented Grasping for Tool Manipulation from Simulated Self-Supervision. International Journal of Robotics Research (IJRR) 2019 pdf project video
-
Neural Task Graphs: Generalizing to unseen tasks from a single video demonstration. IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2019 Oral Presentation arXiv video
-
Making Sense of Vision and Touch: Self-Supervised Learning of Multimodal Representations for Contact-Rich Tasks. IEEE International Conference on Robotics and Automation (ICRA) 2019 ICRA Best Paper Award and Finalist: Best Cognitive Robotics Paper arXiv project video
-
Mechanical Search: Multi-Step Retrieval of a Target Object Occluded by Clutter. IEEE International Conference on Robotics and Automation (ICRA) 2019 abstract arXiv project video
-
SWIRL: A Sequential Windowed Inverse Reinforcement Learning Algorithm for Robot Tasks With Delayed Rewards. International Journal of Robotics Research (IJRR) 2018 pdf project
-
Learning Task-Oriented Grasping for Tool Manipulation from Simulated Self-Supervision. Robotics: Systems and Science (RSS) 2018 arXiv project video talk
-
Neural Task Programming: Learning to generalize across hierarchical tasks. IEEE International Conference on Robotics and Automation (ICRA) 2018 arXiv project video talk
-
DeformNet: Free-Form Deformation Network for 3D Shape Reconstruction from a Single Image. IEEE Winter Conference on Applications of Computer Vision (WACV) 2018 arXiv project talk
-
AdaPT: Zero-Shot Adaptive Policy Transfer for Stochastic Dynamical Systems. International Symposium on Robotics Research (ISRR) 2017 arXiv
-
Transition State Clustering: Unsupervised surgical trajectory segmentation for robot learning. International Journal of Robotics Research (IJRR) 2017 pdf code
-
Weakly Supervised Generative Adversarial Networks for 3D Reconstruction. IEEE Conference on 3D Vision (3DV) 2017 arXiv code
-
Adversarially Robust Policy Learning through Active Construction of Physically-Plausible Perturbations. IEEE International Conference on Intelligent Robots and Systems (IROS) 2017 pdf project video
-
Multilateral Surgical Pattern Cutting in 2D Orthotropic Gauze with Deep Reinforcement Learning Policies for Tensioning. IEEE International Conference on Robotics and Automation (ICRA) 2017 pdf video
-
SWIRL: A Sequential Windowed Inverse Reinforcement Learning Algorithm for Robot Tasks With Delayed Rewards. Workshop on Algorithmic Foundations of Robotics (WAFR) 2016 pdf talk
-
Interchangeable Surgical Instrument System with Application to Supervised Automation of Multilateral Tumor Resection. . IEEE International Conference on Automation Science & Engg. (CASE) 2016 Best Video Award at 2015 Hamlyn Symposium project video
-
TSC-DL: Unsupervised Trajectory Segmentation of Multi-Modal Surgical Demonstrations with Deep Learning . IEEE International Conference on Robotics & Automation (ICRA) 2016 project code video
-
Autonomous Multiple-Throw Multilateral Surgical Suturing with a Mechanical Needle Guide and Optimization based Needle Planning . IEEE International Conference on Robotics & Automation (ICRA) 2016 project video
-
Tumor localization using automated palpation with Gaussian Process Adaptive Sampling. IEEE International Conference on Automation Science & Engg. (CASE) 2016 pdf project
-
A Single-Use Haptic Palpation Probe for Locating Subcutaneous Blood Vessels in Robot-Assisted Minimally Invasive Surgery. IEEE International Conference on Automation Science & Engg. (CASE) 2015 Best Poster/Demo Award at ICRA 2015 Workshop on Shared Frameworks for Medical Robotics abstract pdf project
-
Learning by Observation for Surgical Subtasks: Multilateral Cutting of 3D Viscoelastic and 2D Orthotropic Tissue Phantoms. IEEE International Conference on Robotics & Automation (ICRA) 2015 Finalist: Best Paper, Student Paper, and Medical Robotics Paper Award abstract project video
-
Material Evaluation of PC-ISO for Customized, 3D Printed, Gynecologic 192Ir HDR Brachytherapy Applicators. Journal of Applied Clinical Medical Physics (JACMP) 2015 project
-
Transition State Clustering: Unsupervised Surgical Trajectory Segmentation For Robot Learning. International Symposium on Robotics Research (ISRR) 2015 code
-
Exact Reachability Analysis for Planning Skew-Line Needle Arrangements for Automated Brachytherapy. IEEE International Conference on Automation Science & Engg. (CASE) 2014 abstract project
-
Robot-Guided Open-Loop Insertion of Skew-Line Needle Arrangements for High Dose Rate Brachytherapy. IEEE Transactions on Automation Science and Engineering (T-ASE) 2013 abstract pdf
-
An Algorithm for Computing Customized 3D Printed Implants with Curvature Constrained Channels for Enhancing Intracavitary Brachytherapy Radiation Delivery. IEEE International Conference on Automation Science & Engg. (CASE) 2013 abstract pdf project
-
Initial experiments toward automated robotic implantation of skew-line needle arrangements for HDR brachytherapy. IEEE International Conference on Automation Science & Engg. (CASE) 2012 IEEE CASE Best Application Paper Award abstract project video talk
-
Robot-guided Delivery of Brachytherapy Needles along Non-Parallel Paths to Avoid Penile Bulb Puncture. Radiotherapy and Oncology 2012 pdf
-
Low-Cost Teleoperation of Remotely Located Actuators Based on Dual Tone Multi-Frequency Data Transfer. MEMS, NANO and Smart Systems 2012 abstract pdf
-
The Autotrix: Design and Implementation of an Autonomous Urban Driving System. MEMS, NANO and Smart Systems 2012 abstract pdf
-
Object Identification and Mapping using Monocular Vision in an Autonomous Urban Driving System. International Conference of Machine Vision 2010
Preprints
-
Koopman Q-learning: Offline Reinforcement Learning via Symmetries of Dynamics. preprint 2021 Koopman Forward (Conservative) Q-learning (KFC): a model-free RL algorithm which uses the symmetries in the dynamics of the environment to guide data augmentation in Offline RL. arXiv
-
Continuous-Time Fitted Value Iteration for Robust Policies. preprint 2021 Solving HJB differential equation and its extension the Hamilton-Jacobi-Isaacs equation yields a robust optimal policy that achieves the maximum reward on a give task. We propose continuous and robust fitted value iteration that leverage the non-linear control-affine dynamics and separable state & action reward in continuous control to derive the optimal policy and optimal adversary in closed form. arXiv project
-
Auditing AI models for Verified Deployment under Semantic Specifications. preprint 2021 How can we design a similarly motivated auditing scheme for deep learning models? We propose a sequence of semantically aligned unit tests each to verify a predefined specification. arXiv project blog
-
Transferring Dexterous Manipulation from GPU Simulation to a Remote Real-World TriFinger. preprint 2021 A framework for learning a challenging dexterous manipulation task. The systems builds on IsaacGym for large scale simulation, keypoint based state representation and Cross-Atlantic remote sim2real to demonstrate the viability to scalability of robot learning. arXiv project
-
A Robot Cluster for Reproducible Research in Dexterous Manipulation. preprint 2021 A framework for democratizing multi-finger manipulation using a common hardware and software benchmark. arXiv
-
Articulated Object Interaction in Unknown Scenes with Whole-Body Mobile Manipulation. preprint 2021 Predict expected keyframes of operating an articulated object, then plan for close-loop dyanmically-feasible whole-body motion to match predicted object trajectory. arXiv project
-
De-anonymization of authors through arXiv submissions during double-blind review. preprint 2020 Doubt-blind reviewing process may tilt the scales in favor of eminent authors who de-anonymize through concurrent arxiv release. arXiv blog
Peer-Reviewed Workshops
-
Uniform Priors for Data-Efficient Transfer. Workshop on Learning with Limited Labelled Data for Image and Video Understanding at CVPR 2022 Features that are most transferable have high uniformity in the embedding space and propose a uniformity regularization scheme that encourages better transfer and feature reuse. arXiv talk
-
D2RL: Deep Dense Architectures in Reinforcement Learning. Workshop on Deep Reinforcement Learning (DRL) at Neurips 2020 Architectures in RL do not need to be simple MLPs! Dense connections in both actor and critic improve representation learning and hence performance. arXiv project code talk
-
Combining Model-Free and Model-Based Strategies for Sample-Efficient Reinforcement Learning. Workshop on Robot Learning at NeurIPS 2019 Best paper award at the workshop pdf
-
Turbulence forecasting via Neural ODE. Workshop on Machine Learning and the Physical Sciences at NeurIPS 2019 pdf
-
Towards Grasp Transfer using Shape Deformation.. Conference on Robot Learning (CoRL), workshop track 2017 pdf poster
-
Hierarchical Task Generalization with Neural Programs. Conference on Robot Learning (CoRL), workshop track 2017 talk
-
Hierarchical Task Generalization with Neural Programs. R:SS Workshop on New Frontiers for Deep Learning in Robotics 2017 poster
-
Adversarially Robust Policy Learning through Active Construction of Physically-Plausible Perturbations. Multidisciplinary Conference on Reinforcement Learning and Decision Making (RLDM) 2017 pdf
-
On Visual Feature Representations for Transition State Learning in Robotic Task Demonstrations. NIPS Workshop on Feature Extraction 2015 pdf project
-
Automated Delivery Instrument for Stem Cell Treatment Using the daVinci Robotic Surgical System. Annual Meeting of the International Society for Stem Cell Research. Stockholm, Sweden. 2015 poster
-
Autonomous Tumor Localization and Extraction: Palpation, Incision, Debridement and Adhesive Closure with the da Vinci Research Kit. Hamlyn Surgical Robotics Conference, London 2015 Best Video Award video
Theses
-
Autonomous Palpation for Tumor Localization: Design of a Palpation Probe and Gaussian Process Adaptive Sampling. MS Thesis, EECS Department, University of California, Berkeley, 2016 project
-
Optimization and Design for Automation of Brachytherapy Delivery and Learning Robot-Assisted Surgical Sub-Tasks. PhD Thesis, University of California, Berkeley, 2016 project
Patents
-
Precision injector/extractor for robot-assisted minimally-invasive surgery. U.S. Provisional. PCT International Application No.: PCT/US2016/039,026, June, 2016 project
Technical Reports
-
Autonomous localization and navigation using 2D laser scanners. 2010