Researcher & Developer · Reinforcement Learning · Computer Vision
I want to build systems that can see, understand, and act in the physical world. My research sits at
the intersection of perception and decision-making, with a broad interest in spatial
intelligence —
how will machines understand, reason about, and interact with physical space.
Currently I am exploring how
Vision Language Models can be grounded
in physical action, enabling machines to semantically understand their environment and
respond to natural language with physical gestures.
Publications
Segment Anything but Farms: Comparing Segmentation Paradigms for Rural UAV Captured Ultra-High-Resolution Imagery
WACV 2026 Workshop · GeoCV
Experience
NAAMII
AI Engineer Intern
Telemus AI
Research & Development Intern
Robotics Association of Nepal
Projects
Personal
InMoov — Open-Source Humanoid Robot
🔬 OngoingBuilding and programming the InMoov humanoid robot as a research platform. Currently the robot accepts natural language voice commands and executes a set of physical actions. The ongoing research direction extends this further — integrating Vision Language Models so the robot can semantically understand its visual environment and physically point toward named objects on demand, e.g., "Where is the door?" — a step toward grounded spatial intelligence in embodied systems.
Subash Sigdel, Ram Tamang
Mentored by Muni Bahadur Shakya — Nepal's first computer scientist & pioneer
Python · ROS · OpenCV · Arduino · 3D Printing · Sensors · LLMs · VLMs
Group
PadhaiSathi — For Visually Impaired Students
🏆 Hackathon WinnerAI-powered assistive webapp helping visually impaired students read and navigate content through audio feedback using OCR and text-to-speech.
Subash Sigdel, Zidan Rai, Arik Rai, Nima Tamang, Jiten Rai
Python · OCR · gTTS