{"id":946,"date":"2026-05-07T17:35:29","date_gmt":"2026-05-07T09:35:29","guid":{"rendered":"https:\/\/ldsc.pme.nthu.edu.tw\/?p=946"},"modified":"2026-05-07T19:32:18","modified_gmt":"2026-05-07T11:32:18","slug":"%e4%bb%a5%e5%bb%ba%e6%a7%8b%e5%bc%8f%e7%b6%b2%e8%b7%af%e6%93%b4%e5%b1%95%e4%b9%8b%e5%a4%9a%e6%a9%9f%e5%99%a8%e4%ba%ba%e7%b3%bb%e7%b5%b1%e5%b0%8e%e8%88%aa%e8%88%87%e9%81%bf%e9%9a%9c%e4%b9%8b%e6%b7%b1","status":"publish","type":"post","link":"https:\/\/ldsc.pme.nthu.edu.tw\/en\/research\/academic-research\/%e4%bb%a5%e5%bb%ba%e6%a7%8b%e5%bc%8f%e7%b6%b2%e8%b7%af%e6%93%b4%e5%b1%95%e4%b9%8b%e5%a4%9a%e6%a9%9f%e5%99%a8%e4%ba%ba%e7%b3%bb%e7%b5%b1%e5%b0%8e%e8%88%aa%e8%88%87%e9%81%bf%e9%9a%9c%e4%b9%8b%e6%b7%b1\/","title":{"rendered":"Deep Reinforcement Learning for Navigation and Collision Avoidance of Multi-Robot Systems By Constructive Network Expansion"},"content":{"rendered":"\n<p class=\"has-medium-font-size\">This research presents a navigation and obstacle avoidance policy network for multi-robot systems using deep reinforcement learning. The network is first designed and trained for a dual-robot setup. By incorporating nonholonomic constraints and priority rules, reinforcement learning is used to train the network with respect to the kinematics of mobile robots, enabling effective navigation and collision avoidance. An innovative expansion architecture is introduced, leveraging the social-force model to extend the dual-robot policy to multi-robot scenarios with moderate computational cost. Although the network is trained in an open environment, it can be applied to general map environments by using virtual robots to simulate walls and compartments. Simulations and indoor experiments validate the feasibility and performance of the proposed multi-robot navigation and obstacle avoidance policy.<\/p>\n\n\n\n<figure class=\"wp-block-embed is-type-video is-provider-youtube wp-block-embed-youtube wp-embed-aspect-16-9 wp-has-aspect-ratio\"><div class=\"wp-block-embed__wrapper\">\n<div class=\"ast-oembed-container\" style=\"height: 100%;\"><iframe loading=\"lazy\" title=\"DRL for Navigation and Collision Avoidance of Multi Robot systems by Constructive Network Expansion\" width=\"500\" height=\"281\" src=\"https:\/\/www.youtube.com\/embed\/jqA5vrRYqv8?feature=oembed\" frameborder=\"0\" allow=\"accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share\" referrerpolicy=\"strict-origin-when-cross-origin\" allowfullscreen><\/iframe><\/div>\n<\/div><\/figure>\n","protected":false},"excerpt":{"rendered":"<p>This research presents a navigation and obstacle avoidance policy network for multi-robot systems using deep reinforcement learning. The network is first designed and trained for a dual-robot setup. By incorporating nonholonomic constraints and priority rules, reinforcement learning is used to train the network with respect to the kinematics of mobile robots, enabling effective navigation and &hellip;<\/p>\n<p class=\"read-more\"> <a class=\"\" href=\"https:\/\/ldsc.pme.nthu.edu.tw\/en\/research\/academic-research\/%e4%bb%a5%e5%bb%ba%e6%a7%8b%e5%bc%8f%e7%b6%b2%e8%b7%af%e6%93%b4%e5%b1%95%e4%b9%8b%e5%a4%9a%e6%a9%9f%e5%99%a8%e4%ba%ba%e7%b3%bb%e7%b5%b1%e5%b0%8e%e8%88%aa%e8%88%87%e9%81%bf%e9%9a%9c%e4%b9%8b%e6%b7%b1\/\"> <span class=\"screen-reader-text\">Deep Reinforcement Learning for Navigation and Collision Avoidance of Multi-Robot Systems By Constructive Network Expansion<\/span> Read More &raquo;<\/a><\/p>\n","protected":false},"author":1,"featured_media":958,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_uag_custom_page_level_css":"","site-sidebar-layout":"default","site-content-layout":"","ast-site-content-layout":"default","site-content-style":"default","site-sidebar-style":"default","ast-global-header-display":"","ast-banner-title-visibility":"","ast-main-header-display":"","ast-hfb-above-header-display":"","ast-hfb-below-header-display":"","ast-hfb-mobile-header-display":"","site-post-title":"","ast-breadcrumbs-content":"","ast-featured-img":"","footer-sml-layout":"","theme-transparent-header-meta":"","adv-header-id-meta":"","stick-header-meta":"","header-above-stick-meta":"","header-main-stick-meta":"","header-below-stick-meta":"","astra-migrate-meta-layouts":"set","footnotes":""},"categories":[17],"tags":[],"class_list":["post-946","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-academic-research"],"translation":{"provider":"WPGlobus","version":"2.12.2","language":"en","enabled_languages":["tw","en"],"languages":{"tw":{"title":true,"content":true,"excerpt":false},"en":{"title":true,"content":true,"excerpt":false}}},"uagb_featured_image_src":{"full":["https:\/\/ldsc.pme.nthu.edu.tw\/wp-content\/uploads\/2026\/05\/\u6295\u5f71\u72472.png",1280,720,false],"thumbnail":["https:\/\/ldsc.pme.nthu.edu.tw\/wp-content\/uploads\/2026\/05\/\u6295\u5f71\u72472-150x150.png",150,150,true],"medium":["https:\/\/ldsc.pme.nthu.edu.tw\/wp-content\/uploads\/2026\/05\/\u6295\u5f71\u72472-300x169.png",300,169,true],"medium_large":["https:\/\/ldsc.pme.nthu.edu.tw\/wp-content\/uploads\/2026\/05\/\u6295\u5f71\u72472-768x432.png",768,432,true],"large":["https:\/\/ldsc.pme.nthu.edu.tw\/wp-content\/uploads\/2026\/05\/\u6295\u5f71\u72472-1024x576.png",1024,576,true],"1536x1536":["https:\/\/ldsc.pme.nthu.edu.tw\/wp-content\/uploads\/2026\/05\/\u6295\u5f71\u72472.png",1280,720,false],"2048x2048":["https:\/\/ldsc.pme.nthu.edu.tw\/wp-content\/uploads\/2026\/05\/\u6295\u5f71\u72472.png",1280,720,false],"depicter-thumbnail":["https:\/\/ldsc.pme.nthu.edu.tw\/wp-content\/uploads\/2026\/05\/\u6295\u5f71\u72472-200x113.png",200,113,true]},"uagb_author_info":{"display_name":"admin","author_link":"https:\/\/ldsc.pme.nthu.edu.tw\/en\/author\/admin\/"},"uagb_comment_info":0,"uagb_excerpt":"This research presents a navigation and obstacle avoidance policy network for multi-robot systems using deep reinforcement learning. The network is first designed and trained for a dual-robot setup. By incorporating nonholonomic constraints and priority rules, reinforcement learning is used to train the network with respect to the kinematics of mobile robots, enabling effective navigation and&hellip;","_links":{"self":[{"href":"https:\/\/ldsc.pme.nthu.edu.tw\/en\/wp-json\/wp\/v2\/posts\/946","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/ldsc.pme.nthu.edu.tw\/en\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/ldsc.pme.nthu.edu.tw\/en\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/ldsc.pme.nthu.edu.tw\/en\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/ldsc.pme.nthu.edu.tw\/en\/wp-json\/wp\/v2\/comments?post=946"}],"version-history":[{"count":5,"href":"https:\/\/ldsc.pme.nthu.edu.tw\/en\/wp-json\/wp\/v2\/posts\/946\/revisions"}],"predecessor-version":[{"id":959,"href":"https:\/\/ldsc.pme.nthu.edu.tw\/en\/wp-json\/wp\/v2\/posts\/946\/revisions\/959"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/ldsc.pme.nthu.edu.tw\/en\/wp-json\/wp\/v2\/media\/958"}],"wp:attachment":[{"href":"https:\/\/ldsc.pme.nthu.edu.tw\/en\/wp-json\/wp\/v2\/media?parent=946"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/ldsc.pme.nthu.edu.tw\/en\/wp-json\/wp\/v2\/categories?post=946"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/ldsc.pme.nthu.edu.tw\/en\/wp-json\/wp\/v2\/tags?post=946"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}