基于HER-TD3算法的青皮核桃采摘機(jī)械臂路徑規(guī)劃

doi:10.6041/j.issn.1000-1298.2024.04.011

首頁(yè) > 過(guò)刊瀏覽>2024年第55卷第4期 >113-123. DOI:10.6041/j.issn.1000-1298.2024.04.011

基于HER-TD3算法的青皮核桃采摘機(jī)械臂路徑規(guī)劃
DOI:
                        10.6041/j.issn.1000-1298.2024.04.011
                    
作者:
                        
                        
                    
作者單位:
作者簡(jiǎn)介:
通訊作者:
中圖分類號(hào):
基金項(xiàng)目:河北省重點(diǎn)研發(fā)計(jì)劃項(xiàng)目（21327211D）和河北省博士研究生創(chuàng)新能力培養(yǎng)項(xiàng)目（CXZZBS2022050）

Path Planning of Green Walnut Picking Robotic Arm Based on HER-TD3 Algorithm

Author:

Affiliation:

Fund Project:

摘要

圖/表

訪問(wèn)統(tǒng)計(jì)

參考文獻(xiàn)

相似文獻(xiàn)

引證文獻(xiàn)

資源附件

文章評(píng)論

摘要:

針對(duì)青皮核桃和樹(shù)枝等障礙物無(wú)序生長(zhǎng)導(dǎo)致機(jī)械臂采摘環(huán)境復(fù)雜、訓(xùn)練任務(wù)量大、穩(wěn)定性差等普遍存在的問(wèn)題，本文設(shè)計(jì)了一種同步帶模組與機(jī)械臂協(xié)作的采摘裝置，并采用基于事后經(jīng)驗(yàn)回放的雙延遲深度確定性策略梯度算法（Twin delayed deep deterministic policy gradient with hindsight experience replay，HER-TD3）對(duì)采摘機(jī)械臂進(jìn)行路徑規(guī)劃，通過(guò)HER算法提高智能體的探索能力，緩解稀疏獎(jiǎng)勵(lì)的問(wèn)題；通過(guò)TD3算法提高智能體的穩(wěn)定性，減少了訓(xùn)練中出現(xiàn)的震蕩現(xiàn)象。為了證明HER-TD3算法的可行性和泛化能力，引入TD3、HER-DDPG算法進(jìn)行對(duì)比，采用降維訓(xùn)練方法對(duì)3種深度強(qiáng)化學(xué)習(xí)智能體進(jìn)行訓(xùn)練，結(jié)果表明HER-TD3算法模型在完成路徑規(guī)劃任務(wù)中成功率達(dá)到98%，與HER-DDPG算法相比提高4個(gè)百分點(diǎn)，與TD3算法相比提高19個(gè)百分點(diǎn)；在CoppeliaSim軟件中搭建三維模型仿真環(huán)境，設(shè)計(jì)初始姿態(tài)和碰撞檢測(cè)，使用YOLO v4識(shí)別青皮核桃，通過(guò)該算法模型能夠引導(dǎo)虛擬采摘機(jī)械臂避開(kāi)樹(shù)枝障礙物達(dá)到目標(biāo)位置，完成無(wú)碰撞路徑規(guī)劃，無(wú)障礙物和有障礙物時(shí)路徑規(guī)劃成功率分別為91%和86%；利用物理樣機(jī)進(jìn)行青皮核桃采摘試驗(yàn)時(shí)，仍能較好地完成路徑規(guī)劃任務(wù)，無(wú)障礙物時(shí)采摘路徑規(guī)劃成功率為86.7%，平均運(yùn)動(dòng)時(shí)間為12.8s，有障礙物時(shí)采摘路徑規(guī)劃成功率為80.0%，平均運(yùn)動(dòng)時(shí)間為13.6s，驗(yàn)證了HER-TD3算法對(duì)復(fù)雜環(huán)境具有較好的適應(yīng)性和穩(wěn)定性。

Abstract:

In response to the common problems of complex environments, large training tasks, and poor stability caused by the disorder growth of green walnut and tree branches, etc., a harvesting device based on synchronous belt module and manipulator was designed, and the path planning of harvesting manipulator was carried out by using the twin delayed deep deterministic policy gradient with hindsight experience replay (HER-TD3) algorithm. HER algorithm was used to improve the agent’s ability of exploration and alleviate the problem of sparse reward, and TD3 algorithm was used to improve the agent’s stability and reduce the oscillation in training. In order to demonstrate the feasibility and generalization ability of the HER-TD3 algorithm, TD3 and HER-DDPG algorithms were introduced for comparison. Three deep reinforcement learning agents were trained by using dimensionality reduction training methods. The results showed that the success rate of the HER-TD3 algorithm model in completing path planning tasks reached 98%, which was 4 percentage points higher than that of the HER-DDPG algorithm and 19 percentage points higher than that of TD3. The 3D model simulation environment was built in CoppeliaSim software, and the initial attitude and collision detection were designed, YOLO v4 was used to recognize green walnuts, and used this algorithm model to guide the virtual harvesting robotic arm to avoid tree branches and obstacles to reach the target position, completing collision free path planning. The success rates of path planning were 91% in the absence of obstacles and 86% in the presence of obstacles. In the experiment of picking green walnut using a physical prototype, the path planning task was still well completed. The success rate of path planning for harvesting without obstacles was 86.7%, with an average motion time of 12.8s, while the success rate in the presence of obstacles was 80.0%, with an average motion time of 13.6s. It was verified that HER-TD3 algorithm had good adaptability and stability to complex environment.

參考文獻(xiàn)

相似文獻(xiàn)

引證文獻(xiàn)

引用本文

楊淑華,謝曉波,邴振凱,郝建軍,張秀花,袁大超.基于HER-TD3算法的青皮核桃采摘機(jī)械臂路徑規(guī)劃[J].農(nóng)業(yè)機(jī)械學(xué)報(bào),2024,55(4):113-123. YANG Shuhua, XIE Xiaobo, BING Zhenkai, HAO Jianjun, ZHANG Xiuhua, YUAN Dachao. Path Planning of Green Walnut Picking Robotic Arm Based on HER-TD3 Algorithm[J]. Transactions of the Chinese Society for Agricultural Machinery,2024,55(4):113-123.

復(fù)制

文章指標(biāo)

點(diǎn)擊次數(shù):
下載次數(shù):
HTML閱讀次數(shù):
引用次數(shù):

歷史

收稿日期:2023-12-20
最后修改日期:
錄用日期:
在線發(fā)布日期: 2024-04-10
出版日期:

亚洲一区欧美在线,日韩欧美视频免费观看,色戒的三场床戏分别是在几段,欧美日韩国产在线人成

期刊瀏覽

EI收錄結(jié)果

引用本文

分享

文章指標(biāo)

歷史