Back to Search
Start Over
Improving Efficiency of Training a Virtual Treatment Planner Network via Knowledge-guided Deep Reinforcement Learning for Intelligent Automatic Treatment Planning of Radiotherapy
- Source :
- Med Phys
- Publication Year :
- 2020
- Publisher :
- arXiv, 2020.
-
Abstract
- Purpose We previously proposed an intelligent automatic treatment planning framework for radiotherapy, in which a virtual treatment planner network (VTPN) is built using deep reinforcement learning (DRL) to operate a treatment planning system (TPS) by adjusting treatment planning parameters in it to generate high-quality plans. We demonstrated the potential feasibility of this idea in prostate cancer intensity-modulated radiation therapy (IMRT). Despite the success, the process to train a VTPN via the standard DRL approach with an e-greedy algorithm was time consuming. The required training time was expected to grow with the complexity of the treatment planning problem, preventing the development of VTPN for more complicated but clinically relevant scenarios. In this study, we proposed a novel knowledge-guided DRL (KgDRL) approach that incorporated knowledge from human planners to guide the training process to improve the efficiency of training a VTPN. Method Using prostate cancer IMRT as a testbed, we first summarized a number of rules in the actions of adjusting treatment planning parameters of our in-house TPS. During the training process of VTPN, in addition to randomly navigating the large state-action space, as in the standard DRL approach using the e-greedy algorithm, we also sampled actions defined by the rules. The priority of sampling actions from rules decreased over the training process to encourage VTPN to explore new policy on parameter adjustment that were not covered by the rules. To test this idea, we trained a VTPN using KgDRL and compared its performance with another VTPN trained using the standard DRL approach. Both networks were trained using 10 training patient cases and 5 additional cases for validation, while another 59 cases were employed for the evaluation purpose. Results It was found that both VTPNs trained via KgDRL and standard DRL spontaneously learned how to operate the in-house TPS to generate high-quality plans, achieving plan quality scores of 8.82 (±0.29) and 8.43 (±0.48), respectively. Both VTPNs outperformed treatment planning purely based on the rules, which had a plan score of 7.81 (±1.59). VTPN trained with eight episodes using KgDRL was able to perform similarly to that trained using DRL with 100 epochs. The training time was reduced from more than a week to ~13 hours. Conclusion The proposed KgDRL framework was effective in accelerating the training process of a VTPN by incorporating human knowledge, which will facilitate the development of VTPN for more complicated treatment planning scenarios.
- Subjects :
- Male
Process (engineering)
Computer science
medicine.medical_treatment
media_common.quotation_subject
FOS: Physical sciences
Plan (drawing)
Machine learning
computer.software_genre
Article
030218 nuclear medicine & medical imaging
03 medical and health sciences
Prostate cancer
0302 clinical medicine
medicine
Reinforcement learning
Humans
Quality (business)
Radiation treatment planning
computer.programming_language
media_common
business.industry
Radiotherapy Planning, Computer-Assisted
Prostatic Neoplasms
Radiotherapy Dosage
General Medicine
medicine.disease
Planner
Physics - Medical Physics
Test (assessment)
Radiation therapy
030220 oncology & carcinogenesis
Artificial intelligence
Radiotherapy, Intensity-Modulated
Medical Physics (physics.med-ph)
business
computer
Algorithms
Subjects
Details
- Database :
- OpenAIRE
- Journal :
- Med Phys
- Accession number :
- edsair.doi.dedup.....670f47259ffe46f59daf7dbca3eaf1e7
- Full Text :
- https://doi.org/10.48550/arxiv.2007.12591