Fig 1.
Modeling of a fixed wing UAV.
Fig 2.
Track followed by fixed wing UAV.
Fig 3.
Working of airspeed and altitude controller.
Fig 4.
Working of heading controller.
Fig 5.
Working of attitude controller.
Fig 6.
DDPG algorithm structure.
Fig 7.
TD3 algorithm structure.
Fig 8.
PPO algorithm structure.
Fig 9.
TRPO algorithm structure.
Fig 10.
SAC algorithm structure.
Table 1.
Strengths and limitations of RL algorithms.
Table 2.
Training process for RL agents.
Table 3.
Evaluation criteria for RL agents.
Table 4.
Hyperparameters for all agents.
Fig 11.
The Altitude Control (left) and Training Curve (right) obtained on DDPG agent.
Fig 12.
The Heading Control (left) and Roll Control (right) obtained on DDPG agent.
Fig 13.
The Altitude Control (left) and Training Curve (right) obtained on TRPO agent.
Fig 14.
The Heading Control (left) and Roll Control (right) obtained on TRPO agent.
Fig 15.
The Altitude Control (left) and Training Curve (right) obtained on PPO agent.
Fig 16.
The Heading Control (left) and Roll Control (right) obtained on PPO agent.
Fig 17.
The Altitude Control (left) and Training Curve (right) obtained on TD3 agent.
Fig 18.
The Heading Control (left) and Roll Control (right) obtained on TD3 agent.
Fig 19.
The Altitude Control (left) and Training Curve (right) obtained on SAC agent.
Fig 20.
The Heading Control (left) and Roll Control (right) obtained on SAC agent.
Fig 21.
The altitude control for PID controller.
Fig 22.
The Heading Control (left) and Roll Control (right) of PID controller.
Fig 23.
The deflection of control surfaces over time.
Fig 24.
Comparison of RL agent and PID response for altitude controller.
Fig 25.
Comparison of RL agent and PID response for heading controller.
Fig 26.
Comparison of RL agent and PID response for roll controller.
Table 5.
Comparison of RL agents.
Table 6.
Training time, total steps, and step efficiency of RL agents.
Table 7.
Comparison of PID and RL agents.