🌊 Aquatic Drone Navigation - AI Exam

This project implements Reinforcement Learning algorithms to navigate an advanced aquatic drone through a dynamic underwater environment. The goal is to find the optimal policy to travel from a starting point to a research site, maximizing energy efficiency while avoiding obstacles and utilizing currents.

📌 Problem Description

An aquatic drone is deployed to collect data on marine biodiversity. It starts at point $S$ (near the shore) and must navigate to point $G$ (a coral reef).

Environment Details (Grid World 10x10)

The environment is a grid where the agent must manage energy consumption and stochastic elements.

Start ($S$): Position (0, 0).
Goal ($G$): Position (9, 7). Reaching this provides a reward of +20.0 and ends the episode.
Move Cost: Standard energy cost of -0.04 per step.

Terrain Types & Rewards

Type	Symbol	Description	Effect / Reward
Open Water	`O`	Normal movement.	Standard cost (-0.04).
Currents	`C`	Strong ocean currents.	Stochastic: 80% intended move, 10% drift Left, 10% drift Right.
Seaweed	`F`	Dense vegetation.	Additional penalty of -0.2 (Total cost: -0.24).
Energy Stations	`E`	Charging points.	Bonus reward of +1.0 when visited.

🧠 Algorithms Implemented

To solve this MDP (Markov Decision Process), two Dynamic Programming algorithms were implemented and compared:

1. Value Iteration

Approach: Iteratively updates the value function $V(s)$ until the change ($\Delta$) is below a threshold (maxDifferenza).
Convergence: Asymptotic. It stops when the values stabilize close to the optimal solution.

2. Policy Iteration

Approach: Alternates between Policy Evaluation and Policy Improvement.
Convergence: Converges to the exact optimal policy in a finite number of iterations.
Trade-off: Often requires fewer iterations than Value Iteration, but each iteration is computationally more expensive.

📊 Results & Analysis

Algorithm Comparison

Both algorithms successfully converge to the same optimal policy.

Value Iteration is faster per iteration but requires more loops to reach the mathematical threshold.
Policy Iteration explores options systematically and finds the exact policy with fewer total loops, though at a higher computational cost per step depending on the state space size.

Critical Analysis: Energy Station Routing

Question: Does the optimal solution guarantee that the drone passes by at least two charging stations (E) in every possible execution?

Answer: NO. Due to the stochastic nature of the Currents (C), a deterministic path cannot be guaranteed 100% of the time.

Specifically, at cell (4, 1), the optimal action is DOWN.
However, there is a 10% probability that the current pushes the drone to the RIGHT.
If this deviation occurs, the drone is forced onto a path that misses the second column of energy stations. Therefore, passing two stations is not guaranteed for every execution.

Project developed for the Artificial Intelligence course.

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
DocumentazioneProgettoAI		DocumentazioneProgettoAI
Ex-2		Ex-2
__MACOSX		__MACOSX
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🌊 Aquatic Drone Navigation - AI Exam

📌 Problem Description

Environment Details (Grid World 10x10)

Terrain Types & Rewards

🧠 Algorithms Implemented

1. Value Iteration

2. Policy Iteration

📊 Results & Analysis

Algorithm Comparison

Critical Analysis: Energy Station Routing

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

🌊 Aquatic Drone Navigation - AI Exam

📌 Problem Description

Environment Details (Grid World 10x10)

Terrain Types & Rewards

🧠 Algorithms Implemented

1. Value Iteration

2. Policy Iteration

📊 Results & Analysis

Algorithm Comparison

Critical Analysis: Energy Station Routing

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages