Autonomous Android Agent

This project is an autonomous AI agent capable of understanding high-level goals and translating them into a sequence of actions on an Android device. It can see and interpret the device's screen, make decisions, and interact with UI elements to accomplish complex tasks without human intervention.

How It Works

The agent operates on a continuous see-think-act cycle, powered by a local Large Language Model (LLM) running via Ollama.

Decomposition: When given a high-level goal, the agent first uses the LLM to break down the goal into a series of smaller, verifiable sub-tasks.
Observation: For each sub-task, the agent captures the device's current UI hierarchy as an XML file. This gives it "vision," allowing it to see all the buttons, text, and other elements on the screen.
Decision: The agent sends the current UI layout and the sub-task description to the LLM, asking it to determine the most logical next action (e.g., tap on an element, swipe down, or type text into a field).
Execution: The agent executes the chosen action on the device using the Android Debug Bridge (ADB).

This loop repeats until all sub-tasks are completed and the main goal is achieved. The entire process is driven by local AI, ensuring privacy and full control over the agent's operations.

Key Features

Goal-Oriented: Operates based on natural language objectives.
Vision-Capable: Analyzes the screen's UI hierarchy to make informed decisions.
Local & Private: Runs entirely on your local machine using Ollama. No data ever leaves your computer.
Modular Architecture: Built with a clean separation between device interaction, AI planning, and action execution, making it easy to extend and modify.

1. Prerequisites

Android Studio & ADB: You will need to have the Android SDK Platform-Tools installed and know how to connect your device via ADB (Android Debug Bridge).
Python: Python 3.10 or higher is required.
Ollama: Ollama is required to run the language model. The setup script will attempt to install it for you.

2. Setup

Connect Your Android Device:
- The agent connects to an Android device using an ADB TCP connection (IP address and port). This is standard for emulators and for physical devices using Wi-Fi debugging.
- For Emulators (Android Studio, Genymotion, etc.):
  - Ensure your emulator is running.
  - The agent will attempt to auto-detect the device from the adb devices command. If no TCP device is listed, it will default to 127.0.0.1:5555.
- For Physical Devices (via Wi-Fi):
  - Enable Developer Options and Wireless Debugging on your device.
  - Connect to the same Wi-Fi network as your computer.
  - Use the IP address and port provided in the Wireless Debugging settings to connect:
```
adb connect YOUR_DEVICE_IP:PORT
```
- Verify Connection:
  - Run adb devices in your terminal. You should see your device listed with an IP address (e.g., 192.168.1.5:5555 device).
Run the Setup Script:
- Open a command prompt and navigate to the project directory.
- Run the setup.bat script to automatically install all dependencies.
- The script will check for ADB, install the required Python packages, and download/install Ollama if it's not already present.
Pull the Ollama Model:
- The setup script will automatically pull the required llama3.2 model. If it fails, you can do it manually by running:
```
ollama pull llama3.2
```

3. Running the Agent

Start the Ollama Service:
- Before running the agent, you must start the Ollama service. You can do this by running:
```
ollama serve
```
- Leave this service running in a separate terminal.
Run the Demo:
- Open a new terminal, navigate to the project directory, and run the demo with:
```
python demo.py
```

You can change the goal for the agent by modifying the goal variable in the demo.py file.

4. Demo

vid.mp4

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
action		action
agent		agent
device		device
llm		llm
memory		memory
observation		observation
README.md		README.md
config.py		config.py
demo.py		demo.py
models.py		models.py
requirements.txt		requirements.txt
setup.bat		setup.bat

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Autonomous Android Agent

How It Works

Key Features

1. Prerequisites

2. Setup

3. Running the Agent

4. Demo

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Autonomous Android Agent

How It Works

Key Features

1. Prerequisites

2. Setup

3. Running the Agent

4. Demo

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages