diff --git a/index.toml b/index.toml
index 7eeaf19..2f3e105 100644
--- a/index.toml
+++ b/index.toml
@@ -370,3 +370,9 @@ title = "Tabular Data Processing with Prior Labs MCP"
 notebook = "prior_labs_agent.ipynb"
 new = true
 topics = ["Agents", "MCP", "Data Processing"]
+
+[[cookbook]]
+title = "Live-Learning Research Agent with Perplexity (Search + Embeddings + Agent) and Qdrant"
+notebook = "perplexity_live_research_agent.ipynb"
+topics = ["Agents", "RAG", "Web-QA", "Advanced Retrieval"]
+new = true
diff --git a/notebooks/perplexity_live_research_agent.ipynb b/notebooks/perplexity_live_research_agent.ipynb
new file mode 100644
index 0000000..940b69c
--- /dev/null
+++ b/notebooks/perplexity_live_research_agent.ipynb
@@ -0,0 +1,1099 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "id": "title",
+   "metadata": {
+    "tags": []
+   },
+   "source": [
+    "<a name=\"title\"></a>\n",
+    "# Live-Learning Research Agent with Perplexity (Search + Embeddings + Agent) and Qdrant\n",
+    "\n",
+    "*Notebook by the [Perplexity](https://perplexity.ai) API team.*\n",
+    "\n",
+    "This cookbook builds a **single research agent that uses all three Perplexity APIs together** through the [`perplexity-haystack`](https://haystack.deepset.ai/integrations/perplexity) integration:\n",
+    "\n",
+    "| Perplexity API | Haystack component | Role in this notebook |\n",
+    "|---|---|---|\n",
+    "| **Agent API** (`POST /v1/agent`) | `PerplexityChatGenerator` | The agent's reasoning model. OpenAI-Responses-compatible, so it slots into Haystack's [`Agent`](https://docs.haystack.deepset.ai/docs/agent) with no glue. |\n",
+    "| **Search API** (`POST /search`) | `PerplexityWebSearch` | Ranked, cleaned, cited web results, exposed to the agent as a tool \u2014 replaces the SerperDev / DuckDuckGo + `LinkContentFetcher` chain other cookbooks build by hand. |\n",
+    "| **Embeddings API** (`POST /v1/embeddings`) | `PerplexityTextEmbedder` / `PerplexityDocumentEmbedder` | Indexes documents into [Qdrant](https://haystack.deepset.ai/integrations/qdrant-document-store) and embeds queries at retrieval time. |\n",
+    "\n",
+    "The agent gets three tools \u2014 `retrieve_from_index`, `web_search`, `ingest_url` \u2014 and decides per question whether to read the local index, search the live web, or grow the index with a freshly-fetched page. Net result: a knowledge base that *learns from the agent's own behaviour*, with citations on every web answer.\n",
+    "\n",
+    "> **Embeddings note:** Perplexity's `/v1/embeddings` endpoint only accepts `encoding_format` of `base64_int8` or `base64_binary`. As of [`perplexity-haystack` PR #3344](https://github.com/deepset-ai/haystack-core-integrations/pull/3344), `PerplexityDocumentEmbedder` and `PerplexityTextEmbedder` default to `base64_int8` and decode the response back to `list[float]` automatically \u2014 no manual `httpx` call needed. Make sure you're on a release that includes that fix; on older versions the embedders inherit OpenAI's `encoding_format=\"float\"` default and get HTTP 400.\n"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "what-you-will-build",
+   "metadata": {
+    "tags": []
+   },
+   "source": [
+    "<a name=\"what-you-will-build\"></a>\n",
+    "## What you will build\n",
+    "\n",
+    "1. A Qdrant Cloud\u2013backed knowledge base seeded with a couple of Haystack documentation snippets, embedded with `pplx-embed-v1-0.6b` via `PerplexityDocumentEmbedder`.\n",
+    "2. A `web_search` tool wrapping `PerplexityWebSearch` so the agent can hit the Perplexity Search API directly.\n",
+    "3. An `ingest_url` tool that takes a URL from a web search result, extracts the page with `trafilatura`, embeds the chunks with `PerplexityDocumentEmbedder`, and writes them to Qdrant.\n",
+    "4. A `retrieve_from_index` tool that embeds the query with `PerplexityTextEmbedder` and pulls the top-k from Qdrant.\n",
+    "5. A Haystack [`Agent`](https://docs.haystack.deepset.ai/docs/agent) driven by `PerplexityChatGenerator` that orchestrates the three tools.\n",
+    "6. Three sample questions that show the index growing across turns and answers carrying citations end-to-end.\n"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "setup",
+   "metadata": {
+    "tags": []
+   },
+   "source": [
+    "<a name=\"setup\"></a>\n",
+    "## 1. Setup\n",
+    "\n",
+    "Install the integration packages plus `trafilatura` for HTML extraction. The notebook uses a **Qdrant Cloud** cluster (free tier is enough) so the index persists across runs \u2014 flip `recreate_index=True` to `False` once you have something you want to keep.\n"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "install",
+   "metadata": {
+    "execution": {
+     "iopub.execute_input": "2026-05-21T20:54:38.213438Z",
+     "iopub.status.busy": "2026-05-21T20:54:38.213330Z",
+     "iopub.status.idle": "2026-05-21T20:54:38.216520Z",
+     "shell.execute_reply": "2026-05-21T20:54:38.216064Z"
+    },
+    "tags": [
+     "install"
+    ]
+   },
+   "outputs": [],
+   "source": [
+    "%pip install -q \\\n",
+    "    \"haystack-ai>=2.24.1\" \\\n",
+    "    \"perplexity-haystack\" \\\n",
+    "    \"qdrant-haystack\" \\\n",
+    "    \"trafilatura\" \\\n",
+    "    \"httpx\"\n"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "credentials",
+   "metadata": {
+    "tags": []
+   },
+   "source": [
+    "<a name=\"credentials\"></a>\n",
+    "### 1.1 Credentials\n",
+    "\n",
+    "You'll need two things:\n",
+    "\n",
+    "* A **Perplexity API key** from [https://www.perplexity.ai/account/api](https://www.perplexity.ai/account/api).\n",
+    "* A **Qdrant Cloud cluster URL + API key** from [https://cloud.qdrant.io](https://cloud.qdrant.io) (free tier works fine).\n",
+    "\n",
+    "The cell below uses `getpass` so the keys never end up in the notebook output.\n"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "setup-credentials",
+   "metadata": {
+    "execution": {
+     "iopub.execute_input": "2026-05-21T20:54:38.217778Z",
+     "iopub.status.busy": "2026-05-21T20:54:38.217662Z",
+     "iopub.status.idle": "2026-05-21T20:54:38.221347Z",
+     "shell.execute_reply": "2026-05-21T20:54:38.220845Z"
+    },
+    "tags": [
+     "setup",
+     "credentials"
+    ]
+   },
+   "outputs": [],
+   "source": [
+    "import os\n",
+    "from getpass import getpass\n",
+    "\n",
+    "if not os.environ.get(\"PERPLEXITY_API_KEY\"):\n",
+    "    os.environ[\"PERPLEXITY_API_KEY\"] = getpass(\"PERPLEXITY_API_KEY: \")\n",
+    "if not os.environ.get(\"QDRANT_URL\"):\n",
+    "    os.environ[\"QDRANT_URL\"] = getpass(\"QDRANT_URL (e.g. https://xxxx.cloud.qdrant.io): \")\n",
+    "if not os.environ.get(\"QDRANT_API_KEY\"):\n",
+    "    os.environ[\"QDRANT_API_KEY\"] = getpass(\"QDRANT_API_KEY: \")\n",
+    "\n",
+    "print(\"Perplexity key set:\", bool(os.environ.get(\"PERPLEXITY_API_KEY\")))\n",
+    "print(\"Qdrant URL set:    \", bool(os.environ.get(\"QDRANT_URL\")))\n",
+    "print(\"Qdrant key set:    \", bool(os.environ.get(\"QDRANT_API_KEY\")))\n"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "embeddings",
+   "metadata": {
+    "tags": []
+   },
+   "source": [
+    "<a name=\"embeddings\"></a>\n",
+    "## 2. Embedding components (Perplexity `pplx-embed-v1-0.6b`)\n",
+    "\n",
+    "We use the first-class Haystack components from `perplexity-haystack`:\n",
+    "\n",
+    "* `PerplexityDocumentEmbedder` \u2014 embeds `Document` objects in batches for indexing.\n",
+    "* `PerplexityTextEmbedder` \u2014 embeds a single query string at retrieval time.\n",
+    "\n",
+    "Both default to `encoding_format=\"base64_int8\"` and decode the response to `list[float]` internally \u2014 see [PR #3344](https://github.com/deepset-ai/haystack-core-integrations/pull/3344). The vector dimension for `pplx-embed-v1-0.6b` is **1024**; keep that in mind when you create the Qdrant collection below.\n"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "embed-helper",
+   "metadata": {
+    "execution": {
+     "iopub.execute_input": "2026-05-21T20:54:38.222462Z",
+     "iopub.status.busy": "2026-05-21T20:54:38.222351Z",
+     "iopub.status.idle": "2026-05-21T20:54:38.441325Z",
+     "shell.execute_reply": "2026-05-21T20:54:38.440733Z"
+    },
+    "tags": [
+     "component:embedder",
+     "perplexity:embeddings"
+    ]
+   },
+   "outputs": [],
+   "source": [
+    "from haystack_integrations.components.embedders.perplexity import (\n",
+    "    PerplexityDocumentEmbedder,\n",
+    "    PerplexityTextEmbedder,\n",
+    ")\n",
+    "\n",
+    "EMBEDDING_MODEL = \"pplx-embed-v1-0.6b\"\n",
+    "EMBEDDING_DIM = 1024\n",
+    "\n",
+    "doc_embedder = PerplexityDocumentEmbedder(model=EMBEDDING_MODEL)\n",
+    "text_embedder = PerplexityTextEmbedder(model=EMBEDDING_MODEL)\n",
+    "\n",
+    "doc_embedder.warm_up()\n",
+    "text_embedder.warm_up()\n",
+    "\n",
+    "# Quick sanity check \u2014 embed a single query string.\n",
+    "sample = text_embedder.run(text=\"retrieval augmented generation\")\n",
+    "print(\"returned vector of dim\", len(sample[\"embedding\"]))\n"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "doc-store",
+   "metadata": {
+    "tags": []
+   },
+   "source": [
+    "<a name=\"doc-store\"></a>\n",
+    "## 3. Qdrant document store\n",
+    "\n",
+    "The Qdrant collection is created with `embedding_dim=1024` to match `pplx-embed-v1-0.6b`. We keep `recreate_index=True` here so re-running the notebook from scratch gives reproducible output \u2014 change to `False` once you want the index to persist between sessions.\n"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 4,
+   "id": "init-qdrant",
+   "metadata": {
+    "execution": {
+     "iopub.execute_input": "2026-05-21T20:54:38.442682Z",
+     "iopub.status.busy": "2026-05-21T20:54:38.442523Z",
+     "iopub.status.idle": "2026-05-21T20:54:41.164116Z",
+     "shell.execute_reply": "2026-05-21T20:54:41.163213Z"
+    },
+    "tags": [
+     "setup",
+     "doc-store"
+    ]
+   },
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "Qdrant collection ready: research_agent_demo\n"
+     ]
+    },
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "Docs currently in index: 0\n"
+     ]
+    }
+   ],
+   "source": [
+    "from haystack.utils import Secret\n",
+    "from haystack_integrations.document_stores.qdrant import QdrantDocumentStore\n",
+    "\n",
+    "document_store = QdrantDocumentStore(\n",
+    "    url=os.environ[\"QDRANT_URL\"],\n",
+    "    api_key=Secret.from_token(os.environ[\"QDRANT_API_KEY\"]),\n",
+    "    index=\"research_agent_demo\",\n",
+    "    embedding_dim=EMBEDDING_DIM,\n",
+    "    similarity=\"cosine\",\n",
+    "    recreate_index=True,   # set False once you want to keep the index across runs\n",
+    "    return_embedding=False,\n",
+    ")\n",
+    "\n",
+    "print(\"Qdrant collection ready:\", document_store.index)\n",
+    "print(\"Docs currently in index:\", document_store.count_documents())\n"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "seed",
+   "metadata": {
+    "tags": []
+   },
+   "source": [
+    "<a name=\"seed\"></a>\n",
+    "## 4. Seed the index\n",
+    "\n",
+    "Two tight docs about *Haystack itself*. We intentionally leave out anything about the `perplexity-haystack` package so that one of the demo questions later forces the agent to call `web_search` + `ingest_url`.\n"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "seed-docs",
+   "metadata": {
+    "execution": {
+     "iopub.execute_input": "2026-05-21T20:54:41.165792Z",
+     "iopub.status.busy": "2026-05-21T20:54:41.165480Z",
+     "iopub.status.idle": "2026-05-21T20:54:41.810765Z",
+     "shell.execute_reply": "2026-05-21T20:54:41.810255Z"
+    },
+    "tags": [
+     "doc-store",
+     "seed"
+    ]
+   },
+   "outputs": [],
+   "source": [
+    "from haystack import Document\n",
+    "from haystack.document_stores.types import DuplicatePolicy\n",
+    "\n",
+    "seed_docs = [\n",
+    "    Document(\n",
+    "        content=(\n",
+    "            \"Haystack is an open-source LLM framework by deepset. You compose \"\n",
+    "            \"components like retrievers, generators, and embedders into pipelines, \"\n",
+    "            \"and add tool-calling Agents on top.\"\n",
+    "        ),\n",
+    "        meta={\"source\": \"https://haystack.deepset.ai/\", \"title\": \"Haystack overview\"},\n",
+    "    ),\n",
+    "    Document(\n",
+    "        content=(\n",
+    "            \"In Haystack 2.x, the Agent component takes a chat generator plus a \"\n",
+    "            \"list of Tool objects, loops over tool calls, and exits when the model \"\n",
+    "            \"emits a final answer (the 'text' exit condition).\"\n",
+    "        ),\n",
+    "        meta={\"source\": \"https://docs.haystack.deepset.ai/docs/agent\", \"title\": \"Haystack Agent component\"},\n",
+    "    ),\n",
+    "]\n",
+    "\n",
+    "# PerplexityDocumentEmbedder sets `.embedding` on each Document in place\n",
+    "# (well, on the documents it returns) \u2014 Qdrant needs the vectors up front.\n",
+    "embedded = doc_embedder.run(documents=seed_docs)[\"documents\"]\n",
+    "\n",
+    "document_store.write_documents(embedded, policy=DuplicatePolicy.OVERWRITE)\n",
+    "print(\"Seeded docs in index:\", document_store.count_documents())\n"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "retriever",
+   "metadata": {
+    "tags": []
+   },
+   "source": [
+    "<a name=\"retriever\"></a>\n",
+    "## 5. `retrieve_from_index` tool\n",
+    "\n",
+    "Embed the query with the same model used at indexing time, then pull the top-k from Qdrant. The tool returns a compact JSON payload \u2014 title, source URL, snippet, score \u2014 because the agent has to fit it into its context window.\n"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 6,
+   "id": "import-json",
+   "metadata": {
+    "execution": {
+     "iopub.execute_input": "2026-05-21T20:54:41.812022Z",
+     "iopub.status.busy": "2026-05-21T20:54:41.811899Z",
+     "iopub.status.idle": "2026-05-21T20:54:41.814392Z",
+     "shell.execute_reply": "2026-05-21T20:54:41.813927Z"
+    },
+    "tags": [
+     "setup"
+    ]
+   },
+   "outputs": [],
+   "source": [
+    "import json\n"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "retriever-tool",
+   "metadata": {
+    "execution": {
+     "iopub.execute_input": "2026-05-21T20:54:41.815845Z",
+     "iopub.status.busy": "2026-05-21T20:54:41.815714Z",
+     "iopub.status.idle": "2026-05-21T20:54:42.131915Z",
+     "shell.execute_reply": "2026-05-21T20:54:42.131357Z"
+    },
+    "tags": [
+     "tool:retrieve",
+     "component:retriever"
+    ]
+   },
+   "outputs": [],
+   "source": [
+    "from haystack_integrations.components.retrievers.qdrant import QdrantEmbeddingRetriever\n",
+    "\n",
+    "retriever = QdrantEmbeddingRetriever(document_store=document_store, top_k=4)\n",
+    "\n",
+    "\n",
+    "def retrieve_from_index(query: str, top_k: int = 4) -> dict:\n",
+    "    query_emb = text_embedder.run(text=query)[\"embedding\"]\n",
+    "    hits = retriever.run(query_embedding=query_emb, top_k=top_k)[\"documents\"]\n",
+    "    return {\n",
+    "        \"hits\": [\n",
+    "            {\n",
+    "                \"title\": d.meta.get(\"title\", \"\"),\n",
+    "                \"source\": d.meta.get(\"source\", \"\"),\n",
+    "                \"snippet\": d.content[:400],\n",
+    "                \"score\": round(d.score, 4),\n",
+    "            }\n",
+    "            for d in hits\n",
+    "        ]\n",
+    "    }\n",
+    "\n",
+    "\n",
+    "# Smoke test \u2014 should retrieve the seed docs about Haystack.\n",
+    "preview = retrieve_from_index(\"What is Haystack?\", top_k=2)\n",
+    "print(json.dumps(preview, indent=2)[:800])\n"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "web-search",
+   "metadata": {
+    "tags": []
+   },
+   "source": [
+    "<a name=\"web-search\"></a>\n",
+    "## 6. `web_search` tool (Perplexity Search API)\n",
+    "\n",
+    "`PerplexityWebSearch` hits `POST /search` and gives back already-ranked, already-cleaned results plus the list of source URLs. No SERP scraper, no extra fetcher in front of the model. Refer to the official documentation [here](https://docs.perplexity.ai/docs/search/quickstart). \n"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 8,
+   "id": "web-search-tool",
+   "metadata": {
+    "execution": {
+     "iopub.execute_input": "2026-05-21T20:54:42.133168Z",
+     "iopub.status.busy": "2026-05-21T20:54:42.133046Z",
+     "iopub.status.idle": "2026-05-21T20:54:42.533718Z",
+     "shell.execute_reply": "2026-05-21T20:54:42.533259Z"
+    },
+    "tags": [
+     "tool:web_search",
+     "component:websearch",
+     "perplexity:search"
+    ]
+   },
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "{\n",
+      "  \"results\": [\n",
+      "    {\n",
+      "      \"title\": \"Perplexity with Haystack\",\n",
+      "      \"url\": \"https://docs.perplexity.ai/docs/getting-started/integrations/haystack\",\n",
+      "      \"snippet\": \"## \\u200b Overview\\nThe `perplexity-haystack` package provides Haystack components for Perplexity\\u2019s Agent API, Embeddings API, and grounded Search API, so you can build retrieval-augmented and agentic pipelines that combine chat, embeddings, and live web search.\\n**Haystack** is an open-source Python framework by deepset for building production-ready LLM applications, including RAG pipelines and agentic \"\n",
+      "    },\n",
+      "    {\n",
+      "      \"title\": \"Perplexity | Haystack - deepset AI\",\n",
+      "      \"url\": \"https://haystack.deepset.ai/integrations/perplexity\",\n",
+      "      \"snippet\": \"# Integration: Perplexity\\nUse the Perplexity Agent API, Embeddings API, and grounded Search API in Haystack pipelines.\\n...\\n## Overview\\nThe `perplexity-haystack`\n"
+     ]
+    }
+   ],
+   "source": [
+    "from haystack_integrations.components.websearch.perplexity import PerplexityWebSearch\n",
+    "\n",
+    "websearch = PerplexityWebSearch(top_k=5)\n",
+    "\n",
+    "\n",
+    "def web_search(query: str, top_k: int = 5) -> dict:\n",
+    "    r = websearch.run(query=query, search_params={\"max_results\": top_k})\n",
+    "    return {\n",
+    "        \"results\": [\n",
+    "            {\n",
+    "                \"title\": d.meta.get(\"title\", \"\"),\n",
+    "                \"url\": d.meta.get(\"url\") or d.meta.get(\"link\", \"\"),\n",
+    "                \"snippet\": d.content[:400],\n",
+    "            }\n",
+    "            for d in r[\"documents\"]\n",
+    "        ],\n",
+    "        \"links\": r[\"links\"],\n",
+    "    }\n",
+    "\n",
+    "\n",
+    "preview = web_search(\"perplexity haystack integration overview\", top_k=3)\n",
+    "print(json.dumps(preview, indent=2)[:900])\n"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "ingest",
+   "metadata": {
+    "tags": []
+   },
+   "source": [
+    "<a name=\"ingest\"></a>\n",
+    "## 7. `ingest_url` tool\n",
+    "\n",
+    "Fetch a URL with `trafilatura` (with an `httpx` fallback for sites that 403 the default UA), pull the main article text, chunk it with Haystack's `DocumentSplitter`, embed the chunks, and write them to Qdrant.\n",
+    "\n",
+    "Once a page is ingested, future `retrieve_from_index` calls can hit it without paying for another web request.\n"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "ingest-tool",
+   "metadata": {
+    "execution": {
+     "iopub.execute_input": "2026-05-21T20:54:42.535679Z",
+     "iopub.status.busy": "2026-05-21T20:54:42.535560Z",
+     "iopub.status.idle": "2026-05-21T20:54:44.311222Z",
+     "shell.execute_reply": "2026-05-21T20:54:44.309817Z"
+    },
+    "tags": [
+     "tool:ingest",
+     "component:splitter",
+     "component:embedder"
+    ]
+   },
+   "outputs": [],
+   "source": [
+    "import httpx\n",
+    "import trafilatura\n",
+    "from haystack.components.preprocessors import DocumentSplitter\n",
+    "\n",
+    "splitter = DocumentSplitter(split_by=\"word\", split_length=200, split_overlap=20)\n",
+    "splitter.warm_up()\n",
+    "\n",
+    "\n",
+    "def ingest_url(url: str, title: str | None = None) -> dict:\n",
+    "    # trafilatura's own fetcher is fine on clean pages; some sites block the\n",
+    "    # default UA, so fall back to httpx with a browser-ish UA.\n",
+    "    raw = trafilatura.fetch_url(url)\n",
+    "    if not raw:\n",
+    "        try:\n",
+    "            raw = httpx.get(\n",
+    "                url, timeout=20, follow_redirects=True,\n",
+    "                headers={\"User-Agent\": \"Mozilla/5.0 (Haystack cookbook demo)\"},\n",
+    "            ).text\n",
+    "        except Exception as e:\n",
+    "            return {\"ok\": False, \"reason\": f\"fetch error: {e}\", \"url\": url}\n",
+    "    text = trafilatura.extract(raw)\n",
+    "    if not text or len(text) < 200:\n",
+    "        return {\"ok\": False, \"reason\": \"could not extract enough content\", \"url\": url}\n",
+    "\n",
+    "    doc = Document(content=text, meta={\"source\": url, \"title\": title or url})\n",
+    "    chunks = splitter.run(documents=[doc])[\"documents\"]\n",
+    "\n",
+    "    embedded_chunks = doc_embedder.run(documents=chunks)[\"documents\"]\n",
+    "\n",
+    "    document_store.write_documents(embedded_chunks, policy=DuplicatePolicy.OVERWRITE)\n",
+    "    return {\n",
+    "        \"ok\": True,\n",
+    "        \"url\": url,\n",
+    "        \"chunks_indexed\": len(embedded_chunks),\n",
+    "        \"chars_extracted\": len(text),\n",
+    "        \"total_docs_in_index\": document_store.count_documents(),\n",
+    "    }\n",
+    "\n",
+    "\n",
+    "# Demo: ingest the Perplexity integrations landing page so the agent\n",
+    "# can answer questions about perplexity-haystack later from the index.\n",
+    "print(json.dumps(\n",
+    "    ingest_url(\"https://haystack.deepset.ai/integrations/perplexity\",\n",
+    "               title=\"Perplexity Haystack integration\"),\n",
+    "    indent=2,\n",
+    "))\n"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "agent",
+   "metadata": {
+    "tags": []
+   },
+   "source": [
+    "<a name=\"agent\"></a>\n",
+    "## 8. Agent (Perplexity Agent API)\n",
+    "\n",
+    "`PerplexityChatGenerator` defaults to `openai/gpt-5.4` via the Agent API; you can swap to any other model the Agent API exposes (Anthropic, Gemini, Perplexity Sonar, etc.). Refer to the original documentation [here](https://docs.perplexity.ai/docs/agent-api/quickstart).\n",
+    "\n",
+    "The system prompt forces the order **retrieve \u2192 web_search \u2192 ingest_url \u2192 retrieve again \u2192 answer** so we get a clean demo of the loop. In production you can loosen it.\n"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 10,
+   "id": "agent-init",
+   "metadata": {
+    "execution": {
+     "iopub.execute_input": "2026-05-21T20:54:44.312883Z",
+     "iopub.status.busy": "2026-05-21T20:54:44.312616Z",
+     "iopub.status.idle": "2026-05-21T20:54:45.818446Z",
+     "shell.execute_reply": "2026-05-21T20:54:45.817587Z"
+    },
+    "tags": [
+     "component:agent",
+     "perplexity:agent"
+    ]
+   },
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "Agent ready with tools: ['retrieve_from_index', 'web_search', 'ingest_url']\n"
+     ]
+    }
+   ],
+   "source": [
+    "from haystack.tools import Tool\n",
+    "from haystack.components.agents import Agent\n",
+    "from haystack.dataclasses import ChatMessage\n",
+    "from haystack_integrations.components.generators.perplexity import PerplexityChatGenerator\n",
+    "\n",
+    "retrieve_tool = Tool(\n",
+    "    name=\"retrieve_from_index\",\n",
+    "    description=(\n",
+    "        \"Search the local Qdrant knowledge base (Perplexity embeddings). \"\n",
+    "        \"Use this FIRST for any question that might be in the index.\"\n",
+    "    ),\n",
+    "    parameters={\n",
+    "        \"type\": \"object\",\n",
+    "        \"properties\": {\n",
+    "            \"query\": {\"type\": \"string\"},\n",
+    "            \"top_k\": {\"type\": \"integer\", \"default\": 4, \"minimum\": 1, \"maximum\": 10},\n",
+    "        },\n",
+    "        \"required\": [\"query\"],\n",
+    "    },\n",
+    "    function=retrieve_from_index,\n",
+    ")\n",
+    "\n",
+    "web_search_tool = Tool(\n",
+    "    name=\"web_search\",\n",
+    "    description=(\n",
+    "        \"Search the live web with the Perplexity Search API. Use when \"\n",
+    "        \"retrieve_from_index returns nothing useful, or when you need current information.\"\n",
+    "    ),\n",
+    "    parameters={\n",
+    "        \"type\": \"object\",\n",
+    "        \"properties\": {\n",
+    "            \"query\": {\"type\": \"string\"},\n",
+    "            \"top_k\": {\"type\": \"integer\", \"default\": 5, \"minimum\": 1, \"maximum\": 20},\n",
+    "        },\n",
+    "        \"required\": [\"query\"],\n",
+    "    },\n",
+    "    function=web_search,\n",
+    ")\n",
+    "\n",
+    "ingest_tool = Tool(\n",
+    "    name=\"ingest_url\",\n",
+    "    description=(\n",
+    "        \"Fetch a URL, embed it with Perplexity embeddings, and add it to the \"\n",
+    "        \"Qdrant index for reuse by future retrieve_from_index calls. Use after \"\n",
+    "        \"web_search when you find an authoritative source.\"\n",
+    "    ),\n",
+    "    parameters={\n",
+    "        \"type\": \"object\",\n",
+    "        \"properties\": {\n",
+    "            \"url\": {\"type\": \"string\"},\n",
+    "            \"title\": {\"type\": \"string\"},\n",
+    "        },\n",
+    "        \"required\": [\"url\"],\n",
+    "    },\n",
+    "    function=ingest_url,\n",
+    ")\n",
+    "\n",
+    "chat = PerplexityChatGenerator(model=\"openai/gpt-5.4\")\n",
+    "\n",
+    "agent = Agent(\n",
+    "    chat_generator=chat,\n",
+    "    tools=[retrieve_tool, web_search_tool, ingest_tool],\n",
+    "    system_prompt=(\n",
+    "        \"You are a research agent. For every user question:\\n\"\n",
+    "        \"1. Call retrieve_from_index first to see if the local knowledge base already has the answer.\\n\"\n",
+    "        \"2. If the retrieved snippets don't cover the question, call web_search.\\n\"\n",
+    "        \"3. Whenever web_search returns a URL whose contents you actually need to answer the question, \"\n",
+    "        \"call ingest_url on that URL FIRST so future retrieve_from_index calls can find it. \"\n",
+    "        \"Then call retrieve_from_index again to read the chunks you just ingested.\\n\"\n",
+    "        \"4. Write a concise answer. Cite every fact with the source URL it came from using inline markdown links.\"\n",
+    "    ),\n",
+    "    exit_conditions=[\"text\"],\n",
+    "    max_agent_steps=10,\n",
+    ")\n",
+    "agent.warm_up()\n",
+    "print(\"Agent ready with tools:\", [t.name for t in agent.tools])\n"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "run-demo",
+   "metadata": {
+    "tags": []
+   },
+   "source": [
+    "<a name=\"run-demo\"></a>\n",
+    "## 9. Run the agent on three questions\n",
+    "\n",
+    "* **Q1** is covered by the seed docs \u2014 should be one `retrieve_from_index` call, then an answer.\n",
+    "* **Q2** isn't covered \u2014 the agent should fall through to `web_search`, pick a URL, `ingest_url` it, and retrieve again before answering.\n",
+    "* **Q3** asks something that *should* now be in the index thanks to Q2's ingest, so the agent can stay local.\n",
+    "\n",
+    "For each run we print the message timeline (role + tool calls), then the final answer.\n"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 11,
+   "id": "run-questions",
+   "metadata": {
+    "execution": {
+     "iopub.execute_input": "2026-05-21T20:54:45.820008Z",
+     "iopub.status.busy": "2026-05-21T20:54:45.819759Z",
+     "iopub.status.idle": "2026-05-21T20:55:18.306713Z",
+     "shell.execute_reply": "2026-05-21T20:55:18.306092Z"
+    },
+    "tags": [
+     "demo",
+     "run"
+    ]
+   },
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "================================================================================\n",
+      "Q: What is the Haystack Agent component, briefly?\n"
+     ]
+    },
+    {
+     "name": "stderr",
+     "output_type": "stream",
+     "text": [
+      "/tmp/ipykernel_1731/3725313797.py:29: Warning: Mutating attribute 'embedding' on an instance of 'Document' can lead to unexpected behavior by affecting other parts of the pipeline that use the same dataclass instance. Use `dataclasses.replace(instance, embedding=new_value)` instead. See https://docs.haystack.deepset.ai/docs/custom-components#requirements for details.\n",
+      "  c.embedding = v\n"
+     ]
+    },
+    {
+     "name": "stderr",
+     "output_type": "stream",
+     "text": [
+      "\r\n",
+      "  0%|          | 0/12 [00:00<?, ?it/s]"
+     ]
+    },
+    {
+     "name": "stderr",
+     "output_type": "stream",
+     "text": [
+      "\r\n",
+      "100it [00:00, 283.69it/s]             "
+     ]
+    },
+    {
+     "name": "stderr",
+     "output_type": "stream",
+     "text": [
+      "\r\n",
+      "100it [00:00, 283.36it/s]"
+     ]
+    },
+    {
+     "name": "stderr",
+     "output_type": "stream",
+     "text": [
+      "\n"
+     ]
+    },
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "messages: 11\n",
+      "  [0] role=system tool_calls=[] results=0 text_len=571\n",
+      "  [1] role=user tool_calls=[] results=0 text_len=46\n",
+      "  [2] role=assistant tool_calls=['retrieve_from_index'] results=0 text_len=0\n",
+      "  [3] role=tool tool_calls=[] results=1 text_len=0\n",
+      "  [4] role=assistant tool_calls=['web_search'] results=0 text_len=0\n",
+      "  [5] role=tool tool_calls=[] results=1 text_len=0\n",
+      "  [6] role=assistant tool_calls=['ingest_url'] results=0 text_len=0\n",
+      "  [7] role=tool tool_calls=[] results=1 text_len=0\n",
+      "  [8] role=assistant tool_calls=['retrieve_from_index'] results=0 text_len=0\n",
+      "  [9] role=tool tool_calls=[] results=1 text_len=0\n",
+      "  [10] role=assistant tool_calls=[] results=0 text_len=455\n",
+      "\n",
+      "FINAL ANSWER:\n",
+      "The Haystack **Agent** component is a pipeline component that lets an LLM **reason through a task, use tools when needed, and produce a final answer**. In Haystack, agents are designed for workflows where the model may need multiple steps\u2014such as deciding which tool to call, gathering information, and then responding\u2014rather than just generating a single direct output from a prompt. [Haystack docs](https://docs.haystack.deepset.ai/reference/agents-api)\n"
+     ]
+    },
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "\n",
+      "Docs in index now: 16\n",
+      "\n",
+      "================================================================================\n",
+      "Q: What does the perplexity-haystack integration package provide, according to the official Haystack integrations page?\n"
+     ]
+    },
+    {
+     "name": "stderr",
+     "output_type": "stream",
+     "text": [
+      "failed to send, dropping 1 traces to intake at http://localhost:8126/v0.5/traces: client error (Connect)\n"
+     ]
+    },
+    {
+     "name": "stderr",
+     "output_type": "stream",
+     "text": [
+      "\r\n",
+      "  0%|          | 0/7 [00:00<?, ?it/s]"
+     ]
+    },
+    {
+     "name": "stderr",
+     "output_type": "stream",
+     "text": [
+      "\r\n",
+      "100it [00:00, 286.96it/s]            "
+     ]
+    },
+    {
+     "name": "stderr",
+     "output_type": "stream",
+     "text": [
+      "\r\n",
+      "100it [00:00, 286.52it/s]"
+     ]
+    },
+    {
+     "name": "stderr",
+     "output_type": "stream",
+     "text": [
+      "\n"
+     ]
+    },
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "messages: 11\n",
+      "  [0] role=system tool_calls=[] results=0 text_len=571\n",
+      "  [1] role=user tool_calls=[] results=0 text_len=116\n",
+      "  [2] role=assistant tool_calls=['retrieve_from_index'] results=0 text_len=0\n",
+      "  [3] role=tool tool_calls=[] results=1 text_len=0\n",
+      "  [4] role=assistant tool_calls=['web_search'] results=0 text_len=0\n",
+      "  [5] role=tool tool_calls=[] results=1 text_len=0\n",
+      "  [6] role=assistant tool_calls=['ingest_url'] results=0 text_len=0\n",
+      "  [7] role=tool tool_calls=[] results=1 text_len=0\n",
+      "  [8] role=assistant tool_calls=['retrieve_from_index'] results=0 text_len=0\n",
+      "  [9] role=tool tool_calls=[] results=1 text_len=0\n",
+      "  [10] role=assistant tool_calls=[] results=0 text_len=312\n",
+      "\n",
+      "FINAL ANSWER:\n",
+      "According to the official Haystack integrations page, the **`perplexity-haystack`** package provides **components for using Perplexity models within Haystack pipelines**\u2014specifically support for **chat generators and rankers**. [https://haystack.deepset.ai/integrations](https://haystack.deepset.ai/integrations)\n"
+     ]
+    },
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "\n",
+      "Docs in index now: 23\n",
+      "\n",
+      "================================================================================\n",
+      "Q: Which Perplexity APIs does the perplexity-haystack package wrap, and which Haystack component classes does it ship?\n"
+     ]
+    },
+    {
+     "name": "stderr",
+     "output_type": "stream",
+     "text": [
+      "\r\n",
+      "  0%|          | 0/1 [00:00<?, ?it/s]"
+     ]
+    },
+    {
+     "name": "stderr",
+     "output_type": "stream",
+     "text": [
+      "\r\n",
+      "100it [00:00, 433.23it/s]            "
+     ]
+    },
+    {
+     "name": "stderr",
+     "output_type": "stream",
+     "text": [
+      "\r\n",
+      "100it [00:00, 432.49it/s]"
+     ]
+    },
+    {
+     "name": "stderr",
+     "output_type": "stream",
+     "text": [
+      "\n"
+     ]
+    },
+    {
+     "name": "stderr",
+     "output_type": "stream",
+     "text": [
+      "\r\n",
+      "  0%|          | 0/1 [00:00<?, ?it/s]"
+     ]
+    },
+    {
+     "name": "stderr",
+     "output_type": "stream",
+     "text": [
+      "\r\n",
+      "100it [00:00, 433.09it/s]            "
+     ]
+    },
+    {
+     "name": "stderr",
+     "output_type": "stream",
+     "text": [
+      "\r\n",
+      "100it [00:00, 432.30it/s]"
+     ]
+    },
+    {
+     "name": "stderr",
+     "output_type": "stream",
+     "text": [
+      "\n"
+     ]
+    },
+    {
+     "name": "stderr",
+     "output_type": "stream",
+     "text": [
+      "\r\n",
+      "  0%|          | 0/2 [00:00<?, ?it/s]"
+     ]
+    },
+    {
+     "name": "stderr",
+     "output_type": "stream",
+     "text": [
+      "\r\n",
+      "100it [00:00, 430.24it/s]            "
+     ]
+    },
+    {
+     "name": "stderr",
+     "output_type": "stream",
+     "text": [
+      "\r\n",
+      "100it [00:00, 429.36it/s]"
+     ]
+    },
+    {
+     "name": "stderr",
+     "output_type": "stream",
+     "text": [
+      "\n"
+     ]
+    },
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "messages: 15\n",
+      "  [0] role=system tool_calls=[] results=0 text_len=571\n",
+      "  [1] role=user tool_calls=[] results=0 text_len=115\n",
+      "  [2] role=assistant tool_calls=['retrieve_from_index'] results=0 text_len=0\n",
+      "  [3] role=tool tool_calls=[] results=1 text_len=0\n",
+      "  [4] role=assistant tool_calls=['web_search'] results=0 text_len=0\n",
+      "  [5] role=tool tool_calls=[] results=1 text_len=0\n",
+      "  [6] role=assistant tool_calls=['ingest_url'] results=0 text_len=0\n",
+      "  [7] role=tool tool_calls=[] results=1 text_len=0\n",
+      "  [8] role=assistant tool_calls=['ingest_url'] results=0 text_len=0\n",
+      "  [9] role=tool tool_calls=[] results=1 text_len=0\n",
+      "  [10] role=assistant tool_calls=['ingest_url'] results=0 text_len=0\n",
+      "  [11] role=tool tool_calls=[] results=1 text_len=0\n",
+      "  [12] role=assistant tool_calls=['retrieve_from_index'] results=0 text_len=0\n",
+      "  [13] role=tool tool_calls=[] results=1 text_len=0\n",
+      "  [14] role=assistant tool_calls=[] results=0 text_len=445\n",
+      "\n",
+      "FINAL ANSWER:\n",
+      "`perplexity-haystack` wraps Perplexity\u2019s **chat-completions API** and **search API**. It ships Haystack components for both: **`PerplexityChatGenerator`** for chat/completions and **`PerplexityWebSearch`** for web search.[[Haystack docs](https://docs.haystack.deepset.ai/docs/perplexity)][[PyPI](https://pypi.org/project/perplexity-haystack/)][[GitHub](https://github.com/deepset-ai/haystack-core-integrations/tree/main/integrations/perplexity)]\n"
+     ]
+    },
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "\n",
+      "Docs in index now: 27\n",
+      "\n"
+     ]
+    }
+   ],
+   "source": [
+    "QUESTIONS = [\n",
+    "    # Q1 \u2014 answerable from the seed docs alone.\n",
+    "    \"What is the Haystack Agent component, briefly?\",\n",
+    "    # Q2 \u2014 not in the index. Forces web_search + ingest_url.\n",
+    "    \"What does the perplexity-haystack integration package provide, according to the official Haystack integrations page?\",\n",
+    "    # Q3 \u2014 should now hit the page Q2 ingested.\n",
+    "    \"Which Perplexity APIs does the perplexity-haystack package wrap, and which Haystack component classes does it ship?\",\n",
+    "]\n",
+    "\n",
+    "\n",
+    "def run_question(q: str) -> None:\n",
+    "    print(\"=\" * 80)\n",
+    "    print(\"Q:\", q)\n",
+    "    result = agent.run(messages=[ChatMessage.from_user(q)])\n",
+    "    msgs = result[\"messages\"]\n",
+    "    print(f\"messages: {len(msgs)}\")\n",
+    "    for i, m in enumerate(msgs):\n",
+    "        tool_calls = getattr(m, \"tool_calls\", []) or []\n",
+    "        tool_results = getattr(m, \"tool_call_results\", []) or []\n",
+    "        print(\n",
+    "            f\"  [{i}] role={m.role.value} \"\n",
+    "            f\"tool_calls={[t.tool_name for t in tool_calls]} \"\n",
+    "            f\"results={len(tool_results)} text_len={len(m.text or '')}\"\n",
+    "        )\n",
+    "    print(\"\\nFINAL ANSWER:\\n\" + (msgs[-1].text or \"\"))\n",
+    "    print(\"\\nDocs in index now:\", document_store.count_documents())\n",
+    "\n",
+    "\n",
+    "for q in QUESTIONS:\n",
+    "    run_question(q)\n",
+    "    print()\n"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "wrap-up",
+   "metadata": {
+    "tags": []
+   },
+   "source": [
+    "<a name=\"wrap-up\"></a>\n",
+    "## 10. Where to go next\n",
+    "\n",
+    "* Swap `recreate_index=True` to `False` and let the index accumulate across sessions \u2014 every web answer the agent gives stays available offline.\n",
+    "* Bring in `PerplexityContextualizedEmbedder` (`/v1/contextualizedembeddings`) instead of the static doc embedder if your corpus has heavy section structure or codebases.\n",
+    "* Add a `cite_index` tool that returns the embedded chunk IDs alongside the answer, so downstream UIs can hyperlink back to the source.\n",
+    "* Try a different reasoning model \u2014 `chat = PerplexityChatGenerator(model=\"anthropic/claude-sonnet-4-5\")` works via the Agent API too.\n",
+    "\n",
+    "### References\n",
+    "* Perplexity Agent API \u2014 <https://docs.perplexity.ai/api-reference/agent-post>\n",
+    "* Perplexity Search API \u2014 <https://docs.perplexity.ai/api-reference/search-post>\n",
+    "* Perplexity Embeddings API \u2014 <https://docs.perplexity.ai/api-reference/embeddings-post>\n",
+    "* Haystack Perplexity integration \u2014 <https://haystack.deepset.ai/integrations/perplexity>\n",
+    "* Qdrant document store \u2014 <https://haystack.deepset.ai/integrations/qdrant-document-store>\n"
+   ]
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "Python 3",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.12.8"
+  },
+  "x_cookbook": {
+   "authors": [
+    {
+     "name": "Perplexity API team",
+     "url": "https://perplexity.ai"
+    }
+   ],
+   "category": "agents",
+   "components_used": [
+    "haystack.components.agents.Agent",
+    "haystack.tools.Tool",
+    "haystack.components.preprocessors.DocumentSplitter",
+    "haystack_integrations.components.generators.perplexity.PerplexityChatGenerator",
+    "haystack_integrations.components.websearch.perplexity.PerplexityWebSearch",
+    "haystack_integrations.document_stores.qdrant.QdrantDocumentStore",
+    "haystack_integrations.components.retrievers.qdrant.QdrantEmbeddingRetriever"
+   ],
+   "integrations": [
+    {
+     "name": "perplexity-haystack",
+     "url": "https://haystack.deepset.ai/integrations/perplexity"
+    },
+    {
+     "name": "qdrant-haystack",
+     "url": "https://haystack.deepset.ai/integrations/qdrant-document-store"
+    }
+   ],
+   "perplexity_apis_used": [
+    "/v1/agent",
+    "/search",
+    "/v1/embeddings"
+   ],
+   "slug": "perplexity_live_research_agent",
+   "tags": [
+    "perplexity",
+    "qdrant",
+    "agent",
+    "rag",
+    "tool-calling",
+    "embeddings"
+   ],
+   "title": "Live-Learning Research Agent with Perplexity (Search + Embeddings + Agent) and Qdrant"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 5
+}
\ No newline at end of file