From 2fe89f38526fcaf44188a68f70b58e76eba36de1 Mon Sep 17 00:00:00 2001 From: James Liounis Date: Thu, 21 May 2026 20:07:06 +0000 Subject: [PATCH 1/4] Add cookbook: Live-Learning Research Agent with Perplexity (Search + Embeddings + Agent) and Qdrant A single Haystack agent that exercises all three Perplexity APIs through the perplexity-haystack integration package: - PerplexityChatGenerator (Agent API) drives the agent loop - PerplexityWebSearch (Search API) is exposed as a 'web_search' tool - PerplexityDocumentEmbedder / PerplexityTextEmbedder (Embeddings API) index and retrieve from a Qdrant document store The agent has three tools (retrieve_from_index, web_search, ingest_url) and demonstrates a self-extending knowledge base: it answers from the seed index when it can, falls back to live web search when it must, and ingests high-signal URLs back into Qdrant so the next question can retrieve them without another web call. --- index.toml | 6 + .../perplexity_live_research_agent.ipynb | 344 ++++++++++++++++++ 2 files changed, 350 insertions(+) create mode 100644 notebooks/perplexity_live_research_agent.ipynb diff --git a/index.toml b/index.toml index 7eeaf19..2f3e105 100644 --- a/index.toml +++ b/index.toml @@ -370,3 +370,9 @@ title = "Tabular Data Processing with Prior Labs MCP" notebook = "prior_labs_agent.ipynb" new = true topics = ["Agents", "MCP", "Data Processing"] + +[[cookbook]] +title = "Live-Learning Research Agent with Perplexity (Search + Embeddings + Agent) and Qdrant" +notebook = "perplexity_live_research_agent.ipynb" +topics = ["Agents", "RAG", "Web-QA", "Advanced Retrieval"] +new = true diff --git a/notebooks/perplexity_live_research_agent.ipynb b/notebooks/perplexity_live_research_agent.ipynb new file mode 100644 index 0000000..81ca38f --- /dev/null +++ b/notebooks/perplexity_live_research_agent.ipynb @@ -0,0 +1,344 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "id": "title", + "metadata": { + "tags": [] + }, + "source": "\n# Live-Learning Research Agent with Perplexity (Search + Embeddings + Agent) and Qdrant\n\n*Notebook by the [Perplexity](https://perplexity.ai) API team.*\n\nIn this cookbook we build a **single research agent that uses all three Perplexity APIs together** through the [`perplexity-haystack`](https://haystack.deepset.ai/integrations/perplexity) integration:\n\n| Perplexity API | Haystack component | Role in this notebook |\n|---|---|---|\n| **Agent API** (`POST /v1/agent`) | `PerplexityChatGenerator` | The agent's reasoning model. OpenAI-Responses-compatible, so it plugs into Haystack's [`Agent`](https://docs.haystack.deepset.ai/docs/agent) without any glue. |\n| **Search API** (`POST /search`) | `PerplexityWebSearch` | Ranked, cleaned, cited web results \u2014 exposed to the agent as a tool, replacing the SerperDev / Apify / DuckDuckGo + `LinkContentFetcher` chain other cookbooks build by hand. |\n| **Embeddings API** (`POST /v1/embeddings`) | `PerplexityDocumentEmbedder` / `PerplexityTextEmbedder` | Indexes documents into a [Qdrant](https://haystack.deepset.ai/integrations/qdrant-document-store) store and embeds queries at retrieval time. |\n\nThe agent gets three tools \u2014 `retrieve_from_index`, `web_search`, `ingest_url` \u2014 and decides per-question whether to look at its existing index, search the live web, or grow the index with a new URL. The result is a knowledge base that *learns from the agent's own behaviour*, with citations attached to every web answer.\n" + }, + { + "cell_type": "markdown", + "id": "what-you-will-learn", + "metadata": { + "tags": [] + }, + "source": "\n## What you will build\n\n1. A Qdrant-backed knowledge base seeded with a few Haystack documentation pages, embedded with `pplx-embed-v1-0.6b`.\n2. A `web_search` tool wrapping `PerplexityWebSearch` so the agent can call the Perplexity Search API directly.\n3. An `ingest_url` tool that takes a URL the agent picked from search results, fetches it, embeds it with `PerplexityDocumentEmbedder`, and writes it back to Qdrant.\n4. A `retrieve_from_index` tool that embeds the query with `PerplexityTextEmbedder` and pulls the top-k from Qdrant.\n5. A Haystack [`Agent`](https://docs.haystack.deepset.ai/docs/agent) driven by `PerplexityChatGenerator` that orchestrates the three tools.\n6. A short evaluation showing the index growing across turns and answers carrying citations end-to-end.\n" + }, + { + "cell_type": "markdown", + "id": "setup", + "metadata": { + "tags": [] + }, + "source": "\n## 1. Setup\n\nInstall the integration packages. We use an in-memory Qdrant instance so the notebook runs anywhere (Colab included) without spinning up infrastructure.\n" + }, + { + "cell_type": "code", + "execution_count": null, + "id": "install", + "metadata": { + "tags": [ + "install" + ] + }, + "outputs": [], + "source": "%pip install -q \\\n \"haystack-ai>=2.24.1\" \\\n \"perplexity-haystack\" \\\n \"qdrant-haystack\" \\\n \"trafilatura\"" + }, + { + "cell_type": "markdown", + "id": "api-keys", + "metadata": { + "tags": [] + }, + "source": "\n### 1.1 API key\n\nYou only need **one** API key for this notebook \u2014 your Perplexity key (covers the Agent API, Search API and Embeddings API). Get one at [perplexity.ai/account/api](https://www.perplexity.ai/account/api).\n" + }, + { + "cell_type": "code", + "execution_count": null, + "id": "set-env", + "metadata": { + "tags": [ + "setup", + "env" + ] + }, + "outputs": [], + "source": "import os\nfrom getpass import getpass\n\nif not os.environ.get(\"PERPLEXITY_API_KEY\"):\n os.environ[\"PERPLEXITY_API_KEY\"] = getpass(\"PERPLEXITY_API_KEY: \")" + }, + { + "cell_type": "markdown", + "id": "seed-kb", + "metadata": { + "tags": [] + }, + "source": "\n## 2. Seed the knowledge base with Perplexity Embeddings\n\nWe start with three short seed documents covering Haystack basics. They get embedded with `PerplexityDocumentEmbedder` (model `pplx-embed-v1-0.6b`, 1536-dim) and written into an in-memory Qdrant store.\n" + }, + { + "cell_type": "code", + "execution_count": null, + "id": "init-store", + "metadata": { + "tags": [ + "setup", + "document-store" + ] + }, + "outputs": [], + "source": "from haystack import Document\nfrom haystack_integrations.document_stores.qdrant import QdrantDocumentStore\n\ndocument_store = QdrantDocumentStore(\n location=\":memory:\",\n index=\"research_agent\",\n embedding_dim=1536, # pplx-embed-v1-0.6b output dim\n similarity=\"cosine\",\n return_embedding=False,\n)" + }, + { + "cell_type": "code", + "execution_count": null, + "id": "seed-docs", + "metadata": { + "tags": [ + "data", + "seed" + ] + }, + "outputs": [], + "source": "seed_docs = [\n Document(\n content=(\n \"Haystack is an open-source Python framework by deepset for building \"\n \"production LLM applications such as RAG pipelines and agentic workflows.\"\n ),\n meta={\"source\": \"https://haystack.deepset.ai/\", \"title\": \"Haystack overview\"},\n ),\n Document(\n content=(\n \"A Haystack Agent is a tool-calling component that wraps a chat generator \"\n \"and a list of Tool objects. The agent loops: it asks the model what to do, \"\n \"executes the chosen tool, and feeds the result back until it produces a final answer.\"\n ),\n meta={\"source\": \"https://docs.haystack.deepset.ai/docs/agent\", \"title\": \"Haystack Agent\"},\n ),\n Document(\n content=(\n \"The perplexity-haystack package ships PerplexityChatGenerator (Agent API), \"\n \"PerplexityTextEmbedder + PerplexityDocumentEmbedder (Embeddings API), and \"\n \"PerplexityWebSearch (Search API).\"\n ),\n meta={\n \"source\": \"https://haystack.deepset.ai/integrations/perplexity\",\n \"title\": \"perplexity-haystack integration\",\n },\n ),\n]" + }, + { + "cell_type": "code", + "execution_count": null, + "id": "index-seed", + "metadata": { + "tags": [ + "pipeline:indexing", + "component:embedder", + "component:writer" + ] + }, + "outputs": [], + "source": "from haystack import Pipeline\nfrom haystack.components.writers import DocumentWriter\nfrom haystack.document_stores.types import DuplicatePolicy\nfrom haystack_integrations.components.embedders.perplexity import PerplexityDocumentEmbedder\n\nindexing = Pipeline()\nindexing.add_component(\"embedder\", PerplexityDocumentEmbedder(model=\"pplx-embed-v1-0.6b\"))\nindexing.add_component(\n \"writer\",\n DocumentWriter(document_store=document_store, policy=DuplicatePolicy.OVERWRITE),\n)\nindexing.connect(\"embedder.documents\", \"writer.documents\")\n\nindexing.run({\"embedder\": {\"documents\": seed_docs}})\nprint(f\"Indexed {document_store.count_documents()} seed documents.\")" + }, + { + "cell_type": "markdown", + "id": "tools", + "metadata": { + "tags": [] + }, + "source": "\n## 3. Define the three tools\n\nEach tool is a thin function wrapped with Haystack's [`Tool`](https://docs.haystack.deepset.ai/docs/tool) dataclass and a JSON-schema for arguments. We close over the document store and the embedder components so the tools stay pure functions from the agent's perspective.\n" + }, + { + "cell_type": "markdown", + "id": "tool-retrieve", + "metadata": { + "tags": [] + }, + "source": "\n### 3.1 `retrieve_from_index` \u2014 Perplexity Embeddings + Qdrant\n" + }, + { + "cell_type": "code", + "execution_count": null, + "id": "tool-retrieve-code", + "metadata": { + "tags": [ + "tool", + "tool:retrieve_from_index", + "component:text_embedder", + "component:retriever" + ] + }, + "outputs": [], + "source": "from haystack.tools import Tool\nfrom haystack_integrations.components.embedders.perplexity import PerplexityTextEmbedder\nfrom haystack_integrations.components.retrievers.qdrant import QdrantEmbeddingRetriever\n\n_text_embedder = PerplexityTextEmbedder(model=\"pplx-embed-v1-0.6b\")\n_retriever = QdrantEmbeddingRetriever(document_store=document_store, top_k=4)\n\n\ndef retrieve_from_index(query: str, top_k: int = 4) -> dict:\n \"\"\"Embed the query with Perplexity and pull the top-k matches from Qdrant.\"\"\"\n query_emb = _text_embedder.run(text=query)[\"embedding\"]\n hits = _retriever.run(query_embedding=query_emb, top_k=top_k)[\"documents\"]\n return {\n \"hits\": [\n {\n \"title\": d.meta.get(\"title\", \"\"),\n \"source\": d.meta.get(\"source\", \"\"),\n \"snippet\": d.content[:500],\n \"score\": d.score,\n }\n for d in hits\n ]\n }\n\n\nretrieve_tool = Tool(\n name=\"retrieve_from_index\",\n description=(\n \"Search the local Qdrant knowledge base (Perplexity embeddings). \"\n \"Use this FIRST for any question that might already be covered by indexed sources.\"\n ),\n parameters={\n \"type\": \"object\",\n \"properties\": {\n \"query\": {\"type\": \"string\", \"description\": \"Natural-language search query.\"},\n \"top_k\": {\"type\": \"integer\", \"default\": 4, \"minimum\": 1, \"maximum\": 10},\n },\n \"required\": [\"query\"],\n },\n function=retrieve_from_index,\n)" + }, + { + "cell_type": "markdown", + "id": "tool-search", + "metadata": { + "tags": [] + }, + "source": "\n### 3.2 `web_search` \u2014 Perplexity Search API\n\n`PerplexityWebSearch` returns ranked, cleaned web results as Haystack `Document`s and the raw URL list. No `LinkContentFetcher`, no HTML cleaning, no SERP API key.\n" + }, + { + "cell_type": "code", + "execution_count": null, + "id": "tool-search-code", + "metadata": { + "tags": [ + "tool", + "tool:web_search", + "component:websearch" + ] + }, + "outputs": [], + "source": "from haystack_integrations.components.websearch.perplexity import PerplexityWebSearch\n\n_web_search = PerplexityWebSearch(top_k=5)\n\n\ndef web_search(query: str, top_k: int = 5) -> dict:\n \"\"\"Run a Perplexity web search and return ranked, cited results.\"\"\"\n result = _web_search.run(query=query, search_params={\"max_results\": top_k})\n return {\n \"results\": [\n {\n \"title\": d.meta.get(\"title\", \"\"),\n \"url\": d.meta.get(\"url\") or d.meta.get(\"link\", \"\"),\n \"snippet\": d.content[:600],\n }\n for d in result[\"documents\"]\n ],\n \"links\": result[\"links\"],\n }\n\n\nweb_search_tool = Tool(\n name=\"web_search\",\n description=(\n \"Search the live web with the Perplexity Search API. \"\n \"Use this when retrieve_from_index returns nothing relevant, or when the question \"\n \"requires post-cutoff or current information.\"\n ),\n parameters={\n \"type\": \"object\",\n \"properties\": {\n \"query\": {\"type\": \"string\"},\n \"top_k\": {\"type\": \"integer\", \"default\": 5, \"minimum\": 1, \"maximum\": 20},\n },\n \"required\": [\"query\"],\n },\n function=web_search,\n)" + }, + { + "cell_type": "markdown", + "id": "tool-ingest", + "metadata": { + "tags": [] + }, + "source": "\n### 3.3 `ingest_url` \u2014 close the loop\n\nThe agent picks a URL it considers high-signal (often returned by `web_search`), we fetch it with `trafilatura`, chunk it, embed it with `PerplexityDocumentEmbedder`, and write it back into Qdrant. The *next* call to `retrieve_from_index` can then find it.\n" + }, + { + "cell_type": "code", + "execution_count": null, + "id": "tool-ingest-code", + "metadata": { + "tags": [ + "tool", + "tool:ingest_url", + "component:embedder", + "component:splitter", + "component:writer" + ] + }, + "outputs": [], + "source": "import trafilatura\nfrom haystack.components.preprocessors import DocumentSplitter\n\n_doc_embedder = PerplexityDocumentEmbedder(model=\"pplx-embed-v1-0.6b\", progress_bar=False)\n_splitter = DocumentSplitter(split_by=\"word\", split_length=200, split_overlap=20)\n_writer = DocumentWriter(document_store=document_store, policy=DuplicatePolicy.OVERWRITE)\n\n\ndef ingest_url(url: str, title: str | None = None) -> dict:\n \"\"\"Fetch a URL, embed its contents, and add it to the Qdrant index.\"\"\"\n raw = trafilatura.fetch_url(url)\n text = trafilatura.extract(raw) if raw else None\n if not text:\n return {\"ok\": False, \"reason\": \"could not extract content\", \"url\": url}\n\n doc = Document(content=text, meta={\"source\": url, \"title\": title or url})\n chunks = _splitter.run(documents=[doc])[\"documents\"]\n embedded = _doc_embedder.run(documents=chunks)[\"documents\"]\n _writer.run(documents=embedded)\n\n return {\n \"ok\": True,\n \"url\": url,\n \"chunks_indexed\": len(embedded),\n \"total_docs_in_index\": document_store.count_documents(),\n }\n\n\ningest_tool = Tool(\n name=\"ingest_url\",\n description=(\n \"Fetch the content at `url`, embed it with Perplexity embeddings, and add it to the \"\n \"local Qdrant index for reuse by future retrieve_from_index calls. \"\n \"Call this after web_search when you find an authoritative source worth keeping.\"\n ),\n parameters={\n \"type\": \"object\",\n \"properties\": {\n \"url\": {\"type\": \"string\", \"description\": \"Absolute URL to fetch.\"},\n \"title\": {\"type\": \"string\", \"description\": \"Optional human-readable title.\"},\n },\n \"required\": [\"url\"],\n },\n function=ingest_url,\n)" + }, + { + "cell_type": "markdown", + "id": "agent", + "metadata": { + "tags": [] + }, + "source": "\n## 4. Build the agent (Perplexity Agent API)\n\n`PerplexityChatGenerator` subclasses Haystack's `OpenAIResponsesChatGenerator`, which is exactly what the [`Agent`](https://docs.haystack.deepset.ai/docs/agent) component expects. We pass our three tools and a system prompt that spells out the retrieve-first / search-then-ingest policy.\n" + }, + { + "cell_type": "code", + "execution_count": null, + "id": "build-agent", + "metadata": { + "tags": [ + "agent", + "component:chat_generator" + ] + }, + "outputs": [], + "source": "from haystack.components.agents import Agent\nfrom haystack_integrations.components.generators.perplexity import PerplexityChatGenerator\n\nSYSTEM_PROMPT = (\n \"You are a research agent. For every user question:\\n\"\n \"1. Call `retrieve_from_index` first to check the local knowledge base.\\n\"\n \"2. If the retrieved snippets do not fully answer the question, call `web_search`.\\n\"\n \"3. If `web_search` surfaces a high-quality URL the user is likely to ask about again, \"\n \"call `ingest_url` to add it to the index before answering.\\n\"\n \"4. Cite every fact with the source URL it came from. \"\n \"Prefer concise answers with inline markdown links.\"\n)\n\nchat_generator = PerplexityChatGenerator(model=\"openai/gpt-5.4\")\n\nagent = Agent(\n chat_generator=chat_generator,\n tools=[retrieve_tool, web_search_tool, ingest_tool],\n system_prompt=SYSTEM_PROMPT,\n exit_conditions=[\"text\"],\n max_agent_steps=8,\n)\nagent.warm_up()" + }, + { + "cell_type": "markdown", + "id": "run-q1", + "metadata": { + "tags": [] + }, + "source": "\n## 5. Run it\n\n### 5.1 Question 1 \u2014 answerable from the seed index\n\nThis should be answered with a single `retrieve_from_index` call.\n" + }, + { + "cell_type": "code", + "execution_count": null, + "id": "run-q1-code", + "metadata": { + "tags": [ + "run", + "demo:q1" + ] + }, + "outputs": [], + "source": "from haystack.dataclasses import ChatMessage\n\nq1 = \"What does the perplexity-haystack package provide?\"\nresult = agent.run(messages=[ChatMessage.from_user(q1)])\nprint(result[\"messages\"][-1].text)" + }, + { + "cell_type": "markdown", + "id": "run-q2", + "metadata": { + "tags": [] + }, + "source": "\n### 5.2 Question 2 \u2014 requires live web search + ingestion\n\nThe agent should fail to answer from the index, fall back to `web_search`, ingest the best URL, and then answer with a citation.\n" + }, + { + "cell_type": "code", + "execution_count": null, + "id": "run-q2-code", + "metadata": { + "tags": [ + "run", + "demo:q2" + ] + }, + "outputs": [], + "source": "q2 = \"Summarise the latest stable release notes for the perplexity-haystack package.\"\nresult = agent.run(messages=[ChatMessage.from_user(q2)])\nprint(result[\"messages\"][-1].text)\nprint(f\"\\nDocuments in index after Q2: {document_store.count_documents()}\")" + }, + { + "cell_type": "markdown", + "id": "run-q3", + "metadata": { + "tags": [] + }, + "source": "\n### 5.3 Question 3 \u2014 confirms the index actually grew\n\nA follow-up that should now hit the ingested page via `retrieve_from_index` without another web call.\n" + }, + { + "cell_type": "code", + "execution_count": null, + "id": "run-q3-code", + "metadata": { + "tags": [ + "run", + "demo:q3" + ] + }, + "outputs": [], + "source": "q3 = \"In the perplexity-haystack release notes you just ingested, which components were added or changed?\"\nresult = agent.run(messages=[ChatMessage.from_user(q3)])\nprint(result[\"messages\"][-1].text)" + }, + { + "cell_type": "markdown", + "id": "inspect", + "metadata": { + "tags": [] + }, + "source": "\n## 6. Inspect the trace\n\nEach `ChatMessage` in `result['messages']` records the tool calls and their outputs. This is the per-turn evidence that the agent used the right tool for each question.\n" + }, + { + "cell_type": "code", + "execution_count": null, + "id": "inspect-code", + "metadata": { + "tags": [ + "inspect" + ] + }, + "outputs": [], + "source": "for i, msg in enumerate(result[\"messages\"]):\n tool_calls = getattr(msg, \"tool_calls\", []) or []\n tool_results = getattr(msg, \"tool_call_results\", []) or []\n summary = []\n if tool_calls:\n summary.append(\"calls=\" + \", \".join(tc.tool_name for tc in tool_calls))\n if tool_results:\n summary.append(f\"results={len(tool_results)}\")\n if msg.text:\n summary.append(f\"text[{len(msg.text)} chars]\")\n print(f\"[{i}] role={msg.role.value} \" + \" | \".join(summary))" + }, + { + "cell_type": "markdown", + "id": "wrap-up", + "metadata": { + "tags": [] + }, + "source": "\n## 7. Wrap up\n\nYou now have a Haystack agent that:\n\n- **Decides** when to retrieve, search, or ingest \u2014 using Perplexity for all three.\n- **Learns** across turns: the second question added a page to the Qdrant index, the third question retrieved it without another web call.\n- **Cites** every web-derived fact \u2014 Perplexity's Search API returns the source URLs alongside the content, so the agent has nothing to hallucinate.\n\n### Where to go next\n\n- Swap `pplx-embed-v1-0.6b` for `pplx-embed-v1-4b` if you need higher-quality embeddings.\n- Add `search_recency_filter` or `search_domain_filter` to `PerplexityWebSearch.search_params` to constrain results.\n- Replace the in-memory Qdrant with a hosted Qdrant Cloud instance to persist what the agent learns.\n- Read the [Perplexity x Haystack integration guide](https://docs.perplexity.ai/docs/getting-started/integrations/haystack) for advanced patterns (streaming, structured outputs, tool-calling with the Agent API).\n" + } + ], + "metadata": { + "kernelspec": { + "display_name": "Python 3", + "language": "python", + "name": "python3" + }, + "language_info": { + "name": "python", + "pygments_lexer": "ipython3" + }, + "x_cookbook": { + "external_services": [ + "perplexity-api", + "qdrant (in-memory)" + ], + "haystack_components": [ + "PerplexityChatGenerator", + "PerplexityWebSearch", + "PerplexityDocumentEmbedder", + "PerplexityTextEmbedder", + "QdrantDocumentStore", + "QdrantEmbeddingRetriever", + "Agent", + "Tool", + "DocumentSplitter", + "DocumentWriter" + ], + "perplexity_apis": [ + "Agent API", + "Search API", + "Embeddings API" + ], + "requires_keys": [ + "PERPLEXITY_API_KEY" + ], + "title": "Live-Learning Research Agent with Perplexity (Search + Embeddings + Agent) and Qdrant", + "topics": [ + "Agents", + "RAG", + "Web-QA", + "Advanced Retrieval" + ] + } + }, + "nbformat": 4, + "nbformat_minor": 5 +} From 224b0ee043a50f12b788d79202bea984d225f019 Mon Sep 17 00:00:00 2001 From: James Liounis Date: Thu, 21 May 2026 20:55:39 +0000 Subject: [PATCH 2/4] Execute notebook against live Perplexity + Qdrant Cloud; human-tone comments - Switched the cookbook to a Qdrant Cloud setup so the index persists. - Calls /v1/embeddings directly via httpx with encoding_format=base64_int8 because PerplexityDocumentEmbedder inherits OpenAI's float default and hits HTTP 400 against the live API (noted inline in the notebook). - Executed end-to-end against the live Perplexity Agent / Search / Embeddings APIs so every code cell has real outputs (tool traces, citations, index growth across 3 questions). - Reworded markdown and code comments in a more natural tone. --- .../perplexity_live_research_agent.ipynb | 1219 ++++++++++++++--- 1 file changed, 1064 insertions(+), 155 deletions(-) diff --git a/notebooks/perplexity_live_research_agent.ipynb b/notebooks/perplexity_live_research_agent.ipynb index 81ca38f..3a1b568 100644 --- a/notebooks/perplexity_live_research_agent.ipynb +++ b/notebooks/perplexity_live_research_agent.ipynb @@ -6,15 +6,42 @@ "metadata": { "tags": [] }, - "source": "\n# Live-Learning Research Agent with Perplexity (Search + Embeddings + Agent) and Qdrant\n\n*Notebook by the [Perplexity](https://perplexity.ai) API team.*\n\nIn this cookbook we build a **single research agent that uses all three Perplexity APIs together** through the [`perplexity-haystack`](https://haystack.deepset.ai/integrations/perplexity) integration:\n\n| Perplexity API | Haystack component | Role in this notebook |\n|---|---|---|\n| **Agent API** (`POST /v1/agent`) | `PerplexityChatGenerator` | The agent's reasoning model. OpenAI-Responses-compatible, so it plugs into Haystack's [`Agent`](https://docs.haystack.deepset.ai/docs/agent) without any glue. |\n| **Search API** (`POST /search`) | `PerplexityWebSearch` | Ranked, cleaned, cited web results \u2014 exposed to the agent as a tool, replacing the SerperDev / Apify / DuckDuckGo + `LinkContentFetcher` chain other cookbooks build by hand. |\n| **Embeddings API** (`POST /v1/embeddings`) | `PerplexityDocumentEmbedder` / `PerplexityTextEmbedder` | Indexes documents into a [Qdrant](https://haystack.deepset.ai/integrations/qdrant-document-store) store and embeds queries at retrieval time. |\n\nThe agent gets three tools \u2014 `retrieve_from_index`, `web_search`, `ingest_url` \u2014 and decides per-question whether to look at its existing index, search the live web, or grow the index with a new URL. The result is a knowledge base that *learns from the agent's own behaviour*, with citations attached to every web answer.\n" + "source": [ + "\n", + "# Live-Learning Research Agent with Perplexity (Search + Embeddings + Agent) and Qdrant\n", + "\n", + "*Notebook by the [Perplexity](https://perplexity.ai) API team.*\n", + "\n", + "This cookbook builds a **single research agent that uses all three Perplexity APIs together** through the [`perplexity-haystack`](https://haystack.deepset.ai/integrations/perplexity) integration:\n", + "\n", + "| Perplexity API | Haystack component | Role in this notebook |\n", + "|---|---|---|\n", + "| **Agent API** (`POST /v1/agent`) | `PerplexityChatGenerator` | The agent's reasoning model. OpenAI-Responses-compatible, so it slots into Haystack's [`Agent`](https://docs.haystack.deepset.ai/docs/agent) with no glue. |\n", + "| **Search API** (`POST /search`) | `PerplexityWebSearch` | Ranked, cleaned, cited web results, exposed to the agent as a tool — replaces the SerperDev / DuckDuckGo + `LinkContentFetcher` chain other cookbooks build by hand. |\n", + "| **Embeddings API** (`POST /v1/embeddings`) | `pplx-embed-v1-0.6b` (called directly, see note below) | Indexes documents into [Qdrant](https://haystack.deepset.ai/integrations/qdrant-document-store) and embeds queries at retrieval time. |\n", + "\n", + "The agent gets three tools — `retrieve_from_index`, `web_search`, `ingest_url` — and decides per question whether to read the local index, search the live web, or grow the index with a freshly-fetched page. Net result: a knowledge base that *learns from the agent's own behaviour*, with citations on every web answer.\n", + "\n", + "> **One real-world gotcha you'll hit:** the live `/v1/embeddings` endpoint only accepts `encoding_format` of `base64_int8` or `base64_binary`, but `PerplexityDocumentEmbedder` / `PerplexityTextEmbedder` inherit the OpenAI default of `float` and get back HTTP 400. Until that's fixed upstream, we call the embeddings endpoint directly with `httpx` and decode `base64_int8` to `float32`. It's ~15 lines and we walk through it below.\n" + ] }, { "cell_type": "markdown", - "id": "what-you-will-learn", + "id": "what-you-will-build", "metadata": { "tags": [] }, - "source": "\n## What you will build\n\n1. A Qdrant-backed knowledge base seeded with a few Haystack documentation pages, embedded with `pplx-embed-v1-0.6b`.\n2. A `web_search` tool wrapping `PerplexityWebSearch` so the agent can call the Perplexity Search API directly.\n3. An `ingest_url` tool that takes a URL the agent picked from search results, fetches it, embeds it with `PerplexityDocumentEmbedder`, and writes it back to Qdrant.\n4. A `retrieve_from_index` tool that embeds the query with `PerplexityTextEmbedder` and pulls the top-k from Qdrant.\n5. A Haystack [`Agent`](https://docs.haystack.deepset.ai/docs/agent) driven by `PerplexityChatGenerator` that orchestrates the three tools.\n6. A short evaluation showing the index growing across turns and answers carrying citations end-to-end.\n" + "source": [ + "\n", + "## What you will build\n", + "\n", + "1. A Qdrant Cloud–backed knowledge base seeded with a couple of Haystack documentation snippets, embedded with `pplx-embed-v1-0.6b`.\n", + "2. A `web_search` tool wrapping `PerplexityWebSearch` so the agent can hit the Perplexity Search API directly.\n", + "3. An `ingest_url` tool that takes a URL from a web search result, extracts the page with `trafilatura`, embeds the chunks with the Perplexity Embeddings API, and writes them to Qdrant.\n", + "4. A `retrieve_from_index` tool that embeds the query and pulls the top-k from Qdrant.\n", + "5. A Haystack [`Agent`](https://docs.haystack.deepset.ai/docs/agent) driven by `PerplexityChatGenerator` that orchestrates the three tools.\n", + "6. Three sample questions that show the index growing across turns and answers carrying citations end-to-end.\n" + ] }, { "cell_type": "markdown", @@ -22,269 +49,1118 @@ "metadata": { "tags": [] }, - "source": "\n## 1. Setup\n\nInstall the integration packages. We use an in-memory Qdrant instance so the notebook runs anywhere (Colab included) without spinning up infrastructure.\n" + "source": [ + "\n", + "## 1. Setup\n", + "\n", + "Install the integration packages plus `trafilatura` for HTML extraction. The notebook uses a **Qdrant Cloud** cluster (free tier is enough) so the index persists across runs — flip `recreate_index=True` to `False` once you have something you want to keep.\n" + ] }, { "cell_type": "code", "execution_count": null, "id": "install", "metadata": { + "execution": { + "iopub.execute_input": "2026-05-21T20:54:38.213438Z", + "iopub.status.busy": "2026-05-21T20:54:38.213330Z", + "iopub.status.idle": "2026-05-21T20:54:38.216520Z", + "shell.execute_reply": "2026-05-21T20:54:38.216064Z" + }, "tags": [ "install" ] }, "outputs": [], - "source": "%pip install -q \\\n \"haystack-ai>=2.24.1\" \\\n \"perplexity-haystack\" \\\n \"qdrant-haystack\" \\\n \"trafilatura\"" + "source": [ + "%pip install -q \\\n", + " \"haystack-ai>=2.24.1\" \\\n", + " \"perplexity-haystack\" \\\n", + " \"qdrant-haystack\" \\\n", + " \"trafilatura\" \\\n", + " \"httpx\" \\\n", + " \"numpy\"" + ] }, { "cell_type": "markdown", - "id": "api-keys", + "id": "credentials", "metadata": { "tags": [] }, - "source": "\n### 1.1 API key\n\nYou only need **one** API key for this notebook \u2014 your Perplexity key (covers the Agent API, Search API and Embeddings API). Get one at [perplexity.ai/account/api](https://www.perplexity.ai/account/api).\n" + "source": [ + "\n", + "### 1.1 Credentials\n", + "\n", + "You'll need two things:\n", + "\n", + "* A **Perplexity API key** from [https://www.perplexity.ai/account/api](https://www.perplexity.ai/account/api).\n", + "* A **Qdrant Cloud cluster URL + API key** from [https://cloud.qdrant.io](https://cloud.qdrant.io) (free tier works fine).\n", + "\n", + "The cell below uses `getpass` so the keys never end up in the notebook output.\n" + ] }, { "cell_type": "code", "execution_count": null, - "id": "set-env", + "id": "setup-credentials", "metadata": { + "execution": { + "iopub.execute_input": "2026-05-21T20:54:38.217778Z", + "iopub.status.busy": "2026-05-21T20:54:38.217662Z", + "iopub.status.idle": "2026-05-21T20:54:38.221347Z", + "shell.execute_reply": "2026-05-21T20:54:38.220845Z" + }, "tags": [ "setup", - "env" + "credentials" ] }, "outputs": [], - "source": "import os\nfrom getpass import getpass\n\nif not os.environ.get(\"PERPLEXITY_API_KEY\"):\n os.environ[\"PERPLEXITY_API_KEY\"] = getpass(\"PERPLEXITY_API_KEY: \")" + "source": [ + "import os\n", + "from getpass import getpass\n", + "\n", + "if not os.environ.get(\"PERPLEXITY_API_KEY\"):\n", + " os.environ[\"PERPLEXITY_API_KEY\"] = getpass(\"PERPLEXITY_API_KEY: \")\n", + "if not os.environ.get(\"QDRANT_URL\"):\n", + " os.environ[\"QDRANT_URL\"] = getpass(\"QDRANT_URL (e.g. https://xxxx.cloud.qdrant.io): \")\n", + "if not os.environ.get(\"QDRANT_API_KEY\"):\n", + " os.environ[\"QDRANT_API_KEY\"] = getpass(\"QDRANT_API_KEY: \")\n", + "\n", + "print(\"Perplexity key set:\", bool(os.environ.get(\"PERPLEXITY_API_KEY\")))\n", + "print(\"Qdrant URL set: \", bool(os.environ.get(\"QDRANT_URL\")))\n", + "print(\"Qdrant key set: \", bool(os.environ.get(\"QDRANT_API_KEY\")))\n" + ] }, { "cell_type": "markdown", - "id": "seed-kb", + "id": "embeddings", "metadata": { "tags": [] }, - "source": "\n## 2. Seed the knowledge base with Perplexity Embeddings\n\nWe start with three short seed documents covering Haystack basics. They get embedded with `PerplexityDocumentEmbedder` (model `pplx-embed-v1-0.6b`, 1536-dim) and written into an in-memory Qdrant store.\n" - }, - { - "cell_type": "code", - "execution_count": null, - "id": "init-store", - "metadata": { - "tags": [ - "setup", - "document-store" - ] - }, - "outputs": [], - "source": "from haystack import Document\nfrom haystack_integrations.document_stores.qdrant import QdrantDocumentStore\n\ndocument_store = QdrantDocumentStore(\n location=\":memory:\",\n index=\"research_agent\",\n embedding_dim=1536, # pplx-embed-v1-0.6b output dim\n similarity=\"cosine\",\n return_embedding=False,\n)" - }, - { - "cell_type": "code", - "execution_count": null, - "id": "seed-docs", - "metadata": { - "tags": [ - "data", - "seed" - ] - }, - "outputs": [], - "source": "seed_docs = [\n Document(\n content=(\n \"Haystack is an open-source Python framework by deepset for building \"\n \"production LLM applications such as RAG pipelines and agentic workflows.\"\n ),\n meta={\"source\": \"https://haystack.deepset.ai/\", \"title\": \"Haystack overview\"},\n ),\n Document(\n content=(\n \"A Haystack Agent is a tool-calling component that wraps a chat generator \"\n \"and a list of Tool objects. The agent loops: it asks the model what to do, \"\n \"executes the chosen tool, and feeds the result back until it produces a final answer.\"\n ),\n meta={\"source\": \"https://docs.haystack.deepset.ai/docs/agent\", \"title\": \"Haystack Agent\"},\n ),\n Document(\n content=(\n \"The perplexity-haystack package ships PerplexityChatGenerator (Agent API), \"\n \"PerplexityTextEmbedder + PerplexityDocumentEmbedder (Embeddings API), and \"\n \"PerplexityWebSearch (Search API).\"\n ),\n meta={\n \"source\": \"https://haystack.deepset.ai/integrations/perplexity\",\n \"title\": \"perplexity-haystack integration\",\n },\n ),\n]" + "source": [ + "\n", + "## 2. Embedding helper (Perplexity `pplx-embed-v1-0.6b`)\n", + "\n", + "A tiny wrapper around `POST /v1/embeddings`. We ask for `base64_int8` (currently the only supported `encoding_format`), then decode to a `float32` list that Qdrant is happy to ingest.\n", + "\n", + "The vector dimension for `pplx-embed-v1-0.6b` is **1024** — keep this in mind when you create the Qdrant collection.\n", + "\n", + "We also stamp every request with an `X-Pplx-Integration` header so the API team can see traffic from this cookbook in their dashboards.\n" + ] }, { "cell_type": "code", - "execution_count": null, - "id": "index-seed", + "execution_count": 3, + "id": "embed-helper", "metadata": { + "execution": { + "iopub.execute_input": "2026-05-21T20:54:38.222462Z", + "iopub.status.busy": "2026-05-21T20:54:38.222351Z", + "iopub.status.idle": "2026-05-21T20:54:38.441325Z", + "shell.execute_reply": "2026-05-21T20:54:38.440733Z" + }, "tags": [ - "pipeline:indexing", "component:embedder", - "component:writer" + "perplexity:embeddings" ] }, - "outputs": [], - "source": "from haystack import Pipeline\nfrom haystack.components.writers import DocumentWriter\nfrom haystack.document_stores.types import DuplicatePolicy\nfrom haystack_integrations.components.embedders.perplexity import PerplexityDocumentEmbedder\n\nindexing = Pipeline()\nindexing.add_component(\"embedder\", PerplexityDocumentEmbedder(model=\"pplx-embed-v1-0.6b\"))\nindexing.add_component(\n \"writer\",\n DocumentWriter(document_store=document_store, policy=DuplicatePolicy.OVERWRITE),\n)\nindexing.connect(\"embedder.documents\", \"writer.documents\")\n\nindexing.run({\"embedder\": {\"documents\": seed_docs}})\nprint(f\"Indexed {document_store.count_documents()} seed documents.\")" - }, - { - "cell_type": "markdown", - "id": "tools", - "metadata": { - "tags": [] - }, - "source": "\n## 3. Define the three tools\n\nEach tool is a thin function wrapped with Haystack's [`Tool`](https://docs.haystack.deepset.ai/docs/tool) dataclass and a JSON-schema for arguments. We close over the document store and the embedder components so the tools stay pure functions from the agent's perspective.\n" + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "returned 2 vectors of dim 1024\n" + ] + } + ], + "source": [ + "import base64\n", + "import httpx\n", + "import numpy as np\n", + "\n", + "EMBEDDINGS_URL = \"https://api.perplexity.ai/v1/embeddings\"\n", + "EMBEDDING_MODEL = \"pplx-embed-v1-0.6b\"\n", + "EMBEDDING_DIM = 1024\n", + "\n", + "\n", + "def embed_texts(texts: list[str]) -> list[list[float]]:\n", + " \"\"\"Call Perplexity /v1/embeddings and return float32 vectors.\"\"\"\n", + " resp = httpx.post(\n", + " EMBEDDINGS_URL,\n", + " headers={\n", + " \"Authorization\": f\"Bearer {os.environ['PERPLEXITY_API_KEY']}\",\n", + " \"Content-Type\": \"application/json\",\n", + " # See attribution header docs at docs.perplexity.ai\n", + " \"X-Pplx-Integration\": \"haystack/cookbook-live-research-agent\",\n", + " },\n", + " json={\n", + " \"model\": EMBEDDING_MODEL,\n", + " \"input\": texts,\n", + " \"encoding_format\": \"base64_int8\",\n", + " },\n", + " timeout=60.0,\n", + " )\n", + " resp.raise_for_status()\n", + " out = []\n", + " for item in resp.json()[\"data\"]:\n", + " raw = base64.b64decode(item[\"embedding\"])\n", + " vec = np.frombuffer(raw, dtype=np.int8).astype(np.float32)\n", + " out.append(vec.tolist())\n", + " return out\n", + "\n", + "\n", + "# Quick sanity check.\n", + "sample = embed_texts([\"hello world\", \"retrieval augmented generation\"])\n", + "print(\"returned\", len(sample), \"vectors of dim\", len(sample[0]))\n" + ] }, { "cell_type": "markdown", - "id": "tool-retrieve", + "id": "doc-store", "metadata": { "tags": [] }, - "source": "\n### 3.1 `retrieve_from_index` \u2014 Perplexity Embeddings + Qdrant\n" + "source": [ + "\n", + "## 3. Qdrant document store\n", + "\n", + "The Qdrant collection is created with `embedding_dim=1024` to match `pplx-embed-v1-0.6b`. We keep `recreate_index=True` here so re-running the notebook from scratch gives reproducible output — change to `False` once you want the index to persist between sessions.\n" + ] }, { "cell_type": "code", - "execution_count": null, - "id": "tool-retrieve-code", + "execution_count": 4, + "id": "init-qdrant", "metadata": { + "execution": { + "iopub.execute_input": "2026-05-21T20:54:38.442682Z", + "iopub.status.busy": "2026-05-21T20:54:38.442523Z", + "iopub.status.idle": "2026-05-21T20:54:41.164116Z", + "shell.execute_reply": "2026-05-21T20:54:41.163213Z" + }, "tags": [ - "tool", - "tool:retrieve_from_index", - "component:text_embedder", - "component:retriever" + "setup", + "doc-store" ] }, - "outputs": [], - "source": "from haystack.tools import Tool\nfrom haystack_integrations.components.embedders.perplexity import PerplexityTextEmbedder\nfrom haystack_integrations.components.retrievers.qdrant import QdrantEmbeddingRetriever\n\n_text_embedder = PerplexityTextEmbedder(model=\"pplx-embed-v1-0.6b\")\n_retriever = QdrantEmbeddingRetriever(document_store=document_store, top_k=4)\n\n\ndef retrieve_from_index(query: str, top_k: int = 4) -> dict:\n \"\"\"Embed the query with Perplexity and pull the top-k matches from Qdrant.\"\"\"\n query_emb = _text_embedder.run(text=query)[\"embedding\"]\n hits = _retriever.run(query_embedding=query_emb, top_k=top_k)[\"documents\"]\n return {\n \"hits\": [\n {\n \"title\": d.meta.get(\"title\", \"\"),\n \"source\": d.meta.get(\"source\", \"\"),\n \"snippet\": d.content[:500],\n \"score\": d.score,\n }\n for d in hits\n ]\n }\n\n\nretrieve_tool = Tool(\n name=\"retrieve_from_index\",\n description=(\n \"Search the local Qdrant knowledge base (Perplexity embeddings). \"\n \"Use this FIRST for any question that might already be covered by indexed sources.\"\n ),\n parameters={\n \"type\": \"object\",\n \"properties\": {\n \"query\": {\"type\": \"string\", \"description\": \"Natural-language search query.\"},\n \"top_k\": {\"type\": \"integer\", \"default\": 4, \"minimum\": 1, \"maximum\": 10},\n },\n \"required\": [\"query\"],\n },\n function=retrieve_from_index,\n)" + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Qdrant collection ready: research_agent_demo\n" + ] + }, + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Docs currently in index: 0\n" + ] + } + ], + "source": [ + "from haystack.utils import Secret\n", + "from haystack_integrations.document_stores.qdrant import QdrantDocumentStore\n", + "\n", + "document_store = QdrantDocumentStore(\n", + " url=os.environ[\"QDRANT_URL\"],\n", + " api_key=Secret.from_token(os.environ[\"QDRANT_API_KEY\"]),\n", + " index=\"research_agent_demo\",\n", + " embedding_dim=EMBEDDING_DIM,\n", + " similarity=\"cosine\",\n", + " recreate_index=True, # set False once you want to keep the index across runs\n", + " return_embedding=False,\n", + ")\n", + "\n", + "print(\"Qdrant collection ready:\", document_store.index)\n", + "print(\"Docs currently in index:\", document_store.count_documents())\n" + ] }, { "cell_type": "markdown", - "id": "tool-search", + "id": "seed", "metadata": { "tags": [] }, - "source": "\n### 3.2 `web_search` \u2014 Perplexity Search API\n\n`PerplexityWebSearch` returns ranked, cleaned web results as Haystack `Document`s and the raw URL list. No `LinkContentFetcher`, no HTML cleaning, no SERP API key.\n" + "source": [ + "\n", + "## 4. Seed the index\n", + "\n", + "Two tight docs about *Haystack itself*. We intentionally leave out anything about the `perplexity-haystack` package so that one of the demo questions later forces the agent to call `web_search` + `ingest_url`.\n" + ] }, { "cell_type": "code", - "execution_count": null, - "id": "tool-search-code", + "execution_count": 5, + "id": "seed-docs", "metadata": { + "execution": { + "iopub.execute_input": "2026-05-21T20:54:41.165792Z", + "iopub.status.busy": "2026-05-21T20:54:41.165480Z", + "iopub.status.idle": "2026-05-21T20:54:41.810765Z", + "shell.execute_reply": "2026-05-21T20:54:41.810255Z" + }, "tags": [ - "tool", - "tool:web_search", - "component:websearch" + "doc-store", + "seed" ] }, - "outputs": [], - "source": "from haystack_integrations.components.websearch.perplexity import PerplexityWebSearch\n\n_web_search = PerplexityWebSearch(top_k=5)\n\n\ndef web_search(query: str, top_k: int = 5) -> dict:\n \"\"\"Run a Perplexity web search and return ranked, cited results.\"\"\"\n result = _web_search.run(query=query, search_params={\"max_results\": top_k})\n return {\n \"results\": [\n {\n \"title\": d.meta.get(\"title\", \"\"),\n \"url\": d.meta.get(\"url\") or d.meta.get(\"link\", \"\"),\n \"snippet\": d.content[:600],\n }\n for d in result[\"documents\"]\n ],\n \"links\": result[\"links\"],\n }\n\n\nweb_search_tool = Tool(\n name=\"web_search\",\n description=(\n \"Search the live web with the Perplexity Search API. \"\n \"Use this when retrieve_from_index returns nothing relevant, or when the question \"\n \"requires post-cutoff or current information.\"\n ),\n parameters={\n \"type\": \"object\",\n \"properties\": {\n \"query\": {\"type\": \"string\"},\n \"top_k\": {\"type\": \"integer\", \"default\": 5, \"minimum\": 1, \"maximum\": 20},\n },\n \"required\": [\"query\"],\n },\n function=web_search,\n)" + "outputs": [ + { + "name": "stderr", + "output_type": "stream", + "text": [ + "/tmp/ipykernel_1731/3113823150.py:26: Warning: Mutating attribute 'embedding' on an instance of 'Document' can lead to unexpected behavior by affecting other parts of the pipeline that use the same dataclass instance. Use `dataclasses.replace(instance, embedding=new_value)` instead. See https://docs.haystack.deepset.ai/docs/custom-components#requirements for details.\n", + " doc.embedding = vec\n" + ] + }, + { + "name": "stderr", + "output_type": "stream", + "text": [ + "\r", + " 0%| | 0/2 [00:00\n### 3.3 `ingest_url` \u2014 close the loop\n\nThe agent picks a URL it considers high-signal (often returned by `web_search`), we fetch it with `trafilatura`, chunk it, embed it with `PerplexityDocumentEmbedder`, and write it back into Qdrant. The *next* call to `retrieve_from_index` can then find it.\n" + "source": [ + "\n", + "## 5. `retrieve_from_index` tool\n", + "\n", + "Embed the query with the same model used at indexing time, then pull the top-k from Qdrant. The tool returns a compact JSON payload — title, source URL, snippet, score — because the agent has to fit it into its context window.\n" + ] }, { "cell_type": "code", - "execution_count": null, - "id": "tool-ingest-code", + "execution_count": 6, + "id": "import-json", "metadata": { + "execution": { + "iopub.execute_input": "2026-05-21T20:54:41.812022Z", + "iopub.status.busy": "2026-05-21T20:54:41.811899Z", + "iopub.status.idle": "2026-05-21T20:54:41.814392Z", + "shell.execute_reply": "2026-05-21T20:54:41.813927Z" + }, "tags": [ - "tool", - "tool:ingest_url", - "component:embedder", - "component:splitter", - "component:writer" + "setup" ] }, "outputs": [], - "source": "import trafilatura\nfrom haystack.components.preprocessors import DocumentSplitter\n\n_doc_embedder = PerplexityDocumentEmbedder(model=\"pplx-embed-v1-0.6b\", progress_bar=False)\n_splitter = DocumentSplitter(split_by=\"word\", split_length=200, split_overlap=20)\n_writer = DocumentWriter(document_store=document_store, policy=DuplicatePolicy.OVERWRITE)\n\n\ndef ingest_url(url: str, title: str | None = None) -> dict:\n \"\"\"Fetch a URL, embed its contents, and add it to the Qdrant index.\"\"\"\n raw = trafilatura.fetch_url(url)\n text = trafilatura.extract(raw) if raw else None\n if not text:\n return {\"ok\": False, \"reason\": \"could not extract content\", \"url\": url}\n\n doc = Document(content=text, meta={\"source\": url, \"title\": title or url})\n chunks = _splitter.run(documents=[doc])[\"documents\"]\n embedded = _doc_embedder.run(documents=chunks)[\"documents\"]\n _writer.run(documents=embedded)\n\n return {\n \"ok\": True,\n \"url\": url,\n \"chunks_indexed\": len(embedded),\n \"total_docs_in_index\": document_store.count_documents(),\n }\n\n\ningest_tool = Tool(\n name=\"ingest_url\",\n description=(\n \"Fetch the content at `url`, embed it with Perplexity embeddings, and add it to the \"\n \"local Qdrant index for reuse by future retrieve_from_index calls. \"\n \"Call this after web_search when you find an authoritative source worth keeping.\"\n ),\n parameters={\n \"type\": \"object\",\n \"properties\": {\n \"url\": {\"type\": \"string\", \"description\": \"Absolute URL to fetch.\"},\n \"title\": {\"type\": \"string\", \"description\": \"Optional human-readable title.\"},\n },\n \"required\": [\"url\"],\n },\n function=ingest_url,\n)" - }, - { - "cell_type": "markdown", - "id": "agent", - "metadata": { - "tags": [] - }, - "source": "\n## 4. Build the agent (Perplexity Agent API)\n\n`PerplexityChatGenerator` subclasses Haystack's `OpenAIResponsesChatGenerator`, which is exactly what the [`Agent`](https://docs.haystack.deepset.ai/docs/agent) component expects. We pass our three tools and a system prompt that spells out the retrieve-first / search-then-ingest policy.\n" + "source": [ + "import json\n" + ] }, { "cell_type": "code", - "execution_count": null, - "id": "build-agent", + "execution_count": 7, + "id": "retriever-tool", "metadata": { + "execution": { + "iopub.execute_input": "2026-05-21T20:54:41.815845Z", + "iopub.status.busy": "2026-05-21T20:54:41.815714Z", + "iopub.status.idle": "2026-05-21T20:54:42.131915Z", + "shell.execute_reply": "2026-05-21T20:54:42.131357Z" + }, "tags": [ - "agent", - "component:chat_generator" + "tool:retrieve", + "component:retriever" ] }, - "outputs": [], - "source": "from haystack.components.agents import Agent\nfrom haystack_integrations.components.generators.perplexity import PerplexityChatGenerator\n\nSYSTEM_PROMPT = (\n \"You are a research agent. For every user question:\\n\"\n \"1. Call `retrieve_from_index` first to check the local knowledge base.\\n\"\n \"2. If the retrieved snippets do not fully answer the question, call `web_search`.\\n\"\n \"3. If `web_search` surfaces a high-quality URL the user is likely to ask about again, \"\n \"call `ingest_url` to add it to the index before answering.\\n\"\n \"4. Cite every fact with the source URL it came from. \"\n \"Prefer concise answers with inline markdown links.\"\n)\n\nchat_generator = PerplexityChatGenerator(model=\"openai/gpt-5.4\")\n\nagent = Agent(\n chat_generator=chat_generator,\n tools=[retrieve_tool, web_search_tool, ingest_tool],\n system_prompt=SYSTEM_PROMPT,\n exit_conditions=[\"text\"],\n max_agent_steps=8,\n)\nagent.warm_up()" + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "{\n", + " \"hits\": [\n", + " {\n", + " \"title\": \"Haystack overview\",\n", + " \"source\": \"https://haystack.deepset.ai/\",\n", + " \"snippet\": \"Haystack is an open-source LLM framework by deepset. You compose components like retrievers, generators, and embedders into pipelines, and add tool-calling Agents on top.\",\n", + " \"score\": 0.6988\n", + " },\n", + " {\n", + " \"title\": \"Haystack Agent component\",\n", + " \"source\": \"https://docs.haystack.deepset.ai/docs/agent\",\n", + " \"snippet\": \"In Haystack 2.x, the Agent component takes a chat generator plus a list of Tool objects, loops over tool calls, and exits when the model emits a final answer (the 'text' exit condition).\",\n", + " \"score\": 0.4161\n", + " }\n", + " ]\n", + "}\n" + ] + } + ], + "source": [ + "from haystack_integrations.components.retrievers.qdrant import QdrantEmbeddingRetriever\n", + "\n", + "retriever = QdrantEmbeddingRetriever(document_store=document_store, top_k=4)\n", + "\n", + "\n", + "def retrieve_from_index(query: str, top_k: int = 4) -> dict:\n", + " query_emb = embed_texts([query])[0]\n", + " hits = retriever.run(query_embedding=query_emb, top_k=top_k)[\"documents\"]\n", + " return {\n", + " \"hits\": [\n", + " {\n", + " \"title\": d.meta.get(\"title\", \"\"),\n", + " \"source\": d.meta.get(\"source\", \"\"),\n", + " \"snippet\": d.content[:400],\n", + " \"score\": round(d.score, 4),\n", + " }\n", + " for d in hits\n", + " ]\n", + " }\n", + "\n", + "\n", + "# Smoke test — should retrieve the seed docs about Haystack.\n", + "preview = retrieve_from_index(\"What is Haystack?\", top_k=2)\n", + "print(json.dumps(preview, indent=2)[:800])\n" + ] }, { "cell_type": "markdown", - "id": "run-q1", + "id": "web-search", "metadata": { "tags": [] }, - "source": "\n## 5. Run it\n\n### 5.1 Question 1 \u2014 answerable from the seed index\n\nThis should be answered with a single `retrieve_from_index` call.\n" + "source": [ + "\n", + "## 6. `web_search` tool (Perplexity Search API)\n", + "\n", + "`PerplexityWebSearch` hits `POST /search` and gives back already-ranked, already-cleaned results plus the list of source URLs. No SERP scraper, no extra fetcher in front of the model.\n" + ] }, { "cell_type": "code", - "execution_count": null, - "id": "run-q1-code", + "execution_count": 8, + "id": "web-search-tool", "metadata": { + "execution": { + "iopub.execute_input": "2026-05-21T20:54:42.133168Z", + "iopub.status.busy": "2026-05-21T20:54:42.133046Z", + "iopub.status.idle": "2026-05-21T20:54:42.533718Z", + "shell.execute_reply": "2026-05-21T20:54:42.533259Z" + }, "tags": [ - "run", - "demo:q1" + "tool:web_search", + "component:websearch", + "perplexity:search" ] }, - "outputs": [], - "source": "from haystack.dataclasses import ChatMessage\n\nq1 = \"What does the perplexity-haystack package provide?\"\nresult = agent.run(messages=[ChatMessage.from_user(q1)])\nprint(result[\"messages\"][-1].text)" + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "{\n", + " \"results\": [\n", + " {\n", + " \"title\": \"Perplexity with Haystack\",\n", + " \"url\": \"https://docs.perplexity.ai/docs/getting-started/integrations/haystack\",\n", + " \"snippet\": \"## \\u200b Overview\\nThe `perplexity-haystack` package provides Haystack components for Perplexity\\u2019s Agent API, Embeddings API, and grounded Search API, so you can build retrieval-augmented and agentic pipelines that combine chat, embeddings, and live web search.\\n**Haystack** is an open-source Python framework by deepset for building production-ready LLM applications, including RAG pipelines and agentic \"\n", + " },\n", + " {\n", + " \"title\": \"Perplexity | Haystack - deepset AI\",\n", + " \"url\": \"https://haystack.deepset.ai/integrations/perplexity\",\n", + " \"snippet\": \"# Integration: Perplexity\\nUse the Perplexity Agent API, Embeddings API, and grounded Search API in Haystack pipelines.\\n...\\n## Overview\\nThe `perplexity-haystack`\n" + ] + } + ], + "source": [ + "from haystack_integrations.components.websearch.perplexity import PerplexityWebSearch\n", + "\n", + "websearch = PerplexityWebSearch(top_k=5)\n", + "\n", + "\n", + "def web_search(query: str, top_k: int = 5) -> dict:\n", + " r = websearch.run(query=query, search_params={\"max_results\": top_k})\n", + " return {\n", + " \"results\": [\n", + " {\n", + " \"title\": d.meta.get(\"title\", \"\"),\n", + " \"url\": d.meta.get(\"url\") or d.meta.get(\"link\", \"\"),\n", + " \"snippet\": d.content[:400],\n", + " }\n", + " for d in r[\"documents\"]\n", + " ],\n", + " \"links\": r[\"links\"],\n", + " }\n", + "\n", + "\n", + "preview = web_search(\"perplexity haystack integration overview\", top_k=3)\n", + "print(json.dumps(preview, indent=2)[:900])\n" + ] }, { "cell_type": "markdown", - "id": "run-q2", + "id": "ingest", "metadata": { "tags": [] }, - "source": "\n### 5.2 Question 2 \u2014 requires live web search + ingestion\n\nThe agent should fail to answer from the index, fall back to `web_search`, ingest the best URL, and then answer with a citation.\n" + "source": [ + "\n", + "## 7. `ingest_url` tool\n", + "\n", + "Fetch a URL with `trafilatura` (with an `httpx` fallback for sites that 403 the default UA), pull the main article text, chunk it with Haystack's `DocumentSplitter`, embed the chunks, and write them to Qdrant.\n", + "\n", + "Once a page is ingested, future `retrieve_from_index` calls can hit it without paying for another web request.\n" + ] }, { "cell_type": "code", - "execution_count": null, - "id": "run-q2-code", + "execution_count": 9, + "id": "ingest-tool", "metadata": { + "execution": { + "iopub.execute_input": "2026-05-21T20:54:42.535679Z", + "iopub.status.busy": "2026-05-21T20:54:42.535560Z", + "iopub.status.idle": "2026-05-21T20:54:44.311222Z", + "shell.execute_reply": "2026-05-21T20:54:44.309817Z" + }, "tags": [ - "run", - "demo:q2" + "tool:ingest", + "component:splitter", + "component:embedder" ] }, - "outputs": [], - "source": "q2 = \"Summarise the latest stable release notes for the perplexity-haystack package.\"\nresult = agent.run(messages=[ChatMessage.from_user(q2)])\nprint(result[\"messages\"][-1].text)\nprint(f\"\\nDocuments in index after Q2: {document_store.count_documents()}\")" + "outputs": [ + { + "name": "stderr", + "output_type": "stream", + "text": [ + "/tmp/ipykernel_1731/3725313797.py:29: Warning: Mutating attribute 'embedding' on an instance of 'Document' can lead to unexpected behavior by affecting other parts of the pipeline that use the same dataclass instance. Use `dataclasses.replace(instance, embedding=new_value)` instead. See https://docs.haystack.deepset.ai/docs/custom-components#requirements for details.\n", + " c.embedding = v\n" + ] + }, + { + "name": "stderr", + "output_type": "stream", + "text": [ + "\r", + " 0%| | 0/2 [00:00 dict:\n", + " # trafilatura's own fetcher is fine on clean pages; some sites block the\n", + " # default UA, so fall back to httpx with a browser-ish UA.\n", + " raw = trafilatura.fetch_url(url)\n", + " if not raw:\n", + " try:\n", + " raw = httpx.get(\n", + " url, timeout=20, follow_redirects=True,\n", + " headers={\"User-Agent\": \"Mozilla/5.0 (Haystack cookbook demo)\"},\n", + " ).text\n", + " except Exception as e:\n", + " return {\"ok\": False, \"reason\": f\"fetch error: {e}\", \"url\": url}\n", + " text = trafilatura.extract(raw)\n", + " if not text or len(text) < 200:\n", + " return {\"ok\": False, \"reason\": \"could not extract enough content\", \"url\": url}\n", + "\n", + " doc = Document(content=text, meta={\"source\": url, \"title\": title or url})\n", + " chunks = splitter.run(documents=[doc])[\"documents\"]\n", + "\n", + " chunk_vecs = embed_texts([c.content for c in chunks])\n", + " for c, v in zip(chunks, chunk_vecs):\n", + " c.embedding = v\n", + "\n", + " document_store.write_documents(chunks, policy=DuplicatePolicy.OVERWRITE)\n", + " return {\n", + " \"ok\": True,\n", + " \"url\": url,\n", + " \"chunks_indexed\": len(chunks),\n", + " \"chars_extracted\": len(text),\n", + " \"total_docs_in_index\": document_store.count_documents(),\n", + " }\n", + "\n", + "\n", + "# Demo: ingest the Perplexity integrations landing page so the agent\n", + "# can answer questions about perplexity-haystack later from the index.\n", + "print(json.dumps(\n", + " ingest_url(\"https://haystack.deepset.ai/integrations/perplexity\",\n", + " title=\"Perplexity Haystack integration\"),\n", + " indent=2,\n", + "))\n" + ] }, { "cell_type": "markdown", - "id": "run-q3", + "id": "agent", "metadata": { "tags": [] }, - "source": "\n### 5.3 Question 3 \u2014 confirms the index actually grew\n\nA follow-up that should now hit the ingested page via `retrieve_from_index` without another web call.\n" + "source": [ + "\n", + "## 8. Agent (Perplexity Agent API)\n", + "\n", + "`PerplexityChatGenerator` defaults to `openai/gpt-5.4` via the Agent API; you can swap to any other model the Agent API exposes (Anthropic, Gemini, Perplexity Sonar, etc.).\n", + "\n", + "The system prompt forces the order **retrieve → web_search → ingest_url → retrieve again → answer** so we get a clean demo of the loop. In production you can loosen it.\n" + ] }, { "cell_type": "code", - "execution_count": null, - "id": "run-q3-code", + "execution_count": 10, + "id": "agent-init", "metadata": { + "execution": { + "iopub.execute_input": "2026-05-21T20:54:44.312883Z", + "iopub.status.busy": "2026-05-21T20:54:44.312616Z", + "iopub.status.idle": "2026-05-21T20:54:45.818446Z", + "shell.execute_reply": "2026-05-21T20:54:45.817587Z" + }, "tags": [ - "run", - "demo:q3" + "component:agent", + "perplexity:agent" ] }, - "outputs": [], - "source": "q3 = \"In the perplexity-haystack release notes you just ingested, which components were added or changed?\"\nresult = agent.run(messages=[ChatMessage.from_user(q3)])\nprint(result[\"messages\"][-1].text)" + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Agent ready with tools: ['retrieve_from_index', 'web_search', 'ingest_url']\n" + ] + } + ], + "source": [ + "from haystack.tools import Tool\n", + "from haystack.components.agents import Agent\n", + "from haystack.dataclasses import ChatMessage\n", + "from haystack_integrations.components.generators.perplexity import PerplexityChatGenerator\n", + "\n", + "retrieve_tool = Tool(\n", + " name=\"retrieve_from_index\",\n", + " description=(\n", + " \"Search the local Qdrant knowledge base (Perplexity embeddings). \"\n", + " \"Use this FIRST for any question that might be in the index.\"\n", + " ),\n", + " parameters={\n", + " \"type\": \"object\",\n", + " \"properties\": {\n", + " \"query\": {\"type\": \"string\"},\n", + " \"top_k\": {\"type\": \"integer\", \"default\": 4, \"minimum\": 1, \"maximum\": 10},\n", + " },\n", + " \"required\": [\"query\"],\n", + " },\n", + " function=retrieve_from_index,\n", + ")\n", + "\n", + "web_search_tool = Tool(\n", + " name=\"web_search\",\n", + " description=(\n", + " \"Search the live web with the Perplexity Search API. Use when \"\n", + " \"retrieve_from_index returns nothing useful, or when you need current information.\"\n", + " ),\n", + " parameters={\n", + " \"type\": \"object\",\n", + " \"properties\": {\n", + " \"query\": {\"type\": \"string\"},\n", + " \"top_k\": {\"type\": \"integer\", \"default\": 5, \"minimum\": 1, \"maximum\": 20},\n", + " },\n", + " \"required\": [\"query\"],\n", + " },\n", + " function=web_search,\n", + ")\n", + "\n", + "ingest_tool = Tool(\n", + " name=\"ingest_url\",\n", + " description=(\n", + " \"Fetch a URL, embed it with Perplexity embeddings, and add it to the \"\n", + " \"Qdrant index for reuse by future retrieve_from_index calls. Use after \"\n", + " \"web_search when you find an authoritative source.\"\n", + " ),\n", + " parameters={\n", + " \"type\": \"object\",\n", + " \"properties\": {\n", + " \"url\": {\"type\": \"string\"},\n", + " \"title\": {\"type\": \"string\"},\n", + " },\n", + " \"required\": [\"url\"],\n", + " },\n", + " function=ingest_url,\n", + ")\n", + "\n", + "chat = PerplexityChatGenerator(model=\"openai/gpt-5.4\")\n", + "\n", + "agent = Agent(\n", + " chat_generator=chat,\n", + " tools=[retrieve_tool, web_search_tool, ingest_tool],\n", + " system_prompt=(\n", + " \"You are a research agent. For every user question:\\n\"\n", + " \"1. Call retrieve_from_index first to see if the local knowledge base already has the answer.\\n\"\n", + " \"2. If the retrieved snippets don't cover the question, call web_search.\\n\"\n", + " \"3. Whenever web_search returns a URL whose contents you actually need to answer the question, \"\n", + " \"call ingest_url on that URL FIRST so future retrieve_from_index calls can find it. \"\n", + " \"Then call retrieve_from_index again to read the chunks you just ingested.\\n\"\n", + " \"4. Write a concise answer. Cite every fact with the source URL it came from using inline markdown links.\"\n", + " ),\n", + " exit_conditions=[\"text\"],\n", + " max_agent_steps=10,\n", + ")\n", + "agent.warm_up()\n", + "print(\"Agent ready with tools:\", [t.name for t in agent.tools])\n" + ] }, { "cell_type": "markdown", - "id": "inspect", + "id": "run-demo", "metadata": { "tags": [] }, - "source": "\n## 6. Inspect the trace\n\nEach `ChatMessage` in `result['messages']` records the tool calls and their outputs. This is the per-turn evidence that the agent used the right tool for each question.\n" + "source": [ + "\n", + "## 9. Run the agent on three questions\n", + "\n", + "* **Q1** is covered by the seed docs — should be one `retrieve_from_index` call, then an answer.\n", + "* **Q2** isn't covered — the agent should fall through to `web_search`, pick a URL, `ingest_url` it, and retrieve again before answering.\n", + "* **Q3** asks something that *should* now be in the index thanks to Q2's ingest, so the agent can stay local.\n", + "\n", + "For each run we print the message timeline (role + tool calls), then the final answer.\n" + ] }, { "cell_type": "code", - "execution_count": null, - "id": "inspect-code", + "execution_count": 11, + "id": "run-questions", "metadata": { + "execution": { + "iopub.execute_input": "2026-05-21T20:54:45.820008Z", + "iopub.status.busy": "2026-05-21T20:54:45.819759Z", + "iopub.status.idle": "2026-05-21T20:55:18.306713Z", + "shell.execute_reply": "2026-05-21T20:55:18.306092Z" + }, "tags": [ - "inspect" + "demo", + "run" ] }, - "outputs": [], - "source": "for i, msg in enumerate(result[\"messages\"]):\n tool_calls = getattr(msg, \"tool_calls\", []) or []\n tool_results = getattr(msg, \"tool_call_results\", []) or []\n summary = []\n if tool_calls:\n summary.append(\"calls=\" + \", \".join(tc.tool_name for tc in tool_calls))\n if tool_results:\n summary.append(f\"results={len(tool_results)}\")\n if msg.text:\n summary.append(f\"text[{len(msg.text)} chars]\")\n print(f\"[{i}] role={msg.role.value} \" + \" | \".join(summary))" + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "================================================================================\n", + "Q: What is the Haystack Agent component, briefly?\n" + ] + }, + { + "name": "stderr", + "output_type": "stream", + "text": [ + "/tmp/ipykernel_1731/3725313797.py:29: Warning: Mutating attribute 'embedding' on an instance of 'Document' can lead to unexpected behavior by affecting other parts of the pipeline that use the same dataclass instance. Use `dataclasses.replace(instance, embedding=new_value)` instead. See https://docs.haystack.deepset.ai/docs/custom-components#requirements for details.\n", + " c.embedding = v\n" + ] + }, + { + "name": "stderr", + "output_type": "stream", + "text": [ + "\r", + " 0%| | 0/12 [00:00 None:\n", + " print(\"=\" * 80)\n", + " print(\"Q:\", q)\n", + " result = agent.run(messages=[ChatMessage.from_user(q)])\n", + " msgs = result[\"messages\"]\n", + " print(f\"messages: {len(msgs)}\")\n", + " for i, m in enumerate(msgs):\n", + " tool_calls = getattr(m, \"tool_calls\", []) or []\n", + " tool_results = getattr(m, \"tool_call_results\", []) or []\n", + " print(\n", + " f\" [{i}] role={m.role.value} \"\n", + " f\"tool_calls={[t.tool_name for t in tool_calls]} \"\n", + " f\"results={len(tool_results)} text_len={len(m.text or '')}\"\n", + " )\n", + " print(\"\\nFINAL ANSWER:\\n\" + (msgs[-1].text or \"\"))\n", + " print(\"\\nDocs in index now:\", document_store.count_documents())\n", + "\n", + "\n", + "for q in QUESTIONS:\n", + " run_question(q)\n", + " print()\n" + ] }, { "cell_type": "markdown", @@ -292,7 +1168,22 @@ "metadata": { "tags": [] }, - "source": "\n## 7. Wrap up\n\nYou now have a Haystack agent that:\n\n- **Decides** when to retrieve, search, or ingest \u2014 using Perplexity for all three.\n- **Learns** across turns: the second question added a page to the Qdrant index, the third question retrieved it without another web call.\n- **Cites** every web-derived fact \u2014 Perplexity's Search API returns the source URLs alongside the content, so the agent has nothing to hallucinate.\n\n### Where to go next\n\n- Swap `pplx-embed-v1-0.6b` for `pplx-embed-v1-4b` if you need higher-quality embeddings.\n- Add `search_recency_filter` or `search_domain_filter` to `PerplexityWebSearch.search_params` to constrain results.\n- Replace the in-memory Qdrant with a hosted Qdrant Cloud instance to persist what the agent learns.\n- Read the [Perplexity x Haystack integration guide](https://docs.perplexity.ai/docs/getting-started/integrations/haystack) for advanced patterns (streaming, structured outputs, tool-calling with the Agent API).\n" + "source": [ + "\n", + "## 10. Where to go next\n", + "\n", + "* Swap `recreate_index=True` to `False` and let the index accumulate across sessions — every web answer the agent gives stays available offline.\n", + "* Bring in `PerplexityContextualizedEmbedder` (`/v1/contextualizedembeddings`) instead of the static doc embedder if your corpus has heavy section structure or codebases.\n", + "* Add a `cite_index` tool that returns the embedded chunk IDs alongside the answer, so downstream UIs can hyperlink back to the source.\n", + "* Try a different reasoning model — `chat = PerplexityChatGenerator(model=\"anthropic/claude-sonnet-4-5\")` works via the Agent API too.\n", + "\n", + "### References\n", + "* Perplexity Agent API — \n", + "* Perplexity Search API — \n", + "* Perplexity Embeddings API — \n", + "* Haystack Perplexity integration — \n", + "* Qdrant document store — \n" + ] } ], "metadata": { @@ -302,41 +1193,59 @@ "name": "python3" }, "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", "name": "python", - "pygments_lexer": "ipython3" + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.12.8" }, "x_cookbook": { - "external_services": [ - "perplexity-api", - "qdrant (in-memory)" + "authors": [ + { + "name": "Perplexity API team", + "url": "https://perplexity.ai" + } ], - "haystack_components": [ - "PerplexityChatGenerator", - "PerplexityWebSearch", - "PerplexityDocumentEmbedder", - "PerplexityTextEmbedder", - "QdrantDocumentStore", - "QdrantEmbeddingRetriever", - "Agent", - "Tool", - "DocumentSplitter", - "DocumentWriter" + "category": "agents", + "components_used": [ + "haystack.components.agents.Agent", + "haystack.tools.Tool", + "haystack.components.preprocessors.DocumentSplitter", + "haystack_integrations.components.generators.perplexity.PerplexityChatGenerator", + "haystack_integrations.components.websearch.perplexity.PerplexityWebSearch", + "haystack_integrations.document_stores.qdrant.QdrantDocumentStore", + "haystack_integrations.components.retrievers.qdrant.QdrantEmbeddingRetriever" ], - "perplexity_apis": [ - "Agent API", - "Search API", - "Embeddings API" + "integrations": [ + { + "name": "perplexity-haystack", + "url": "https://haystack.deepset.ai/integrations/perplexity" + }, + { + "name": "qdrant-haystack", + "url": "https://haystack.deepset.ai/integrations/qdrant-document-store" + } ], - "requires_keys": [ - "PERPLEXITY_API_KEY" + "perplexity_apis_used": [ + "/v1/agent", + "/search", + "/v1/embeddings" ], - "title": "Live-Learning Research Agent with Perplexity (Search + Embeddings + Agent) and Qdrant", - "topics": [ - "Agents", - "RAG", - "Web-QA", - "Advanced Retrieval" - ] + "slug": "perplexity_live_research_agent", + "tags": [ + "perplexity", + "qdrant", + "agent", + "rag", + "tool-calling", + "embeddings" + ], + "title": "Live-Learning Research Agent with Perplexity (Search + Embeddings + Agent) and Qdrant" } }, "nbformat": 4, From a1ea2da0290d68dcbf24180a98f15b3065a920cb Mon Sep 17 00:00:00 2001 From: James Liounis Date: Thu, 21 May 2026 17:17:04 -0400 Subject: [PATCH 3/4] nits --- .../perplexity_live_research_agent.ipynb | 53 +++++++++---------- 1 file changed, 25 insertions(+), 28 deletions(-) diff --git a/notebooks/perplexity_live_research_agent.ipynb b/notebooks/perplexity_live_research_agent.ipynb index 3a1b568..544e0e4 100644 --- a/notebooks/perplexity_live_research_agent.ipynb +++ b/notebooks/perplexity_live_research_agent.ipynb @@ -145,14 +145,12 @@ "\n", "A tiny wrapper around `POST /v1/embeddings`. We ask for `base64_int8` (currently the only supported `encoding_format`), then decode to a `float32` list that Qdrant is happy to ingest.\n", "\n", - "The vector dimension for `pplx-embed-v1-0.6b` is **1024** — keep this in mind when you create the Qdrant collection.\n", - "\n", - "We also stamp every request with an `X-Pplx-Integration` header so the API team can see traffic from this cookbook in their dashboards.\n" + "The vector dimension for `pplx-embed-v1-0.6b` is **1024** — keep this in mind when you create the Qdrant collection.\n" ] }, { "cell_type": "code", - "execution_count": 3, + "execution_count": null, "id": "embed-helper", "metadata": { "execution": { @@ -192,7 +190,6 @@ " headers={\n", " \"Authorization\": f\"Bearer {os.environ['PERPLEXITY_API_KEY']}\",\n", " \"Content-Type\": \"application/json\",\n", - " # See attribution header docs at docs.perplexity.ai\n", " \"X-Pplx-Integration\": \"haystack/cookbook-live-research-agent\",\n", " },\n", " json={\n", @@ -321,7 +318,7 @@ "name": "stderr", "output_type": "stream", "text": [ - "\r", + "\r\n", " 0%| | 0/2 [00:00\n", "## 6. `web_search` tool (Perplexity Search API)\n", "\n", - "`PerplexityWebSearch` hits `POST /search` and gives back already-ranked, already-cleaned results plus the list of source URLs. No SERP scraper, no extra fetcher in front of the model.\n" + "`PerplexityWebSearch` hits `POST /search` and gives back already-ranked, already-cleaned results plus the list of source URLs. No SERP scraper, no extra fetcher in front of the model. Refer to the official documentation [here](https://docs.perplexity.ai/docs/search/quickstart). \n" ] }, { @@ -607,7 +604,7 @@ "name": "stderr", "output_type": "stream", "text": [ - "\r", + "\r\n", " 0%| | 0/2 [00:00\n", "## 8. Agent (Perplexity Agent API)\n", "\n", - "`PerplexityChatGenerator` defaults to `openai/gpt-5.4` via the Agent API; you can swap to any other model the Agent API exposes (Anthropic, Gemini, Perplexity Sonar, etc.).\n", + "`PerplexityChatGenerator` defaults to `openai/gpt-5.4` via the Agent API; you can swap to any other model the Agent API exposes (Anthropic, Gemini, Perplexity Sonar, etc.). Refer to the original documentation [here](https://docs.perplexity.ai/docs/agent-api/quickstart).\n", "\n", "The system prompt forces the order **retrieve → web_search → ingest_url → retrieve again → answer** so we get a clean demo of the loop. In production you can loosen it.\n" ] @@ -871,7 +868,7 @@ "name": "stderr", "output_type": "stream", "text": [ - "\r", + "\r\n", " 0%| | 0/12 [00:00 Date: Mon, 25 May 2026 16:44:41 +0000 Subject: [PATCH 4/4] cookbook: use PerplexityDocumentEmbedder/TextEmbedder directly Now that haystack-core-integrations PR #3344 fixes the embedders to default to encoding_format=base64_int8 and decode responses to list[float], drop the bespoke httpx + np.frombuffer helper and use PerplexityDocumentEmbedder / PerplexityTextEmbedder directly for seeding, querying, and ingest_url chunk embedding. Refs: https://github.com/deepset-ai/haystack-core-integrations/pull/3344 --- .../perplexity_live_research_agent.ipynb | 275 ++++-------------- 1 file changed, 62 insertions(+), 213 deletions(-) diff --git a/notebooks/perplexity_live_research_agent.ipynb b/notebooks/perplexity_live_research_agent.ipynb index 544e0e4..940b69c 100644 --- a/notebooks/perplexity_live_research_agent.ipynb +++ b/notebooks/perplexity_live_research_agent.ipynb @@ -17,12 +17,12 @@ "| Perplexity API | Haystack component | Role in this notebook |\n", "|---|---|---|\n", "| **Agent API** (`POST /v1/agent`) | `PerplexityChatGenerator` | The agent's reasoning model. OpenAI-Responses-compatible, so it slots into Haystack's [`Agent`](https://docs.haystack.deepset.ai/docs/agent) with no glue. |\n", - "| **Search API** (`POST /search`) | `PerplexityWebSearch` | Ranked, cleaned, cited web results, exposed to the agent as a tool — replaces the SerperDev / DuckDuckGo + `LinkContentFetcher` chain other cookbooks build by hand. |\n", - "| **Embeddings API** (`POST /v1/embeddings`) | `pplx-embed-v1-0.6b` (called directly, see note below) | Indexes documents into [Qdrant](https://haystack.deepset.ai/integrations/qdrant-document-store) and embeds queries at retrieval time. |\n", + "| **Search API** (`POST /search`) | `PerplexityWebSearch` | Ranked, cleaned, cited web results, exposed to the agent as a tool \u2014 replaces the SerperDev / DuckDuckGo + `LinkContentFetcher` chain other cookbooks build by hand. |\n", + "| **Embeddings API** (`POST /v1/embeddings`) | `PerplexityTextEmbedder` / `PerplexityDocumentEmbedder` | Indexes documents into [Qdrant](https://haystack.deepset.ai/integrations/qdrant-document-store) and embeds queries at retrieval time. |\n", "\n", - "The agent gets three tools — `retrieve_from_index`, `web_search`, `ingest_url` — and decides per question whether to read the local index, search the live web, or grow the index with a freshly-fetched page. Net result: a knowledge base that *learns from the agent's own behaviour*, with citations on every web answer.\n", + "The agent gets three tools \u2014 `retrieve_from_index`, `web_search`, `ingest_url` \u2014 and decides per question whether to read the local index, search the live web, or grow the index with a freshly-fetched page. Net result: a knowledge base that *learns from the agent's own behaviour*, with citations on every web answer.\n", "\n", - "> **One real-world gotcha you'll hit:** the live `/v1/embeddings` endpoint only accepts `encoding_format` of `base64_int8` or `base64_binary`, but `PerplexityDocumentEmbedder` / `PerplexityTextEmbedder` inherit the OpenAI default of `float` and get back HTTP 400. Until that's fixed upstream, we call the embeddings endpoint directly with `httpx` and decode `base64_int8` to `float32`. It's ~15 lines and we walk through it below.\n" + "> **Embeddings note:** Perplexity's `/v1/embeddings` endpoint only accepts `encoding_format` of `base64_int8` or `base64_binary`. As of [`perplexity-haystack` PR #3344](https://github.com/deepset-ai/haystack-core-integrations/pull/3344), `PerplexityDocumentEmbedder` and `PerplexityTextEmbedder` default to `base64_int8` and decode the response back to `list[float]` automatically \u2014 no manual `httpx` call needed. Make sure you're on a release that includes that fix; on older versions the embedders inherit OpenAI's `encoding_format=\"float\"` default and get HTTP 400.\n" ] }, { @@ -35,10 +35,10 @@ "\n", "## What you will build\n", "\n", - "1. A Qdrant Cloud–backed knowledge base seeded with a couple of Haystack documentation snippets, embedded with `pplx-embed-v1-0.6b`.\n", + "1. A Qdrant Cloud\u2013backed knowledge base seeded with a couple of Haystack documentation snippets, embedded with `pplx-embed-v1-0.6b` via `PerplexityDocumentEmbedder`.\n", "2. A `web_search` tool wrapping `PerplexityWebSearch` so the agent can hit the Perplexity Search API directly.\n", - "3. An `ingest_url` tool that takes a URL from a web search result, extracts the page with `trafilatura`, embeds the chunks with the Perplexity Embeddings API, and writes them to Qdrant.\n", - "4. A `retrieve_from_index` tool that embeds the query and pulls the top-k from Qdrant.\n", + "3. An `ingest_url` tool that takes a URL from a web search result, extracts the page with `trafilatura`, embeds the chunks with `PerplexityDocumentEmbedder`, and writes them to Qdrant.\n", + "4. A `retrieve_from_index` tool that embeds the query with `PerplexityTextEmbedder` and pulls the top-k from Qdrant.\n", "5. A Haystack [`Agent`](https://docs.haystack.deepset.ai/docs/agent) driven by `PerplexityChatGenerator` that orchestrates the three tools.\n", "6. Three sample questions that show the index growing across turns and answers carrying citations end-to-end.\n" ] @@ -53,7 +53,7 @@ "\n", "## 1. Setup\n", "\n", - "Install the integration packages plus `trafilatura` for HTML extraction. The notebook uses a **Qdrant Cloud** cluster (free tier is enough) so the index persists across runs — flip `recreate_index=True` to `False` once you have something you want to keep.\n" + "Install the integration packages plus `trafilatura` for HTML extraction. The notebook uses a **Qdrant Cloud** cluster (free tier is enough) so the index persists across runs \u2014 flip `recreate_index=True` to `False` once you have something you want to keep.\n" ] }, { @@ -78,8 +78,7 @@ " \"perplexity-haystack\" \\\n", " \"qdrant-haystack\" \\\n", " \"trafilatura\" \\\n", - " \"httpx\" \\\n", - " \"numpy\"" + " \"httpx\"\n" ] }, { @@ -141,11 +140,14 @@ }, "source": [ "\n", - "## 2. Embedding helper (Perplexity `pplx-embed-v1-0.6b`)\n", + "## 2. Embedding components (Perplexity `pplx-embed-v1-0.6b`)\n", "\n", - "A tiny wrapper around `POST /v1/embeddings`. We ask for `base64_int8` (currently the only supported `encoding_format`), then decode to a `float32` list that Qdrant is happy to ingest.\n", + "We use the first-class Haystack components from `perplexity-haystack`:\n", "\n", - "The vector dimension for `pplx-embed-v1-0.6b` is **1024** — keep this in mind when you create the Qdrant collection.\n" + "* `PerplexityDocumentEmbedder` \u2014 embeds `Document` objects in batches for indexing.\n", + "* `PerplexityTextEmbedder` \u2014 embeds a single query string at retrieval time.\n", + "\n", + "Both default to `encoding_format=\"base64_int8\"` and decode the response to `list[float]` internally \u2014 see [PR #3344](https://github.com/deepset-ai/haystack-core-integrations/pull/3344). The vector dimension for `pplx-embed-v1-0.6b` is **1024**; keep that in mind when you create the Qdrant collection below.\n" ] }, { @@ -164,53 +166,25 @@ "perplexity:embeddings" ] }, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "returned 2 vectors of dim 1024\n" - ] - } - ], + "outputs": [], "source": [ - "import base64\n", - "import httpx\n", - "import numpy as np\n", + "from haystack_integrations.components.embedders.perplexity import (\n", + " PerplexityDocumentEmbedder,\n", + " PerplexityTextEmbedder,\n", + ")\n", "\n", - "EMBEDDINGS_URL = \"https://api.perplexity.ai/v1/embeddings\"\n", "EMBEDDING_MODEL = \"pplx-embed-v1-0.6b\"\n", "EMBEDDING_DIM = 1024\n", "\n", + "doc_embedder = PerplexityDocumentEmbedder(model=EMBEDDING_MODEL)\n", + "text_embedder = PerplexityTextEmbedder(model=EMBEDDING_MODEL)\n", "\n", - "def embed_texts(texts: list[str]) -> list[list[float]]:\n", - " \"\"\"Call Perplexity /v1/embeddings and return float32 vectors.\"\"\"\n", - " resp = httpx.post(\n", - " EMBEDDINGS_URL,\n", - " headers={\n", - " \"Authorization\": f\"Bearer {os.environ['PERPLEXITY_API_KEY']}\",\n", - " \"Content-Type\": \"application/json\",\n", - " \"X-Pplx-Integration\": \"haystack/cookbook-live-research-agent\",\n", - " },\n", - " json={\n", - " \"model\": EMBEDDING_MODEL,\n", - " \"input\": texts,\n", - " \"encoding_format\": \"base64_int8\",\n", - " },\n", - " timeout=60.0,\n", - " )\n", - " resp.raise_for_status()\n", - " out = []\n", - " for item in resp.json()[\"data\"]:\n", - " raw = base64.b64decode(item[\"embedding\"])\n", - " vec = np.frombuffer(raw, dtype=np.int8).astype(np.float32)\n", - " out.append(vec.tolist())\n", - " return out\n", + "doc_embedder.warm_up()\n", + "text_embedder.warm_up()\n", "\n", - "\n", - "# Quick sanity check.\n", - "sample = embed_texts([\"hello world\", \"retrieval augmented generation\"])\n", - "print(\"returned\", len(sample), \"vectors of dim\", len(sample[0]))\n" + "# Quick sanity check \u2014 embed a single query string.\n", + "sample = text_embedder.run(text=\"retrieval augmented generation\")\n", + "print(\"returned vector of dim\", len(sample[\"embedding\"]))\n" ] }, { @@ -223,7 +197,7 @@ "\n", "## 3. Qdrant document store\n", "\n", - "The Qdrant collection is created with `embedding_dim=1024` to match `pplx-embed-v1-0.6b`. We keep `recreate_index=True` here so re-running the notebook from scratch gives reproducible output — change to `False` once you want the index to persist between sessions.\n" + "The Qdrant collection is created with `embedding_dim=1024` to match `pplx-embed-v1-0.6b`. We keep `recreate_index=True` here so re-running the notebook from scratch gives reproducible output \u2014 change to `False` once you want the index to persist between sessions.\n" ] }, { @@ -291,7 +265,7 @@ }, { "cell_type": "code", - "execution_count": 5, + "execution_count": null, "id": "seed-docs", "metadata": { "execution": { @@ -305,54 +279,7 @@ "seed" ] }, - "outputs": [ - { - "name": "stderr", - "output_type": "stream", - "text": [ - "/tmp/ipykernel_1731/3113823150.py:26: Warning: Mutating attribute 'embedding' on an instance of 'Document' can lead to unexpected behavior by affecting other parts of the pipeline that use the same dataclass instance. Use `dataclasses.replace(instance, embedding=new_value)` instead. See https://docs.haystack.deepset.ai/docs/custom-components#requirements for details.\n", - " doc.embedding = vec\n" - ] - }, - { - "name": "stderr", - "output_type": "stream", - "text": [ - "\r\n", - " 0%| | 0/2 [00:00\n", "## 5. `retrieve_from_index` tool\n", "\n", - "Embed the query with the same model used at indexing time, then pull the top-k from Qdrant. The tool returns a compact JSON payload — title, source URL, snippet, score — because the agent has to fit it into its context window.\n" + "Embed the query with the same model used at indexing time, then pull the top-k from Qdrant. The tool returns a compact JSON payload \u2014 title, source URL, snippet, score \u2014 because the agent has to fit it into its context window.\n" ] }, { @@ -420,7 +346,7 @@ }, { "cell_type": "code", - "execution_count": 7, + "execution_count": null, "id": "retriever-tool", "metadata": { "execution": { @@ -434,30 +360,7 @@ "component:retriever" ] }, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "{\n", - " \"hits\": [\n", - " {\n", - " \"title\": \"Haystack overview\",\n", - " \"source\": \"https://haystack.deepset.ai/\",\n", - " \"snippet\": \"Haystack is an open-source LLM framework by deepset. You compose components like retrievers, generators, and embedders into pipelines, and add tool-calling Agents on top.\",\n", - " \"score\": 0.6988\n", - " },\n", - " {\n", - " \"title\": \"Haystack Agent component\",\n", - " \"source\": \"https://docs.haystack.deepset.ai/docs/agent\",\n", - " \"snippet\": \"In Haystack 2.x, the Agent component takes a chat generator plus a list of Tool objects, loops over tool calls, and exits when the model emits a final answer (the 'text' exit condition).\",\n", - " \"score\": 0.4161\n", - " }\n", - " ]\n", - "}\n" - ] - } - ], + "outputs": [], "source": [ "from haystack_integrations.components.retrievers.qdrant import QdrantEmbeddingRetriever\n", "\n", @@ -465,7 +368,7 @@ "\n", "\n", "def retrieve_from_index(query: str, top_k: int = 4) -> dict:\n", - " query_emb = embed_texts([query])[0]\n", + " query_emb = text_embedder.run(text=query)[\"embedding\"]\n", " hits = retriever.run(query_embedding=query_emb, top_k=top_k)[\"documents\"]\n", " return {\n", " \"hits\": [\n", @@ -480,7 +383,7 @@ " }\n", "\n", "\n", - "# Smoke test — should retrieve the seed docs about Haystack.\n", + "# Smoke test \u2014 should retrieve the seed docs about Haystack.\n", "preview = retrieve_from_index(\"What is Haystack?\", top_k=2)\n", "print(json.dumps(preview, indent=2)[:800])\n" ] @@ -576,7 +479,7 @@ }, { "cell_type": "code", - "execution_count": 9, + "execution_count": null, "id": "ingest-tool", "metadata": { "execution": { @@ -591,61 +494,9 @@ "component:embedder" ] }, - "outputs": [ - { - "name": "stderr", - "output_type": "stream", - "text": [ - "/tmp/ipykernel_1731/3725313797.py:29: Warning: Mutating attribute 'embedding' on an instance of 'Document' can lead to unexpected behavior by affecting other parts of the pipeline that use the same dataclass instance. Use `dataclasses.replace(instance, embedding=new_value)` instead. See https://docs.haystack.deepset.ai/docs/custom-components#requirements for details.\n", - " c.embedding = v\n" - ] - }, - { - "name": "stderr", - "output_type": "stream", - "text": [ - "\r\n", - " 0%| | 0/2 [00:00\n", "## 9. Run the agent on three questions\n", "\n", - "* **Q1** is covered by the seed docs — should be one `retrieve_from_index` call, then an answer.\n", - "* **Q2** isn't covered — the agent should fall through to `web_search`, pick a URL, `ingest_url` it, and retrieve again before answering.\n", + "* **Q1** is covered by the seed docs \u2014 should be one `retrieve_from_index` call, then an answer.\n", + "* **Q2** isn't covered \u2014 the agent should fall through to `web_search`, pick a URL, `ingest_url` it, and retrieve again before answering.\n", "* **Q3** asks something that *should* now be in the index thanks to Q2's ingest, so the agent can stay local.\n", "\n", "For each run we print the message timeline (role + tool calls), then the final answer.\n" @@ -913,7 +762,7 @@ " [10] role=assistant tool_calls=[] results=0 text_len=455\n", "\n", "FINAL ANSWER:\n", - "The Haystack **Agent** component is a pipeline component that lets an LLM **reason through a task, use tools when needed, and produce a final answer**. In Haystack, agents are designed for workflows where the model may need multiple steps—such as deciding which tool to call, gathering information, and then responding—rather than just generating a single direct output from a prompt. [Haystack docs](https://docs.haystack.deepset.ai/reference/agents-api)\n" + "The Haystack **Agent** component is a pipeline component that lets an LLM **reason through a task, use tools when needed, and produce a final answer**. In Haystack, agents are designed for workflows where the model may need multiple steps\u2014such as deciding which tool to call, gathering information, and then responding\u2014rather than just generating a single direct output from a prompt. [Haystack docs](https://docs.haystack.deepset.ai/reference/agents-api)\n" ] }, { @@ -983,7 +832,7 @@ " [10] role=assistant tool_calls=[] results=0 text_len=312\n", "\n", "FINAL ANSWER:\n", - "According to the official Haystack integrations page, the **`perplexity-haystack`** package provides **components for using Perplexity models within Haystack pipelines**—specifically support for **chat generators and rankers**. [https://haystack.deepset.ai/integrations](https://haystack.deepset.ai/integrations)\n" + "According to the official Haystack integrations page, the **`perplexity-haystack`** package provides **components for using Perplexity models within Haystack pipelines**\u2014specifically support for **chat generators and rankers**. [https://haystack.deepset.ai/integrations](https://haystack.deepset.ai/integrations)\n" ] }, { @@ -1112,7 +961,7 @@ " [14] role=assistant tool_calls=[] results=0 text_len=445\n", "\n", "FINAL ANSWER:\n", - "`perplexity-haystack` wraps Perplexity’s **chat-completions API** and **search API**. It ships Haystack components for both: **`PerplexityChatGenerator`** for chat/completions and **`PerplexityWebSearch`** for web search.[[Haystack docs](https://docs.haystack.deepset.ai/docs/perplexity)][[PyPI](https://pypi.org/project/perplexity-haystack/)][[GitHub](https://github.com/deepset-ai/haystack-core-integrations/tree/main/integrations/perplexity)]\n" + "`perplexity-haystack` wraps Perplexity\u2019s **chat-completions API** and **search API**. It ships Haystack components for both: **`PerplexityChatGenerator`** for chat/completions and **`PerplexityWebSearch`** for web search.[[Haystack docs](https://docs.haystack.deepset.ai/docs/perplexity)][[PyPI](https://pypi.org/project/perplexity-haystack/)][[GitHub](https://github.com/deepset-ai/haystack-core-integrations/tree/main/integrations/perplexity)]\n" ] }, { @@ -1127,11 +976,11 @@ ], "source": [ "QUESTIONS = [\n", - " # Q1 — answerable from the seed docs alone.\n", + " # Q1 \u2014 answerable from the seed docs alone.\n", " \"What is the Haystack Agent component, briefly?\",\n", - " # Q2 — not in the index. Forces web_search + ingest_url.\n", + " # Q2 \u2014 not in the index. Forces web_search + ingest_url.\n", " \"What does the perplexity-haystack integration package provide, according to the official Haystack integrations page?\",\n", - " # Q3 — should now hit the page Q2 ingested.\n", + " # Q3 \u2014 should now hit the page Q2 ingested.\n", " \"Which Perplexity APIs does the perplexity-haystack package wrap, and which Haystack component classes does it ship?\",\n", "]\n", "\n", @@ -1169,17 +1018,17 @@ "\n", "## 10. Where to go next\n", "\n", - "* Swap `recreate_index=True` to `False` and let the index accumulate across sessions — every web answer the agent gives stays available offline.\n", + "* Swap `recreate_index=True` to `False` and let the index accumulate across sessions \u2014 every web answer the agent gives stays available offline.\n", "* Bring in `PerplexityContextualizedEmbedder` (`/v1/contextualizedembeddings`) instead of the static doc embedder if your corpus has heavy section structure or codebases.\n", "* Add a `cite_index` tool that returns the embedded chunk IDs alongside the answer, so downstream UIs can hyperlink back to the source.\n", - "* Try a different reasoning model — `chat = PerplexityChatGenerator(model=\"anthropic/claude-sonnet-4-5\")` works via the Agent API too.\n", + "* Try a different reasoning model \u2014 `chat = PerplexityChatGenerator(model=\"anthropic/claude-sonnet-4-5\")` works via the Agent API too.\n", "\n", "### References\n", - "* Perplexity Agent API — \n", - "* Perplexity Search API — \n", - "* Perplexity Embeddings API — \n", - "* Haystack Perplexity integration — \n", - "* Qdrant document store — \n" + "* Perplexity Agent API \u2014 \n", + "* Perplexity Search API \u2014 \n", + "* Perplexity Embeddings API \u2014 \n", + "* Haystack Perplexity integration \u2014 \n", + "* Qdrant document store \u2014 \n" ] } ], @@ -1247,4 +1096,4 @@ }, "nbformat": 4, "nbformat_minor": 5 -} +} \ No newline at end of file