Skip to content

Commit e6491d9

Browse files
committed
quest 1
1 parent 52926ca commit e6491d9

File tree

12 files changed

+396
-197
lines changed

12 files changed

+396
-197
lines changed

01-Local-AI-Development/README.md

Lines changed: 396 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,396 @@
1+
![Foundry Local](./assets/foundry-local-logo.png)
2+
3+
Livestream starting soon! **Click below to watch live.**
4+
5+
[![Reactor Livestream](./assets/poster-quest1.png)](https://www.youtube.com/live/GHH50rDlLn0?si=-i3hPYq1o6H271_z)
6+
7+
## Overview
8+
9+
In this Quest, you will unlock the power of **Local AI** using **Microsoft Foundry Local**. With Foundry Local, you can run AI models and integrate them into your applications directly on-device, with no reliance on the public cloud.
10+
11+
With Foundry Local, you gain:
12+
13+
- **Privacy & Security**: Keep sensitive data on your device.
14+
- **Low Latency**: Instant responses without network delays.
15+
- **Cost Efficiency**: No cloud compute costs incurred.
16+
- **Offline Access**: AI capabilities even without internet connectivity.
17+
18+
> [!NOTE]
19+
>
20+
> **Hackathon Award Category: Offline-Ready Award**
21+
>
22+
> As part of the Build-a-thon Hack!, we have a special award category that will recognize the best performing AI solution with standout offline capabilities (local inference).
23+
>
24+
> Consider building an app that:
25+
>
26+
> - Processes sensitive data entirely on-device.
27+
> - Uses Foundry Local for reasoning and a cloud storage service for optional sync or analytics.
28+
>
29+
> Highlight in your submission how you:
30+
>
31+
> - Achieve **privacy** (no sensitive data leaving the device).
32+
> - Optimize for **latency** using local inference.
33+
34+
First step, **installation instructions** for your OS:
35+
36+
<details>
37+
<summary>Install on Windows</summary>
38+
<br>
39+
40+
```bash
41+
winget install Microsoft.FoundryLocal
42+
```
43+
</details>
44+
45+
<details>
46+
<summary>Install on MacOS</summary>
47+
<br>
48+
49+
```bash
50+
brew tap microsoft/foundrylocal
51+
brew install foundrylocal
52+
```
53+
</details>
54+
55+
Once installed, run the following command to start the Foundry Local service:
56+
57+
```bash
58+
foundry service start
59+
```
60+
61+
Let's dive into what is happening under the hood.
62+
63+
## Foundry Local Architecture
64+
65+
What you've started is the **Foundry Local Service**, which provides an **OpenAI-compatible REST server** that acts as a bridge to the **ONNX Runtime** inference engine running on your device.
66+
67+
![Foundry service start](./assets/foundry-service-start.png)
68+
This API endpoint is dynamically allocated each time the service starts, and you can interact with it in various ways that we'll cover below.
69+
70+
Underneath the Foundry Local Service, is the **ONNX Runtime**, a high-performance inference engine optimized for running AI models on local hardware (CPU/GPU/NPU).
71+
72+
Wrapping around these components is a **model management layer** that holds the:
73+
74+
- **management service** to handle model lifecycle operations.
75+
- **model cache** on disk, which stores downloaded models to ensure they are readily available for inference without re-downloading.
76+
77+
![Foundry Local Architecture](./assets/foundry-local-architecture.png)
78+
79+
## Model Lifecycle on Foundry Local
80+
81+
To list models available for local inferencing, run:
82+
83+
```bash
84+
foundry model list
85+
```
86+
87+
![Foundry model list](./assets/foundry-model-list.png)
88+
89+
Each model has:
90+
91+
- An **alias**: a friendly name for easy reference (e.g., `phi-3.5-mini`).
92+
- Device compatible variants (e.g., CPU, GPU, NPU) to automatically leverage your hardware optimally.
93+
- A **model ID**: a unique identifier for precise model selection (e.g., `Phi-4-generic-gpu:1`).
94+
- **Licence, size, and task** information.
95+
96+
The Foundry Local model lifecycle consists of the following stages:
97+
98+
#### 1. Download model
99+
100+
- Fetch the model from the Foundry model catalog to local disk. Run:
101+
102+
```bash
103+
foundry model download <model-alias>
104+
```
105+
106+
*Get `<model-alias>` from the `Alias` column in the model list output.* Downloaded models are automatically cached for more efficient subsequent use. You can inspect the model cache with `foundry cache ls`.
107+
108+
#### 2. Load model
109+
110+
- Load the model into the local management service memory for inference. Run:
111+
112+
```bash
113+
foundry model load <model-alias>
114+
```
115+
116+
#### 3. Run model
117+
118+
- Execute inference requests against the loaded model. Run:
119+
120+
```bash
121+
foundry model run <model-alias>
122+
```
123+
124+
*If you directly run a model that hasn't been downloaded or loaded yet, Foundry Local will automatically handle those steps for you.*
125+
126+
#### 4. Unload model
127+
128+
- Remove the model from memory to free up memory resources when not in use. Run:
129+
130+
```bash
131+
foundry model unload <model-alias>
132+
```
133+
134+
## Developer Experience on Foundry Local
135+
136+
Foundry Local provides multiple ways to interact with and integrate local AI models:
137+
138+
### 1. Command Line Interface (CLI)
139+
140+
In step 4 of the model lifecycle above, we used the CLI to run inference against a model. This is a powerful way to quickly experiment with local models directly from your terminal.
141+
142+
![Foundry model run CLI](./assets/foundry-local-cli.png)
143+
144+
### 2. AI Toolkit Extension for VS Code
145+
146+
The **AI Toolkit for VS Code** extension complements the discovery and experimentation with local models by providing a graphical interface directly within VS Code.
147+
148+
- **Step 1**: Install the [AI Toolkit extension](https://marketplace.visualstudio.com/items?itemName=ms-windows-ai-studio.windows-ai-studio) from the Extensions marketplace.
149+
150+
- **Step 2**: Open the `AI Toolkit` extension, under `Local Resources`, hover on `Models` and click the `+` icon.
151+
152+
- **Step 3**: Select `Add Foundry Local Model` and select a model from the dropdown. Click `Ok`.
153+
154+
![Add model from Foundry Local](./assets/add-model-from-foundry-local.png)
155+
156+
Once added *(and this may take a few moments to download and load the model)*, you can interact with it in two ways:
157+
158+
#### Option A: Model Playground
159+
160+
Use the built-in playground to test your local model with chat completions or other inference requests.
161+
162+
- **Step 4**: Under **Tools** >> **+ Build**, select **Model Playground** and on the **Model** setting, choose your Foundry Local model.
163+
164+
![AI Toolkit Model Playground](./assets/phi-3.5-mini-foundry-local.png)
165+
166+
#### Option B: Power GitHub Copilot with Local Models
167+
168+
You can use your Foundry Local models directly with GitHub Copilot Chat - keeping your AI coding assistance entirely on-device for maximum privacy.
169+
170+
> [!TIP]
171+
>
172+
> This is ideal for sensitive codebases or regulated environments where data cannot leave your device, and for working fully offline.
173+
174+
*Ensure you have the [GitHub Copilot extension](https://marketplace.visualstudio.com/items?itemName=GitHub.copilot) installed.*
175+
176+
- **Step 4**: Open GitHub Copilot Chat and click the **model picker** dropdown.
177+
178+
- **Step 5**: Click on **Manage models** at the bottom of the model picker, and expand the **Foundry Local via AI Toolkit** section.
179+
180+
- **Step 6**: Select your preferred local model (e.g., `phi-3.5-mini`, `Qwen`, or other supported models). Right click and select **Show in the Chat model picker**. AI Toolkit will prompt you to download the model if it hasn't been cached locally.
181+
182+
![GitHub Copilot with Foundry Local](./assets/copilot-foundry-local.png)
183+
184+
Once configured, GitHub Copilot Chat will use your local Foundry model for all responses. You can switch between local and cloud models at any time using the model picker.
185+
186+
**Recommended Models for Code Tasks**:
187+
188+
| Model | Best For |
189+
|-------|----------|
190+
| **Phi models** | Reasoning, code generation, natural language understanding |
191+
| **Qwen models** | Multilingual code generation |
192+
| **GPT models** | Advanced capabilities and broad compatibility |
193+
194+
> [!NOTE]
195+
>
196+
> For the Offline-Ready Award, using GitHub Copilot with Foundry Local demonstrates a powerful offline development workflow. Highlight this capability in your submission!
197+
198+
### 3. Software Development Kits (SDKs)
199+
200+
Foundry Local provides SDKs to programmatically send requests to the local management service. Since the endpoint is dynamically allocated each time the service starts, the SDK handles endpoint discovery and management for you (control plane).
201+
202+
#### Step 1: Initialize New Project
203+
204+
Create a parent folder for your Build-a-thon projects and navigate into it:
205+
206+
```bash
207+
mkdir buildathon
208+
cd buildathon
209+
```
210+
211+
Create a new folder for this quest, navigate into it and initialize a Node.js project:
212+
213+
```bash
214+
mkdir foundry-local-quest
215+
cd foundry-local-quest
216+
npm init -y
217+
npm pkg set type=module
218+
```
219+
220+
#### Step 2: Install Foundry Local SDK and LangChain
221+
222+
To interact with Foundry Local programmatically, install the Foundry Local SDK along with LangChain. LangChain is a powerful framework for building AI applications and Agents, providing pre-built components and patterns to streamline AI development.
223+
224+
> [!NOTE]
225+
> After completing this quest, you can visit our free [LangChain.js for Beginners Course](https://github.com/microsoft/langchainjs-for-beginners) to learn more about building AI app & Agents with LangChain.
226+
227+
```bash
228+
npm install foundry-local-sdk @langchain/openai @langchain/core
229+
```
230+
231+
#### Step 3: Exercise: Create an AI Insight Mapper App
232+
233+
Scenario: Assume you want to extract structured data from unstructured inputs like customer support emails for an automated CRM system.
234+
235+
Create `insight_mapper.js` and add the following code:
236+
237+
<details open>
238+
<summary>insight_mapper.js</summary>
239+
240+
```javascript
241+
import { FoundryLocalManager } from "foundry-local-sdk";
242+
import { ChatOpenAI } from "@langchain/openai";
243+
import { ChatPromptTemplate } from "@langchain/core/prompts";
244+
245+
const alias = "phi-3.5-mini";
246+
247+
const foundryLocalManager = new FoundryLocalManager()
248+
249+
const modelInfo = await foundryLocalManager.init(alias)
250+
console.log("Model Info:", modelInfo)
251+
252+
const llm = new ChatOpenAI({
253+
model: modelInfo.id,
254+
configuration: {
255+
baseURL: foundryLocalManager.endpoint,
256+
apiKey: foundryLocalManager.apiKey
257+
},
258+
temperature: 0.6,
259+
streaming: false,
260+
maxTokens: 5000
261+
});
262+
263+
const prompt = ChatPromptTemplate.fromMessages([
264+
{
265+
role: "system",
266+
content: [
267+
"You are InsightMapper, an expert that extracts consistent structured data as JSON.",
268+
"Always answer with VALID JSON using double quotes.",
269+
"Never add commentary, markdown, or surrounding text.",
270+
"If a field cannot be determined, output null for that field."
271+
].join(" ")
272+
},
273+
{
274+
role: "user",
275+
content: [
276+
"Document type: {document_type}",
277+
"Target JSON schema:",
278+
"{json_schema}",
279+
"",
280+
"Unstructured text:",
281+
"{input}",
282+
"",
283+
"Return ONLY the JSON formatted according to the schema."
284+
].join("\n")
285+
}
286+
]);
287+
288+
const chain = prompt.pipe(llm);
289+
290+
const demoName = "InsightMapper JSON Extractor";
291+
const documentType = "customer support email";
292+
const schemaDefinition = `{
293+
"documentType": "string",
294+
"sender": "string",
295+
"recipient": "string",
296+
"contactInfo": "string",
297+
"subject": "string",
298+
"summary": "string",
299+
"sentiment": "one of: positive | neutral | negative",
300+
"actionItems": [
301+
{
302+
"owner": "string",
303+
"description": "string",
304+
"dueDate": "ISO 8601 date or null"
305+
}
306+
],
307+
"priority": "one of: low | medium | high"
308+
}`;
309+
310+
const messyInput = `Hey Support Team – just checking in.
311+
312+
Zava Corp here (Amanda from Ops). Our order #49302 still hasn't shipped and the portal shows ''processing'' for 6 days. We promised our retail partner delivery by next Friday, so this is urgent.
313+
314+
Can someone confirm:
315+
- When will it leave the warehouse?
316+
- Do we need to upgrade shipping to hit the deadline?
317+
318+
Loop in Jessie if you need PO details. Please call me at 555-239-4433.
319+
320+
Thanks!`;
321+
322+
console.log(`\nRunning ${demoName}...`);
323+
324+
chain.invoke({
325+
document_type: documentType,
326+
json_schema: schemaDefinition,
327+
input: messyInput
328+
}).then(aiMsg => {
329+
const rawContent = Array.isArray(aiMsg.content)
330+
? aiMsg.content.map(part => typeof part === "string" ? part : part?.text ?? "").join("")
331+
: String(aiMsg.content);
332+
333+
try {
334+
const parsed = JSON.parse(rawContent);
335+
console.log("\nStructured JSON Output:\n", JSON.stringify(parsed, null, 2));
336+
} catch (parseError) {
337+
console.warn("\nReceived non-JSON output, displaying raw content:");
338+
console.log(rawContent);
339+
}
340+
}).catch(err => {
341+
console.error("Error:", err);
342+
});
343+
```
344+
345+
</details>
346+
347+
Run the code using `node insight_mapper.js`
348+
349+
> Note that the initial run might be slow if the model is still being downloaded.
350+
351+
#### Step 4: Code Explanation and modifications using GitHub Copilot
352+
353+
GitHub Copilot, your AI peer programmer, can help you understand the code above and make further modifications. To get started, ensure you have [access to GitHub Copilot](https://github.com/copilot), free tier available.
354+
355+
Here are some suggested prompts to use. Iterate as needed:
356+
357+
<details>
358+
<summary>Code Explanation</summary>
359+
360+
```
361+
@workspace /explain the purpose and flow of the code in #insight_mapper.js
362+
```
363+
364+
- `@workspace` tells Copilot to focus on the project context.
365+
- `/explain` is a pre-defined command to generate explanations.
366+
- `#insight_mapper.js` specifies the target file.
367+
368+
</details>
369+
370+
<details>
371+
<summary>Build a Simple API Server</summary>
372+
373+
```
374+
Generate a minimal Node.js HTTP server without frameworks. Reuse "foundry-local-sdk" and the same alias from #file:insight_mapper.js to initialize FoundryLocalManager once at startup, obtain the model info, and keep the chain ready. Expose POST /extract that reads raw JSON from the request body, invokes the InsightMapper chain with fields "document_type", "json_schema", and "input" taken from the payload, and returns the model’s JSON response unmodified. Include instructions to run with "node server.js", ensure error handling for JSON parsing and chain failures, and keep the code under 80 lines. Ensure you test the server
375+
```
376+
377+
</details>
378+
379+
<details>
380+
<summary>Create a Simple HTML UI</summary>
381+
382+
```
383+
Provide a standalone HTML file (no external libraries) containing a form with fields for document type, JSON schema, and unstructured text. On submit, prevent default behaviour, gather the values, POST them as JSON to <INSERT YOUR API ENDPOINT HERE>, and display the returned JSON below the form with basic formatting. Handle network or parsing errors gracefully, keep styles minimal and inline, and ensure the markup is concise and easy to copy-paste.
384+
```
385+
386+
Example UI:
387+
![Screenshot of Sample UI](./assets/sample-ui.png)
388+
</details>
389+
390+
## Stay connected
391+
392+
Have a question, project or insight to share? Post in the [Local AI discussion hub](https://github.com/Azure-Samples/JavaScript-AI-Buildathon/discussions/88)
393+
394+
## AI Note
395+
396+
This quest was partially created with the help of AI. The author reviewed and revised the content to ensure accuracy and quality.
397 KB
Loading
467 KB
Loading
402 KB
Loading
692 KB
Loading
5.6 KB
Loading
423 KB
Loading
115 KB
Loading
453 KB
Loading
1.72 MB
Loading

0 commit comments

Comments
 (0)