Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
25 commits
Select commit Hold shift + click to select a range
9457d87
small spelling corrections
bteleuca Dec 8, 2025
091b094
added missing entry for gpt_4o_mini_az_2024_07_18 in llm_fact_sheet.csv
bteleuca Dec 9, 2025
091a547
Corrected multiple typos in llm_fact_sheet.csv:
bteleuca Dec 9, 2025
3012486
updated scoreModel, new options, modelConfiguration and README
bteleuca Dec 9, 2025
4ee764b
temporary remove -q to see all messages
bteleuca Dec 9, 2025
ec14b8d
trying new settings to deblock the issue
bteleuca Dec 9, 2025
ca7f33d
fix huggingface-hub>=0.18.0 instead of huggingface-hub[cli] on 2025.0…
bteleuca Dec 9, 2025
2f73a98
Standardize pip upgrade across all LLM model definitions
bteleuca Dec 9, 2025
24405fe
DCO Remediation Commit for Bogdan Teleuca <bogdan.teleuca@gmail.com>
bteleuca Dec 9, 2025
37b90f0
DCO Remediation Commit for Bogdan Teleuca <bogdan.teleuca@sas.com>
bteleuca Dec 9, 2025
8591503
Fix Monitoring Baseline VA report data item mismatches
bteleuca Dec 11, 2025
aa8f55d
DCO Remediation Commit for Bogdan Teleuca <bogdan.teleuca@sas.com>
bteleuca Dec 11, 2025
c7b4f8e
fixed endpoint default issue
bteleuca Dec 17, 2025
cf27ef8
DCO Remediation Commit for Bogdan Teleuca <bogdan.teleuca@sas.com>
bteleuca Dec 17, 2025
3f890ee
changed from ProviderName to match the key in the llm-prompt-builder.…
bteleuca Dec 18, 2025
1f58cbd
DCO Remediation Commit for Bogdan Teleuca <bogdan.teleuca@sas.com>
bteleuca Dec 18, 2025
977bdb2
GPT_4.1 from Azure OpenAI
bteleuca Dec 18, 2025
95cb121
DCO Remediation Commit for Bogdan Teleuca <bogdan.teleuca@sas.com>
bteleuca Dec 18, 2025
8e7e65f
gpt-4.1 fact sheet and modelConfiguration
bteleuca Dec 18, 2025
aca4bb4
DCO Remediation Commit for Bogdan Teleuca <bogdan.teleuca@sas.com>
bteleuca Dec 18, 2025
e68f60b
cost issue filled
bteleuca Dec 18, 2025
3726223
DCO Remediation Commit for Bogdan Teleuca <bogdan.teleuca@sas.com>
bteleuca Dec 18, 2025
16b8ec3
pandas series parse options
bteleuca Dec 19, 2025
8e5e632
sizing tag for publishing
bteleuca Dec 19, 2025
5d5fedf
DCO Remediation Commit for Bogdan Teleuca <bogdan.teleuca@sas.com>
bteleuca Dec 19, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
393 changes: 359 additions & 34 deletions LLM-Definitions/_Base_Definition/README.md

Large diffs are not rendered by default.

4 changes: 4 additions & 0 deletions LLM-Definitions/_Base_Definition/requirements.json
Original file line number Diff line number Diff line change
@@ -1,4 +1,8 @@
[
{
"step": "upgrade pip before pip install",
"command": "pip3 -q install --upgrade pip setuptools wheel"
},
{
"step":"install common packages",
"command":"pip3 -q install requests tiktoken numpy==1.26.4"
Expand Down
4 changes: 4 additions & 0 deletions LLM-Definitions/claude_2_0/requirements.json
Original file line number Diff line number Diff line change
@@ -1,4 +1,8 @@
[
{
"step": "upgrade pip before pip install",
"command": "pip3 -q install --upgrade pip setuptools wheel"
},
{
"step":"install common packages",
"command":"pip3 -q install requests tiktoken numpy==1.26.4"
Expand Down
4 changes: 4 additions & 0 deletions LLM-Definitions/claude_2_1/requirements.json
Original file line number Diff line number Diff line change
@@ -1,4 +1,8 @@
[
{
"step": "upgrade pip before pip install",
"command": "pip3 -q install --upgrade pip setuptools wheel"
},
{
"step":"install common packages",
"command":"pip3 -q install requests tiktoken numpy==1.26.4"
Expand Down
4 changes: 4 additions & 0 deletions LLM-Definitions/claude_haiku_3/requirements.json
Original file line number Diff line number Diff line change
@@ -1,4 +1,8 @@
[
{
"step": "upgrade pip before pip install",
"command": "pip3 -q install --upgrade pip setuptools wheel"
},
{
"step":"install common packages",
"command":"pip3 -q install requests tiktoken numpy==1.26.4"
Expand Down
4 changes: 4 additions & 0 deletions LLM-Definitions/claude_opus_3/requirements.json
Original file line number Diff line number Diff line change
@@ -1,4 +1,8 @@
[
{
"step": "upgrade pip before pip install",
"command": "pip3 -q install --upgrade pip setuptools wheel"
},
{
"step":"install common packages",
"command":"pip3 -q install requests tiktoken numpy==1.26.4"
Expand Down
4 changes: 4 additions & 0 deletions LLM-Definitions/claude_sonnet_3/requirements.json
Original file line number Diff line number Diff line change
@@ -1,4 +1,8 @@
[
{
"step": "upgrade pip before pip install",
"command": "pip3 -q install --upgrade pip setuptools wheel"
},
{
"step":"install common packages",
"command":"pip3 -q install requests tiktoken numpy==1.26.4"
Expand Down
4 changes: 4 additions & 0 deletions LLM-Definitions/claude_sonnet_3_5/requirements.json
Original file line number Diff line number Diff line change
@@ -1,4 +1,8 @@
[
{
"step": "upgrade pip before pip install",
"command": "pip3 -q install --upgrade pip setuptools wheel"
},
{
"step":"install common packages",
"command":"pip3 -q install requests tiktoken numpy==1.26.4"
Expand Down
4 changes: 4 additions & 0 deletions LLM-Definitions/gemini_flash_15_001/requirements.json
Original file line number Diff line number Diff line change
@@ -1,4 +1,8 @@
[
{
"step": "upgrade pip before pip install",
"command": "pip3 -q install --upgrade pip setuptools wheel"
},
{
"step":"install common packages",
"command":"pip3 -q install requests numpy==1.26.4"
Expand Down
4 changes: 4 additions & 0 deletions LLM-Definitions/gemini_flash_15_002/requirements.json
Original file line number Diff line number Diff line change
@@ -1,4 +1,8 @@
[
{
"step": "upgrade pip before pip install",
"command": "pip3 -q install --upgrade pip setuptools wheel"
},
{
"step":"install common packages",
"command":"pip3 -q install requests numpy==1.26.4"
Expand Down
4 changes: 4 additions & 0 deletions LLM-Definitions/gemini_flash_15_8b/requirements.json
Original file line number Diff line number Diff line change
@@ -1,4 +1,8 @@
[
{
"step": "upgrade pip before pip install",
"command": "pip3 -q install --upgrade pip setuptools wheel"
},
{
"step":"install common packages",
"command":"pip3 -q install requests numpy==1.26.4"
Expand Down
4 changes: 4 additions & 0 deletions LLM-Definitions/gemini_flash_25/requirements.json
Original file line number Diff line number Diff line change
@@ -1,4 +1,8 @@
[
{
"step": "upgrade pip before pip install",
"command": "pip3 -q install --upgrade pip setuptools wheel"
},
{
"step":"install common packages",
"command":"pip3 -q install requests numpy==1.26.4"
Expand Down
4 changes: 4 additions & 0 deletions LLM-Definitions/gemini_flash_lite_25/requirements.json
Original file line number Diff line number Diff line change
@@ -1,4 +1,8 @@
[
{
"step": "upgrade pip before pip install",
"command": "pip3 -q install --upgrade pip setuptools wheel"
},
{
"step":"install common packages",
"command":"pip3 -q install requests numpy==1.26.4"
Expand Down
4 changes: 4 additions & 0 deletions LLM-Definitions/gemini_pro_15/requirements.json
Original file line number Diff line number Diff line change
@@ -1,4 +1,8 @@
[
{
"step": "upgrade pip before pip install",
"command": "pip3 -q install --upgrade pip setuptools wheel"
},
{
"step":"install common packages",
"command":"pip3 -q install requests numpy==1.26.4"
Expand Down
4 changes: 4 additions & 0 deletions LLM-Definitions/gemini_pro_25/requirements.json
Original file line number Diff line number Diff line change
@@ -1,4 +1,8 @@
[
{
"step": "upgrade pip before pip install",
"command": "pip3 -q install --upgrade pip setuptools wheel"
},
{
"step":"install common packages",
"command":"pip3 -q install requests numpy==1.26.4"
Expand Down
130 changes: 130 additions & 0 deletions LLM-Definitions/gpt_41_az_2025_01_01/Model-Card.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,130 @@
# GPT-4.1 Azure OpenAI / Azure AI Foundry

## Details

The GPT-4.1 series is the latest iteration of the GPT-4o model family. This iteration of models is specifically targeted for better coding and instruction following, making it better at handling complex technical and coding problems.

**Direct from Azure models** - GPT-4.1 is a select portfolio model curated for its market-differentiated capabilities:
- **Secure and managed by Microsoft**: Purchase and manage models directly through Azure with a single license, consistent support, and no third-party dependencies, backed by Azure's enterprise-grade infrastructure.
- **Streamlined operations**: Benefit from unified billing, governance, and seamless PTU portability across models hosted on Azure - all part of Microsoft Foundry.
- **Future-ready flexibility**: Access the latest models as they become available, and easily test, deploy, or switch between them within Microsoft Foundry; reducing integration effort.
- **Cost control and optimization**: Scale on demand with pay-as-you-go flexibility or reserve PTUs for predictable performance and savings.

## Key Capabilities

- **Text and image processing**: Multimodal input support
- **JSON Mode**: Structured JSON output generation
- **Parallel function calling**: Execute multiple function calls simultaneously
- **Enhanced accuracy and responsiveness**: Parity with English text and coding tasks compared to GPT-4 Turbo with Vision
- **Superior multilingual performance**: Improved performance in non-English languages and vision tasks
- **Complex structured outputs**: Support for sophisticated output formatting

## Context and Output

GPT-4.1 increases the context token limit up to **1M input tokens** with separate billing for:
- Small context: 128k tokens
- Large context: up to 1M tokens

As with the previous GPT-4o model family, it supports a **16k output size**.

## Model ID

The GPT-4.1 model is available through Azure OpenAI and Azure AI Foundry.

**Availability**: Standard, Global Standard, Global Batch, Regional Provisioned Throughput, Global Provisioned Throughput, Data Zone Standard, Data Zone Provisioned Throughput, Data Zone Batch

**Lifecycle**: Generally available (Preview)

**Training cut-off date**: Not supplied by provider

## Data, Media and Languages

**Supported data types:**
- **Inputs**: Text, image
- **Outputs**: Text (up to 16k tokens)

**Input Formats**: Text, image processing

**Output Formats**: Text with JSON Mode and support for complex structured outputs

**Supported languages**: Superior performance in non-English languages and vision tasks. Includes support for: en, it, af, es, de, fr, id, ru, pl, uk, el, lv, zh, ar, tr, ja, sw, cy, ko, is, bn, ur, ne, th, pa, mr, te, and many more.

## Use Cases

### Key Use Cases

This iteration of models is specifically targeted for:
- **Complex coding problems**: Better at handling technical programming challenges
- **Instruction following**: Enhanced accuracy in following detailed instructions
- **Technical documentation**: Processing and generating complex technical content
- **Function calling**: Build applications that fetch data or take actions with external systems
- **Long-context processing**: Handle large codebases or conversation histories (up to 1M tokens)

### Out of Scope Use Cases

Prompts and completions are passed through a default configuration of Azure AI Content Safety classification models to detect and prevent the output of harmful content. Learn more about [Azure AI Content Safety](https://learn.microsoft.com/en-us/azure/ai-services/content-safety/). Additional classification models and configuration options are available when you deploy an Azure OpenAI model in production.

## Pricing

Pricing is based on a number of factors, including deployment type and tokens used:
- **Small context** (up to 128k tokens): Standard pricing
- **Large context** (128k - 1M tokens): Separate billing for extended context

See [Azure OpenAI pricing details](https://azure.microsoft.com/en-us/pricing/details/cognitive-services/openai-service/) for current rates.

## Transparency

### Model Provider

This model is provided through the Azure OpenAI service and Azure AI Foundry.

### Distribution Channels

- Azure OpenAI Service
- Azure AI Foundry

### Relevant Documents

The following documents are applicable:

- [Overview of Responsible AI practices for Azure OpenAI models](https://learn.microsoft.com/en-us/legal/cognitive-services/openai/overview)
- [Transparency Note for Azure OpenAI Service](https://learn.microsoft.com/en-us/legal/cognitive-services/openai/transparency-note)
- [Introducing the GPT-4.1 Series](https://techcommunity.microsoft.com/blog/azure-ai-services-blog/introducing-gpt-4-1-openais-new-flagship-multimodal-model-now-in-preview-on-azu/4357395): OpenAI's new flagship multimodal model now in preview on Azure

### Model Architecture

The provider has not supplied detailed architecture information.

## Responsible AI Considerations

### Built-in Safety Measures

Safety is built into GPT-4.1 from the beginning, and reinforced at every step of the development process:

- **Pre-training filtering**: Content filtering to exclude hate speech, adult content, personal information aggregation, and spam
- **Post-training alignment**: Reinforcement learning with human feedback (RLHF) to improve accuracy and reliability
- **Expert evaluation**: Assessed using both automated and human evaluations according to Azure's Preparedness Framework
- **Instruction hierarchy**: Enhanced ability to resist jailbreaks, prompt injections, and system prompt extractions
- **Continuous monitoring**: Ongoing safety improvements as new risks are identified

### Content Filtering

Prompts and completions are passed through a default configuration of Azure AI Content Safety classification models to detect and prevent the output of harmful content. Learn more about Azure AI Content Safety. Additional classification models and configuration options are available when you deploy an Azure OpenAI model in production; learn more.

## Additional Information

### Training and Testing

The provider has not supplied detailed training, testing, and validation information.

### Performance

GPT-4.1 is specifically optimized for:
- **Enhanced coding capabilities**: Better handling of complex technical and coding problems
- **Instruction following**: Improved accuracy in following detailed instructions
- **Parity with GPT-4 Turbo with Vision**: Equivalent performance on English text and coding tasks
- **Superior multilingual performance**: Enhanced performance in non-English languages and vision tasks

### Learn More

For the latest information and updates on GPT-4.1, visit the [Azure AI Foundry Model Catalog](https://ai.azure.com).
140 changes: 140 additions & 0 deletions LLM-Definitions/gpt_41_az_2025_01_01/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,140 @@
# GPT-4.1 from Azure OpenAI / Azure AI Foundry

Source: https://ai.azure.com/explore/models/gpt-4.1/version/2025-04-14/registry/azure-openai

## Required Items

Azure OpenAI and Azure AI Foundry provide REST APIs for interaction and response generation.

To use a GPT-4.1 model, you need:

- A resource (Azure OpenAI or Azure AI Foundry). See [Create and deploy an Azure OpenAI resource](https://learn.microsoft.com/en-us/azure/ai-services/openai/how-to/create-resource?pivots=web-portal).
- A deployment of the GPT-4.1 model. See below.
- The endpoint and API key.

## Create a Model Deployment

Before using the Azure OpenAI or Azure AI Foundry resource, deploy the GPT-4.1 model.

### Option A: Deploy via Azure AI Foundry

1. Navigate to [Azure AI Foundry](https://ai.azure.com).
1. Select your project or create a new one.
1. **Create a deployment**:
* Search for and select **GPT-4.1**.
* Deployment name: **gpt-4.1** (**MUST USE THIS EXACT NAME**).
* **Deployment type**: Choose based on your needs (e.g., Global Standard).
* **Model version**: Choose the latest available.
* Tokens per Minute Rate Limit: Choose around 250K, if possible.
* Click **Deploy**.

### Option B: Deploy via Azure OpenAI Studio

1. Inside your Azure OpenAI resource, click **Go to Azure OpenAI Studio** or navigate directly to [oai.azure.com](https://oai.azure.com).
1. Go to **Deployments** > **Create new deployment**.
1. Select **GPT-4.1**:
* Deployment name: **gpt-4.1** (**MUST USE THIS EXACT NAME**).
* **Model version**: Choose the latest available.
* Location: Choose your preferred region.
* Click **Deploy**.

The chat playground will open. Feel free to ask a question to test the deployment.

## Retrieve your Azure OpenAI or Azure AI Foundry Key and Endpoint

Steps:

1. On the Azure portal:
* Search for **Azure OpenAI** or **Azure AI Foundry**.
1. Locate your service.
1. Expand **Resource Management** > **Keys and Endpoint**.
- **Endpoint**: Found in the **Keys & Endpoint** section of the Azure portal.
- Azure OpenAI example: `https://westus3.api.cognitive.microsoft.com/`
- Azure AI Foundry example: `https://sbxbotres.cognitiveservices.azure.com/`
- **Note**: Save the hostname (e.g., `westus3.api.cognitive.microsoft.com` or `sbxbotres.cognitiveservices.azure.com`). You'll configure this as the `azure_openai_resource` option when scoring the model.
- **API Key**: Found in the **Keys & Endpoint** section. Copy any of the keys. You will need it when scoring the model (passed as `API_KEY` in options).

## Configuration Options

This model is **location-independent** and supports both Azure OpenAI and Azure AI Foundry deployments. Configure your endpoint at runtime through the `options` parameter - no code changes required for different Azure regions or deployment types.

### Required Options:

- **`API_KEY`**: Your API key from the Azure portal (required for authentication)

### Endpoint Configuration (choose one):

**Option 1: Full Endpoint URL Override** (recommended for Azure AI Foundry)
- **`endpoint_url`**: Complete chat completions endpoint URL
- Example (Azure AI Foundry): `https://sbxbotres.cognitiveservices.azure.com/openai/deployments/gpt-4.1/chat/completions?api-version=2025-01-01-preview`
- Example (Azure OpenAI): `https://westus3.api.cognitive.microsoft.com/openai/deployments/gpt-4.1/chat/completions?api-version=2025-01-01-preview`
- When set, this is used directly (ignores `azure_openai_resource` and `api_version`)

**Option 2: Resource Hostname** (constructed endpoint)
- **`azure_openai_resource`**: Your Azure resource hostname
- Azure AI Foundry: `sbxbotres.cognitiveservices.azure.com`
- Azure OpenAI: `westus3.api.cognitive.microsoft.com` or short name `my-openai-westus3` (auto-appends `.openai.azure.com`)
- Find this in the **Keys & Endpoint** section of your resource
- **`api_version`**: Azure OpenAI API version (default: `2025-01-01-preview`)
- Used only when `endpoint_url` is not provided

### Optional Parameters:

- **`temperature`**: Sampling temperature 0-2 (default: `1`)
- Higher values (0.8) make output more random
- Lower values (0.2) make output more focused and deterministic
- **`top_p`**: Nucleus sampling 0-1 (default: `1`)
- Controls diversity via nucleus sampling
- 0.1 means only tokens in top 10% probability are considered

### Example Options Strings:

**Azure AI Foundry with full endpoint URL:**
```
{endpoint_url:https://sbxbotres.cognitiveservices.azure.com/openai/deployments/gpt-4.1/chat/completions?api-version=2025-01-01-preview,temperature:1,top_p:1,API_KEY:your-key-here}
```

**Azure OpenAI with resource hostname:**
```
{azure_openai_resource:westus3.api.cognitive.microsoft.com,api_version:2025-01-01-preview,temperature:1,top_p:1,API_KEY:your-key-here}
```

**Azure OpenAI with short resource name:**
```
{azure_openai_resource:my-openai-westus3,api_version:2025-01-01-preview,temperature:1,top_p:1,API_KEY:your-key-here}
```

### Testing Locally:

To test the model wrapper locally, uncomment the example code at the bottom of `gpt41MiniScore.py`:

**Using Azure AI Foundry:**
```python
if __name__ == "__main__":
userPrompt = ["Count to ten in French"]
systemPrompt = ["You are an AI Assistant helping people learn languages"]
options = ["{endpoint_url:https://sbxbotres.cognitiveservices.azure.com/openai/deployments/gpt-4.1/chat/completions?api-version=2025-01-01-preview,temperature:1,top_p:1,API_KEY:your-actual-key}"]
response, run_time, prompt_length, output_length = scoreModel(userPrompt, systemPrompt, options)
```

**Using Azure OpenAI:**
```python
if __name__ == "__main__":
userPrompt = ["Count to ten in French"]
systemPrompt = ["You are an AI Assistant helping people learn languages"]
options = ["{azure_openai_resource:westus3.api.cognitive.microsoft.com,api_version:2025-01-01-preview,temperature:1,top_p:1,API_KEY:your-actual-key}"]
response, run_time, prompt_length, output_length = scoreModel(userPrompt, systemPrompt, options)
```

Run: `python gpt41MiniScore.py`

## Benefits of This Approach

- **Dual deployment support**: Works with both Azure OpenAI and Azure AI Foundry
- **Multi-region support**: Works with any Azure region without code changes
- **Flexible configuration**: Choose between full endpoint URL or constructed endpoint
- **Configurable at runtime**: Endpoint and API version configured via options
- **Development flexibility**: Different endpoints for dev/test/prod environments
- **Future-proof**: Easy to update API versions as Azure releases new features

## End
Loading