✨ Here we outline how to create an Azure search index from a CSV file summarizing funded award data exported from Reporter.nih.gov
👂 If you already have your csv ready, skip to section (2)
Download this public csv file from kaggle to use as our input.
👂 If you already added your data to blob storage skip to section (3)
On the home page, navigate to Storage Accounts.
Create a new storage account if needed. Place your storage account in US East region.
Create a new container if needed, otherwise navigate to an existing container.
Select Upload and then add your file by dropping or browsing.
Navigate to AI Search and create a new search.
Click Import data.
Now fill out all the necessary parameters.
-
Data Source: Select
Azure Blob Storage. New options will drop down. -
Data source name: This can be anything, but go with something like
ds-salaries-data. -
Data to extract: Select
Content and metadata. -
Parsing mode: Select
Delimited text. Check theFirst Line Contains Headerbox and leaveDelimiter Characteras,. -
Connection string: Click
Choose an existing connectionand navigate to your storage account and container. -
Managed identity authentication: Leave as default.
-
Container name: Should be populated when you connect via Connection String, but otherwise just enter your container name here.
-
Blob folder: Optional, if you have a folder within the container with the file(s) you want to index, enter that path here.
-
Description: Optional.
-
If you get errors when trying to go to the next screen, make sure you don't have trailing commas in your csv, and there are not spaces in the header names. If this happens, fix those errors, re-upload to blob storage, and then try again!
Skip ahead to Customize target index.
-
Give your index a name.
-
Make
Project_Numberyour key. -
Make sure the expected column names are present under fields. For the columns you expect to use, select
RetrievableandSearchable. If you select all the columns you will just pay for indexing you are not using.
Advance to Create an indexer, name your indexer, then click Submit.
Navigate to Indexes on the left panel and wait until your index shows as many documents as you have lines in your file. It will read 0 documents until it is finished indexing. The example 500 line csv takes about one minute.
And that is it! Now return to the tutorial notebook to run queries against this csv using GPT-4.








