Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
34 changes: 18 additions & 16 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -32,6 +32,8 @@ For complete setup instructions and usage examples, see the [full docs](https://

## **Quick Start**

**v1.2.0 will be the final version where Knowledge Graph can be downloaded as CSV flat files. Starting from v1.3.0 onwards, Knowledge Graph will be accessible as graph-native JSON flat files and we’ll be starting to grant access to our REST API in early 2026. Any CSV and JSON flat files that were previously downloaded will be unaffected.**

The knowledge graph data is available for download in both CSV and JSON formats. The graph data is exported with each file representing a specific entity type, and a relationships file capturing the connections between entities.

**CSV files:** UTF-8 encoded with comma delimiters and quoted fields. All CSV files include header rows with column names.
Expand All @@ -54,10 +56,10 @@ There are two options to download the files: direct s3 links, or using curl comm
Click links to download files directly. Files will download to your browser's default location (typically `~/Downloads`).

**CSV files:**
- [StandardsFramework.csv](https://aidt-knowledge-graph-datasets-public-prod.s3.us-west-2.amazonaws.com/knowledge-graph/v1.1.0/csv/StandardsFramework.csv?ref=github)
- [StandardsFrameworkItem.csv](https://aidt-knowledge-graph-datasets-public-prod.s3.us-west-2.amazonaws.com/knowledge-graph/v1.1.0/csv/StandardsFrameworkItem.csv?ref=github)
- [LearningComponent.csv](https://aidt-knowledge-graph-datasets-public-prod.s3.us-west-2.amazonaws.com/knowledge-graph/v1.1.0/csv/LearningComponent.csv?ref=github)
- [Relationships.csv](https://aidt-knowledge-graph-datasets-public-prod.s3.us-west-2.amazonaws.com/knowledge-graph/v1.1.0/csv/Relationships.csv?ref=github)
- [StandardsFramework.csv](https://aidt-knowledge-graph-datasets-public-prod.s3.us-west-2.amazonaws.com/knowledge-graph/v1.2.0/csv/StandardsFramework.csv?ref=github)
- [StandardsFrameworkItem.csv](https://aidt-knowledge-graph-datasets-public-prod.s3.us-west-2.amazonaws.com/knowledge-graph/v1.2.0/csv/StandardsFrameworkItem.csv?ref=github)
- [LearningComponent.csv](https://aidt-knowledge-graph-datasets-public-prod.s3.us-west-2.amazonaws.com/knowledge-graph/v1.2.0/csv/LearningComponent.csv?ref=github)
- [Relationships.csv](https://aidt-knowledge-graph-datasets-public-prod.s3.us-west-2.amazonaws.com/knowledge-graph/v1.2.0/csv/Relationships.csv?ref=github)

**For SQL database imports:** Move the downloaded CSV files to `/tmp/kg-data/` to use the import scripts without modification:

Expand All @@ -67,10 +69,10 @@ mv ~/Downloads/StandardsFramework.csv ~/Downloads/StandardsFrameworkItem.csv ~/D
```

**JSON files:**
- [StandardsFramework.json](https://aidt-knowledge-graph-datasets-public-prod.s3.us-west-2.amazonaws.com/knowledge-graph/v1.1.0/json/StandardsFramework.json?ref=github)
- [StandardsFrameworkItem.json](https://aidt-knowledge-graph-datasets-public-prod.s3.us-west-2.amazonaws.com/knowledge-graph/v1.1.0/json/StandardsFrameworkItem.json?ref=github)
- [LearningComponent.json](https://aidt-knowledge-graph-datasets-public-prod.s3.us-west-2.amazonaws.com/knowledge-graph/v1.1.0/json/LearningComponent.json?ref=github)
- [Relationships.json](https://aidt-knowledge-graph-datasets-public-prod.s3.us-west-2.amazonaws.com/knowledge-graph/v1.1.0/json/Relationships.json?ref=github)
- [StandardsFramework.json](https://aidt-knowledge-graph-datasets-public-prod.s3.us-west-2.amazonaws.com/knowledge-graph/v1.2.0/json/StandardsFramework.json?ref=github)
- [StandardsFrameworkItem.json](https://aidt-knowledge-graph-datasets-public-prod.s3.us-west-2.amazonaws.com/knowledge-graph/v1.2.0/json/StandardsFrameworkItem.json?ref=github)
- [LearningComponent.json](https://aidt-knowledge-graph-datasets-public-prod.s3.us-west-2.amazonaws.com/knowledge-graph/v1.2.0/json/LearningComponent.json?ref=github)
- [Relationships.json](https://aidt-knowledge-graph-datasets-public-prod.s3.us-west-2.amazonaws.com/knowledge-graph/v1.2.0/json/Relationships.json?ref=github)

### Using curl commands

Expand All @@ -83,17 +85,17 @@ If you don't have `curl` installed, see [installation instructions](https://gith
mkdir -p /tmp/kg-data
cd /tmp/kg-data

curl -L "https://aidt-knowledge-graph-datasets-public-prod.s3.us-west-2.amazonaws.com/knowledge-graph/v1.1.0/csv/StandardsFramework.csv?ref=gh_curl" -o StandardsFramework.csv
curl -L "https://aidt-knowledge-graph-datasets-public-prod.s3.us-west-2.amazonaws.com/knowledge-graph/v1.1.0/csv/StandardsFrameworkItem.csv?ref=gh_curl" -o StandardsFrameworkItem.csv
curl -L "https://aidt-knowledge-graph-datasets-public-prod.s3.us-west-2.amazonaws.com/knowledge-graph/v1.1.0/csv/LearningComponent.csv?ref=gh_curl" -o LearningComponent.csv
curl -L "https://aidt-knowledge-graph-datasets-public-prod.s3.us-west-2.amazonaws.com/knowledge-graph/v1.1.0/csv/Relationships.csv?ref=gh_curl" -o Relationships.csv
curl -L "https://aidt-knowledge-graph-datasets-public-prod.s3.us-west-2.amazonaws.com/knowledge-graph/v1.2.0/csv/StandardsFramework.csv?ref=gh_curl" -o StandardsFramework.csv
curl -L "https://aidt-knowledge-graph-datasets-public-prod.s3.us-west-2.amazonaws.com/knowledge-graph/v1.2.0/csv/StandardsFrameworkItem.csv?ref=gh_curl" -o StandardsFrameworkItem.csv
curl -L "https://aidt-knowledge-graph-datasets-public-prod.s3.us-west-2.amazonaws.com/knowledge-graph/v1.2.0/csv/LearningComponent.csv?ref=gh_curl" -o LearningComponent.csv
curl -L "https://aidt-knowledge-graph-datasets-public-prod.s3.us-west-2.amazonaws.com/knowledge-graph/v1.2.0/csv/Relationships.csv?ref=gh_curl" -o Relationships.csv
```
```bash
# Download JSON files
curl -L "https://aidt-knowledge-graph-datasets-public-prod.s3.us-west-2.amazonaws.com/knowledge-graph/v1.1.0/json/StandardsFramework.json?ref=gh_curl" -o StandardsFramework.json
curl -L "https://aidt-knowledge-graph-datasets-public-prod.s3.us-west-2.amazonaws.com/knowledge-graph/v1.1.0/json/StandardsFrameworkItem.json?ref=gh_curl" -o StandardsFrameworkItem.json
curl -L "https://aidt-knowledge-graph-datasets-public-prod.s3.us-west-2.amazonaws.com/knowledge-graph/v1.1.0/json/LearningComponent.json?ref=gh_curl" -o LearningComponent.json
curl -L "https://aidt-knowledge-graph-datasets-public-prod.s3.us-west-2.amazonaws.com/knowledge-graph/v1.1.0/json/Relationships.json?ref=gh_curl" -o Relationships.json
curl -L "https://aidt-knowledge-graph-datasets-public-prod.s3.us-west-2.amazonaws.com/knowledge-graph/v1.2.0/json/StandardsFramework.json?ref=gh_curl" -o StandardsFramework.json
curl -L "https://aidt-knowledge-graph-datasets-public-prod.s3.us-west-2.amazonaws.com/knowledge-graph/v1.2.0/json/StandardsFrameworkItem.json?ref=gh_curl" -o StandardsFrameworkItem.json
curl -L "https://aidt-knowledge-graph-datasets-public-prod.s3.us-west-2.amazonaws.com/knowledge-graph/v1.2.0/json/LearningComponent.json?ref=gh_curl" -o LearningComponent.json
curl -L "https://aidt-knowledge-graph-datasets-public-prod.s3.us-west-2.amazonaws.com/knowledge-graph/v1.2.0/json/Relationships.json?ref=gh_curl" -o Relationships.json
```

### **Next steps**
Expand Down
2 changes: 2 additions & 0 deletions import_scripts/mysql/README.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,5 @@
**v1.2.0 will be the final version where Knowledge Graph can be downloaded as CSV flat files. Starting from v1.3.0 onwards, Knowledge Graph will be accessible as graph-native JSON flat files and we’ll be starting to grant access to our REST API in early 2026. Any CSV and JSON flat files that were previously downloaded will be unaffected.**

# MySQL Import Guide

This guide provides instructions for loading the Learning Commons Knowledge Graph dataset into a MySQL database.
Expand Down
6 changes: 5 additions & 1 deletion import_scripts/mysql/create_tables.sql
Original file line number Diff line number Diff line change
Expand Up @@ -67,5 +67,9 @@ CREATE TABLE IF NOT EXISTS relationships (
`author` TEXT,
`provider` TEXT,
`license` TEXT,
`attributionStatement` TEXT
`attributionStatement` TEXT,
`jaccard` DOUBLE,
`ccssLCCount` INT,
`sharedLCCount` INT,
`stateLCCount` INT
);
2 changes: 2 additions & 0 deletions import_scripts/postgresql/README.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,5 @@
**v1.2.0 will be the final version where Knowledge Graph can be downloaded as CSV flat files. Starting from v1.3.0 onwards, Knowledge Graph will be accessible as graph-native JSON flat files and we’ll be starting to grant access to our REST API in early 2026. Any CSV and JSON flat files that were previously downloaded will be unaffected.**

# PostgreSQL Import Guide

This guide provides instructions for loading the Learning Commons Knowledge Graph dataset into a PostgreSQL database.
Expand Down
6 changes: 5 additions & 1 deletion import_scripts/postgresql/create_tables.sql
Original file line number Diff line number Diff line change
Expand Up @@ -66,5 +66,9 @@ CREATE TABLE IF NOT EXISTS relationships (
"author" TEXT,
"provider" TEXT,
"license" TEXT,
"attributionStatement" TEXT
"attributionStatement" TEXT,
"jaccard" DOUBLE PRECISION,
"ccssLCCount" INT,
"sharedLCCount" INT,
"stateLCCount" INT
);
Binary file removed sample_queries/mysql/.DS_Store
Binary file not shown.
Original file line number Diff line number Diff line change
@@ -0,0 +1,37 @@
-- Find the best Texas state standard matches for a given CCSSM standard
-- Returns crosswalks ordered by Jaccard score (highest similarity first)
-- with metadata about both the CCSSM and matching Texas state standards

SELECT
-- CCSSM Standard Information
ccss.`statementCode` AS ccss_standard_code,
ccss.`description` AS ccss_description,
ccss.`gradeLevel` AS ccss_grade_level,
ccss.`jurisdiction` AS ccss_jurisdiction,

-- State Standard Information
state.`statementCode` AS state_standard_code,
state.`description` AS state_description,
state.`gradeLevel` AS state_grade_level,
state.`jurisdiction` AS state_jurisdiction,

-- Crosswalk Metrics
r.`jaccard`,
r.`sharedLCCount`,
r.`stateLCCount`,
r.`ccssLCCount`,

-- Entity Values for further joins if needed
r.`sourceEntityValue` AS state_uuid,
r.`targetEntityValue` AS ccss_uuid
FROM relationships r
JOIN standards_framework_item state
ON state.`caseIdentifierUUID` = r.`sourceEntityValue`
JOIN standards_framework_item ccss
ON ccss.`caseIdentifierUUID` = r.`targetEntityValue`
WHERE r.`relationshipType` = 'hasStandardAlignment'
AND ccss.`statementCode` = '6.EE.B.5'
AND ccss.`jurisdiction` = 'Multi-State'
AND state.`jurisdiction` = 'Texas'
ORDER BY r.`jaccard` DESC
LIMIT 10;
17 changes: 17 additions & 0 deletions sample_queries/mysql/crosswalk_queries/get_all_crosswalks.sql
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
-- Get all crosswalks with state standard information
-- Returns state → CCSSM standard alignments ordered by Jaccard score

SELECT
state.`statementCode` AS state_standard_code,
state.`jurisdiction` AS state_jurisdiction,
r.`sourceEntityValue`,
r.`targetEntityValue`,
r.`jaccard`,
r.`stateLCCount`,
r.`ccssLCCount`,
r.`sharedLCCount`
FROM relationships r
JOIN standards_framework_item state
ON state.`caseIdentifierUUID` = r.`sourceEntityValue`
WHERE r.`relationshipType` = 'hasStandardAlignment'
ORDER BY r.`jaccard` DESC;
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
-- Get all Texas crosswalks that meet or exceed a specified Jaccard similarity threshold
-- Ordered by CCSSM standard code, then Jaccard score

SELECT
state_std.`jurisdiction` AS state_jurisdiction,
state_std.`statementCode` AS state_standard_code,
ccss_std.`statementCode` AS ccss_standard_code,
r.`jaccard`,
r.`sharedLCCount`,
r.`stateLCCount`,
r.`ccssLCCount`
FROM relationships r
JOIN standards_framework_item state_std
ON state_std.`caseIdentifierUUID` = r.`sourceEntityValue`
JOIN standards_framework_item ccss_std
ON ccss_std.`caseIdentifierUUID` = r.`targetEntityValue`
WHERE r.`relationshipType` = 'hasStandardAlignment'
AND r.`jaccard` >= 0.7
AND state_std.`jurisdiction` = 'Texas'
ORDER BY
ccss_std.`statementCode`,
r.`jaccard` DESC;
Original file line number Diff line number Diff line change
@@ -0,0 +1,35 @@
-- Get all crosswalks for a specific state jurisdiction
-- Returns comprehensive metadata for both state and CCSSM standards
-- Ordered by state standard code and Jaccard score

SELECT
-- State Standard Information
state_std.`jurisdiction` AS state_jurisdiction,
state_std.`statementCode` AS state_standard_code,
state_std.`gradeLevel` AS state_grade_level,
state_std.`description` AS state_description,
state_std.`academicSubject` AS state_academic_subject,

-- CCSSM Standard Information
ccss_std.`statementCode` AS ccss_standard_code,
ccss_std.`gradeLevel` AS ccss_grade_level,
ccss_std.`description` AS ccss_description,
ccss_std.`academicSubject` AS ccss_academic_subject,

-- Crosswalk Metrics
r.`jaccard`,
r.`sharedLCCount`,
r.`stateLCCount`,
r.`ccssLCCount`
FROM relationships r
JOIN standards_framework_item state_std
ON state_std.`caseIdentifierUUID` = r.`sourceEntityValue`
JOIN standards_framework_item ccss_std
ON ccss_std.`caseIdentifierUUID` = r.`targetEntityValue`
WHERE r.`relationshipType` = 'hasStandardAlignment'
AND state_std.`jurisdiction` = 'Texas'
AND state_std.`academicSubject` = 'Mathematics'
ORDER BY
state_std.`statementCode`,
ccss_std.`statementCode`,
r.`jaccard` DESC;
Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
-- Find Texas state standard matches for a given CCSSM standard with full metadata
-- Returns multiple matching Texas state standards ordered by similarity

SELECT
-- CCSSM Standard Information
ccss_std.`statementCode` AS ccss_standard_code,
ccss_std.`jurisdiction` AS ccss_jurisdiction,
ccss_std.`gradeLevel` AS ccss_grade_level,
ccss_std.`description` AS ccss_description,
ccss_std.`academicSubject` AS ccss_academic_subject,

-- State Standard Information
state_std.`statementCode` AS state_standard_code,
state_std.`jurisdiction` AS state_jurisdiction,
state_std.`gradeLevel` AS state_grade_level,
state_std.`description` AS state_description,
state_std.`academicSubject` AS state_academic_subject,

-- Crosswalk Metrics
r.`jaccard`,
r.`sharedLCCount`,
r.`stateLCCount`,
r.`ccssLCCount`
FROM relationships r
JOIN standards_framework_item state_std
ON state_std.`caseIdentifierUUID` = r.`sourceEntityValue`
JOIN standards_framework_item ccss_std
ON ccss_std.`caseIdentifierUUID` = r.`targetEntityValue`
WHERE r.`relationshipType` = 'hasStandardAlignment'
AND ccss_std.`statementCode` = '6.EE.B.5'
AND ccss_std.`jurisdiction` = 'Multi-State'
AND state_std.`jurisdiction` = 'Texas'
ORDER BY r.`jaccard` DESC
LIMIT 10;
Original file line number Diff line number Diff line change
@@ -0,0 +1,55 @@
-- Get the Learning Components that support both a state standard and a CCSSM standard
-- Returns three categories: shared LCs (in both), state-only LCs, and CCSSM-only LCs
-- This shows the pedagogical overlap and differences between crosswalked standards

WITH state_lcs AS (
SELECT lc.`identifier`, lc.`description`
FROM relationships r
JOIN standards_framework_item sfi
ON sfi.`caseIdentifierUUID` = r.`targetEntityValue`
JOIN learning_component lc
ON lc.`identifier` = r.`sourceEntityValue`
WHERE r.`relationshipType` = 'supports'
AND sfi.`statementCode` = '111.26.b.4.D'
AND sfi.`jurisdiction` = 'Texas'
),
ccss_lcs AS (
SELECT lc.`identifier`, lc.`description`
FROM relationships r
JOIN standards_framework_item sfi
ON sfi.`caseIdentifierUUID` = r.`targetEntityValue`
JOIN learning_component lc
ON lc.`identifier` = r.`sourceEntityValue`
WHERE r.`relationshipType` = 'supports'
AND sfi.`statementCode` = '6.RP.A.2'
AND sfi.`jurisdiction` = 'Multi-State'
)
SELECT
'shared' AS lc_type,
state_lcs.`identifier`,
state_lcs.`description`
FROM state_lcs
INNER JOIN ccss_lcs
ON state_lcs.`identifier` = ccss_lcs.`identifier`

UNION ALL

SELECT
'state_only' AS lc_type,
state_lcs.`identifier`,
state_lcs.`description`
FROM state_lcs
LEFT JOIN ccss_lcs
ON state_lcs.`identifier` = ccss_lcs.`identifier`
WHERE ccss_lcs.`identifier` IS NULL

UNION ALL

SELECT
'ccss_only' AS lc_type,
ccss_lcs.`identifier`,
ccss_lcs.`description`
FROM ccss_lcs
LEFT JOIN state_lcs
ON ccss_lcs.`identifier` = state_lcs.`identifier`
WHERE state_lcs.`identifier` IS NULL;
Original file line number Diff line number Diff line change
@@ -0,0 +1,37 @@
-- Find the best Texas state standard matches for a given CCSSM standard
-- Returns crosswalks ordered by Jaccard score (highest similarity first)
-- with metadata about both the CCSSM and matching Texas state standards

SELECT
-- CCSSM Standard Information
ccss."statementCode" AS ccss_standard_code,
ccss."description" AS ccss_description,
ccss."gradeLevel" AS ccss_grade_level,
ccss."jurisdiction" AS ccss_jurisdiction,

-- State Standard Information
state."statementCode" AS state_standard_code,
state."description" AS state_description,
state."gradeLevel" AS state_grade_level,
state."jurisdiction" AS state_jurisdiction,

-- Crosswalk Metrics
r."jaccard",
r."sharedLCCount",
r."stateLCCount",
r."ccssLCCount",

-- Entity Values for further joins if needed
r."sourceEntityValue" AS state_uuid,
r."targetEntityValue" AS ccss_uuid
FROM relationships r
JOIN standards_framework_item state
ON state."caseIdentifierUUID" = r."sourceEntityValue"
JOIN standards_framework_item ccss
ON ccss."caseIdentifierUUID" = r."targetEntityValue"
WHERE r."relationshipType" = 'hasStandardAlignment'
AND ccss."statementCode" = '6.EE.B.5'
AND ccss."jurisdiction" = 'Multi-State'
AND state."jurisdiction" = 'Texas'
ORDER BY r."jaccard" DESC
LIMIT 10;
17 changes: 17 additions & 0 deletions sample_queries/postgresql/crosswalk_queries/get_all_crosswalks.sql
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
-- Get all crosswalks with state standard information
-- Returns state → CCSSM standard alignments ordered by Jaccard score

SELECT
state."statementCode" AS state_standard_code,
state."jurisdiction" AS state_jurisdiction,
r."sourceEntityValue",
r."targetEntityValue",
r."jaccard",
r."stateLCCount",
r."ccssLCCount",
r."sharedLCCount"
FROM relationships r
JOIN standards_framework_item state
ON state."caseIdentifierUUID" = r."sourceEntityValue"
WHERE r."relationshipType" = 'hasStandardAlignment'
ORDER BY r."jaccard" DESC;
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
-- Get all Texas crosswalks that meet or exceed a specified Jaccard similarity threshold
-- Ordered by CCSSM standard code, then Jaccard score

SELECT
state_std."jurisdiction" AS state_jurisdiction,
state_std."statementCode" AS state_standard_code,
ccss_std."statementCode" AS ccss_standard_code,
r."jaccard",
r."sharedLCCount",
r."stateLCCount",
r."ccssLCCount"
FROM relationships r
JOIN standards_framework_item state_std
ON state_std."caseIdentifierUUID" = r."sourceEntityValue"
JOIN standards_framework_item ccss_std
ON ccss_std."caseIdentifierUUID" = r."targetEntityValue"
WHERE r."relationshipType" = 'hasStandardAlignment'
AND r."jaccard" >= 0.7
AND state_std."jurisdiction" = 'Texas'
ORDER BY
ccss_std."statementCode",
r."jaccard" DESC;
Loading
Loading