VizDSGraph.Gov/codebook-v1.0 at main · datahac/VizDSGraph.Gov · GitHub

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
# Codebook v1 -- Schema-Constrained Transcript Extraction


# Global Coding Protocol

## Unit of Coding

-   Code the smallest transcript span that fully supports the entity.
-   If an entity appears multiple times, create a single entity with
    multiple mentions.

## Evidence Standard

-   Extract only what is explicitly stated or strongly implied.
-   Do not infer beyond transcript evidence.

## Certainty Scoring (1--10)

-   10: Explicit and unambiguous\
-   7--9: Strongly implied\
-   4--6: Weakly implied but plausible\
-   1--3: Speculative (avoid coding)

## Schema Constraint Rule

Only use: Dataset, DetailedProblem, EvidenceArtifact, Feature, Hub,
IntendedUse, Module, Project, Regulation, RegulatoryRequirement, Role,
TopLevelProblem.

No new categories may be created.

------------------------------------------------------------------------

# Inter-Coder Reliability Protocol

## Phase 1: Anchor Set Double Coding

-   A stratified subset of transcripts (minimum 20%) is independently
    coded by two human reviewers.
-   Reviewers work blind to each other's coding.

## Phase 2: Agreement Measurement

Agreement is measured using: - Entity-level precision/recall -
Relationship agreement rate - Certainty score deviation (mean absolute
difference) - Cohen's kappa (where applicable)

## Phase 3: Disagreement Resolution

-   Discrepancies are logged in a structured disagreement log.
-   Reviewers discuss disagreements.
-   If unresolved, a third adjudicator makes final decision.
-   Codebook definitions are updated only if disagreement reveals
    systematic ambiguity.

## Phase 4: Drift Monitoring

-   Every 10 transcripts, a calibration transcript is co-reviewed.
-   If agreement drops below predefined threshold (e.g., κ \< 0.7),
    recalibration is required.

------------------------------------------------------------------------

# Bias and Reliability Audit

To mitigate systematic bias:

-   Compare extraction across Role types (e.g., Clinician vs CEO)
-   Compare extraction across TRL stages
-   Identify systematic under-extraction or over-inference patterns
-   Reject patterns where bias is systematic and uncorrectable

LLM outputs are always validated by a human reviewer before final graph
integration.

------------------------------------------------------------------------

# Entity Definitions and Coding Rules

## Project

Organisation or innovation initiative.

**Include:** Named organisation or clearly identifiable project.\
**Exclude:** Generic "startups" without identity.

Allowed Properties: name, ProjectCode, Description, TRL,
deployment_region, Location, latitude, longitude, start_date, stage,
geography, scope, color

------------------------------------------------------------------------

## Hub

Parent ecosystem organisation (accelerator, university programme,
innovation cluster).

Allowed Property: name

------------------------------------------------------------------------

## Role

Stakeholder category relevant to system design, governance, or use.

Allowed Properties: name, color, location

------------------------------------------------------------------------

## IntendedUse

Regulatory intended purpose and operational deployment intent.

Allowed Properties: id, name, intended_purpose, clinical_function,
medical_purpose_family, output_type, time_horizon, decision_impact,
target_user_type, use_environment, human_in_the_loop,
direct_patient_actuation, patient_level, autonomy_level,
safety_criticality_hint, regulatory_hooks

------------------------------------------------------------------------

## Dataset

Dataset used or referenced in development or deployment.

Allowed Properties: dataset_id, name, source_type, contains_pii

Note: Only set contains_pii if explicitly stated or strongly implied.

------------------------------------------------------------------------

## Module

Subsystem grouping features or implementing models/pipelines.

Allowed Properties: module_id, name, module_type, purpose,
regulatory_role, decision_impact, automation_level, learning_type,
model_family, data_modality, deployment, storage, security,
pii_handling, explainability

------------------------------------------------------------------------

## Feature

Concrete technological capability or control.

Allowed Properties: name, description, id, category, URL, color

------------------------------------------------------------------------

## Regulation

Law, regulatory framework, or standard.

Allowed Properties: id, name, scope, type, description, url, version,
confidentiality

------------------------------------------------------------------------

## RegulatoryRequirement

Concrete operational obligation derived from regulation.

Allowed Properties: id, title, name, description, evidence,
applies_in_stages

------------------------------------------------------------------------

## TopLevelProblem

High-level commercial or governance problem to be resolved.

Allowed Property: description

------------------------------------------------------------------------

## DetailedProblem

Decomposition of TopLevelProblem framed as a visualization task.

Allowed Property: description

Must represent an actionable analytical or visual task.

------------------------------------------------------------------------

## EvidenceArtifact

Documented evidence supporting compliance, validation, or oversight.

Allowed Properties: artifact_id, name, artifact_type, origin,
reuse_status, status, uri, last_updated

------------------------------------------------------------------------

# Versioning

Version: v1.2\
Changes from v1.1: - Added second human reviewer protocol\
- Added inter-coder agreement metrics\
- Formalised disagreement resolution\
- Expanded bias audit section\
- Clarified inclusion/exclusion rules