Skip to content

Investigate backend generation and data validity of D_597869478_D_597869478_num in Module 2 #1606

@gloria-trivitt

Description

@gloria-trivitt

Overview
The Question “In total, how many months or years have you used other progestin-only medication?”, CID 597869478, in Module 2 all versions, has a response category not indicated in the data dictionary or the Quest questionnaire mark-up (Nicole checked multiple versions in GitHub). We expect answers under D_597869478_D_970604592 which has the unit of years or D_597869478_D_434243220 which has the unit of months. However, in Module 2 version 1 we see an additional answer category D_597869478_D_597869478_num for which the units are unknown and in Module 2 version 2, D_597869478_D_970604592, the years category disappears altogether, but D_597869478_D_597869478_num is still present.

We would like DevOps assistance in determining how this D_597869478_D_597869478_num variable is being generated on the back end and testing to determine if D_597869478_D_597869478_num contains any valid year or month data, especially in Module 2 version 2 where D_597869478_D_597869478_num contains about 50% of all replies.

Concept definitions

Concept ID Definition
597869478 In total, how many months or years have you used other progestin-only medication?
434243220 Months used
970604592 Years used

Query results

SELECT
    COUNT(*) AS count,
    (D_597869478_D_970604592 IS NOT NULL) AS has_years,
    (D_597869478_D_597869478_num IS NOT NULL) AS has_unknown,
    (D_597869478_D_434243220 IS NOT NULL) AS has_months
FROM `nih-nci-dceg-connect-prod-6d04.FlatConnect.module2_v1`
GROUP BY has_years, has_unknown, has_months
ORDER BY count DESC;
Row count has_years has_unknown has_months
1 2302 false false false
2 14 false false true
3 11 false true false
4 2 true false false
SELECT
    COUNT(*) AS count,
    (D_597869478_D_597869478_num IS NOT NULL) AS has_unknown,
    (D_597869478_D_434243220 IS NOT NULL) AS has_months
FROM `nih-nci-dceg-connect-prod-6d04.FlatConnect.module2_v2`
GROUP BY has_unknown, has_months
ORDER BY count DESC;
Row count has_unknown has_months
1 65807 false false
2 451 false true
3 423 true false

Supporting context
Current data indicates that when values are present, they appear in only one field (years, months, or unknown) with no overlap between variables.

Metadata

Metadata

Labels

No labels
No labels

Type

No type

Projects

Status

Backlog

Relationships

None yet

Development

No branches or pull requests

Issue actions