Skip to content

measurement_date crash in biology #75

@PPPinson

Description

@PPPinson

Description

prepare_measurement_table returns error

MissingConceptError: The DataFrame is missing some columns, namely:
- measurement_date

there is often issues with "date columns" in spark + Pandas. We should only use measurement_datetime column.

Solution : delete measurement_date in variable "_measurement_required_columns" in utils.check_data.check_data_and_select_columns_measurement.

How to reproduce the bug

prepare_measurement_table issue

import eds_scikit
from eds_scikit.biology import prepare_measurement_table, ConceptsSet
from eds_scikit.io import HiveData
data = HiveData(MyDB)

leukocytes_set = ConceptsSet("Leukocytes_Blood_Count")
measurement = prepare_measurement_table(
    data,
    start_date="2022-01-01",
    end_date="2022-05-01",
    concept_sets=[leukocytes_set],
    convert_units=False,
    get_all_terminologies=True,
)

date columns issue

sql("SELECT measurement_date FROM measurement limit 10").toPandas()

returns : "AttributeError: Can only use .dt accessor with datetimelike values"

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions