Skip to content

Imputation of Categorical Variables giving ValueError #26

@ankitsethknp

Description

@ankitsethknp

My dataset has both continuous and categorical variables and both types has missing values as NaN. I have been trying to impute continuous as well as categorical missing data using MissForest but after passing data in fit method, I am getting ValueError for categorical columns -

<ipython-input-8-5a247d138e72> in <module>
      7 imputer = MissForest()
----> 8 tmp = imputer.fit(df1, cat_vars = [0,5,8,9])
      9 df1_imputed = tmp.transform(df1)

~\AppData\Local\Continuum\anaconda3\lib\site-packages\missingpy\missforest.py in fit(self, X, y, cat_vars)
    439 
    440         X = check_array(X, accept_sparse=False, dtype=np.float64,
--> 441                         force_all_finite=force_all_finite, copy=self.copy)
    442 
    443         # Check for +/- inf

~\AppData\Local\Continuum\anaconda3\lib\site-packages\sklearn\utils\validation.py in check_array(array, accept_sparse, accept_large_sparse, dtype, order, copy, force_all_finite, ensure_2d, allow_nd, ensure_min_samples, ensure_min_features, warn_on_dtype, estimator)
    494             try:
    495                 warnings.simplefilter('error', ComplexWarning)
--> 496                 array = np.asarray(array, dtype=dtype, order=order)
    497             except ComplexWarning:
    498                 raise ValueError("Complex data not supported\n"

~\AppData\Local\Continuum\anaconda3\lib\site-packages\numpy\core\_asarray.py in asarray(a, dtype, order)
     83 
     84     """
---> 85     return array(a, dtype, copy=False, order=order)
     86 
     87 

~\AppData\Local\Continuum\anaconda3\lib\site-packages\pandas\core\generic.py in __array__(self, dtype)
   1896 
   1897     def __array__(self, dtype=None) -> np.ndarray:
-> 1898         return np.asarray(self._values, dtype=dtype)
   1899 
   1900     def __array_wrap__(

~\AppData\Local\Continuum\anaconda3\lib\site-packages\numpy\core\_asarray.py in asarray(a, dtype, order)
     83 
     84     """
---> 85     return array(a, dtype, copy=False, order=order)
     86 
     87 

ValueError: could not convert string to float: 'B'

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions