In [3]: ?make_blobs
Signature: make_blobs(n_samples=100, n_features=2, centers=3, cluster_std=1.0, center_box=(-10.0, 10.0), shuffle=True, random_state=None, *, astype='dataset', **kwargs)
Docstring:
Like sklearn.datasets.samples_generator.make_blobs, but with added functionality.
Parameters
---------------------
Same parameters/arguments as sklearn.datasets.samples_generator.make_blobs, in addition to the following
keyword-only arguments:
astype: str
One of ('array', 'dataframe', 'dataset', 'mldataset') or None to return an NpXyTransformer. See documentation
of NpXyTransformer.astype.
**kwargs: dict
Optional arguments that depend on astype. See documentation of
NpXyTransformer.astype.
@gpfreitas This is related to issue #5 and #6 and tries to condense them into a TODO list.
Items to do related to the argument specs of
make_*functions fromxarray_filters.datasets:MLDatasetbe the default return value rather thanDatasetn_samplesargument in this case:MLDataset(make_blobs(n_samples=2000, shape=(200,10)))wheren_samplescan be taken fromshapedask_glm, e.g. make_classification, we should default to making aMLDatasetas in thexarray_filters.datasetsso far, but usedask_glm's funcs for adask.arrayin eachDataArrayrather thansklearn.datasetsnumpybased approach.use_dask_glm=Truekeyword to control whether the functions indask_glm.datasetsare used.astypeto the following (or equivalent way of specifying the data structures below as the output type):( 'pandas.dataframe','dask.array', 'dask.dataframe', 'numpy.ndarray', ,'dataset', 'mldataset')xnamesshould belayersmake_blobsfromxarray_filters- I think it needs more of the docs from the transformation part explained, e.g. that it typically outputs N-DDataArrays in anMLDatasetor any differences betweensklearnandxarray_filtersliken_samplesversusshape:Note - where I said
dask_glmabove - also look atdask-ml