Skip to content

Commit d3df4c4

Browse files
committed
add readme
1 parent b9f053f commit d3df4c4

1 file changed

Lines changed: 64 additions & 0 deletions

File tree

README.md

Lines changed: 64 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,64 @@
1+
2+
# purescript-dataframe
3+
4+
[![Latest release](http://img.shields.io/github/release/gabysbrain/purescript-dataframe.svg)](https://github.com/gabysbrain/purescript-dataframe/releases)
5+
6+
A datastructure designed to be used with queries as well as a type for
7+
queries. There is also semantics for combining the queries.
8+
9+
# Example
10+
11+
```purescript
12+
main = do
13+
let df = init [1, 2, 3, 4, 5, 6, 7]
14+
q = filter (\x -> x > 3) `chain`
15+
mutate show `chain`
16+
trim 3
17+
putStrLn $ runQuery q df
18+
```
19+
20+
# Getting started
21+
22+
## Installation
23+
24+
```
25+
bower install purescript-dataframe
26+
```
27+
28+
## Queries
29+
30+
The idea of a Query type is that we want to have a type-safe way to chain
31+
operations on dataframes and we want to maintain the original dataset
32+
throughout the query. In other data processing languages this is a common
33+
source of error, especially when mutating rows.
34+
35+
The set of dataframe operations are based on what's offered by the
36+
[dplyr](https://github.com/tidyverse/dplyr) R package.
37+
38+
* `filter :: forall r. (r -> Boolean) -> Query (DataFrame r) (Dataframe r)`
39+
filter rows of the DataFrame
40+
* `group :: forall r g. Ord g => (r -> g) -> Query (DataFrame r) (DataFrame {group :: g, data :: Dataframe r})`
41+
group the rows of the dataframe by some grouping method
42+
* `count :: forall r g. Ord g => (r -> g) -> Query (DataFrame r) (DataFrame {group :: g, count :: Int})`
43+
group the rows of the dataframe and count the size of the groups
44+
* `summarize :: forall r x. (r -> x) -> Query (DataFrame r) (Array x)`
45+
convert each row of the dataframe to some type and return an array
46+
* `mutate :: forall r s. (r -> s) -> Query (DataFrame r) (Dataframe s)`
47+
change each row of the dataframe to some other type
48+
* `sort :: forall r. (r -> r -> Ordering) -> Query (DataFrame r) (Dataframe r)`
49+
sort the rows of the dataframe given the ordering function
50+
* `trim :: forall r. Int -> Query (DataFrame r) (Dataframe r)`
51+
keep only the first n rows of the DataFrame
52+
53+
The `chain :: forall r s t. Query r s -> Query s t -> Query r t` function
54+
allows us to chain queries and keep the original context.
55+
56+
# API Docs
57+
58+
API documentation is [published on Pursuit](http://pursuit.purescript.org/packages/purescript-dataframe).
59+
60+
# Todos
61+
62+
* DataFrames should be able to operate either as a set of columns or a set of rows.
63+
* Queries should do caching
64+

0 commit comments

Comments
 (0)