Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 1 addition & 5 deletions Project.toml
Original file line number Diff line number Diff line change
Expand Up @@ -5,28 +5,24 @@ version = "1.0.1-DEV"
[deps]
DataValues = "e7dc6d0d-1eca-5fa6-8ad6-5aecde8b7ea5"
Dates = "ade2ca70-3891-5945-98fb-dc099432e06a"
ExcelReaders = "c04bee98-12a5-510c-87df-2a230cb6e075"
FileIO = "5789e2e9-d7fb-5bc7-8068-2c6fae9b9549"
IterableTables = "1c8ee90f-4401-5389-894e-7a04a3dc0f4d"
IteratorInterfaceExtensions = "82899510-4779-5014-852e-03e436cf321d"
Printf = "de0858da-6303-5e67-8744-51eddeeeb8d7"
PyCall = "438e738f-606a-5dbb-bf0a-cddfbfd45ab0"
TableShowUtils = "5e66a065-1f0a-5976-b372-e0b8c017ca10"
TableTraits = "3783bdb8-4a98-5b6b-af9a-565f29a5fe9c"
TableTraitsUtils = "382cd787-c1b6-5bf2-a167-d5b971a19bda"
XLSX = "fdbf4ff8-1666-58a4-91e7-1b58723a45e0"

[compat]
DataValues = "0.4.11"
ExcelReaders = "0.11"
FileIO = "1"
IterableTables = "0.8.3, 0.9, 0.10, 0.11, 1"
IteratorInterfaceExtensions = "0.1.1, 1"
PyCall = "1.90"
TableShowUtils = "0.2"
TableTraits = "0.3.1, 0.4, 1"
TableTraitsUtils = "0.3, 0.4, 1"
XLSX = "0.4.1, 0.5, 0.6, 0.7, 0.8, 0.9"
XLSX = "0.10, 0.11"
julia = "1"

[extras]
Expand Down
92 changes: 69 additions & 23 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,9 +7,18 @@

## Overview

This package provides load support for Excel files under the
This package provides support for Excel files under the
[FileIO.jl](https://github.com/JuliaIO/FileIO.jl) package.

It provides functionality to read simple tabular data from
an Excel (.xlsx) file and to save simple tabular data to an
Excel file.

For more extensive functionality when reading and writing Excel files,
consider using [XLSX.jl](https://felipenoris.github.io/XLSX.jl/stable/).
Under the hood, `ExcelFiles.jl` uses the `XLSX.jl` functions `readtable`
and `writetable`.

## Installation

Use ``Pkg.add("ExcelFiles")`` in Julia to install ExcelFiles and its dependencies.
Expand All @@ -18,17 +27,17 @@ Use ``Pkg.add("ExcelFiles")`` in Julia to install ExcelFiles and its dependencie

### Load an Excel file

To read a Excel file into a ``DataFrame``, use the following julia code:
To read an Excel file into a `DataFrame`, use the following julia code:

````julia
```julia
using ExcelFiles, DataFrames

df = DataFrame(load("data.xlsx", "Sheet1"))
````
```

The call to ``load`` returns a ``struct`` that is an [IterableTable.jl](https://github.com/queryverse/IterableTables.jl), so it can be passed to any function that can handle iterable tables, i.e. all the sinks in [IterableTable.jl](https://github.com/queryverse/IterableTables.jl). Here are some examples of materializing an Excel file into data structures that are not a ``DataFrame``:
The call to `load` returns an object that is an [IterableTable.jl](https://github.com/queryverse/IterableTables.jl), so it can be passed to any function that can handle iterable tables, i.e. all the sinks in [IterableTable.jl](https://github.com/queryverse/IterableTables.jl). Here are some examples of materializing an Excel file into data structures that are not a `DataFrame`:

````julia
```julia
using ExcelFiles, DataTables, IndexedTables, TimeSeries, Temporal, Gadfly

# Load into a DataTable
Expand All @@ -45,46 +54,83 @@ ts = TS(load("data.xlsx", "Sheet1"))

# Plot directly with Gadfly
plot(load("data.xlsx", "Sheet1"), x=:a, y=:b, Geom.line)
````
```

The `load` function takes a number of arguments and keywords:

```julia
FileIO.load(
source::String,
[sheet::String,
[columns::String]];
[first_row::Int],
[column_labels::Vector{String}],
[header::Bool],
[normalizenames::Bool]
)
```

The ``load`` function also takes a number of parameters:

````julia
function load(f::FileIO.File{FileIO.format"Excel"}, range; keywords...)
````
#### Arguments:

* ``range``: either the name of the sheet in the Excel file to read, or a full Excel range specification (i.e. "Sheetname!A1:B2").
* The ``keywords`` arguments are the same as in [ExcelReaders.jl](https://github.com/queryverse/ExcelReaders.jl) (which is used under the hood to read Excel files). When ``range`` is a sheet name, the keyword arguments for the ``readxlsheet`` function from ExcelReaders.jl apply, if ``range`` is a range specification, the keyword arguments for the ``readxl`` function apply.
* `source`: The name of the file to be loaded.
* `sheet`: Specifies the sheet name to be loaded. If `sheet` is not given, the first Excel sheet in the file will be used.
* `columns`: Determines which columns to read. For example, "B:D" will select columns B, C and D. If columns is not given, the algorithm will find the first sequence of consecutive non-empty cells. A valid sheet **must** be specified when specifying columns.

#### Keywords:

* `first_row`: Indicates the first row of the data table to be read. For example, `first_row=5` will look for a table starting at sheet row 5. If first_row is not given, the algorithm will look for the first non-empty row in the sheet.
* `header`: Indicates if the first row is a header. If `header=true` and `column_labels` is not specified, the column labels for the table will be read from the first row of the table. If `header=false` and `column_labels` is not specified, the algorithm will generate column labels. The default value is `header=true`.
* `column_labels`: Specifies column names for the header of the table. If `column_labels` are given and `header=true`, the headers given by `column_labels` will be used, and the first row of the table (containing headers) will be ignored.
* `normalizenames`: Set to `true` to normalize column names to valid Julia identifiers. Default=`false`

### Save an Excel file

The following code saves any iterable table as an excel file:
````julia

```julia
using ExcelFiles

save("output.xlsx", it)
````
This will work as long as it is any of the types supported as sources in IterableTables.jl.
```
This will work as long as it is any of the types supported as sources in IterableTables.jl (such as a `DataFrame`).

The `save` function takes a number of arguments and keywords:

```julia
FileIO.save(
source::String;
[sheetname::String],
[overwrite::Bool]
)
```

#### Arguments:

* `source`: The name of the file to be created on save.

#### Keywords:

* `sheetname`: Specify the sheetname to be used in the created file. By default, the sheetname will be `Sheet1`.
* `overwrite`: Set `overwrite=true` to overwite any existing file of the same name. Default = `false`.

### Using the pipe syntax

``load`` also support the pipe syntax. For example, to load an Excel file into a ``DataFrame``, one can use the following code:
The `load` and `save` functions also support the pipe syntax. For example, to load an Excel file into a `DataFrame`, one can use the following code:

````julia
```julia
using ExcelFiles, DataFrame

df = load("data.xlsx", "Sheet1") |> DataFrame
````
```

To save an iterable table, one can use the following form:

````julia
```julia
using ExcelFiles, DataFrame

df = # Aquire a DataFrame somehow

df |> save("output.xlsx")
````
```

The pipe syntax is especially useful when combining it with [Query.jl](https://github.com/queryverse/Query.jl) queries, for example one can easily load an Excel file, pipe it into a query, then pipe it to the ``save`` function to store the results in a new file.
The pipe syntax is especially useful when combining it with [Query.jl](https://github.com/queryverse/Query.jl) queries, for example one can easily load an Excel file, pipe it into a query, then pipe it to the `save` function to store the results in a new file.
Binary file added data/TestData.xlsx
Binary file not shown.
123 changes: 123 additions & 0 deletions docs/src/index.md
Original file line number Diff line number Diff line change
@@ -1 +1,124 @@
# Introduction

This package provides support for Excel files under the
[FileIO.jl](https://github.com/JuliaIO/FileIO.jl) package.

It provides functionality to read simple tabular data from
an Excel (.xlsx) file and to save simple tabular data to an
Excel file.

For more extensive functionality when reading and writing Excel files,
consider using [XLSX.jl](https://felipenoris.github.io/XLSX.jl/stable/).
Under the hood, `ExcelFiles.jl` uses the `XLSX.jl` functions `readtable`
and `writetable`.

# Usage

## Load an Excel file

To read an Excel file into a `DataFrame`, use the following julia code:

```julia
using ExcelFiles, DataFrames

df = DataFrame(load("data.xlsx", "Sheet1"))
```

The call to `load` returns an object that is an [IterableTable.jl](https://github.com/queryverse/IterableTables.jl), so it can be passed to any function that can handle iterable tables, i.e. all the sinks in [IterableTable.jl](https://github.com/queryverse/IterableTables.jl). Here are some examples of materializing an Excel file into data structures that are not a `DataFrame`:

```julia
using ExcelFiles, DataTables, IndexedTables, TimeSeries, Temporal, Gadfly

# Load into a DataTable
dt = DataTable(load("data.xlsx", "Sheet1"))

# Load into an IndexedTable
it = IndexedTable(load("data.xlsx", "Sheet1"))

# Load into a TimeArray
ta = TimeArray(load("data.xlsx", "Sheet1"))

# Load into a TS
ts = TS(load("data.xlsx", "Sheet1"))

# Plot directly with Gadfly
plot(load("data.xlsx", "Sheet1"), x=:a, y=:b, Geom.line)
```

The `load` function takes a number of arguments and keywords:

```julia
FileIO.load(
source::String,
[sheet::String,
[columns::String]];
[first_row::Int],
[column_labels::Vector{String}],
[header::Bool],
[normalizenames::Bool]
)
```

### Arguments:

* `source`: The name of the file to be loaded.
* `sheet`: Specifies the sheet name to be loaded. If `sheet` is not given, the first Excel sheet in the file will be used.
* `columns`: Determines which columns to read. For example, "B:D" will select columns B, C and D. If columns is not given, the algorithm will find the first sequence of consecutive non-empty cells. A valid sheet **must** be specified when specifying columns.

### Keywords:

* `first_row`: Indicates the first row of the data table to be read. For example, `first_row=5` will look for a table starting at sheet row 5. If first_row is not given, the algorithm will look for the first non-empty row in the sheet.
* `header`: Indicates if the first row is a header. If `header=true` and `column_labels` is not specified, the column labels for the table will be read from the first row of the table. If `header=false` and `column_labels` is not specified, the algorithm will generate column labels. The default value is `header=true`.
* `column_labels`: Specifies column names for the header of the table. If `column_labels` are given and `header=true`, the headers given by `column_labels` will be used, and the first row of the table (containing headers) will be ignored.
* `normalizenames`: Set to `true` to normalize column names to valid Julia identifiers. Default=`false`.

## Save an Excel file

The following code saves any iterable table as an excel file:
```julia
using ExcelFiles

save("output.xlsx", it)
```
This will work as long as it is any of the types supported as sources in IterableTables.jl (such as a `DataFrame`).

The `save` function takes a number of arguments and keywords:

```julia
FileIO.save(
source::String;
[sheetname::String],
[overwrite::Bool]
)
```

#### Arguments:

* `source`: The name of the file to be created on save.

#### Keywords:

* `sheetname`: Specify the sheetname to be used in the created file. By default, the sheetname will be `Sheet1`.
* `overwrite`: Set `overwrite=true` to overwite any existing file of the same name. Default = `false`.

## Using the pipe syntax

The `load` and `save` functions also support the pipe syntax. For example, to load an Excel file into a `DataFrame`, one can use the following code:

```julia
using ExcelFiles, DataFrame

df = load("data.xlsx", "Sheet1") |> DataFrame
```

To save an iterable table, one can use the following form:

```julia
using ExcelFiles, DataFrame

df = # Aquire a DataFrame somehow

df |> save("output.xlsx")
```

The pipe syntax is especially useful when combining it with [Query.jl](https://github.com/queryverse/Query.jl) queries, for example one can easily load an Excel file, pipe it into a query, then pipe it to the `save` function to store the results in a new file.
Loading