|
10 | 10 |
|
11 | 11 | This package provides load and save support for CSV Files under the |
12 | 12 | [FileIO.jl](https://github.com/JuliaIO/FileIO.jl) package. |
| 13 | + |
| 14 | +## Installation |
| 15 | + |
| 16 | +Use Pkg.add("CSVFiles") in Julia to install ExcelReaders and its dependencies. |
| 17 | + |
| 18 | +## Usage |
| 19 | + |
| 20 | +### Load a CSV file |
| 21 | + |
| 22 | +To read a CSV file into a ``DataFrame``, use the following julia code: |
| 23 | + |
| 24 | +````julia |
| 25 | +using FileIO, CSVFiles, DataFrames |
| 26 | + |
| 27 | +df = DataFrame(load("data.csv")) |
| 28 | +```` |
| 29 | + |
| 30 | +The call to ``load`` returns a ``struct`` that is an [IterableTable.jl](https://github.com/davidanthoff/IterableTables.jl), so it can be passed to any function that can handle iterable tables, i.e. all the sinks in [IterableTable.jl](https://github.com/davidanthoff/IterableTables.jl). Here are some examples of materializing a CSV file into data structures that are not a ``DataFrame``: |
| 31 | + |
| 32 | +````julia |
| 33 | +using FileIO, CSVFiles, DataTables, IndexedTables, TimeSeries, Temporal, Gadfly |
| 34 | + |
| 35 | +# Load into a DataTable |
| 36 | +dt = DataTable(load("data.csv")) |
| 37 | + |
| 38 | +# Load into an IndexedTable |
| 39 | +it = IndexedTable(load("data.csv")) |
| 40 | + |
| 41 | +# Load into a TimeArray |
| 42 | +ta = TimeArray(load("data.csv")) |
| 43 | + |
| 44 | +# Load into a TS |
| 45 | +ts = TS(load("data.csv")) |
| 46 | + |
| 47 | +# Plot directly with Gadfly |
| 48 | +plot(load("data.csv"), x=:a, y=:b, Geom.line) |
| 49 | +```` |
| 50 | + |
| 51 | +The ``load`` function also takes a number of parameters: |
| 52 | + |
| 53 | +````julia |
| 54 | +load(f::FileIO.File{FileIO.format"CSV"}, delim=','; <arguments>...) |
| 55 | +```` |
| 56 | +#### Arguments: |
| 57 | + |
| 58 | +* ``file``: either an IO object or file name string |
| 59 | +* ``delim``: the delimiter character |
| 60 | +* ``quotechar``: character used to quote strings, defaults to " |
| 61 | +* ``escapechar``: character used to escape quotechar in strings. (could be the same as quotechar) |
| 62 | +* ``nrows``: number of rows in the file. Defaults to 0 in which case we try to estimate this. |
| 63 | +* ``header_exists``: boolean specifying whether CSV file contains a header |
| 64 | +* ``colnames``: manually specified column names. Could be a vector or a dictionary from Int index (the column) to String column name. |
| 65 | +* ``colparsers``: Parsers to use for specified columns. This can be a vector or a dictionary from column name / column index (Int) to a "parser". The simplest parser is a type such as Int, Float64. It can also be a dateformat"...", see CustomParser if you want to plug in custom parsing behavior |
| 66 | +* ``type_detect_rows``: number of rows to use to infer the initial colparsers defaults to 20. |
| 67 | + |
| 68 | +These are simply the arguments from [TextParse.jl](https://github.com/JuliaComputing/TextParse.jl), which is used under the hood to read CSV files. |
| 69 | + |
| 70 | +### Save a CSV file |
| 71 | + |
| 72 | +The following code saves any iterable table as a CSV file: |
| 73 | +````julia |
| 74 | +using FileIO, CSVFiles |
| 75 | + |
| 76 | +save("output.csv", it) |
| 77 | +```` |
| 78 | +This will work as long as ``it`` is any of the types supported as sources in [IterableTables.jl](https://github.com/davidanthoff/IterableTables.jl). |
| 79 | + |
| 80 | +The ``save`` function takes a number of arguments: |
| 81 | +````julia |
| 82 | +save(f::FileIO.File{FileIO.format"CSV"}, data; delim=',', quotechar='"', escapechar='\\', header=true) |
| 83 | +```` |
| 84 | + |
| 85 | +#### Arguments |
| 86 | + |
| 87 | +* ``delim``: the delimiter character, defaults to ``,``. |
| 88 | +* ``quotechar``: character used to quote strings, defaults to ``"``. |
| 89 | +* ``escapechar``: character used to escape ``quotechar`` in strings, defaults to ``\``. |
| 90 | +* ``header``: whether a header should be written, defaults to ``true. |
| 91 | + |
| 92 | +### Using the pipe syntax |
| 93 | + |
| 94 | +Both ``load`` and ``save`` also support the pipe syntax. For example, to load a CSV file into a ``DataFrame``, one can use the following code: |
| 95 | + |
| 96 | +````julia |
| 97 | +using FileIO, CSVFiles, DataFrame |
| 98 | + |
| 99 | +df = load("data.csv") |> DataFrame |
| 100 | +```` |
| 101 | + |
| 102 | +To save an iterable table, one can use the following form: |
| 103 | + |
| 104 | +````julia |
| 105 | +using FileIO, CSVFiles, DataFrame |
| 106 | + |
| 107 | +df = # Aquire a DataFrame somehow |
| 108 | + |
| 109 | +df |> save("output.csv") |
| 110 | +```` |
| 111 | + |
| 112 | +The pipe syntax is especially useful when combining it with [Query.jl](https://github.com/davidanthoff/Query.jl) queries, for example one can easily load a CSV file, pipe it into a query, then pipe it to the ``save`` function to store the results in a new file. |
0 commit comments