Skip to content

Commit 9dec79b

Browse files
ARROW-5505: [R] Normalize file and class names, stop masking base R functions, add vignette, improve documentation
The main thrust of the changes are summarized in the new vignette: > C++ is an object-oriented language, so the core logic of the Arrow library is encapsulated in classes and methods. In the R package, these classes are implemented as `R6` reference classes, most of which are exported from the namespace. > > In order to match the C++ naming conventions, the `R6` classes are in TitleCase, e.g. `RecordBatch`. This makes it easy to look up the relevant C++ implementations in the [code](https://github.com/apache/arrow/tree/master/cpp) or [documentation](https://arrow.apache.org/docs/cpp/). To simplify things in R, the C++ library namespaces are generally dropped or flattened; that is, where the C++ library has `arrow::io::FileOutputStream`, it is just `FileOutputStream` in the R package. One exception is for the file readers, where the namespace is necessary to disambiguate. So `arrow::csv::TableReader` becomes `CsvTableReader`, and `arrow::json::TableReader` becomes `JsonTableReader`. > > Some of these classes are not meant to be instantiated directly; they may be base classes or other kinds of helpers. For those that you should be able to create, use the `$create()` method to instantiate an object. For example, `rb <- RecordBatch$create(int = 1:10, dbl = as.numeric(1:10))` will create a `RecordBatch`. Many of these factory methods that an R user might most often encounter also have a `snake_case` alias, in order to be more familiar for contemporary R users. So `record_batch(int = 1:10, dbl = as.numeric(1:10))` would do the same as `RecordBatch$create()` above. > > The typical user of the `arrow` R package may never deal directly with the `R6` objects. We provide more R-friendly wrapper functions as a higher-level interface to the C++ library. An R user can call `read_parquet()` without knowing or caring that they're instantiating a `ParquetFileReader` object and calling the `$ReadFile()` method on it. The classes are there and available to the advanced programmer who wants fine-grained control over how the C++ library is used. There are a few other fixes and cleanups rolled in here, named in the individual commit messages below. I stopped short of more documentation consolidation because (1) this patch is already huge and (2) `R6` classes are really tedious to document because it's all manual. I did some searching around and found open issues from 2014 and 2015 about supporting R6 better in roxygen2. Closes #5279 from nealrichardson/cleaner-class-names and squashes the following commits: 3c6f85b <Neal Richardson> 🐀 22c9d04 <Neal Richardson> More doc cleaning 01084ce <Neal Richardson> Factor out assert_is() caf3265 <Neal Richardson> PR feedback from romain adf1cf9 <Neal Richardson> File renaming (not case-sensitive) 35f00f5 <Neal Richardson> Rename Table.R to table.R 8bd52d7 <Neal Richardson> Rename Struct.R to struct.R 358290b <Neal Richardson> Rename Schema.R to schema.R 924edd1 <Neal Richardson> Rename List.R to list.R 0150d99 <Neal Richardson> Rename Field.R to field.R 8683f10 <Neal Richardson> Add content to vignette from blog post e6b75f4 <Neal Richardson> Consolidate and document reader/writer classes; also fix ARROW-6449 495abf6 <Neal Richardson> Fill in documentation and standardize file naming 5fd49ef <Neal Richardson> Fix check failures 96873e1 <Neal Richardson> Factor out make_readable_file 3e4cfe7 <Neal Richardson> Clean up parquet classes and document the R6 85a8d36 <Neal Richardson> Start vignette draft explaining the class and naming conventions 71cac57 <Neal Richardson> Clean up Rd file names, experiment with documenting constructors, and start updating pkgdown 2d1b738 <Neal Richardson> Replace table() with Table() b694511 <Neal Richardson> Remove defunct Column class 730313e <Neal Richardson> One more find/replace, esp. RecordBatch* 702a0b1 <Neal Richardson> Message 365fedc <Neal Richardson> feather 0e7877b <Neal Richardson> Drop ::ipc:: 55607a6 <Neal Richardson> json 9bd708f <Neal Richardson> csv fbebf27 <Neal Richardson> io 1711d3e <Neal Richardson> CastOptions 12031ad <Neal Richardson> Backfill some methods 4075897 <Neal Richardson> compression 3b4b492 <Neal Richardson> ChunkedArray bbf0799 <Neal Richardson> Buffer 3f1cd71 <Neal Richardson> Object 9fbecda <Neal Richardson> A few more backticks 8edf085 <Neal Richardson> Remove more backticks 1f6d154 <Neal Richardson> Replace array() with Array() 9f52490 <Neal Richardson> Progress commit renaming Array Authored-by: Neal Richardson <[email protected]> Signed-off-by: Neal Richardson <[email protected]>
1 parent 0fbaff6 commit 9dec79b

File tree

156 files changed

+2510
-3169
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

156 files changed

+2510
-3169
lines changed

dev/release/rat_exclude_files.txt

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -213,6 +213,7 @@ r/README.md
213213
r/README.Rmd
214214
r/man/*.Rd
215215
r/cran-comments.md
216+
r/vignettes/*.Rmd
216217
.gitattributes
217218
ruby/red-arrow/.yardopts
218219
rust/arrow/test/data/*.csv

r/DESCRIPTION

Lines changed: 17 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -36,10 +36,12 @@ Imports:
3636
utils
3737
Roxygen: list(markdown = TRUE)
3838
RoxygenNote: 6.1.1
39+
VignetteBuilder: knitr
3940
Suggests:
4041
covr,
4142
fs,
4243
hms,
44+
knitr,
4345
lubridate,
4446
rmarkdown,
4547
testthat,
@@ -49,33 +51,33 @@ Collate:
4951
'enums.R'
5052
'arrow-package.R'
5153
'type.R'
52-
'ArrayData.R'
53-
'ChunkedArray.R'
54-
'Column.R'
55-
'Field.R'
56-
'List.R'
57-
'RecordBatch.R'
58-
'RecordBatchReader.R'
59-
'RecordBatchWriter.R'
60-
'Schema.R'
61-
'Struct.R'
62-
'Table.R'
54+
'array-data.R'
6355
'array.R'
6456
'arrowExports.R'
6557
'buffer.R'
58+
'chunked-array.R'
6659
'io.R'
6760
'compression.R'
6861
'compute.R'
6962
'csv.R'
7063
'dictionary.R'
7164
'feather.R'
65+
'field.R'
7266
'install-arrow.R'
7367
'json.R'
74-
'memory_pool.R'
68+
'list.R'
69+
'memory-pool.R'
7570
'message.R'
7671
'parquet.R'
77-
'read_record_batch.R'
78-
'read_table.R'
72+
'read-record-batch.R'
73+
'read-table.R'
74+
'record-batch-reader.R'
75+
'record-batch-writer.R'
76+
'record-batch.R'
7977
'reexports-bit64.R'
8078
'reexports-tidyselect.R'
81-
'write_arrow.R'
79+
'schema.R'
80+
'struct.R'
81+
'table.R'
82+
'util.R'
83+
'write-arrow.R'

r/NAMESPACE

Lines changed: 38 additions & 82 deletions
Original file line numberDiff line numberDiff line change
@@ -1,114 +1,74 @@
11
# Generated by roxygen2: do not edit by hand
22

3-
S3method("!=","arrow::Object")
4-
S3method("==","arrow::Array")
5-
S3method("==","arrow::DataType")
6-
S3method("==","arrow::Field")
7-
S3method("==","arrow::RecordBatch")
8-
S3method("==","arrow::Schema")
9-
S3method("==","arrow::ipc::Message")
10-
S3method(BufferReader,"arrow::Buffer")
11-
S3method(BufferReader,default)
12-
S3method(CompressedInputStream,"arrow::io::InputStream")
13-
S3method(CompressedInputStream,character)
14-
S3method(CompressedOutputStream,"arrow::io::OutputStream")
15-
S3method(CompressedOutputStream,character)
16-
S3method(FeatherTableReader,"arrow::io::RandomAccessFile")
17-
S3method(FeatherTableReader,"arrow::ipc::feather::TableReader")
18-
S3method(FeatherTableReader,character)
19-
S3method(FeatherTableReader,raw)
20-
S3method(FeatherTableWriter,"arrow::io::OutputStream")
21-
S3method(FixedSizeBufferWriter,"arrow::Buffer")
22-
S3method(FixedSizeBufferWriter,default)
23-
S3method(MessageReader,"arrow::io::InputStream")
24-
S3method(MessageReader,default)
25-
S3method(RecordBatchFileReader,"arrow::Buffer")
26-
S3method(RecordBatchFileReader,"arrow::io::RandomAccessFile")
27-
S3method(RecordBatchFileReader,character)
28-
S3method(RecordBatchFileReader,raw)
29-
S3method(RecordBatchFileWriter,"arrow::io::OutputStream")
30-
S3method(RecordBatchFileWriter,character)
31-
S3method(RecordBatchStreamReader,"arrow::Buffer")
32-
S3method(RecordBatchStreamReader,"arrow::io::InputStream")
33-
S3method(RecordBatchStreamReader,raw)
34-
S3method(RecordBatchStreamWriter,"arrow::io::OutputStream")
35-
S3method(RecordBatchStreamWriter,character)
36-
S3method(as.data.frame,"arrow::RecordBatch")
37-
S3method(as.data.frame,"arrow::Table")
38-
S3method(as.raw,"arrow::Buffer")
39-
S3method(buffer,"arrow::Buffer")
40-
S3method(buffer,complex)
41-
S3method(buffer,default)
42-
S3method(buffer,integer)
43-
S3method(buffer,numeric)
44-
S3method(buffer,raw)
45-
S3method(csv_table_reader,"arrow::csv::TableReader")
46-
S3method(csv_table_reader,"arrow::io::InputStream")
47-
S3method(csv_table_reader,character)
48-
S3method(csv_table_reader,default)
49-
S3method(dim,"arrow::RecordBatch")
50-
S3method(dim,"arrow::Table")
51-
S3method(json_table_reader,"arrow::io::InputStream")
52-
S3method(json_table_reader,"arrow::json::TableReader")
53-
S3method(json_table_reader,character)
54-
S3method(json_table_reader,default)
55-
S3method(length,"arrow::Array")
56-
S3method(names,"arrow::RecordBatch")
57-
S3method(parquet_file_reader,"arrow::io::RandomAccessFile")
58-
S3method(parquet_file_reader,character)
59-
S3method(parquet_file_reader,raw)
3+
S3method("!=",Object)
4+
S3method("==",Array)
5+
S3method("==",DataType)
6+
S3method("==",Field)
7+
S3method("==",Message)
8+
S3method("==",RecordBatch)
9+
S3method("==",Schema)
10+
S3method(as.data.frame,RecordBatch)
11+
S3method(as.data.frame,Table)
12+
S3method(as.raw,Buffer)
13+
S3method(dim,RecordBatch)
14+
S3method(dim,Table)
15+
S3method(length,Array)
16+
S3method(names,RecordBatch)
6017
S3method(print,"arrow-enum")
61-
S3method(read_message,"arrow::io::InputStream")
62-
S3method(read_message,"arrow::ipc::MessageReader")
18+
S3method(read_message,InputStream)
19+
S3method(read_message,MessageReader)
6320
S3method(read_message,default)
64-
S3method(read_record_batch,"arrow::Buffer")
65-
S3method(read_record_batch,"arrow::io::InputStream")
66-
S3method(read_record_batch,"arrow::ipc::Message")
21+
S3method(read_record_batch,Buffer)
22+
S3method(read_record_batch,InputStream)
23+
S3method(read_record_batch,Message)
6724
S3method(read_record_batch,raw)
68-
S3method(read_schema,"arrow::Buffer")
69-
S3method(read_schema,"arrow::io::InputStream")
70-
S3method(read_schema,"arrow::ipc::Message")
25+
S3method(read_schema,Buffer)
26+
S3method(read_schema,InputStream)
27+
S3method(read_schema,Message)
7128
S3method(read_schema,raw)
72-
S3method(read_table,"arrow::ipc::RecordBatchFileReader")
73-
S3method(read_table,"arrow::ipc::RecordBatchStreamReader")
29+
S3method(read_table,RecordBatchFileReader)
30+
S3method(read_table,RecordBatchStreamReader)
7431
S3method(read_table,character)
7532
S3method(read_table,raw)
76-
S3method(type,"arrow::Array")
77-
S3method(type,"arrow::ChunkedArray")
78-
S3method(type,"arrow::Column")
33+
S3method(type,Array)
34+
S3method(type,ChunkedArray)
35+
S3method(type,Column)
7936
S3method(type,default)
80-
S3method(write_arrow,"arrow::ipc::RecordBatchWriter")
37+
S3method(write_arrow,RecordBatchWriter)
8138
S3method(write_arrow,character)
8239
S3method(write_arrow,raw)
83-
S3method(write_feather,"arrow::RecordBatch")
84-
S3method(write_feather,data.frame)
85-
S3method(write_feather,default)
86-
S3method(write_feather_RecordBatch,"arrow::io::OutputStream")
87-
S3method(write_feather_RecordBatch,character)
88-
S3method(write_feather_RecordBatch,default)
40+
export(Array)
41+
export(Buffer)
8942
export(BufferOutputStream)
9043
export(BufferReader)
44+
export(ChunkedArray)
9145
export(CompressedInputStream)
9246
export(CompressedOutputStream)
9347
export(CompressionType)
9448
export(DateUnit)
9549
export(FeatherTableReader)
9650
export(FeatherTableWriter)
51+
export(Field)
9752
export(FileMode)
9853
export(FileOutputStream)
9954
export(FixedSizeBufferWriter)
55+
export(MemoryMappedFile)
10056
export(MessageReader)
10157
export(MessageType)
10258
export(MockOutputStream)
59+
export(ParquetFileReader)
60+
export(ParquetReaderProperties)
61+
export(RandomAccessFile)
10362
export(ReadableFile)
10463
export(RecordBatchFileReader)
10564
export(RecordBatchFileWriter)
10665
export(RecordBatchStreamReader)
10766
export(RecordBatchStreamWriter)
67+
export(Schema)
10868
export(StatusCode)
69+
export(Table)
10970
export(TimeUnit)
11071
export(Type)
111-
export(array)
11272
export(arrow_available)
11373
export(bool)
11474
export(boolean)
@@ -150,8 +110,6 @@ export(mmap_open)
150110
export(null)
151111
export(num_range)
152112
export(one_of)
153-
export(parquet_arrow_reader_properties)
154-
export(parquet_file_reader)
155113
export(read_arrow)
156114
export(read_csv_arrow)
157115
export(read_delim_arrow)
@@ -168,7 +126,6 @@ export(schema)
168126
export(starts_with)
169127
export(string)
170128
export(struct)
171-
export(table)
172129
export(time32)
173130
export(time64)
174131
export(timestamp)
@@ -180,7 +137,6 @@ export(uint8)
180137
export(utf8)
181138
export(write_arrow)
182139
export(write_feather)
183-
export(write_feather_RecordBatch)
184140
export(write_parquet)
185141
importFrom(R6,R6Class)
186142
importFrom(Rcpp,sourceCpp)

r/R/RecordBatchReader.R

Lines changed: 0 additions & 138 deletions
This file was deleted.

0 commit comments

Comments
 (0)