Skip to content

Commit 49101bd

Browse files
committed
Update docs
1 parent f05f2cd commit 49101bd

File tree

20 files changed

+286
-210
lines changed

20 files changed

+286
-210
lines changed

.gitignore

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -245,3 +245,4 @@ ModelManifest.xml
245245
/pysnptools/util/filecache/localcache1/sub1
246246
/pysnptools/util/filecache/peertopeer1
247247
/pysnptools/util/filecache/tempdir
248+
/pysnptools/snpreader/localcache1

README.md

Lines changed: 6 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -13,7 +13,10 @@ util: In one line, intersect and re-order IIDs from snpreader and other sources.
1313

1414
util.IntRangeSet: Efficiently manipulate ranges of integers - for example, genetic position - with set operators including union, intersection, and set difference.
1515

16-
util.pheno: Read the PLINK pheno type file format.
16+
util.mapreduce1: Run loops on locally, on multiple processors, or on any cluster.
17+
18+
util.filecache: Read and write files locally or from/to any remote storage.
19+
1720

1821
Find the PySnpTools documentation (including links to tutorial slides, notesbooks, and video):
1922
http://microsoftgenomics.github.io/PySnpTools/
@@ -53,7 +56,7 @@ Packages:
5356
We highly recommend using a python distribution such as
5457
Anaconda (https://store.continuum.io/cshop/anaconda/)
5558
or Enthought (https://www.enthought.com/products/epd/free/).
56-
Both these distributions can be used on linux and Windows, are free
59+
Both these distributions can be used on Linux and Windows, are free
5760
for non-commercial use, and optionally include an MKL-compiled distribution
5861
for optimal speed. This is the easiest way to get all the required package
5962
dependencies.
@@ -64,7 +67,7 @@ dependencies.
6467

6568
Go to the directory where you copied the source code for fastlmm.
6669

67-
On linux:
70+
On Linux:
6871

6972
At the shell, type:
7073

doc/build/html/index.html

Lines changed: 140 additions & 111 deletions
Large diffs are not rendered by default.

doc/build/html/searchindex.js

Lines changed: 1 addition & 1 deletion
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

doc/source/index.rst

Lines changed: 9 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -4,11 +4,13 @@
44

55
PySnpTools: A library for reading and manipulating genetic data.
66

7-
:synopsis:
7+
:Synopsis:
88

99
* :mod:`.snpreader`: Efficiently read genetic PLINK formats including \*.bed/bim/fam and phenotype files. Also, efficiently read *parts* of files and standardize data.
1010

11-
* :class:`.snpreader.SnpGen`: Generate synthetic SNP data on the fly.
11+
:new: :class:`.snpreader.SnpGen`: Generate synthetic SNP data on the fly.
12+
:new: :class:`.snpreader.SnpMemMap`: Support larger in-memory data via on-disk memory mapping.
13+
:new: :class:`.snpreader.DistributedBed`: Split :class:`.Bed`-like data into multiple files for more efficient cluster use.
1214

1315
* :mod:`.kernelreader`: Efficiently create, read, and manipulate kernel data.
1416

@@ -23,9 +25,12 @@ PySnpTools: A library for reading and manipulating genetic data.
2325
* :class:`.util.IntRangeSet`: Efficiently manipulate ranges of integers -- for example, genetic position -- with set operators including
2426
union, intersection, and set difference.
2527

26-
* :mod:`.util.mapreduce1`: Run in parallel on multiple processes, threads, or clusters.
28+
:new:
29+
30+
* :mod:`.util.mapreduce1`: Run loops in parallel on multiple processes, threads, or clusters.
31+
32+
* :mod:`.util.filecache`: Automatically copy files to and from any remote storage.
2733

28-
* :mod:`.util.filecache`: Copy files to and from any remote storage.
2934

3035
:Tutorial:
3136

pysnptools/kernelreader/kerneldata.py

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -100,11 +100,11 @@ def __init__(self, iid=None, iid0=None, iid1=None, val=None, name=None, parent_s
100100
(0.5, 2)
101101
"""
102102

103-
def allclose(a,b,equal_nan=True):
103+
def allclose(self, value,equal_nan=True):
104104
'''
105-
:param b: Other object with which to compare.
106-
:type b: :class:`KernelData`
107-
:param equal_nan: (Default: True) Tells if NaN in .val should be treated as regular values when testing equality.
105+
:param value: Other object with which to compare.
106+
:type value: :class:`KernelData`
107+
:param equal_nan: (Default: True) Tells if NaN in :attr:`.KernelData.val` should be treated as regular values when testing equality.
108108
:type equal_nan: bool
109109
110110
>>> import numpy as np
@@ -116,7 +116,7 @@ def allclose(a,b,equal_nan=True):
116116
False
117117
118118
'''
119-
return PstData.allclose(a,b,equal_nan=equal_nan)
119+
return PstData.allclose(self,value,equal_nan=equal_nan)
120120

121121

122122
#!! SnpData.standardize() changes the str to help show that the data has been standardized. Should this to that too?

pysnptools/pstreader/pstdata.py

Lines changed: 9 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -88,11 +88,11 @@ def __init__(self, row, col, val, row_property=None, col_property=None, name=Non
8888
def __eq__(a,b):
8989
return a.allclose(b,equal_nan=False)
9090

91-
def allclose(a,b,equal_nan=True):
91+
def allclose(self,value,equal_nan=True):
9292
'''
93-
:param b: Other object with which to compare.
94-
:type b: :class:`PstData`
95-
:param equal_nan: (Default: True) Tells if NaN in .val should be treated as regular values when testing equality.
93+
:param value: Other object with which to compare.
94+
:type value: :class:`PstData`
95+
:param equal_nan: (Default: True) Tells if NaN in :attr:`.PstData.val` should be treated as regular values when testing equality.
9696
:type equal_nan: bool
9797
9898
>>> import numpy as np
@@ -105,11 +105,11 @@ def allclose(a,b,equal_nan=True):
105105
106106
'''
107107
try:
108-
return (np.array_equal(a.row,b.row) and
109-
np.array_equal(a.col,b.col) and
110-
np.array_equal(a.row_property,b.row_property) and
111-
np.array_equal(a.col_property,b.col_property) and
112-
np.allclose(a.val,b.val,equal_nan=equal_nan))
108+
return (np.array_equal(self.row,value.row) and
109+
np.array_equal(self.col,value.col) and
110+
np.array_equal(self.row_property,value.row_property) and
111+
np.array_equal(self.col_property,value.col_property) and
112+
np.allclose(self.val,value.val,equal_nan=equal_nan))
113113
except:
114114
return False
115115

pysnptools/pysnptools.pyproj

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,7 @@
55
<SchemaVersion>2.0</SchemaVersion>
66
<ProjectGuid>{60f1cb4a-9da3-47c2-9b89-60ac1ce93347}</ProjectGuid>
77
<ProjectHome />
8-
<StartupFile>util\filecache\test.py</StartupFile>
8+
<StartupFile>util\filecache\filecache.py</StartupFile>
99
<SearchPath>..\;..\..\PySnpTools;..\..\PySnpTools</SearchPath>
1010
<WorkingDirectory>.\snpreader</WorkingDirectory>
1111
<OutputPath>.</OutputPath>

pysnptools/snpreader/distributedbed.py

Lines changed: 15 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -13,14 +13,13 @@
1313

1414
class DistributedBed(SnpReader):
1515
'''
16-
A class that implements the :class:`SnpReader` interface. It stores :class:`.Bed`-like data in pieces on storage. When requested, it retrieves requested parts of the data.
16+
A class that implements the :class:`SnpReader` interface. It stores :class:`.Bed`-like data in pieces on storage. When you request data, it retrieves only the needed pieces.
1717
1818
**Constructor:**
19-
:Parameters: * **storage** (:class:`.FileCache` or string) -- Tells where the SNP data was stored. A :class:`.FileCache` instance can be given, which provides a
20-
method to specify cluster-distributed storage. (:class:`.FileCache`'s will **not** be automatically erased and must be user managed.)
21-
A string can be given and will be interpreted as the path of a local directory to use for storage. (The local
22-
directory will **not** be automatically erased and so must be user managed.)
23-
:type storage: :class:`.FileCache` or string.
19+
:Parameters: **storage** (string or :class:`.FileCache`) -- Tells where the DistirubtedBed data is stored.
20+
A string can be given and will be interpreted as the path to a directory.
21+
A :class:`.FileCache` instance can be given, which provides a method to specify cluster-distributed storage.
22+
:type storage: string or :class:`.FileCache`
2423
2524
'''
2625
def __init__(self, storage):
@@ -95,25 +94,26 @@ def write(storage, snpreader, piece_per_chrom_count, updater=None, runner=None):
9594
If some of the contents already exists in storage, it skips uploading that part of the contents. (To avoid this behavior,
9695
clear the storage.)
9796
98-
:param storage: Tells where to upload SNP data.
97+
:param storage: Tells where to store SNP data.
98+
A string can be given and will be interpreted as the path of a local directory to use for storage. (The local
99+
directory will **not** be automatically erased and so must be user managed.)
99100
A :class:`.FileCache` instance can be given, which provides a
100101
method to specify cluster-distributed storage. (:class:`.FileCache`'s will **not** be automatically erased and must be user managed.)
101102
If `None`, the storage will be in an automatically-erasing temporary directory. (If the TEMP environment variable is set, Python places the temp directory under it.)
102-
A string can be given and will be interpreted as the path of a local directory to use for storage. (The local
103-
directory will **not** be automatically erased and so must be user managed.)
104-
:type storage: :class:`.FileCache` or None or string.
103+
104+
:type storage: string or :class:`.FileCache` or None.
105105
106-
:param snpreader: A :class:`.Bed` or other :class:`.SnpReader` that with values of 0,1,2, or missing.
107-
(Note that this differs from most other `write`methods that take a :class:`.SnpData`)
106+
:param snpreader: A :class:`.Bed` or other :class:`.SnpReader` with values of 0,1,2, or missing.
107+
(Note that this differs from most other `write` methods that take a :class:`.SnpData`)
108108
:type snpreader: :class:`.SnpReader`
109109
110110
:param piece_per_chrom_count: The number of pieces in which to store the data from each chromosome. Data is split across
111-
SNPs. If `piece_per_chrom_count` is set to 100 and 22 chromosomes are uploaded, then data will be stored in 2200 pieces. Later, when data is requested
111+
SNPs. For exmple, if `piece_per_chrom_count` is set to 100 and 22 chromosomes are uploaded, then data will be stored in 2200 pieces. Later, when data is requested
112112
only the pieces necessary for the request will be downloaded to local storage.
113113
:type piece_per_chrom_count: A number
114114
115-
:param updater: A single argument function to write logging message to.
116-
:type updater: A lambda such as created by :func:`.log_in_place`
115+
:param updater: A single argument function to write logging message to, for example, the function created by :func:`.log_in_place`.
116+
:type updater: A function or lambda
117117
118118
:param runner: a :class:`.Runner`, optional: Tells how to run.
119119
(Note that :class:`.Local` and :class:`.LocalMultProc` are good options.)

pysnptools/snpreader/snpdata.py

Lines changed: 7 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -33,8 +33,8 @@ class SnpData(PstData,SnpReader):
3333
3434
**Equality:**
3535
36-
Two SnpData objects are equal if their four arrays (:attr:`.SnpData.val`, :attr:`SnpReader.iid`, :attr:`.SnpReader.sid`, and :attr:`.SnpReader.pos_property`) are 'array_equal'.
37-
(Their 'name' does not need to be the same).
36+
Two SnpData objects are equal if their four arrays (:attr:`.SnpData.val`, :attr:`SnpReader.iid`, :attr:`.SnpReader.sid`, and :attr:`.SnpReader.pos`)
37+
are 'array_equal'. (Their 'name' does not need to be the same).
3838
If either :attr:`.SnpData.val` contains NaN, the objects will not be equal. However, :meth:`.SnpData.allclose` can be used to treat NaN as
3939
regular values.
4040
@@ -88,11 +88,11 @@ def __init__(self, iid, sid, val, pos=None, name=None, parent_string=None, copyi
8888
2.0
8989
"""
9090

91-
def allclose(a,b,equal_nan=True):
91+
def allclose(self,value,equal_nan=True):
9292
'''
93-
:param b: Other object with which to compare.
94-
:type b: :class:`SnpData`
95-
:param equal_nan: (Default: True) Tells if NaN in .val should be treated as regular values when testing equality.
93+
:param value: Other object with which to compare.
94+
:type value: :class:`SnpData`
95+
:param equal_nan: (Default: True) Tells if NaN in :attr:`.SnpData.val` should be treated as regular values when testing equality.
9696
:type equal_nan: bool
9797
9898
>>> import numpy as np
@@ -104,7 +104,7 @@ def allclose(a,b,equal_nan=True):
104104
False
105105
106106
'''
107-
return PstData.allclose(a,b,equal_nan=equal_nan)
107+
return PstData.allclose(self,value,equal_nan=equal_nan)
108108

109109
def train_standardizer(self, apply_in_place, standardizer=Unit(), force_python_only=False):
110110
"""

0 commit comments

Comments
 (0)