DOC: move data reader docs to Remote Data Access top-level section

jreback · jreback · commit 689d4917e741 · 2013-07-25T14:47:01.000-04:00
diff --git a/doc/source/index.rst b/doc/source/index.rst
@@ -125,6 +125,7 @@ See the package overview for more detail about what's in the library.
     visualization
     rplot
     io
+    remote_data
     enhancingperf
     sparse
     gotchas
diff --git a/doc/source/io.rst b/doc/source/io.rst
@@ -2523,184 +2523,3 @@ Alternatively, the function :func:`~pandas.io.stata.read_stata` can be used
 
    import os
    os.remove('stata.dta')
-
-Data Reader
------------
-
-.. _io.data_reader:
-
-Functions from :mod:`pandas.io.data` extract data from various Internet
-sources into a DataFrame. Currently the following sources are supported:
-
-    - Yahoo! Finance
-    - Google Finance
-    - St. Louis FED (FRED)
-    - Kenneth French's data library
-
-It should be noted, that various sources support different kinds of data, so not all sources implement the same methods and the data elements returned might also differ.
-
-Yahoo! Finance
-~~~~~~~~~~~~~~
-
-.. ipython:: python
-
-    import pandas.io.data as web
-    start = datetime.datetime(2010, 1, 1)
-    end = datetime.datetime(2013, 01, 27)
-    f=web.DataReader("F", 'yahoo', start, end)
-    f.ix['2010-01-04']
-
-Google Finance
-~~~~~~~~~~~~~~
-
-.. ipython:: python
-
-    import pandas.io.data as web
-    start = datetime.datetime(2010, 1, 1)
-    end = datetime.datetime(2013, 01, 27)
-    f=web.DataReader("F", 'google', start, end)
-    f.ix['2010-01-04']
-
-FRED
-~~~~
-
-.. ipython:: python
-
-    import pandas.io.data as web
-    start = datetime.datetime(2010, 1, 1)
-    end = datetime.datetime(2013, 01, 27)
-    gdp=web.DataReader("GDP", "fred", start, end)
-    gdp.ix['2013-01-01']
-
-
-Fama/French
-~~~~~~~~~~~
-
-Tthe dataset names are listed at `Fama/French Data Library
-<http://mba.tuck.dartmouth.edu/pages/faculty/ken.french/data_library.html>`_)
-
-.. ipython:: python
-
-    import pandas.io.data as web
-    ip=web.DataReader("5_Industry_Portfolios", "famafrench")
-    ip[4].ix[192607]
-
-
-World Bank panel data in Pandas
-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-
-``Pandas`` users can easily access thousands of panel data series from the
-`World Bank's World Development Indicators <http://data.worldbank.org>`_ 
-by using the ``wb`` I/O functions.
-
-For example, if you wanted to compare the Gross Domestic Products per capita in
-constant dollars in North America, you would use the ``search`` function:
-
-.. code:: python
-
-    In [1]: from pandas.io.wb import search, download
-
-    In [2]: search('gdp.*capita.*const').iloc[:,:2]
-    Out[2]: 
-                         id                                               name
-    3242            GDPPCKD             GDP per Capita, constant US$, millions
-    5143     NY.GDP.PCAP.KD                 GDP per capita (constant 2005 US$)
-    5145     NY.GDP.PCAP.KN                      GDP per capita (constant LCU)
-    5147  NY.GDP.PCAP.PP.KD  GDP per capita, PPP (constant 2005 internation...
-
-Then you would use the ``download`` function to acquire the data from the World
-Bank's servers:
-
-.. code:: python
-
-    In [3]: dat = download(indicator='NY.GDP.PCAP.KD', country=['US', 'CA', 'MX'], start=2005, end=2008)
-
-    In [4]: print dat
-                          NY.GDP.PCAP.KD
-    country       year                  
-    Canada        2008  36005.5004978584
-                  2007  36182.9138439757
-                  2006  35785.9698172849
-                  2005  35087.8925933298
-    Mexico        2008  8113.10219480083
-                  2007  8119.21298908649
-                  2006  7961.96818458178
-                  2005  7666.69796097264
-    United States 2008  43069.5819857208
-                  2007  43635.5852068142
-                  2006   43228.111147107
-                  2005  42516.3934699993
-
-The resulting dataset is a properly formatted ``DataFrame`` with a hierarchical
-index, so it is easy to apply ``.groupby`` transformations to it:
-
-.. code:: python
-
-    In [6]: dat['NY.GDP.PCAP.KD'].groupby(level=0).mean()
-    Out[6]: 
-    country
-    Canada           35765.569188
-    Mexico            7965.245332
-    United States    43112.417952
-    dtype: float64
-
-Now imagine you want to compare GDP to the share of people with cellphone
-contracts around the world. 
-
-.. code:: python
-
-    In [7]: search('cell.*%').iloc[:,:2]
-    Out[7]: 
-                         id                                               name
-    3990  IT.CEL.SETS.FE.ZS  Mobile cellular telephone users, female (% of ...
-    3991  IT.CEL.SETS.MA.ZS  Mobile cellular telephone users, male (% of po...
-    4027      IT.MOB.COV.ZS  Population coverage of mobile cellular telepho...
-
-Notice that this second search was much faster than the first one because
-``Pandas`` now has a cached list of available data series. 
-
-.. code:: python
-
-    In [13]: ind = ['NY.GDP.PCAP.KD', 'IT.MOB.COV.ZS']
-    In [14]: dat = download(indicator=ind, country='all', start=2011, end=2011).dropna()
-    In [15]: dat.columns = ['gdp', 'cellphone']
-    In [16]: print dat.tail()
-                            gdp  cellphone
-    country   year                        
-    Swaziland 2011  2413.952853       94.9
-    Tunisia   2011  3687.340170      100.0
-    Uganda    2011   405.332501      100.0
-    Zambia    2011   767.911290       62.0
-    Zimbabwe  2011   419.236086       72.4
-
-Finally, we use the ``statsmodels`` package to assess the relationship between
-our two variables using ordinary least squares regression. Unsurprisingly,
-populations in rich countries tend to use cellphones at a higher rate:
-
-.. code:: python
-
-    In [17]: import numpy as np
-    In [18]: import statsmodels.formula.api as smf
-    In [19]: mod = smf.ols("cellphone ~ np.log(gdp)", dat).fit()
-    In [20]: print mod.summary()
-                                OLS Regression Results                            
-    ==============================================================================
-    Dep. Variable:              cellphone   R-squared:                       0.297
-    Model:                            OLS   Adj. R-squared:                  0.274
-    Method:                 Least Squares   F-statistic:                     13.08
-    Date:                Thu, 25 Jul 2013   Prob (F-statistic):            0.00105
-    Time:                        15:24:42   Log-Likelihood:                -139.16
-    No. Observations:                  33   AIC:                             282.3
-    Df Residuals:                      31   BIC:                             285.3
-    Df Model:                           1                                         
-    ===============================================================================
-                      coef    std err          t      P>|t|      [95.0% Conf. Int.]
-    -------------------------------------------------------------------------------
-    Intercept      16.5110     19.071      0.866      0.393       -22.384    55.406
-    np.log(gdp)     9.9333      2.747      3.616      0.001         4.331    15.535
-    ==============================================================================
-    Omnibus:                       36.054   Durbin-Watson:                   2.071
-    Prob(Omnibus):                  0.000   Jarque-Bera (JB):              119.133
-    Skew:                          -2.314   Prob(JB):                     1.35e-26
-    Kurtosis:                      11.077   Cond. No.                         45.8
-    ==============================================================================
diff --git a/doc/source/remote_data.rst b/doc/source/remote_data.rst