Skip to content

Commit be06855

Browse files
committed
WIP: changelog migration
1 parent 32356e3 commit be06855

File tree

2 files changed

+383
-8
lines changed

2 files changed

+383
-8
lines changed

CHANGELOG.md

Lines changed: 374 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -1,21 +1,387 @@
11
# Changelog
22

3-
[PyPI History][1]
3+
## 0.15.0 / 2021-03-30
44

5-
[1]: https://pypi.org/project/DISTRIBUTION NAME/#history
5+
### Features
66

7-
## 0.16.0
7+
- Load DataFrame with `to_gbq` to a table in a project different from
8+
the API client project. Specify the target table ID as
9+
`project.dataset.table` to use this feature. (`321`, `347`)
10+
- Allow billing project to be separate from destination table project
11+
in `to_gbq`. (`321`)
812

9-
07-16-2021 14:47 PDT
13+
### Bug fixes
1014

15+
- Avoid 403 error from `to_gbq` when table has `policyTags`. (`354`)
16+
- Avoid `client.dataset` deprecation warnings. (`312`)
1117

12-
### Implementation Changes
18+
### Dependencies
1319

14-
### New Features
20+
- Drop support for Python 3.5 and 3.6. (`337`)
21+
- Drop support for <span
22+
class="title-ref">google-cloud-bigquery==2.4.\*</span> due to query
23+
hanging bug. (`343`)
1524

16-
### Dependencies
25+
## 0.14.1 / 2020-11-10
26+
27+
### Bug fixes
28+
29+
- Use `object` dtype for `TIME` columns. (`328`)
30+
- Encode floating point values with greater precision. (`326`)
31+
- Support `INT64` and other standard SQL aliases in
32+
`~pandas_gbq.to_gbq` `table_schema` argument. (`322`)
33+
34+
## 0.14.0 / 2020-10-05
35+
36+
- Add `dtypes` argument to `read_gbq`. Use this argument to override
37+
the default `dtype` for a particular column in the query results.
38+
For example, this can be used to select nullable integer columns as
39+
the `Int64` nullable integer pandas extension type. (`242`, `332`)
40+
41+
``` python
42+
df = gbq.read_gbq(
43+
"SELECT CAST(NULL AS INT64) AS null_integer",
44+
dtypes={"null_integer": "Int64"},
45+
)
46+
```
47+
48+
### Dependency updates
49+
50+
- Support `google-cloud-bigquery-storage` 2.0 and higher. (`329`)
51+
- Update the minimum version of `pandas` to 0.20.1. (`331`)
52+
53+
### Internal changes
54+
55+
- Update tests to run against Python 3.8. (`331`)
56+
57+
## 0.13.3 / 2020-09-30
58+
59+
- Include needed "extras" from `google-cloud-bigquery` package as
60+
dependencies. Exclude incompatible 2.0 version. (`324`, `329`)
61+
62+
## 0.13.2 / 2020-05-14
63+
64+
- Fix `Provided Schema does not match Table` error when the existing
65+
table contains required fields. (`315`)
66+
67+
## 0.13.1 / 2020-02-13
68+
69+
- Fix `AttributeError` with BQ Storage API to download empty results.
70+
(`299`)
71+
72+
## 0.13.0 / 2019-12-12
73+
74+
- Raise `NotImplementedError` when the deprecated `private_key`
75+
argument is used. (`301`)
76+
77+
## 0.12.0 / 2019-11-25
78+
79+
### New features
80+
81+
- Add `max_results` argument to `~pandas_gbq.read_gbq()`. Use this
82+
argument to limit the number of rows in the results DataFrame. Set
83+
`max_results` to 0 to ignore query outputs, such as for DML or DDL
84+
queries. (`102`)
85+
- Add `progress_bar_type` argument to `~pandas_gbq.read_gbq()`. Use
86+
this argument to display a progress bar when downloading data.
87+
(`182`)
88+
89+
### Bug fixes
90+
91+
- Fix resource leak with `use_bqstorage_api` by closing BigQuery
92+
Storage API client after use. (`294`)
93+
94+
### Dependency updates
95+
96+
- Update the minimum version of `google-cloud-bigquery` to 1.11.1.
97+
(`296`)
98+
99+
### Documentation
100+
101+
- Add code samples to introduction and refactor howto guides. (`239`)
102+
103+
## 0.11.0 / 2019-07-29
104+
105+
- **Breaking Change:** Python 2 support has been dropped. This is to
106+
align with the pandas package which dropped Python 2 support at the
107+
end of 2019. (`268`)
108+
109+
### Enhancements
110+
111+
- Ensure `table_schema` argument is not modified inplace. (`278`)
112+
113+
### Implementation changes
114+
115+
- Use object dtype for `STRING`, `ARRAY`, and `STRUCT` columns when
116+
there are zero rows. (`285`)
117+
118+
### Internal changes
119+
120+
- Populate `user-agent` with `pandas` version information. (`281`)
121+
- Fix `pytest.raises` usage for latest pytest. Fix warnings in tests.
122+
(`282`)
123+
- Update CI to install nightly packages in the conda tests. (`254`)
124+
125+
## 0.10.0 / 2019-04-05
126+
127+
- **Breaking Change:** Default SQL dialect is now `standard`. Use
128+
`pandas_gbq.context.dialect` to override the default value. (`195`,
129+
`245`)
130+
131+
### Documentation
132+
133+
- Document `BigQuery data type to pandas dtype conversion
134+
<reading-dtypes>` for `read_gbq`. (`269`)
135+
136+
### Dependency updates
137+
138+
- Update the minimum version of `google-cloud-bigquery` to 1.9.0.
139+
(`247`)
140+
- Update the minimum version of `pandas` to 0.19.0. (`262`)
141+
142+
### Internal changes
143+
144+
- Update the authentication credentials. **Note:** You may need to set
145+
`reauth=True` in order to update your credentials to the most recent
146+
version. This is required to use new functionality such as the
147+
BigQuery Storage API. (`267`)
148+
- Use `to_dataframe()` from `google-cloud-bigquery` in the
149+
`read_gbq()` function. (`247`)
150+
151+
### Enhancements
152+
153+
- Fix a bug where pandas-gbq could not upload an empty DataFrame.
154+
(`237`)
155+
- Allow `table_schema` in `to_gbq` to contain only a subset of
156+
columns, with the rest being populated using the DataFrame dtypes
157+
(`218`) (contributed by @johnpaton)
158+
- Read `project_id` in `to_gbq` from provided `credentials` if
159+
available (contributed by @daureg)
160+
- `read_gbq` uses the timezone-aware
161+
`DatetimeTZDtype(unit='ns', tz='UTC')` dtype for BigQuery
162+
`TIMESTAMP` columns. (`269`)
163+
- Add `use_bqstorage_api` to `read_gbq`. The BigQuery Storage API can
164+
be used to download large query results (>125 MB) more quickly. If
165+
the BQ Storage API can't be used, the BigQuery API is used instead.
166+
(`133`, `270`)
167+
168+
## 0.9.0 / 2019-01-11
169+
170+
- Warn when deprecated `private_key` parameter is used (`240`)
171+
- **New dependency** Use the `pydata-google-auth` package for
172+
authentication. (`241`)
173+
174+
## 0.8.0 / 2018-11-12
175+
176+
### Breaking changes
177+
178+
- **Deprecate** `private_key` parameter to `pandas_gbq.read_gbq` and
179+
`pandas_gbq.to_gbq` in favor of new `credentials` argument. Instead,
180+
create a credentials object using
181+
`google.oauth2.service_account.Credentials.from_service_account_info`
182+
or
183+
`google.oauth2.service_account.Credentials.from_service_account_file`.
184+
See the `authentication how-to guide <howto/authentication>` for
185+
examples. (`161`, `231`)
186+
187+
### Enhancements
188+
189+
- Allow newlines in data passed to `to_gbq`. (`180`)
190+
- Add `pandas_gbq.context.dialect` to allow overriding the default SQL
191+
syntax dialect. (`195`, `235`)
192+
- Support Python 3.7. (`197`, `232`)
193+
194+
### Internal changes
195+
196+
- Migrate tests to CircleCI. (`228`, `232`)
197+
198+
## 0.7.0 / 2018-10-19
199+
200+
- <span class="title-ref">int</span> columns which contain <span
201+
class="title-ref">NULL</span> are now cast to <span
202+
class="title-ref">float</span>, rather than <span
203+
class="title-ref">object</span> type. (`174`)
204+
- <span class="title-ref">DATE</span>, <span
205+
class="title-ref">DATETIME</span> and <span
206+
class="title-ref">TIMESTAMP</span> columns are now parsed as pandas'
207+
<span class="title-ref">timestamp</span> objects (`224`)
208+
- Add `pandas_gbq.Context` to cache credentials in-memory, across
209+
calls to `read_gbq` and `to_gbq`. (`198`, `208`)
210+
- Fast queries now do not log above `DEBUG` level. (`204`) With
211+
BigQuery's release of
212+
[clustering](https://cloud.google.com/bigquery/docs/clustered-tables)
213+
querying smaller samples of data is now faster and cheaper.
214+
- Don't load credentials from disk if reauth is `True`. (`212`) This
215+
fixes a bug where pandas-gbq could not refresh credentials if the
216+
cached credentials were invalid, revoked, or expired, even when
217+
`reauth=True`.
218+
- Catch RefreshError when trying credentials. (`226`)
219+
220+
### Internal changes
221+
222+
- Avoid listing datasets and tables in system tests. (`215`)
223+
- Improved performance from eliminating some duplicative parsing steps
224+
(`224`)
225+
226+
## 0.6.1 / 2018-09-11
227+
228+
- Improved `read_gbq` performance and memory consumption by delegating
229+
`DataFrame` construction to the Pandas library, radically reducing
230+
the number of loops that execute in python (`128`)
231+
- Reduced verbosity of logging from `read_gbq`, particularly for short
232+
queries. (`201`)
233+
- Avoid `SELECT 1` query when running `to_gbq`. (`202`)
234+
235+
## 0.6.0 / 2018-08-15
236+
237+
- Warn when `dialect` is not passed in to `read_gbq`. The default
238+
dialect will be changing from 'legacy' to 'standard' in a future
239+
version. (`195`)
240+
- Use general float with 15 decimal digit precision when writing to
241+
local CSV buffer in `to_gbq`. This prevents numerical overflow in
242+
certain edge cases. (`192`)
243+
244+
## 0.5.0 / 2018-06-15
245+
246+
- Project ID parameter is optional in `read_gbq` and `to_gbq` when it
247+
can inferred from the environment. Note: you must still pass in a
248+
project ID when using user-based authentication. (`103`)
249+
- Progress bar added for `to_gbq`, through an optional library <span
250+
class="title-ref">tqdm</span> as dependency. (`162`)
251+
- Add location parameter to `read_gbq` and `to_gbq` so that pandas-gbq
252+
can work with datasets in the Tokyo region. (`177`)
17253

18254
### Documentation
19255

20-
### Internal / Testing Changes
256+
- Add `authentication how-to guide <howto/authentication>`. (`183`)
257+
- Update `contributing` guide with new paths to tests. (`154`, `164`)
258+
259+
### Internal changes
260+
261+
- Tests now use <span class="title-ref">nox</span> to run in multiple
262+
Python environments. (`52`)
263+
- Renamed internal modules. (`154`)
264+
- Refactored auth to an internal auth module. (`176`)
265+
- Add unit tests for `get_credentials()`. (`184`)
266+
267+
## 0.4.1 / 2018-04-05
268+
269+
- Only show `verbose` deprecation warning if Pandas version does not
270+
populate it. (`157`)
271+
272+
## 0.4.0 / 2018-04-03
273+
274+
- Fix bug in <span class="title-ref">read_gbq</span> when building a
275+
dataframe with integer columns on Windows. Explicitly use 64bit
276+
integers when converting from BQ types. (`119`)
277+
- Fix bug in <span class="title-ref">read_gbq</span> when querying for
278+
an array of floats (`123`)
279+
- Fix bug in <span class="title-ref">read_gbq</span> with
280+
configuration argument. Updates <span
281+
class="title-ref">read_gbq</span> to account for breaking change in
282+
the way `google-cloud-python` version 0.32.0+ handles query
283+
configuration API representation. (`152`)
284+
- Fix bug in <span class="title-ref">to_gbq</span> where seconds were
285+
discarded in timestamp columns. (`148`)
286+
- Fix bug in <span class="title-ref">to_gbq</span> when supplying a
287+
user-defined schema (`150`)
288+
- **Deprecate** the `verbose` parameter in <span
289+
class="title-ref">read_gbq</span> and <span
290+
class="title-ref">to_gbq</span>. Messages use the logging module
291+
instead of printing progress directly to standard output. (`12`)
292+
293+
## 0.3.1 / 2018-02-13
294+
295+
- Fix an issue where Unicode couldn't be uploaded in Python 2 (`106`)
296+
- Add support for a passed schema in `` `to_gbq ``<span
297+
class="title-ref"> instead inferring the schema from the passed
298+
</span><span class="title-ref">DataFrame</span><span
299+
class="title-ref"> with </span><span
300+
class="title-ref">DataFrame.dtypes</span><span class="title-ref">
301+
(:issue:\`46</span>)
302+
- Fix an issue where a dataframe containing both integer and floating
303+
point columns could not be uploaded with `to_gbq` (`116`)
304+
- `to_gbq` now uses `to_csv` to avoid manually looping over rows in a
305+
dataframe (should result in faster table uploads) (`96`)
306+
307+
## 0.3.0 / 2018-01-03
308+
309+
- Use the
310+
[google-cloud-bigquery](https://googlecloudplatform.github.io/google-cloud-python/latest/bigquery/usage.html)
311+
library for API calls. The `google-cloud-bigquery` package is a new
312+
dependency, and dependencies on `google-api-python-client` and
313+
`httplib2` are removed. See the [installation
314+
guide](https://pandas-gbq.readthedocs.io/en/latest/install.html#dependencies)
315+
for more details. (`93`)
316+
- Structs and arrays are now named properly (`23`) and BigQuery
317+
functions like `array_agg` no longer run into errors during type
318+
conversion (`22`).
319+
- `to_gbq` now uses a load job instead of the streaming API. Remove
320+
`StreamingInsertError` class, as it is no longer used by `to_gbq`.
321+
(`7`, `75`)
322+
323+
## 0.2.1 / 2017-11-27
324+
325+
- `read_gbq` now raises `QueryTimeout` if the request exceeds the
326+
`query.timeoutMs` value specified in the BigQuery configuration.
327+
(`76`)
328+
- Environment variable `PANDAS_GBQ_CREDENTIALS_FILE` can now be used
329+
to override the default location where the BigQuery user account
330+
credentials are stored. (`86`)
331+
- BigQuery user account credentials are now stored in an
332+
application-specific hidden user folder on the operating system.
333+
(`41`)
334+
335+
## 0.2.0 / 2017-07-24
336+
337+
- Drop support for Python 3.4 (`40`)
338+
- The dataframe passed to
339+
`` `.to_gbq(...., if_exists='append') ``<span class="title-ref">
340+
needs to contain only a subset of the fields in the BigQuery schema.
341+
(:issue:\`24</span>)
342+
- Use the [google-auth](https://google-auth.readthedocs.io/en/latest/)
343+
library for authentication because `oauth2client` is deprecated.
344+
(`39`)
345+
- `read_gbq` now has a `auth_local_webserver` boolean argument for
346+
controlling whether to use web server or console flow when getting
347+
user credentials. Replaces <span
348+
class="title-ref">--noauth_local_webserver</span> command line
349+
argument. (`35`)
350+
- `read_gbq` now displays the BigQuery Job ID and standard price in
351+
verbose output. (`70` and `71`)
352+
353+
## 0.1.6 / 2017-05-03
354+
355+
- All gbq errors will simply be subclasses of `ValueError` and no
356+
longer inherit from the deprecated `PandasError`.
357+
358+
## 0.1.4 / 2017-03-17
359+
360+
- `InvalidIndexColumn` will be raised instead of `InvalidColumnOrder`
361+
in `read_gbq` when the index column specified does not exist in the
362+
BigQuery schema. (`6`)
363+
364+
## 0.1.3 / 2017-03-04
365+
366+
- Bug with appending to a BigQuery table where fields have modes
367+
(NULLABLE,REQUIRED,REPEATED) specified. These modes were compared
368+
versus the remote schema and writing a table via `to_gbq` would
369+
previously raise. (`13`)
370+
371+
## 0.1.2 / 2017-02-23
372+
373+
Initial release of transfered code from
374+
[pandas](https://github.com/pandas-dev/pandas)
375+
376+
Includes patches since the 0.19.2 release on pandas with the following:
21377

378+
- `read_gbq` now allows query configuration preferences
379+
[pandas-GH#14742](https://github.com/pandas-dev/pandas/pull/14742)
380+
- `read_gbq` now stores `INTEGER` columns as `dtype=object` if they
381+
contain `NULL` values. Otherwise they are stored as `int64`. This
382+
prevents precision lost for integers greather than 2\**53.
383+
Furthermore \`\`FLOAT\`\` columns with values above 10*\*4 are no
384+
longer casted to `int64` which also caused precision loss
385+
[pandas-GH#14064](https://github.com/pandas-dev/pandas/pull/14064),
386+
and
387+
[pandas-GH#14305](https://github.com/pandas-dev/pandas/pull/14305)

0 commit comments

Comments
 (0)