Skip to content

BUG: replace iterrows with itertuples in sql insert (GH6509) #6591

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

jorisvandenbossche
Copy link
Member

Fixes #6509.

Should still add a test

df.to_sql("test_read_write", self.conn)
df2 = sql.read_table("test_read_write", self.conn)

tm.assert_equal(df['s1'].values, df2['s1'].values)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jreback I couldn't use:

        tm.assert_frame_equal(df, df2.set_index('pandas_index'),
                              check_dtype=False, check_names=False)

because it seems this uses assert_almost_equal, and it seems that 33554433 is 'almost' equal to 33554432 ...

Is this also OK?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see....

maybe add an argument to assert_frame_equal (and also to assert_series_equal) that is check_exact=False; which is the default, but if True, then use assert_numpy_array_equal?

in this case you REALLY do care that they are exactly equal

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

even better

self.assertTrue(df.equals(df2) (though it has to match exacty including the index, names, dtype, everything)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for this last one, does it also have to match the dtype? Because in this case, I want to check the values while the dtypes don't match (int32 vs int64)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

some if these may not match dtype and hence are not round trippable
eg the values will match but u lose the original dtype
nothing to do about that (and should be a gotcha in the docs)

@jreback jreback added this to the 0.14.0 milestone Mar 11, 2014
if check_exact:
if not np.array_equal(left.values, right.values):
raise AssertionError('{0} is not equal to {1}.'.format(left.values,
right.values))
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jreback Is there a reason that assert_numpy_array_equal is only avaiable in the TestCase class and not in the rest of the testing module? (so I cannot use that function here)

@jorisvandenbossche
Copy link
Member Author

@jreback I added a check_exact keyword. Is this what you had in mind?

@jreback
Copy link
Contributor

jreback commented Mar 11, 2014

@jorisvandenbossche perfect.

assert_numpy_array_equal should prob be a module level function too (it just calls np.array_equal in any event), but no biggie.

in theory all functions should actually be on the TestCase, but too much work to fix

pls add a release note then merge

@jorisvandenbossche
Copy link
Member Author

Should there be a release note? As this fixes something (the new sqlalchemy based sql functions) that is not yet released?

@jreback
Copy link
Contributor

jreback commented Mar 11, 2014

oh right.....though like to reference the closed issue (maybe just add it to an existing one)

@jorisvandenbossche
Copy link
Member Author

Actually, it was also fixed in the legacy code. Anyway, added a line to release notes.

jorisvandenbossche added a commit that referenced this pull request Mar 11, 2014
BUG: replace iterrows with itertuples in sql insert (GH6509)
@jorisvandenbossche jorisvandenbossche merged commit dc59749 into pandas-dev:master Mar 11, 2014
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Dtype Conversions Unexpected or buggy dtype conversions IO SQL to_sql, read_sql, read_sql_query
Projects
None yet
Development

Successfully merging this pull request may close these issues.

BUG: SQL writing can lead to data loss
2 participants