BUG: Key Error: range exception when printing DataFrame #3869

dmlockhart · 2013-06-12T19:04:13Z

Here's a reprodu

df = pd.DataFrame({ 'A' : ['foo',"~:{range}:0"], 'B' : ['bar','bah'] })
df
             A    B
0          foo  bar
1  ~:{range}:0  bah

df.set_index(['A']).info()
*** KeyError: 'range'

in core/index.py, check if head/tail is not already an instance of a str

    def summary(self, name=None):
        if len(self) > 0:
            head = self[0]
            if hasattr(head,'format'):
                head = head.format()
            tail = self[-1]
            if hasattr(tail,'format'):
                tail = tail.format()

Printing a DataFrame created from two Series objects (previously columns in other DataFrames) results in a "Key Error: 'range'" exception being raised. The DataFrame creation seems to work fine. Printing other DataFrames with the same "problematic" index also works okay.

Test code:

import pstats
import pandas as pd

# Import some cProfile data
run1 = pstats.Stats('run1.prof')
run2 = pstats.Stats('run2.prof')

# Utility function to convert pstats dict into a list of lists
def pstats_to_list( stats ):
  plist = []
  for key, value in stats.strip_dirs().stats.items():
    filename, lineno, func_name = key
    ccalls, ncalls, total_time, cum_time, callers = value
    name = "{}:{}:{}".format( filename, func_name, lineno )
    plist.append( [name, ncalls, total_time, cum_time] )
  return plist

jit_list   = pstats_to_list( run1 )
nojit_list = pstats_to_list( run2 )

# Create DataFrames for the profile run data
columns=['name','ncalls','ttime', 'ctime']
jdf = pd.DataFrame( jit_list,   columns = columns )
ndf = pd.DataFrame( nojit_list, columns = columns )

# Set the 'name' column to be the index (for plotting)
jdf = jdf.set_index( 'name' )
ndf = ndf.set_index( 'name' )

# These DataFrames print fine
print jdf
print ndf

# Extract out the 'ttime' columns
x = ndf['ttime']
y = jdf['ttime']

# Create a new DataFrame using the 'ttime' Series from jdf and ndf
z = pd.DataFrame( {'jit': x, 'nojit': y } )

# Print some data.... this works
print z[0:10]

# Print some data.... this raises "KeyError: 'range'"
print z

The text was updated successfully, but these errors were encountered:

jreback · 2013-07-10T13:17:19Z

can you post/link to these prof files, this is impossible to reprod otherwise

dmlockhart · 2013-07-12T01:35:37Z

Here is a link to the two .prof files, as well as the test code quoted above:

https://dl.dropboxusercontent.com/u/1734164/run1.prof
https://dl.dropboxusercontent.com/u/1734164/run2.prof
https://dl.dropboxusercontent.com/u/1734164/pandas_issue_3869.py

jreback · 2013-07-12T01:39:07Z

great

pandas version, numpy version, and platform?

jreback · 2013-07-12T01:39:29Z

python version as well

jreback · 2013-07-12T12:24:02Z

@dmlockhart if you look at the top part of the question, I put a reproducible example
the last element in the index has an element that 'looks' like it needs formatting (but not quite).

To workaround for now, just reset_index on z (so your index is a number index), rather than this odd string index

thanks for the report

jtratner · 2013-07-12T13:06:09Z

Neat bug actually... Probably just need to change pprint thing slightly and/or make sure that we don't build up format strings dynamically unless sure that string is escaped.

Worth it to add something like escape_format something simple like:

def escape_format(strlike):
    return strlike.replace('{', '{{').replace('}', '}} ')

jreback · 2013-07-12T13:17:29Z

no...just a simple change....something like

    def summary(self, name=None):
        if len(self) > 0:
            head = self[0]
            if hasattr(head,'format') and not isinstance(head, basestring):
                head = head.format()
            tail = self[-1]
            if hasattr(tail,'format') and not isinstance(tail, basestring):
                tail = tail.format()

dmlockhart · 2013-07-15T18:17:15Z

@jreback here is my version information:

Python: Python 2.7.3 (default, Mar 26 2013, 21:14:37)
[GCC 4.2.1 (Apple Inc. build 5666) (dot 3)] on darwin

Pandas: pandas - 0.11.0
Numpy: numpy - 1.7.1

Platform: OSX 10.6.8

jtratner · 2013-09-05T02:43:34Z

@dmlockhart this should work now.

jtratner mentioned this issue Sep 5, 2013

BUG: Fix wrong str.format() calls in Index.summary #4751

Merged

jtratner closed this as completed in #4751 Sep 5, 2013

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

BUG: Key Error: range exception when printing DataFrame #3869

BUG: Key Error: range exception when printing DataFrame #3869

dmlockhart commented Jun 12, 2013

jreback commented Jul 10, 2013

Uh oh!

dmlockhart commented Jul 12, 2013

Uh oh!

jreback commented Jul 12, 2013

Uh oh!

jreback commented Jul 12, 2013

Uh oh!

jreback commented Jul 12, 2013

Uh oh!

jtratner commented Jul 12, 2013

Uh oh!

jreback commented Jul 12, 2013

Uh oh!

dmlockhart commented Jul 15, 2013

Uh oh!

jtratner commented Sep 5, 2013

Uh oh!

Uh oh!

BUG: Key Error: range exception when printing DataFrame #3869

BUG: Key Error: range exception when printing DataFrame #3869

Comments

dmlockhart commented Jun 12, 2013

jreback commented Jul 10, 2013

Uh oh!

dmlockhart commented Jul 12, 2013

Uh oh!

jreback commented Jul 12, 2013

Uh oh!

jreback commented Jul 12, 2013

Uh oh!

jreback commented Jul 12, 2013

Uh oh!

jtratner commented Jul 12, 2013

Uh oh!

jreback commented Jul 12, 2013

Uh oh!

dmlockhart commented Jul 15, 2013

Uh oh!

jtratner commented Sep 5, 2013

Uh oh!