Skip to content

Conversation

urq
Copy link

@urq urq commented Jan 23, 2015

When polling for a long-running gbq job to determine if it is complete, we should only return results once query_results['jobCompleted'] is True, not just when the jobCompleted key exists. Otherwise, the gbq client thinks results exist and it will start attempting to parse the results, leading to a weird KeyError:

/Library/Python/2.7/site-packages/pandas/io/gbq.pyc in read_gbq(query, project_id, index_col, col_order, reauth)
    368▓
    369     connector = GbqConnector(project_id, reauth = reauth)
--> 370     schema, pages = connector.run_query(query)
    371     dataframe_list = []
    372     while len(pages) > 0:

/Library/Python/2.7/site-packages/pandas/io/gbq.pyc in run_query(self, query)
    192                             jobId=job_reference['jobId']).execute()
    193▓
--> 194         total_rows = int(query_reply['totalRows'])
    195         result_pages = list()
    196         seen_page_tokens = list()

KeyError: 'totalRows'

This simple patch accounts for the case where query_results['jobCompleted'] is False.

When polling for a long-running gbq job to determine if it is complete, we
should only return results once query_results['jobCompleted'] is True, not
just when the 'jobCompleted' key exists.
@jreback
Copy link
Contributor

jreback commented Jan 23, 2015

this is addressed in #8728 which is pending some more testing

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants