Issue
I'm trying to write an asynchronous method to run a query on hive (using pyhive). Now, pyhive does support asynchronous querying, I have no idea how to wait for the query to finish without blocking.
I can wait for the query to finish by repeatedly checking, but that's basically same as blocking.
def runQuery():
cursor = hive.connect('localhost').cursor()
cursor.execute('select * from mytable', async_ = True)
status = cursor.poll().operationState
while status in (TOperationState.INITIALIZED_STATE, TOperationState.RUNNING_STATE):
status = cursor.poll().operationState
return cursor.fetchall()
So I use async, but then I don't know how to await. I tried the below code, but it throws TypeError: object int can't be used in 'await' expression
async def runQueryAsync():
cursor = hive.connect('localhost').cursor()
cursor.execute('select * from mytable', async_ = True)
#THIS DOESN'T WORK
await cursor.poll().operationState not in (TOperationState.INITIALIZED_STATE, TOperationState.RUNNING_STATE)
return cursor.fetchall()
Any workarounds? Basically I want a way where instead of saying await methodCall, I say await until this condition is true
PS: To clarify, cursor.execute('select * from mytable', async_ = True)
is not asynchronous in the python sense of returning a coroutine/future. It simply starts a query and immediately returns, and you have to check the state to know whether the query is finished. So await cursor.execute('select * from mytable', async_ = True)
won't work.
Solution
You would have to actively await:
async def runQueryAsync():
cursor = hive.connect('localhost').cursor()
await cursor.execute('select * from mytable', async_ = True)
while cursor.poll().operationState not in (TOperationState.INITIALIZED_STATE, TOperationState.RUNNING_STATE):
await asyncio.sleep(1) # try each 1 second
return cursor.fetchall()
I'm not sure if you can await cursor.execute('select * from mytable', async_ = True)
, but if not just use cursor.execute('select * from mytable', async_ = True)
, altough it would make sense to use it there. If it works with await
in the execute it you may not need to use the while
loop, since it should continue when the execute is finished:
async def runQueryAsync():
cursor = hive.connect('localhost').cursor()
await cursor.execute('select * from mytable', async_ = True)
return cursor.fetchall()
Answered By - Netwave
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.