Issue
I have code that reads from a json file and loads it into the neo4j database. Because the data is big, I want to do this with concurrent processing. I decided to use asyncio library but I'm having a problem. In the code below, session.run function blocks the running and I don't have parallelism. Is there a way to have parallelism with neo4j sessions?
def json_loading_function():
...
asyncio.run(self.run_all_cqls(process_params), debug=True)
async def run_all_cqls(self, params):
results = await asyncio.gather(*(self.run_cql(p['driver'], p['session_index'], p['cql'], p['rows_dict']) for p in params))
return results
async def run_cql(self, session, sessionIndex,cql,dict):
with self._driver.session(**self.db_config) as session:
print('Running session %d' % sessionIndex)
session.run(cql, dict=dict).consume()
Solution
I ended up using neo4j python driver version 5 alpha with async support. Here is my fork to pyingest repo : https://github.com/cuneyttyler/pyingest
Answered By - cuneyttyler
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.