Issue
I have two scripts, scraper.py and db_control.py. In scraper.py I have something like this:
...
def scrape(category, field, pages, search, use_proxy, proxy_file):
...
loop = asyncio.get_event_loop()
to_do = [ get_pages(url, params, conngen) for url in urls ]
wait_coro = asyncio.wait(to_do)
res, _ = loop.run_until_complete(wait_coro)
...
loop.close()
return [ x.result() for x in res ]
...
And in db_control.py:
from scraper import scrape
...
while new < 15:
data = scrape(category, field, pages, search, use_proxy, proxy_file)
...
...
Theoretically, scraper should be started unknown-times until enough of data have been obtained. But when new
is not imidiatelly > 15
then this error occurs:
File "/usr/lib/python3.4/asyncio/base_events.py", line 293, in run_until_complete
self._check_closed()
File "/usr/lib/python3.4/asyncio/base_events.py", line 265, in _check_closed
raise RuntimeError('Event loop is closed')
RuntimeError: Event loop is closed
But scripts works just fine if I run scrape() only once. So I guess there is some problem with recreating loop = asyncio.get_event_loop()
, I have tried this but nothing changed. How I can fix this? Of course those are just snippets of my code, if you think problem can be elsewhere, full code is available here.
Solution
Methods run_until_complete
, run_forever
, run_in_executor
, create_task
, call_at
explicitly check
the loop and throw exception if it's closed.
Quote from docs - BaseEvenLoop.close
:
This is idempotent and irreversible
Unless you have some(good) reasons, you might simply omit the close line:
def scrape(category, field, pages, search, use_proxy, proxy_file):
#...
loop = asyncio.get_event_loop()
to_do = [ get_pages(url, params, conngen) for url in urls ]
wait_coro = asyncio.wait(to_do)
res, _ = loop.run_until_complete(wait_coro)
#...
# loop.close()
return [ x.result() for x in res ]
If you want to have each time a brand new loop, you have t create it manually and set as default:
def scrape(category, field, pages, search, use_proxy, proxy_file):
#...
loop = asyncio.new_event_loop()
asyncio.set_event_loop(loop)
to_do = [ get_pages(url, params, conngen) for url in urls ]
wait_coro = asyncio.wait(to_do)
res, _ = loop.run_until_complete(wait_coro)
#...
return [ x.result() for x in res ]
Answered By - kwarunek
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.