Issue
I would like to know if there is any possible way to create a custom script that can use django imports and run in parallel from a python shell. for example, imagine I could write this on the shell:
@make_it_parallel
def my_custom_task():
from mydjangomodule.models import myclass
# do something
for user_range in range(0, 1000, 10):
my_custom_task.delay(user_range)
The main question is "how to easily implement parallelism over a script" instead of having to push something to production or setup a complete set of tools for a script that only needs to run once.
P.D.: If there is another tool besides IPython/Celery that can do the job I would be interest to hear about it.
Solution
I end up parallelizing django scripts with ipython ipyparallel package, here is how to do it!
first you need to install ipyparallel: pip install ipyparallel -U
on the default profile from ipython (or any profile of your preference) we need to add this startup imports to load django:
from MyProject import settings
import django
django.setup()
This should be added to a path like this: ~/.ipython/profile_default/startup/00-load-django.py
This will load django on the engines you need to start so ipython can parallelize your functions.
Now, lets start the engines that will be able to parallelize your django scripts coded on the fly, be sure to be in the main folder from your django project (where the manage.py file is): ipcluster start -n X
where X will be the number of engines wanted (IMHO it would be the number of cores in the current computer + 1)
Please let the ipcluster be fully operational before entering into ipython.
Now, lets parallelize that django script, enter into ipython:
import ipyparallel as ipp
rc = ipp.Client() # Create the client that will connect to the ipython engines
lview = rc.load_balanced_view()
@lview.parallel()
def show_polls(user_range):
from poll.models import Poll
return list(Poll.objects.filter(user_id__gte=user_range, user_id__lt=user_range+100))
for res in show_polls.map(range(0, 1000, 100)):
print res
and there we go, a django script parallelized! notice that I convert the QuerySet into a list, that's because anything returned needs to be pickable.
Answered By - Hassek
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.