Issue
I have code that runs over 7000 columns dataset with around 250k records
it works fine when I use Anaconda Spyder
but now I want to put this code to run based on Task Schedule
Therefore I am want to execute the code in cmd Command Prompt as
Python "c:\myfolder\predservice.py"
It didn't work as Task Schedule
I start tracing the problem by running the script on Python from within cmd
I got this error, which I did not get when I was running through Spyder
How can I avoid such error
I thought it is because of the memory. when i select a subset of the dataset it runs fine but when I run the whole dataset i get this error
>>> df =pd.concat([df.drop('PreviousDRGs', 1), pd.get_dummies(df['PreviousDRGs'] .str.split(",", expand=True), prefix
= 'PreviousDRGs').max(level=0, axis=1).asty pe(np.int8)], axis=1)
Traceback (most recent call last): File "<stdin>", line 1, in
<module> File
"C:\Users\decisionsupport\AppData\Local\Programs\Python\Python37-32\lib\site-packages\pandas\core\generic.py",
line 11616, in stat_func return self._agg_by_level(name, axis=axis,
level=level, skipna=skipna)
File
"C:\Users\decisionsupport\AppData\Local\Programs\Python\Python37-32\lib\site-packages\pandas\core\generic.py",
line 10440, in _agg_by_level grouped = self.groupby(level=level,
axis=axis, sort=False)
File
"C:\Users\decisionsupport\AppData\Local\Programs\Python\Python37-32\lib\site-packages\pandas\core\generic.py",
line 7894, in groupby **kwargs
File
"C:\Users\decisionsupport\AppData\Local\Programs\Python\Python37-32\lib\site-packages\pandas\core\groupby\groupby.py",
line 2522, in groupby return klass(obj, by, **kwds)
File
"C:\Users\decisionsupport\AppData\Local\Programs\Python\Python37-32\lib\site-packages\pandas\core\groupby\groupby.py",
line 363, in __init__ obj._consolidate_inplace()
File
"C:\Users\decisionsupport\AppData\Local\Programs\Python\Python37-32\lib\site-packages\pandas\core\generic.py",
line 5252, in _consolidate_inplace self._protect_consolidate(f)
File
"C:\Users\decisionsupport\AppData\Local\Programs\Python\Python37-32\lib\site-packages\pandas\core\generic.py",
line 5241, in _protect_consolidate result = f()
File
"C:\Users\decisionsupport\AppData\Local\Programs\Python\Python37-32\lib\site-packages\pandas\core\generic.py",
line 5250, in f self._data = self._data.consolidate()
File
"C:\Users\decisionsupport\AppData\Local\Programs\Python\Python37-32\lib\site-packages\pandas\core\internals\managers.py",
line 932, in consolidate bm._consolidate_inplace()
File
"C:\Users\decisionsupport\AppData\Local\Programs\Python\Python37-32\lib\site-packages\pandas\core\internals\managers.py",
line 937, in _consolidate_inplace self.blocks =
tuple(_consolidate(self.blocks))
File
"C:\Users\decisionsupport\AppData\Local\Programs\Python\Python37-32\lib\site-packages\pandas\core\internals\managers.py",
line 1913, in _consolidate list(group_blocks), dtype=dtype,
_can_consolidate=_can_consolidate
File "C:\Users\decisionsupport\AppData\Local\Programs\Python\Python37-32\lib\site-packages\pandas\core\internals\blocks.py",
line 3323, in _merge_blocks new_values = new_values[argsort]
numpy.core._exceptions.MemoryError: Unable to allocate array with
shape (4524, 1 38299) and data type uint8
>>>
Solution
Looks like Anaconda may be using a 64-bit python. Your error traceback shows that your Task Scheduler is running your .py script using a 32-bit python. Either
- replace your 32-bit python with a 64-bit version
- or install a separate 64-bit python and tell Task Scheduler to use that to run your .py file.
Answered By - Justin Ezequiel
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.