Wednesday, November 24, 2021

[FIXED] Python script export to .exe including some excel and numpy calculations

November 24, 2021 exe, numpy, pandas, python, spyder No comments

Issue

I have a python 3.7 (spyder) script, which collects data from a given .xlsx files and uses this data to create a cubic spine function.

After this step the script is going through all the files in a given directory and makes some calculations/adjustments to the source files (using the cubic spline function) and finally saves the new files as new files.

I tried to export the script so that it can be run on another computer (I used the "Auto Py to Exe"), which seemed to work great however:

The .exe file is super big (300mb+)
It is super super slow

What am I doing wrong here? Since it's literally a few lines of code that file shouldn't be that big, plus it really should run in like 1-2 seconds.

These are the imported modules:

import numpy as np
import pandas as pd
from scipy import interpolate
from scipy.interpolate import CubicSpline
import os

Here's is the full code


import numpy as np
import pandas as pd
from scipy import interpolate
from scipy.interpolate import CubicSpline
import os

OESbaseline = pd.read_excel('OES_CubicSplineBaseline.xlsx')
x_baseline = OESbaseline['pre-CS']
y_baseline = OESbaseline['sample_known_ppm_with_flux']
cs = CubicSpline(x_baseline, y_baseline)
tck = interpolate.splrep(x_baseline, y_baseline)

def f(x_baseline):
    return interpolate.splev(x_baseline, tck)

basepath = "some_path"

for filename in os.listdir(basepath):

    file = os.path.join(basepath, filename)
    if os.path.isfile(file): 

        OESrun = pd.read_csv(file, skiprows=2)
        CorrectedData = f(OESrun['Concentration'])
        CorrectedData[CorrectedData < 0] = 0
        CorrectedDF = pd.DataFrame({'SampleID': OESrun['Label'], 
                                    'Recvd Wt. (kgs)': np.nan,
                                    'AQR (ppm)': np.around(CorrectedData,3),
                                    'Grav (ppm)': np.nan,
                                    'FA Notes': np.nan,                            
                                    'AQR (R1) (ppm)': np.nan,
                                    'Grav (R1) (ppm)': np.nan,                             
                                    'Assay Wt.(R1) (gr)': np.nan,
                                    'AQR (R2) (ppm)': np.nan,   
                                    'Grav (R2) (ppm)': np.nan,                              
                                    'Assay Wt.(R2) (gr)': np.nan,                                                  
                                    'CN (ppm)': np.nan,      
                                    'CN R (ppm)': np.nan,    
                                    '': np.nan,
                                    'Run Assay Wt.': OESrun['Weight'],
                                    'Grav. (OPT)': np.nan,     
                                    # 'OES Conc. (ppm)': np.around(OESrun['Concentration'],3),
                                    # 'CS Conc. (OPT)': np.around(CorrectedData*(1/34.285),4),                          
                                    'Recvd Wt. (lbs)': np.nan})
        # (OPTIONAL) remove the second (267) wavelength readings
        # CorrectedDF_1wave = CorrectedDF.iloc[::2, :]

        oldname =  os.path.splitext(filename)[0]
        oldext = os.path.splitext(filename)[1]
        new_filename = str(oldname + '_corrected' + oldext)

        # export to Excel
        CorrectedDF.to_csv(os.path.join(basepath, new_filename))

        # (OPTIONAL) export to CSV (1 wave only)
        #CorrectedDF_1wave.to_csv(OESrun_filename+'_Corrected'+'.csv')

Solution

Create your package in a virtual environment and pip install only the packages you need to run your script. This will shrink your .exe files from the +300MB exe to a ~30MB exe with pandas in it that will be much faster to bootup.

my personal goto is virtual env but you can use whatever virtual env packager you like:

step 1 - in your global environment pip install virtualenv

pip install virtualenv

step 2 - create your venv

python -m virtualenv venv

step 3 - activate your venv and pip install your packages

source venv/Scripts/activate

pip install pandas scipy numpy

step 4 - run your script within venv to ensure it works

step 5 - package (now I am not familiar with "py to exe" so I don't know if you can point it to a venv but in case you cannot here is the step with pyinstaller)

pip install pyinstaller

pyinstaller script.py --onefile

Answered By - PydPiper

This Answer collected from stackoverflow and tested by PythonFixing community admins, is licensed under cc by-sa 2.5 , cc by-sa 3.0 and cc by-sa 4.0

Wednesday, November 24, 2021

[FIXED] Python script export to .exe including some excel and numpy calculations

Issue

Solution

0 comments:

Post a Comment

Popular Posts

Labels