Issue
I currently have my flask running static where I run the ETL job independently and then use the result dataset in my Flask to display chartjs line charts.
However, I'd like to integrate the ETL piece in my web framework where my users can login and submit the input parameter(locations of the input files and an added version id) using HTML form, which will then be used by my ETL job to run and use the resultant data directly to display the charts on same page.
Current Setup:
My custom ETL module has submodules that act together to form a simple pipeline process:
globals.py - has my globals such as location of the s3 and the etc. ideally i'd like my user's form inputs to be stored here so that they can be used directly in all my submodules wherever necessary.
s3_bkt = 'abc' #change bucket here
s3_loc = 's3://'+s3_bkt+'/'
ip_loc = 'alv-input/'
#Ideally ,I'd like my users form inputs to be sitting here
# ip1 = 'alv_ip.csv'
# ip2 = 'Input_product_ICL_60K_Seg15.xlsx'
#version = 'v1'
op_loc = 'alv-output/'
---main-module.py - main function
import module3 as m3
import globals as g
def main(ip1,ip2,version):
data3,ip1,ip2,version = m3.module3(ip1,ip2,version)
----perform some actions on the data and return---
return res_data
---module3.py
import module2 as m2
def mod3(ip1,ip2,version):
data2,ip1,ip2,version = m2.mod2(ip1,ip2,version)
----perform some actions on the data and return---
return data3
---module2.py
import module1 as m1
import globals as g
def mod2(ip1,ip2,version):
data1,ip1,ip2,version = m1.mod1(ip1,ip2,version)
data_cnsts = pd.read_csv(ip2) #this is where i'll be using the user's input for ip2
----perform some actions on the datasets and write them to location with version_id to return---
data1.to_csv(g.op_loc+'data2-'+ version + '.csv', index=False)
return data2
---module1.py
import globals as g
def mod1(ip1,ip2,version):
#this is the location where the form input for the data location should be actually used
data = pd.read_csv(g.s3_loc+g.ip_loc+ip1)
----perform some actions on the data and return---
return data1
Flask setup:
import main-module as mm
app = Flask(__name__)
#this is where the user first hits and submits the form
@app.route('/form')
def form():
return render_template('form.html')
@app.route('/result/',methods=['GET', 'POST'])
def upload():
msg=''
if request.method == 'GET':
return f"The URL /data is accessed directly. Try going to '/upload' to submit form"
if request.method == 'POST':
ip1 = request.form['ip_file']
ip2 = request.form['ip_sas_file']
version = request.form['version']
data = mm.main(ip1,ip2,version)
grpby_vars = ['a','b','c']
grouped_data = data.groupby(['mob'])[grpby_vars].mean().reset_index()
#labels for the chart
a = [val for val in grouped_data['a']]
#values for the chart
b = [round(val*100,3) for val in grouped_data['b']]
c = [round(val*100,3) for val in grouped_data['c']]
d = [val for val in grouped_data['d']]
return render_template('results.html', title='Predictions',a=a,b=b,c=c,d=d)
The Flask setup works perfectly fine without using any form inputs from the user(if the ETL job and Flask is de-coupled i.e. when I run the ETL and supply the result data location to Flask directly.
Problem:
The problem after integration is that I'm not very sure how to pass these inputs from users to all my sub-module. Below is the error I get when I use the current setup.
data3,ip1,ip2,version = m3.module3(ip1,ip2,version)
TypeError: module3() missing 3 required positional arguments: 'ip1', 'ip2', and 'version'
So, definitely this is due to the issue with my param passing across my sub-modules.
So my question is how do I use the data from the form as a global variable across my sub-modules. Ideally I'd like them to be stored in my globals so that I'd not have to be passing them as a param through all modules.
Is there a standard way to achieve that? Might sound very trivial but I'm struggling hard to get to my end-state.
Thanks for reading through :)
Solution
I realized my dumb mistake, I should've have just included
data,ip1,ip2,version = mm.main(ip1,ip2,version)
I could also instead use my globals file by initiating the inputs with empty string and then import my globals in the flask file and update the values. This way I can avoid the route of passing the params through my sub-modules.
Answered By - Sai Pardhu
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.