Issue
How could I use a global defined variable (pandas data frame) df
within a scrapy-spider?
import scrapy
import pandas as pd
import re
df = pd.read_csv('home/test.csv')
class Spider:
name = 'test'
start_urls = 'https://test.org'
def parse(self, response):
data = response.css('get-data-here').extract()
for i in data:
final_output = **df**[(**df**[0]==re.search(r'[test]', i).group(1)), 1].item()
Solution
You need to declare variable inside class, if you want to initialize do that in constructor.
import scrapy
import pandas as pd
import re
class Spider:
name = 'test'
start_urls = 'https://test.org'
def __init__(self, *args, **kwargs):
self.df = pd.read_csv('home/test.csv')
def parse(self, response):
data = response.css('get-data-here').extract()
## use data frame with self.df
Answered By - hammad rauf
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.