Issue
I am creating a scraping bot that collects etf data from etf.com. I am trying to export the data I collect to a sqlite database but every time I do so I get this error message (for each item I try to add):
sqlalchemy.exc.InterfaceError: (sqlite3.InterfaceError) Error binding parameter 0 - probably unsupported type.
[SQL: INSERT INTO etf (ticker, name, issuer, aum, expense_ratio, tr, segment) VALUES (?, ?, ?, ?, ?, ?, ?)]
[parameters: (['ROKT'], ['SPDR S&P Kensho Final Frontiers ETF'], ['State Street Global Advisors'], ['$19.24M'], ['0.45%'], ['6.45%'], ['Equity: U.S. Space'])]
My scraper (brandetfs_spider.py):
for etf in etfs:
loader = ItemLoader(item=BrandetfsItem(), selector=etf)
loader.add_css('ticker', 'a.linkTickerName::text')
loader.add_css('name', 'td.col_2::text')
loader.add_css('issuer', 'td.col_3::text')
loader.add_css('aum', 'td.col_4::text')
loader.add_css('expense_ratio', 'td.col_5::text')
loader.add_css('tr', 'td.col_6::text')
loader.add_css('segment', 'td.col_7::text')
yield loader.load_item()
model:
Base = declarative_base()
def db_connect():
"""
performs database connection using database settings from settings.py
returns sqlalchemy engine instance
"""
return create_engine(get_project_settings().get("CONNECTION_STRING")) # connects to a database
def create_table(engine):
Base.metadata.create_all(engine)
class ETF(Base):
__tablename__ = "etf"
print("-------------------------------------------")
id = Column(Integer, primary_key=True)
ticker = Column('ticker', Text())
name = Column('name', Text())
issuer = Column('issuer', Text())
aum = Column('aum', String(10))
expense_ratio = Column('expense_ratio', Text())
tr = Column('tr', Text())
segment = Column('segment', Text())
pipeline:
class BrandetfsPipeline(object):
def __init__(self):
"""
Initializes database connection and sessionmaker
creates tables
"""
engine = db_connect()
create_table(engine)
self.Session = sessionmaker(bind=engine)
def process_item(self, item, spider):
"""
Save etfs in the database
This method is called for every item pipeline component
"""
session = self.Session()
# create etf table
etf = ETF()
etf.ticker = item["ticker"]
etf.name = item["name"]
etf.issuer = item["issuer"]
etf.aum = item["aum"]
etf.expense_ratio = item["expense_ratio"]
etf.tr = item["tr"]
etf.segment = item["segment"]
try:
session.add(etf)
session.commit()
except:
session.rollback()
raise
finally:
session.close()
return item
settings.py:
CONNECTION_STRING = 'sqlite:///scrapy_etfs.db'
I don't understand why I am getting the error message because I am using Text types for the database fields and that's what I'm trying to put into them
Solution
It's porbably because itemloader returns lists. you can fix it by using TakeFirst as a processor
Answered By - zaki98
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.