Issue
I am working with response in scrapy and keep on getting this message.
I only gave the snippet where the error is occuring. I am trying to go through different webpages and need get the # of pages in that particular webpage. So I created A response object where I get the href for the next button but keep on getting AttributeError: 'Response' object has no attribute 'body_as_unicode'
code working with.
from scrapy.spiders import Spider
from scrapy.selector import Selector
from scrapy.http import Request
from scrapingtest.items import ScrapingTestingItem
from collections import OrderedDict
import json
from scrapy.selector.lxmlsel import HtmlXPathSelector
import csv
import scrapy
from scrapy.http import Response
class scrapingtestspider(Spider):
name = "scrapytesting"
allowed_domains = ["tripadvisor.in"]
# base_uri = ["tripadvisor.in"]
def start_requests(self):
site_array=["http://www.tripadvisor.in/Hotel_Review-g3581633-d2290190-Reviews-Corbett_Treetop_Riverview-Marchula_Jim_Corbett_National_Park_Uttarakhand.html"
"http://www.tripadvisor.in/Hotel_Review-g297600-d8029162-Reviews-Daman_Casa_Tesoro-Daman_Daman_and_Diu.html",
"http://www.tripadvisor.in/Hotel_Review-g304557-d2519662-Reviews-Darjeeling_Khushalaya_Sterling_Holidays_Resort-Darjeeling_West_Bengal.html",
"http://www.tripadvisor.in/Hotel_Review-g319724-d3795261-Reviews-Dharamshala_The_Sanctuary_A_Sterling_Holidays_Resort-Dharamsala_Himachal_Pradesh.html",
"http://www.tripadvisor.in/Hotel_Review-g1544623-d8029274-Reviews-Dindi_By_The_Godavari-Nalgonda_Andhra_Pradesh.html"]
for i in range(len(site_array)):
response = Response(url=site_array[i])
sites = Selector(response).xpath('//a[contains(text(), "Next")]/@href').extract()
# sites = response.selector.xpath('//a[contains(text(), "Next")]/@href').extract()
for site in sites:
yield Request(site_array[i],self.parse)
`
Solution
In this case the line where your error occurs expects a TextResponse
object not a normal response. Try to create a TextResponse
instead of the normal Response
to resolve the error.
The missing method is documented here.
More specifically use an HtmlResponse
because your response would be some HTML and not plain text. HtmlResponse
is a subclass of TextResponse
so it inherits the missing method.
One more thing: where do you set the body of your Response
? Without any body your xpath
query will return nothing. As far as in the example in your question you only set the URL but no body. This is why your xpath
returns nothing.
Answered By - GHajba
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.