Issue
I have written the following code
from scrapy import Selector
html = '''
<html><head></head><body><table>
<tr> <td>a1</td> <td>b1</td> </tr>
<tr> <td>a2</td> <td>b2</td> </tr>
</table></body></html>
'''
selector = Selector(text=html)
temp = selector.xpath("//td").extract()
print(temp)
and hope to get the following result
[
'<td>a1</td>',
'<td>b1</td>',
'<td>a2</td>',
'<td>b2</td>'
]
But I actually got this
[
'<td>a1</td> <td>b1</td> </tr>\n<tr> <td>a2</td> <td>b2</td> </tr>\n</table>\n</body>\n</html>\n',
'<td>b1</td> </tr>\n<tr> <td>a2</td> <td>b2</td> </tr>\n</table>\n</body>\n</html>\n',
'<td>a2</td> <td>b2</td> </tr>\n</table>\n</body>\n</html>\n',
'<td>b2</td> </tr>\n</table>\n</body>\n</html>\n'
]
but with '/text()' in xpath
temp = selector.xpath("//td/text()").extract()
It turned out to be alright
['a1', 'b1', 'a2', 'b2']
It might just be a simple question, I just didn't find the key.
I tried 'extract', 'extract_frist', 'get', 'getall' all have the same problem.
I don't know what's wrong, please help me
Solution
After I uninstall my Anaconda, then install a pure python, I fixed this problem... That's strange.
Answered By - watfe
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.