I am trying to download all the images from the product gallery. I have tried the mentioned script but somehow I am not able to download the images. I could manage to download the main image which contains an id. The other images from the gallery do not contain any id and I failed to download them.
import scrapy
from scrapy.linkextractors import LinkExtractor
from scrapy.spiders import CrawlSpider, Rule
class BasicSpider(CrawlSpider):
name = 'basic'
allowed_domains = ['www.leebmann24.de']
start_urls = ['https://www.leebmann24.de/bmw.html']
rules = (
Rule(LinkExtractor(restrict_xpaths="//div[@class='category-products']/ul/li/h2/a"), callback='parse_item'),
Rule(LinkExtractor(restrict_xpaths="//li[@class='next']/a"), callback='parse_item', follow=True),
)
def parse_item(self, response):
yield {
'URL': response.url,
'Price': response.xpath("normalize-space(//span[@class='price']/text())").get(),
'image_urls': response.xpath("//div[@class='item']/a/img/@src").getall()
}
@Raisul Islam,
'//*[@id="image-main"]/@src'is generating the image url and I'm not getting any issues. Please, see the output whether that's your expacted or not.Output: