[SOLVED] Failed to extract all the image links linked to the floorplans using the requests module

Failed to extract all the image links linked to the floorplans using the requests module

I'm trying to get the image links associated with the floor plans located in the middle of the webpage using the requests module. The links are available in the page source, but I can't manage to scrape them, even with regex, as they are scattered throughout it. There are 11 images in there.

import re
import json
import requests

link = 'https://www.livabl.com/abbotsford-bc/jem1'

headers = {
    'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/103.0.0.0 Safari/537.36',
    'Referer': 'https://www.livabl.com/',
}

def get_floor_plan_images(link,headers):
    res = requests.get(link,headers=headers)
    print(res.status_code)
    match = re.search(r"\{\\\"images\\\":(.*?]),",res.text)
    if match:
        image_links = match.group(1)
        return image_links

images = get_floor_plan_images(link,headers)
print(images)

How can I extract all the image links connected to the floorplans using the requests module?

Solution

I think this is what you need:

import re
import json
import requests

link = 'https://www.livabl.com/abbotsford-bc/jem1'

headers = {
    'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/103.0.0.0 Safari/537.36',
    'Referer': 'https://www.livabl.com/',
}

def get_floor_plan_images(link,headers):
    res = requests.get(link,headers=headers)
    print(res.status_code)
    return re.finditer(r"\{\\\"images\\\":(.*?]),",res.text)


for img in get_floor_plan_images(link,headers):
    print(img.group(1))