i am parsing a website with a choices in filters.
For instance, if it is a filter by a building, the link contains a number at the end:
https://etalongroup.ru/choose/?group=false&object%5B%5D=16
Where 16 is a certain building with flats, same goes for other buildings. Grabbing all the 900+ flats works fine at this point.
BUT, when i try to filter by other parameters (e.g "has a balcony") where the link looks like
https://etalongroup.ru/choose/?group=false&option%5B%5D=23
It always returns an error, which is case not only for a balcony, but literally to the link with any other filter, rather than choosing a building
here is the code that grabs apartments in a building (15) succesfuly:
import requests
import pandas as pd
base_link_url = "https://etalongroup.ru" # Base URL for apartment links
def fetch_data(action: str, object_id: str, offset: int, limit: int) -> dict:
url = f"https://etalongroup.ru/bitrix/services/main/ajax.php?action={action}"
payload = {
'filter[object]': object_id, # Use the passed object identifier
'haveItem': 'true',
'limit': limit,
'offset': offset
}
response = requests.post(url, data=payload)
return response.json()
def get_data(action: str, object_id: str, max_offset: int, limit: int = 50) -> list:
result = []
offset = 0
while offset < max_offset:
response_json = fetch_data(action, object_id, offset, limit)
if 'data' in response_json and response_json['data'][0]['itemList']:
for item in response_json['data'][0]['itemList']:
# Extract additional data from each item
flat_info = {
'id': item.get('id'),
'img': base_link_url + item.get('img'),
'price': item.get('price'),
'priceTotal': item.get('priceTotal'),
'area': item.get('area'),
'floor': item.get('floor'),
'title': item.get('title'),
'link': base_link_url + item.get('link'), # Full link
'deliveryName': item.get('deliveryName'),
'isBooked': item.get('isBooked'),
'object_id': object_id # Add object_id field
}
result.append(flat_info)
offset += limit
else:
break
return result
# Collect data only for the object with ID 16
all_data = []
object_id = 16
print(f"Fetching data for object_id: {object_id}")
data = get_data(action='etalongroup:filter.FlatFilter.getFlatList', object_id=str(object_id), max_offset=700) # Adjust max_offset as needed
if data: # Check if data is available
all_data.extend(data)
# Create a DataFrame from all collected data
df_all = pd.DataFrame(all_data)
# Count the number of rows in the DataFrame
num_rows = len(df_all)
print(f"Total number of objects: {num_rows}")
# Print the DataFrame
print(df_all.to_string(index=False))
# Export the DataFrame to a CSV file
#df_all.to_csv('exported_data.csv', index=False)
however, if id like to grab all the flats with a balcony (object_id = 23)
Fetching data for object_id: 23
Total number of objects: 0
Empty DataFrame
Columns: []
Index: []
Still i have to remind that all other numbers (that contain exactly filter by a building) work - 15, 16, 18 etc
The question goes: how can i implement filtering by other parameters?
Your code assumes object_id always corresponds to 'filter[object]', which works for buildings but doesnt work for options because the API doesn’t get that filter[object]=23 as a valid filter for features, which should be an array of options rather than straight filter (it's thinking 23 option is instead a filter like buildings).
your payload for options should match a pattern more like filter[option][]=23
options are nested in an object within a filter, so buildings, then looking at the options for buildings for the balcony id
make sense?