Is it possible to merge two array nodes into one? I'm trying to get everything under class container-ingredient_group
, but each div
block comes out as a separate block. When extracting data, it doesn't match, see screenshot. Maybe there is a better way to pull data out of web?
for receptes_link in range(len(productlinks)):
driver.get("https://receptes.tvnet.lv/recepte/23989-skabenu-darza-zalumu-zupa-ar-kupinatu-vistu-un-perlu-grubam")
sastavdala = driver.find_elements(By.XPATH, '//div[@class="container-ingredient_group"]')
for sastavdala_link in sastavdala:
sastavdala_final.append(sastavdala_link.text)
HTML:
<div data-v-7b1d4f90="" class="container-ingredient_group">
<span data-v-7b1d4f90="" class="ingredient_group__title">Pasniegšanai:</span>
<ul data-v-7b1d4f90="" class="ingredient_group">
<div data-v-7b1d4f90="" class="container-ingredient">
<li data-v-7b1d4f90="" class="recipe--ingredients__ingredient">
<span data-v-7b1d4f90="" itemprop="recipeIngredient" class="ingredient__unit">
Vārītas olas
</span>
</li>
</div>
<div data-v-7b1d4f90="" class="container-ingredient">
<li data-v-7b1d4f90="" class="recipe--ingredients__ingredient">
<span data-v-7b1d4f90="" itemprop="recipeIngredient" class="ingredient__unit">
Zaļumi
</span>
</li>
</div>
</ul>
</div>
<div data-v-7b1d4f90="" class="container-ingredient_group">
<span data-v-7b1d4f90="" class="ingredient_group__title">Buljonam:</span>
<ul data-v-7b1d4f90="" class="ingredient_group">
<div data-v-7b1d4f90="" class="container-ingredient">
<li data-v-7b1d4f90="" class="recipe--ingredients__ingredient">
<span data-v-7b1d4f90="" itemprop="recipeIngredient" class="ingredient__unit">
3 kūpinātas vistas stilbiņi
</span>
</li>
</div>
<div data-v-7b1d4f90="" class="container-ingredient">
<li data-v-7b1d4f90="" class="recipe--ingredients__ingredient">
<span data-v-7b1d4f90="" itemprop="recipeIngredient" class="ingredient__unit">
1/3 selerijas sakne
</span>
</li>
</div>
<div data-v-7b1d4f90="" class="container-ingredient">
<li data-v-7b1d4f90="" class="recipe--ingredients__ingredient">
<span data-v-7b1d4f90="" itemprop="recipeIngredient" class="ingredient__unit">
1 sīpols
</span>
</li>
</div>
<div data-v-7b1d4f90="" class="container-ingredient">
<li data-v-7b1d4f90="" class="recipe--ingredients__ingredient">
<span data-v-7b1d4f90="" itemprop="recipeIngredient" class="ingredient__unit">
1 burkāns
</span>
</li>
</div>
<div data-v-7b1d4f90="" class="container-ingredient">
<li data-v-7b1d4f90="" class="recipe--ingredients__ingredient">
<span data-v-7b1d4f90="" itemprop="recipeIngredient" class="ingredient__unit">
4 daiviņa/s ķiploka
</span>
</li>
</div>
<div data-v-7b1d4f90="" class="container-ingredient">
<li data-v-7b1d4f90="" class="recipe--ingredients__ingredient">
<span data-v-7b1d4f90="" itemprop="recipeIngredient" class="ingredient__unit">
2 lauru lapas
</span>
</li>
</div>
<div data-v-7b1d4f90="" class="container-ingredient">
<li data-v-7b1d4f90="" class="recipe--ingredients__ingredient">
<span data-v-7b1d4f90="" itemprop="recipeIngredient" class="ingredient__unit">
5-6 smaržīgie pipari
</span>
</li>
</div>
<div data-v-7b1d4f90="" class="container-ingredient">
<li data-v-7b1d4f90="" class="recipe--ingredients__ingredient">
<span data-v-7b1d4f90="" itemprop="recipeIngredient" class="ingredient__unit">
Pētersīļu un diļļu kāti
</span>
</li>
</div>
</ul>
</div>
<div data-v-7b1d4f90="" class="container-ingredient_group">
<span data-v-7b1d4f90="" class="ingredient_group__title">Zupai:</span>
<ul data-v-7b1d4f90="" class="ingredient_group">
<div data-v-7b1d4f90="" class="container-ingredient">
<li data-v-7b1d4f90="" class="recipe--ingredients__ingredient">
<span data-v-7b1d4f90="" itemprop="recipeIngredient" class="ingredient__unit">
500 ml konservētu skābeņu
</span>
</li>
</div>
<div data-v-7b1d4f90="" class="container-ingredient">
<li data-v-7b1d4f90="" class="recipe--ingredients__ingredient">
<span data-v-7b1d4f90="" itemprop="recipeIngredient" class="ingredient__unit">
1 liels saišķis dārza zaļumu (gārsa, lakši, estragona dzinumi, pienenes)
</span>
</li>
</div>
<div data-v-7b1d4f90="" class="container-ingredient">
<li data-v-7b1d4f90="" class="recipe--ingredients__ingredient">
<span data-v-7b1d4f90="" itemprop="recipeIngredient" class="ingredient__unit">
2/3 paciņas pērļu grūbu
</span>
</li>
</div>
<div data-v-7b1d4f90="" class="container-ingredient">
<li data-v-7b1d4f90="" class="recipe--ingredients__ingredient">
<span data-v-7b1d4f90="" itemprop="recipeIngredient" class="ingredient__unit">
8 kartupeļi
</span>
</li>
</div>
<div data-v-7b1d4f90="" class="container-ingredient">
<li data-v-7b1d4f90="" class="recipe--ingredients__ingredient">
<span data-v-7b1d4f90="" itemprop="recipeIngredient" class="ingredient__unit">
1 ēd. k. cukura
</span>
</li>
</div>
<div data-v-7b1d4f90="" class="container-ingredient">
<li data-v-7b1d4f90="" class="recipe--ingredients__ingredient">
<span data-v-7b1d4f90="" itemprop="recipeIngredient" class="ingredient__unit">
sāls
</span>
</li>
</div>
<div data-v-7b1d4f90="" class="container-ingredient">
<li data-v-7b1d4f90="" class="recipe--ingredients__ingredient">
<span data-v-7b1d4f90="" itemprop="recipeIngredient" class="ingredient__unit">
Malti melnie pipari
</span>
</li>
</div>
<div data-v-7b1d4f90="" class="container-ingredient">
<li data-v-7b1d4f90="" class="recipe--ingredients__ingredient">
<span data-v-7b1d4f90="" itemprop="recipeIngredient" class="ingredient__unit">
3 daiviņa/s ķiploka
</span>
</li>
</div>
</ul>
</div>
If you have it as list of strings then simply join
then to one string
one_string = "\n".join(sastavdala_final)
Or if you have other items in sastavdala_final
then first create list with new values, next join them to one string, and later append to final list
temporary = []
for sastavdala_link in sastavdala:
temporary.append(sastavdala_link.text)
one_string = "\n".join(temporaty)
sastavdala_final.append(one_string)
shorter:
temporary = [sastavdala_link.text for sastavdala_link in sastavdala]
one_string = "\n".join(temporaty)
sastavdala_final.append(one_string)
or even:
one_string = "\n".join([sastavdala_link.text for sastavdala_link in sastavdala])
sastavdala_final.append(one_string)
Other idea:
get its parent <div class="container">
and get .text
from parent
and you should have all as one string.
Because there are many container
it may need to use nested XPATH
to get only container which has container-ingredient_group
inside.
I use find_element
(without s
at the end) to get only one element.
xpath = '//div[@class="container" and div[@class="container-ingredient_group"]]'
sastavdala = driver.find_element(By.XPATH, xpath) # without `s` in word
one_string = sastavdala.text
sastavdala_final.append(one_string)
but it gets also header: Sastāvdaļas
Full working code with both versions:
from selenium import webdriver
from selenium.webdriver.common.by import By
import time
# ---
import selenium
print('Selenium:', selenium.__version__)
# ---
url = "https://receptes.tvnet.lv/recepte/23989-skabenu-darza-zalumu-zupa-ar-kupinatu-vistu-un-perlu-grubam"
#driver = webdriver.Chrome() # options=options) # the newest Selenium will automatically download driver - so it doesn't need `service=`
driver = webdriver.Firefox() # options=options) # the newest Selenium will automatically download driver - so it doesn't need `service=`
driver.get(url)
# ---
time.sleep(5)
# --- version 1 ---
sastavdala_finali = []
sastavdala = driver.find_elements(By.XPATH, '//div[@class="container-ingredient_group"]')
temporary = []
for item in sastavdala:
temporary.append(item.text)
one_string = "\n".join(temporary)
sastavdala_finali.append(one_string)
# --- version 2 ---
sastavdala = driver.find_element(By.XPATH, '//div[@class="container" and div[@class="container-ingredient_group"]]')
one_string = sastavdala.text
sastavdala_finali.append(one_string)
# --- all results ---
for index, item in enumerate(sastavdala_finali, 1):
print(f'--- version {index} ---')
print(item)
# ---
driver.close()
Result:
Selenium: 4.31.0
--- version 1 ---
Buljonam:
3 kūpinātas vistas stilbiņi
1/3 selerijas sakne
1 sīpols
1 burkāns
4 daiviņa/s ķiploka
2 lauru lapas
5-6 smaržīgie pipari
Pētersīļu un diļļu kāti
Zupai:
500 ml konservētu skābeņu
1 liels saišķis dārza zaļumu (gārsa, lakši, estragona dzinumi, pienenes)
2/3 paciņas pērļu grūbu
8 kartupeļi
1 ēd. k. cukura
sāls
Malti melnie pipari
3 daiviņa/s ķiploka
Pasniegšanai:
Vārītas olas
Zaļumi
--- version 2 ---
Sastāvdaļas
Buljonam:
3 kūpinātas vistas stilbiņi
1/3 selerijas sakne
1 sīpols
1 burkāns
4 daiviņa/s ķiploka
2 lauru lapas
5-6 smaržīgie pipari
Pētersīļu un diļļu kāti
Zupai:
500 ml konservētu skābeņu
1 liels saišķis dārza zaļumu (gārsa, lakši, estragona dzinumi, pienenes)
2/3 paciņas pērļu grūbu
8 kartupeļi
1 ēd. k. cukura
sāls
Malti melnie pipari
3 daiviņa/s ķiploka
Pasniegšanai:
Vārītas olas
Zaļumi