pythonbeautifulsoupcraigslist

How can I get the Price, Title, and the Link from this element tag?


This is the exported element tags of the searched item in Craigslist.

I'm using BeautifulSoup and Python.

How can I get the 3 items hidden below? 1.) 2 BED/1 BATH CONDO UNIT WITH FANTASTIC LAYOUT 2.) https://vancouver.craigslist.ca/rch/apa/d/2-bed-1-bath-condo-unit-with/6682563732.html 3.) $1400

<li class="result-row" data-pid="6682563732" data-repost-of="6672289062">
<a class="result-image gallery" data-ids="1:00O0O_9WrVUmeuy5e,1:01010_8pEn0AcGEYo,1:00707_77eVL7Ade68,1:00e0e_dxzhja1yrDa,1:00k0k_2F9337g40vD,1:00O0O_eaVv31Gd4yw,1:00Z0Z_jccPvNndfg7,1:00505_eicoHPPOUcN,1:00U0U_7ligL02j3Mr,1:00u0u_3RnaxaZyl81,1:00S0S_ld5VSNTlzAJ,1:00U0U_dU2swTovLEJ,1:00O0O_d7PeITBKlmL,1:00x0x_3kcVk30PK30,1:00i0i_xhm6pORJuQ,1:00m0m_kIMUZ1PgCHb,1:00D0D_aTGfSGlJ1Ru,1:00v0v_8M4NXLErFqM,1:00c0c_jiTQuztqh9J,1:00Z0Z_3rlFLU7MCbq" href="https://vancouver.craigslist.ca/rch/apa/d/2-bed-1-bath-condo-unit-with/6682563732.html">
<span class="result-price">$1400</span>
</a>
<p class="result-info">
<span class="icon icon-star" role="button">
<span class="screen-reader-text">favorite this post</span>
</span>
<time class="result-date" datetime="2018-08-27 16:54" title="Mon 27 Aug 04:54:01 PM">Aug 27</time>
<a class="result-title hdrlnk" data-id="6682563732" href="https://vancouver.craigslist.ca/rch/apa/d/2-bed-1-bath-condo-unit-with/6682563732.html">2 BED/1 BATH CONDO UNIT WITH FANTASTIC LAYOUT</a>
<span class="result-meta">
<span class="result-price">$1400</span>
<span class="housing">
                    2br -
                    1430ft<sup>2</sup> -
                </span>
<span class="result-hood"> (RICHMOND)</span>
<span class="result-tags">
                    pic
                    <span class="maptag" data-pid="6682563732">map</span>
</span>
<span class="banish icon icon-trash" role="button">
<span class="screen-reader-text">hide this posting</span>
</span>
<span aria-hidden="true" class="unbanish icon icon-trash red" role="button"></span>
<a class="restore-link" href="#">
<span class="restore-narrow-text">restore</span>
<span class="restore-wide-text">restore this posting</span>
</a>
</span>
</p>
</li>


Solution

  • You can find the elements with the desired class names:

    import re
    from bs4 import BeautifulSoup as soup
    d = soup(content, 'html.parser')
    [titles] = [[i.text, i['href']] for i in d.find_all('a', {'class':'result-title'})]
    _, price= [i.text for i in d.find_all('span', {'class':'result-price'})]
    print([*titles, price])
    

    Output:

    ['2 BED/1 BATH CONDO UNIT WITH FANTASTIC LAYOUT', 'https://vancouver.craigslist.ca/rch/apa/d/2-bed-1-bath-condo-unit-with/6682563732.html', '$1400']