python-3.xweb-scrapingbeautifulsoupsplitnumbered-list

Split function doesnt work for string and for list


Just doing one of my first web scraping and I already have elements I wanted to extract but I cannot find the function to print them as a numbered list. The code I have for now:

r = requests.get('https://mmazurek.dev/category/programowanie-2/page/3/', proxies={'http':'82.119.170.106'})

page = soup(r.content, "html.parser")

contents=page.findAll(None, class_="post-title-link")

for content in contents:
    text_content=list(content.get_text())
    first_letter=str(text_content[0])
    x="".join(first_letter)  

    listToStr = "".join(map(str, text_content))

    print(listToStr)

The purpose is to have list printed like:

  1. P....
  2. J...
  3. ...

Hope you dont mind it's a Polish text;)


Solution

  • def get_html(url, useragent=None, proxy=None):
        session = requests.Session()
        request = session.get(url=url, headers=useragent, proxies=proxy)
        if request.status_code == 200:
            soup = bs(request.text, 'lxml')
            return soup
        else:
            print("Error " + str(request.status_code))
            return request.status_code
    
    
    def parse(soup):
        data = []
        contents = soup.findAll(None, class_="post-title-link")
        for i, content in enumerate(contents):
            text = content.text
            href = content['href']
            data.append([
                i,
                text,
                href,
            ])
    
        return data
    
        return data
    
    data = parse(get_html('https://mmazurek.dev/category/programowanie-2/page/3/', proxy={'http': '82.119.170.106'}))
    
    print(data)