pythonsplit

Python: using split() to split a string at 2 separate points


I have a string that I need to split at 2 separate parts, but all I find is how to split the string using identifiers like "," and other punctuation.

string = "<p>The brown dog jumped over the... <a href="https://google.com" target="something">... but then splashed in the water<p>

hyperlink = re.split(r'(?=https)',string)

print(hyperlink[0])

In the example above, I need to extract just the url in the string "https://google.com" then print out. However, I can only find out how to split the string at "https", so everything past the url comes with it.

I hope this makes sense. After a bunch of searching and testing I can figure out how to do this.


Solution

  • There are many ways this can be achieved but a simple one is using find() and then slicing. find() will find the starting position of a substring in a string. using this you can then slice there. e.g.

    string = '<p>The brown dog jumped over the... <a href="https://google.com" target="something">... but then splashed in the water<p>'
    
    # Find where the URL starts
    start_word = "https"
    start_index = string.find(start_word)
    
    # For URLs, we need to find where it ends - usually at a quote mark
    end_index = string.find('"', start_index)
    
    # Extract just the URL
    result = string[start_index:end_index]
    
    print(result)
    

    Output:

    "https://google.com"
    

    The find() method returns the index where the substring begins. Then, using these positions, we slice the string to extract just the section we want.