New to Python/pandas.
Within a column called "URL", I am trying to replace any URLs that have "http://", "https://", or "www." and just keep everything after it.
For example,
http://www.jhu.edu
http://www.brown.edu
http://https://www.amherst.edu
http://www.usc.edu
Should look like:
jhu.edu
brown.edu
amherst.edu
usc.edu
# example
import pandas as pd
data = {'colA': ['http://www.jhu.edu', 'http://www.brown.edu', 'http://https://www.amherst.edu', 'http://www.usc.edu']}
df = pd.DataFrame(data)
use str.replace with regex
out = df['colA'].str.replace(r'https?://|www\.', '', regex=True)