I am trying to build a url cleaner.
I am looking to get a list of urls and remove all https://
, http://
, www.
, etc. from the beginning as well as all text after the trailing /
.
I have tried the following regex url.replace(/^https?\:\/\/www\./i, "").split('/')[0];
This works to a certain extent and outputs the following
"www.net-temps.com"
"www.toplanguagejobs.com"
"http:"
"peopleready.com"
"nationjob.com"
"http:"
"bluesteps.com"
"https:"
"theguardian.com"
"reddit.com"
"youtube.com"
"https:"
"pgatour.com"
"cultofmac.com"
from the following list:
'www.net-temps.com',
'www.toplanguagejobs.com',
'http://nychires.com/',
'http://www.peopleready.com/',
'https://www.nationjob.com/',
'http://nationaljobsonline.com/',
'https://www.bluesteps.com/',
'https://medium.freecodecamp.com/how-we-got-our-2-year-old-open-source-project-to-trend-on-github-8c25b0a6dfe9#.nl4985bjz',
'https://www.theguardian.com/uk/business',
'https://www.reddit.com/r/funny/comments/5qzkz4/my_captain_friend_sent_me_this_photo_saudi_prince/',
'https://www.youtube.com/watch?v=Bua8k_CcnuI',
'https://stackoverflow.com/questions/7000995/jquery-removing-part-of-string-after-and-removing-too/7001040#7001040',
'http://www.pgatour.com/fantasy.html',
'http://www.cultofmac.com/464645/apple-spaceship-campus-flyover/'
If I remove the /www\.
from the regex this works well and removes all https:
etc., but I'd also like to remove the www.
if it's there regardless of https:
This is what i have coded so far
https://jsfiddle.net/xba5x9ro/1/
In the future once this is sorted. I would like to take a list of urls from a text area run makeDomainBeautiful
and output to another textarea but thought I'd get this working first.
/^(?:https?:\/\/)?(?:www\.)?/i
where both https://
and www.
should be optional (?
) and non-capturing groups ((?:...)
).
var url = prompt("url: ");
url = url.replace(/^(?:https?:\/\/)?(?:www\.)?/i, "").split('/')[0];
alert("url: " + url);