Given a url list, how can I divide it to 3 sum lists?
one for YT videos, second for YT channels, third for all the rest?
const paragraph1 = 'www.youtube.com/watch?v=NsjeEt1ZpqQ';
const regex1 = /www.youtube.com/(\c*)(watch?v=)?<videoId>[A-Z,0-9])/gi;
const paragraph2 = 'https://www.youtube.com/channel/UCKqFqiCe1dCUxRe0_YNZ6gg';
const regex2 = /www.youtube.com/channel/?<channelId>[A-Z,_,0-9])/gi;
const found = paragraph1.match(regex1);
console.log(found);
// expected output: Array ["T", "I"]
const found = paragraph2.match(regex2);
console.log(found);
Tried to sandbox on this site.
Since you are planning to split some URL string list into three different parts, you can use three different patterns:
www\.youtube\.com\/watch\?v=(?<videoId>\S+)
www\.youtube\.com\/channel\/(?<videoId>\S+)
www\.youtube\.com(?!\/(?:channel\/|watch\?v=))\S*
See regex #1, regex #2 and regex #3 demos. Note you need an ECMAScript 2018+ compliant JavaScript environment for the named capturing groups to work. Also, see the dots are escaped everywhere they denote literal dots.
The patterns mean
www\.youtube\.com\/watch\?v=
- a literal www.youtube.com/watch?v=
string(?<videoId>\S+)
- Group "videoId": one or more non-whitespace charswww\.youtube\.com\/channel\/(?<videoId>\S+)
- a literal www.youtube.com/channel/
string and then Group "videoId" capturing one or more non-whitespace charswww\.youtube\.com(?!\/(?:channel\/|watch\?v=))\S*
- www.youtube.com
string and then a negative lookahead that fails the match if, immediately to the right, there is a /
char, then channel/
or watch?v=
, and then zero or more non-whitespace chars are consumed.If you plan to use the patterns agains some mark-up text, make sure you subtract the mark-up chars from the \S
pattern, that is, change it into a negated character class with a reverse shorthand, [^\s]
, and add the chars after \s
. Say, if the links are inside double quotes, put "
there, [^\s"]
.