javascriptregexmatch

Fetching YT video, channel or none


Given a url list, how can I divide it to 3 sum lists?

one for YT videos, second for YT channels, third for all the rest?

const paragraph1 = 'www.youtube.com/watch?v=NsjeEt1ZpqQ';
const regex1 = /www.youtube.com/(\c*)(watch?v=)?<videoId>[A-Z,0-9])/gi;

const paragraph2 = 'https://www.youtube.com/channel/UCKqFqiCe1dCUxRe0_YNZ6gg';                                 
const regex2 = /www.youtube.com/channel/?<channelId>[A-Z,_,0-9])/gi;

const found = paragraph1.match(regex1);
console.log(found);
// expected output: Array ["T", "I"]

const found = paragraph2.match(regex2);
console.log(found);

Tried to sandbox on this site.


Solution

  • Since you are planning to split some URL string list into three different parts, you can use three different patterns:

    www\.youtube\.com\/watch\?v=(?<videoId>\S+)
    www\.youtube\.com\/channel\/(?<videoId>\S+)
    www\.youtube\.com(?!\/(?:channel\/|watch\?v=))\S*
    

    See regex #1, regex #2 and regex #3 demos. Note you need an ECMAScript 2018+ compliant JavaScript environment for the named capturing groups to work. Also, see the dots are escaped everywhere they denote literal dots.

    The patterns mean

    If you plan to use the patterns agains some mark-up text, make sure you subtract the mark-up chars from the \S pattern, that is, change it into a negated character class with a reverse shorthand, [^\s], and add the chars after \s. Say, if the links are inside double quotes, put " there, [^\s"].