I would like to know if anybody can help me with a regular expression problem. I want to write a regular expression to catch URLs similar to this URL:
www.justin.tv/channel_name_here
I have tried:
/justin\.tv\/(.*)
The problem I get is that when this channel goes live, sometimes the URL transforms to something like this:
www.justin.tv/channel_name_here#/w/45365675688
I can't catch this. :( Can anybody please help me with this? I just want to catch the channel name without the pound symbol and the rest of the URL.
Here are some example URLs:
www.justin.tv/winning_movies#/w/6347562128
http://www.justin.tv/cine_accion_hd16#/w/6347562128/18
http://www.justin.tv/fox_movies_hd1/
I would want to get:
winning_movies
cine_accion_hd16
fox_movies_hd1
Thanks in advance! :)
Short answer:
(?<=justin\.tv\/)([^#\/]+)
Long answer:
Let's split this up into parts. Look at the back part first.
([^#\/]+)
This delimits the string into parts that don't include either '#' or '/'. Now let's look at the first part.
(?<=justin\.tv\/)
The syntax "(?<=" followed by ")" is called positive lookbehind (this page has good examples and explanation of the different types of lookaround). Using a simple example:
(?<=A)B
The above example says "I want all 'B' that are immediately after an 'A'." Going to our big example, we're saying we want all parts (separated by '#' or '/') that are immediately after a part called "justin.tv/".
Look here for an example of the expression in action.