[SOLVED] please regex url request

please regex url request

I would like to know if anybody can help me with a regular expression problem. I want to write a regular expression to catch URLs similar to this URL:

www.justin.tv/channel_name_here

I have tried:

/justin\.tv\/(.*)

The problem I get is that when this channel goes live, sometimes the URL transforms to something like this:

www.justin.tv/channel_name_here#/w/45365675688

I can't catch this. :( Can anybody please help me with this? I just want to catch the channel name without the pound symbol and the rest of the URL.

Here are some example URLs:

www.justin.tv/winning_movies#/w/6347562128
http://www.justin.tv/cine_accion_hd16#/w/6347562128/18
http://www.justin.tv/fox_movies_hd1/

I would want to get:

winning_movies
cine_accion_hd16
fox_movies_hd1

Thanks in advance! :)

Solution

Short answer:

(?<=justin\.tv\/)([^#\/]+)

Long answer:

Let's split this up into parts. Look at the back part first.

([^#\/]+)

This delimits the string into parts that don't include either '#' or '/'. Now let's look at the first part.

(?<=justin\.tv\/)

The syntax "(?<=" followed by ")" is called positive lookbehind (this page has good examples and explanation of the different types of lookaround). Using a simple example:

(?<=A)B

The above example says "I want all 'B' that are immediately after an 'A'." Going to our big example, we're saying we want all parts (separated by '#' or '/') that are immediately after a part called "justin.tv/".

Look here for an example of the expression in action.