I am simplephpdom using to get a
href links values with this code :
$html = file_get_html($url);
foreach($html->find('a') as $element) $array[] = $element->href . '<br>';
Now, the problem is that if the link , is starting with slash (/) the link will be not valid !
How can i have valid links ?
For example , the link is like this :
<a href="/news45454.html">Test link</a>
if i use phpsimpledom code, i will have :
/news45454.html
But, i want to have :
http://example.com/news45454.html
How to get this?
Can we test , if the link was starting with slash , then add site url to it ?! How ?
Basically you need to test if the HREF element is a valid full URL. If the validation passes, you can go ahead and add it to the array. However, if the validation fails, you need to concatenate the basename (which should be website's domain).
$html = file_get_html($url);
foreach($html->find('a') as $element) {
if(filter_var($url, FILTER_VALIDATE_URL)) {
// Valid URL, add to array.
$array[] = $element->href . '<br>';
} else {
// URL is invalid, add basename.
$array[] = basename($url) . $element->href . '<br>';
}
}
This may need a bit of tweaking for other cases (such as <a href="#">
) but it should work for the situation you outlined.