I'm trying my best to get plain text from google maps directions api where it says html_instructions in json. everything is encoded in html and i want to output plain text.
here is what I'm getting image 1
this is what i want : image 2
I tried ay type of preg_replace, it couldn't help me out.
Google maps api link Link
EDIT: Previous code snippet removed and replaced by workable small program.
Note that when you process the data with json_decode() the unicode segments like \u003cb\u003eFlintergata\u003c/b\u003e
are converted to <b>Flintergata</b>
. This helps a lot to make the Regex more readable.
Note the $details
array is a multilevel associative array, so you need to dig down as shown to find the data you need.
Note also the URL you provided results in 1 route, with 1 leg. So the code I provided shows and processes the first leg of the first route.
If you use a different URL you may get multiple routes each with multiple steps. The code will still process the first leg of the first route, but its each (with outer loops) to show all of them (not shown below).
The explanation of the regex string '"~>([A-Z].*?)<~"' is as follows.
The '#'
at each side are the PHP delimiters - but you can use other chars also and would not make any difference.
The <b>
and </b>
are saying that each matched string must start with <b>
and end with </b>
.
Inside the ( )
a "capture group" that says we want to extract that part of the string only (excluding the <b>
and </b>
).
[A-Z]
says begin with a capital letter
.*
says follow with 0 or more of any character.
The ?
makes the * non_greedy so this case, stop current match when it meets the next <
.
The list of matches for each string goes into an array called $matches
and $matches[1]
is an array of capture group matches ( ie text within the <b>
and </b>
is removed).
<?php
$json = file_get_contents("https://maps.googleapis.com/maps/api/directions/json?origin=sandnes&destination=vigrestad&key=");
$details = json_decode($json,true);
// $details is a large associative array
// print all the instructions for the first step of the fist leg of the first route
echo PHP_EOL."Here are the unfiltered html instructions for first leg of first route ".PHP_EOL.PHP_EOL;
$steps = $details['routes'][0]['legs'][0]['steps'];
foreach($steps as $step){
echo($step['html_instructions']).PHP_EOL; // print to see format
// we see unicode html_entities have been replaced and now look like <b> </b> etc
}
// now extra the required information from each step
echo PHP_EOL."Here are the filtered html instructions for first leg of first route ".PHP_EOL.PHP_EOL;
foreach ($steps as $step)
{
//preg_match_all("~003e([A-Z].*?)\\\\u003c~", $step['html_instructions'], $match); // not needed now
preg_match_all('#,<b>([A-Z].*?)</b>#, $step['html_instructions'], $match); // now detects strings between '>' and '<'
foreach($match[1] as $instructionPart)
{
echo $instructionPart." ";
}
echo PHP_EOL;
}
?>