I am not sure if this problem is a boo-boo on my part or something about CI. I have a preg_replace process to convert a published gdoc spreadsheet url back into the original spreadsheet url.
$pat = '/(^[a-z\/\.\:]*?sheet\/)(pub)([a-zA-Z0-9\=\?]*)(\&output\=html)/';
$rep = '$1ccc$3#gid=0';
$theoriginal = preg_replace( $pat, $rep, $published );
This works fine in a test page run locally. This test page isn't framed by CI - it's just a basic php page.
When I copy and paste the pattern and replacement into the CI view which it's intended for, no joy.
Is this malfunction caused by CI or my 'bad' ? Are there easy-to-implement remedies ?
Here's a bit more code from the CI view:
<body id="sites" >
<?php
foreach ( $dets as $item )
{
$nona = $item->nona;
$address = $item->address;
$town = $item->town;
$pc = $item->pc;
$foto1 = $item->foto1;
$foto1txt = $item->foto1txt;
$foto2 = $item->foto2;
$foto2txt = $item->foto2txt;
$costurl = $item->costurl;
$sid = $item->sid;
}
//convert published spreadsheet url to gdoc spreadsheet url
$pat ='/(^[a-z\/\.\:]*?sheet\/)(pub)([a-zA-Z0-9\=\?]*)(\&output\=html)/i';
$rep ='$1ccc$3#gid=0';
$spreadsheet = preg_replace( $pat, $rep, $costurl);
The pattern you came to can be "tidied" up a bit:
~^(.*?sheet/)pub(.*)(&[a-z=]*)$~
See the regex demo.
The leading ^
and trailing $
are not usually put inside the groups. The /
can be left unescaped if you use a regex delimiter other than /
. A &
and =
are not special regex metacharacters, =
is only "special" in positive lookaround constructs. So, your pattern means:
^
- start of a string anchor(.*?sheet/)
- Group 1: any 0+ chars other than line break chars, as few as possible (and since I belive the point is to only match pub
in the URL path, not the query string, you need to actually replace .*?
with [^?#]*?
negated character class matching 0+ chars other than #
and ?
), up to the first occurrence of sheet/
and the subsequent subpatterns...pub
- a substring(.*)
- Group 2: any 0+ chars other than line break chars, as many as possible, up to the last occurrence of the subsequent subpatterns...(&[a-z=]*)
- Group 3: a &
followed with 0 or more ASCII letters (since i
modifier is used, the [a-z]
pattern will also match uppercase letters) and/or =
$
- end of string anchor.It seems to me that you may also use a better pattern like
~^([^?#]*?sheet/)pub(.*)(&[a-z=]*)$~
^^^^^^
See this regex demo. Explanation of the change is provided in the explanation above.