I am attempting to extract valid cell references and range references from a spreadsheet formula, using Google Apps Script (Javascript).
A valid cell reference is one or two letters, followed by consecutive numbers not starting with a zero. Either the letter(s) or the number(s) may or may not be preceded by a $ character. The entire reference can't be preceded/proceeded by a letter, number or underscore (in which case it may be part of either a spreadsheet function or the name of a named range) or a colon (in which case it may be part of range reference).
The range reference regex (rangeRefRe
) seems to work well; but my cell reference regex (cellRefRe
) fails to find a match. Would be great if someone could point out what I'm doing wrong.
function myFunction()
{
var formula = '=A100+B$2:2+INDIRECT("A2:B")+$C3-SUM($D$1:$E5)';
var fSegments = formula.split('"'); // I want to exclude references within double quotation marks
var rangeRefRe = /[^0-9a-zA-Z_$]([0-9a-zA-Z$]+?:[0-9a-zA-Z$]+)(?![0-9a-zA-Z_])/g;
var cellRefRe = /[^0-9a-zA-Z_$:](\${,1}[a-zA-Z]{1,2}\${,1}[1-9][0-9]*)(?![0-9a-zA-Z_:])/g;
var refResult;
var references = [];
for (var i = 0; i < fSegments.length; i += 2)
{
while (refResult = rangeRefRe.exec(fSegments[i]))
{
references.push(refResult[1]);
}
while (refResult = cellRefRe.exec(fSegments[i]))
{
references.push(refResult[1]);
}
}
Logger.log(references);
}
JavaScript doesn't support this part of your regex: {,1}
. To allow 0 or 1 occurrences it would need to be {0,1}
, or you can replace that with just ?
:
/[^0-9a-zA-Z_$:](\$?[a-zA-Z]{1,2}\$?[1-9][0-9]*)(?![0-9a-zA-Z_:])/g;