javascriptregexregex-lookaroundscapturing-group

Use just regexp to split a string into a 'tuple' of filename and extension?


I know there are easier ways to get file extensions with JavaScript, but partly to practice my regexp skills I wanted to try and use a regular expression to split a filename into two strings, before and after the final dot (. character).

Here's what I have so far

const myRegex = /^((?:[^.]+(?:\.)*)+?)(\w+)?$/
const [filename1, extension1] = 'foo.baz.bing.bong'.match(myRegex);
// filename1 = 'foo.baz.bing.'
// extension1 = 'bong'
const [filename, extension] = 'one.two'.match(myRegex);
// filename2 = 'one.'
// extension2 = 'two'
const [filename, extension] = 'noextension'.match(myRegex);
// filename2 = 'noextension'
// extension2 = ''

I've tried to use negative lookahead to say 'only match a literal . if it's followed by a word that ends in, like so, by changing (?:\.)* to (?:\.(?=\w+.))*:

/^((?:[^.]+(?:\.(?=(\w+\.))))*)(\w+)$/gm

But I want to exclude that final period using just the regexp, and preferably have 'noextension' be matched in the initial group, how can I do that with just regexp?

Here is my regexp scratch file: https://regex101.com/r/RTPRNU/1


Solution

  • Just wanted to do a late pitch-in on this because I wanted to split up a filename into a "name" and an "extension" part - and wasn't able to find any good solutions supporting all my test cases ... and I wanted to support filenames starting with "." which should return as the "name" and I wanted to support files without any extension too.

    So I'm using this line which handles all my use-cases

    const [name, ext] = (filename.match(/(.+)+\.(.+)/) || ['', filename]).slice(1)
    

    Which will give this output

    '.htaccess' => ['.htaccess', undefined]
    'foo' => ['foo', undefined]
    'foo.png' => ['foo', 'png']
    'foo.bar.png' => ['foo.bar', 'png']
    '' => ['', undefined]
    

    I find that to be what I want.