How to parse Number with comma via Megaparsec

Currently I have a parser:

pScientific :: Parser Scientific
pScientific = lexeme L.scientific

This is able to easily parse something like 4087.00

but fails when then number 4,087.00 Is there a way to make megaparsec parse number with comma?

PS: I am very new to haskell, so apologize if this is a stupid question

Solution

The reason this is not parsed is because the scientific type is mainly defined for JSON parsing, and JSON does not allow this, and a comma is used to separate elements in arrays and objects.

We can take a look at the implementation of scientific [src]:

-- | Parse a JSON number.
scientific :: Parser Scientific
scientific = do
  sign <- A.peekWord8'
  let !positive = not (sign == W8_MINUS)
  when (sign == W8_PLUS || sign == W8_MINUS) $
    void A.anyWord8

  n <- decimal0

  let f fracDigits = SP (B.foldl' step n fracDigits)
                        (negate $ B.length fracDigits)
      step a w = a * 10 + fromIntegral (w - W8_0)

  dotty <- A.peekWord8
  SP c e <- case dotty of
              Just W8_DOT -> A.anyWord8 *> (f <$> A.takeWhile1 isDigit_w8)
              _           -> pure (SP n 0)

  let !signedCoeff | positive  =  c
                   | otherwise = -c

  (A.satisfy (\ex -> case ex of W8_e -> True; W8_E -> True; _ -> False) *>
      fmap (Sci.scientific signedCoeff . (e +)) (signed decimal)) <|>
    return (Sci.scientific signedCoeff    e)
{-# INLINE scientific #-}

The main thing to change is the decimal0 part, that captures a sequence of zero or more decimal numbers. We can for example implement this with:

import qualified Data.ByteString as B

decimal0' :: Parser Integer
decimal0' = do
  digits <- B.filter (\x -> x /= 44) <$> A.takeWhile1 (\x -> isDigit_w8 x || x == 44)
  if B.length digits > 1 && B.unsafeHead digits == 48
    then fail "leading zero"
    else return (bsToInteger digits)

and then use that one with:

import qualified Data.Attoparsec.ByteString as A
import qualified Data.Scientific as Sci
import Data.Attoparsec.ByteString.Char8 (isDigit_w8)

-- | Parse a JSON number.
scientific :: Parser Scientific
scientific = do
  sign <- A.peekWord8'
  let !positive = not (sign == 45)
  when (sign == 43 || sign == 45) $
    void A.anyWord8

  n <- decimal0'

  let f fracDigits = SP (B.foldl' step n fracDigits)
                        (negate $ B.length fracDigits)
      step a w = a * 10 + fromIntegral (w - W8_0)

  dotty <- A.peekWord8
  SP c e <- case dotty of
              Just 46 -> A.anyWord8 *> (f <$> A.takeWhile1 isDigit_w8)
              _           -> pure (SP n 0)

  let !signedCoeff | positive  =  c
                   | otherwise = -c

  (A.satisfy (\ex -> case ex of W8_e -> True; W8_E -> True; _ -> False) *>
      fmap (Sci.scientific signedCoeff . (e +)) (signed decimal)) <|>
    return (Sci.scientific signedCoeff    e)
{-# INLINE scientific' #-}

This does not take into account that the comma is placed after every three digits, so that will require extra logic, but this is a basic implementation to work accept commas in the integral part of the Scientific.