parsinghaskellapplicativealternative-functor

Haskell: Parsing an object that could be multiple types into one single type


I'm a haskell beginner going through aeson, learning more about both by parsing some data files.

Usually when there's a data file, may it be .json, a lua table, .csv format or others, and you want to parse them, there's always a chance of error.

For example, a simple .json file like this

"root": {
     "m1": {
      "key1": "value1",
      "key2": 2
       },
     "m2": {
       "key1": 1
       },
}

Has two oddities: "m1" has two subkeys, one has a value in String and one in Int. "m2" has only one subkey, and it has same key as the one above it but the value has a different type ie. Int.


If it were like this

"root": {
     "m1": {
      "key1": "value1",
      "key2": 2
       },
     "m2": {
      "key1": "value1",
      "key2": 2 
       },
}

A simple way of parsing it with Aeson would be with these datatypes

data Root = Root { Map String Key
                 } deriving (Show, Generic)

data Key = Key { key1 :: String
               , key2 :: Int
               } deriving (Show, Generic)

If a key was missing

"root": {
     "m1": {
      "key1": "value1",
      "key2": 2
       },
     "m2": {
      "key1": "value1"
       },
}

This could have done the job

data Root = Root { Map String Key
                 } deriving (Show, Generic)

data Key = Key { key1 :: String
               , key2 :: Maybe Int
               } deriving (Show, Generic)

But what if it were like the first example where not only can the keys not have a value but also have completely different ones.

What if in them you only cared about the numbers or the strings? Would there be a way of parsing them without going out of the type definitions?

Going through some quick searches I found out the Alternative class is just meant for this kind of problems and operator like *>, <>, <|> can prove useful, but I'm not sure how.

I know I need to define a type that can encapsulate all three chances if I just wanted the text or numbers, like

Data NeededVal = NoValue | TextValue | Needed Int

or

Data NeededVal = NoValue | NumericValue | Needed String

but I'm not sure how I'd go about making them an instance of Applicative & Alternative so that the idea would work out.

This is a short follow-up of my previous question


Solution

  • Well, I try to play with the JSON as below:

    "root": {
         "m1": {
          "key1": "value1",
          "key2": 2
           },
         "m2": {
           "key1": 1
           },
    }
    

    and parse it to the follow data types using Data.Aeson:

    data Root = Root (Map String Key) deriving (Show)
    
    data NeededVal = NoValue | NumericValue | Needed String deriving (Show)
    
    data Key = Key { key1 :: NeededVal , key2 :: NeededVal } deriving (Show)
    

    To handle NoValue, I use Alternative <|> as

    instance FromJSON Key where
        parseJSON = withObject "Key" $ \obj -> do
            k1 <- obj .: (pack "key1") <|> pure NoValue
            k2 <- obj .: (pack "key2") <|> pure NoValue
            return(Key k1 k2)
    

    To test String and numeric type, I use Value constructor as:

    instance FromJSON NeededVal where
        parseJSON (String txt) = return $ Needed $ unpack txt
        parseJSON (Number _)   = return $ NumericValue
        parseJSON _            = return NoValue
    

    To skip m1 and m2 objects and read the keys value immediately as:

    import Data.Map as Map (Map, fromList)
    import Data.HashMap.Strict as HM (toList, lookup)
    import Data.Aeson.Types (Parser)
    
    parseJSON = withObject "Root" 
                    $ \rootObj-> case HM.lookup (pack "root") rootObj of
                                    Nothing  -> fail "no Root"
                                    Just val -> withObject "Key List" mkRoot val
        where mkRoot obj =
                let (ks, vs) =  unzip $ HM.toList obj
                    ks' = map unpack ks
                in  do vs' <- mapM parseJSON vs::Parser [Key]
                       return $ Root $ Map.fromList $ zip ks' vs'
    

    and the final result:

    Right (Root (fromList [
        ("m1",Key {key1 = Needed "value1", key2 = NumericValue}),
        ("m2",Key {key1 = NumericValue, key2 = NoValue})]
    ))
    

    Side notes:

    but I'm not sure how I'd go about making them an instance of Applicative & Alternative so that the idea would work out.

    No, No need to make them as an instance of Applicative and Alternative, the <|> operator apply on Parser (defined in Data.Aeson.Types) not the user defined data type. Parser has already be an instance of Alternative.