validationclojuresanitizationplumatic-schema

Prismatic schema: removing unanticipated keys


My API is receiving some JSON data from the client.

I would like to use Schema to perform validation and coercion on the data I receive, but with one additional requirement: if there is any map key that is not described in the schema, ignore and remove it instead of failing the validation (this is because my client may send me some "garbage" properties along with the ones I care about. I want to be tolerant to that.).

So in a nutshell, I would like to perform a "deep select-keys" on my input data using my schema, before validation/coercion.

Example of what I need:

(require '[schema.core :as sc])
(def MySchema {:a sc/Int
               :b {:c sc/Str
                   (sc/optional-key :d) sc/Bool}
               :e [{:f sc/Inst}]})

(sanitize-and-validate
  MySchema
  {:a 2
   :b {:c "hello"
       :$$garbage-key 32}
   :e [{:f #inst "2015-07-23T12:29:51.822-00:00" :garbage-key 42}]
   :_garbage-key1 "woot"})
=> {:a 2
    :b {:c "hello"}
    :e [{:f #inst "2015-07-23T12:29:51.822-00:00"}]}

I haven't yet found a reliable way of doing this:

  1. I can't seem to do it in a custom transformation, because it seems a walker does not give you access to the keys.
  2. I haven't had any luck trying to walk the schema by hand, because it's hard to differentiate map schemas and scalar schemas in a generic way; also difficult to account for all the possible shapes a schema can have.

Is there an obvious way I'm not seeing?

Thanks!


Solution

  • A third solution, credits to abp: use schema.coerce/coercer with a matcher that will remove unknown keys from maps.

    (require '[schema.core :as s])
    (require '[schema.coerce :as coerce])
    (require '[schema.utils :as utils])
    
    (defn filter-schema-keys
      [m schema-keys extra-keys-walker]
      (reduce-kv (fn [m k v]
                   (if (or (contains? schema-keys k)
                           (and extra-keys-walker
                                (not (utils/error? (extra-keys-walker k)))))
                     m
                     (dissoc m k)))
                 m
                 m))
    
    (defn map-filter-matcher
      [s]
      (when (or (instance? clojure.lang.PersistentArrayMap s)
                (instance? clojure.lang.PersistentHashMap s))
        (let [extra-keys-schema (#'s/find-extra-keys-schema s)
              extra-keys-walker (when extra-keys-schema (s/walker extra-keys-schema))
              explicit-keys (some->> (dissoc s extra-keys-schema)
                                     keys
                                     (mapv s/explicit-schema-key)
                                     (into #{}))]
          (when (or extra-keys-walker (seq explicit-keys))
            (fn [x]
              (if (map? x)
                (filter-schema-keys x explicit-keys extra-keys-walker)
                x))))))
    

    This was described as the cleanest solution by the primary author of Schema, as is it does not require any change to the schema itself to work. So it's probably the way to go.

    Usage example:

    (def data {:a 2
               :b {:c "hello"
                   :$$garbage-key 32}
               :e [{:f #inst "2015-07-23T12:29:51.822-00:00" :garbage-key 42}]
               :_garbage-key1 "woot"})
    ((coerce/coercer MySchema map-filter-matcher) data)
    ;=> {:a 2, :b {:c "hello"}, :e [{:f #inst "2015-07-23T12:29:51.822-00:00"}]}