ruby-on-railsrubypostgresqlactiverecord

Remove null bytes for every ActiveRecord string attributes


I have a Rails application using Postgres as the database.

Postrgres does not accept strings containing null bytes (example: "a \u0000 b"). Trying to save such data leads to the following error:

ActiveRecord::StatementInvalid
PG::UntranslatableCharacter: ERROR:  unsupported Unicode escape sequence

I'd like to set up a rule that would work for all of my ActiveRecord models, ensuring that every string attribute would remove null bytes from it before saving. I'm thinking of something that could have an effect similar to the example below:

class MyModel < ApplicationRecord
  before_save :remove_null_bytes

  private

  def remove_null_bytes
    my_field.delete!("\u0000")
  end
end

But applied to every string attribute of every model without being forced to set this repeatedly.

The reason I want this to be limited to Active Record string attributes is that the application also handles binary uploads and email webhooks. These might contain legit null bytes, and I do not want these to be affected by the new rule.


Solution

  • When you want to normalize all string type attributes on all models, then I would add this to ApplicationRecord:

    class ApplicationRecord < ActiveRecord::Base
      normalizes *attribute_names, 
        with: ->(value) { value.delete("\u0000") if value.is_a?(String) }
    end
    

    Note that normalizes was introduced in Ruby on Rails 7.1 and is not available on older versions.