mysqlcollationtable-lock

(MySQL 5.7.19 AWS RDS) how to change table column character set without locking


i want to change table character set from 'utf8' to 'utf8mb4'
but each column has own character set setting(utf8)
so i need to change column character set to 'Table Default', but locking is the problem
help me to change column character set without table locking

there is over 100,000,000 rows in table


Solution

  • "Character set" is the encoding of the characters in bytes.
    "Collation" is how to sort characters.

    An INDEX on a VARCHAR is sorted by its collation, so changing the collation of a column requires rebuilding an index -- a non-trivial operation.

    The difference between utf8 and utf8mb4 is relatively minor, but I don't think MySQL (hence RDS) has made a special case of that.

    ALTER TABLE t CONVERT TO utf8mb4; sounds like the operation that you desire. That requires ALGORITHM=COPY, so it is 'locking'.

    Look into pt-online-schema-change and gh-ost as a way of altering a table, even when it needs to "copy". These are essentially non-blocking. However, I do not know if they can be used with RDS. Also, because of JOINs and other cases where one table may need to be consistent with another, those tools may not be practical.

    Another approach... Add another column(s); change your code to use both the old and new column(s). Meanwhile, gradually copy the old values to the new column(s); when this is finished, change your code again -- this time to use the new column(s) instead of the old. At some later date, worry about dropping the dead column(s).

    Recent versions of MySQL have made significant changes in the speed of ALTER, so be sure to study what version RDS is derived from. In 5.6, ADD COLUMN can use ALGORITHM=INPLACE; in 8.0, ALGORITHM=INSTANT. I think either of those is non-"locking" for your purposes. (DROP COLUMN is not cheap; the issues with JOIN and rebuilding indexes are still up in the air.)

    If you try one of these techniques, I strongly recommend you build a table with at least a million rows and try out all the steps (alter add, join, recreate index, alter drop column, etc) to verify what parts are "fast enough" and/or "non-locking".