UPDATE: The root problem has been solved by fixing a number of Sequelize migrations that always run before mysqldump is called, as discussed in the comments that are below the article that is linked in the next paragraph. However, the core technical challenge is still interesting.
I have a problem with mysqldump that might be solved by configuring mysqldump differently, but probably will be solved by just piping the output through a shell script.
Basically, mysqldump
always outputs the tables in the same order, but it list all columns (other than id
) for each table in random order.
So, the first run might output this...
create TABLE `ONE` (
`id` int NOT NULL AUTO_INCREMENT,
`column_a` varchar(255) COLLATE utf8mb4_unicode_ci NOT NULL,
`column_b` varchar(255) COLLATE utf8mb4_unicode_ci NOT NULL,
`column_c` varchar(255) COLLATE utf8mb4_unicode_ci NOT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB;
create TABLE `TWO` (
`id` int NOT NULL AUTO_INCREMENT,
`column_x` varchar(255) COLLATE utf8mb4_unicode_ci NOT NULL,
`column_y` varchar(255) COLLATE utf8mb4_unicode_ci NOT NULL,
`column_z` varchar(255) COLLATE utf8mb4_unicode_ci NOT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB;
...and on the second run, it might produce something like this:
create TABLE `ONE` (
`id` int NOT NULL AUTO_INCREMENT,
`column_c` varchar(255) COLLATE utf8mb4_unicode_ci NOT NULL,
`column_b` varchar(255) COLLATE utf8mb4_unicode_ci NOT NULL,
`column_a` varchar(255) COLLATE utf8mb4_unicode_ci NOT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB;
create TABLE `TWO` (
`id` int NOT NULL AUTO_INCREMENT,
`column_y` varchar(255) COLLATE utf8mb4_unicode_ci NOT NULL,
`column_x` varchar(255) COLLATE utf8mb4_unicode_ci NOT NULL,
`column_z` varchar(255) COLLATE utf8mb4_unicode_ci NOT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB;
I'd like to pipe the result through a shellscript that always sorts the lines in the same way. What script would achieve this? The script needs to run on a build agent that runs on Ubuntu, so if it is possible and practical to use standard GNU tools like awk
then that would be superior to using custom tools.
Assumptions:
create table
command) resides on a separate line (eg, we won't see 2 columns listed on a single line)NOT NULL
or NULL
AUTO-INCREMENT
PRIMARY KEY
clauseAUTO-INCREMENT
column clause and the PRIMARY KEY
clauseOne approach:
One GNU awk
implementation of this approach:
awk '
BEGIN { PROCINFO["sorted_in"]="@val_str_asc" # sort arrays by value, sorted as string in asc[ending] order
delete lines # designate variable "lines" as an array
}
/^[[:space:]]/ && # if line starts with white space (ie, it is indented) and ...
!/AUTO_INCREMENT/ && # line does not contain string "AUTO_INCREMENT" and ...
/ NULL/ { lines[++cnt] = $0 # line contains a single white space + string "NULL", then save current line in array
next # skip to next line of input
}
cnt { for (i in lines) # loop through indices of array in @val_str_asc sorted order
print lines[i] # print array value to stdout
delete lines # reset array
cnt = 0 # reset counter (aka array index)
}
1 # print current line of input to stdout
' sample.sql
NOTES:
sample.sql
is an exact copy of the 2nd pair of create table
commands from OP's question (ie, the example with the out-of-order column_X
rows)GNU awk
for PROCINFO["sort_in"]
support which in turn is used to sort the contents of the array; see gawk predefined sorting orders for detailsThis generates:
create TABLE `ONE` (
`id` int NOT NULL AUTO_INCREMENT
`column_a` varchar(255) COLLATE utf8mb4_unicode_ci NOT NULL,
`column_b` varchar(255) COLLATE utf8mb4_unicode_ci NOT NULL,
`column_c` varchar(255) COLLATE utf8mb4_unicode_ci NOT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB;
create TABLE `TWO` (
`id` int NOT NULL AUTO_INCREMENT,
`column_x` varchar(255) COLLATE utf8mb4_unicode_ci NOT NULL,
`column_y` varchar(255) COLLATE utf8mb4_unicode_ci NOT NULL,
`column_z` varchar(255) COLLATE utf8mb4_unicode_ci NOT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB;