I have an Azure Function built using .NET 8.0, which reads data from an Excel sheet and deletes all rows in Azure SQL Server, then adds around 67,000 new rows.
I was doing this insert one by one with this code:
using (IDbConnection conn = new SqlConnection(connectionString))
{
conn.Open();
await conn.ExecuteAsync(@"DELETE FROM Tactic", null, null, 3000);
foreach (var bgg in tactics)
{
string sql = @"INSERT INTO Tactic (MSID, UMSID, Publisher, Created, Modified)
VALUES (@MSID, @UMSID, @Publisher, GETDATE(), GETDATE());";
await conn.ExecuteAsync(sql, bgg, null, 300);
}
}
which took around 4 minutes to insert all the rows.
I decided to use Chuck inside Dapper, and send 5000 rows in the same transaction, as follows:
using (IDbConnection conn = new SqlConnection(connectionString))
{
conn.Open();
await conn.ExecuteAsync(@"DELETE FROM Tactic", null, null, 3000);
string sql = @"INSERT INTO Tactic (MSID, UMSID, Publisher, Created, Modified)
VALUES (@MSID, @UMSID, @Publisher, GETDATE(), GETDATE());";
var batchSize = 5000;
foreach (var batch in tactics.Chunk(batchSize))
{
await conn.ExecuteAsync(sql, batch, commandTimeout: 60);
}
}
but it took almost the same time to insert those rows.
I'm not sure, but I thought the second approach should have significant performance improvements. Also as a general pattern, when I run this command continually during the insert operations:
SELECT COUNT(*) FROM [dbo].[Tactic]
I can see rows are being added by 150 rows per run, for both methods. Is there a limit on the database itself?
You can apply SQL Bulk Copy:
SQL has a built-in mechanism to import a large volume of data, called Bulk Insert. Luckily for us, dotnet supports a Bulk Insert with the SqlBulkCopy class.
Besides the visible performance advantage over the other solutions, we can also easily tweak the behavior with some Options.
To use the SqlBulkCopy, we need to create a new instance of the class and provide the SQL destination table. When we write the customer list into the customers table, with the WriteToServer method on the SqlBulkCopy class, the method expects the collection to be a DataTable.
Something like this:
using (var copy = new SqlBulkCopy(connectionString))
{
copy.DestinationTableName = "dbo.Tactic";
// Add mappings so that the column order doesn't matter
copy.ColumnMappings.Add(nameof(Tactic.MSID), "MSID");
copy.ColumnMappings.Add(nameof(Tactic.UMSID), "UMSID");
copy.ColumnMappings.Add(nameof(Tactic.Publisher), "Publisher");
copy.WriteToServer(ToDataTable(customers));
}
And I would ensure that created and modified are defaulted to the current time.