I need to be able to remove duplicate entries in a datagridview quickly. Unfortunetly the way I am doing it can take a few minutes with anything above 100K items.
Here is the code I am using:
Dim wordlist As New List(Of String)
Dim numCols As Integer = DataGridView1.ColumnCount
Dim numRows As Integer = DataGridView1.RowCount - 1
Dim wordlist2 As New List(Of String)
For count As Integer = 0 To numRows - 1
wordlist.Add(DataGridView1.Rows(count).Cells("url").Value)
Next
For Each word As String In wordlist
If Not wordlist2.Contains(word) Then
wordlist2.Add(word)
End If
Next
fullitem.Clear()
For Each word2 As String In wordlist2
fullitem.Add(New item(word2, "", ""))
Next
DataGridView1.RowCount = fullitem.Count + 1
MessageBox.Show("Done!")
The datagridview is in virtual mode to support massive amounts of data.
If anyone could help me figure out a fast way to remove dupes I would really appreciate it.
Instead of first adding it to wordList and then looping through that and checking when adding it to a second list, just check when you add it to the first list. Also, we add it to fullitem (no idea what that is, you don't show what it is) right away. We just use the list for the contains.
This way, we reduce three loops to one.
Dim wordlist As New List(Of String)
Dim numCols As Integer = DataGridView1.ColumnCount
Dim numRows As Integer = DataGridView1.RowCount - 1
Dim word As String
fullitem.Clear()
For count As Integer = 0 To numRows - 1
word = DataGridView1.Rows(count).Cells("url").Value
If Not wordlist.Contains(word) Then
wordlist.Add(word)
fullitem.Add(New item(word, "", ""))
End If
Next
DataGridView1.RowCount = fullitem.Count + 1
MessageBox.Show("Done!")