excelvbamultidimensional-arrayarray-merge

Why I could not simply read sheets and merge all the content in a single Variant?


Context: I'm trying, in VBA / Excel, to read the content of multiple large sheets (arround ~100k rows in total) and to store it in a Variant in order to work on the data in memory.

Why : I want to work on this data in memory to gain speed, instead of working and writing on the sheets directly

My code was working well with one sheet, the problems begins when there is more than one sheet.

The problem : I want to join the content of the different sheets in only one Variant, in order to work on it

The code :

Dim ws As Worksheet
Dim arrData As Variant ' The variant to store the data
Dim lastSheetLine As Double
Dim lineMemory As Double

lineMemory = 0

For Each ws In ThisWorkBook.Worksheets ' Parse all the sheets

     lastSheetLine = Worksheets(ws.name).Cells(Worksheets(ws.name).Rows.Count, "A").End(xlUp).Row ' find the last line of the sheet

   If lineMemory = 0 Then
     arrData(0, 15) = Worksheets(ws.name).Range("A2:O" & lastSheetLine).Value ' Store the sheet on the Variant
     lineMemory = lastSheetLine
   Else
     lineMemory = lineMemory + lastSheetLine ' Increment to get the position where to put the block
     arrData(lineMemory, 15) = Worksheets(ws.name).Range("A2:O" & lastSheetLine).Value
   End If

Next ws

In this multi-dimensional array, the second dimension is always 15 (there is always the same number of columns), it is the first dimension that will change from sheet to sheet.

Thanks a lot in advance for your help, I'm open for everything


Solution

  • To merge arrays having the same number of columns, please try the next way:

    1. Declare a variable on top of a standard module (in the declarations area):
      Private arrFin()
    
    1. Copy the next code in the same standard module:
    Sub TestMrgeArrays()
     Dim wb As Workbook, ws As Worksheet, lastRow As Long
    
     Set wb = ThisWorkbook
     For Each ws In wb.Worksheets ' Parse all the sheets
         lastRow = ws.cells(ws.rows.count, "A").End(xlUp).row ' find the last line of the sheet
         merge2DArrays ws.Range("A2:O" & lastRow).Value ' merge with the final one...
     Next ws
     Debug.Print UBound(arrFin), UBound(arrFin, 2): Stop
    End Sub
    
    Sub merge2DArrays(arr)
        Dim arrNew(), i As Long, j As Long, xF As Long
        
        If Not Not arrFin Then
            If UBound(arrFin, 2) <> UBound(arr, 2) Then MsgBox "Different number of columns...": Exit Sub
            ReDim arrNew(1 To UBound(arrFin) + UBound(arr), 1 To UBound(arrFin, 2))
            xF = UBound(arrFin)
            For i = 1 To UBound(arrNew)
                For j = 1 To UBound(arrNew, 2)
                    If i <= xF Then
                        arrNew(i, j) = arrFin(i, j)
                    Else
                        arrNew(i, j) = arr(i - xF, j)
                    End If
                Next j
            Next i
            arrFin = arrNew
        Else
            arrFin = arr: Exit Sub
        End If
    End Sub
    

    There is another option involving ReDim Preserve which can take the previous loaded array, transpose it and place the content in a new array, load it only with the new array content and finally equalize the final array with this (temporary) one. But Transpose has some limitations in terms of range dimensions and it will fail in your case. I can also make a non standard function transposing without error, but it will need iteration also...

    Edited:

    The next version uses a jagged array (an array of arrays) to be firstly loaded with each sheet array and be processed in a separate function:

    Sub TestMrgeArrays_JGGArray()
     Dim wb As Workbook, ws As Worksheet, arrData() As Variant
     Dim lastRow, arrFin, iCount As Long, totCount As Long
    
     Set wb = ThisWorkbook
     ReDim arrData(wb.Worksheets.count - 1) 'ReDim it to keep all the sheets (0 based 1D array)
     For Each ws In wb.Worksheets ' Parse all the sheets
         lastRow = ws.cells(ws.rows.count, "A").End(xlUp).row ' find the last line of the sheet
         arrData(iCount) = ws.Range("A2:O" & lastRow).Value ' place it in the jagged array
         iCount = iCount + 1:  totCount = totCount + lastRow - 1 ' -1 because the array is loaded from the second row...
     Next ws
     arrFin = MergeShArrays(arrData, totCount)
     Debug.Print UBound(arrFin), UBound(arrFin, 2): Stop
    End Sub
    
    Function MergeShArrays(arrD, totCount As Long) As Variant
      Dim i As Long, j As Long, k As Long, iRow As Long, colNo As Long, arr, arrNew()
      colNo = UBound(arrD(0), 2)
      ReDim arr(1 To totCount, 1 To colNo)
      For i = 0 To UBound(arrD)
        arrNew = arrD(i)
        For j = 1 To UBound(arrNew)
            iRow = iRow + 1
            For k = 1 To colNo
                arr(iRow, k) = arrNew(j, k)
            Next k
        Next j
      Next i
      MergeShArrays = arr
    End Function
    

    It should be a little faster determining the total number of rows from the sheets iteration part.