vbaexcel

Extract top N entries from each category


By way of simplified example, say you have the following dataset:

 A      B     C
Name  Group Amount
Dave    A     2
Mike    B     3
Adam    C     4
Charlie A     2
Edward  B     5
Fiona   B     5
Georgie A     4
Harry   C     1
Mary    A     0
Delia   A     0
Victor  B     1
Dennis  B     0
Erica   A     4
Will    B     4

I'm trying to extract the highest N entries (let's say 2 in this example) from each group.

For example, the highest two entries in group A are Georgie and Erica with 4. I also then want the highest two entries for group B and C.

I want the VBA code to extract these rows and paste them on another worksheet for subsequent analysis.

I have tried code like this so far:

ActiveWorkbook.Worksheets("Sheet1").AutoFilter.Sort.SortFields.Add Key:=Range _
    ("C1"), SortOn:=xlSortOnValues, Order:=xlDescending, DataOption:= _
    xlSortNormal
With ActiveWorkbook.Worksheets("Sheet1").AutoFilter.Sort
    .Header = xlYes
    .MatchCase = False
    .Orientation = xlTopToBottom
    .SortMethod = xlPinYin
    .Apply
End With
ActiveSheet.Range("$A$1:$C$15").AutoFilter Field:=2, Criteria1:="A"
Range("A5:C6").Select
Selection.Copy
Sheets("Sheet2").Select
Range("A2").Select
ActiveSheet.Paste
Sheets("Sheet1").Select
ActiveSheet.Range("$A$1:$C$15").AutoFilter Field:=2, Criteria1:="B"
Range("A2:C3").Select
Application.CutCopyMode = False
Selection.Copy
Sheets("Sheet2").Select
Range("E2").Select
ActiveSheet.Paste
Sheets("Sheet1").Select
ActiveSheet.Range("$A$1:$C$15").AutoFilter Field:=2, Criteria1:="C"
Range("A4:C11").Select
Application.CutCopyMode = False
Selection.Copy
Sheets("Sheet2").Select
Range("I2").Select
ActiveSheet.Paste

In short, I'm just sorting the values from largest to smallest, and then filtering for each group, and extracting the top two values. The code is not resilient, however, as the copy part depends on the names being in a particular order, which will change when I get new data.

Is there a cleverer, cleaner way of doing this?


Solution

  • Does this have to be VBA? It can be done with formulas.

    Based on your provided sample data, you could setup Sheet2 like this:

    tigeravatar example for Will T-E

    In cell A4 and copied down is this formula:

    =IF($C4="","",INDEX(Sheet1!$A$2:$A$15,MATCH(1,INDEX((Sheet1!$B$2:$B$15=$B4)*(Sheet1!$C$2:$C$15=$C4)*(COUNTIFS($A$3:$A3,Sheet1!$A$2:$A$15,$B$3:$B3,$B4)=0),),0)))
    

    In cell B4 and copied down is this formula:

    =IF(($B$1>0)*COUNT($B$1),IF(OR($B3="Group",COUNTIF($B$3:$B3,$B3)=$B$1),IFERROR(INDEX(Sheet1!$B$2:$B$15,MATCH(0,INDEX(COUNTIF($B$3:$B3,Sheet1!$B$2:$B$15),),0)),""),$B3),"")
    

    In cell C4 and copied down is this formula:

    =IF(OR($B4="",COUNTIF(Sheet1!$B$2:$B$15,$B4)<COUNTIF($B$4:$B4,$B4)),"",LARGE(INDEX(Sheet1!$C$2:$C$15*(Sheet1!$B$2:$B$15=$B4),),COUNTIF($B$4:$B4,$B4)))
    

    Note that you can copy those formulas down quite a ways, and it will only show desired results. Extra rows will simply be blank. You can also change the number in cell B1 to be whatever the number of top entries to be, so you could see top 5 per category, or top 3, etc.

    However, if it absolutely must be VBA, then something like this should work for you. It's not simple, but it is very efficient and flexible. All you would need to do is update lNumTopEntries, your sheetnames, and where your data is located for the Set rngData line:

    Sub tgr()
    
        Dim wsData As Worksheet
        Dim wsDest As Worksheet
        Dim rngData As Range
        Dim rngFound As Range
        Dim rngUnqGroups As Range
        Dim GroupCell As Range
        Dim lCalc As XlCalculation
        Dim aResults() As Variant
        Dim aOriginal As Variant
        Dim lNumTopEntries As Long
        Dim i As Long, j As Long, k As Long
    
        'Change to grab the top X number of entries per category'
        lNumTopEntries = 2
    
        Set wsData = ActiveWorkbook.Sheets("Sheet1")    'This is where your data is'
        Set wsDest = ActiveWorkbook.Sheets("Sheet2")    'This is where you want to output it'
    
        Set rngData = wsData.Range("A1", wsData.Cells(Rows.Count, "C").End(xlUp))
        aOriginal = rngData.Value   'Store original values so you can set them back later'
    
        'Turn off calculation, events, and screenupdating'
        'This allows code to run faster and prevents "screen flickering"'
        With Application
            lCalc = .Calculation
            .Calculation = xlCalculationManual
            .EnableEvents = False
            .ScreenUpdating = False
        End With
    
        'If there are any problems with the code, make sure the calculation, events, and screenupdating get turned back on'
        On Error GoTo CleanExit
    
        With rngData
            .Sort .Resize(, 1).Offset(, 1), xlAscending, .Resize(, 1).Offset(, 2), , xlDescending, Header:=xlYes
        End With
    
        With rngData.Resize(, 1).Offset(, 1)
            .AdvancedFilter xlFilterInPlace, , , True
            Set rngUnqGroups = .Offset(1).Resize(.Rows.Count - 1).SpecialCells(xlCellTypeVisible)
            .Parent.ShowAllData 'Remove the filter
    
            ReDim aResults(1 To rngUnqGroups.Cells.Count * lNumTopEntries, 1 To 3)
            i = 0
    
            For Each GroupCell In rngUnqGroups
                Set rngFound = .Find(GroupCell.Value, .Cells(.Cells.Count))
                k = 0
                If Not rngFound Is Nothing Then
                    For j = i + 1 To i + lNumTopEntries
                        If rngFound.Offset(j - i - 1).Value = GroupCell.Value Then
                            k = k + 1
                            aResults(j, 1) = rngFound.Offset(j - i - 1, -1).Value
                            aResults(j, 2) = rngFound.Offset(j - i - 1).Value
                            aResults(j, 3) = rngFound.Offset(j - i - 1, 1).Value
                        End If
                    Next j
                    i = i + k
                End If
            Next GroupCell
        End With
    
        'Output results'
        wsDest.Range("A2").Resize(UBound(aResults, 1), UBound(aResults, 2)).Value = aResults
    
    CleanExit:
        'Turn calculation, events, and screenupdating back on'
        With Application
            .Calculation = lCalc
            .EnableEvents = True
            .ScreenUpdating = True
        End With
    
        If Err.Number <> 0 Then
            'There was an error, show the error'
            MsgBox Err.Description, , "Error: " & Err.Number
            Err.Clear
        End If
    
        'Put data back the way it was
        rngData.Value = aOriginal
    
    End Sub