pandas dataframe one-to-many multi-index

How to manage row spans and column spans with two level indexing

I have the following dataframe, mapping a one-to-many relationship between "courses" and "lessons":

   course_id       course_name  lesson_id     lesson_title
0          0          Learn C#          1              foo
1          0          Learn C#          2              bar
2          0          Learn C#          3              baz
3          1  Origami together          1        the crane
4          1  Origami together          2  crease patterns
5          2        WIP course          1        the first

How do I format it so that:

each lesson row is within the span of its belonging course row
lesson_id and lesson_title columns are under the span of a common lessons column

as shown below:

                                            lessons
   course_id       course_name         id            title
0          0          Learn C#          1              foo
1                                       2              bar
2                                       3              baz
3          1  Origami together          1        the crane
4                                       2  crease patterns
5          2        WIP course          1        the first

and producing an output similar to this when exported to Excel:

By looking at similar questions I found that accepted answers involve the use of multi-index, but in this case the first level of the index would have to comprehend all course related columns.

On top of that, the starting table is actually dinamically generated from corresponding Course and Lesson dataclasses, so I fear this approach wouldn't scale well if I were to add attributes to the Course class.

Ideally I would index by course_id and lesson_id, then specify which columns are indexed by the former or the latter, thus avoiding course attributes being duplicated for each lesson;

Is there a way to achieve that?

Solution

If need MultiIndex in index and columns is possible use:

out = df.set_index(['course_id','course_name'])
out.columns = out.columns.str.split('_', expand=True)

If need row spans for both levels here is trick - helper column with empty strings:

out = df.assign(**{'':''}).set_index(['course_id','course_name', ''])
out.columns = out.columns.str.split('_', expand=True)

print (out)
                            lesson                 
                                id            title
course_id course_name                              
0         Learn C                1              foo
                                 2              bar
                                 3              baz
1         Origami together       1        the crane
                                 2  crease patterns
2         WIP course             1        the first

If need remove third column in Excel:

file = 'out.xlsx'
out.to_excel(file)

import xlwings as xw
wb = xw.Book(file)
wb.sheets['Sheet1'].range('C:C').delete()
wb.save(file)