pythonpandasdataframecsvjupyter

Combine multiple CSV files (datasets) to make a joint one


I have 5 datasets, as CSV files, they each contain event logs on a computer, Monday-Friday.

So:

Monday.csv
Tuesday.csv
Wednesday.csv
Thursday.csv
Friday.csv

I was wondering how I could merge all of these together into one big file, each dataset, is identical in format with 80 columns as well as track of which day of the week it was, when looking at this larger dataset with all 5 days.

So all 5 csv's would become 1 bigger one like:

Week1.csv

Could this be possible with pandas? or would I need another library?

Update Import multiple csv files into pandas and concatenate into one DataFrame This helps me do it.

But my CSV files include the first row as a header, when I merge them it includes the same header 5 times through the document when the pdf's merge, is there a way to remove the first column from each one before you merge them?


Solution

  • How about this?

    import pandas as pd
    import glob
    
    path = r'C:\your_path_here' # use your path
    all_files = glob.glob(path + "/*.csv")
    
    # create list to append to
    li = []
    
    # loop through file names in the variable named 'all_files'
    for filename in all_files:
        df = pd.read_csv(filename, index_col=None, skiprows=1, header=o)
        li.append(df)
    
    frame = pd.concat(li, axis=0, ignore_index=True)
    

    Notice: pd.read_csv has an argument for skiprows=1

    Check out this link.

    https://www.listendata.com/2019/06/pandas-read-csv.html