pythonpandasdataframecsvfile-io

Pandas read in table without headers


Using pandas, how do I read in only a subset of the columns (say 4th and 7th columns) of a .csv file with no headers? I cannot seem to be able to do so using usecols.


Solution

  • Previous answers were good and correct, but in my opinion, an extra names parameter will make it perfect, and it should be the recommended way, especially when the csv has no headers.

    Solution

    Use usecols and names parameters

    df = pd.read_csv(file_path, usecols=[3,6], names=['colA', 'colB'])
    

    Additional reading

    or use header=None to explicitly tells people that the csv has no headers (anyway both lines are identical)

    df = pd.read_csv(file_path, usecols=[3,6], names=['colA', 'colB'], header=None)
    

    So that you can retrieve your data by

    # with `names` parameter
    df['colA']
    df['colB'] 
    

    instead of

    # without `names` parameter
    df[0]
    df[1]
    

    Explain

    Based on read_csv, when names are passed explicitly, then header will be behaving like None instead of 0, so one can skip header=None when names exist.