How to strip quotes from CSV table?

I'm using the pandas library to convert CSVs to other data types. My CSV has the fields quoted, like this:

"Version", "Date", "Code", "Description", "Tracking Number", "Author"
"0.1", "22AUG2022", , "Initial Draft", "NR", "Sarah Marshall"
"0.2", "23SEP2022", "D", "Update 1", "N-1234", "Bill Walter"
"0.3", "09MAY2023", "A\, C", "New update.", "N-1235", "George Orwell"

The problem is that when I read the CSV with pandas.read_csv('myfile.csv'), the quotes are included in the values:

   Version        "Date"   "Code"     "Description"  "Tracking Number"           "Author"
0      0.1   "22AUG2022"            "Initial Draft"               "NR"   "Sarah Marshall"
1      0.2   "23SEP2022"      "D"        "Update 1"           "N-1234"      "Bill Walter"
2      0.3   "09MAY2023"   "A, C"     "New update."           "N-1235"    "George Orwell"

So these quotes are included when converting to HTML:

<table>
  <thead>
    <tr style="text-align: right">
      <th></th>
      <th>Version</th>
      <th>"Date"</th>
      <th>"Code"</th>
      <th>"Description"</th>
      <th>"Tracking Number"</th>
      <th>"Author"</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <th>0</th>
      <td>0.1</td>
      <td>"22AUG2022"</td>
      <td></td>
      <td>"Initial Draft"</td>
      <td>"NR"</td>
      <td>"Sarah Marshall"</td>
    </tr>
...

I've tried quoting=csv.QUOTE_NONE, but it didn't fix it--in fact, it actually added quotes to the Version column.

I found this question--the answers essentially says to strip out any quotes in post processing.

I can of course loop through the CSV rows and strip out the leading/trailing quotes for each of the values, but since quoted values are common with CSV I'd expect that there would be a parameter or easy way to enable/disable the quotes in the rendered output but I can't find something like that.

Is there a way to accomplish this?

Solution

The leading spaces in your input seem to be throwing Pandas off (and some other CSV processors I can think of).

Try the skipinitialspace=True option to make Pandas ignore every space between a comma and a quote char:

import pandas as pd

df = pd.read_csv("input.csv", skipinitialspace=True)
print(df)

   Version       Date  ... Tracking Number          Author
0      0.1  22AUG2022  ...              NR  Sarah Marshall
1      0.2  23SEP2022  ...          N-1234     Bill Walter
2      0.3  09MAY2023  ...          N-1235   George Orwell

[3 rows x 6 columns]

Expounding a bit, you can see the difference, where quotes are present, between the default of not skipping the initial space and skipping:

from io import StringIO

data = """
Col1
"foo"
 "foo"
" foo"
""".strip()

print("no skip")

df = pd.read_csv(StringIO(data))
for _, x in df.iterrows():
    print(repr(x["Col1"]))

print()
print("skip initial")

df = pd.read_csv(StringIO(data), skipinitialspace=True)
for _, x in df.iterrows():
    print(repr(x["Col1"]))

no skip
'foo'
' "foo"'
' foo'

skip initial
'foo'
'foo'
' foo'

"foo" doesn't matter as it doesn't have any space
"foo" is up for interpretation as the space precedes the (default) quote char
" foo" also doesn't matter as the space is encoded inside the (default) quote char