I have below Python code using polars
, and I do not want Python to auto parse values as dates or integers unless explicitly stated. schema_overrides
doesn't prevent auto conversion either.
import polars as pl
# Read the CSV file with all columns as strings using schema_overrides
file_path = "./xyz.csv"
df = pl.read_csv(file_path, schema_overrides={'*': pl.Utf8})
# Display the DataFrame
print(df)
I get below error:
polars.exceptions.ComputeError: could not parse
p35038
as dtypei64
at column 'Employee ID' (column number 3)
This is what infer_schema=False
is for.
When
False
, the schema is not inferred and will bepl.String
if not specified inschema
orschema_overrides
.
pl.read_csv(b"""a,b,c
1,2,3""")
# shape: (1, 3)
# ┌─────┬─────┬─────┐
# │ a ┆ b ┆ c │
# │ --- ┆ --- ┆ --- │
# │ i64 ┆ i64 ┆ i64 │
# ╞═════╪═════╪═════╡
# │ 1 ┆ 2 ┆ 3 │
# └─────┴─────┴─────┘
pl.read_csv(b"""a,b,c
1,2,3""", infer_schema=False)
# shape: (1, 3)
# ┌─────┬─────┬─────┐
# │ a ┆ b ┆ c │
# │ --- ┆ --- ┆ --- │
# │ str ┆ str ┆ str │
# ╞═════╪═════╪═════╡
# │ 1 ┆ 2 ┆ 3 │
# └─────┴─────┴─────┘
"*"
in your example is taken literally, it is not treated as a "Wildcard".