I have a xls file that I cannot open in R. I've tried a lot and the best I got was something like:
data = read.table('D:/Usuario/pivo1/table.xls', sep = ';')
but the df comes in this form:
while it should be something like this:
Aditionally:
print(data[9,1])
We can see the numbers of interest. Dates, hours and etc. but this form, in line 9 and column 1 of the df.
None of the excel functions worked, this was the best result i got.
I have a dropbox link for the file access:
https://www.dropbox.com/scl/fi/jlo15oasidxrbvvbcl3ge/table.xls?rlkey=ny8804ox4yqb1nqs9wn33j5ak&dl=0
Thanks for the help, i still have no solution. Working manually on the data.
The file contains HTML for a table and is somehow saved with xls extension:
<table id="tb1">
<tbody><tr>
<th>DATA</th>
<th>HORA</th>
<th>VOLUME (m³)</th>
<th>HORIMETRO</th>
</tr>
<tr>...</tr></tbody></table>
Somewhat surprisingly Excel can actually handle this, even though it does complain about format and extension mismatch.
You can open the file and parse the table with {rvest}
:
library(rvest)
read_html("table.xls") |>
html_element("table") |>
html_table()
#> # A tibble: 162 × 4
#> DATA HORA `VOLUME (m³)` HORIMETRO
#> <chr> <chr> <chr> <dbl>
#> 1 23/04/23 00:00 0253037 m³ 514.
#> 2 23/04/23 01:00 0253037 m³ 514.
#> 3 23/04/23 02:00 0253037 m³ 514.
#> 4 23/04/23 03:00 0253037 m³ 514.
#> 5 23/04/23 04:00 0253037 m³ 514.
#> 6 23/04/23 05:00 0253037 m³ 514.
#> 7 23/04/23 06:00 0253037 m³ 514.
#> 8 23/04/23 07:00 0253037 m³ 514.
#> 9 23/04/23 08:00 0253037 m³ 514.
#> 10 23/04/23 09:00 0253037 m³ 514.
#> # ℹ 152 more rows
Created on 2023-10-28 with reprex v2.0.2