csspandasitext7pdfhtml

Pandas Styling not reading css id


I'm generating the following html from pandas.

<style  type="text/css" >
  #T_header_tablerow0_col3,#T_header_tablerow0_col4{
    background-color:  #F5ABAB;
  }
</style>
        
<table id="T_header_table" class="header-table">
  <thead>
    <tr>
      <th class="col_heading level0 col0" >Threshold</th>        
      <th class="col_heading level0 col1" >Limit Amount</th>        
      <th class="col_heading level0 col2" >Utilization (Historical Cost)</th>
      <th class="col_heading level0 col3" >Threshold Breach</th>
      <th class="col_heading level0 col4" >Limit Breach</th>    
    </tr>
  </thead>
  <tbody>
    <tr>
      <td id="T_header_tablerow0_col0" class="data row0 col0" >778.900000</td>
      <td id="T_header_tablerow0_col1" class="data row0 col1" >1</td>
      <td id="T_header_tablerow0_col2" class="data row0 col2" >2</td>
      <td id="T_header_tablerow0_col3" class="data row0 col3" >0</td>
      <td id="T_header_tablerow0_col4" class="data row0 col4" >0</td>
    </tr>
  </tbody>
</table>

But when I try to generate a pdf using itext from this html, the id based inline styling doesn't come into effect and I don't see the colors. Can anyone help me on this?

Here's my itext code

HtmlConverter.convertToPdf(new File(inputFile), new File(outputFile));

Solution

  • This is a bug in pdfHTML. The exact problem is that it does not gracefully deal with IDs that start with an uppercase letter. So as a workaround you can change your HTML into:

    <style  type="text/css" >
      #t_header_tablerow0_col3,#t_header_tablerow0_col4{
        background-color:  #F5ABAB;
      }
    </style>
    
    <table id="T_header_table" class="header-table">
      <thead>
      <tr>
        <th class="col_heading level0 col0" >Threshold</th>
        <th class="col_heading level0 col1" >Limit Amount</th>
        <th class="col_heading level0 col2" >Utilization (Historical Cost)</th>
        <th class="col_heading level0 col3" >Threshold Breach</th>
        <th class="col_heading level0 col4" >Limit Breach</th>
      </tr>
      </thead>
      <tbody>
      <tr>
        <td id="T_header_tablerow0_col0" class="data row0 col0" >778.900000</td>
        <td id="T_header_tablerow0_col1" class="data row0 col1" >1</td>
        <td id="T_header_tablerow0_col2" class="data row0 col2" >2</td>
        <td id="t_header_tablerow0_col3" class="data row0 col3" >0</td>
        <td id="t_header_tablerow0_col4" class="data row0 col4" >0</td>
      </tr>
      </tbody>
    </table>
    

    And it should produce correct result. Meanwhile if you are considering to contribute to pdfHTML, you may wan to check out CssSelectorParser class and regular expression in SELECTOR_PATTERN_STR - the (#[_a-z][\w-]*) part is the one needs fixing.