delphiunicodedelphi-xe7

"No mapping for the Unicode character exists in the target multi-byte code page" error


I have a bug report showing an EEncodingError. The log points to TFile.AppendAllText. I call TFile.AppendAllText is this procedure of mine:

procedure WriteToFile(CONST FileName: string; CONST uString: string; CONST WriteOp: WriteOpperation; ForceFolder: Boolean= FALSE);     // Works with UNC paths
begin
 if NOT ForceFolder
 OR (ForceFolder AND ForceDirectoriesMsg(ExtractFilePath(FileName))) then
   if WriteOp= (woOverwrite)
   then IOUtils.TFile.WriteAllText (FileName, uString)
   else IOUtils.TFile.AppendAllText(FileName, uString);
end;

This is the information from EurekaLog.

enter image description here

enter image description here

What can cause this to happen?


Solution

  • This program reproduces the error that you report:

    {$APPTYPE CONSOLE}
    
    uses
      System.SysUtils, System.IOUtils;
    
    var
      FileName: string;
    
    begin
      try
        FileName := TPath.GetTempFileName;
        TFile.WriteAllText(FileName, 'é', TEncoding.ANSI);
        TFile.AppendAllText(FileName, 'é');
      except
        on E: Exception do
          Writeln(E.ClassName, ': ', E.Message);
      end;
    end.
    

    Here I have written the original file as ANSI. And then called AppendAllText which will try to write as UTF-8. What happens is that we end up in this function:

    class procedure TFile.AppendAllText(const Path, Contents: string);
    var
      LFileStream: TFileStream;
      LFileEncoding: TEncoding; // encoding of the file
      Buff: TBytes;
      Preamble: TBytes;
      UTFStr: TBytes;
      UTF8Str: TBytes;
    begin
      CheckAppendAllTextParameters(Path, nil, False);
    
      LFileStream := nil;
      try
        try
          LFileStream := DoCreateOpenFile(Path);
          // detect the file encoding
          LFileEncoding := GetEncoding(LFileStream);
    
          // file is written is ASCII (default ANSI code page)
          if LFileEncoding = TEncoding.ANSI then
          begin
            // Contents can be represented as ASCII;
            // append the contents in ASCII
    
            UTFStr := TEncoding.ANSI.GetBytes(Contents);
            UTF8Str := TEncoding.UTF8.GetBytes(Contents);
    
            if TEncoding.UTF8.GetString(UTFStr) = TEncoding.UTF8.GetString(UTF8Str) then
            begin
              LFileStream.Seek(0, TSeekOrigin.soEnd);
              Buff := TEncoding.ANSI.GetBytes(Contents);
            end
            // Contents can be represented only in UTF-8;
            // convert file and Contents encodings to UTF-8
            else
            begin
              // convert file contents to UTF-8
              LFileStream.Seek(0, TSeekOrigin.soBeginning);
              SetLength(Buff, LFileStream.Size);
              LFileStream.ReadBuffer(Buff, Length(Buff));
              Buff := TEncoding.Convert(LFileEncoding, TEncoding.UTF8, Buff);
    
              // prepare the stream to rewrite the converted file contents
              LFileStream.Size := Length(Buff);
              LFileStream.Seek(0, TSeekOrigin.soBeginning);
              Preamble := TEncoding.UTF8.GetPreamble;
              LFileStream.WriteBuffer(Preamble, Length(Preamble));
              LFileStream.WriteBuffer(Buff, Length(Buff));
    
              // convert Contents in UTF-8
              Buff := TEncoding.UTF8.GetBytes(Contents);
            end;
          end
          // file is written either in UTF-8 or Unicode (BE or LE);
          // append Contents encoded in UTF-8 to the file
          else
          begin
            LFileStream.Seek(0, TSeekOrigin.soEnd);
            Buff := TEncoding.UTF8.GetBytes(Contents);
          end;
    
          // write Contents to the stream
          LFileStream.WriteBuffer(Buff, Length(Buff));
        except
          on E: EFileStreamError do
            raise EInOutError.Create(E.Message);
        end;
      finally
        LFileStream.Free;
      end;
    end;
    

    The error stems from this line:

    if TEncoding.UTF8.GetString(UTFStr) = TEncoding.UTF8.GetString(UTF8Str) then
    

    The problem is that UTFStr is not in fact valid UTF-8. And hence TEncoding.UTF8.GetString(UTFStr) throws an exception.

    This is a defect in TFile.AppendAllBytes. Given that it knows perfectly well that UTFStr is ANSI encoded, it makes no sense at all for it to call TEncoding.UTF8.GetString.

    You should submit a bug report to Embarcadero for this defect which still exists in Delphi 10 Seattle. In the meantime you should not use TFile.AppendAllBytes.