delphitwebbrowsertedgebrowser

How to get HTML SOURCE with Delphi Sydney 10.4.2 and Edge Webbrowser Component


I have TEdgeBrowser in Delphi 10.4.2. I would like to extract a RegEx-based string from the HTML code. In the past, it was possible with TWebBrowser (only with IE 11 or below) as selectedEngine.

But my site wants no IE 11+ anymore. So I have to use the Edge-based TEdgeBrowser in Syndey. I found this article to get the source by an script call, but with the latest Edge (Evergreen Standalone from here), I always get the AResultObjectAsJson will be 'null' (no results are returns).

What can I do to accomplish my task?


Solution

  • You probably get nil/NULL because you are not getting the right script to run in js

    Let me answer you in c++builder (move it to delphi is very easy).

    Define a new handle class for ICoreWebView2ExecuteScriptCompletedHandler:

    class TCoreWebView2ExecuteScriptCompletedHandler
    : public TCppInterfacedObject<ICoreWebView2ExecuteScriptCompletedHandler>
    {
    public:
        HRESULT __stdcall Invoke(HRESULT errorCode, WideChar* resultObjectAsJson);
    };
    
    HRESULT __stdcall TCoreWebView2ExecuteScriptCompletedHandler::Invoke(HRESULT errorCode, WideChar* resultObjectAsJson)
    {
        if (FAILED(errorCode))
        {
            ShowMessage("Failed to execute script")
            return errorCode;
        }
        ShowMessage(resultObjectAsJson);
        return S_OK;
    }
    

    Then use it. document.documentElement.outerHTML returns the html code to resultObjectAsJson. Some characters are \u code unicode based and also some other characters are escaped with . You will need a conversion function. Since it is a json string what is responsed it will start and ends with quotes too.

    EdgeBrowser->DefaultInterface->ExecuteScript(L"document.documentElement.outerHTML;",_di_ICoreWebView2ExecuteScriptCompletedHandler(new TCoreWebView2ExecuteScriptCompletedHandler()));
    

    More information here