pythonwinapiwin32guipyhook

Retrieve the element name and xpath when clicking a button or typing in a textbox


I want to develop an app that helps me to record all the keyboard and mouse events executed in a particular Windows Application for example in Microsoft Excel, Acrobat, Notepad, and so on.

I tried Pyhook and Win32gui to achieve my goal so far. However, I do not know how to retrieve the following information:

Thank you so much for your help or advice and please forgive me if I wrote something incorrectly. I am very new with Python ;)


Solution

  • First get the current mouse click position, use WindowFromPoint to get the current window handle;

    GetClassName to get the window class name;

    GetMenu to get the HMENU handle;

    Then use MenuItemFromPoint to get the menu item ID(this function return will -1 if no menu item is at the specified location);

    Finally use GetMenuItemInfo(The GetMenuString function has been superseded by GetMenuItemInfo) to get the menu text.

    Here is a simple C++ test sample:

    #include <Windows.h>
    #include <iostream>
    
    int main()
    {
        Sleep(2000);//give us the time to test clicking on the menu.
        POINT p;
        GetCursorPos(&p);
        HWND h = WindowFromPoint(p);
        char classname[100] = { 0 };
        GetClassName(h, classname,100);
    
        MENUBARINFO menubar = {0};
        menubar.cbSize = sizeof(MENUBARINFO);
        HMENU menu = GetMenu(h);
        int id = MenuItemFromPoint(h, menu,p);
    
        MENUITEMINFO info = { 0 };
        info.cbSize = sizeof(MENUITEMINFO);
        info.fMask = MIIM_STRING;
        info.dwTypeData = NULL;
    
        GetMenuItemInfo(menu, id , true,&info);
        info.dwTypeData = (LPSTR)malloc(info.cch+1);
        info.cch++;
        GetMenuItemInfo(menu, id, true, &info);
    
        MessageBox(NULL, info.dwTypeData,"Menu Name",0);
        free(info.dwTypeData);
        return 0;
    }
    

    UPDATE:

    That's the code I have tested, and work for me.(test in NotePad)

    import win32api
    import win32gui
    import win32gui_struct
    import win32con
    import time
    import ctypes
    from ctypes.wintypes import tagPOINT
    
    time.sleep(3) #point to the menu before the time ends.
    pos = win32gui.GetCursorPos()
    hwnd = win32gui.WindowFromPoint(pos)
    
    ##get ClassName
    ClassName = win32gui.GetClassName(hwnd)
    menu = win32gui.GetMenu(hwnd)
    print("ClassName = " + ClassName)
    
    ##get Id
    point = tagPOINT(pos[0],pos[1])
    Id = ctypes.windll.user32.MenuItemFromPoint(hwnd,menu,point)
    print("Id = " + str(Id))
    
    ##get Menu
    info,extras = win32gui_struct.EmptyMENUITEMINFO(win32con.MIIM_STRING)
    win32gui.GetMenuItemInfo(menu,Id,1,info)
    strings = win32gui_struct.UnpackMENUITEMINFO(info)
    print("Menu = " + strings.text)