In macOS, I want to make a script that focuses the frontmost window (if any) at the arbitrary screen coordinates (x, y)
, independent of which application/process that window belongs to.
It would be kind of equivalent to clicking with the mouse on that location, except I just want the window activated/focused and not any mouse-click event to be triggered (e.g., the target window could have a button at this position that I don’t want to be clicked). Therefore, I’m interested in getting the handle of the frontmost window at the given screen coordinates (x, y)
, so that I can programmatically focus it.
Is there a way of doing this in AppleScript, Swift or Python?
Note that your app needs the accessibility permission to do this and thus cannot be sandboxed. Querying for permission is done via AXIsProcessTrustedWithOptions(_:)
First, create a systemwide accessibility object with
AXUIElementCreateSystemWide()
,
and query the frontmost accessibility object at your coordinates with
AXUIElementCopyElementAtPosition(_:_:_:_:)
.
Here is an example of this first step:
func elementAtPosition(x: Float, y: Float) throws -> AXUIElement? {
let systemwide = AXUIElementCreateSystemWide()
var element: AXUIElement?
let error = AXUIElementCopyElementAtPosition(systemwide, x, y, &element)
if error != .success {
throw error
}
return element
}
Then you will need to recursively drill down the UI hierarchy until you find the element without parent (returning nil), which is your window element. This is done with
AXUIElementCopyAttributeValue(_:_:_:)
and
NSAccessibility.Attribute.parent
.
Last, activate the window element with
AXUIElementPerformAction(_:_:)
and
kAXRaiseAction
.