xpathublock-origin

How to simplify these Ublock filters that block Google annoyances


I use Google search frequently and I really hate Google search's various stuff like "People also ask" and "People also search for" and the search suggestions drop down. These things are always completely irrelevant and obnoxious, and treat its users as imbeciles.

I have always used Ublock Origin to block these annoyances, I never found any working filters through Google search as all those posts I had found are years old and Google updates its stuff constantly, but I am very technically literate and I had made custom filters that worked most of the time by using Ublock Origin's "element picker mode".

These filters look like this:

www.google.com##.sATSHe > div > .vt6azd.Ww4FFb

A hierarchy of random class names, and Google changes the class names very frequently, these filters don't always block intended targets and can sometimes block unintended targets.

I got bored today and I wanted a filter that works perfectly, and through Google searching I have found out that Ublock Origin supports XPath, so through playing with F12 I made these filters that work perfectly, but are very ugly looking:

google.*##:xpath(//span[text()='People also search for']/parent::*/parent::*/parent::*/parent::*/parent::*/parent::*)
google.*##:xpath(//span[text()='People also ask']/parent::*/parent::*/parent::*/parent::*/parent::*/parent::*)
google.*##:xpath(//span[text()='Perspectives']/parent::*/parent::*/parent::*/parent::*/parent::*/parent::*)
google.*##:xpath(//span[text()='Top stories']/parent::*/parent::*/parent::*/parent::*/parent::*/parent::*)
google.*##:xpath(//span[text()='Related']/parent::*/parent::*/parent::*/parent::*/parent::*/parent::*)
google.*##:xpath(//span[text()='Related Questions']/parent::*/parent::*/parent::*/parent::*/parent::*/parent::*)
google.*##:xpath(//span[text()='Related Searches']/parent::*/parent::*/parent::*/parent::*/parent::*/parent::*)

I used this strategy because these annoyances are always highly structured and organized, the 100% sure identifying feature of these tables are a span element with the offending text, and that is also how I identify them visually too.

The aforementioned span element's great-great-great-great-grandparent in the DOM tree is the element I want to block, and that is why I use all those /parent::*s.

I haven't been able to find a keyword that makes "Related" tables to appear, despite the fact that I remember triggering these all the time, so I am unable to verify the effectiveness and correctness of the last three filters, but from my testing so far with dozens of keywords, the first four filters work perfectly.

As you can see, these filters have a lot of redundancy and repetition, and I want to combine them into one filter. The only differences between them is the string used to identify the header, using a Regex would make it more concise, like this: [Pp]eople also|[Rr]elated, but how can I use Regex inside XPath? And I don't know if the particular flavor of XPath Ublock Origin uses supports Regex.

Then, having found the element through XPath, how to tell Ublock Origin to select its 6th generation parent using Ublock selectors?


Update

I had learned that Ublock Origin's has-text "function" supports regex so I achieved this with only Ublock selector syntax, I have also determined that the exact level of ancestry is 5 and not 6:

``` google.*##span:has-text(/people also|related|persp|top/i):nth-ancestor(5) ```

Update 1

Small correction, using the above filter removes all search results that happen to have a header that matches the regex used, results in a blank page if all search results happen to meet the criteria (i.e. if you search "perspectives").

This is of course completely unintended.

I have amended this by using one part of the random string Google assigns as class name to these offending elements to differentiate the elements and eliminate the edge case.

The class name at the moment is "mgAbYb OSrXXb RES9jf IFnjPb", and the fixed filter is:

www.google.com##span.mgAbYb:has-text(/people also|related|perspectives|top stories|things to know/i):nth-ancestor(5)

As you can see, it uses the class name, so there is a dependency on the class name, and this won't work when Google changes this class name, which will undoubtedly happen very often, so this filter will break in time, however the filter can be fixed extremely easily when it breaks.


Solution

  • Ublock Origin will be using the XPath interpreter built into the browser, which will be XPath 1.0. That version of XPath has only a few string functions, and no support for regular expressions.

    One of the string functions available is "starts-with", e.g. to recognise phrases such as 'People also ask' or 'People also search for', you could use:

    starts-with(., 'People also')
    

    To relate the target span to the (6th-level) ancestor element you want to block, you could use .. which is shortcut syntax for parent::node() (in place of parent::*), or you could select the 6th element along the ancestor axis, using ancestor::*[6]

    You can avoid using text() in your predicate and just use . (referring to the span element itself, rather than to its text node child).

    span[.='People also search for']
    

    To combine the tests into a single XPath expression, just use the XPath or operator inside your predicate:

    span[.='People also search for' or .='People also ask']
    

    In summary (broken into multiple lines, for clarity)

    //span[
       .='People also search for' or 
       .='People also ask' or 
       .='Perspectives' or 
       .='Top stories' or 
       .='Related' or 
       .='Related Questions' or 
       .='Related Searches'
    ]
    /../../../../../..
    

    Or if you prefer:

    //span[
       starts-with(., 'People also') or
       .='Perspectives' or 
       .='Top stories' or 
       starts-with(., 'Related')
    ]
    /ancestor::*[6]