prolog pattern-matching sliding-window dcg gomoku

Sliding window method for pattern recognition in SWI-Prolog

I would like to find the most eloquent and efficient method, algorithmically speaking, to count occurrences of some patterns in SWI-Prolog.

For now, my solution uses DCG and looks like this:

count_occurrences(Pattern, List, Count) :-
    phrase(Pattern, PatternList),
    length(PatternList, PatternLength),
    length(List, ListLength),
    MaxIndex is ListLength - PatternLength,
    findall(1, (
        between(0, MaxIndex, Index),
        sublist(List, Index, PatternLength, PatternList)
    ), Counts),
    sum_list(Counts, Count).

sublist(List, Index, Length, Sublist) :-
    length(Sublist, Length),
    append(Prefix, Rest, List),
    length(Prefix, Index),
    append(Sublist, _, Rest).

rep(S, L) --> [], {L is 0} | [S], {L is 1} | [S], { L_1 is L - 1 }, rep(S, L_1), !.
open_rep(S, L) --> [v], rep(S, L), [v], !.

The result is this:

1 ?- count_occurrences(open_rep(n, 1), [v,n,v,n,v,v,v,v,v,v,n,v,v], Count).
Count = 3.

2 ?- count_occurrences(open_rep(n, 3), [v,n,v,n,v,v,v,v,v,v,n,n,n,v,v], Count).
Count = 1.

3 ?- count_occurrences(open_rep(n, 1), [v,n,v,n,v,v,v,v,v,v,n,n,n,v,v], Count).
Count = 2.

I am satisfied with this result, but I don't think this is the most efficient method to achieve it and would like some help making it faster to compute. I think that using append and findall is computationally expensive, but can't figure how to do the same without them...

The DCG shown here is just an example, but I have to count the occurrences of patterns in a list and give them a score. This is in the context of implementing an AI for Gomoku using Alpha-Beta Pruning. Since the board is evaluated frequently, algorithmic complexity matters in order to reduce the time it takes for the AI to take an action.

I have tried multiple variations of the code shown above, but they all use the findall predicate and the best solution I have found to reduce the compute time is to implement early fails.

Solution

IMO your approach is too much specific, and it will be sub optimal from the (re)usability viewpoint. SWI-Prolog offers a library predicate that performs RLE (run length encoding), as I discovered from this interesting topic, and is worth to try: here I'll post a module, where I copied your code, and an alternative that uses clumped/2:

/*  File:    x_so_sliding_window.pl
    Author:  Carlo
    Created: Mar  4 2023
    Purpose: https://stackoverflow.com/q/75630809/874024
*/

:- module(x_so_sliding_window,
          [count_occurrences/3
          ,open_rep//2
          ,count_by_clumped/3
          ]).


count_occurrences(Pattern, List, Count) :-
    phrase(Pattern, PatternList),
    length(PatternList, PatternLength),
    length(List, ListLength),
    MaxIndex is ListLength - PatternLength,
    findall(1, (
        between(0, MaxIndex, Index),
        sublist(List, Index, PatternLength, PatternList)
    ), Counts),
    sum_list(Counts, Count).

sublist(List, Index, Length, Sublist) :-
    length(Sublist, Length),
    append(Prefix, Rest, List),
    length(Prefix, Index),
    append(Sublist, _, Rest).

rep(S, L) --> [], {L is 0} | [S], {L is 1} | [S], { L_1 is L - 1 }, rep(S, L_1), !.
open_rep(S, L) --> [v], rep(S, L), [v], !.


count_by_clumped(Pattern,List,Count) :-
    clumped(List, R),
    aggregate_all(count, member(Pattern,R), Count).

then I had this code in x_so_sliding_window.plt

:- [x_so_sliding_window].

t(Count) :- count_occurrences(open_rep(n, 1), [v,n,v,n,v,v,v,v,v,v,n,v,v], Count).
t(Count) :- count_occurrences(open_rep(n, 3), [v,n,v,n,v,v,v,v,v,v,n,n,n,v,v], Count).
t(Count) :- count_occurrences(open_rep(n, 1), [v,n,v,n,v,v,v,v,v,v,n,n,n,v,v], Count).

c(Count) :- count_by_clumped(n-1, [v,n,v,n,v,v,v,v,v,v,n,v,v], Count).
c(Count) :- count_by_clumped(n-3, [v,n,v,n,v,v,v,v,v,v,n,n,n,v,v], Count).
c(Count) :- count_by_clumped(n-1, [v,n,v,n,v,v,v,v,v,v,n,n,n,v,v], Count).

that shows at first glance the equivalence between t/1 and c/1 calls:

?- t(Count).
Count = 3 ;
Count = 1 ;
Count = 2.

?- c(Count).
Count = 3 ;
Count = 1 ;
Count = 2.

and, at last, the 'benchmark' directly coded in the REPL:

?- time((between(1,1000,_),(t(_),fail))).
% 1,953,001 inferences, 0.234 CPU in 0.234 seconds (100% CPU, 8332804 Lips)
false.
?- time((between(1,1000,_),(c(_),fail))).
% 123,001 inferences, 0.000 CPU in 0.014 seconds (0% CPU, Infinite Lips)
false.