So I'm trying to build a system that allows users to tailor the content that appears in a feed by creating exclusionary rules. As a quick overview, here's a simple look at the tables I'm working with :
Table Name | Fields |
---|---|
Feed | PRIMARY KEY LONG postId PRIMARY KEY INT feedId STRING date |
Post | PRIMARY KEY LONG id STRING creatorId STRING title STRING message STRING uploadDate |
Tag | PRIMARY KEY AUTOGENERATE LONG id FOREIGN KEY LONG parentPostId STRING contents |
BlacklistRule | PRIMARY KEY AUTOGENERATE LONG id STRING tableName STRING fieldName STRING contents |
The Feed
gives you a list of ids to pull from the Posts
table. When a Post
is rendered, all of its Tag
s are attached to it.
An earlier version of this system was limited to just blacklisted tags, it was simple to join the Tag
table to the BlacklistedTag
table and filter the results from there.
@Query("
SELECT DISTINCT Feed.postId FROM Feed
LEFT JOIN (
SELECT DISTINCT Tag.parentPostId as postId FROM Tag
INNER JOIN BlacklistedTag ON Tag.contents = BlacklistedTag.contents
INNER JOIN Feed ON Tag.postId = Feed.postId
WHERE Feed.feedId = :feedId
ORDER BY datetime(Feed.date) DESC
) AS BlockedIds ON Feed.postId = BlockedIds.postId
WHERE Feed.feedId = :feedId
AND BlockedIds.postId IS NULL
ORDER BY datetime(date) DESC
LIMIT :pageSize
OFFSET :offset
")
fun getFilteredPostIdsByPage(
pageSize : Int = 48,
offset : Int = 0,
feedId : Int = ContentFeedIds.Home
) : List<Long>
However, I wanted to make the system more generic. The BlacklistRules
are supposed to allow the program to filter out successive subsets of queries, like Tags
with contents
of FOO
(I hate seeing FOO in my content feed) or Posts
with BAR
in the title
.
Each row in the BlacklistRule
table represents an aggregated subset of ids that will be used to filter the Feed
, but I'm not sure how to perform loops in Android's RoomDB library, or SQLite in general.
So my question is : How would I perform a loop where I continually append unique ids to a set, then use that set as the filter for my content feed? If I were to write this in another language it would work like this :
// create the complete list of blacklisted content
var blacklistedIds : List<Long> = emptyList<Long>()
for (rule : BlacklistRule in BlacklistRules) {
val subquery : String = "SELECT UNIQUE postId FROM $rule.tableName " +
"WHERE $rule.fieldName CONTAINS $rule.contents"
// perform the subquery
// add the ids from the subquery to the blacklistedIds list
}
// fetch and return a page of the feed where the ids are not present in the blacklist
Any help is appreciated. If this is a fundamentally flawed approach, I'd love suggestions on how to improve this process.
The solution could be along the lines of joining every rule with every post/feed/tag permutation which would result in a table from which the rule could be applied.
For example suppose that the tables along with some data where:-
CREATE TABLE IF NOT EXISTS feed (feedid INTEGER, postid INTEGER, date TEXT, PRIMARY KEY(feedid,postid));
CREATE TABLE IF NOT EXISTS post (id INTEGER PRIMARY KEY, creatorId TEXT, title TEXT, message TEXT, uploaddate TEXT);
CREATE TABLE IF NOT EXISTS tag (id INTEGER PRIMARY KEY, parentpostid REFERENCES post(id), contents TEXT);
CREATE TABLE IF NOT EXISTS rule (id INTEGER PRIMARY KEY, tablename TEXT, fieldname TEXT, contents);
INSERT OR IGNORE INTO post VALUES
(1,'Fred','fred''s first message','hi, how are you horrible',datetime('now','-2 days'))
,(2,'Mary','mary''s response to fred','fred, i was fine until you said what you said',datetime('now','-2 days','+30 minutes'))
,(3,'Fred','fred''s response to mary''s response','Mary I''m sorry I was kidding',datetime('now','-2 days','+30 minutes'))
,(4,'Mary','mary''s 2nd response','That is OK Fred you ugly so and so',datetime('now','-2 days','+30 minutes'))
,(5,'Fred','Fred again','All fine and agin sorry',datetime('now','-2 days','+30 minutes'))
;
INSERT OR IGNORE INTO feed VALUES
(100,1,datetime('now','-10 hours'))
,(200,1,datetime('now','-11 hours'))
,(202,2,datetime('now','-9 hours'))
,(203,2,datetime('now','-8 hours'))
;
INSERT INTO rule (tablename,fieldname,contents) VALUES
('post','message','horrible')
,('post','message','nasty')
,('post','message','awful')
,('post','whatever','something')
,('tag','contents','ugly')
;
INSERT OR IGNORE INTO tag (parentpostid,contents) VALUES
(1,'ugly'),(1,'miserable'),(1,'happy'),(2,'happy'),(3,'happy'),(4,'happy'),(5,'ugly')
;
Resulting in:-
and
The the following (SQL) could be the basis of ascertaining what is to be excluded SHOWN TO DEMOSTRATE THE PRINCIPLE(S):-
/* DEMO exclude builder CTE */
WITH
cte_exclude_builder AS (
SELECT DISTINCT
post.*,
feed.*,
tag.*,
rule.tablename,
rule.fieldname,
rule.contents
,CASE
WHEN rule.tablename = 'post' AND fieldname = 'message' THEN instr(post.message,rule.contents)
WHEN rule.tablename = 'post' AND fieldname = 'whatever' THEN 0 /* approriate rule*/
ELSE 0
END AS flg4
, CASE
WHEN rule.tablename = 'tag' AND fieldname = 'contents' THEN instr(tag.contents,rule.contents)
ELSE 0
END AS flg5
FROM post
LEFT JOIN feed ON feed.postid=post.id
LEFT JOIN tag ON tag.parentpostid = post.id
JOIN rule
)
SELECT DISTINCT * FROM cte_exclude_builder
The above, for the data above produces output as per:-
Note that the above uses CTE's (Common Table Expression) see https://www.sqlite.org/lang_with.html
To use the above to omit blacklisted posts (assumption that is what is required) then the following, which is very similar, can be used:-
/* NEXT STEP exluding post using */
WITH
cte_exclude_builder AS (
SELECT DISTINCT
post.*
,CASE
WHEN rule.tablename = 'post' AND fieldname = 'message' THEN instr(post.message,rule.contents)
WHEN rule.tablename = 'post' AND fieldname = 'whatever' THEN 0 /* approriate rule*/
END AS flg4
, CASE
WHEN rule.tablename = 'tag' AND fieldname = 'contents' THEN instr(tag.contents,rule.contents)
ELSE 0
END AS flg5
FROM post
LEFT JOIN feed ON feed.postid=post.id
LEFT JOIN tag ON tag.parentpostid = post.id
JOIN rule
)
SELECT * FROM post WHERE id NOT IN (SELECT DISTINCT id FROM cte_exclude_builder WHERE flg4 OR flg5);
;
i.e.
The above is a little awkward as the rules (CASE WHEN THE ELSE END constructs) have to suit what ever tables/fieldnames should be considered.
Note
The above does not make any reference/indication in regards to incorporating the above into Room. It is simply a matter of cut and past into an @Query
and then minor editing to eliminate the line feeds.
The above also makes many assumptions/interpretations.