I'm working on a problem where I have two Input sets
Input1 :
Multiple Set of rules (Sample):
RuleSet1:
1. I am $name
2. I am $age years old
3. $bookname is my favorite book
....
RuleSet2:
1. I love $sportname
2. $color is my favorite color
....
RuleSet3:
1. $fruit is my favorite fruit
2. I am a $diet
3. I speak $language
4. I am from $countryname
....
Here $name,$age,$bookname... are placeholders. There could be multiple such rule sets. There is no limit.
Input2 :
Multiple Set of Input Strings.
Set 1:
1. I am 26 years old
2. I am James
.....
Set 2:
1. I am John
2. ToKillAMockinBird is my favorite book
.......
Set 3:
1. TuesdaysWithMorrie is my favorite book
2. I am Bill
3. I am 26 years old
......
Set 4:
1. I am Jack
2. I am 27 years old
3. WarAndPeace is my favorite book
......
Set 5:
1. I am a vegan
2. I speak English
......
Set 6:
1. Purple is my favorite color
2. I love football
......
Problem Statement :
For each Set of Strings in Input 2, I need to match with Input 1 and say if these strings appeared in the same order or not.
Output :
Set1 --> false
Set2 --> true
Set3 --> false
Set4 --> true
Set5 --> true
Set6 --> false
I tried brute force by iterating each string in each input set and checking if it exists or not, if so, giving them a number, finally checking if these numbers are in ascending order or not. But, it's not efficient. The input Set1, set2 could be huge data sets. Is there a better way of solving this?
Here is a thought: concatenate the lines in the rule sets and input sets into one line with some special delimiter (or alternately a surrounding pattern)
so rule set #1 can look like this
I am $name ### I am $age years old ### $bookname is my favorite book
or like this
[I am $name] [I am $age years old] [$bookname is my favorite book]
then you can do the same for the input sets and compare. seems to me like replacing the placeholders with regex \w+
may be sufficient