I am working on a script using Twitter's API and I am trying to find matches to exact phrases.
The API however doesn't allow queries for exact phrases so I've been trying to find a workaround however I am getting results that contain words from the phrases but not in the exact match as the phrase.
var search_terms = "buy now, look at this meme, how's the weather?";
let termSplit = search_terms.toLowerCase();
let termArray = termSplit.split(', ');
//["buy now", "look at this meme", "how's the weather?"];
client.stream('statuses/filter', { track: search_terms }, function (stream) {
console.log("Searching for tweets...");
stream.on('data', function (tweet) {
if(termArray.some(v => tweet.text.toLowerCase().includes(v.toLowerCase()) )){
//if(tweet.text.indexOf(termArray) > 0 )
console.log(tweet);
}
});
});
Expected results should be a tweet with any text as long as it contains the exact phrase somewhere.
The results I am getting returns tweets that have an array value present but not an exact phrase match of the value.
Example results being returned - "I don't know why now my question has a close request but I don't buy it."
Example results I am expecting - "If you like it then buy now."
What am I doing wrong?
First, toward the future:
Twitter is planning to deprecate the statuses/filter
v1.1
endpoint:
These features will be retired in six months on October 29, 2022.
Additionally, beginning today, new client applications will not be able to gain access to v1.1 statuses/sample and v1.1 statuses/filter. Developers with client apps already using these endpoints will maintain access until the functionality is retired. We are not retiring v1.1 statuses/filter in 6-months, only the ability to retrieve compliance messages. We will retire the full endpoint eventually.
So, now is a great time to start using the equivalent v2
API, Filtered Stream, which supports exact phrase matching, helping you avoid this entire scenario in your application code.
With that out of the way, below I've included a minimal, reproducible example for you to consider which demonstrates how to match exact phrases in streamed tweets, and even extract additional useful information (like which phrase was used to match it and at what index within the tweet text). It includes inline comments explaining things line-by-line:
<script type="module">
// Transform to lowercase, split on commas, and trim whitespace
// on the ends of each phrase, removing empty phrases
function getPhrasesFromTrackText (trackText) {
return trackText.toLowerCase().split(',')
.map(str => str.trim())
.filter(Boolean);
}
const trackText = `buy now, look at this meme, how's the weather?`;
const phrases = getPhrasesFromTrackText(trackText);
// The callback closure which will be invoked with each matching tweet
// from the streaming response data
const handleTweet = (tweet) => {
// Transform the tweet text once
const lowerCaseText = tweet.text.toLowerCase();
// Create a variable to store the first matching phrase that is found
let firstMatchingPhrase;
for (const phrase of phrases) {
// Find the index of the phrase in the tweet text
const index = lowerCaseText.indexOf(phrase);
// If the phrase isn't found, immediately continue
// to the next loop iteration, skipping the rest of the code block
if (index === -1) continue;
// Else, set the match variable
firstMatchingPhrase = {
index,
text: phrase,
};
// And stop iterating the other phrases by breaking out of the loop
break;
}
if (firstMatchingPhrase) {
// There was a match; do something with the tweet and/or phrase
console.log({
firstMatchingPhrase,
tweet,
});
}
};
// The Stack Overflow code snippet runs in a browser and doesn't have access to
// the Node.js Twitter "client" in your question,
// but you'd use the function like this:
// client.stream('statuses/filter', {track: trackText}, function (stream) {
// console.log('Searching for tweets...');
// stream.on('data', handleTweet);
// });
// Instead, the function can be demonstrated by simulating the stream: iterating
// over sample tweets. The tweets with a ✅ are the ones which
// will be matched in the function and be logged to the console:
const sampleTweets = [
/* ❌ */ {text: `Now available: Buy this product!`},
/* ✅ */ {text: `This product is availble. Buy now!`},
/* ✅ */ {text: `look at this meme 🤣`},
/* ❌ */ {text: `Look at how this meme was created`},
/* ❌ */ {text: `how's it going everyone? good weather?`},
/* ✅ */ {text: `Just wondering: How's the weather?`},
// etc...
];
for (const tweet of sampleTweets) {
handleTweet(tweet);
}
</script>