cstringmultidimensional-arraystrstr

Detect array of string in another string


Im a beginner in programming and stuff, i want to solve this problem here

(Spam Scanner) Spam (or junk e-mail) costs U.S. organizations billions of dollars a year in spam-prevention software, equipment, network resources, bandwidth, and lost productivity Research online some of the most common spam e-mail messages and words, and check your own junk e-mail folder. Create a list of 30 words and phrases commonly found in spam messages. Write a program in which the user enters an e-mail message. Read the message into a large character array and ensure that the program does not attempt to insert characters past the end of the array. Then scan the message for each of the 30 keywords or phrases. For each occurrence of one of these within the message, add a point to the message’s “spam score.” Next, rate the likelihood that the message is spam, based on the number of points it received

I tried write my code like this

#include <stdio.h>
#include <string.h>
#include <ctype.h>

void find_string(char *emailSearch);
const char spam[][30] = {
"congratulation",
"free",
"100%",
"earn",
"million",
"click",
"here",
"instant",
"limited",
"urgent",
"winner",
"selected",
"bargain",
"deal",
"debt",
"lifetime",
"cheap",
"easy",
"bonus",
"credit",
"bullshit",
"scam",
"junk",
"spam",
"passwords",
"invest",
"bulk",
"exclusive",
"win",
"sign"};

int main(){
char email[1000];
    printf("Enter your short email message: \n");
    fgets(email, 80, stdin);
    email[strlen(email)-1] = '\0';
    find_string(email);
    return 0;
    }

void find_string(char *emailSearch){
int i = 0;
    while(emailSearch[i]){
        (tolower(emailSearch[i]));
        i++;
    }
    if(strstr(emailSearch,spam)){
        printf("Your email message is considered spam!");
    }
    else{
        printf("Your email is not spam!");
    }
}

I tried inputing words in the spam array, but the output still printing "Your email is not spam!". Anyone can fix this?


Solution

  • The main issue that you need to iterate over each of your spam words and search for that in your text. If you have strcasestr() use that instead of strtolower(email):

    #include <stdio.h>
    #include <string.h>
    #include <ctype.h>
    
    #define LEN 79
    
    const char *spam[] = {
        "congratulation",
    //  ...
    };
    
    char *strtolower(char *s) {
        size_t n = strlen(s);
        for(int i = 0; i < n; i++) {
            s[i] = tolower(s[i]);
        }
        return s;
    }
    
    void find_string(char *emailSearch){
        for(int i = 0; i < sizeof spam / sizeof *spam; i++) {
            if(strstr(emailSearch, spam[i])) {
                printf("Your email message is considered spam!\n");
                return;
            }
        }
        printf("Your email is not spam!\n");
    }
    
    int main(){
        char email[LEN+1];
        printf("Enter your short email message: \n");
        fgets(email, LEN+1, stdin);
        find_string(strtolower(email));
        return 0;
    }
    

    The next step would be to split your email into words so the spam word "here" will not cause the unrelated email word "there" to be treated as spam. You can now use strcmp() to compare the email and spam list of words. If you sort your spam list, you could use bsearch() instead of linear search. Alternatively consider using a hash table for your spam list.

    The following step after that is implement some type of stemming so "congratulations" would again be considered spam because the root word "congratulation" is on the spam list.