matlabif-statementwhile-looplow-level-io

Altering text in a .txt file and creating a new file output in MATLAB


I apologize in advance if the title seems a bit off. I was having trouble deciding what exactly I should name it. Anyway, basically what I am doing now is completely homework that deals with low-level I/Os. For my one assignment, I have given two .txt files, one that includes a list of email addresses and another that includes a list members who no longer was to be on an email list. What I have to do is delete the emails of the members from the second list. Additionally, there may be some nasty surprises in the .txt files. I have to clean-up the emails and take out any unwanted punctuation after the emails, such as semi-colons, commas and spaces. Furthermore, I need to lowercase all of the text. I'm struggling with this problem in more ways than one (I'm not entirely sure how to get my file to write what I need it to in my output), but right now my main concern is outputting the unsubscribe message in the correct order. Sortrow doesn't seem to work.

Here are some test cases:

Test Cases
unsubscribe('Grand Prix Mailing List.txt', ...
              'Unsubscribe from Grand Prix.txt')
     => output file named 'Grand Prix Mailing List_updated.txt' that looks
        like 'Grand Prix Mailing List_updated_soln.txt'
     => output file named 'Unsubscribe from Grand Prix_messages.txt' that 
        looks like 'Unsubscribe from Grand Prix_messages_soln.txt'

The original mailing list

Grand Prix Mailing List:
MPLUMBER3@gatech.edu, 
lplumber3@gatech.edu 
Ttoadstool3@gatech.edu;
bkoopa3@gatech.edu
ppeach3@gatech.edu,
ydinosaur3@gatech.edu
kBOO3@gatech.edu
WBadguy3@gatech.edu;
FKong3@gatech.edu
dkong3@gatech.edu
dbones3@gatech.edu

People who are like nope:

MARIO PLUMBER; 
bowser koopa 
Luigi Plumber,
Donkey Kong 
King BOO;
Princess Peach

What it's supposed to look like afterwards:

ttoadstool3@gatech.edu
ydinosaur3@gatech.edu
wbadguy3@gatech.edu
fkong3@gatech.edu
dbones3@gatech.edu

My file output:

Mario, you have been unsubscribed from the Grand Prix mailing list.
Luigi, you have been unsubscribed from the Grand Prix mailing list.
Bowser, you have been unsubscribed from the Grand Prix mailing list.
Princess, you have been unsubscribed from the Grand Prix mailing list.
King, you have been unsubscribed from the Grand Prix mailing list.
Donkey, you have been unsubscribed from the Grand Prix mailing list.

So Amro has been kind enough to provide a solution, though it's a little above what I know right now. My main issue now is that when I output the unsubscribe message, I need it to be in the same order as the original email list. For instance, while Bowser was on the complaining list before Luigi, in the unsubscribe message, Luigi needs to come before him.

Here is my original code:

function[] = unsubscribe(email_ids, member_emails)
    Old_list = fopen(email_ids, 'r'); %// opens my email list
    Old_Members = fopen(member_emails, 'r'); %// Opens up the names of people who want to unsubscribe
    emails = fgets(Old_list); %// Reads first line of emails
    member_emails = [member_emails]; %// Creates an array to populate
while ischar(emails) %// Starts my while loop
%// Pulls out a line in the email
    emails = fgets(Old_list);
%// Quits when it sees this jerk
    if emails == -1
        break;
    end

%// I go in to clean stuff up here, but it doesn't do any of it. It's still in the while loop though, so I am not sure where the error is
proper_emails = lower(member_emails); %// This is supposed to lowercase the emails, but it's not working
unwanted = findstr(member_emails, ' ,;');
member_emails(unwanted) = '';
member_emails = [member_emails, emails];
end

while ischar(Old_Members) %// Does the same for the members who want to unsubscribe
    names = fgetl(member_emails);
    if emails == -1
        break
    end
proper_emails = lower(names); %// Lowercases everything
unwanted = findstr(names, ' ,;');
names(unwanted) = '';
end

Complainers = find(emails);

New_List = fopen('Test2', 'w'); %// Creates a file to be written to
fprintf(New_List, '%s', member_emails); %// Writes to it
Sorry_Message = fopen('Test.txt', 'w');
fprintf(Sorry_Message, '%s', Complainers);

%// Had an issue with these, so I commented them out temporarily
%// fclose(New_List);
%// fclose(Sorry_Message);
%// fclose(email_ids); 
%// fclose(members);

end

Solution

  • Below is my implementation for the problem. The code is commented at each step and should be easy to understand. I'm using regular expressions when I can because this is the sort of thing they're good at... Also note that I don't have any loops in the code :)

    unsubscribe.m

    function unsubscribe(mailinglist_file, names_file)
    
        %%
        % read list of names of those who want to unsubscribe
        names = read_file(names_file);
    
        % break names into first/last parts
        first_last = regexp(names, '(\w+)\s+(\w+)', 'tokens', 'once');
        first_last = vertcat(first_last{:});
    
        % build email handles (combination of initials + name + domain)
        emails_exclude = strcat(cellfun(@(str) str(1), first_last(:,1)), ...
            first_last(:,2), '3@gatech.edu');
    
        %%
        % read emails in mailing list
        emails = read_file(mailinglist_file);
    
        % update emails by removing those who wish to unsubscribe
        emails(ismember(emails, emails_exclude)) = [];
    
        %%
        % write updated mailing list
        [~,fName,fExt] = fileparts(mailinglist_file);
        fid = fopen([fName '_updated' fExt], 'wt');
        fprintf(fid, '%s\n', emails{:});
        fclose(fid);
    
        % write list of names removed
        % capilaize first letter of first name
        first_names = cellfun(@(str) [upper(str(1)) str(2:end)], ...
            first_last(:,1), 'UniformOutput',false);
        msg = strcat(first_names, ...
            ', you have been unsubscribed from the mailing list.');
        fid = fopen([fName '_messages' fExt], 'wt');
        fprintf(fid, '%s\n', msg{:});
        fclose(fid);
    
    end
    
    function C = read_file(filename)
        % read lines from file into a cell-array of strings
        fid = fopen(filename, 'rt');
        C = textscan(fid, '%s', 'Delimiter','');
        fclose(fid);
    
        % clean up lines by removing trailing punctuation
        C = lower(regexprep(C{1}, '[,;\s]+$', ''));
    end
    

    Given the following text files:

    list.txt

    MPLUMBER3@gatech.edu, 
    lplumber3@gatech.edu 
    Ttoadstool3@gatech.edu;
    bkoopa3@gatech.edu
    ppeach3@gatech.edu,
    ydinosaur3@gatech.edu
    kBOO3@gatech.edu
    WBadguy3@gatech.edu;
    FKong3@gatech.edu
    dkong3@gatech.edu
    dbones3@gatech.edu
    

    names.txt

    MARIO PLUMBER; 
    bowser koopa 
    Luigi Plumber,
    Donkey Kong 
    King BOO;
    Princess Peach
    

    Here is what I get when running the code:

    >> unsubscribe('list.txt', 'names.txt')
    

    list_messages.txt

    Mario, you have been unsubscribed from the mailing list.
    Bowser, you have been unsubscribed from the mailing list.
    Luigi, you have been unsubscribed from the mailing list.
    Donkey, you have been unsubscribed from the mailing list.
    King, you have been unsubscribed from the mailing list.
    Princess, you have been unsubscribed from the mailing list.
    

    list_updated.txt

    ttoadstool3@gatech.edu
    ydinosaur3@gatech.edu
    wbadguy3@gatech.edu
    fkong3@gatech.edu
    dbones3@gatech.edu