perlemailmime

Perl parsing multipart/alternative emails


I looking for a way to parse the body text part of multipart/alternative emails. I have currently have a perl script using the Email::Mime module, which parses text/plain and text/html correctly. Though the problem I have is that when I parse a multipart/alternative email the $part->body always returns empty. I have tried using $part->body_raw and it does return the text body though it includes the header which I need to omit.

Current output using $part->data_raw

--_000_47C8E15E8EEDCB4E94E891F9414C019A0CB5BDEE79DFW1MBX07mex0_
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: quoted-printable 

Text Body 

Desired Output

Text Body

PERL Code

my ( $body, $text_body, $html_body, $multi_body );
for my $part (@parts) {

if ( $part->content_type =~ m!text/html! ) {
    my $hs = HTML::Strip->new( emit_spaces => 0 );
    $html_body .= $hs->parse( $part->body );
    print "Found HTML\n";
}
elsif ($part->content_type =~ m!text/plain!
    or $part->content_type eq '' )
{

    $text_body .= $part->body;
    print "Found TEXT\n";
}
elsif ($part->content_type =~ m!multipart/alternative!
    or $part->content_type eq '' )
{
    print "Found Multipart\n";
    $multi_body .= $part->body;     

}

Source

Content-Type: multipart/related;
boundary="_004_47C8E15E8EEDCB4E94E891F9414C019A0CB5BDEE79DFW1MBX07mex0_";
type="multipart/alternative"
MIME-Version: 1.0

--_004_47C8E15E8EEDCB4E94E891F9414C019A0CB5BDEE79DFW1MBX07mex0_
Content-Type: multipart/alternative;
boundary="_000_47C8E15E8EEDCB4E94E891F9414C019A0CB5BDEE79DFW1MBX07mex0_"

--_000_47C8E15E8EEDCB4E94E891F9414C019A0CB5BDEE79DFW1MBX07mex0_
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: quoted-printable

Test Body

Solution

  • Multiparts contain multiple parts. Iterate over them:

    use strict;
    use warnings;
    use Email::MIME;
    use Data::Printer;
    use feature qw/say/;
    
    my $source = <<EOF;
    Content-Type: multipart/related;
    boundary="_004_47C8E15E8EEDCB4E94E891F9414C019A0CB5BDEE79DFW1MBX07mex0_";
    type="multipart/alternative"
    MIME-Version: 1.0
    
    --_004_47C8E15E8EEDCB4E94E891F9414C019A0CB5BDEE79DFW1MBX07mex0_
    Content-Type: multipart/alternative;
    boundary="_000_47C8E15E8EEDCB4E94E891F9414C019A0CB5BDEE79DFW1MBX07mex0_"
    
    --_000_47C8E15E8EEDCB4E94E891F9414C019A0CB5BDEE79DFW1MBX07mex0_
    Content-Type: text/plain; charset="us-ascii"
    Content-Transfer-Encoding: quoted-printable
    
    Test Body
    EOF
    
    my $msg = Email::MIME->new($source);
    
    for my $part ($msg->parts) {
        if ($part->content_type =~ m!multipart/alternative!
                or $part->content_type eq '' )
            {
                say "Found Multipart"; 
                for my $subpart ($part->parts) {
                    say $subpart->body;
                }
        }
    }
    

    Outputs:

    C:\>perl test_mime.pl 
    Found Multipart 
    Test Body