algorithm optimization rust interpreter brainfuck

How to optimize brainf*ck instructions

I'm trying to write an optimisation feature for my brainf*ck interpreter. It basically combines same instructions into 1 instruction.

I wrote this function but It doesn't work properly:


pub fn optimize_multiple(instructions: &Vec<Instruction>) -> Vec<OptimizedInstruction> {

    let mut opt: Vec<OptimizedInstruction> = Vec::new();

    let mut last_instruction = instructions.get(0).unwrap();
    let mut last_count = 0;

    for instruction in instructions.iter() {

        if instruction == last_instruction {
            last_count += 1;
        }

        else if let Instruction::Loop(i) = instruction {
            opt.push(OptimizedInstruction::Loop(optimize_multiple(i)));
            last_count = 1;
        }

        else {
            opt.push(OptimizedInstruction::new(last_instruction.clone(), last_count));
            last_instruction = instruction;
            last_count = 1;
        }

    }

    opt

}

Here's the OptimizedInstruction enum and the "new" method: (The Instruction::Loop hand is just a place holder, I didn't used it)

#[derive(Debug, Clone, PartialEq, Eq)]
pub enum OptimizedInstruction {
    IncrementPointer(usize),
    DecrementPointer(usize),
    Increment(usize),
    Decrement(usize),
    Write,
    Read,
    Loop(Vec<OptimizedInstruction>),
}

impl OptimizedInstruction {
    pub fn new(instruction: Instruction, count: usize) -> OptimizedInstruction {
        match instruction {
            Instruction::IncrementPointer => OptimizedInstruction::IncrementPointer(count),
            Instruction::DecrementPointer => OptimizedInstruction::DecrementPointer(count),
            Instruction::Increment => OptimizedInstruction::Increment(count),
            Instruction::Decrement => OptimizedInstruction::Decrement(count),
            Instruction::Write => OptimizedInstruction::Write,
            Instruction::Read => OptimizedInstruction::Read,
            Instruction::Loop(_i) => OptimizedInstruction::Loop(Vec::new()),
        }
    }
}

I ran it with this input: ++-[+-++]

But it gave me this output:

[Increment(2), Loop([Increment(1), Decrement(1)])]

Insted of this:

[Increment(2), Decrement(1), Loop([Increment(1), Decrement(1), Increment(2)])]

I've been trying to solve it for 2 days and still, I don't have any idea about why it doesn't work.

~ Derin

Solution

First off, just to rain a little on your parade I just want to point out that Wilfred already made a brainf*ck compiler in Rust that can compile to a native binary through LLVM (bfc). If you are getting stuck, you may want to check his implementation to see how he does it. If you ignore the LLVM part, it isn't too difficult to read through and he has a good approach.

When broken down into its core components, this problem revolves around merging two elements together. The most elegant way I have come across to solve this is use an iterator with a merge function. I wrote an example of what I imagine that would look like below. I shortened some of the variable names since they were a bit long but the general idea is the same. The merge function has a very simple job. When given two elements, attempt to merge them into a single new element. The iterator then handles putting them through that function and returning items once they can no longer be merged. A sort of optional fold if you will.

pub struct MergeIter<I: Iterator, F> {
    iter: I,
    func: Box<F>,
    held: Option<<I as Iterator>::Item>,
}

impl<I, F> Iterator for MergeIter<I, F>
where
    I: Iterator,
    F: FnMut(&<I as Iterator>::Item, &<I as Iterator>::Item) -> Option<<I as Iterator>::Item>,
{
    type Item = <I as Iterator>::Item;

    fn next(&mut self) -> Option<Self::Item> {
        let mut first = match self.held.take() {
            Some(v) => v,
            None => self.iter.next()?,
        };

        loop {
            let second = match self.iter.next() {
                Some(v) => v,
                None => return Some(first),
            };

            match (self.func)(&first, &second) {
                // If merge succeeds, attempt to merge again
                Some(v) => first = v,
                // If merge fails, store second value for next iteration and return result
                None => {
                    self.held = Some(second);
                    return Some(first);
                }
            }
        }
    }
}

pub trait ToMergeIter: Iterator {
    fn merge<F>(self, func: F) -> MergeIter<Self, F>
    where
        Self: Sized,
        F: FnMut(&Self::Item, &Self::Item) -> Option<Self::Item>;
}

impl<I: Sized + Iterator> ToMergeIter for I {
    fn merge<F>(self, func: F) -> MergeIter<Self, F>
    where
        Self: Sized,
        F: FnMut(&Self::Item, &Self::Item) -> Option<Self::Item>,
    {
        MergeIter {
            iter: self,
            func: Box::new(func),
            held: None,
        }
    }
}

Then we can apply this process recursively to get our result. Here is a brief example of what that would look like. It isn't quite as memory efficient since it makes a new Vec each time, but it makes the process of specifying what instructions to merge way easier for you and helps make your work easier to read/debug.

pub fn optimize(instructions: Vec<Instruction>) -> Vec<Instruction> {
    instructions
        .into_iter()
        // Recursively call function on loops
        .map(|instruction| match instruction {
            Instruction::Loop(x) => Instruction::Loop(optimize(x)),
            x => x,
        })
        // Merge elements using the merge iter
        .merge(|a, b| {
            // State if any two given elements should be merged together or not.
            match (a, b) {
                (Instruction::IncPtr(x), Instruction::IncPtr(y)) => {
                    Some(Instruction::IncPtr(x + y))
                }
                (Instruction::DecPtr(x), Instruction::DecPtr(y)) => {
                    Some(Instruction::DecPtr(x + y))
                }
                (Instruction::Increment(x), Instruction::Increment(y)) => {
                    Some(Instruction::Increment(x + y))
                }
                (Instruction::Decrement(x), Instruction::Decrement(y)) => {
                    Some(Instruction::Decrement(x + y))
                }
                // Etc...
                _ => None,
            }
        })
        // Collect results to return
        .collect()
}

playground link