arraysgoruststruct

Using array instead of vector in a structure in Rust


I write some small programs that do the same thing in many programming languages. I compare their performance and RAM consumption. This is one of the programs I wrote for testing.

Both programs read the same file. The file contains data concatenated with \t(tab) and \n(enter) characters.The content is similar to the following.

aaaa\tbbb\tccc\tddd\teee\tfff\tggg\thhh\n
aaaa\tbbb\tccc\tddd\teee\tfff\tggg\thhh\n
aaaa\tbbb\tccc\tddd\teee\tfff\tggg\thhh\n
aaaa\tbbb\tccc\tddd\teee\tfff\tggg\thhh

The file I created has 14 columns and 63 rows. These numbers can change. This is not important as I am testing it.

I get the rows with split('\n'). And I get the fields in the row with split('\t'). A very simple deserialize. The program reads the file once and deserialize it 200,000 times. Then print the time to the console.

Go:

package main

import (
    "fmt"
    "os"
    "strings"
    "time"
)

type Datatable struct {
    id   int
    rows [][]string
}

func main() {
    start := time.Now()

    dat, err := os.ReadFile("C:\\Temp\\test1.txt")
    if err != nil {
        panic("file not found")
    }
    str := string(dat)

    count := 200_000
    tables := make([]Datatable, count)

    for i := 0; i < count; i++ {
        table := Datatable{i, nil}
        var lines []string = strings.Split(str, "\n")
        table.rows = make([][]string, len(lines))
        for j, l := range lines {
            table.rows[j] = strings.Split(l, "\t")
        }
        tables[i] = table
    }

    end := time.Now()
    elapsed := end.Sub(start)
    fmt.Println("Time: ", elapsed)

    var b []byte = make([]byte, 1)
    os.Stdin.Read(b)
}

Rust:

use std::fs;
use std::time::SystemTime;
use std::io::{self, BufRead};

struct Table<'a>{
    id: usize,
    rows: Vec<Vec<&'a str>>,
}
fn main() {
    let start = SystemTime::now();
    let str = fs::read_to_string("C:\\Temp\\test1.txt")
        .expect("read_to_string: failed");

    let count = 200_000;
    let mut tables = Vec::with_capacity(count);
    for i in 0..count {
        let lines = str.split('\n');
        let mut table = Table {
            id : i,
            rows : Vec::with_capacity(lines.size_hint().0),
        };
        for item in lines {
            table.rows.push(item.split('\t').collect::<Vec<&str>>());
        }
        tables.push(table);
    }
    println!("Time: {}", start.elapsed().expect("elapsed: failed").as_millis());

    let mut line = String::new();
    let stdin = io::stdin();
    stdin.lock().read_line(&mut line).expect("read_line: failed");
}

Build commands:

go build -ldflags "-s -w"
cargo build --release

The results on my computer are as follows:

Go:

Time     : 4510 milis
RAM usage: 3217 MB

Rust

Time     : 5845 milis
RAM usage: 3578 MB

I tried to write the code as simple as possible. You can try it by copying and pasting.

The Rust code works. But it is slower than Go and consumes more RAM. Before writing the code I was hoping Rust would run faster. Maybe there is something I don't know.

Using arrays in structs in Rust, might make it run faster. But I'm not sure if that's possible. What I want to know is how can I write this code in Rust to make it run faster?


Solution

  • The system allocator on Windows is very slow. Rust uses the system allocator by default, while Go has its own allocator.

    If you replace it you will see it'll be faster. An allocator to choose is mimalloc. Another possible option is jemalloc, but it is hard to build on Windows.