ruby-on-railsrubyfastercsv

Rails: FasterCSV - Unique Occurrences


I have my CSV file imported as such:

records = FasterCSV.read(path, :headers => true, :header_converters => :symbol)

How can I get the unique occurrences of my data? For instance, here some sample data:

ID,Timestamp
test,2008.12.03.20.26.32
test,2008.12.03.20.26.38
test,2008.12.03.20.26.41
test,2008.12.03.20.26.42
test,2008.12.03.20.26.43
test,2008.12.03.20.26.44
cnn,2008.12.03.20.30.37
cnn,2008.12.03.20.30.49

If I simply call records[:id], I just get:

testtesttesttesttesttestcnncnn

I would like to get this:

testcnn

How can I do this?


Solution

  • If your data is not masive you can use the Set class.

    Here's an example:

    p ['cnn','test','test','test','test','cnn','cnn'].to_set.to_a
    => ["cnn", "test"]
    

    Here's a simple benchmark:

    require 'set'
    require 'benchmark'
    
    Benchmark.bm(5) do |x|
      x.report("Set")   do
        a = []
        20_000.times do |i|
          a << 'cnn'<< 'test'
        end
        a.to_set.to_a
      end
    end
    
    =>
               user     system      total        real
    
    Set    0.110000   0.000000   0.110000 (  0.109000)