I have my CSV file imported as such:
records = FasterCSV.read(path, :headers => true, :header_converters => :symbol)
How can I get the unique occurrences of my data? For instance, here some sample data:
ID,Timestamp
test,2008.12.03.20.26.32
test,2008.12.03.20.26.38
test,2008.12.03.20.26.41
test,2008.12.03.20.26.42
test,2008.12.03.20.26.43
test,2008.12.03.20.26.44
cnn,2008.12.03.20.30.37
cnn,2008.12.03.20.30.49
If I simply call records[:id]
, I just get:
testtesttesttesttesttestcnncnn
I would like to get this:
testcnn
How can I do this?
If your data is not masive you can use the Set class.
Here's an example:
p ['cnn','test','test','test','test','cnn','cnn'].to_set.to_a
=> ["cnn", "test"]
Here's a simple benchmark:
require 'set'
require 'benchmark'
Benchmark.bm(5) do |x|
x.report("Set") do
a = []
20_000.times do |i|
a << 'cnn'<< 'test'
end
a.to_set.to_a
end
end
=>
user system total real
Set 0.110000 0.000000 0.110000 ( 0.109000)