rvectorrle

How to calculate the length of repetition


I have following sequence of numbers

dat = c(-0.958694836980719, -0.105869068515775, -0.980669802804642, 
0.028491800336812, -0.635061963060457, 0, -0.930716072199491, 
0, 0.588929981219519, 0, 0.282785133581983, 0.611611557545295, 
0.0378059192851003, -0.797706616778708, 0, 0, 0, 0, 0.517465164205096, 
0, -0.488017358301909, 0, 0, -0.25055111810459, -0.649502253262175, 
-0.665111940088685, 0, 0.0833606598934977, -0.514514991719384, 
0.317596020495366, 0.794602807208168, 0, 0, -0.694999957450694, 
0.68257063515541, 0, -0.624026837516857, 0, 0, 0.450339396971535, 
-0.0302201203415504, -0.579393349186543, -0.844405771995823, 
0.315863139331068, -0.171564746000156, -0.0996391017024767, 0, 
0.0838315913186335, -0.36374768393003, 0, -0.572951822576261, 
-0.352439656458088, -0.637019777744324, 0, 0, -0.0952080968332089, 
0.617610001126072, -0.0816285346831291, 0.365239637846338, -0.0470848081799582, 
-0.925681187001364, 0, 0, 0.516154738675926, 0, 0.335416263139046, 
0.532290710398372, 0.18945326903775, 0.288998846320578, 0.125846440933334, 
-0.279555383136218, -0.456389602581116, -0.716237311784933, -0.0920396169199712, 
-0.2813560662731, 0.345024808219092, 0.338383493565635, 0.0058064242368383, 
0, 0.967537135446715, 0, 0.875822485251258, -0.431060076692186, 
-0.822882194591966, -0.62446221874739, -0.348475036137595, 0, 
0, 0.560600291351039, -0.855141781405395, 0, -0.706490388562219, 
0, -0.0451735735541755, 0.113810585454296, 0, -0.283307362362865, 
0.557656832607336, 0, 0, -0.909282421745824, -0.638976539326668, 
0.393719257131686, 0.301397306195678, 0, -0.74000532620085, 0.831188707386854, 
0.786577908437602, 0.296505948686095, 0.139539200765132, 0.88548929301196, 
0, 0.416614048955629, -0.316088049464881, 0, -0.323222691008726, 
-0.227387382853164, -0.562929988503375, 0, 0.283457267127375, 
0.713770547038207, 0.390959387881678, 0, 0, 0, 0.130514217066274, 
0.511687471126713, 0, 0.259730193040464, 0.741689274343481, -0.775924686373506, 
-0.495098678357968, 0.284476197633141, -0.900591805602638, -0.276707687274933, 
-0.191991699142624, 0, -0.916979262244761, 0.769473198941637, 
-0.241554713076157, 0, 0, 0, -0.231727168460227, -0.761155897450598, 
-0.678432614215555, -0.934782559884297, 0, 0, -0.314088267640064, 
0.186322473577494, 0, -0.235062452954516, 0, -0.314446614967701, 
-0.290302655565895, 0, 0, 0.144997859475891, 0, -0.840827052729484, 
0.88274732032249, 0.228399769981503, 0, 0.109512538112691, 0.671159365334607, 
0, 0, 0.0383683666391103, -0.798745428998881, 0, 0, 0, 0.244742671600118, 
-0.567358245295884, 0.509559617984882, 0.909915275452086, 0, 
0.904785111614818, 0, 0.207396095012435, 0, -0.156956691582582, 
0.776618542355675, -0.555791786131894, 0, 0.932355178469579, 
0.429624993163275, 0.0220608551322564, 0.146385826283492, 0, 
0.111119149224947, 0, 0.200025553735095, -0.429542452648371, 
-0.0528214778849886, 0, 0.353971870563417, 0, 0.768878060423797
)

I am looking to calculate, however many times 'n' consecutive zeros have occurred in above sequence.

Is there any direct function available in R to achieve the same?


Solution

  • You can look into the run length encoding, make a function out of it.

    > rle(+(dat != 0))
    Run Length Encoding
      lengths: int [1:93] 5 1 1 1 1 1 4 4 1 1 ...
      values : int [1:93] 1 0 1 0 1 0 1 0 1 0 ...
    > f <- \(sq, n) sum(with(rle(sq), lengths[values == 0]) == n)
    > f(+(dat != 0), n=1)
    [1] 32
    > f(+(dat != 0), n=2)
    [1] 10
    > f(+(dat != 0), n=3)
    [1] 3
    > f(+(dat != 0), n=4)
    [1] 1