I have a large data set that has values ranging from [-3,3] and I'm using a hard limit at 0 as the boundary.
The data has a binary value of 1 when its oscillating from -3,3 at a 56kHz frequency. What this means is that the data will be changing from -3 to 3 and back every N data values where N is typically < 20.
The data has a binary value of 0 when the data is 3 constantly (this can typically last 400+ samples long)
I cant seem to group the data into their binary categories and also know how many samples wide the group is.
Example data:
1.84 |
2.96 |
2.8 |
3.12 |
. | I want this to be grouped as a 0
. |
3.11 |_____
-3.42 |
-2.45 |
-1.49 |
3.12 |
2.99 | I want this to be grouped as a 1
1.97 |
-1.11 |
-2.33 |
. |
. | Keeps going until for N cycles
The cycles in-between the logic HIGH state are typically small (<20 samples).
The code I have so far:
state = "X"
for i in range(0, len(data['input'])):
currentBinaryState = inputBinaryState(data['input'][i]); # Returns -3 or +3 appropriately
if(currentBinaryState != previousBinaryState):
# A cycle is very unlikely to last more than 250 samples
if y > 250 and currentBinaryState == "LOW": # Been low for a long time
if state == "_high":
groupedData['input'].append( ("HIGH", x) )
x = 0
state = "_low"
else:
# Is on carrier wave (logic 1)
if state == "_low":
# Just finished low
groupedData['input'].append( ("LOW", x) )
x = 0
state = "_high"
y = 0
Obviously, the result isn't as I should expect as the LOW group is very small.
[('HIGH', 600), ('LOW', 8), ('HIGH', 1168), ('LOW', 9), ('HIGH', 1168), ('LOW', 8), ('HIGH', 1168), ('LOW', 8), ('HIGH', 1168), ('LOW', 9), ('HIGH', 1168), ('LOW', 8), ('HIGH', 1168), ('LOW', 8), ('HIGH', 1168), ('LOW', 9)]
I understand I could of asked this on the signal processing SA but I deemed this problem to be more programming oriented. I hope I explained the problem sufficiently, if there's any questions just ask. Thanks.
Here is a link to the actual sample data:
https://drive.google.com/folderview?id=0ByJDNIfaTeEfemVjSU9hNkNpQ3c&usp=sharing
Visually, it is very clear where the boundaries of the data lie.
Update 1
I've updated my code to be more legible as single letter variables isn't helping with my sanity.
previousBinaryState = "X"
x = 0
sinceLastChange = 0
previousGroup = inputBinaryState(data['input'][0])
lengthAssert = 0
for i in range(0, len(data['input'])):
currentBinaryState = inputBinaryState(data['input'][i]);
if(currentBinaryState != previousBinaryState): # Changed from -3 -> +3 or +3 -> -3
#print sinceLastChange
if sinceLastChange > 250 and previousGroup == "HIGH" and currentBinaryState == "LOW": # Finished LOW group
groupedData['input'].append( ("LOW", x) )
lengthAssert += x
x = 0
previousGroup = "LOW"
elif sinceLastChange > 20 and previousGroup == "LOW": # Finished HIGH group
groupedData['input'].append( ("HIGH", x) )
lengthAssert += x
x = 0
previousGroup = "HIGH"
sinceLastChange = 0
else:
sinceLastChange += 1
previousBinaryState = currentBinaryState
x += 1
Which, for the sample data, outputs:
8
7
8
7
7
596 <- Clearly a LOW group
7
8
7
8
7
7
8
7
8
7
7
8
7
8
7
7
8
7
8
.
.
.
Problem is the HIGH group is lasting longer than it should be:
[('HIGH', 600), ('LOW', 1176), ('HIGH', 1177), ('LOW', 1176), ('HIGH', 1176), ('LOW', 1177), ('HIGH', 1176), ('LOW', 1176)]
I've finally found a solution. I spent far too long getting my head around, what appears to be, a fairly simple problem but it works now.
It won't pick up the last group in the data set but that's fine.
previousBinaryState = "X"
x = 0
sinceLastChange = 0
previousGroup = inputBinaryState(data['input'][0])
lengthAssert = 0
for i in range(0, len(data['input'])):
currentBinaryState = inputBinaryState(data['input'][i]);
if(currentBinaryState != previousBinaryState): # Changed from -3 -> +3 or +3 -> -3
#print sinceLastChange
if sinceLastChange > 250 and previousGroup == "HIGH" and currentBinaryState == "LOW": # Finished LOW group
groupedData['input'].append( ("LOW", x) )
lengthAssert += x
x = 0
previousGroup = "LOW"
sinceLastChange = 0
else:
if sinceLastChange > 20 and previousGroup == "LOW":
groupedData['input'].append( ("HIGH", x) )
lengthAssert += x
x = 0
previousGroup = "HIGH"
sinceLastChange = 0
sinceLastChange += 1
previousBinaryState = currentBinaryState
x += 1
20 is the maximum number of cycles in the HIGH state and 250 is the maximum number of samples for which the group is in the LOW state.
[('HIGH', 25), ('LOW', 575), ('HIGH', 602), ('LOW', 574), ('HIGH', 602), ('LOW', 575), ('HIGH', 601), ('LOW', 575), ('HIGH', 602), ('LOW', 574), ('HIGH', 602), ('LOW', 575), ('HIGH', 601), ('LOW', 575), ('HIGH', 602), ('LOW', 574)]
When comparing that to the graph and the actual data, it appears to be correct.