I am building a Digital Audio Workstation in the likes of FL Studio and have a chord detection algorithm. It works as it is but is extremely inefficient which I need to optimize for.
First we start with the input:
[
['C', 'E'],
['E', 'G'],
['G']
]
Note that these are in order, and that for demonstrations sake, the first two E's and G's are the same note because at each note change we add an item to the list:
G --------
E --------------
C ---------
The alogrithm I chose is the Viterbi algorithm in which I have base chord templates which I then combine with each of the 12 notes in this case thats 72 templates:
// Base chords; example after combination: Cmaj, Cmin .... BMaj, Bmin
const chordList: ChordInformation[] = [
{ label: 'maj', keys: [0,4,7], bias: -0.2 },
{ label: 'min', keys: [0,3,7], bias: -0.2 },
{ label: 'dim', keys: [0,3,6], bias: 0.0 },
{ label: 'maj7', keys: [0,4,7,11], bias: -0.1 },
{ label: '7', keys: [0,4,7,10], bias: 0.0 },
{ label: 'm7', keys: [0,3,7,10], bias: 0.0 },
];
Then we go through the viterbi algorithm, testing each chord before (72) with each chord after (72), calculate the best one and choose that as the chord interpretation:
// N === templates.length
for (let newSpace = 0; newSpace < N; newSpace++) {
let bestScore = Infinity, prevSpaceOfBest = -1;
for (let prevSpace = 0; prevSpace < N; prevSpace++) {
const candidate = this.chordPath[prevX][prevSpace].bestScore
+ (prevSpace === newSpace ? 0 : this.params.switchCost)
+ this.chordScores[x][newSpace];
if (candidate < bestScore) {
bestScore = candidate;
prevSpaceOfBest = prevSpace;
}
}
this.chordPath[x][newSpace] = {bestScore: bestScore, prevSpace: prevSpaceOfBest};
}
As you can see even with only 6 chords, its already > 5000 (72x72) operations for each calculation, and although I maintain the arrays as a global object, when a note changes at one position, all positions after must have their values updated. This means in a project with 1000 buckets input, If bucket number 15 changes, then buckets 15-1000 must be updated which is too costly.
Heres what Ive tried:
It seems to me that theres no way onward with this algorithm. I have considered looking for a Transformer approach but I dont know where I could find one trained and open source.
Ok I've figured it out. The approach works as follows: 1. prune buckets to only calculate top K elements (this doesn't actually filter out complex harmonies because something like C-E-G should never be interpreted as Dbmaj7 etc. therefore we don't need it). For 6 chords and K=3 this now brings the complexity to 18x18.
As I would like to add more chords and this would scale the time complexity exponentially, I improved upon it further. To further improve the time complexity, we use core chords and add extensions on top. Below you can see the core chord tones of major positively impact it the most, with extensions only slightly positively impacting it.
{ label: 'maj', keys: [0,4,7], bias: -0.2, extensions: {
11: .5, // 7
2: .25 // 9
} },
Because these extensions will only contribute to the viterbi scores and dont actually get tracked we have to go back through the the note buckets outside of the viterbi after the interpretations and build the exact chord name. For instance if its a Cmajor7 chord, the viterbi will identify it a Cmaj and then I will go back through the note buckets manually and add the 7 when I see the B in the bucket.
This brings the total time complexity to 12x12 (maj, min, dim, aug). This implementation needs some polish but I think its the best solution for my use case.