.. | ||
CMakeLists.txt | ||
image.h | ||
importpdf.cpp | ||
importpdf.h | ||
ocr.cpp | ||
ocr.h | ||
omr.cpp | ||
omr.h | ||
omrpage.cpp | ||
omrpage.h | ||
omrview.cpp | ||
omrview.h | ||
pattern.cpp | ||
pattern.h | ||
pdf.cpp | ||
pdf.h | ||
README.md | ||
skew.cpp | ||
TODO | ||
utils.cpp | ||
utils.h |
#Updates on Optical Music Recognition
###Graphical Model for System Identification
-
We can treat bar line detection simply as vertical edge detection. But the performance is quite unreliable because this process is sensitive to noise such as note stems or lines in text. The other problem of solely relying on edge detection is that we can hardly interpret the structure of systems based on what has been detected. The solution to this problem is to apply a graphical model to represent the system structure and encode useful distance or non-overlapping constraints, with which we can determine the grouping of staves into systems and the location of barlines in each system at the same time (simultaneously estimate both).
-
Suppose we have n staves (n-1 gaps), then there'll be 2^(n-1) ways of grouping systems if taking each gap as a binary switch connecting or not connecting the adjacent staves. In each system (staff group), barline positions will be commonly shared (a very strong and useful constriant!). We can use a nested dynamic programming to solve this problem. The optimal hypothesis (how the staves are grouped together) until the k-th stave yielding the max score h(k) = max(h(i) + system(i+1, ..., k)), based on previous optimal hypotheses h(i), i= 1,2,...,k-1. In each hypothesized system(i,...,j) from i-th to j-th staves, we recognize shared barlines from left to right by finding the best scoring configuration b_opt = max(b(k1) + b(k2) + ... + b(kn)) supposing each horizontal location corresponds to a bar or just background, where b(.) is the scoring function for barline in that column. We can also incorporate negative constraints (clef, key sigs, time sigs, or note stem) into this bar line recognition process.
###Demo data and screenshots of results
###Todo
Add pagebreak to generated skeleton
Align skeleton to OMR
Fix pdf loading for vector graphs
Optimize OMR performance and add clef/key recognitions