I am currently trying to use the EBImage
package in R (see article here) to get the perimeters and areas of binary images. Specifically, I'm trying to do this for English and Chinese characters. Below is some code I have written that tries to achieve this. It essentially produces a character within R, saves that character as a .jpeg file with a white background in the directory, then imports that file. After it creates a binary version of the image and calculates the perimeter and area, which I later use for other calculations.
#### Load Library ####
library(EBImage)
#### Create Text Image Function ####
textPlot <- function(plotname, string,cex=20){
par(mar=c(0,0,0,0))
jpeg(paste0(plotname, ".jpg"))
plot(c(0, 1), c(0, 1), ann = F, bty = 'n', type = 'n', xaxt = 'n', yaxt = 'n')
text(x = 0.5, y = 0.5, paste(string), cex = cex, col = "black", family="serif", font=2, adj=0.5)
dev.off()
}
#### Generate Images for W and 爱 ####
textPlot("w","w",20)
textPlot("ai","爱",20)
#### Pre-Process Image ####
path <- "C:/Users/User/Desktop/main_projects/ai.jpg" # set wd
img <- readImage(path) # original image
gray_img <- channel(img, "gray") # converts to grayscale
binary_img <- gray_img > 0.5 # creates threshold for binary
labeled_img <- bwlabel(binary_img) # labels binary objects
#### Compare Images ####
plot(img)
plot(gray_img)
plot(binary_img)
#### Compute Shape Features (Perimeter/Area) ####
shape_features <- computeFeatures.shape(labeled_img)
perimeters <- shape_features[, "s.perimeter"]
areas <- shape_features[, "s.area"]
#### Display Original Image w/ Labeled Boundaries ####
display(img, method = "raster")
highlight <- paintObjects(labeled_img, img, col = "red")
display(highlight, method = "raster")
#### Print Perimeter and Area ####
perimeters
areas
pc <- (sum(perimeters)^2) / (4*sum(areas)*pi)
pc # both chars only slight diff., always changes when inc. size
In the example I provide above, it calculates the area and perimeter for the Chinese character 爱 below (red image is supposedly where each measure is coming from in pixels):
However, the point of this is to calculate perimetric complexity, which is the squared perimeter of an image divided by 4 times the area times pi (as calculated in my code). When I run this in R, it gives me values that are spectacularly low. By comparison, another database I have seen which does this with Mathematica gives a value of around 35 for this character's perimetric complexity, whereas with different image sizes I always get around 1.5 or below. What is causing this discrepancy? Is there a way to change the code to more accurately reflect the scores?
It looks like computeFeatures.shape
takes 1 (white) to be the foreground and 0 (black) to be the background. So you're computing the complexity of the negative space around the character. When we just change
binary_img <- gray_img > 0.5 # creates threshold for binary
to
binary_img <- gray_img < 0.5 # creates threshold for binary
I get 26.8358
for the complexity - which is off from 35, but that could be a difference in fonts.
This also explains why it changes when you change the size, assuming by size you mean the cex
parameter - when you increase the size of the text, the outer perimeter of the image stays the same size, so you're qualitatively changing the shape of the negative space. With the change cex
to 10, I get 26.29365
- a little different, but that's just because of the change in resolution relative to the size of the character.