WORK IN PROGRESS
This tutorial will explain how to calculate facial (dis)similarity in a set of 2-D faces. We will calculate the Euclidean distance between Procrustes-aligned landmark templates; to calculate it dissimilarity in face space, replace your matrix of landmark coordinates with the matrix of PC scores.
path_to_tem <- system.file("extdata", "tem", package="facefuns")
shapedata <- facefuns(data = read_lmdata(lmdata = path_to_tem,
plot = FALSE),
remove_points = "frlgmm",
pc_criterion = "broken_stick",
plot_sample = FALSE,
quiet = TRUE)
We will use facefuns::calc_ed
to calculate similarity. calc_ed
requires two arguments:
Landmark coordinates (or PC scores) in a matrix format
A table specifying for which pairs of faces you would like to calculate similarity
Currently, our landmark templates are stored in a three-dimensional array: a list of n matrices of dimensions p x k
str(shapedata$aligned)
#> num [1:132, 1:2, 1:102] -0.0526 0.054 -0.0522 -0.0603 -0.0622 ...
#> - attr(*, "dimnames")=List of 3
#> ..$ : chr [1:132] "1" "2" "3" "4" ...
#> ..$ : chr [1:2] "X" "Y"
#> ..$ : chr [1:102] "001" "002" "003" "004" ...
We will use facefuns::convert_array_to_matrix
to convert our array into a matrix with n rows and p x k columns.
data_matrix <- convert_array_to_matrix(shapedata$aligned)
str(data_matrix)
#> num [1:102, 1:264] -0.0526 -0.0515 -0.0514 -0.0474 -0.0505 ...
#> - attr(*, "dimnames")=List of 2
#> ..$ : chr [1:102] "001" "002" "003" "004" ...
#> ..$ : chr [1:264] "X1" "Y1" "X2" "Y2" ...
Most times, you will already have a list of face pairs for which you want to calculate similarity.
For this example, we will calculate the similarity between all possible combinations of face pairs in our data set.
We start by assigning all face IDs in our sample to a variable …
face_names <- dimnames(shapedata$aligned)[[3]]
… and then create a list of all possible combinations
pairs <- expand.grid(A = face_names,
B = face_names)
calc_ed
We now have everything we need to run our function
sim_table <- calc_ed(coords_matrix = data_matrix,
pairs_table = pairs)
head(sim_table)
#> # A tibble: 6 x 3
#> A B EuclideanDistance
#> <fct> <fct> <dbl>
#> 1 001 001 0
#> 2 002 001 0.0857
#> 3 003 001 0.0663
#> 4 004 001 0.129
#> 5 005 001 0.112
#> 6 006 001 0.119
Let’s display our data in a wide format and round the values. It is a rather big table, so we will only print a small subset
sim_table %>%
dplyr::mutate(EuclideanDistance = round(EuclideanDistance, 2)) %>%
tidyr::spread(B, EuclideanDistance) %>%
dplyr::select(1:10) %>%
dplyr::slice(1:9)
#> # A tibble: 9 x 10
#> A `001` `002` `003` `004` `005` `006` `007` `008` `009`
#> <fct> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 001 0 0.09 0.07 0.13 0.11 0.12 0.08 0.14 0.07
#> 2 002 0.09 0 0.07 0.1 0.08 0.08 0.12 0.11 0.11
#> 3 003 0.07 0.07 0 0.1 0.09 0.1 0.09 0.11 0.09
#> 4 004 0.13 0.1 0.1 0 0.09 0.08 0.14 0.11 0.14
#> 5 005 0.11 0.08 0.09 0.09 0 0.1 0.13 0.1 0.12
#> 6 006 0.12 0.08 0.1 0.08 0.1 0 0.14 0.09 0.13
#> 7 007 0.08 0.12 0.09 0.14 0.13 0.14 0 0.16 0.06
#> 8 008 0.14 0.11 0.11 0.11 0.1 0.09 0.16 0 0.15
#> 9 009 0.07 0.11 0.09 0.14 0.12 0.13 0.06 0.15 0
Averageness can be quantified as distinctiveness from the sample average. For each face, we will calculate the Euclidean distance to the sample average, and then reverse scores, so higher scores mean “more average”.
You could use calc_ed
, but this will require a wee bit of data wrangling: you will need to attach the average template to the array holding the aligned templates, then convert this new array to a matrix and finally reverse the distinctiveness scores. calc_avg
does all of that! It only takes one argument - a facefuns
object: