Unsupervised learning reveals interpretable latent representations for translucency perception

doi:10.1371/journal.pcbi.1010878

Unsupervised learning reveals interpretable latent representations for translucency perception

Fig 5

The middle-layers of W+ latent space can effectively modulate translucency of generated images and predict human perception.

(A) Illustration of a trained layer-specific supported vector machine (SVM) classifier for the milky-versus-glycerin soap discrimination. (B) The scatter plots show the model prediction values versus the human mean normalized attribute ratings for each generated image in Experiment 2. Green, blue, and orange colors represent the data for translucency, see-throughness, and glow, respectively. (C) The tuning curve of correlation coefficients (correlation between model prediction and human perceptual rating, r_hc) over all layers in the W+ latent space. Model prediction values using the middle-layers’ decision boundaries (d₇, d₈, and d₉) strongly correlate with human attribute ratings. “*” indicates the correlations at that layer are statistically insignificant at the 95% confidence level. (D) Examples of translucency-modulated sequences. Top: Manipulating the layer-9 latent vector of the original image (left end) along the normal of the learned decision boundary has a coherent effect on the translucent material appearance of the object. Left: Moving to the positive direction of the normal of the decision boundary makes the soap appear more opaque. Right: Moving to the negative direction of the normal of the decision boundary makes the soap appear more translucent. Bottom: Manipulating the layer-12 latent vector of the original image along the normal of the learned decision boundary does not fundamentally change the translucent appearance.

doi: https://doi.org/10.1371/journal.pcbi.1010878.g005