Neuro-symbolic procedural semantics for explainable visual dialogue
Fig 9
Schematic representation of the execution of the semantic network underlying the utterance ‘How many brown objects are there?’ on a scene from the CLEVR-Dialog dataset, illustrating the transparency of the approach.
The filter operation wrongly recognises the leftmost object to be brown. As a consequence, two brown objects are counted instead of one.