Figure 1.
An illustration of the DNN training and DBF extraction procedure.
Left: Pre-training of a stack of RBMs with the first layer hosting a Gaussian-Bernoulli RBM and all other layers being Bernoulli-Bernoulli RBMs. The inputs to each RBM are from the outputs of the lower layer RBM. Middle: The generative model DBN constructed from a stack of RBMs. Right: The corresponding DNN and DBF extractor. The DNN is created by adding a randomly initialized softmax output layer on top of the DBN, and the parameters of DNN are obtained in a fine-tuning phase. The final DBF extractor in the bottom right dashed rectangle is obtained by removing the layers above the bottleneck layer.
Figure 2.
Block diagram of our proposed DBF-TV LID system.
This system consists of two main phases, the acoustic frontend and TV modeling back-end.
Figure 3.
Block diagrams of two PDBF-TV LID systems.
The diagram above the dashed line is PDBF-TV with later fusion. The diagram below the dashed line is the PDBF-TV with early fusion.
Table 1.
Comparison of Performances between DBF-TV system and SDC-TV system on LRE09.
Figure 4.
DET curves comparison between MA DBF-TV and SDC-TV.
Table 2.
Comparison of Performances between different temporal context sizes using 43-dimensional DBF on LRE09.
Figure 5.
EER obtained from the MA DBF-TV system based on different dimensions of DBF on LRE09.
Left panel shows the results of 30 s. Middle panel shows the results of 10s. Right panel shows the results of 3 s.
Table 3.
Comparison of Performance between two different PDBF-TV systems on LRE09.
Figure 6.
DET curves comparison between PPRLM, PDBF-TV (MA+EN) and their fusion on LRE09.
Table 4.
Fusion results between PDBF-TV system with PPRLM system on LRE09.