[1]https://docs.google.com/presentation/d/1RFfws_WdT2lBrURbPLxNJScUORArQfCOJlGk4NYaYc/ edit?usp=sharing.
[2]McLaughlin J, Reynolds D A, Gleason T. A study of computation speed-ups of the GMM-UBM speaker recognition system[C]//Sixth European conference on speech communication and technology.1999.
[3]Kenny P, Boulianne G, Ouellet P, et al. Joint factor analysis versus eigenchannels in speaker recognition[J]. IEEE Transactions on Audio, Speech, and Language Processing, 2007, 15(4): 1435-1447.
[4]Dehak N, Kenny P J, Dehak R, et al. Front-end factor analysis for speaker verification[J]. IEEE Transactions on Audio, Speech, and Language Processing, 2010, 19(4): 788-798.
[5]Variani E, Lei X, McDermott E, et al. Deep neural networks for small footprint text-dependent speaker verification[C]//2014 IEEE international conference on acoustics, speech and signal processing(ICASSP). IEEE, 2014: 4052-4056.
[6]Li C, Ma X, Jiang B, et al. Deep speaker: an end-to-end neural speaker embedding system[J]. arXiv preprint arXiv:1705.02304, 2017.
[7]https://www.cnblogs.com/wuxian11/p/6498699.html.
[8]https://blog.csdn.net/weixin_38206214/article/details/81096092.
[9]https://blog.csdn.net/KevinBetterQ/article/details/85476575.
[10]Sainath T N, He Y, Li B, et al. A streaming on-device end-to-end model surpassing server-side conventional model quality and latency[C]//ICASSP 2020-2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 2020: 6059-6063.
[11]https://github.com/hirofumi0810/neural_sp.
[12]https://github.com/cywang97/StreamingTransformer.
[13]https://speech.ee.ntu.edu.tw/~hylee/dlhlp/2020-spring.html.
[14]LeCun Y, Bottou L, Bengio Y, et al. Gradient-based learning applied to document recognition[J]. Proceedings of the IEEE, 1998, 86(11): 2278-2324.
[15]Krizhevsky A, Sutskever I, Hinton G E. Imagenet classification with deep convolutional neural networks[J]. Advances in neural information processing systems, 2012, 25.
[16]He K, Zhang X, Ren S, et al. Deep residual learning for image recognition[C]//Proceedings of the IEEE conference on computer vision and pattern recognition. 2016: 770-778.
[17]Bolya D, Zhou C, Xiao F, et al. Yolact: Real-time instance segmentation[C]//Proceedings of the IEEE/CVF international conference on computer vision. 2019: 9157-9166.
[18]Goodfellow I, Pouget-Abadie J, Mirza M, et al. Generative adversarial networks[J]. Communications of the ACM, 2020, 63(11): 139-144.
[19]Karras T, Laine S, Aila T. A style-based generator architecture for generative adversarial networks[C]//Proceedings of the IEEE/CVF conference on computer vision and pattern recognition.2019: 4401-4410.
[20]https://github.com/deepinsight/insightface.
[21]https://github.com/facebookresearch/detectron2.
[22]https://venturebeat.com/2021/11/22/nvidias-latest-ai-tech-translates-text-into-landscape-images/.