Agenda

06 Dic 2023 10:30

Recent Breakthroughs in Automatic Speech Recognition and Neural Audio Synthesis

Zoom Meeting

Zoom: https://meet.google.com/xts-qsfx-ivd

Speaker: Ambuj Mehrish, Singapore University of Technology and Design

Abstract:
The advent of deep learning has ushered in a revolutionary era in the domains of Automatic Speech Recognition (ASR) and Neural Audio Synthesis. This transformative paradigm shift is marked by the unprecedented capabilities of deep learning methodologies to unravel intricate patterns within audio data. From end-to-end learning architectures to attention mechanisms and transfer learning strategies, deep learning has fundamentally redefined how we perceive and harness the potential of audio technologies.

In this exploration, we delve into the profound impact of deep learning on ASR and Neural Audio Synthesis. The ability of deep learning models to autonomously learn and represent complex features, coupled with advancements in neural network architectures, has propelled these technologies to new heights. The scalability afforded by large datasets, coupled with innovative approaches like transfer learning, has not only enhanced the accuracy of ASR systems but has also elevated the quality and naturalness of synthesized audio outputs.

This brief journey navigates through the key mechanisms and advancements that underpin the deep learning revolution in ASR and Neural Audio Synthesis. From real-time processing capabilities to the adaptability enabled by attention mechanisms, we unravel the intricate tapestry of innovations that collectively shape the landscape of audio technologies. As we progress, we aim to provide a glimpse into the transformative potential and ongoing evolution fostered by deep learning, shaping the present and future trajectories of ASR and Neural Audio Synthesis.

Bio Sketch: 
Ambuj Mehrish is an AI researcher specializing in speech processing, machine learning and neural networks. His professional journey has been marked by impactful roles, including a position as a Postdoctoral Researcher at the University of Le Mans, France, and a Visiting Researcher at NUS, Singapore. Presently serving as a Research Fellow I in the ISTD Department at the Singapore University of Technology and Design, Ambuj dedicates his expertise to advancing Neural Audio Synthesis research, with a focus on enhancing Text-to-Speech (TTS) quality, efficiency, and naturalness.
Within SUTD, Ambuj is an integral part of the DeCLaRe Lab a collaborative initiative with a mission to infuse cognitive and language skills of human-like depth into machines. The lab tackles challenging Natural Language Processing (NLP) problems, including dialogue comprehension and generation, commonsense reasoning, multimodal understanding, and more. Throughout his career, Ambuj has made significant contributions to the field, evident in his work on research papers, reports, and technical documentation. His passion for exploring the potential of AI remains undiminished as he continues to strive for meaningful contributions that contribute to the evolution of the field.

Lingua

L'evento si terrà in inglese

Organizzatore

Dipartimento di Scienze Ambientali, Informatica e Statistica - Prof. Sebastiano Vascon

Cerca in agenda