Kangri-Revitalization-ASR

Using Automatic Speech Recognition to try and revitalize the Kangri Language.

As featured in Hindustan Times

“Using Automatic Speech Recognition for the Low Resource, Morphologically Rich Language Kangri”. The research is currently underway and is being conducted at National Institute of Technology, Hamirpur under the guidance of Dr. Shweta Chauhan and with help from Dr. Rajesh Bhatt from UMass Amherst. So far, I have spent around 20 hours/week, 30 weeks/year conducting research.

Using ASR reduces transcription bottlenecks that are otherwise associated with manually transcribing acoustic recordings. Recent advancements in ASR have made it possible to achieve similar Word Error Rates (WER) with substantially lesser amounts of data and lesser training-time. As such, there has been significant research interest in pushing the limits of techniques such as semi-supervised learning (Zhang Et Al.), Unsupervised Learning and Joint Unsupervised and Supervised Training (Bai Et Al.) My research deals with exploring multiple different paradigms present in ASR and determining which ones give the optimal results for a Low Resource, Morphologically Rich language such as Kangri (keeping in mind the size of the dataset and the time required to train it).