Outline
1. Motivation
Implementing ML in custom tasks
Data shortage
4. Transfer Learning
Model: the same
Dataset: Speech Command (20 classes)
Chae Young Lee recently graduated from Hankuk Academy of Foreign Studies and is currently working at Naver’s Clova AI Research Team as a data scientist. Last year, she participated in Deep Learning Camp Jeju 2018 as the only high school student and developed the Conditional WaveGAN. She is also well-known for her Medium post on TPU.
Chae Young Lee recently graduated from Hankuk Academy of Foreign Studies and is currently working at Naver’s Clova AI Research Team as a data scientist. Last year, she participated in Deep Learning Camp Jeju 2018 as the only high school student and developed the Conditional WaveGAN. She is also well-known for her Medium post on TPU.
In recent years, speech data is receiving spotlight for various applications in deep learning, from Automatic Speech Recognition (ASR) system to source separation. And yet, there are not many augmentation techniques explored for speech data compared to those of image data. Thus, in this track, we will explore various methods to augment speech data. This hands-on tutorial will work along the task of building a simple speech classifier with the Speech Commands Zero to Nine (SC09) dataset available by TensorFlow and go over traditional augmentation techniques, transfer learning, GAN augmentation, and style transfer to increase the classification accuracy. Participants are required to download the libraries and pre-trained models, which will be available in late-January.
In recent years, speech data is receiving spotlight for various applications in deep learning, from Automatic Speech Recognition (ASR) system to source separation. And yet, there are not many augmentation techniques explored for speech data compared to those of image data. Thus, in this track, we will explore various methods to augment speech data. This hands-on tutorial will work along the task of building a simple speech classifier with the Speech Commands Zero to Nine (SC09) dataset available by TensorFlow and go over traditional augmentation techniques, transfer learning, GAN augmentation, and style transfer to increase the classification accuracy. Participants are required to download the libraries and pre-trained models, which will be available in late-January.
Speech
Domain Adaptation
Transfer Learning
Data Augmentation
Speech
Domain Adaptation
Transfer Learning
Data Augmentation
TensorFlow, Librosa, Numpy
Pre-trained models of CNN (SC20), DCGAN, and CycleGAN
SC09 dataset
TensorFlow, Librosa, Numpy
Pre-trained models of CNN (SC20), DCGAN, and CycleGAN
SC09 dataset
Implementing ML in custom tasks
Data shortage
Model: the same
Dataset: Speech Command (20 classes)
Model: CNN trainable with laptop CPUs (speech.py)
Dataset: Speech Command Zero to Nine (SC09)
Input: Spectrogram images
Pre-processing data → model setup → initial training
Model: DCGAN
Generating more SC09 dataset
Pre-trained → generation (on spot)
Adding noises
Stretching
Shifting pitches
Rolling
Model: CycleGAN/StarGAN
Generate dataset by converting gender/age/etc
Pre-trained → generation (on spot)
Compare accuracy
Insights
This tutorial is first come – first serve. Please register soon as number of spots are limited.
As a prerequisite participants are asked to download all the dev packages and data.
Registration opens soon – stay tuned!
Implementing ML in custom tasks
Data shortage
Model: CNN trainable with laptop CPUs (speech.py)
Dataset: Speech Command Zero to Nine (SC09)
Input: Spectrogram images
Pre-processing data → model setup → initial training
Adding noises
Stretching
Shifting pitches
Rolling
Model: the same
Dataset: Speech Command (20 classes)
Model: DCGAN
Generating more SC09 dataset
Pre-trained → generation (on spot)
Model: CycleGAN/StarGAN
Generate dataset by converting gender/age/etc
Pre-trained → generation (on spot)
Compare accuracy
Insights
This tutorial is first come – first serve. Please register soon as number of spots are limited.
As a prerequisite participants are asked to download all the dev packages and data.
Registration opens soon – stay tuned!