1School of Data Science, Shenzhen Research Institute of Big Data
2The Chinese University of Hong Kong, Shenzhen (CUHK-Shenzhen), Shenzhen, China
3Fuxi AI Lab, NetEase Inc., Hangzhou, China
In accented voice conversion or accent conversion, we seek to convert the accent in speech from one another while preserving speaker identity and semantic content. In this study, we formulate a novel method for creating multi-accented speech samples, thus pairs of accented speech samples by the same speaker, through text transliteration for training accent conversion systems. We begin by generating transliterated text with a Large Language Model (LLM), which are then fed into multilingual TTS models to synthesize accented English speech. As a reference system, we built a sequence-to-sequence model on the synthetic parallel corpus for accent conversion. We validated the proposed method for both native and non-native English speakers. Subjective and objective evaluations further validate our dataset's effectiveness in accent conversion studies.
You can visit the project page of this paper: Github Repository.
This section includes audio samples generated by our proposed generation method, MacST. There are three speakers involved in this section.
ID | arctic_a0058 | arctic_a0085 | arctic_a0131 | arctic_a0202 | arctic_a0210 | arctic_a0274 | arctic_a0544 | arctic_a0561 | arctic_b0019 | arctic_b0047 | arctic_b0067 | arctic_b0083 | arctic_b0113 | arctic_b0142 | arctic_b0147 | arctic_b0210 | arctic_b0257 | arctic_b0317 | arctic_b0318 | arctic_b0497 |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Text |
Original: I came for information more out of curiosity than anything else. |
Original: A trickle of fresh blood ran over his face. |
Original: Providence had delivered him through the maelstrom. |
Original: She turned, fearing that Jacques might see what was in her face. |
Original: Its diameter was not more than two hundred yards. |
Original: And Raoul listened again to the tale of the house. |
Original: He may anticipate the day of his death. |
Original: Bill lingered, contemplating his work with artistic appreciation. |
Original: His slim hands gripped the edges of the table. |
Original: Suppose you saw me at work through the window. |
Original: Below him the shadow was broken into a pool of rippling starlight. |
Original: In it there was something that was almost tragedy. |
Original: There was something pathetic in the girl's attitude now. |
Original: Something vastly more thrilling had come into it now. |
Original: Had it struck squarely it would have killed him. |
Original: Whatever he guessed he locked away in the taboo room of Naomi. |
Original: Tudor surveyed him with withering disgust. |
Original: He did not believe in the burning of daylight for such a luxury. |
Original: Again he had done the big thing. |
Original: One great drawback to farming in California is our long dry summer. |
Ground Truth (SLT/American) | ||||||||||||||||||||
MacST (SLT/American) | ||||||||||||||||||||
MacST (SLT/Hindi) | ||||||||||||||||||||
Ground Truth (ASI/Hindi) | ||||||||||||||||||||
MacST (ASI/Hindi) | ||||||||||||||||||||
Ground Truth (TNI/Hindi) | ||||||||||||||||||||
MacST (TNI/Hindi) |
* please scroll horizontally to explore additional columns in the table.
This section includes audio samples generated by our proposed generation method, MacST. There are three speakers involved in this section.
ID | arctic_a0058 | arctic_a0085 | arctic_a0131 | arctic_a0202 | arctic_a0210 | arctic_a0274 | arctic_a0544 | arctic_a0561 | arctic_b0019 | arctic_b0047 | arctic_b0067 | arctic_b0083 | arctic_b0113 | arctic_b0142 | arctic_b0147 | arctic_b0210 | arctic_b0257 | arctic_b0317 | arctic_b0318 | arctic_b0497 |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Text |
Original: I came for information more out of curiosity than anything else. |
Original: A trickle of fresh blood ran over his face. |
Original: Providence had delivered him through the maelstrom. |
Original: She turned, fearing that Jacques might see what was in her face. |
Original: Its diameter was not more than two hundred yards. |
Original: And Raoul listened again to the tale of the house. |
Original: He may anticipate the day of his death. |
Original: Bill lingered, contemplating his work with artistic appreciation. |
Original: His slim hands gripped the edges of the table. |
Original: Suppose you saw me at work through the window. |
Original: Below him the shadow was broken into a pool of rippling starlight. |
Original: In it there was something that was almost tragedy. |
Original: There was something pathetic in the girl's attitude now. |
Original: Something vastly more thrilling had come into it now. |
Original: Had it struck squarely it would have killed him. |
Original: Whatever he guessed he locked away in the taboo room of Naomi. |
Original: Tudor surveyed him with withering disgust. |
Original: He did not believe in the burning of daylight for such a luxury. |
Original: Again he had done the big thing. |
Original: One great drawback to farming in California is our long dry summer. |
Ground Truth (SLT/American) | ||||||||||||||||||||
MacST (SLT/American) | ||||||||||||||||||||
MacST (SLT/Korean) | ||||||||||||||||||||
Ground Truth (HKK/Korean) | ||||||||||||||||||||
MacST (HKK/Korean) | ||||||||||||||||||||
Ground Truth (YDCK/Korean) | ||||||||||||||||||||
MacST (YDCK/Korean) |
* please scroll horizontally to explore additional columns in the table.
This section includes audio samples generated by Accent Conversion (AC) Models in Hindi Accent. The input is a native accent while the output is a Hindi accent. We used SLT for the whole experiment. We compared two models: the first (AC w/o Data Augmentation) was trained on paired data with ground-truth input from CMU-ARCTIC and synthetic target output from MacST; the second model (AC w/ Data Augmentation (ours)) incorporated additional synthetic speech pairs from MacST. We augmented our dataset with 1 hour from ARCTIC transcriptions and an extra 3 hours from VCTK to generate American and Hindi-accented speech pairs.
ID | arctic_a0058 | arctic_a0085 | arctic_a0131 | arctic_a0202 | arctic_a0210 | arctic_a0274 | arctic_a0544 | arctic_a0561 | arctic_b0019 | arctic_b0047 | arctic_b0067 | arctic_b0083 | arctic_b0113 | arctic_b0142 | arctic_b0147 | arctic_b0210 | arctic_b0257 | arctic_b0317 | arctic_b0318 | arctic_b0497 |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Text |
Original: I came for information more out of curiosity than anything else. |
Original: A trickle of fresh blood ran over his face. |
Original: Providence had delivered him through the maelstrom. |
Original: She turned, fearing that Jacques might see what was in her face. |
Original: Its diameter was not more than two hundred yards. |
Original: And Raoul listened again to the tale of the house. |
Original: He may anticipate the day of his death. |
Original: Bill lingered, contemplating his work with artistic appreciation. |
Original: His slim hands gripped the edges of the table. |
Original: Suppose you saw me at work through the window. |
Original: Below him the shadow was broken into a pool of rippling starlight. |
Original: In it there was something that was almost tragedy. |
Original: There was something pathetic in the girl's attitude now. |
Original: Something vastly more thrilling had come into it now. |
Original: Had it struck squarely it would have killed him. |
Original: Whatever he guessed he locked away in the taboo room of Naomi. |
Original: Tudor surveyed him with withering disgust. |
Original: He did not believe in the burning of daylight for such a luxury. |
Original: Again he had done the big thing. |
Original: One great drawback to farming in California is our long dry summer. |
Ground Truth (SLT/American/source) | ||||||||||||||||||||
MacST (SLT/American) | ||||||||||||||||||||
MacST (SLT/Hindi/target) | ||||||||||||||||||||
AC w/o Data Augmentation | ||||||||||||||||||||
AC w/ Data Augmentation (ours) |
* please scroll horizontally to explore additional columns in the table.
This section includes audio samples of additional accents, generated by MacST including Japanese, Russian, and Arabic. We used a speaker provided by 11ElevenLabs. Each audio sample is accompanied by the transliterated text corresponding to the accent.
ID | arctic_a0058 | arctic_a0085 | arctic_a0131 | arctic_a0202 | arctic_a0210 | arctic_a0274 | arctic_a0544 | arctic_a0561 | arctic_b0019 | arctic_b0047 | arctic_b0067 | arctic_b0083 | arctic_b0113 | arctic_b0142 | arctic_b0147 | arctic_b0210 | arctic_b0257 | arctic_b0317 | arctic_b0318 | arctic_b0497 |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Text |
I came for information more out of curiosity than anything else. |
A trickle of fresh blood ran over his face. |
Providence had delivered him through the maelstrom. |
She turned, fearing that Jacques might see what was in her face. |
Its diameter was not more than two hundred yards. |
And Raoul listened again to the tale of the house. |
He may anticipate the day of his death. |
Bill lingered, contemplating his work with artistic appreciation. |
His slim hands gripped the edges of the table. |
Suppose you saw me at work through the window. |
Below him the shadow was broken into a pool of rippling starlight. |
In it there was something that was almost tragedy. |
There was something pathetic in the girl's attitude now. |
Something vastly more thrilling had come into it now. |
Had it struck squarely it would have killed him. |
Whatever he guessed he locked away in the taboo room of Naomi. |
Tudor surveyed him with withering disgust. |
He did not believe in the burning of daylight for such a luxury. |
Again he had done the big thing. |
One great drawback to farming in California is our long dry summer. |
American |
i came for information more out of curiosity than anything else. |
a trickle of fresh blood ran over his face. |
providence had delivered him through the maelstrom. |
she turned, fearing that jacques might see what was in her face. |
its diameter was not more than two hundred yards. |
and raoul listened again to the tale of the house. |
he may anticipate the day of his death. |
bill lingered, contemplating his work with artistic appreciation. |
his slim hands gripped the edges of the table. |
suppose you saw me at work through the window. |
below him the shadow was broken into a pool of rippling starlight. |
in it there was something that was almost tragedy. |
there was something pathetic in the girl's attitude now. |
something vastly more thrilling had come into it now. |
had it struck squarely it would have killed him. |
whatever he guessed he locked away in the taboo room of naomi. |
tudor surveyed him with withering disgust. |
he did not believe in the burning of daylight for such a luxury. |
again he had done the big thing. |
one great drawback to farming in california is our long dry summer. |
Hindi |
आई कैम फॉर इन्फर्मेशन मोर आउट अफ क्यूरिओसिटी थैन एनीथिंग एल्स. |
ए त्रिकल अव फ्रेश ब्लड रैन ओवर हिस फेस. |
प्रॉविडन्स हैड डिलिवर्ड हिम थ्रू द मेलस्ट्रॉम. |
शी टर्न्ड, फियरिंग दॅट जैक माइट सी वट वज इन हर फेस. |
इट्स डाइअमीटर वस नॉट मोर दन टू हंड्रेड यार्ड्स. |
ऐंड राउल लिसन्ड अगैन टू द टेल अव द हाउस. |
ही मे एंटिसिपेट द डे अव हिस डेथ. |
बिल लिंगर्ड, कॉन्टेम्प्लेटिंग हिस वर्क विथ आर्टिस्टिक आप्रेसिएशन. |
हिस स्लिम हैंड्स ग्रिप्पेड द एजेस ऑफ द टेबल. |
सपोज़ यू सॉ मी ऐट वर्क थ्रू द विंडो. |
बिलो हिम द शैडो वस ब्रोकेन इंटू ए पूल अव रिप्लिंग स्टारलाइट. |
इन इट देयर वज़ समथिंग दैट वज़ ऑलमोस्ट ट्रैजेडी. |
देर वज़ समथिंग पैथेटिक इन द गर्ल्स ऐटिट्यूड नाउ. |
समथिंग वैस्टली मोर थ्रिलिंग हैड कम इंटू इट नाउ. |
हैड इट स्ट्रक स्क्वेरली इट वुड हैव किल्ड हिम. |
व्हाटेव ही गेस्ट ही लॉक्ट अवे इन द टैबू रूम अव नओमी. |
ट्यूडर सर्वेड हिम विथ विथरिंग डिसगस्ट. |
ही डिड नाट बिलीव इन द बर्निंग अव डेलाइट फॉर सच ए लक्जरी. |
अगेन ही हैड डन द बिग थिंग. |
वन ग्रेट ड्रॉबैक टू फार्मिंग इन कैलिफ़ोर्निया इज़ आवर लांग ड्राई समर. |
Korean |
아이 케임 포 인포메이션 모어 아웃 옵 큐리어시 댄 애니씽 엘스. |
에이 트리클 어브 프레시 블러드 랜 오버 히즈 페이스. |
프로비던스 해드 딜리버드 힘 쓰루 더 메일스트롬. |
시 턴드, 피어링 댓 자크 마이트 씨 왓 워즈 인 허 페이스. |
잇츠 다이어미터 와즈 낫 모어 댄 투 헌드레드 야드즈. |
앤드 라울 리슨드 어게인 투 더 테일 어브 더 하우스. |
히 메이 앤티시페이트 더 데이 어브 히즈 데스. |
빌 링거드, 컨템플레이팅 히즈 워크 위드 아티스틱 어프리케이션. |
히즈 슬림 햔즈 그립트 디 에지즈 옵 디 테이블. |
서포즈 유 소 미 앳 워크 쓰루 더 윈도우. |
빌로 힘 더 샤도우 워즈 브로큰 인투 에이 풀 오브 리플링 스타라이트. |
인 잇 데어 워즈 썸씽 댓 워즈 올모스트 트래저디. |
데어 워즈 썸씽 패써릭 인 더 걸즈 에티튜드 나우. |
썸씽 배슬리 모어 쓰릴링 해드 컴 인투 잇 나우. |
핻 잇 스트럭 스퀘얼리 잇 우드 해브 킬드 힘. |
왓에버 히 게스트 히 락트 어웨이 인 더 타부 룸 오브 나오미. |
튜더 서베이드 힘 위드 위더링 디스거스트. |
히 디드 낫 빌리브 인 더 버닝 어브 데이라이트 포어 서치 에이 럭셔리. |
어겐 히 해드 던 더 빅 띵. |
원 그레이트 드로백 투 파밍 인 캘리포니아 이즈 아워 롱 드라이 서머. |
Japanese |
アイ ケイム フォー インフォーメーション モア アウト オブ キュリオシティ ザン エニシング エルス. |
エイ トリクル オヴ フレッシュ ブラッド ラン オーバー ヒズ フェイス. |
プロビデンス ハド デリバード ヒム スルー ザ メイルストロム. |
シー ターンド, フィアリング ザット ジャック マイト シー ワット ワズ イン ハー フェイス. |
イッツ ダイアメーター ワズ ノット モア ザン トゥー ハンドレッド ヤーズ. |
アンド ラウル リスンド アゲン トゥ ザ テイル オブ ザ ハウス. |
ヒー メイ アンティシペイト ザ デイ オブ ヒズ デス. |
ビル リンガー, コンテンプレイティング ヒズ ワーク ウィズ アーティスティック アプリシエーション. |
ヒズ スリム ハンズ グリップト ザ エッジズ オヴ ザ テーブル. |
サポーズ ユー ソー ミー アット ワーク スルー ザ ウィンドウ. |
ビロウ ヒム ザ シャドウ ワズ ブローケン イントゥ エイ プール オブ リプリング スターライト. |
イン イット ゼア ワズ サムシング ザット ワズ オールモスト トラジェディ. |
ゼア ワズ サムシング パセティック イン ザ ガールズ アティチュード ナウ. |
サムシング ヴァストリ モア スリリング ハッド カム イントゥ イット ナウ. |
ハッド イット ストラック スクウェアリー イット ウッド ハヴ キルド ヒム. |
ワットエバー ヒー ゲスト ヒー ロックト アウェイ イン ザ タブー ルーム オブ ナオミ. |
チューダー サーベイド ヒム ウィズ ウィザリング ディスガスト. |
ヒー ディッド ナート ビリーヴ イン ザ バーニング オブ デイライト フォー サッチ エー ラグジュアリ. |
アゲン ヒー ハッド ダン ザ ビッグ シング. |
ワン グレート ドローバック トゥー ファーミング イン カリフォルニア イズ アウアー ロング ドライ サマー. |
Russian |
ай кейм фор информэйшн мор аут ов куриˈɔсити тан энисинг элс. |
эй трикл ов фреш блад рэн овер хиз фейс. |
провиденс хэд деливерд хим тру зе мейлстром. |
ши тёрнд, фиринг дат жак майт си ват ваз ин хёр фейс. |
итс диаметр уаз нот море тан ту хандред ярдз. |
энд раул лисэнд агэн ту зэ тэйл ав зэ хаус. |
хи мей антисипейт зэ дей ов хиз дэс. |
бил лингерд, контемплейтинг хис вурк вид артИстик апришиэйшн. |
хис слим хандс грипт дэ еджес ов дэ тэйбл. |
сапоз ю со ми эт ворк тру де окно. |
билоу хим зэ шэдоу ваз брокен инту эй пул ов риплинг старлайт. |
ин ит зэр ваз самсинг зэт ваз олмост траджеди. |
дэр ваз самсинг патерик ин де гёрлз этитуд нау. |
самтинг вастли мор триллинг хад кам инту ит нау. |
хад ит страк сквэрли ит вуд хэв килд хим. |
ватэвэр хи гест хи локт авей ин де табу рум ов наоми. |
тюдор сервейд хим виз визеринг дисгаст. |
хи дид нат билив ин зе бёрнинг ав дейлайт фор суч эй лакшери. |
аген хи хэд дан зэ биг синг. |
ван грейт дробак ту фарминг ин калифорния из ауэр лонг драй саммер. |
Arabic |
أي كيم فور إنفورميشن مور آوت أف كيوريوسيتي ثان أنيثنغ آلس. |
أي تريكل أوف فريش بلد ران أوفر هيز فيس. |
بروفيدنس هاد دليفرد هيم ثرو ذَ ميلستروم. |
شي ترند, فيرينغ ذات جاك مايت سي وت وز إن هر فيس. |
إتس داياميتر واز ناٹ مور ثان تو هُندَرد ياردز. |
أند راؤول لِسند أَجَين تو ذا تيل أوف ذا هاوس. |
هي مي أنتيسبيت ذا دي أوف هز ديث. |
بيل لينجرد, كونتمبليتنج هس وورك ويث ارتيستيك ابريشيشن. |
هِس سليم هاندز جريبت ذِ إيجِز اف ذِ تَيْبَل. |
سَپُوز يو ساو مي أت وُرك ثرو ذَ ويندو. |
بيلو هِم ذا شادو وَز بروكن إنتو إي بول أوف ريبلينغ ستارلايت. |
إن إت ذير وز سمثنغ ذات وز ألموست تراجيدي. |
ذير وز سومثينغ باثيتك إن ذا جيرلز اتتيود ناو. |
سمثينج فاستلي مور ثرلينج هاد كوم إنتو إت ناو. |
هَد إت سترك سكويرلي إت وُد هَف كِلد هِم. |
واتيفور هي جست هي لاكت أوي إن ذا تابو روم أف ناومي. |
تودور سيرفيد هم ويذ ويذرينج دسجست. |
هي ديد نات بليف إن ذا بيرننج أف ديلايت فور ساتش إي لكشري. |
أَجَن هِي هاد دَن ذَ بِغ ثينغ. |
وَن جريت دروبك تو فارمينغ إن كاليفورنيا إز أور لونغ دراي سَمَر. |
* please scroll horizontally to explore additional columns in the table.