Catherina, Angela (2025) Implementasi Sintesis Ucapan Bahasa Indonesia Menggunakan Model Text-to-Speech Tacotron 2 dan HIFI-GAN - Submit Seminar. Bachelor thesis, Institut Teknologi Kalimantan.
![]() |
Text
11211013_cover.pdf Download (150kB) |
![]() |
Text
11211013_statement_of_authenticity.pdf Download (189kB) |
![]() |
Text
11211013_publishing_agreement.pdf Download (62kB) |
![]() |
Text
11211013_approval_sheet.pdf Download (186kB) |
![]() |
Text
11211013_preface.pdf Download (241kB) |
![]() |
Text
11211013_abstract_id.pdf Download (264kB) |
![]() |
Text
11211013_abstract_en.pdf Restricted to Repository staff only until 4 October 2027. Download (219kB) | Request a copy |
![]() |
Text
11211013_table_of_content.pdf Restricted to Repository staff only until 4 October 2027. Download (384kB) | Request a copy |
![]() |
Text
11211013_illustrations.pdf Restricted to Repository staff only until 4 October 2027. Download (256kB) | Request a copy |
![]() |
Text
11211013_tables.pdf Restricted to Repository staff only until 4 October 2027. Download (258kB) | Request a copy |
![]() |
Text
11211013_chapter_1.pdf Restricted to Repository staff only until 4 October 2027. Download (538kB) | Request a copy |
![]() |
Text
11211013_chapter_2.pdf Restricted to Repository staff only until 4 October 2027. Download (1MB) | Request a copy |
![]() |
Text
11211013_chapter_3.pdf Restricted to Repository staff only until 4 October 2027. Download (521kB) | Request a copy |
![]() |
Text
11211013_chapter_4.pdf Restricted to Repository staff only until 4 October 2027. Download (1MB) | Request a copy |
![]() |
Text
11211013_conclusions.pdf Restricted to Repository staff only until 4 October 2027. Download (294kB) | Request a copy |
![]() |
Text
11211013_bibliography.pdf Download (334kB) |
![]() |
Text
11211013_enclosure.pdf Restricted to Repository staff only until 4 October 2027. Download (747kB) | Request a copy |
![]() |
Text
11211013_paper.pdf Restricted to Repository staff only until 4 October 2027. Download (1MB) | Request a copy |
![]() |
Text
11211013_presentation.pdf Restricted to Repository staff only until 4 October 2027. Download (3MB) | Request a copy |
![]() |
Text
11211013_Form.TA-020.pdf Restricted to Repository staff only until 4 October 2027. Download (227kB) | Request a copy |
Abstract
This study aims to develop a high-quality Text-to-Speech (TTS) model for the Indonesian language by utilizing the Tacotron 2 architecture as a mel spectrogram synthesizer and HiFi-GAN as the vocoder. The dataset used consists of Indonesian-language audiobooks compiled and formatted by the researcher in a structure similar to LJSpeech. Tacotron 2 is trained to convert text into mel spectrograms, while HiFi-GAN generates audio signals from the resulting spectrograms. Model training was conducted using the open-source SpeechBrain toolkit, which enables architectural modifications, including adjustments to the attention mechanism. Speech quality evaluation was carried out using the Mean Opinion Score (MOS) method and a cross-similarity matrix. Model performance was further analyzed through attention weight visualization and loss function tracking. The results show that the variation with modified content-based attention and without phonemizer achieved the highest overall MOS score of 4.09. Meanwhile, the model with both modified content-based attention and phonemizer achieved the highest embedding similarity to ground truth speech (0.916) based on the cross-similarity analysis. These findings indicate that architectural modifications can enhance model performance and that subjective and objective evaluation methods complement each other in assessing TTS quality. This study demonstrates that the Tacotron 2 and HiFi-GAN architecture can be effectively implemented for Indonesian speech synthesis with competitive results.
Item Type: | Thesis (Bachelor) |
---|---|
Subjects: | Q Science > Q Science (General) Q Science > QA Mathematics > QA75 Electronic computers. Computer science |
Divisions: | Jurusan Matematika dan Teknologi Informasi > Informatika |
Depositing User: | Angela Catherina |
Date Deposited: | 11 Jul 2025 06:05 |
Last Modified: | 11 Jul 2025 06:05 |
URI: | http://repository.itk.ac.id/id/eprint/23605 |
Actions (login required)
![]() |
View Item |