IMPLEMENTASI TEKS MENJADI UCAPAN MENGGUNAKAN KORPUS BAHASA INDONESIA BERBASIS TRANSFER LEARNING SV2TTS

Vaiz, Muhammad Gozy Al (2026) IMPLEMENTASI TEKS MENJADI UCAPAN MENGGUNAKAN KORPUS BAHASA INDONESIA BERBASIS TRANSFER LEARNING SV2TTS. Bachelor thesis, Institut Teknologi Kalimantan.

[img] Text (cover)
11191051_cover.pdf - Cover Image

Download (210kB)
[img] Text (statement of authenticity)
11191051_statement_of_authenticity.pdf

Download (462kB)
[img] Text (publishing agreement)
11191051_publishing_agreement.pdf

Download (595kB)
[img] Text (approval sheet)
11191051_approval_sheet.pdf

Download (515kB)
[img] Text (preface)
11191051_preface.pdf

Download (379kB)
[img] Text (abstract id)
11191051_abstract_id.pdf

Download (562kB)
[img] Text (abstract en)
11191051_abstract_en.pdf
Restricted to Repository staff only until January 2028.

Download (492kB) | Request a copy
[img] Text (table of content)
11191051_table_of_content.pdf
Restricted to Repository staff only until January 2028.

Download (584kB) | Request a copy
[img] Text (illustrations)
11191051_illustrations.pdf
Restricted to Repository staff only until January 2028.

Download (365kB) | Request a copy
[img] Text (tables)
11191051_tables.pdf
Restricted to Repository staff only until January 2028.

Download (150kB) | Request a copy
[img] Text (notations)
11191051_notations.pdf
Restricted to Repository staff only until January 2028.

Download (457kB) | Request a copy
[img] Text (chapter 1)
11191051_chapter_1.pdf
Restricted to Repository staff only until January 2028.

Download (1MB) | Request a copy
[img] Text (chapter 2)
11191051_chapter_2.pdf
Restricted to Repository staff only until January 2028.

Download (6MB) | Request a copy
[img] Text (chapter 3)
11191051_chapter_3.pdf
Restricted to Repository staff only until January 2028.

Download (2MB) | Request a copy
[img] Text (chapter 4)
11191051_chapter_4.pdf
Restricted to Repository staff only until January 2028.

Download (4MB) | Request a copy
[img] Text (conclusions)
11191051_conclusions.pdf
Restricted to Repository staff only until January 2028.

Download (479kB) | Request a copy
[img] Text (bibliography)
11191051_bibliography.pdf

Download (933kB)
[img] Text (paper)
11191051_paper.pdf
Restricted to Repository staff only until January 2028.

Download (859kB) | Request a copy
[img] Text (presentation)
11191051_presentation.pdf
Restricted to Repository staff only until January 2028.

Download (1MB) | Request a copy
[img] Text (Form. TA-020)
11191051_Form. TA-020.pdf
Restricted to Repository staff only until January 2028.

Download (877kB) | Request a copy

Abstract

Pendekatan transfer learning digunakan untuk menghasilkan kualitas ucapan sintesis yang lebih alami dengan menggunakan model pembicara dari basis data speaker verification. Korpus bahasa Indonesia digunakan sebagai sumber data untuk pelatihan model. Metodologi penelitian mencakup pengumpulan korpus bahasa Indonesia, pre-processing data, pelatihan model Generalized End-To-End Loss for Speaker Verification (GE2E), dan pelatihan model synthesizer Tacotron2 yang diintegrasikan dengan model encoder. Selanjutnya pembuatan model vocoder auto-regressive berdasarkan WaveRNN yang diintegrasikan dengan model synthesizer untuk menghasilkan ucapan. Hasil penelitian menunjukkan bahwa model encoder GE2E memiliki plot cluster yang baik pada step ke-174500. Model encoder mampu mengelompokkan suara yang mirip serta dapat membedakan suara dengan pitch rendah dan tinggi. Hasil pelatihan model synthesizer LSA memiliki nilai loss sebesar 0.30740 pada step ¬ke-50000. Hasil pelatihan model synthesizer LSA memiliki nilai loss sebesar 0.22048 pada step ¬ke-50000. Hasil pelatihan model vocoder dengan synthesizer LSA memiliki nilai loss sebesar 4.0005 pada epoch ke-350. Hasil pelatihan model vocoder dengan synthesizer forward attention memiliki nilai loss sebesar 3.9152 pada epoch ke-350. Hasil evaluasi model secara objektif pada model vocoder LSA mendapatkan nilai 0.5333. Hasil evaluasi model secara objektif pada model vocoder forward attention mendapatkan nilai 0.6516. Hasil evaluasi model secara subjektif pada model vocoder LSA mendapatkan nilai 1.4285. Hasil evaluasi model secara subjektif pada model vocoder forward attention mendapatkan nilai 1.5714. Penelitian mengidentifikasi beberapa keterbatasan dalam hal sintesis suara dalam bahasa Indonesia. Jumlah dataset berkualitas mempengaruhi kualitas dari model synthesizer. Kualitas model synthesizer mempengaruhi hasil dari model vocoder.

Item Type: Thesis (Bachelor)
Subjects: Q Science > QA Mathematics > QA75 Electronic computers. Computer science
Divisions: Jurusan Matematika dan Teknologi Informasi > Informatika
Depositing User: Muhammad Gozy Al Vaiz
Date Deposited: 09 Jan 2026 08:24
Last Modified: 09 Jan 2026 08:24
URI: http://repository.itk.ac.id/id/eprint/25110

Actions (login required)

View Item View Item