Skip to main content

2025-08 Recordings Report

Prepared by: Muireann Nic Corcráin
An Rinn Gaeltacht Recording Initiative
Project Period: June 2025 – August 2025
Date:
Institution/Organization: Abair.ie, Speech and Phonetics Laboratory, Trinity College Dublin


Summary

The primary aim of this project was to collect speech data in An Rinn, an area of Gaeltacht na nDéise in Co. Waterford, Ireland, to support the development of a female voice in this minority dialect. The main goal was to record as much data as possible using new corpus materials and utilizing existing corpus materials to make sure there was a substantial quantity to begin training the voice.

Preparation of the materials took place in June, recording took place over the month of July, with a total of 11 usable data hours retrieved from these sessions. This accounts for approximately 2 hours of recording for one hour of usable data.

Plans now begin to start training the voice and to launch this new dialect for Abair.ie at the Oireachtas 2025 in Belfast.


1. Project Objectives

Primary Objectives

  • Gathering Irish language materials from the area to use in a specific An Rinn corpus
  • Conduct recording sessions with An Rinn Gaeltacht speaker for an extended period of time (approximately 2 hours per session) to achieve 10 hours of usable data to train synthetic voice of An Rinn Gaeltacht speaker.
  • Begin training and testing of this data to have a prototype ready by November 2025 to showcase at the Oireachtas 2025 in Belfast, Northern Ireland.

Secondary Objectives

  • Collect materials to help support the development of the children’s corpora that will be used to record children’s voices for the synthesiser, as well as identifying good local texts that could be used to support the recording of a male voice for the dialect.

2. Materials Used

2.1 Source Materials

  • Prompts ‘An Rinn’ – 2023
  • Scéalta Mhicil
  • I Dtús M’Óige
  • Cúpla
  • Katfish
  • Scéalta agus Seanchas
  • An Linn Bhuí
  • Nóinín & Róllaí Pollaí
  • Nóinín & Siar Aniar
  • An Nollaig Fadó
  • Fóir Orm
  • Éalú

Foinsí Breise (Níor Úsáid Muid seo)

  • An Múinteoir Nua – Áine Ní Ghlinn
  • An Nathair agus na Spéaclaí
  • Cogair mé seo leat – Rang a Trí
  • Mistéir Lios an Chatha
  • Seo, Siúd agus Uile – Rang a Trí
  • Síolta Sí – Rang a Trí
  • An Teist – Paula Ní Chionnaith
  • Ar Thóir an Toirc Bháin – Éamonn Ó Ruanaí
  • Carraig na Caillí
  • Cogair mé seo leat aríst – Rang a Ceathair
  • Mairéad Ní Mhaonaigh – Uinsionn Ó Domhnaill
  • Seo, Siúd agus Eile – Rang a Ceathair
  • Togha agus Rogha – Rang a Ceathair
  • An Marcach Óg – Pádraig Ó Baoighill
  • Cine Shiabhra – Claire Lyons
  • Féasta Filíochta – Rang a Cúig
  • Inis seo dom – Rang a Cúig
  • Sin Scéal Eile – Rang a Cúig
  • Amstardam – Éamon Ó Ruanaí
  • Dalladh Dánta – Rang a Sé
  • I gCéin is i gCongar – Rang a Sé
  • Siar is Aniar – Rang a Sé
  • Úbalonga – Áine Ní Ghlinn

2.2 Recording Equipment and Setup

Audio Equipment: Focusrite Scarlett Solo (4th Generation), b&k 4191 microphone, Nexus conditioning amplifier, Macbook Pro (M2 or M3), Microphone stand & Microphone isolation shield.

Software Systems: SpeechRecorder & Audacity

Recording Environment: Living room of Fianait’s house. Some high ceilings, so soundboards were placed around the sideboard that served as a table, with a duvet cover on top of the soundboards to help block out any interfering external sounds. All windows and doors were closed, and the soundboards moved together closely to make sure the space was as compact as possible.

2.3 Speaker Information

  • Speaker Background: Grew up in the area, all education through the medium of Irish until university, now working in Gaeloideachas sector as AP 1 in Scoil Garbhán.
  • Dialect Information: An Rinn, Gaeltacht na nDéise, Co. Port Láirge. One of the smaller Gaeltacht areas, close to Dungarvan. Fishing area and home to Coláiste na Rinne and the Coláiste na Rinne archives, where documentation of the area is kept.
  • Experience Level: Had done some recording in 2023 so there was familiarity with the recording system and set up. New to using Audacity, needed some tutoring in understanding this software.

3. Project Timeline and Phases

Phase 1: Material Preparation and Selection

Duration: June 2025
Activities:

  • Material research and collection – using An Rinn 2023 prompts again, searching for local texts/texts written by local authors that hadn’t been used in previous recording sessions
  • Content curation and selection – All new prompts were compiled in a Word Document, which was then converted into TextFile format for Andy to run through the program linked to SpeechRecorder
  • Preparation of recording scripts – Completed by Andy, shared in Google Drive in a zip file, which was then downloaded and imported onto SpeechRecorder along with access to other SpeechRecorder projects completed or in progress by the lab.
  • Initial speaker consultation – arranging best times and dates to conduct the research, working on a schedule that didn’t conflict with planned holidays and appointments, Muireann making the trip to and from An Rinn on a daily basis.

Key Milestones:
Identifying some key stories in the local Déise publication of ‘An Linn Bhuí’ not only provided good content for recording, but also offered insight into the type of language used in the days of Dr. Piaras de Hindeburg, who travelled around the area recording people’s daily lives, stories, and songs they grew up with. These stories are an excellent way to understand the linguistic structure of the dialect from that period, as well as offering insight into the culture and society of the time.

Phase 2: Recording Sessions

Duration: July 2025
Activities:

  • Equipment setup and testing – about 1 hour, testing session of 20 minutes once the equipment was arranged to make sure the data was getting uploaded to the linked Google Drive folder that Andy could see and analyse.
  • Recording sessions with speaker
  • Quality control and review
  • File organization and backup

Recording Statistics:

  • Total recording hours: 16.5 hours
  • Number of sessions: 11
  • Total prompt count on Speech Recorder: 3,122
  • Total time using Audacity: 59 minutes
  • Total usable data hours: 11 hours

Phase 3: Post-Production and Processing

Duration: August – November 2025
Activities:

  • Analysis of data recorded – confirmation approximately 11 hours of recording completed, which is enough to begin training a female synthetic voice.
  • Audio editing and cleanup
  • File formatting and standardization
  • Transcription verification – utilizing Tranztool to assist in the editing process to get more accurate TTS

4. Challenges and Issues Encountered

4.1 Technical Challenges

Audio Quality Issues:

  • July 22: Issues with headphones occurred when accessing 2023 An Rinn Prompts, despite having good sound quality from the 2025 prompts. Andy helped via remote desktop to resolve this, adjusting amplifier and interface settings to ensure the recording could proceed, since the 2023 prompts were run on a different computer with a different external soundcard. Once recording resumed, SpeechRecorder was documenting the prompts and the sound, Andy was able to hear them on his end, just couldn’t hear it on a local level.
  • July 31: same issue with the 2023 prompts with not hearing the recordings and when accessing the Arctic corpus for the first time. Once settings were checked and adjusted, sound was restored for the Arctic corpus recordings.

Equipment-Related Problems:

  • July 22: 2023 prompts issues, SpeechRecorder settings had to be adjusted because it didn’t recognise the amplifier and focusrite. SpeechRecorder settings changed to include these in the settings, and save them as the sound processing tools from the microphone, but the levels on the recordings were weak, amplifier fixed up again. Interesting that the settings didn’t seem to affect Audacity at all, there was no need to change up the settings for this at all. We also had a tripped switch knock off power to the amplifier and focusrite, but this was fixed up quickly and didn’t impede on recording too much.
  • July 31: There had been a powercut in the area and it had knocked off the amplifier and focusrite again, devices were reset and proceeded with recordings. Still having issues with the headphones for the 2023 prompts, but no issues with the headphones for the Arctic corpus. Review of materials that Fianait recorded on Audacity showed no impact on devices and that everything worked well for her while she did this work.

4.2 Linguistic Challenges

Dialect Variations:

  • Tá brón orm – cathú orm
  • Tá – pronounced ‘thá’
  • Cén fáth – dén chúis
  • Rinne – dhein
  • Dh’fhiafraigh pronunciation – ‘yee-frig’
  • Lúdramán – gamal
  • Daichead – used more in this area
  • Logainmneacha – great to have recorded in the local dialect

Text Complexity:

  • Some Ulster texts were included in the 2025 prompts (should have been reviewed better before being sent onto Andy to include in the corpus).
  • The majority of the prompts used are local texts or texts written by local authors that speaker was familiar with, easily accessible (children’s stories or local dialectal pieces already in her vernacular). Some of the texts utilized in Audacity are pieces utilized by speaker in professional life in Gaeloideachas and is also simple and appropriate in offering a variety of pieces related to nature, history and biographical pieces.

4.3 Logistical Challenges

Scheduling:

  • Making sure we had enough time to get as much recording done as possible, allowing for 2- 2.5 hour sessions at a time. Also intention to do so in a manner that didn’t impede on speakers’ time and summer plans, so certain dates were blocked off for vacation.
  • As many potential days as possible were blocked off; most of them were utilised, but some were not feasible due to changes in the speaker's and the Abair recorder's schedules.

Location/Environment:

  • High ceilings in living room – soundboards were useful to help with noise cancellation, along with a duvet on top of the soundboards and all the windows and doors being closed.
  • Rural area – lots of farm machinery on the road outside, sometimes could cause recording sessions to pause.
  • Small light fixings outside needed to be removed from the balcony.

6. Analysis and Evaluation

6.1 Project Success Metrics

Objectives Met:

  • The goal was to record approximately 10 hours of data; achieving 11 hours was a great achievement and helpful in beginning the development of a Déise female speaker
  • Good quality texts used from the area, more sources could be discovered for future child voice and male voice development, but there is a good basis to go from here. One note is that there would need to be a clean up on 2025 prompts to exclude the Ulster dialect texts, all relevant texts will be allocated to a separate corpus to accommodate for this.
  • The project was completed in a timeframe that worked for the speaker, putting their needs to the forefront of the process, as well as gaining insight into their experience of reading the texts and understanding local phrases in the texts.

Quality Assessment:

  • Audio quality was of high quality for the duration of the project. Even though there was the few days of technical difficulty, the sound quality of the data was still of high quality when downloaded and sent to Google Drive folder and accessed on other device.
  • The linguistic accuracy and authenticity from the speaker was very high due to the speaker having grown up in the area and being immersed in the culture and language their entire lives. The opportunity to listen to the phonological pronunciation of words and hear the dialectal differences compared to the ‘standard’ showcases the richness of the local dialect.
  • The recordings gathered for the purpose of developing a synthetic female voice are of a high enough standard to begin the training of the dataset gathered and linking up previous recordings from 2023 to ensure a high-quality TTS voice to be developed.

7. Future Recommendations

Technical Improvements:

  • Double-checking the interface and amplifier settings to make sure there is no issue with input or output.
  • Ensuring that output can be heard through headphones on variety of corpora, settings for SpeechRecorder should allow for this.

Material Expansion:

  • Trying to find more local texts to include in corpus for future children recording and for male voice.
  • Clean up the corpus and make it accessible to other lab members to ensure that the material is easy to access and can be used on different devices.

Project Scaling:

  • Templates of project plan and report to be shared to mimic in different Gaeltacht areas and for male & children’s voices in the Déise Gaeltacht to assist in streamlining workflow
  • Important to ensure that the resources for each project are identified and gathered ahead of time, if equipment needs to move location to have this transfer timeline figured out ahead of time in case the current project runs over schedule.

Technology Development:

  • Editing of SpeechRecorder prompts via Tranztool to ensure dialectal differences are noted and can be integrated into the training dataset work

8. Acknowledgments

Buíochas ollmhór d’Fhianait Nic Mhurchú, a chaith an méid sin ama ag tabhairt cupán tae dom agus a rinne obair na gcapall chun an éacht seo a bhaint amach.
Buíochas d’Ailbhe Ní Chasaide, Neasa Ní Chiaráin, Andy Murphy, Amanda Bernhard agus gach duine a thug tacaíocht dom an tasc seo a dhéanamh.


9. Appendices

Appendix A: Timetable & Recording Log

Mí Iúil

Image from July 22 session, issue with processing sound and SpeechRecorder not working

Recording Data:

DátaTréimhsePromptsScéalta
02/0720 nóiméad1 - 47Mac an Tiarna
07/072 uair48-384Mac an Tiarna & Micil
08/072 uair 15 nóiméad385-416, 575-937Micil, Tadhg Cúndún, Nioclás Corráin
09/072 uair938-1195, 1258-1481Éalú, Cnuasach Máire Ní Chaoimh, Katfish agus Scéalta Eile - Ógie ó Céilleachair
10/072 uair 30 nóiméad1482-1917, 1940-2024Éalú, Fóir Orm
11/072 uair2025–2054, 2082–2218, 2262–2435Fóir Orm, Nic BGG Onóracha
22/071 uair go leith2436–2533, 1–252Nic BGG Onóracha, Prompts 2023 - Cúpla
23/07½ uairAudacity, Prompts Muireann
24/07½ uairAudacitySéimí agus An Chléirseach, An Cranna Úll, Féirín an Eala, Na Fir Grinn, Drochlá Ruairí
25/07½ uairAudacityTuras na Scríbhneoireachta including various texts (Labradóir, Beacha Meala, ... )
31/072 ½ uair252 - 610, 1 - 300Cúpla, Arctic Corpus
06/08Ag bailiú trealamh

Appendix B: Images from Project Issues that Arose

  • Appendix B.1: Image from July 22 session, issue with processing sound and SpeechRecorder not working. Image from July 22 session, issue with processing sound and SpeechRecorder not working
  • Appendix B.2: Resetting SpeechRecorder settings to align with 2023 prompts corpus. Resetting SpeechRecorder settings to align with 2023 prompts corpus
  • Appendix B.3: Resetting SpeechRecorder settings to align with 2023 prompts corpus. Resetting SpeechRecorder settings to align with 2023 prompts corpus
  • Appendix B.4: 2023 Prompts - Amplifier levels not on a high enough level, indicating issues with amplifier and interface not matching settings with microphone input. 2023 Prompts - Amplifier levels not on a high enough level, indicating issues with amplifier and interface not matching settings with microphone input
  • Appendix B.5: 2023 Prompts - Issue with input to SpeechRecorder, indicated need to check amplifier and interface settings. 2023 Prompts - Issue with input to SpeechRecorder, indicated need to check amplifier and interface settings.
  • Appendix B.6: Audio host issue with Audacity, indicating need to check amplifier and interface level following power outage. Audio host issue with Audacity, indicating need to check amplifier and interface level following power outage.