Skip to main content

August 8th ABAIR Meeting

· 2 min read
John Sloan
Research Fellow
Neasa Ní Chiaráin
Assistant Professor of Speech and Language Technologies, PI ABAIR Project
Amanda Bernhard
Research Fellow
Andy Murphy
Research Fellow
Aifric Nic Niallais
Undergraduate Student
Ciadhla Mulloy
Research Assistant
Finn Farrell
Undergraduate Student
Joey McInerney
Masters Student

Highlights

  • Meeting with TG4 pushed back from September to October.
  • COGG have asked Amanda to come present to them.
  • Julie heard from someone Gaeltacht sites of UG wanting to use Geabaire ond other ABAIR resources in their courses.

Amanda

  • Just 2 hours gathered from child speech so far. Much more difficult to get volume than for adults.
  • Want to give kids some token of appreciation.
  • Recording on iPad good.
  • Afric up with Amanda.
  • Need to polish up Mile Glor.
  • Diversity needed in characters in MaO.
  • COGG contacted Amanda to present to them.
  • Julie heard from someone Gaeltacht sites of UG wanting to use Geabaire ond other ABAIR resources in their courses.

Aifric

  • Working with Amanda on Míle Glór na nÓg.
  • Questions if Míle Glór is still going.

Ciadhla

  • Working with Transtool.
  • Scraped some other data ?? RTÉ Archives.

Joey

  • Collected Oireachtas Corpus, with 1.5B tokens EN and 27M GA. Helps with current GA LLM problems: short context text, cultural alignment. Working with Tung, a PhD researcher at UCC.
  • Plans for data set:
    • Instruction data set
    • Human feedback dataset
    • Bilingual QA Bench mark. Easy to make because written parliamentary questions are annotaed by topic. E.g.: 'Finance'.
    • Sociolingusitic analysis (long term, future project): topics, codeswitching en/ga

Andy

  • Questions on how much data TG4 will be providing us so we can plan storage.

John

  • Fixed errors on S2s.
  • Will fix fotheidil next week.
  • Asked about new Website progress.