Announcement

Hello World! — Our Open Source Project Launch

By Koel Labs ·

December 23, 2024

Hello World! — Our Open Source Project Launch

An Introduction

At Koel Labs, our goal is to make pronunciation learning more accessible and inclusive. To represent the diversity of language and dialects, we're excited to announce that everything from model weights and training code to datasets, research papers, and the frontend UI is officially open source!

The problem of pronunciation learning

Currently, 48% of foreign speakers are anxious about their accents [1]. Pronunciation is one of the most complex parts of learning a language. It's difficult to hear the difference between what you're saying and what you should be saying (sometimes, it's impossible without a teacher, which is not affordable for many). Once you hear the difference, it is also super hard to learn to make sounds you've never made in your native language.

Technology can bridge this gap; only existing language learning tools do not value the diversity of languages and dialects. A good solution should be able to understand and teach any accent, not just define a "standard." Moreover, the feedback should be nuanced, actionable, and personalized based on your native language background, not just a human-void ASR system saying "yes" if it recognizes each word you're saying.

We want to collect datasets that represent the diversity of languages and dialects and make the entire process of training and evaluating models and then interpreting the results to surface explainable feedback to users reflect the diversity of backgrounds that language learners have.

For us, as immigrants and children of immigrants, pronunciation learning has a special meaning because we see not only the importance of fitting into our new communities but also of fitting into our extended families.

For others, pronunciation learning has other meanings, and we want to make sure that our tools can help everyone. This is why we're making the project open source — it allows for discussion and ideas from a worldwide audience.

Current progress and plans

We were fortunate to join the 2024 Mozilla Builders program. This has provided us with the resources to train state-of-the-art audio models for the first version of our tool targeting foreign English speakers. We are in the process of publishing an academic paper on our approach. We are planning on continuing to iterate on the pipeline to support more languages, dialects, and use cases such as speech pathology for speech-impaired children.

Our web application is not yet ready, but we are gearing up for a closed beta launch soon. In the meantime, check out our models on Hugging Face and training code on GitHub.

How do I get involved?

If your institution is interested in collaborating, please reach out to us at [email protected]. We have already partnered with several leading HCI, Phonology, and Linguistics researchers from institutions like CMU, BCU, and UW.

If you are a developer, designer, or just interested in language learning, please partake in the discussion on our GitHub after consulting our contribution guidelines. Any feedback is welcome. User feedback is especially important to us, so if you are open to joining the beta testing program, please sign up here.

[1] Babbel Anxiety Study. Retrieved from https://www.babbel.com/en/magazine/accent-anxiety-study