Asmelash Teka Hadgu is the co-founder and chief technology officer of Lesan, a Berlin-based startup developing machine translation products for Ethiopian languages. Founded in 2019, Lesan has launched translation tools for Amharic and Tigrinya, which it says outperform Google Translate in terms of quality. The startup’s use of offline print resources to create a new benchmark data set for languages from the Horn of Africa has been key to its success.

This interview has been edited for length and clarity.

As an automated translation service, you’re competing with the biggest and most advanced tech companies in the world. What can smaller shops like Lesan bring to the table?

Google and Facebook overhype that they have built one single giant model to solve machine translation for hundreds of languages. If you put it on its face and compare it with smaller startups, like Lesan or Ghana NLP, the quality is actually low. So at Lesan, we don’t believe that you can create just one model that solves these problems. If your native language is one of these African languages, if you have the technical know-how around machine learning, and you care about your community and the kind of problems they want to solve, that unique advantage and connection to the community can carry you further.

How is your work relevant to the boom in conversational artificial intelligence tools, such as ChatGPT, in terms of building original training data sets for low-resource languages?

If you ask ChatGPT in Tigrinya or Amharic the simplest and most frequently asked questions, it gives you gibberish, a mix of Tigrinya and Amharic, or even made-up words. Chatbots like ChatGPT are utterly broken or useless for these languages. Most of the data that powers them is basically internet data, and there is not enough data online for these languages. So it makes it even more urgent to create language-specific technologies.

Do you have kinship with other developers and companies that are working on making products in low-resource languages?

I absolutely belong to a tribe, or whatever we want to call it — a kind of pan-African AI movement. We’re actually meeting every other week with other language technology startups, discussing how we can come together and solve this problem our own way. We don’t have to repeat what Silicon Valley does for our languages. And you don’t have to have a one-player-takes-it-all mentality.