ICLR 2023 - My First ICLR Experience

illume_orig

Welcome to Beautiful Kigali

To be very honest, it was a brilliant stroke of fortune that gave me the chance to attend this year’s ICLR. For years, even prior to the pandemic, I had heard all kinds of visa horror stories from would-be participants of some of the world’s top conferences.

Enter COVID-19: the playing field for events was leveled, and we were all reduced to faces and names on Zoom screens, even for the world’s most exclusive events.

I don’t know if that’s what made it click, or if it was the fresh community push towards equity or equality that ultimately caused the change, but whatever it was, I am grateful, because, for the first time, ever, the ICLR happened on African soil here in Kigali, where I just happened to have moved to complete grad school, and settled into work. 😅

Talk about auspicious!

I know I’m not the only one who appreciated the opportunity. As you’ll see in the image below, African registration was up to 261 from 16 at the last in-person ICLR event. That’s a whopping 1,631% increase!


Photo credit: David Adelani

Now that that shocker is out of the way…

My experience:

Name Tag

The first thing you will notice is that my tag says Thursday and Friday. I completely skipped Monday to Wednesday for three main reasons:

The prices, my friend, the prices. (Many of the people who attend these conferences are sponsored by their institutions. I was there as an individual.)
I got a pass for those days through my IndabaX Rwanda participation.
I still had to go to work 😂

There were many summary videos, for all 5 days, but stick with me as I walk you through days 4 and 5.

Day 4 (IndabaX Rwanda)

This was not my first IndabaX. In fact, I was a speaker at the first IndabaX Ghana in 2019. However, the colocation of IndabaX Rwanda with ICLR made it bigger and better. Plus, there just seemed to be a lot more excitement about our return from the ‘exile’ of COVID-19.

There were a number of talks, but I’ll comment on the ones I remember:
I sat through Sara Hooker’s talk on the state of LLM research, and a very insightful talk on the pitfalls to avoid in data collection, by Kathleen Siminyu. (Key takeaway from that was to read data terms and conditions before scraping a site… there can be horror stories otherwise.) Samuel Rutunda (CTO at Digital Umuganda) and Isaac Manzi (Mbaza NLP) also gave talks on AI/NLP efforts and communities growing in Rwanda. They were inspiring, to say the least. People really do be trying 😊.

The highlight of the day for me was the poster session. A few months ago, I published my first first-author paper in Nature Scientific Reports. It was a big deal for me. I compressed the facts into a one-pager and that became my first-ever poster presentation. And guess who won first place? Yup! Yours truly! * takes a bow *

poster

I also met one Jonathan from Meta who, it turns out, worked on the Canadian subset of the dataset we analyzed in our paper. I’d never seen anyone that interested in something I’d written before.

Day 5 (Africa NLP)

A lot more than just the Africa NLP workshops took place on Day 5, but to be honest, that was my singular focus. African languages are generally considered low-resource languages, meaning there is very little machine-readable and sufficiently-labeled data available to train models. This doesn’t necessarily mean there is no data, but it means that most of the available data is not in a format that is useful to an ML model. For example, there might be a lot of audio in my mother tongue, Twi, from radio broadcasts. However, without transcriptions, or better still, time-segmented transcriptions, this might not be useful in an NLP context.
There are a lot of efforts by all kinds of groups, like Masakhane, Ghana NLP, Lesan, and a host of others, not just when it comes to collecting data, but also training models. However, there is still a huge gap between what is currently possible in languages like English, French, and Chinese, and the average African language. Still, kudos to the groups I mentioned, they are really pushing the frontiers!
This is a good segue to what, in my opinion, was the most interesting talk I heard during my time at the conference:

NLP systems for low resource languages: hype vs reality


Photo credit: Paul Azunre on Twitter

This was a panel discussion involving Paul Azunre (Ghana NLP), Jade Abbott (Lelapa AI), Asmelash Teka Hadgu (Lesan), which was moderated by Timnit Gebru (DAIR). This panel was feisty, to say the least. There’s a lot that can be said, but I think my two major takeaways would be:

Africans have to get up and tell their own story. Nobody is coming from outside to “save” our languages and make them relevant for today’s technology. Even if there were people who wanted to do that from outside, we still have to be responsible and make sure we are deeply involved. There will never be anybody better to ensure the quality of our representation than us.
It appears Big Tech might be trying to bully smaller players and use the work of grassroots communities (or highly underpaid annotators) to generate huge profits without reasonable compensation. “What is reasonable?” Well, if you ever get to the point where you decide, and forgive my French, “I don’t get paid enough for this s***”, then you have crossed the line from reasonable to unreasonable. It would appear that a number of people have been crossing that line in recent times.

The feelings of disappointment were quite strong, but even stronger was the hope that we would all rise to the challenge!

Does this seem like a good note to end on?

I think so. Until the next one. Cheers! 🥂

Written on June 1, 2023