A silhouete man walking behind a row of tall windows.
Photo by sebastiaan stam on Unsplash
NO VEM BER 2020

Issue #1

November 2020



Welcome


If you've arrived here it means you may have a feeling that digital technology is no longer something neutral in our lives.

Technology's in our pockets, on our wrists, at work, supermarkets, schools and our homes. It has its own agenda which may or may not be in line with ours. With such pervasiveness, it becomes imperative to be critical and ponder both the positive or negative outcomes of its usage for our collective future.

Critical Future Tech's aim is to provide thinking points and conversation starters on the impact of technology in our lives as well as serving as a meeting point on the Portuguese scene to discuss these themes.

Every month CFT delivers to your inbox a curation of the tech industry highlights on privacy, ethics, fairness, responsibility and impact as well as, when possible, long-form discussions with guests.

Thank you for your joining this project.

- Lawrence

What Happened


2020 has been a beatdown of a year for everyone. Big Tech companies, although having banked on the global pandemic, are facing an increasing backlash from consumers and specially from governments. Here's what went down in October.


Big News


The U.S. Congress released a 450 pages antitrust report with findings that Amazon, Apple, Facebook and Google each hold and have abused monopoly power. Who would have thought?

In what now seems like a recurring episode, Facebook, Google and Twitter executives were grilled by the U.S. Senate over Section 230.

The European Union is also turning up the pressure on Big Tech by threatening their core business model and drafting rules to limit Apple and Google at pre-installing their own apps.

France is tightening an order for Google to negotiate online revenue with media groups while Germany's Cartel Office has launched a new anticompetitive investigation against Amazon and Apple.

Privacy


Still in Europe, privacy matters are all over the place with a recent ruling that deems bulk data retention illegal all the while German's federal government is passing a law to allow all 19 secret services to use state Trojans to hack anyone at any time, for any reasons.

In the U.S., court documents revealed Google provided individuals' search queries to the police while the IRS is under investigation for using smartphone location data track Americans without a warrant.

Fairness & Accountability


Demonstrators marched to Jeff Bezos' Beverly Hills mansion protesting working conditions during the COVID-19 crisis and calling for higher wages, while a leaked memo revealed the company's plan to track union activity.

Microsoft employees published an open letter on GitHub asking Microsoft to drop its contracts with ICE.

Uber got accused in a lawsuit of violating the U.S. Civil Rights Act by firing minority drivers based on customers' ratings while in Europe the company is being sued for having automatically fired drivers in Portugal and the U.K.

And Finally


Netflix's Social Dilemma keeps on igniting debates worldwide with some in favor and others against the documentary's stance. Dive into some of Tristan Harris's take on the feedback they received so far on the Big Technology podcast and more recently on the Joe Rogan podcast. You can listen to in-depth conversations around the topics of the documentary on Your Undivided Attention podcast.

Worth Checking


  • "The social responsibility of software engineers — What duty do engineers have with the responsibilities they hold?" — LeadDev

  • "Privacidade, Segurança e Ética em IA (Convidado: Eliano Marques)" — Building The Future - AI Portugal Podcast

  • "Interview to Pedro Saleiro on detecting inequalities with software (Portuguese)" — Público

To get these delivered to your inbox, subscribe to CFT's monthly newsletter at the end of the page and join the conversation on Telegram.

And now, onto the interview.

Conversation with
Christine
Maroti

AI Research Engineer

This conversation, which took place October 16th 2020, has been edited for length and clarity.

Lawrence — Start by telling us a little bit about you, what you currently do and your background.

Christine — Currently I’m an AI research engineer at Unbabel, working with machine translation systems mostly. My role is basically trying to develop the best translation systems for Unbabel's customers. Prior to that, I studied Linguistics at university and then I worked in finance for 3 years not doing anything computer science or data related.

Then I decided to make a transition, so I did a machine learning and data science bootcamp in New York for three months. Fortunately enough I was able to get this job at Unbabel after that. That's basically my background, a bit unusual I guess.

L And do you enjoy being here in Portugal, having done the shift from the U.S. to Portugal?

C Yeah it was exciting, I wanted to try living abroad so once I got this offer it was kind of a no-brainer. And it's been interesting but it has had its ups and downs, being away from home of course.

L One of the reasons I wanted to talk to you is because you published this article about biases within datasets and algorithms. Why did you decide to write on this theme? Is it something you think about much?

C Now it's something that I think about more, but I’m not really sure what my motivation was. I had been talking to the team about different ideas that we could publish around AI, and I guess that maybe at the time a new model or news came out about bias in NLP systems and AI in general, so I guess that was probably the motivation.

I did a lot of research to write that article because I wasn't super familiar with the topic at the time. I think it was sort of a follow up article to the previous article I had written which was around how machines understand words and how they can make sense of text data. It was this whole explanation of the internal representations of these NLP systems and how they encode things like gender. This was kind of taking that article one step further, like: machines learn all of this cool stuff about language, but what can be the "dark side" of this?

Are there things in the features of data that we don't really want them to learn?

L Is that something that you think about at work or is that something you don't need to worry about?

C At Unbabel it's not really a problem per se. We may have gotten more requests for gender neutral systems, but I’m guessing in general if you don't know the person you’re speaking with, you don't just assume their gender. I think from a business perspective it's not something that we can justify spending a lot of resources on. It's something that I’ve been wanting to measure in our translation systems but haven't gotten the time to do.

L As you know better than me, some languages always have some gender associated to it. You can be translating from a gender neutral language to a language that takes gender into account. Do you see these sorts of translation errors often?

C I guess there are two parts of it. Part of it is that some languages have a gender for every noun, like Portuguese. We do see problems there where we’re maybe using the wrong article for the noun making the translation sound kind of stupid, but it isn't necessarily offensive. For the other part when you’re talking to someone where you don't necessarily know their gender, how do you use adjectives to describe them that necessarily require a gender? It's something that is hard to account for.

L When bias is mentioned in tech it seems to be often associated with gender or race but there's always some bias in data. You can have an algorithm trained with some data that makes it biased towards an industry or kind of problem.

C Yeah I think a lot of conversations around bias in AI can be kind of vague. We use the word bias to describe our models because it's trained on a very specific dataset and therefore it's biased to that dataset, but it's not like “bad bias".

A silhouete of a man leaning against a wall looking at his phone.

"I think that in AI, the people aren't really talking with the communities that the models can affect"

Photo by Warren Wong on Unsplash

L It's not the same bias right? It's biased because of the nature of the data. Well, the “bad bias" is also there because of the nature of the data used to train.

C I think that in AI, the people aren't really talking with the communities that the models can affect.

L Do you mean AI professionals work in a sort of hermetic way in regards to whom their work may affect?

C Bias in AI has become a sort of trendy topic in AI, so you see it mentioned more in papers as in GPT-3, like, "Oh we did an analysis on gender, race and religion biases in the model".

It's great that people are talking and thinking about it, but I think it's still kind of the researchers in an ivory tower saying "Yes my model is biased, this is bad", but they are not actually taking a step further and looking at who they are actually affecting. From a more societal perspective it's not something that is really measured. I read another paper that was a sort of critical analysis of papers that claimed to be about biases in AI, and their argument was that in general bias is always considered bad, but the papers wouldn't really say who that bias would be bad for or how it would affect society. It was only talked about in the abstract.

L I see what you mean but I think that we’re seeing more and more concrete examples of how algorithmic bias affects real people. Algorithms are increasingly controlling every aspects of our lives. So how we train them, the data we use, will determine how “fair" these algorithms are. But these are most if not all of the time opaque to those subject to it. It comes to mind the case of Uber employees in the U.K. taking the company to court to try and get an understanding on how the algorithm works underneath. The inner workings are a black box for those that are subject to algorithms so they don't really understand how it works and how it makes decisions.

What do you think is the role and responsibility of a data scientist on the outcomes and impact when building such algorithms?

C That's a really good question. I’ve never worked in a Big Tech company but I can imagine that if you’re a data scientist you may not know the full scope of what it is that you’re working on? Maybe you’re just trying to predict one tiny thing like "How long will it take for this guy to go from point A for point B" and it's a really small prediction. I’m not sure.

I don't really want to judge anyone for working in those companies, of course anyone has the right to earn their money you know? I don't know what the right answer is. I can imagine, if I were in their place, would I do anything differently from them? I mean of course I care, I think everyone kind of cares. But it's hard to say "me predicting this really tiny thing is making this other person not work for a day". It's really hard.

L Yeah and it's not obvious at first as you don't know what will be the collaterals of those choices. Another thing I want to touch is there's a knowledge gap through these algorithms, between those who build them and those that are subject to it. It's like "today you’ll get this score which influences some specific outcome like more fares or a worst credit or seeing a specific piece of information". This sort of asymmetry is weird. As a technologist, I like being on the end that defines and sort of understands how it works. But then you see people that are on the other end that may not comprehend it at all.

The other day I had food delivered and as the delivery guy's giving me the bag he goes "Can I take a picture of it with the receipt facing the camera?"". I asked him — suspecting that it was for some quality related aspect — if he had any idea why he had to do that and he said "Well it's for them [Uber] I guess to control that I’m doing my job? Like they randomly ask for this but I don't really know what it then does".

Of course delivery workers are the most visibly impacted by algorithm coordination but there are many other examples of algorithms playing an important role in people's lives like CV scanning, grading, risk assessment. All these can have biases that don't account for side-effects.

C Yeah, you can blind your model for instance to race data, but because the way society is structured and decades and decades of reinforcement of certain patterns, you can try to blind it for let's say race but based on where you live or how much you earn, those are kind of already baked in.

L Do you remember last year's credit card bias controversy with Apple where women were getting lower credit limits than men? They basically went and said that they were going to look into the data being used to avoid potential bias. These are kind of valid excuses and it's hard to look out for it I imagine. You don't have diverse enough data to be comprehensive in how you train algorithms so it isn't done on purpose, but ends up reflecting society's biases.

C And I guess there's not much regulation around this at all.

L I was gonna go there. When algorithms have such an impact on our lives and companies are simply focused on profit without much accountability.
Do you think it would make sense for governments or maybe an approach like Open AI but only for algorithms in a way?

C I guess it would make sense. If you look at the financial industry they have sort of these self-regulatory organizations and whether or not these mechanisms actually work or are actually independent is another question.
But I guess the financial industry is aware that certain arbitrages can take place and tries to mitigate that and keep things fair and I guess for technology there's nothing really like that yet.

L Why do you think that's the case for the tech industry?

C I don't know I guess everything happened very quickly for one thing. And maybe because so much of it is opaque for the average person maybe there just hasn't been enough pressure in the industry to actually do this.

L I agree. In addition I think a lot of what happened initially in the digital world was thought of as not being real as to not having impact in the "real" world. But online bullying is real, swaying people's opinions is real. The internet has become central to people's lives. The expansion was too fast and it's like magic for most people. It's amazing and it works really well but you’re not aware of what's behind that magic so you don't question it.

Do you think this positive view of tech is shifting?

C I don't know. On one hand technology is very much taken for granted, it's a great tool, it can do amazing things. But I think that the general population is becoming a little more skeptical of the impact. You saw the Social Dilemma on Netflix? I don't know who watches this, but at least it's out there in the open and people are interested. So I do think it's kind of getting better, as in, people are more aware. And whether or not that will actually change behavior I don't know. I mean I saw that documentary and I was like "This is so horrible, I’m like a sheep following Big Tech," but then I live so far away from home and I need to talk to my friends, I can't just leave Facebook.

L I totally get what you mean, that dilema. In a way what they provide is good for some stuff and bad for some other stuff and they must fix what is bad. But if no one puts their feet to the fire they won't fix it fast enough. Do you talk about these sort of more abstract themes and the implications of technology with your colleagues on a daily basis?

C No I can't really say I do.

L Why is that the case?

C I don't know. Maybe sometimes it's just easier not to think about it. And I don't know, but maybe in Portugal it's not as much part of the conversation, but I don't know if that's true. I don't know if my peers in the U.S. are talking about this on a daily basis, I would guess no. But I also think there are more events and meetups like that happening there, but I don't think that people there care more. I don't really know.

L What do you think could be the benefits of talking about this on a more regular basis?

C I think awareness would maybe be the first step. Again, going back to the data scientist example before, I think at least that person would be more curious about what their actions are probably doing. I think this project you're doing is really cool because it can reach people outside of the technology world as well, and it would be great if maybe in school AI or similar would be part of the general curriculum.

We always say things like “it's magic” and a complicated black-box, but I don't think it needs to be. Sure, the algorithms definitely rely on some complex mathematics, but you don't need to understand them fully to get a feel of what the algorithms are doing. I think it could be really cool if that was just normally taught in school.

L Well actually I'm going for a different approach. Like you said it's cool that this project may reach people outside of the tech industry but I'm more like: it's tech people that must first be aware.

What it looks like is that technologists will usually be presented with a complex problem that attracts them and challenges them but won't ponder what the problem is tackling. Like "I need to understand how these people relate among each other and how the move around, in order to profile them and service an immigrations agency, potentially causing harm to other people". So you're working on an enticing, complex mathematical problem and you're very well paid but the outcome is that it's going to be used for some not-so-ethical or dubious end.

So to me the approach starts at the technologist's level to be aware and ask "What should I pay attention to in order to minimize unintended collateral issues" and create more sensitivity.

C Yeah I definitely agree with that. It is our responsibility to be asking these questions, but I guess in the grand scheme of things I have a bit more pessimistic view. What can just one little data scientist do against these malicious uses of technology? I don't know, maybe you have a better answer.

L I don't have any satisfactory answer but what I know is that it's the technologist that is going to do it, he's the one that is going to code it, not necessarily the CEO or the manager. Technologists are the ones implementing it, putting their fingers on the keyboard. So that's like the last barrier where you can draw a line and say "Hold on a second". And we've seen that happening with Google employees protesting the company's contracts with the Pentagon or GitHub employees because of the company's ties with ICE.

And we're being critical but of course we like those companies, we like all of them. We're just demanding them to be better, not to treat people as cogs and own the unintended consequences of their work and fix that.

C I think it's the basic philosophical experiment of like "I'm a soldier and my general tells me to kill someone, is it my fault or his fault?"

L Totally. And then you have other things like student debt, mortgage, a family to support and those jobs are usually very well paid in data science, AI and so on.

C Yes it becomes harder to just say no.

L No one is condemning anyone for pursuing a career but we need to strike a sort of balance. I mean it's complicated but we need to talk about it. In a couple of decades we may end up with a society entirely run by algorithms where only a handful of people actually know why and how these work. How fair is it that only a couple of people can control something that literally impacts millions or billions of people?

C Yeah, I guess having discussions around those topics can help us remain aware of the possible issues with our work. Sometimes we may not realize that something can have a side-effect, but someone else could notice and tell us.

L Exactly. I think we've past our time. Thanks a lot for being willing to do this and share your thoughts. I hope you liked it.

C Thanks. Yeah it was fun.

You can connect with Christine on LinkedIn.

Send your feedback, suggestions or thoughts over at hello@criticalfuture.tech.

If you've enjoyed this publication, consider subscribing to CFT's monthly newsletter to get this content delivered to your inbox.