I was going to put it in the gender thrad but there might be a chance someone who might have a hand in Google somewhere would see it here:
In Hungarian, we don’t use he/she there is only one gender pronoun “Ö”. But it’s fascinating when this is fed through Google Translate, the algorithms highlight the biases that are there. Then imagine enacting any kind of change from those biases, encoded into computer code. pic.twitter.com/DygBtaHShU
Like Hungarian, Finnish has a gender-neutral third person pronoun, hän. So the same test could be done asking The Goog to translate from Finnish to English, I thought to myself.
First step, I will try translating the given English into Finnish, without using google translate.
Some hours later:
The English in the tweet:
Quote:
She is beautiful. He is clever. He understands math. She is kind. He is a doctor. She's a cleaner. He is a politician. She is a teacher. He is strong. He is clever. He's a driver. She's shopping. She washes the dishes. He is a doctor. He's fishing. He makes a lot of money. She is beautiful. He is clever. He's even smarter. He's the smartest. She washes the dishes. Get it, Google.
and the Hungarian because I'm going to check that too:
Quote:
Ö szép. Ö okos. Ö érti a matematikát. Ö kedves. Ö egy orvos. Ö egy takarító. Ö egy politikus. Ö egy tanár. Ö erös. Ö okos. Ö soför. Ö bevásárol. Ö mosogat. Ö egy orvos. Ö horgászik. Ö sok pénzt keres. Ö szép. Ö okos. Ö még okosabb. Ö a legokosabb. Ö mosogat. Kapd be, Google.
My Finnish:
Quote:
Hän on kaunis. Hän on älykäs. Hän ymmärtää matematiikka. Hän on kiltti. Hän on lääkäri. Hän on siivooja. Hän on poliitikko. Hän on opettaja. Hän on vahva. Hän on terävä. Hän on ajaja. Hän käy kaupassa. Hän tiskaa. Hän on lääkäri. Hän kalastaa. Hän ansaitsee paljon rahaa. Hän on kaunis. Hän on älykäs. Hän on vielä terävämpi. Hän on terävin. Hän tiskaa. Ymmärrä, Google.
And now, what does Google say:
First with the Hungarian - I get slightly less gendered results than the tweeter (but only because there are more male pronouns ... so really it's more gendered):
Quote:
She is beautiful. He is clever. He understands math. She is kind. He's a doctor. He's a cleaner. He is a politician. He's a teacher. He is strong. He is clever. He's a driver. He's shopping. He washes. He's a doctor. He's fishing. He makes a lot of money. She is beautiful. He is clever. He's even smarter. He's the smartest. He washes. Get it, Google.
Now the Finnish:
Quote:
She is beautiful. He is intelligent. He understands mathematics. He's kind. She's a doctor. She is a cleaner. He is a politician. He is a teacher. He is strong. He is sharp. He is a driver. He trades. She was doing the dishes. She's a doctor. He fishes. He makes a lot of money. She is beautiful. He is intelligent. He is even sharper. He is the sharpest. She was doing the dishes. Understand, Google.
She is a doctor, he is a teacher - two tiny bits of positive gender roles. All the rest are cultural biases not actually in the source text.
It's decided to translate my käy kaupassa (goes to the shops) as trades, which I think is rubbish but means no credit for "he is shopping".
Plus tiskaa is definitely present tense, but Google Translate getting Finnish wrong is no surprise.
I don't know about the Hungarian word, but kaunis is a pretty gendered word in Finnish like beautiful in English. I can't really complain about picking "she" for that.
I will now invite Miisa to improve my Finnish choices.
All except the teacher one aligns with stereotypes or statistics, most doctors I have come across in the past 10 years anyway are women. "Hän käy kaupassa" is not really "He trades" but whatever. So when I change it a little to "Hän menee kauppaan" it becomes "She goes to the store" because, you know, shopping vs trading.
But presumably it is Google's biases, so why would a translation into the same language from Hungarian give a different result than Finnish?
Also, if the clip is short enough, it gives both options. Two separate sentences and it starts having to choose a pronoun for each one and drops the warning about genders.
Yet it can sometimes figure out from the context if it is the same person. So if the default for "handsome" is male, and for "goes shopping" is female, if you add an "also" in the next sentence Google realises it is the same person and is brutally forced to decide one way or the other. So
Quote:
Hän on komea. Hän menee myös kauppaan.
becomes
Quote:
He is handsome. He also goes to the store.
Yet if I switch the "handsome" to "beautiful" it kept the shopping gender in the next sentence as male, until I told it the person was buying makeup.
I would not want to be the developer of that software, sounds like a nightmare of forced gender supposition pitfalls.
Only one thing to do.
Ditch ALL the gendered pronouns, you silly anglophones. This is the way.
I don't think most of Google's translation data was created by Google employees - I suspect it was done by (stolen from) the public via innocent-sounding "suggest an edit - Your contribution will be used to improve translation quality" links.
This (a) is a possible reason why different language pairs might show different built-in bias and (b) will be Google's excuse if asked about it.
But presumably it is Google's biases, so why would a translation into the same language from Hungarian give a different result than Finnish?
Not that this is forgiving Google, but considering this is AI doing the translations, the most likely reason is that the sources Google has fed the translations have biases, and they are slightly different per language. Joep is probably correct, too, that the AI may have been tweaked by suggested improvements.
This could have been avoided if the translation just used the gender neutral singular "they" and let the user figure this out.
Oh yeah, I don't think anyone at Google is sitting there with a Doctor Evil pinkie finger to the lip (except maybe Ensign Steve), it's just an aggregate of what exists in the world of humans. Maybe an lesson to us as individuals is 'be less shit'.
And use 'they/them' more.
__________________
Peering from the top of Mount Stupid
Oh yeah, I don't think anyone at Google is sitting there with a Doctor Evil pinkie finger to the lip (except maybe Ensign Steve)
To be honest, I'm pretty sure there are more people at Google than just ES doing the .
Unfortunately we've been reduced to muahahaha-ing at each other over teleconference. It's just not the same.
Here's almost everything I know about Google Translate:
It is AI, and the models are trained using mostly books but also websites that have been human-translated into various languages. Like almost the whole of human written literature, replete with its centuries of cultural biases baked right in.
We (or I guess "they" since it's nothing to do with my jerb specifically) have to be careful now with the websites we source these days, because many websites have been translated into other languages using Google Translate, and training an AI model on the very data it produces creates a feedback loop and that is (probably fairly obviously) not good.
I know one other thing but I'm late for a meeting so maybe I'll tell you later.
We (or I guess "they" since it's nothing to do with my jerb specifically) have to be careful now with the websites we source these days, because many websites have been translated into other languages using Google Translate, and training an AI model on the very data it produces creates a feedback loop and that is (probably fairly obviously) not good.
Also, because if left unsupervised an AI turns into a Nazi: