A few absolute shockers in the list of websites the Washington Post has revealed are used to train Google’s generative AI tools. Apparently including the likes of 4Chan, Breitbart, and RT.
From WaPo:
"Meanwhile, we found several media outlets that rank low on NewsGuard’s independent scale for trustworthiness: RT.com No. 65, the Russian state-backed propaganda site; breitbart.com No. 159, a well-known source for far-right news and opinion; and vdare.com No. 993, an anti-immigration site that has been associated with white supremacy.
“The top Christian site, Grace to You (gty.org No. 164), belongs to Grace Community Church, an evangelical megachurch in California. Christianity Today recently reported that the church counseled women to ‘continue to submit’ to abusive fathers and husbands and to avoid reporting them to authorities.”
https://www.washingtonpost.com/technology/interactive/2023/ai-chatbot-learning/
#technology #ArtificialIntelligence #AI #GenerativeAI #Google #tech #news @technology @politics #trans #lgbtqia #lgbtq
@ajsadauskas @technology @politics Not surprising, but at least it’s listed. We still have no idea what ChatGPT is trained on!
@drdanielturner @technology @politics True. I guess the interest is that this is the first time we get to see exactly what crap these generative AI tools are trained on.
Guess which one they will delete from it.
@PolandIsAStateOfMind @ajsadauskas The Austrollian one? ;)
Surely not WaPo lol
Those rankings are pretty low.
@ajsadauskas @technology @politics suddenly things make sense
Considering the amount of tokens and their ranks it could be just to know and understand more context about them?
Considering the amount of data those silos actually have.