Darkard@lemmy.world to Showerthoughts@lemmy.worldEnglish · 2 months agoThere's a pretty good chance that all the AI content scrapers used to get "Training data" have ingested all the Epstein filesmessage-squaremessage-square15linkfedilinkarrow-up1133arrow-down15
arrow-up1128arrow-down1message-squareThere's a pretty good chance that all the AI content scrapers used to get "Training data" have ingested all the Epstein filesDarkard@lemmy.world to Showerthoughts@lemmy.worldEnglish · 2 months agomessage-square15linkfedilink
minus-squarehexagonwin@lemmy.todaylinkfedilinkarrow-up48arrow-down2·2 months agohighly doubt it, they’re trained on publicly available data mostly
minus-squareramble81@lemmy.ziplinkfedilinkarrow-up30·2 months agoIf by “publicly” you mean “any data source that it managed to connect to, public or private”, then yes….
minus-squareyoucantreadthis@quokk.aulinkfedilinkEnglisharrow-up7arrow-down6·2 months agoNot entirely though. Like we know grok was trained on the fbi’s cp stash, right?
minus-squareGoodlucksil@lemmy.dbzer0.comlinkfedilinkarrow-up14·2 months agoDo you have any source on that?
minus-squareyoucantreadthis@quokk.aulinkfedilinkEnglisharrow-up2arrow-down2·2 months agoDont recall where i heard it.
minus-square🇦🇺𝕄𝕦𝕟𝕥𝕖𝕕𝕔𝕣𝕠𝕔𝕠𝕕𝕚𝕝𝕖@hilariouschaos.comlinkfedilinkEnglisharrow-up1·2 months agoAll the ai companies have trained on the FBI cp database but they use a coefficient of -1 thus steering it away from such content. Reinforcing against such material. Openai used to pay people for cp Idk if they still do this tho.
minus-squareyoucantreadthis@quokk.aulinkfedilinkEnglisharrow-up1·2 months agoSo why all the cp from them?
highly doubt it, they’re trained on publicly available data mostly
If by “publicly” you mean “any data source that it managed to connect to, public or private”, then yes….
Not entirely though.
Like we know grok was trained on the fbi’s cp stash, right?
Do you have any source on that?
Dont recall where i heard it.
All the ai companies have trained on the FBI cp database but they use a coefficient of -1 thus steering it away from such content. Reinforcing against such material. Openai used to pay people for cp Idk if they still do this tho.
So why all the cp from them?