Sex offender banned from using AI tools in landmark UK case

girlfreddy@lemmy.ca · 7 months ago

Sex offender banned from using AI tools in landmark UK case

xmunk@sh.itjust.works · 7 months ago

It is true, a 10 year old naked woman is just a 30 year old naked woman scaled down by 40%. /s

No buddy, there isn’t some vector of “this is the distance between kid and adult” that a model can apply to generate what a hypothetical child looks like. The base model was almost certainly trained on more than just anatomical drawings from Wikipedia - it ate some csam.

If you’ve seen stuff about “Hitler - Germany + Italy = Mousillini” for models where that’s true (which is not universal) it takes an awful lot of training data to establish and strengthen those vectors. Unless the generated images were comically inaccurate then a lot of training went into this too.

rebelsimile@sh.itjust.works · 7 months ago

Right, and the google image ai gobbled up a bunch of images of black george washington, right? They must have been in the data set, there’s no way to blend a vector from one value to another, like you said. That would be madness. Nope, must have been copious amounts of asian nazis in the training set, since the model is incapable of blending concepts.

PotatoKat@lemmy.world · 7 months ago

From a few months ago

https://cyber.fsi.stanford.edu/news/investigation-finds-ai-image-generation-models-trained-child-abuse

xmunk@sh.itjust.works · 7 months ago

You’re incorrect and you should fucking know better.

I have no idea why my comment above was downvoted to hell but AI can’t “dream up” what a naked young person looks like. An AI can figure that adults wear different clothes and put a black woman in a revolutionary war outfit. These are totally different concepts.

You can downvote me if you like but your AI generated csam is based on real csam so fuck off. I’m disappointed there is such a large proportion of people defending csam here especially since lemmy should be technically oriented - I expect to see more input from fellow AI fluent people.

rebelsimile@sh.itjust.works · 7 months ago

You’re spreading misinformation and getting called out for it.

xmunk@sh.itjust.works · 7 months ago

Just a note - csam has been found in model training sets: https://cyber.fsi.stanford.edu/news/investigation-finds-ai-image-generation-models-trained-child-abuse

rebelsimile@sh.itjust.works · 7 months ago

Ok? Hundreds of images of anything isn’t going to necessarily train a model based on billions of images. Have you ever tried to get Stable Diffusion to draw a bow and arrow? Just because it has ever seen something doesn’t mean that it has learned it, nor, more importantly, does that mean that is the way it learned it, since we can see that it can infer many concepts from related concepts- pregnant old women, asian nazis, black george washingtons (NONE OF WHICH actually have ever existed or been photographed)… is unclothed children really more of a leap than any of those?

xmunk@sh.itjust.works · 7 months ago

It is, yes. A black George Washington is one known visual motif (a George Washington costume) combined with another known visual motif. A naked prepubescent child isn’t just the combination of “naked adult” and “child” naked children don’t look like naked adults simply scaled down.

AI can’t tell us what something we’ve never seen looks like… a kid who knows what George Washington and a black woman looks like can imagine a black George Washington. That’s probably a helpful analogy, AI can combine simple concepts but it can’t innovate - it can dream, but it can’t know something that we haven’t told it about.

rebelsimile@sh.itjust.works · edit-2 7 months ago

What you’re saying is based on the predicate that the system can’t draw concepts it has never seen which is simply untrue. Everything else past that is sophistry.

Edit: also not continuing a conversation with someone who is hostile to the basic rules of logic.

xmunk@sh.itjust.works · 7 months ago

You have a basic misunderstanding of how AI works and are endowing it with mystical properties. Generative AI can’t accurately infer concepts or items it doesn’t understand. It has all the knowledge of the internet but if you ask it to draw a schematic for a hydrogen bomb it’ll give you back hallucinated bullshit. I’ll grant that there’s a small chance that just enough random details have been leaked that the AI may actually know how to build a hydrogen bomb - but it can’t infer how that would work from “understanding physics”.

Either way, these models were trained on csam, so my initial point is accurate and not misinformation.

xmunk@sh.itjust.works · 7 months ago

It isn’t misinformation, though, generative AI needs a basis for it’s generation.

rebelsimile@sh.itjust.works · edit-2 7 months ago

The misinformation you’re spreading is related to how it works. A generative AI system will (without prompting away from it) create people with 3 heads, 8 fingers on each hand and multiple legs connecting to each other. Do you think it was trained on that? This argument of “it can generate it, therefore it was trained on it” is ridiculous. You clearly don’t understand how it works.

Leate_Wonceslace@lemmy.dbzer0.com · edit-2 7 months ago

deleted by creator

xmunk@sh.itjust.works · 7 months ago

You’re extremely correct when it comes to combining different aspects of existing works to generate something new - but AI can’t generate something it doesn’t know about. If a generative model knows what a prepubescent naked body looks like it has been exposed to them before. The most generous way to excuse this is that medical diagrams exist and supplied the majority of inputs for any prompts about cp to work off of. A must more realistic view is that some cp made it into the training set.

I don’t disagree with any of your assessments but if you wanted a Van Gogh painting of a Glorp from Omnicron Persei 8, you’ll get out… something, but because the model has no reference for Glorps it’ll be hallucinations or guesses based on other terms it can find.

To be clear, I’m coming at this from the angle as someone who has trained and evaluated models in a company that’s used them for the better part of a decade.

I understand I’m going up against your earnestly held belief, but I’ve seen behind the curtain on a lot of this stuff and hopefully in time the way it works becomes demystified for more people.

Leate_Wonceslace@lemmy.dbzer0.com · 7 months ago

For reference, the comment I made was improperly displayed, and I thought I replied to the wrong person. It said:

Hi, I’m a mathematician that’s been following the development of generative neural networks for about a decade or more.

You’re wrong. Your knowledge of the inner workings of these AI is accurate, but somehow you’ve reached an incorrect conclusion. I sometimes run a local instance of Stable Diffusion on my home PC, and it can make things that have never existed look totally unlike anything it’s ever seen, and yet match certain specifications in principle.

I don’t use it to generate porn, so I can’t speak to the difficulties in avoiding csam while doing so. Mostly I generate is paintings in the style of Van Gogh, and it does a remarkable job of doing so, even when I can’t get it to do what I want. For example: it generated a painting of him in profile wearing armor when I asked for a weapon. I don’t think Van Gogh ever painted himself in profile, and he certainly never did so in armor. And yet the model was capable of imagining what this human-like figure so closely associated with the artist style “Van Gogh” would look like in profile because it knew what humans tend to look like in profile, and it could conceptualize how the features would present themselves. I’m certain that an AI can imagine a convincing image of simulated csam without ever having seen it, because these models really are just that good at imagining new things.

PotatoKat@lemmy.world · 7 months ago

Has your model seen humans in a profile view? Has it seen armor? Has it seen Van Gogh style paintings? If yes then it can create a combo of those things.

For CSAM it needs to know what porn looks like, what a child looks like and what a naked pubescent body looks like to create it. It didn’t make your van Gogh painting from nothing it had an idea of what those things were.

rebelsimile@sh.itjust.works · 7 months ago

If the system must see something to generate it, and the system can’t generate things that don’t exist, then how is it generating pregnant old women?

xmunk@sh.itjust.works · 7 months ago

Because it’s a transformation that can be accurately predicted, at least as far as we can conceive. This is sort of the problem with this thread - there are plenty of examples of derivative combinations that are being presented as counter examples but naked children don’t just look like adults scaled down. This is a rather unique situation because most people have been parents or siblings and know what naked children look like but photographs of that nudity are restricted and shouldn’t be included in model training.

The other example we might have to work with would be copywrited material but we know that models did consume material they weren’t licensed to - as a result AI has been able to generate Disney characters and the like in a recognizable way.

redlue@startrek.website · edit-2 6 months ago

Removed by mod

7 months ago

Bro googled the word vector and was waiting to use it.

Leate_Wonceslace@lemmy.dbzer0.com · edit-2 7 months ago

No, they’s referring to the internal workings of AI models, which are essentially a series of incredibly high-dimension matrices with extra bits around them to make them work. Individual concepts are embedded as vectors in the space that these models work in. That’s why linear algebra is brought up so frequently in discussions of AI.

The_Vampire@lemmy.world · 7 months ago

While it’s true that linear algebra and vectors are used in learning models, they’re not using the term correctly in a way that says they know something about the subject (at least, the modern subject). Concepts aren’t embedded as vectors. In older models (before the craze), concepts were manually embedded as numbers or a collection of numbers, which could be a vector (but could be something else as well), and the machine would learn by modifying weights. However, in current models (and by current, I mean at least more than a couple years), concepts are learnt by the machine (weights are still modified by the machine as well) and the machine makes its own connections between features presented to it.

For example, you give it a dataset of 10x10 pixel images (with text descriptions) and it reads that as 100 pixels split into 3 numbers (RGB) and then looks for connections between those numbers and in which pixels. It’s not identifying what a boob is, but knows that when an image has ‘boob’ in the text description then there’s a very high likelihood that there will be a circular collection of pixels with lots of red somewhere in the image that are also connected to other pixels that are often also lots of red. That’s me breaking down what a human would think given the same task/information, but the reality is the machine will come up with its own connections/concepts which are both often far better than humans (when the model works, at least) and far more ineffable to humans.

Leate_Wonceslace@lemmy.dbzer0.com · 7 months ago

From my perspective as an algebraist, you seem to be splitting hairs when you’re making a distinction between vectors and n-tuples of real numbers. Furthermore, he’s referencing a specific 3blue1brown video. I’m not saying their conclusion is correct; they’s dead wrong but that doesn’t mean their understanding is so shallow that they’re simply repeating a word they heard to sound smart.

PipedLinkBot@feddit.rocks · 7 months ago

Here is an alternative Piped link(s):

specific 3blue1brown video.

Piped is a privacy-respecting open-source alternative frontend to YouTube.

I’m open-source; check me out at GitHub.