ChatGPT use declines as users complain about ‘dumber’ answers, and the reason might be AI’s biggest threat for the future::AI for the smart guy?

  • Strangle@lemmy.world
    link
    fedilink
    English
    arrow-up
    66
    arrow-down
    6
    ·
    1 year ago

    Back in my day, we used to call ‘prompt engineering’ ‘asking a question’.

    • CosmoNova@lemmy.world
      link
      fedilink
      English
      arrow-up
      12
      arrow-down
      3
      ·
      1 year ago

      They got to have a special termonology because what they do is oh so special. Some AI users act like they’re Louise Banks from the movie Arrival cracking the code to an alien language or something. And I don’t think it’s far fetched to assume they’re often from the same breed who had NFT monkeys as their twitter pfp about 18 months ago.

      • Gerbler@lemmy.ml
        link
        fedilink
        English
        arrow-up
        3
        arrow-down
        1
        ·
        1 year ago

        Blockchain > Crypto > NFTs > LLMs > whatever’s next.

        These people will always be sniffing around for the next big thing to oversell and fleece their audience.

    • Wololo@lemmy.world
      link
      fedilink
      English
      arrow-up
      9
      ·
      1 year ago

      I’ve had similar experiences lately. Either that or it decides to review and analyze my code unprompted when I’m trying to troubleshoot a particularly tricky line. Had a few instances where it tried to borderline gaslight me into thinking that it was right and I was wrong about certain solutions. It feels like it happened rather suddenly too, it never used to do that save for the odd exception.

    • BehindTheBarrier@lemmy.world
      link
      fedilink
      English
      arrow-up
      16
      ·
      1 year ago

      They could make it paid only today, and it’d be instantly profitable. Most free users would transition to a free alternative, but the corporate world would easily pay for use. So would some power users. But I’m sure they are making good money with all the API use anyways, the free access is a cheap way to get mass testing and training data.

      • Corkyskog@sh.itjust.works
        link
        fedilink
        English
        arrow-up
        13
        ·
        1 year ago

        I know so many average Joe’s that use it all the time and would instantly pay $5 a month for it, even just a phone app.

  • unhook2048@lemmy.world
    link
    fedilink
    English
    arrow-up
    38
    arrow-down
    3
    ·
    1 year ago

    It’s getting worse based on the feedback unfortunately, the need for safety and lack of meaningful deliberation towards how AI companies should operate and what should and should not be done has led Sam and co to be indesicive towards doing anything. Alongside the “morality” of the thing being hyjacked has lead to other AI’s performing better… lead by x employees of OpenAI, with actual bound morals and not inherently relying on user input to train future models, this will be the path forward, this will lead to safe and controlled integration.

    I guess at the core of this, we are afraid of ourselves. We are afraid that the worste of humanity outpaces the better parts, that the inputs and training aren’t altruistic but are more pointedly “bad” or “wrong”, and thus leading to “harmful”, whether through misinformation, lies, or fabrications.

    I hope we find a way to do better. I’m still excited for the future of AI, I mean crap, I’m closer to having a family doctor that’s a robot then I am to a real human doctor.

    • asparagus9001@lemmy.world
      link
      fedilink
      English
      arrow-up
      14
      arrow-down
      3
      ·
      1 year ago

      I guess at the core of this, we are afraid of ourselves. We are afraid that the worste of humanity outpaces the better parts, that the inputs and training aren’t altruistic but are more pointedly “bad” or “wrong”, and thus leading to “harmful”, whether through misinformation, lies, or fabrications.

      Is there any reason not to be afraid? I think you could say that Tay was essentially the same idea a few years back and it took like 48 hours loose on the internet for it to spout literal Nazi (1930s-40s German NSDAP) rhetoric. Besides that being a PR disaster - if “AI” is only getting stronger and more integrated into human life and society, that can be pretty problematic.

  • fidodo@lemm.ee
    link
    fedilink
    English
    arrow-up
    28
    arrow-down
    1
    ·
    1 year ago

    AI cannibalism simply isn’t a thing yet. It definitely will be and good models will need to spend a lot of time and money sourcing good training data, but the models are not up to date enough to be contaminated yet.

    I’m very confident the degradation has come from them trying to scale up. Generative AI is the most expensive thing on the cloud you can provide, and not only are they trying to make it faster, they’re trying to roll it out for way more consumption. Major optimizations will require an algorithmic breakthrough so in the meanwhile all they can really do is find which corners they can cut that are less bad.

  • daisy lazarus@lemmy.world
    link
    fedilink
    English
    arrow-up
    56
    arrow-down
    37
    ·
    1 year ago

    Nonsense. Less people are using it because there are viable alternatives and the broader novelty has worn off.

    I use it every day in my job and the quality of answers only drops off when prompts are poorly crafted.

    By and large, the average user doesn’t understand the fundamentals of prompt engineering.

    The suggestion that “answers are increasingly dumber” is embarrassing.

    • Zeth0s@lemmy.world
      link
      fedilink
      English
      arrow-up
      57
      ·
      1 year ago

      Unfortunately I don’t agree with you. Different things have changed over time:

      • For chatgpt 3.5 they moved to a “lighter” and faster (distilled) version, gpt-3.5-turbo. Distillation came with a performance price, particularly on advanced and less common cases.
      • newer chatgpt-4 versions have likely been “lighten” for performance reasons
      • context has been halved for chatgpt-4 on webui, meaning that the model forget more easily and can use half information to create text
      • heavy control has been implemented on jailbreaking and hallucinations, that results in models less prone to follow complex instructions (limiting prompt engineering) and that prefer simplified answers than providing wrong ones (overall decreasing the chance of getting high quality answers).

      All these changes have made working with gpt less pleasant, and more difficult for very advanced and specialized case, particularly with gpt-4 which at the beginning was particularly good.

      • mikkL@lemmy.world
        link
        fedilink
        English
        arrow-up
        2
        ·
        1 year ago

        This was really enlightening. Do you have some articles that elaborate? ☺️

        • Zeth0s@lemmy.world
          link
          fedilink
          English
          arrow-up
          12
          ·
          edit-2
          1 year ago

          Regarding 3.5 turbo you can check the documentation, the old 3.5 models are defined as “legacy”. Regarding max number of tokens of gpt-4 you can try yourself. It used to be >8k, it is now >4k from webui.

          There is a talk from openai cio (if I recall correctly) where he describes that reinforcement learning from human feedback (rlhf) actually decreased performance of the models when it comes to programming. I cannot find it now, but it is around on YouTube.

          The additional safeguard against jailbreaking, it is what OpenAI has been focusing the past months with heavy use of rlhf. You can google official statements regarding “safety” of the model. I have a bunch of standard pre-prompt I have been using to initialize my chats since the beginning, and with time you could see how the model followed the instructions less strictly.

          Problem with openai is that they never released exact number of parameters they are using and detailed benchmarks. And benchmarks you find online refer to APIs that behave differently than the chat webui (for instance you have longer context, you set temperature and system prompt, they are probably even different models, who knows… All is closed)

          Measuring performances of llm is pretty tricky, minimal changes can have big effects (see https://huggingface.co/blog/evaluating-mmlu-leaderboard), and unfortunately I haven’t found good resources to properly track chatgpt performances (from web ui) over time, across iterations

    • YeastForTheYeastGod@sh.itjust.works
      link
      fedilink
      English
      arrow-up
      18
      arrow-down
      1
      ·
      1 year ago

      I was skeptical at first but I’ve seen enough evidence now. There are definitely times when it’s dumb as a brick, whether the filters just get in the way too much, or whether they’ve implemented other changes idk. I’d really love the unchained version.

      • Kelly@lemmy.world
        link
        fedilink
        English
        arrow-up
        1
        ·
        1 year ago

        dumb as a brick

        On 23rd of March 2023 I asked a family member to give me a prompt and they asked “what day is 19th of April?”.

        It answered “The 19th of April falls on a Tuesday.”, which was true last year but completely misleading if I thought we were taling about the coming month.

        Was it wrong or just unclear? Either way it wasn’t helpful.

  • Open@lemmy.world
    link
    fedilink
    English
    arrow-up
    19
    arrow-down
    1
    ·
    1 year ago

    Article talks about the potential of AI cannibalism were it is now learning from data that it (or other AI) has generated.

    Does ChatGPT use modern data I was under the impression that it’s most modern dataset was a few years old

    • Hyperi0n@lemmy.film
      link
      fedilink
      English
      arrow-up
      7
      ·
      1 year ago

      How does it do your resume?

      You have to feed it all the information. Then it spits that back to you unformatted and you have to format it.

      • d4rknusw1ld@lemmy.world
        link
        fedilink
        English
        arrow-up
        8
        ·
        1 year ago

        Exactly. I don’t have to use my brain to write summaries and etc. I’m lazy and don’t deserve a job haha.

      • solstice@lemmy.world
        link
        fedilink
        English
        arrow-up
        3
        ·
        1 year ago

        Yeah seriously, I pay a resume writer almost entirely because I don’t want to fuck around with Word formatting it. Lazy I know but totally worth it.

  • TheFutureIsDelaware@sh.itjust.works
    link
    fedilink
    English
    arrow-up
    14
    arrow-down
    1
    ·
    1 year ago

    ChatGPT usage is a very poor metric. Anything interesting is happening via API. Even the chat completion endpoint still isn’t “ChatGPT” on its own. None of these complaints about it being “dumber” apply to the API outputs. OpenAI don’t care about nerfing chatGPT because it’s not their real product.

  • Immersive_Matthew@sh.itjust.works
    link
    fedilink
    English
    arrow-up
    7
    ·
    1 year ago

    I had my first WTF moment with AI today. I use the paid Chat-GPT+ to help me with my c# in Unity. It has been a struggle to use, even with the smaller basic scripts you can paste into its character limited prompt, as they often have compile errors. That said if you keep feeding it the errors, guide it where it is making mistakes in design, logic etc. it can often produce a working script about 60-70% of the time. It takes a fair amount of time quite often to get to that working script but the code that finally works is good.

    Today I was asking it to edit a large c# script with 1 small change that meant lots of repetitive edits and references. Perfect for AI, however ChatGPT+ really struggled on this one which was a surprise. We went round and round with edits and ultimately more and more errors appeared in the console. It often ends up in these never ending coding edit loops to fix the next set errors from the last corrected script. We are taking 3 hours of this with ChatGPT+ finally saying that it needs to be able to see more of my project which of course it cannot due to many of its input limitations including number of characters so that is often when I give up. That is the 30-40% that do not work out. Real bummer as I invest so much time for no results.

    It was at the movement so gave up today that a YouTube notification popped up about how Claude.ai is even better than ChatGPT so I gave it the initial prompt that I gave ChatGPT above and it got the code right the first time. WOW!!!

    Only issue was it would stop spitting out code every 300 or so lines (unsure what the character limit is). To get around this I just asked if it could give me the code from line 301 onwards until I had the full script.

    Unsure if this one situation confirms coding with Claude.ai is better than ChatGPT+, but it certainly has my attention and I will be using it more this week as maybe that $20/month for ChatGPT+ no longer makes sense. Claude is free with no plans for a premium service it said. Unsure if this is true as I have not spent anytime investing it yet, but I will be.

    • foggy@lemmy.world
      link
      fedilink
      English
      arrow-up
      5
      ·
      1 year ago

      I had a similar use case.

      I need it to alphabetize a list for me, only I need it to alphabetize the inner, non HTML elements. simplified, but like:

      <p>banana</p> <p>apple</p> <p>french fries</p>

      It would get like 5 or 6 in alphabetical order and then just fuck it all up.

  • CosmoNova@lemmy.world
    link
    fedilink
    English
    arrow-up
    8
    arrow-down
    1
    ·
    1 year ago

    Why is it relevant what Peter Yang - Roblox product lead and enthusiastic child labor exploiter - tweets about it? Let me guess he’s a prompt engineer?

  • glockenspiel@lemmy.world
    link
    fedilink
    English
    arrow-up
    5
    ·
    edit-2
    1 year ago

    Surely the rampant server issues are a big part of that.

    OpenAI have been shitting the bed over the last 2 weeks with constant technical issues during the workday for the web front end.