• pankuleczkapl@lemmy.dbzer0.com
    link
    fedilink
    arrow-up
    50
    ·
    4 days ago

    Thankfully, AI is bad at maths for exactly this reason. You don’t have to be an expert on a very specific topic to be able to verify a proof and - spoiler alert - most of the proofs ChatGPT 5 has given me are plain incorrect, despite OpenSlop’s claims that it is vastly superior to previous models.

    • jj4211@lemmy.world
      link
      fedilink
      arrow-up
      22
      ·
      4 days ago

      I’ve been through the cycle of the AI companies repeatedly saying “now it’s perfect” only admitting it’s complete trash when they release the next iteration and claim “yeah it was broken, we admit, but now it’s perfect” so many times now…

      Problem being there’s a massive marketing effort to gaslight everyone and so if I point it out in any vaguely significant context, I’m just not keeping up and most only have dealt with the shitty ChatGPT 5.1, not the more perfect 5.2. Of course in my company they are about the Anthropic models so it is instead Opus 4.5 versus 4.6 now. Even proving the limitations in trying to work with 4.6 gives anthropic money, and at best I earn a “oh, those are probably going to be fixed in 4.7 or 5 or whatever”.

      Outsiders are used to traditional software that has mistakes, but those are straightforward to address so a close but imperfect software can hit the mark in updates. LLMs not working that way doesn’t make sense. They use the same version number scheme after all, so expectations should be similar.

    • amorpheus@lemmy.world
      link
      fedilink
      arrow-up
      11
      ·
      4 days ago

      most of the proofs ChatGPT 5 has given me are plain incorrect, despite OpenSlop’s claims that it is vastly superior to previous models

      Both of those can be true.