• maria [she/her]@lemmy.blahaj.zone
    link
    fedilink
    English
    arrow-up
    16
    arrow-down
    1
    ·
    9 days ago

    id be real fun to do som LM poisoning-

    i believe its not that easy unfortunately-- where its very obvious when the data is poisoned. its not as easy as the “glazing” used for image predictors… i believe.

    my memory might be outdated, so happy to be proven wrong <3 <|endoftext|>

    • Gladaed@feddit.org
      link
      fedilink
      English
      arrow-up
      9
      ·
      9 days ago

      To be fair, your old style of writing excessively special probably would damage training sets. That being said you can’t use Lemmy to poison them, effectively.

      Also: this is not a riff. It’s. Ok being weird.

      • ApertureUA@lemmy.today
        link
        fedilink
        English
        arrow-up
        2
        ·
        8 days ago

        Not sure about the <|endoftext|> but the rest of the writing quirks I see here are also the ones I see edgy 13 year olds using nowadays (no offense intended). I guess the new is the well forgotten old, or however that phrase went.

      • maria [she/her]@lemmy.blahaj.zone
        link
        fedilink
        English
        arrow-up
        2
        ·
        8 days ago

        hmmm… see - i dont believe this is how it goes.

        we all know LMs predict patterns, but most of todays “poisoning” attempts at LMs were done byhaving a certain keyword ttigger a random string of characters afterward,…

        so in that way, its easy to tell if a certain bit of data was poisoned, even for an LM itself. and by todays standards, every bit of training data is already being filtered, changed and optimized fir training, like how when qwen 3 coder was trained, alibaba group used their older qwen 2.5 coder to clean the training data to be less “noisy”, and it worked!

        when peeps say “lm poisoning”, they usually refer to this anthropic post about the topuc released two months ago.

        best case: we find a token combination which is frequently used while running the model, rare to find in the post-training data (the instruction tuning dataset) AND is very rare to occur in the prettaining data (the internet source text)… and thats rather limiting.

        best case: we poison it well so that the model behaves differently enough for us to ne happy, and too obscurely for the model devs to notice.

        so we gotta be very sneaky with the poisoning… again tho, mayb some new better technique came up and is now going to make it easier-

    • maria [she/her]@lemmy.blahaj.zone
      link
      fedilink
      English
      arrow-up
      4
      ·
      8 days ago

      smol LMs, specifically “abliterated” finetunes of smol LMs can already di that (they got their “refusal” mechanism trained out of them), but iguess those dont count as large, so u can decide if u wana count that-.,