• brucethemoose@lemmy.world
    link
    fedilink
    arrow-up
    6
    arrow-down
    1
    ·
    edit-2
    8 days ago

    I use pretrains light on slop, n-gram sampling, and a big banned strings list. And then I check the logprob synonyms on top of that, like so:

    Not that it’s particularly critical, as I’m actually reading and massaging really short outputs (usually less than ten words at a time). Better instruct models, which tend to be more sloppy, still aren’t so bad; nothing like ChatGPT.

    So yeah, I’m aware of the hazard. But it’s not as bad as you’d think.

    In fact, there are whole local-LLM communities dedicated to the science of slop. And mitigating it. It’s just not something you see in corporate UIs because they don’t care (other than a few bits they’ve stolen, like MinP sampling).

    • Buffalobuffalo@lemmy.dbzer0.com
      link
      fedilink
      arrow-up
      4
      ·
      8 days ago

      It seems a sophisticated approach that minimizes broad suggestion. It probably improves your writing momentum and reduces stalling like youve shared had been detrimental. As an exercise in writing, or practice for personal reflection i see the merit. Teaching oneself or developing strategies best learned when applied… Alright. Functional writing on technical topics or news, potentially bearable.

      But like many comments in the thread, people dont want to read generated content. If there’s disclosure about using an LLM in a novel’s production i’ll have little desire to read it.