• Chamomile 🐑@furry.engineer
    link
    fedilink
    arrow-up
    7
    ·
    5 days ago

    @OwOarchist @Rhoeri Unlike AI crawlers, search engines generally respect robots.txt and noindex tags, which will tell them not to index or surface those pages in search results. This is how fediverse profiles which have chosen to opt out of internet search indexes do so.

    You should still assume things you post in public with no auth required are public of course.

    • cron@feddit.org
      link
      fedilink
      arrow-up
      3
      ·
      5 days ago

      Does robots.txt really work in the fediverse? At least on lemmy, the content can be retrieved on different hosts, all of which have different robots.txt files. Unless it is somehow “baked” into the protocol.

      • pkjqpg1h@lemmy.zip
        link
        fedilink
        English
        arrow-up
        2
        ·
        3 days ago

        Major search engines respect robots.txt, but as you said some instances allow them but this is not a scalable way