• Kissaki@feddit.org
    link
    fedilink
    English
    arrow-up
    104
    arrow-down
    1
    ·
    edit-2
    2 days ago

    Perplexity argues that a platform’s inability to differentiate between helpful AI assistants and harmful bots causes misclassification of legitimate web traffic.

    So, I assume Perplexity uses appropriate identifiable user-agent headers, to allow hosters to decide whether to serve them one way or another?

    • ubergeek@lemmy.today
      link
      fedilink
      English
      arrow-up
      9
      ·
      1 day ago

      And I’m assuming if the robots.txt state their UserAgent isn’t allowed to crawl, it obeys it, right? :P

      • Kissaki@feddit.org
        link
        fedilink
        English
        arrow-up
        3
        ·
        1 day ago

        No, as per the article, their argumentation is that they are not web crawlers generating an index, they are user-action-triggered agents working live for the user.

        • ubergeek@lemmy.today
          link
          fedilink
          English
          arrow-up
          2
          ·
          1 day ago

          Except, it’s not a live user hitting 10 sights all the same time, trying to crawl the entire site… Live users cannot do that.

          That said, if my robots.txt forbids them from hitting my site, as a proxy, they obey that, right?

    • lime!@feddit.nu
      cake
      link
      fedilink
      English
      arrow-up
      37
      ·
      2 days ago

      yeah it’s almost like there as already a system for this in place

    • Dr. Moose@lemmy.world
      link
      fedilink
      English
      arrow-up
      3
      arrow-down
      5
      ·
      1 day ago

      Its not up to the hoster to decide whom to serve content. Web is intended to be user agent agnostic.