@conorab

conorab@lemmy.conorab.com · 5 hours ago

I understand (to agree degree) going after AI companies for reproducing the lyrics in a way that would not normally be protected by copyright but outright scraping is going too far from a moral standpoint.

There’s a good argument to be made about abusing their resources to do the scraping as I’ve heard complaints of site owners getting overwhelmed by AI crawlers but provided you’re not doing that I think scraping should be allowed generally speaking even if the operator disallows it, since without that search engines break and archival (especially to prove malice) go out the window.

I’m inclined to take an approach of “you can ingest whatever you want, but you are liable for reproduction, and if preventing reproduction is too onerous, then you probably should get the licences to permit it or don’t ingest that data”. Even that has some caveats since that reasoning would decimate social media services and personal/community spaces if actually enforced which is kinda what Safe Harbor helps protect.

conorab@lemmy.conorab.com · 3 days ago

It would help! It would establish that an archive was made no later than the date it was recorded on a blockchain (assuming the archiver isn’t also the one the made the original content in which case they can upload it after making the “archive”). You would still need to prove the trustworthiness of the archived data and at the moment the only thing we have for that is just trusting the archiver.

You could do something like have multiple archivers archive the same site in s stripped down for like plain text (so that differences caused by time or day, ads, etc don’t change the hash) and that way you can say that X amount of archivers agree that the site looked like that at that time.

conorab@lemmy.conorab.com · 7 days ago

It occasionally catches things that archive.org misses too. Also really nice to have an alternative.

It’d be nice to have a way of doing decentralised archiving while still keeping the trust. If you’re trying to prove that a site really said something at a certain date to another person, pointing to your own archive is kinda useless.

conorab@lemmy.conorab.com · 2 months ago

And Microsoft ended up providing their own compiled version of OpenJDK to get around the non-commercial use part of the licence to do it.