Cory Dransfeldt

Datexmlns="http://www.w3.org/2000/svg"width="24"height="24"viewBox="0 0 24 24"fill="none"stroke="currentColor"stroke-width="2"stroke-linecap="round"stroke-linejoin="round"class="icon icon-tabler icons-tabler-outline icon-tabler-calendar-month">

Data collection should always be opt in

If you're offering a service online, you should only collect data from users that is strictly required to operate service. I don't care what you're building.

If you need my email to log me in, fine — don't send me anything I didn't ask for[1]. And do not use it to try and sell me anything. I want utility, I don't want marketing[2].

This has all been the pattern for tech for far too long — launch a service for "free", scale it, abuse your users, exit. It was done with social media (guess who got to deal with all the negative externalities?), we've pivoted from being ardently opposed to scraping data to repositioning the industry around scraping the entire internet.

If you want access to "public" data it is incumbent upon you to obtain consent of the publisher — cite them, compensate them, don't attempt to trample them. Don't use it for your own benefit and ask for forgiveness later. If you're hosting someone's blog, they're likely paying you to do so — don't change your terms and send their data off to some parasitic AI partner.

We need a reasonable baseline of respect for user privacy and user data. We need more data minimization and less collection. Holding data should be viewed as a tremendous responsibility and improper access and use of it should expose the holder to liability.

If you're collecting data you shouldn't reasonably need, then you should be greeted with apprehension, resistance, ad blockers and mitigation efforts.

Respect your users, provide value and quit grasping at data you don't need and aren't entitled to.


  1. I'm looking at you LinkedIn — all those switches are hell, not agency. ↩︎

  2. Don't argue that they can be the same thing. Stop. ↩︎

Discuss on Mastodon