How to Regain Control of your (Digital) Soul
Every prompt you send to OpenAI or Anthropic is logged, stored, and readable by their staff.

You’re using a wiretap to do your work.
Every prompt you send to OpenAI or Anthropic is logged, stored, and readable by their staff.
How would it feel if you didn’t need to trust when using an LLM?
Not Sam or his peers. Not a BigAICorp. Not the tiny text in the ToS — just the code.
You and I both use LLMs daily, and by now we take it for granted — meaning we are not going back to the pre-LLM world. However, this world we find ourselves in is already far more dystopian than it seems.
You are training the models that will automate you by providing data which is extracted without your consent or any accountability — while you give up your right to privacy, and, as the cherry on top, pay a subscription fee for the ‘privilege’.
I’m a co-founder of Loyal Private Intelligence, and I’ll examine the data-extraction pipeline from scrapers to brokers to large companies that love to yap about how they bring humanity forward — forgetting to mention the catch. Then, we’ll work together to regain control.
1. There is no consent.
The industry standard is simple: You get convenience, they get your mind.
The phrase “Data is the new oil” is a cliché, but with the advent of LLMs, I have started to accept its premise. Why? Because I’ve seen petabytes scraped, packaged, and sold to LLM companies with no regard for users at all. When you type or upload anything, you share it by default. That’s now considered the “industry standard”.
You may be surprised by just how many people really believe that by clicking “do not share my data”, using “secret mode”, or such prevent ChatGPT from collecting and storing their data. The same goes for using social platforms and ‘private’ chats — everything is collected either by the platform itself, or by third-party extractors (or by both).
OpenAI’s ToS allow them to hand over data to law enforcement. If you are a founder asking about legal loopholes, or a regular person asking about sensitive personal issues, that text is permanent evidence.
2. There is no accountability
Why is data extracted at this scale?
- Training models. Data is one of the major bottlenecks, especially for some types, contexts, and languages. AI companies purchase data for tens of millions of $ monthly.
- Targeting. You may have seen the first ad experiments by major LLM companies, and everything we know about contemporary consumerism points to this being the default monetization in the future.
In short, your data is getting extracted & processed for free, while the LLM company benefits from it, and then can sell you more of anything you don’t need. Imagine the ad economy, but on steroids, and you pay instead of getting paid.
Now, what could you do against this?
What if your data gets leaked (which it, statistically, will)?
… I mean, aside from not using LLMs.
3. There is no privacy
However, it doesn’t stop at targeting. The data needs to be stored, and may be compromised in so many different ways. And if you think it’s stored in the most secure fashion possible, just think again about the centralized servers.
For me, the main punchline is that any of my private or work-related information can be seen directly by people operating the LLM as a product, and then go into training datasets.
Moreover, my data could be used against me in court (which is openly stated by LLM corp execs), and if LLM corps see my AI startup as a competitor… well.
This will be financial data, medical data, confidential discussions, and generally sensitive info. Just not private by default. And what can we do about it?
As I said at the very beginning, this does feel dystopian to the point where you’d want to just look away: you don’t own your data, it’s basically stolen from you daily, and then used with disregard for your interests.
But the good news is that we still can regain control, and in this case, it actually starts with you and me.
There are only two realistic ways of dismantling the extraction pipeline: self-hosting and verifiable privacy:
- Self-hosting is what you may already be using as an “advanced user”, but it’s fair to say it’ll not become the default or even a popular choice for most, including the most affluent.
- Verifiable privacy was not practically viable at scale even ten years ago_,_ but today it could become the default answer to the pipeline problem. With it, you don’t trust people. Instead, you verify and rely on hardware-enforced confidential computing like TEEs / confidential VMs.
This is what we’ve built at @loyal_hq: we combined the speed of the data center with the privacy of self-hosting.
Share your setup for secure inference. And I invite you to learn more about ours in the upcoming articles.


