Hacker News

Exposed DeepSeek database leaking sensitive information, including chat history

by talhof8on 1/29/2025, 9:25:36 PM with 31 comments

by jvanscon 1/29/2025, 11:42:26 PM
This is probably an incredibly stupid, off-topic question, but why are their database schemas and logs in English?
Like, when a DeepSeek dev uses these systems as intended, would they also be seeing the columns, keys, etc. in English? Is there usually a translation step involved? Or do devs around the world just have to bite the bullet and learn enough English to be able to use the majority of tools?
I'm realizing now that I'm very ignorant when it comes to non English-based software engineering.
by galnaglion 1/30/2025, 6:30:07 AM
Thank you everyone, this was responsibly disclosed to DeepSeek and published after the issue was remediated, we got acknowledgment from their team today on our contribution.
by caust1con 1/29/2025, 11:40:07 PM
Interesting to note:
- Dev infra, observability database (open telemetry spans)
- Logs of course contain chat data, because that's what happens with logging inevitably
The startling rocket building prompt screenshot that was shared is meant to be shocking of course, but most probably was training data to prevent deepseek from completing such prompts, evidenced by the `"finish_reason":"stop"` included in the span attributes.
Still pretty bad obviously and could have easily led to further compromise but I'm guessing Wiz wanted to ride the current media wave with this post instead of seeing how far they could take it. Glad to see it was disclosed and patched quickly.
by danielodievichon 1/29/2025, 11:11:28 PM
open exposed clickhouse is this decade's open exposed elasticsearch so common in the past
by mmaunderon 1/30/2025, 1:19:46 AM
Does DeepSeek have a bug bounty program I'm not aware of with a clearly defined scope? It appears that Wiz took it upon themselves to probe and access DeepSeek's systems without permission and then write about it.
If you do this and the company you're conducting your "research" on hasn't given you permission in some form, you can get yourself in a lot of hot water under the CFAA in the USA and other laws around the world.
Please don't follow this example. Sign up for a bug bounty program or work directly with a company to get permission before you probe and access their systems, and don't exceed the access granted.
by ripped_britcheson 1/30/2025, 12:46:12 AM
Ironic - I bet if you ask deepseek r1 how to set up clickhouse it would tell you the right way to do it.
by semkingon 1/30/2025, 4:35:42 AM
Can you imagine executing arbitrary SQL queries via your web browser? :D
Complete database control and potential privilege escalation within the DeepSeek environment without ANY authentication...
by NathanKPon 1/29/2025, 10:42:10 PM
And that's why you run models locally. Or if you want a remote chat model, use something stateless like AWS Bedrock custom model import to avoid having stored chats on the server.
by sylwareon 1/30/2025, 10:30:11 AM
The second Big Tech was threatened by significant competition (DeepSeek), this competition is "stealing"(lol), and is under heavy hacking attacks (main online inference portal).
There you have, the real face of Big Tech. Extinguishing the competition by locking a service behind a portal provided for free, then starting to milk the users, is not enough for them... they will also fight dirty, really dirty.
by anhldbkon 1/30/2025, 3:23:16 AM
Good finding. I don't see its timeline usually discussed in other Ethical hacking and responsible disclosures.
by Havocon 1/30/2025, 12:44:01 AM
Ugh. I know I’ve got at least some keys in those logs. Thankfully nothing too intense
by b3ingon 1/29/2025, 11:33:42 PM
It seems fair since all the other AI's scraped copyrighted information, images, video online and from pirated books, etc. without ever asking anyone first.
by mmaunderon 1/30/2025, 1:58:51 AM
The amount of vitriol in these comments is the really surprising data. I've seen the same on Twitter. I can only put it down to the financial pain DeepSeek inflicted on many US retail investors by wiping almost $700 billion off NVidia's stock price. I think a lot of folks didn't see it coming and it hurt them right where it matters most: In the wallet. The anger out there is very real.
by seeknotfindon 1/30/2025, 1:19:40 AM
Where's the download link?
by j45on 1/30/2025, 12:07:38 AM
A data point on self-hosting being preferable, or using an alternate gpu cloud host who can run the model privately/semi-privately for you.
by mr90210on 1/30/2025, 12:30:47 AM
Poorly secured or not it still managed to hit your favourite stock. The execs at NVIDIA still haven’t recovered from the bloodbath.
by hdlothiaon 1/29/2025, 10:51:36 PM
This kinda does support the 'DeepSeek is the side project of a bunch of quants' angle.
Seems like the kind of mistake you would make if you are not used to deploying external client facing applications.
by rvzon 1/29/2025, 10:43:34 PM
> More critically, the exposure allowed for full database control and potential privilege escalation within the DeepSeek environment, without any authentication or defense mechanism to the outside world.
Not only that, this was a "production-grade" database with millions of users using it and the app was #1 on the app store and ALL text sent there in the prompts was logged in plain-text?
Unbelievable.
by lexandstuffon 1/30/2025, 2:04:29 AM
Another example of DeekSeek copying straight from OpenAI's playbook [1] [2]
[1] https://www.reuters.com/technology/cybersecurity/openais-int...
[2] https://openai.com/index/march-20-chatgpt-outage/
by nialv7on 1/30/2025, 2:05:13 AM
I wonder if this is the "cyberattack" DeepSeek was talking about?
by hi_hion 1/30/2025, 1:02:18 AM
I don't get the discussions around side project and they're ML engineers, not security experts. Why are you excusing a company for a serious security leak.
If you're releasing a major project into the wild, expect serious attention and have the money, you get third parties involved to test for these things before you launch.
Now can we get back to discussing the real conspiracy theories. This is clearly a disinformation piece by BigAI to add FUD around the Chinese challenger :-)
by maitolaon 1/30/2025, 1:37:02 AM
How do we know for sure that DeepSeek is not actually trained on Nvidia chips? Did someone outside of China replicated the training from scratch (Spending $6M)?
by suracion 1/30/2025, 1:34:05 AM
that's why i never use my strong passwords in many chinese websites(in fact, i tend not to use passwords in any website)
i suggest you guys don't do that also
this industry in china is so young, many devs and orgs don't understand what will happened if they shutdown the firewall or expose their database on the internet without a password
they just, can't think of it, need someone to remind them
by SebFenderon 1/30/2025, 12:07:51 PM
Never forget honeypots.
by nicoon 1/29/2025, 10:52:01 PM
So much effort in trying to tarnish DeepSeek the last 24hrs
by dotcomaon 1/30/2025, 12:09:36 AM
It’s a feature, not a bug !
by bryan_won 1/30/2025, 12:27:32 AM
This is totally expected when you use AI to build your infrastructure.
by mrbungieon 1/29/2025, 11:02:18 PM
[edit: Nevermind, see below]
The direct disclosure of urls and ports is insane. Wonder if they would be as irresponsible if it was MSFT, OpenAI, Anthropic, etc.
PS: Not defending DeepSeek for bad practices, but still. Nothing irresponsible here.
PS2: It is marked as resolved, I went directly to the vulns due to the title of the post.
by tomlockwoodon 1/29/2025, 10:53:18 PM
This doesn't look like a responsible disclosure, at all.
ed: I was wrong!
by lysaceon 1/30/2025, 12:25:49 AM
[flagged]
by samedevon 1/29/2025, 11:36:57 PM
Man! I used deepseek.com luckily I didn't use the same password as I use. :) Time to use ollama!