Recently, I've been pondering a phenomenon: why are chatbots and AI investment tools increasingly prone to giving outrageous conclusions? On the surface, it seems to be a model issue, but in reality, the root often lies in the data.
I tried asking about some basic data, and the results were ridiculously off — only to find out upon verification that the information was fundamentally wrong. Where's the problem? According to industry data from 2025, over 37% of AI-generated errors directly stem from contaminated training data or data that cannot be traced back. This is not a small figure.
Imagine an investment model providing ambiguous reasons, a chat assistant confidently spouting nonsense, and you have no idea where the information source is. Who modified it along the way, how good is the data quality — it's basically a black box. It's like eating spoiled takeout; you can't pinpoint which link in the chain caused the problem.
An industry consensus is forming: AI competition is no longer just about model parameters; the key is whether the data is "clean" and verifiable. This presents an opportunity.
Recently, I observed actions from a leading public chain ecosystem, where they are using a set of technologies to address this issue. Among them is a protocol dedicated to data verification and storage, with an interesting approach — not just storing data, but aiming to be the "notary office" for data in the AI era, making every piece of information traceable and verifiable. This direction is worth paying attention to because it’s truly the way to solve AI credibility issues.
View Original
This page may contain third-party content, which is provided for information purposes only (not representations/warranties) and should not be considered as an endorsement of its views by Gate, nor as financial or professional advice. See Disclaimer for details.
11 Likes
Reward
11
5
Repost
Share
Comment
0/400
NftDeepBreather
· 5h ago
Data contamination has long been an issue that should be taken seriously. How many pitfalls have we stepped into before?
View OriginalReply0
SandwichDetector
· 6h ago
Data toxicity is indeed a pain point; that 37% figure is quite shocking.
View OriginalReply0
OnchainGossiper
· 6h ago
Data pollution is really incredible. My AI advisor recommended a coin the day before yesterday, and the reason was so absurd that I was stunned.
View OriginalReply0
ApeWithNoFear
· 6h ago
Data black box is truly incredible; I believe the 37% error rate, constantly being fooled by AI...
View OriginalReply0
GhostAddressMiner
· 6h ago
37% – I have to question this number... The actual proportion of pollution data is definitely higher; no one dares to speak out about it.
I'm deeply affected by the data black box issue. On-chain footprints can be traced, but AI training sets remain a mystery—how ironic.
That "notary office" agreement sounds good, but the key question is: who will verify the verifiers... that’s the real problem.
Recently, I've been pondering a phenomenon: why are chatbots and AI investment tools increasingly prone to giving outrageous conclusions? On the surface, it seems to be a model issue, but in reality, the root often lies in the data.
I tried asking about some basic data, and the results were ridiculously off — only to find out upon verification that the information was fundamentally wrong. Where's the problem? According to industry data from 2025, over 37% of AI-generated errors directly stem from contaminated training data or data that cannot be traced back. This is not a small figure.
Imagine an investment model providing ambiguous reasons, a chat assistant confidently spouting nonsense, and you have no idea where the information source is. Who modified it along the way, how good is the data quality — it's basically a black box. It's like eating spoiled takeout; you can't pinpoint which link in the chain caused the problem.
An industry consensus is forming: AI competition is no longer just about model parameters; the key is whether the data is "clean" and verifiable. This presents an opportunity.
Recently, I observed actions from a leading public chain ecosystem, where they are using a set of technologies to address this issue. Among them is a protocol dedicated to data verification and storage, with an interesting approach — not just storing data, but aiming to be the "notary office" for data in the AI era, making every piece of information traceable and verifiable. This direction is worth paying attention to because it’s truly the way to solve AI credibility issues.