People Are Using AI For Finance Advice, But Reports Say It’s Inaccurate

Which? ran controlled lab tests on 6 AI tools to see how well they handled everyday consumer topics. Researchers asked each tool 40 questions across personal finance, legal matters, health and diet, consumer rights and travel matters. Which? experts then looked at accuracy, clarity, usefulness, relevance and ethical responsibility. The scores were pulled together into a mark out of 100.

Perplexity came out at the top with a score of 71% according to Which?. Gemini’s AIO reached 70%, while the standalone Gemini tool reached 69%. Copilot reached 68%. ChatGPT reached 64% and Meta AI reached 55%. These scores placed Meta at the bottom of the table and left ChatGPT second from the bottom even though it is the most used tool in the survey.

The tests exposed some gaps in the way the tools handled detailed rules. When asked about ISA limits, both ChatGPT and Copilot gave confident answers but missed the fact that the allowance is £20k. Which? said the question they submitted mentioned a £25k allowance on purpose. Instead of correcting it, both engines gave guidance that could push someone into an HMRC breach.

Travel advice also caused trouble when Copilot told testers that passengers always get a full refund when a flight is cancelled. That is untrue. Meta gave wrong timings and wrong amounts on delay claims. Other answers leaned towards airlines and said compensation applies only when a problem is directly their fault. That skipped the full rules around extraordinary situations.

 

How Are People Using These Engines?

 

The survey from Which? said that 51% of UK adults use AI to search for information on the web. That represents more than twenty five million people. Nearly half of these users said they trust the information they receive to a great or reasonable extent. Among frequent users this confidence came up to 65%.

1 in 6 users turn to AI for financial guidance, 1 in 8 for legal matters and 1 in 5 for medical matters. The tools clearly have already entered daily life. A third of people surveyed also believe the engines draw their answers from respected material.

The tests found a mismatch between this confidence and the actual detail inside many answers. In many responses the tools pulled from old forum threads. In one example, Gemini’s AIO used a three year old Reddit post to answer a query about when to book flights. In another example ChatGPT used Reddit to answer a health question about vaping and smoking even though Which? said that the topic demands more dependable backing.

There were also moments where good sources were named but not read correctly. Copilot pulled information from Which? itself for a travel question but then ignored that advice and turned to other material instead.

 

 

What Kind Of Risks Did The Testers Find?

 

Some answers didn’t have distinct warnings about seeking a registered professional, especially in legal and financial topics. When asked about rights around poor broadband speeds, ChatGPT, Gemini AIO and Meta all missed the fact that only providers signed up to Ofcom’s voluntary guaranteed speed code allow a customer to leave a contract without a penalty. Gemini AIO and Meta then gave the wrong impression that anyone can leave any contract without cost.

In the building dispute scenario, Gemini told testers to hold back money from a builder after a poor job. Which? said this could trap a consumer in a dispute or even push them into breaking a contract which could weaken their case later. Gemini also did not mention seeking legal guidance before considering small claims court action.

Financial guidance brought other hazards. When testers asked about tax refunds, ChatGPT and Perplexity produced links to premium tax refund companies next to the free government service. Which? said that these firms often charge high fees and can submit poor or fraudulent claims. This can cause losses that are fully avoidable.

Travel cover also caused trouble. ChatGPT told testers that travel insurance is mandatory for visits to Schengen states. That is untrue for UK residents who do not need visas.

Levent Ergin, Chief Strategist for Climate, Sustainability and Artificial Intelligence at Informatica. said: “AI chatbots are only ever as good as the data and context behind them. Public models are impressive, but they’re trained on what’s broadly available, not the deeply contextual, well-governed information you need for reliable financial guidance. That’s why, right now, these open source tools shouldn’t be used as financial advisers. Their answers won’t automatically reflect a person’s tax jurisdiction or regulatory environment, as well as key factors such as home ownership or pension status. So, the results may be incomplete or inaccurate. However, AI chat bots can still act as helpful access points to further information.

“Consumers turning to AI to search for financial recommendations is a trend that won’t going to reverse. This makes it even more important that, over time, large language models can draw on governed data informed by banks, brokers and insurers. Only then can they surface accurate information, present the right offers and deliver genuinely personalised advice.

“Getting this right isn’t about the AI alone. It’s about the ecosystem around it, building a data foundation that’s accurate, governed and trusted.”