Chat With Avi Perez, Co-founder and CTO at Pyramid Analytics

avi-perez-white-office-background

A veteran of the business intelligence industry, Avi Perez co-founded Pyramid Analytics some 16 years ago. In February, Pyramid announced the release of “GenBI,” a solution that allows anyone to ask questions about their data and quickly receive useful answers, which can include rich reports and interactive dashboards.

We spoke with Avi about the latest trends in decision intelligence, embedded analytics, and the future of conversational interfaces.

 

Now That It’s Possible To Ask Questions About Business Data Using Natural Language, What Are The Biggest Obstacles For Companies Trying To Emphasise Data-Driven Decision Making?

 

The first obstacle for a company before you get your users to make data-driven decisions, is to make sure that the questions that are going to be asked by your users can be answered in the first place.

The second thing is then to make sure that the LLM engine and the recipes that it generates are so clever, so smart, so insightful that the answers that they give are actually on point and correct and accurate. That itself is a challenge, because the LLMs have a habit of hallucinating when they don’t know answers.

The second thing is to make sure that the chatbot, the LLM interface, is in the right place in the right time so that people can use it. It’s easy to use, it’s very, very fluid and flexible in the language. Another one that we think is very important that it’s something you can talk to rather than having to type into.

These are things that Pyramid has already done and its approach to the problem on which we deliver today basically in an attempt to challenge all of that.

Once we get to that point, you’re going to see many more people using the technology and adopting it.

And so you’ll get more people asking questions and if there’s questions come out, then they can start to actually build decisions off it. So I think these are the actual hurdles more than convincing people to make data-driven decisions. People don’t do it, not because they don’t want to, but because it’s difficult. If you make it simpler, then the problem is going to go away.

 

It Seems Like There Are Many New Approaches To Implementing AI For Data Analytics. Will A Killer Use Case Eventually Emerge?

 

I think the ultimate delivery is not so much in how it responds, but what it responds with. If I say to the chatbot, “Tell me what my sales were by country,” and it shows me a picture of sales by country in a bar chart, a pie chart, even a map; very cool, very interesting.

And yet I still need to look at that and make a determination about the data and make a determination of what to do. That’s what is known as descriptive analytics, where you describe what’s going on.

I could then follow up with the question, “Well, why are my sales greater in the UK versus Germany?” It’s kind of like getting into the diagnostic level, and it’s starting to do the mathematics in calculations form to explain the difference. Now it gets more interesting, and suddenly the use case gets stronger.

That’s the kind of thing where a typical user could spend a lot of time trying to analyse and slicing and dicing and looking for this and looking for that, and the use case gets stronger and stronger in that regard. So we’re going up the food chain.

The next step to do, which is the famous levels of hierarchy of analysis is to say, “Well, what could happen tomorrow in the UK? What could the sales be tomorrow?” Now we’re using predictive engines to predict what might happen.

This is not a new thing, but if I can ask the question through generative BI through the large language models and it can work out from the question that I want to do a prediction, go and predict the data for me and then tell me the answer, it got more interesting.

It’s the same sort of flow, but the complexity of the question and more importantly, the sophistication of the answer is getting more and more advanced. And then the killer use case is actually none of those. It’s actually just to answer the question, “What should I do tomorrow morning to improve my sales in the UK?” That is known as the prescriptive layer, and that is the killer use case.

What is the ultimate use case for AI and generative AI? The idea that I don’t even need to see a bar chart. I don’t need to see the data. You just tell me what I need to do tomorrow.

The Way You’ve Integrated LLMs Into Pyramid Analytics, The Third-Party AI Models Don’t Actually Have Access To The Data. Why?

 

LLMs are inherently lousy at mathematics and by extension are lousy at analysis, which is a surprising statement, but it’s very, very true. Also, when you ask an LLM a question about your private data, you’ll have to give it access to the data.

The LLM is very, very good at understanding what the user is trying to ask and formulating a plan of attack for how to question the data set and therefore is very good at what we call generating the recipe.

So the way I like to think of it, I go to the LLM and I say, “Listen, I want you to bake me a black forest cake. Here are the ingredients in my cupboard.” I don’t hand the ingredients to the LLM, I just describe what I’ve got in the cupboard. And it says, “Based on your ingredients and based on your question and based on everything I know, here’s the exact recipe for how to make a black forest cake.” But it is not able to open up the fridge, it can’t crack eggs, it doesn’t know how to heat up an oven, it doesn’t know how to bake anything.

Instead, it hands those instructions, the recipe to a robot that’s standing in your kitchen, and it’s got all the ways and means and physical capabilities to go ahead and bake the cake based on those instructions.

It works identically in the analytics and data business. Pyramid hands a description of your data to the LLM. The LLM understands my question based on what I call prompts, which is the instructions, very exacting instructions for how to build a recipe. It takes all that information, it comes back with a recipe for the kind of question that I’m asking and therefore how to find the answer. Pyramid takes the recipe and goes and bakes the cake and effectively runs the query for the user.

Because we have access to the data, we know how to run queries against the data, we’re able to generate not only the right query against the right data set effectively and produce the right answer. We’re also able to formulate it and visualise it for the user, even textualise it for the user and hand it back to them.

There’s one last issue that comes out of all of this, which is security. If we were to send the data to the LLM, there’s an issue that the data is now in the LLM pipeline and the vendor of the LLM could in theory see that data. By never sending the data to the LLM and only describing the ingredients, the LLM is never privy to the actual data.

The Pyramid solution is running in the sandbox of the customer on their private data, and therefore there is no data leak in the solution, because the data never leaves the sandbox.

 

Pyramid Users Can Select Different Llms For Different Projects. Why Did You Feel That This Was Beneficial?

 

In the long run, you’re going to have different LLMs that are fine-tuned for different verticals or different use cases or different business areas. So maybe there’s an LLM that is tuned for accounting, another one is tuned for marketing, maybe an LLM is tuned on healthcare, maybe an LLM is tuned on insurance data. Therefore, it makes a lot of sense that you’re able to switch out the LLM based on the data that you’re looking at.

There’s another benefit, which is that an LLM that is very specifically trained on a specific block of content will actually be smaller than a broad, generic LLM, and interestingly faster. The smaller it is, the faster it is to respond. So there’s going to be a natural progression whereby if I want a faster, quicker LLM that is cheaper to run and smaller, I want it to be more narrowly focused. So we see that being an outcome in the market too.

 

There’s A Lot Of Buzz In The Industry About Using BI Tools As Building Blocks, As “Embedded Analytics” Components In Others’ Apps. How Do You See This Space Evolving In The Next Year?

 

I think embedded analytics is one of the hot areas in the market right now. It’s not a new thing, but it is a challenge when it comes to adoption. And one of the challenges of adoption is convincing your users to leave the CRM tool and go to that tool over there to do the CRM analysis or to leave the accounting platform and go over here to do the accounting analysis.

Now, many people are used to doing that all the time. We are very used to pulling the data out and going to Excel and doing the analysis there. But if you can take the high-end analytical components and drop them right into the app itself, no one has to go anywhere. And as you’re looking at your CRM data, you can do the analysis on the spot and the two are highly synchronized and highly copacetic.

You can see that people are going to use the analysis immediately far more than having to go through the whole headache of extracting the data, stick it into Excel, and go and do the analysis independently again. So embedded analytics is the next big thing going to happen.

Interestingly enough, the trick there is to not only bring in side-by-side with the rest of the application but allow whoever’s doing the embedding to do it very easy and very simply. Web applications can be very complicated, and integrating another application inside it can be even more complicated.

The big trick is to do it very effectively, very simply. It needs to talk to the database directly, so there’s no copying and pasting. Connecting this all up to the rest of the story, it’d be fantastic if you can bring the generative BI or generative AI functionality into that embedded experience directly within the application seamlessly.

 

What’s Next For Pyramid Analytics?

 

I can say that obviously our investment in the entire natural language querying LLM space is being huge to date, and we continue to grow that. In the next six to 12 months, you’re going to see a bunch of big leaps with some high-powered capabilities through the same interfaces. The depth to which you can go with simple natural language questions or requests is going to be phenomenal.

This is the one end of the stick. The other end of the stick that Pyramid is chasing at the moment, which is somewhat unrelated to this entire conversation, is a big focus on data science and machine learning, DSML.

Obviously, LLMs and AI are heavily related to that, but Pyramid is about to deploy its own toolset to allow customers to build their own models, which ultimately could power their own LLM frameworks, their own AI, their own predictions.

Ultimately, the idea is to get to that eureka point, which is to tell somebody not just what did happen, not why it happened, but what could happen – and what they should do to make it a better outcome, more profitability, lower the cost, better quality, whatever the reason is. And these are the two prongs of Pyramid that you should expect to see from us in the next six to 12 months.