AtScale Makes Significant Strides in Text-to-SQL Conversion through Semantic Layer

Share

Key Takeaways:

– AtScale achieves a major development in Text-to-SQL processing through its use of semantic layer.
– With semantic layer, AtScale enhances the accuracy of generated SQL queries from 20% to 92.5%.
– Semantic layer is crucial to feed context to Large Language Models (LLMs) like Google’s Gemini Pro 1.5, thereby enhancing their accuracy.
– Through breakthrough, AtScale paves way for more natural language query adoption in Generative AI (GenAI).

Significant developments in the industries of advanced analytics and Generative AI (GenAI) have been made by data analytics platform, AtScale. The company announced recently that its semantic layer application led to a significant evolution in text-to-SQL processing. This improvement might lead to an increased adoption of natural language query in GenAI.

AtScale: Bridging the Gap between BI Tools and Data Warehouses

AtScale, initially known in the advanced analytics space for its OLAP layer that accelerated SQL queries in big data ecosystems, has now shifted focus to its semantic layer application. Positioned between the business intelligence tool and the data warehouse, the semantic layer has played a pivotal role in advanced analytic systems, particularly as the scale and importance of automated decision-making systems have grown.

The role of semantic layer in enhancing the interpretability and consistency of virtual business metrics has been profound. However, its importance often goes unnoticed. For instance, the question, “What were our total sales last month by region?” might appear simple, but without clear definitions for each term concerning the organization’s specific data, it’s easy to derive incorrect answers.

GenAI Revolution and The Importance of Semantic Layer

The effect of semantic layer has become even more pronounced since the onset of the GenAI revolution in 2022. It’s especially crucial in feeding business-specific context to Large Language Models (LLMs) like Google’s Gemini Pro 1.5 LLM. Without the semantic layer, the chances of generated SQL providing accurate answers are slim.

AtScale carried out a comparison test using this model on the TPC-DS dataset. The test measured the accuracy of LLM-generated SQL queries with and without semantic layers. The company also detailed these findings in its recent white paper, “Enabling Natural Language Prompting with AtScale Semantic Layer and Generative AI.”

Improvement in Accuracy: Semantic Layer Vs. Control Environment

In the test without the semantic layer, the system demonstrated only a 20% total accuracy rate. Remarkably, when AtScale implemented its semantic layer into the mix, the accuracy of the generated SQL queries surged to 92.5%. Furthermore, the semantic layer-based system answered 100% of simpler questions correctly and only generated incorrect data with the most complex queries.

According to Jeff Curran, the data science team lead at AtScale, it signifies a major milestone in NLP and data analytics. By working with semantic layer and the AtScale query engine, the performance and accuracy on more complex question sets could be significantly improved.

Furthermore, with the use of prompt engineering and RAG, more context can be provided for LLMs. However, questions remain on how much users can trust LLMs. AtScale notes that the LLM occasionally generated data that didn’t exist or overlooked commands to use certain filters.

With additional training data designed to prepare an LLM for work with the AtScale Query Engine, the accuracy rate could be boosted even higher. In conclusion, the AtScale Semantic Layer has proven to be a viable solution to performing basic natural language query tasks.

Read more

More News