The Logica System: Elevating SQL Databases to Declarative Data Science Engines
- Evgeny Skvortsovc(Author),
- Yilin Xiab(Author),
- ,
- Bertram Ludäscherb(Author)
- ,
- bUniversity of Illinois Urbana-Champaign,
- cGoogle LLC
Abstract
Logica (= Logic + aggregation) is a freely available, open-source, feature-enhanced version of Datalog that automatically compiles logic rules to a number of popular SQL platforms (DuckDB, SQLite, PostgreSQL, and BigQuery). Logica combines beginner-friendly declarative features of Datalog with advanced analytical features needed by data science and ML practitioners when processing real-world data. Since Logica is built on top of mature SQL implementations, these features can be executed robustly and scalably. Logica allows beginners to seamlessly progress from simple textbook examples to intermediate and advanced use cases. We introduce Logica with examples that combine aggregation, recursion, and negation in interesting and powerful ways. Additional advanced examples (maximum flow, matrix inversion, etc.) are demonstrated in an online notebook. Logica source programs are compiled into (a) self-contained SQL scripts (for non-recursive and shallow-recursive problems) or (b) Python-driven iterations of SQL queries (when deep recursion is needed). Logica’s practical and theoretical expressive power thus extends both SQL and (pure) Datalog. The Logica system has been used for data science applications and training in industry, and in graduate-level courses in academia.
