A mechanism to evaluate context-free queries inspired in LR(1) parsers over graph databases
Graph databases; Query language expressiveness; RDF; LR(1) languages.
The World Wide Web is an always increasing collection of information. This information is spread among different documents, which are made available by using the Hypertext Transfer Protocol (HTTP). Even though this information is accessible to users in the form of news articles, audio broadcasts, images and videos, software agents often cannot classify it. The lack of semantic information about these documents in a machine readable format often causes the analysis to be inaccurate. A significant number of entities have adopted Linked Data as a way to add semantic information to their data, not just publishing it on the Web. The result is a global data collection, called the Web of Data, which forms a global graph, consisting of Resource Description Framework (RDF) statements from numerous sources, covering all sorts of topics. To be able to find specific information in this graph, queries are performed by starting at a subject and analyzing its predicates in the RDF statements. Given that a trace is a list of predicates in an information path, one can tell there is a connection between one subject and one object if there is a trace between them in the RDF statements.
The use of HTTP as a standardized data access mechanism and RDF as a standard data model simplifies the data access, but accessing heterogeneous data on distinct loca- tions can have an increased time complexity and current query languages have a reduced query expressiveness, which motivates us to research alternatives in how this data is queried. This reduced expressiveness happens because most query languages reside in the Regular Languages class. In this work, we introduce some of the concepts needed for better understanding the given problems and how to solve them. We analyze some works related to our research and propose to use Deterministic Context-Free Grammars instead of Regular languages to increase the expressiveness of the graph database queries. More specifically, applying the LR(1) parsing method to find paths in an RDF graph database. Lastly, we analyze our algorithm’s complexity and make some experiments, comparing our solution to other proposals, and show that ours can have better performance in given scenarios.