llama_index SPARQL Notes 11

Published on 2023-09-06 by @danja

WARNING:llama_index.graph_stores.nebulagraph:s =Peter Quill WARNING:llama_index.graph_stores.nebulagraph:rel_map =

Ok, I want rel_map to take the subject, Peter Quill, call the SPARQL store and return something in this format :

{'Peter Quill': [ 'Peter Quill, -[would return to the MCU]->, May 2021, <-[would return to the MCU]-, Peter Quill', 'Peter Quill, -[would return to the MCU]->, May 2021', 'Peter Quill, -[was raised by]->, a group of alien thieves and smugglers', 'Peter Quill, -[is leader of]->, Guardians of the Galaxy', 'Peter Quill, -[would return to the MCU]->, May 2021, <-[Gunn reaffirmed]-, Guardians of the Galaxy Vol. 3', ...


Hmm, it takes a list :

def get_rel_map(
    self, subjs: Optional[List[str]] = None, depth: int = 2
) -> Dict[str, List[List[str]]]:


Looping through the list to build the query should work, but there might be a more elegant way. Whatever, start with a single subject.

If I build this up in :

llama_index/tests/storage/graph_stores/test_sparql.py

It make a good start to the test.

Probably unnecessary but I've added an `unescape_from_rdf` helper to `sparql.py` to revert the quote escaping that Turtle needed.

cd ~/AI/nlp/GraphRAG/src export PYTHONPATH=$PYTHONPATH:/home/danny/AI/LIBS-under-dev/llama_index python /home/danny/AI/LIBS-under-dev/llama_index/tests/storage/graph_stores/test_sparql.py


> urllib.error.HTTPError: HTTP Error 502: Proxy Error

Oops. Too many results? Check server...

That took me a long time, bit fiddly. But now :

results = graph_store.select_triplets('Peter Quill', 10)

is returning :

{'rel': {'type': 'literal', 'value': 'is leader of'}, 'obj': {'type': 'literal', 'value': 'Guardians of the Galaxy'}} {'rel': {'type': 'literal', 'value': 'is half-human'}, 'obj': {'type': 'literal', 'value': 'half-Celestial'}} {'rel': {'type': 'literal', 'value': 'was abducted from Earth'}, 'obj': {'type': 'literal', 'value': 'as a child'}} {'rel': {'type': 'literal', 'value': 'was raised by'}, 'obj': {'type': 'literal', 'value': 'a group of alien thieves and smugglers'}}


Ok, so now I reckon I need SPARQL UNION (and possibly BIND) to get some <-[backwards]- bits.

Break time.

Hmm, I was playing around with the SPARQL, looks like this dataset (populated from `sparql.py`) is missing a few triples.
For now go with https://fuseki.hyperdata.it/#/dataset/llama_index-test/query which came from NebulaGraph.

Ok, this returns some things of the right shape, will do for now :

PREFIX er: http://purl.org/stuff/er#

BASE http://purl.org/stuff/data

SELECT DISTINCT ?subj ?rel ?obj ?rel2 ?obj2 WHERE {

GRAPH <http://purl.org/stuff/guardians> {
    ?triplet a er:Triplet ;
        er:subject ?subject ;
        er:property ?property ;
        er:object ?object .

    ?subject er:value "Peter Quill"  .
    ?property er:value ?rel .
    ?object er:value ?obj .
OPTIONAL {
        ?triplet2 a er:Triplet ;
        er:subject ?subject2 ;
        er:property ?property2 ;
        er:object ?object2 .

    ?subject2 er:value ?obj .
    ?property2 er:value ?rel2 .
    ?object2 er:value ?obj2 .
}
}

}


**Property paths!** D'oh! I'd forgotten about them. Probably useful here. https://www.w3.org/TR/sparql11-query/#propertypaths

But for now, get suitable output of `rel_map` from results of the above.

**ChatGPT**
Given the following example :

subj = 'Peter Quill'
rels = {'rel': {'type': 'literal', 'value': 'is leader of'}, 'obj': {'type': 'literal', 'value': 'Guardians of the Galaxy'}, 'rel2': {'type': 'literal', 'value': 'cannot heal'}, 'obj2': {'type': 'literal', 'value': 'Rocket'}}
arp = to_arrows(subj, rels)

write the function to_arrows so this will be the value of string arp :

'Peter Quill, -[would return to the MCU]->, May 2021, <-[Gunn reaffirmed]-, Guardians of the Galaxy Vol. 3'
**didnt really help**

Started doing it manually, now too tired. Night night.

---
I've used this (and almost identical in Java etc) _so often_, but have managed to forget :

> Logger.setLevel() specifies the lowest-severity log message a logger will handle, where debug is the lowest built-in severity level and critical is the highest built-in severity. For example, if the severity level is INFO, the logger will handle only INFO, WARNING, ERROR, and CRITICAL messages and will ignore DEBUG messages.

`:cat AI`
`:tag SPARQL`
`:tag LlamaIndex`