Run bunch of queries natively on Memgraph

Hi,

I am working on a Quantitative metrics evaluation on Memgraph as part of my Thesis. I would like to know if there is a way to run a bunch of queries (~20-30) at the same time in Memgraph Lab or GUI way to determine total query time execution. Also, possibly get performance metrics like - parsing time, plan execution time, etc. I had previously read in docs that we can do it programmatically. However, I would like to run a simple test and get as many results as natively as possible because, on the other hand, I have already implemented such a programming layer with which I wanted to compare results.

Thank you for such amazing Graph DB. :slightly_smiling_face:

1 Like

Hi @red_devil!

Thank you very much for the praise! :grinning_face_with_smiling_eyes:

Unfortunately, at this moment, it is only possible to profile queries one at a time. Multi-command execution planning or profiling is not supported.

Just as you have stated, I suggest using the programmatic approach to solve this problem. For example, add the PROFILE keyword at the beginning of all your queries and then parse the results in Python. If you need help with that part, you can always share your code, or I can create a simple demo. :slightly_smiling_face:

1 Like

Hi @idespot

I did try adding keyword PROFILE like this however, it returned with an error:

mgclient.DatabaseError: PROFILE not allowed in multicommand transactions.

I can share my code however it is very unstructured at the moment.

memgraphImply.py

import utilities.memgraphConfig as mcfg
from persistence.connection import Connection
import mgclient


class MemgraphImpl(Connection):

    def __init__(self):
        self.connect(host=mcfg.host, port=mcfg.port)

    def connect(self, host, port):
        self.connection = mgclient.connect(host=host, port=port)
        print("Memgraph connection successful")

    def execute(self, index,  query):
        print('executing query: ', index)
        cursor = self.connection.cursor()
        cursor.execute(query)
        result = cursor.fetchall()
        self.connection.commit()
        return self.serialize_response(result)

    def serialize_response(self, result):
        return {'result': [self.serialize_data2(i, row) for i, row in enumerate(result)]}
    
    # function return with an error
    # def serialize_data(self, index, row):
    #     print(row)
    #     return dict(row)

    def serialize_data2(self, index, row):
        graphDataTypeList = []
        for graphDataType in row:
            if isinstance(graphDataType, str) or isinstance(graphDataType, int):
                graphDataTypeDict = graphDataType
            else:
                graphDataTypeDict = graphDataType.properties
            print(graphDataTypeDict)
            graphDataTypeList.append(graphDataTypeDict)
        return graphDataTypeList

    def disconnect(self):
        self.connection_class = mgclient.Connection.close(self.connection)
        print("Memgraph connection terminated")

A common driver orchestrates the whole process like this :

driver.py

import jsonlines

from CypherPersistenceApi import CypherPersistenceApi
from utilities.Utilities import DatabaseType
from utilities.Utilities import readQueriesFromFile
from utilities.executionTimer import ExecutionTimer


def run_query(index, query):
    x = connection.execute(index, query)
    return json_generator(x)
    


def json_generator(x):
    with jsonlines.open('output/output.json', mode='a') as writer:
        writer.write(x)
        writer.close()


queries = readQueriesFromFile("queries.txt")
if queries[0] == "1":
    databaseType = DatabaseType.Neo4j
elif queries[0] == "2":
    databaseType = DatabaseType.Memgraph
else:
    databaseType = DatabaseType.RedisGraph
connection = CypherPersistenceApi(databaseType)
queries = queries[1:]
executionTimer = ExecutionTimer()
executionTimer.start()
for index, query in enumerate(queries, start=1):
    for i in range(20):
        print("\n==\n")
        run_query(index, query)
executionTimer.stop()
connection.disconnect()
print('The program will now exit.')

I am currently logging time from query read to writing results into JSON file in the project.

However, as part of my research, I would like to log how much overhead does my program creates as compared to the native one provided by Memgraph.

queries.txt

2
PROFILE MATCH (n) WHERE ID(n) = 71 RETURN ID(n)
MATCH (n)- [r] -> (b:TBox) WHERE id(n)=71 RETURN r,b,n;
match (a) - [r] -> (b) return a, type(r), b limit 1
match (a:TBox {title: 'Person'}) - [r] -> (b) where not (b:property) return a, type(r), b
MATCH (n) with count(n) as nodes MATCH () - [r] -> () RETURN nodes, count(r) as relationships
match(c {title: "Agent"})<-[:subclass_of*1..]-(n)<-[:type_of]-(no) return no
match(c {title: "D4Thing"})<-[:subclass_of*0..]-(n)<-[:type_of*1..]-(no) return n, count(no)
match (n) <-[:subclass_of*0..]-({title: 'User'}) return n
match (rp)<-[:required_property]-(n) <-[:subclass_of*0..]-({title: 'User'}) return rp
match (op)<-[:optional_property]-(n) <-[:subclass_of*0..]-({title: 'User'}) return op
match(c {title: "Agent"})<-[:subclass_of*0..]-(n)<-[r]-(no) return count(r)
match (n)- [r] - ({title: 'Person'}) return r
MATCH (a)-[r]->(b) RETURN distinct(type(r))

1 - Neo4j, 2 - Memgraph, 3 - RedisGraph.

PS: I can share my project as a whole repository if you like.

Thank you.

Hi, to find the solution for your problem, check out the autocommit attribute of the Connection object:
https://memgraph.github.io/pymgclient/connection.html#mgclient.Connection.autocommit

Memgraph transactions can operate in 2 modes:
Explicit transaction and auto-commit transactions.

Explicit transaction is explicitly started by sending the BEGIN query, or by sending a certain message from the client-side. pymgclient by default sends that message before the first query, which starts an explicit transaction. Explicit transactions need to be committed explicitly, that’s why you need to call commit (or rollback) method after you’re done.

Auto-commit transaction is basically one query transaction, which creates a transaction, runs the query, and commits the results, so no need to call commit after you execute a query.

We allow only pure cypher queries (MATCH, RETURN, CREATE…) in explicit (multicommand) transactions. Other, Memgraph specific queries (like PROFILE) need to be run in auto-commit mode.

So the solution for your problem would be to add self.connection.autocommit = True in the connect function.

2 Likes

Hello @toni,

It worked. However, I could not understand the output of the PROFILE keyword completely, I am heading to profiling queries

Furthermore, this helps to write my research. :slightly_smiling_face:

Thank you.

1 Like