Unable to setup spacy language model on top of Memgraph + MAGE


I downloaded the latest memgraph-mage Docker image with the command:

docker pull memgraph/memgraph-mage:latest

I have the following Dockerfile where I am installing spacy and spacy language model onto Memgraph with MAGE:

FROM memgraph/memgraph-mage

USER root

# Install the NLP libraries
RUN python3 -m pip install -U pip setuptools wheel && \
    python3 -m pip install -U spacy

# Download the NLP model for English language
RUN python3 -m spacy download en_core_web_sm

USER memgraph

I was building this image totally fine a week ago, but when I want to build it now, I am getting the following error:

[+] Building 17.2s (7/8)                                                                                                                                        
 => [internal] load build definition from Dockerfile                                                                                                       0.0s
 => => transferring dockerfile: 650B                                                                                                                       0.0s
 => [internal] load .dockerignore                                                                                                                          0.0s
 => => transferring context: 2B                                                                                                                            0.0s
 => [internal] load metadata for docker.io/memgraph/memgraph-mage:latest                                                                                   0.0s
 => [1/4] FROM docker.io/memgraph/memgraph-mage                                                                                                            0.2s
 => [internal] load build context                                                                                                                          0.0s
 => => transferring context: 5.72kB                                                                                                                        0.0s
 => [2/4] RUN python3 -m pip install -U pip setuptools wheel &&     python3 -m pip install -U spacy                                                       16.3s
 => ERROR [3/4] RUN python3 -m spacy download en_core_web_sm                                                                                               0.7s
 > [3/4] RUN python3 -m spacy download en_core_web_sm:                                                                                                          
#5 0.667 Traceback (most recent call last):                                                                                                                     
#5 0.667   File "/usr/lib/python3.7/runpy.py", line 183, in _run_module_as_main                                                                                 
#5 0.667     mod_name, mod_spec, code = _get_module_details(mod_name, _Error)                                                                                   
#5 0.667   File "/usr/lib/python3.7/runpy.py", line 142, in _get_module_details                                                                                 
#5 0.667     return _get_module_details(pkg_main_name, error)
#5 0.667   File "/usr/lib/python3.7/runpy.py", line 109, in _get_module_details
#5 0.667     __import__(pkg_name)
#5 0.667   File "/usr/local/lib/python3.7/dist-packages/spacy/__init__.py", line 11, in <module>
#5 0.667     from thinc.api import prefer_gpu, require_gpu, require_cpu  # noqa: F401
#5 0.667   File "/usr/local/lib/python3.7/dist-packages/thinc/__init__.py", line 5, in <module>
#5 0.667     from .config import registry
#5 0.667   File "/usr/local/lib/python3.7/dist-packages/thinc/config.py", line 10, in <module>
#5 0.667     from pydantic import BaseModel, create_model, ValidationError, Extra
#5 0.667   File "pydantic/__init__.py", line 2, in init pydantic.__init__
#5 0.667   File "pydantic/dataclasses.py", line 4, in init pydantic.dataclasses
#5 0.667     import types
#5 0.667   File "pydantic/error_wrappers.py", line 4, in init pydantic.error_wrappers
#5 0.667   File "pydantic/json.py", line 11, in init pydantic.json
#5 0.667 ImportError: /usr/lib/memgraph/query_modules/uuid.so: undefined symbol: mgp_module_add_read_procedure
executor failed running [/bin/sh -c python3 -m spacy download en_core_web_sm]: exit code: 1

We added UUID generator implementation as a procedure in query modules, which seems to be a reason for this ImportError, I should definitely investigate this issue further.

The problem is the newest uuid query module that conflicts with python own uuid.so library. The solution would be to rename the uuid to something else.

1 Like

Didn’t know about python’s uuid.so library. Thanks for pointing that out. Will make a fix and rename the module in MAGE.

The issue is now solved, and you should be able to setup spacy on top of MAGE again.
This PR closes the issue: https://github.com/memgraph/mage/pull/63

1 Like

Whoa! That was fast! It works now! :smiley:

Thanks a lot for the quick fix!

1 Like