The mapKurator-system requires that cuda_11.3 with cudnn and nvidia-smi is working properly on the underlying host OS. For a successful installation, you may need to use cuda_11.3-devel. You can learn more here.
Note that cuda_11.3 only provided support for Ubuntu 20.04 and below at the time this document was created.
NOTE: The docker image supports upto stitch module.
If you would like to get a quick set up to try out our text spotting feature without Post-OCR and Entity linking modules, please consider using our docker image which is built on nvidia/cuda:11.3.0-devel-ubuntu18.04. For full features of mapKurator, please follow Option 2 for installation.
First pull the docker image with the following command - docker pull knowledgecomputing/mapkurator_recogito_2023:latest
Then run the container with -
docker run -it --name YOUR_CONTAINER_NAME --gpus all -v /PATH/TO/INPUT/FOLDER/ON/HOST_MACHINE:/home/mapkurator-test-images/input/ -v /PATH/TO/OUTPUT/FOLDER/ON/HOST_MACHINE:/home/mapkurator-test-images/output/ knowledgecomputing/mapkurator_recogito_2023
Inside the container, run conda activate mapKurator to activate the mapkurator environment.
NOTE:
- Remember to change
/PATH/TO/INPUT/FOLDER/ON/HOST_MACHINEand/PATH/TO/OUTPUT/FOLDER/ON/HOST_MACHINEin the above command to two actual directory paths on your host machine. - The -v option in the command above gives your docker container access to the folders on host machine. More documentation can be found at this link
Then refer to this "How to Use" guide link. Ensure that you place any test images in the /PATH/TO/INPUT/FOLDER/ON/HOST_MACHINE mentioned above. The docker image comes with two spotting modules which can be found in the /home directory. These are spotter-v2 and spotter_testr.
Setup an anaconda environment by running the following commands.
-
Download the latest anaconda setup -
wget https://repo.anaconda.com/archive/Anaconda3-2022.10-Linux-x86_64.shYou may replace the link above with the latest link from anaconda. -
Run the installation file.
bash Anaconda3-2022.10-Linux-x86_64.sh -
Create a conda environment to install all software packages required by mapKurator-system.
conda create --name mapKurator -y python=3.8 -
Activate the environment.
conda activate mapKurator
git clone https://github.com/knowledge-computing/mapkurator-system
- Install all python packages with the commands below.
python -m pip install numpy==1.21.6
python -m pip install opencv-python
python -m pip install pandas==1.4.3
python -m pip install Pillow==9.4.0
pip install Polygon3
python -m pip install shapely==1.8.2
python -m pip install geojson==2.5.0
python3 -m pip install setuptools==59.5.0
conda install pytorch==1.10.0 torchvision==0.11.0 torchaudio==0.10.0 cudatoolkit=11.3 -c pytorch -c conda-forge
pip install scikit-image
pip install matplotlib
pip install numba
pip install jupyterlab
-
Install
gdalby following the instructions here -
Install
PostgreSQLby following the instructions here. Tested version: 14.7 -
Install
elasticsearchby following the instructions here. Tested version: 7.10.1 -
Install
logstashby following the instructions here. Tested version: 8.7.0 -
Install Detectron
python -m pip install detectron2 -f https://dl.fbaipublicfiles.com/detectron2/wheels/cu113/torch1.10/index.html -
Install Adelaidet
git clone https://github.com/aim-uofa/AdelaiDet.git
cd AdelaiDet
python setup.py build develop
Please note that the mapKurator has been tested with the versions shown above only. If you tested it on the latest versions and found any issues, please let us know.
git clone https://github.com/knowledge-computing/mapkurator-spotter.git
cd /mapkurator-spotter/spotter-v2
python setup.py build develop
To retrieve OpenStreetMap geo-entities and popularity score (i.e., frequency of geo-entities' names), we utilize Postgres database and Elasticsearch search engine with Logstash. The tested version of each software is mentioned above.
Note that the following procedures are the demonstration of creating indices using OpenStreetMap.
Figure shows an outline of tables on Postgres and indices on Elasticsearch. The details of each component are as follows.
- table
all_continents: A table of all OpenStreetMap geo-entities' id, names, and the corresponding source tables - schema
{each continent}table{points, lines, multilinestrings, multipolygons, other_relations}: A source table of OpenStreetMap geo-entities including names, semantic types, and geometries - index
osm: An Elasticsearch index of tableall_continents - index
osm-voca: An Elasticsearch index that contains a unique vocabulary set of single words from geo-entities' names and their popularity from the indexosm - index
osm-linker: An Elasticsearch index that contains a unique vocabulary set of single words from geo-entities' names and the list of geo-entities' id with the corresponding source tables
- Download OpenStreetMap geo-entities of each continent in Geofabrik (file format: .osm.pbf)
- Create Postgres database and run
CREATE EXTENSION postgis; - Upload OpenStreetMap files (.osm.pbf) to Postgres database. Please run the following code after setting up the appropriate environment variables: m6_entity_linker/upload_osm_to_postgres_ogr2ogr.py
- Create generic index structure (GIST) of
osm_idandwkb_geometrycolumns for each table. Please run or modify the following code: m6_entity_linker/create_spatial_index_postgres.py - Create
all_continentstable and insert all OpenStreetMap geo-entities' id, names, and the corresponding source tables. Please run or modify the following code: m6_entity_linker/upload_osm_to_postgres_all_continents.py
- Create
osmindex on Elasticsearch usingall_continentstable on Postgres. Please refer the following Logstash configuration file: m6_entity_linker/logstash_postgres_world.conf - Create
osm-vocaindex on Elasticsearch which is used for PostOCR module. Please run or modify m4_post_ocr/preprocess.py and you will find the generated csv file namedtotal.csv. Then, refer the following Logstash configuration file to createosm-voca: m4_post_ocr/logstash_postocr.conf - Create
osm-linkerindex on Elasticsearch which is used for EntityLinker module: Please run or modify m6_entity_linker/create_elasticsearch_index.py and you will find the generated csv file namedosm_linker.csv. Then, refer the following Logstash configuration file to createosm-linker: m6_entity_linker/logstash_osm_linker.conf
