mapkurator-doc/docs/modules/spot.md at main · knowledge-computing/mapkurator-doc

Description

mapKurator System provides multiple state-of-the-art approaches for spotting text instances on scaned historical maps: TESTR, Spotter-v2, and PALETTE.

Training Processs

Training Datasets
- Synthetic datasets:
  - SynthText: We select 40k text-free background images from COCO and use them to generate synthetic text images (see the left image). Code: Github and Dataset: here
  - SynMap: We propose an approach to generate synthetic maps that mimic the text (e.g., font, spacing, orientation) and background styles in the real historical maps (see the right image). Code: TBD and Dataset : here
  - The synthetic datasets are in English, Arabic, Russian, and Chinese. We use these datasets for training multilingual text spotters (see Multilingual Spotting).
- Human Annotations : here

Text Spotters
- TESTR: A state-of-the-art text spotting model, originally on scene images, using Deformable Transformers. [Code]
- Spotter-v2: We propose a new appoach adopts a novel feature sampling strategy that samples relevant image features around the target points for predicting boundary points, which leads to enhanced detection performance. [Code]
- PALETTE: We propose a new appoach adopts a novel hyper-local feature sampling strategy that samples relevant image features around the target components (boundary points and characters), which leads to enhanced detection and recognition performance. [Code]
Training Process
- Pretrain: We train the TESTR, Spotter-v2, and PALETTE with the synthetic datasets.
- Finetune: We finetune the models with human annotations.

Inference Commands

1) Use run.py

To run spotting, you can call run.py with the following command:

python run.py --module_text_spotting 
              --sample_map_csv_path /home/maplord/maplist_csv/luna_omo_metadata_56628_20220724.csv
              --text_spotting_model_dir ./spotter-v2/PALEJUN/
              --expt_name sample_maps 
              --spotter_model spotter-v2
              --spotter_config  /home/spotter-v2/PALEJUN/configs/PALEJUN/Finetune/Rumsey_Polygon_Finetune.yaml
              --spotter_expt_name test
              --gpu_id 0

where

--module_text_spotting turns on the spotting module in this run
--sample_map_csv_path stores the metadata of the input map, a sample file can be found here.
--text_spotting_model_dir switches to the model directory
--expt_name is the experiment name for running the pipeline
--spotter_model is the spotter model name, choices=["testr", "spotter-v2"]
--spotter_config is the configuration file for running the spotting model
--spotter_expt_name is the experiment name for running the spotter
--gpu_id selects a GPU for running the spotter

2) Use inference.py

If you do not have a metadata csv file, or wish to specify the input path of image directly, you can use tools/inference.py in the model folder (i.e., text_spotting_model_dir).

python tools/inference.py --spotter_config  /home/spotter-v2/PALEJUN/configs/PALEJUN/Finetune/Rumsey_Polygon_Finetune.yaml
                          --output_json 
                          --input ./test_images
                          --output ./output

where

--config-file is the configuration file for running the spotting model
--output_json indicates the output file format is JSON
--input is the input image directory
--output is the output file directory
You can set GPU with CUDA_VISIBLE_DEVICES={gpu_id}, default gpu_id=0

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Description

Training Processs

Inference Commands

1) Use run.py

2) Use inference.py

FilesExpand file tree

spot.md

Latest commit

History

spot.md

File metadata and controls

Description

Training Processs

Inference Commands

1) Use run.py

2) Use inference.py