Skip to content

Latest commit

 

History

History
75 lines (55 loc) · 4.75 KB

File metadata and controls

75 lines (55 loc) · 4.75 KB

Description

mapKurator System provides multiple state-of-the-art approaches for spotting text instances on scaned historical maps: TESTR, Spotter-v2, and PALETTE.

Training Processs

  • Training Datasets
    • Synthetic datasets:
      • SynthText: We select 40k text-free background images from COCO and use them to generate synthetic text images (see the left image). Code: Github and Dataset: here
      • SynMap: We propose an approach to generate synthetic maps that mimic the text (e.g., font, spacing, orientation) and background styles in the real historical maps (see the right image). Code: TBD and Dataset : here
      • The synthetic datasets are in English, Arabic, Russian, and Chinese. We use these datasets for training multilingual text spotters (see Multilingual Spotting).
    • Human Annotations : here

image

  • Text Spotters

    • TESTR: A state-of-the-art text spotting model, originally on scene images, using Deformable Transformers. [Code]
    • Spotter-v2: We propose a new appoach adopts a novel feature sampling strategy that samples relevant image features around the target points for predicting boundary points, which leads to enhanced detection performance. [Code]
    • PALETTE: We propose a new appoach adopts a novel hyper-local feature sampling strategy that samples relevant image features around the target components (boundary points and characters), which leads to enhanced detection and recognition performance. [Code]
  • Training Process

    • Pretrain: We train the TESTR, Spotter-v2, and PALETTE with the synthetic datasets.
    • Finetune: We finetune the models with human annotations.

Inference Commands

1) Use run.py

To run spotting, you can call run.py with the following command:

python run.py --module_text_spotting 
              --sample_map_csv_path /home/maplord/maplist_csv/luna_omo_metadata_56628_20220724.csv
              --text_spotting_model_dir ./spotter-v2/PALEJUN/
              --expt_name sample_maps 
              --spotter_model spotter-v2
              --spotter_config  /home/spotter-v2/PALEJUN/configs/PALEJUN/Finetune/Rumsey_Polygon_Finetune.yaml
              --spotter_expt_name test
              --gpu_id 0

where

  • --module_text_spotting turns on the spotting module in this run
  • --sample_map_csv_path stores the metadata of the input map, a sample file can be found here.
  • --text_spotting_model_dir switches to the model directory
  • --expt_name is the experiment name for running the pipeline
  • --spotter_model is the spotter model name, choices=["testr", "spotter-v2"]
  • --spotter_config is the configuration file for running the spotting model
  • --spotter_expt_name is the experiment name for running the spotter
  • --gpu_id selects a GPU for running the spotter

2) Use inference.py

If you do not have a metadata csv file, or wish to specify the input path of image directly, you can use tools/inference.py in the model folder (i.e., text_spotting_model_dir).

python tools/inference.py --spotter_config  /home/spotter-v2/PALEJUN/configs/PALEJUN/Finetune/Rumsey_Polygon_Finetune.yaml
                          --output_json 
                          --input ./test_images
                          --output ./output

where

  • --config-file is the configuration file for running the spotting model
  • --output_json indicates the output file format is JSON
  • --input is the input image directory
  • --output is the output file directory
  • You can set GPU with CUDA_VISIBLE_DEVICES={gpu_id}, default gpu_id=0