Skip to content

paul-sud/pubmed-es

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

13 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

pubmed_es

Parse Pubmed XML into JSON and load it into Elasticsearch. See the Downloading Pubmed documentation for details on obtaining the XML files.

Installation

pubmed_es requires Python 3.7 or higher and Elasticsearch 7. Clone this repo, create a Python venv, and install the requirements.txt:

git clone https://github.com/paul-sud/pubmed-es.git
cd pubmed-es
python -m venv venv
source venv/bin/activate
pip install -r requirements.txt

Usage

From the root of this repo, run the following to read and index the XML, where DATA_DIR points to a folder containing the XML files:

python -m pubmed_es -d $DATA_DIR

About

Index Pubmed into Elasticsearch

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages