Releases: explosion/spacy-vectors-builder
Releases · explosion/spacy-vectors-builder
Dutch vectors for DH2023
nl-dh2023-v0.0.1 Initial nl config for DH2023 demo
English vectors for spaCy v3.4
English vectors trained for spaCy v3.4.0 using floret.
The en_vectors_fasttext vectors were trained with floret in fasttext mode and are the same vectors as in en_core_web_lg v3.4.0.
The floret vectors are trained in floret mode on the same data with 50K entries (md) and 200K entries (lg).
Note that the .bin files are only compatible with floret, not fasttext. Load with the command-line floret or the python module:
import floret
model = floret.load_model("en_vectors_floret_md.bin")
model.get_subwords("covid")
# (['<covid>', '<covi', 'covid', 'ovid>'], array([517646, 541731, 558180, 540981, 527325, 538060, 559280, 538021]))
model.get_nearest_neighbors("covid")
# [(0.70456463098526, 'Covid'), (0.6891582012176514, 'COVID'), (0.6806262135505676, 'covid-19'), (0.607974648475647, 'Covid-19'), (0.5875810384750366, 'COVID-19'), (0.5560713410377502, 'covid19'), (0.5450572371482849, 'coronavirus'), (0.5238808393478394, 'Covid19'), (0.5168178081512451, 'pandemic'), (0.5062406659126282, 'Coronavirus')]