index naming#1962
Merged
lukavdplas merged 19 commits intodevelopfrom Feb 11, 2026
Merged
Conversation
use es_alias as additional alias
JeltevanBoheemen
approved these changes
Feb 11, 2026
Contributor
JeltevanBoheemen
left a comment
There was a problem hiding this comment.
Makes sense to me!
Co-Authored-By: Jelte van Boheemen <j.vanboheemen@uu.nl>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This makes some changes to how indices are named, and how
es_indexandes_aliasare used.The main reason was that the behaviour for database corpora needed to change, e.g. to allow duplicate names for database corpora in the future. I also wanted to change
es_aliasto work better for use cases like parlamint or course explorer.es_indexis now optional, and intended to customise/override the default name. For Python corpora, the name is based on the corpus name, e.g.textcavator-times, for database corpora it's based on the ID, e.g.textcavator-custom[3]. Bracket characters are to avoid overlap with corpus names or version numbers.es_indexis no longer set when you create a corpus via the API (but you can still set it from the admin site). The generated index name is not derived from the corpus title, so this resolves Corpus form: updating corpus title orphans index #1824es_index. Incidentally, that means you can use them in development too, without deleting the index every time you run unit tests. (Because the tests use a different application prefix.)es_aliasnow specifies an additional alias, rather than replacing the alias name. I think this is closer to how we use it. This means the corpus is still searchable under its own index name, as well as the alias.es_alias, theindex --prod --rolloverandaliascommands now only remove the alias from earlier versions of the same corpus. (Not from indices that belong to other corpora.)--prodfails with a clear message if you previously created an unversioned index for the corpus (close Indexing: no catch if intended alias is already used as index name #1947)es_index/es_aliasare no longer visible in the API.As far as I know, this won't break anything for existing corpora, because they all still specify
es_index.