Skip to content

index naming#1962

Merged
lukavdplas merged 19 commits intodevelopfrom
feature/index-naming
Feb 11, 2026
Merged

index naming#1962
lukavdplas merged 19 commits intodevelopfrom
feature/index-naming

Conversation

@lukavdplas
Copy link
Copy Markdown
Contributor

@lukavdplas lukavdplas commented Dec 16, 2025

This makes some changes to how indices are named, and how es_index and es_alias are used.

The main reason was that the behaviour for database corpora needed to change, e.g. to allow duplicate names for database corpora in the future. I also wanted to change es_alias to work better for use cases like parlamint or course explorer.

  • es_index is now optional, and intended to customise/override the default name. For Python corpora, the name is based on the corpus name, e.g. textcavator-times, for database corpora it's based on the ID, e.g. textcavator-custom[3]. Bracket characters are to avoid overlap with corpus names or version numbers.
  • es_index is no longer set when you create a corpus via the API (but you can still set it from the admin site). The generated index name is not derived from the corpus title, so this resolves Corpus form: updating corpus title orphans index #1824
  • Most test corpora no longer set es_index. Incidentally, that means you can use them in development too, without deleting the index every time you run unit tests. (Because the tests use a different application prefix.)
  • es_alias now specifies an additional alias, rather than replacing the alias name. I think this is closer to how we use it. This means the corpus is still searchable under its own index name, as well as the alias.
  • For a corpus with es_alias, the index --prod --rollover and alias commands now only remove the alias from earlier versions of the same corpus. (Not from indices that belong to other corpora.)
  • Indexing with --prod fails with a clear message if you previously created an unversioned index for the corpus (close Indexing: no catch if intended alias is already used as index name #1947)
  • In production mode, you can now create a new index version when the active version is not the latest one. (close Versioned indexing when alias is not set to latest version #1159)
  • es_index/es_alias are no longer visible in the API.
  • In some places, added a catch if the index name matches more than one index.

As far as I know, this won't break anything for existing corpora, because they all still specify es_index.

@lukavdplas lukavdplas changed the title Feature/index naming index naming Dec 16, 2025
@lukavdplas lukavdplas marked this pull request as ready for review December 16, 2025 18:38
@Meesch Meesch self-requested a review January 9, 2026 12:17
@lukavdplas lukavdplas added the backend changes to the django backend label Jan 30, 2026
Copy link
Copy Markdown
Contributor

@JeltevanBoheemen JeltevanBoheemen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense to me!

Comment thread backend/addcorpus/python_corpora/corpus.py Outdated
Comment thread backend/es/client.py Outdated
Comment thread backend/es/search.py
@lukavdplas lukavdplas merged commit 3a68b02 into develop Feb 11, 2026
4 checks passed
@lukavdplas lukavdplas deleted the feature/index-naming branch February 11, 2026 17:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backend changes to the django backend

Projects

None yet

2 participants