feat: set vector as default data pipeline by Ian2012 · Pull Request #1199 · openedx/tutor-contrib-aspects

Ian2012 · 2026-03-06T21:39:20Z

This PR sets Vector as the default data pipeline. It also includes a couple of improvements:

Stores the alembic migration in the same database to avoid conflicts (thanks @bmtcril).
Add tutor mounts for aspects-dbt. Now Aspects developers can work with a local copy of aspects-dbt without needing to push their changes to github. Just run tutor mounts add ./aspects-dbt/ or the directory where aspects-dbt is stored and run your local copy.
Enables Vector by default.

Caution

This is a breaking change. Users which install this version will disable their Ralph workloads if they do not update their configuration.

Depends on: openedx/aspects-dbt#164

Fixes: #1126 #1096

Previously installing a clean Aspects with Vector set as the xAPI database migrations would fail due to ASPECTS_XAPI_DATABASE not being the Ralph database. This upgrade fixes the migrations by adding an explicit Ralph database variable allowing both databases to be created independantly as designed.

Previously Alembic state was stored in ASPECTS_XAPI_DATABASE, which can change when switching between Ralph and Vector pipelines and cause Alembic to lose state and try to re-run all migrations. This is now explicit. Also makes sure Ralph uses the RALPH_DATABASE, simplifies and re-organizes the ClickHouse init script and makes sure the Vector user can access databases needed for inserting into downstream MVs.

openedx-webhooks · 2026-03-06T21:39:26Z

Thanks for the pull request, @Ian2012!

This repository is currently maintained by @bmtcril.

Once you've gone through the following steps feel free to tag them in a comment and let them know that your changes are ready for engineering review.

🔘 Get product approval

If you haven't already, check this list to see if your contribution needs to go through the product review process.

If it does, you'll need to submit a product proposal for your contribution, and have it reviewed by the Product Working Group.
- This process (including the steps you'll need to take) is documented here.
If it doesn't, simply proceed with the next step.

🔘 Provide context

To help your reviewers and other members of the community understand the purpose and larger context of your changes, feel free to add as much of the following information to the PR description as you can:

Dependencies

This PR must be merged before / after / at the same time as ...
Blockers

This PR is waiting for OEP-1234 to be accepted.
Timeline information

This PR must be merged by XX date because ...
Partner information

This is for a course on edx.org.
Supporting documentation
Relevant Open edX discussion forum threads

🔘 Get a green build

If one or more checks are failing, continue working on your changes until this is no longer the case and your build turns green.

Details

Where can I find more information?

If you'd like to get more details on all aspects of the review process for open source pull requests (OSPRs), check out the following resources:

When can I expect my changes to be merged?

Our goal is to get community contributions seen and reviewed as efficiently as possible.

However, the amount of time that it takes to review and merge a PR can vary significantly based on factors such as:

The size and impact of the changes that it introduces
The need for product review
Maintenance status of the parent repository

💡 As a result it may take up to several weeks or months to complete a review and merge your PR.

bmtcril · 2026-03-09T20:06:59Z

 type = "filter"
 inputs = ["docker_logs"]
-condition = 'includes(["lms", "cms", "lms-job", "cms-job"], .label."com.docker.compose.service")'
+condition = 'includes(["lms", "cms", "lms-worker", "cms-worker", "lms-job", "cms-job"], .label."com.docker.compose.service")'


I think this won't do anything due to overhangio/tutor#1263

But I support adding it anyway so we don't have to do it later

bmtcril · 2026-03-16T17:01:57Z

-batch_size: 100
+log_dir: logs
+num_xapi_batches: 10
+batch_size: 100000


I don't think we need to do this many events, it's just a smoke test to make sure inserts work. I don't think this helps as much as the Celery version since I don't think we'll see errors here if inserts fail like we do there. Is there a good way to check that the right number of rows have landed in CH and downstream tables? I think the row counts from the performance test script can be flaky based on the course that gets chosen, but maybe we can just limit things to 1 course for this test.

bmtcril · 2026-03-16T17:20:44Z

Were there additional fixes to get Vector working at all beyond #1132 ? I'd like to separate that PR from this one so we can release a bug fix version before doing the big breaking change.

Ian2012 · 2026-03-16T18:17:42Z

Were there additional fixes to get Vector working at all beyond #1132 ? I'd like to separate that PR from this one so we can release a bug fix version before doing the big breaking change.

Yes, this commit only: 0975832

bmtcril and others added 6 commits March 6, 2026 10:44

build: Update tests to also test the Vector configurations

1a2ef91

style: Fix line length error

1f5371d

style: Fix formatting

11606bf

feat: set vector as default data pipeline

10d194f

openedx-webhooks added the open-source-contribution PR author is not from Axim or 2U label Mar 6, 2026

openedx-webhooks added this to Contributions Mar 6, 2026

github-project-automation bot moved this to Needs Triage in Contributions Mar 6, 2026

chore: format files

87912ed

Ian2012 marked this pull request as ready for review March 6, 2026 21:51

bmtcril reviewed Mar 9, 2026

View reviewed changes

bmtcril requested a review from saraburns1 March 9, 2026 20:27

Ian2012 added 2 commits March 10, 2026 10:25

fix: set xapi_database on EVENT_SINK_CLICKHOUSE_BACKEND_CONFIG

0975832

fix: fix clickhouse report url

c4733f4

Ian2012 mentioned this pull request Mar 10, 2026

feat: set vector as default data pipeline openedx/openedx-aspects#339

Open

mphilbrick211 moved this from Needs Triage to Waiting on Author in Contributions Mar 11, 2026

fix: update default load test config to support vector

5ed6916

bmtcril reviewed Mar 16, 2026

View reviewed changes

chore: disable batching by default

1a93e38

Ian2012 changed the base branch from main to bmtcril/vector_bump March 19, 2026 15:39

Ian2012 added 2 commits March 19, 2026 10:42

Merge branch 'bmtcril/vector_bump' into cag/vector-default

7a78995

chore: restore defaults for load test

e9dce6a

bmtcril deleted the branch openedx:bmtcril/vector_bump March 19, 2026 16:10

bmtcril closed this Mar 19, 2026

github-project-automation bot moved this from Waiting on Author to Done in Contributions Mar 19, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: set vector as default data pipeline#1199

feat: set vector as default data pipeline#1199
Ian2012 wants to merge 13 commits intoopenedx:bmtcril/vector_bumpfrom
Ian2012:cag/vector-default

Ian2012 commented Mar 6, 2026 •

edited

Loading

Uh oh!

openedx-webhooks commented Mar 6, 2026 •

edited

Loading

Uh oh!

bmtcril Mar 9, 2026

Uh oh!

bmtcril Mar 9, 2026

Uh oh!

bmtcril Mar 16, 2026

Uh oh!

bmtcril commented Mar 16, 2026

Uh oh!

Ian2012 commented Mar 16, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

Ian2012 commented Mar 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

openedx-webhooks commented Mar 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

bmtcril Mar 9, 2026

Choose a reason for hiding this comment

Uh oh!

bmtcril Mar 9, 2026

Choose a reason for hiding this comment

Uh oh!

bmtcril Mar 16, 2026

Choose a reason for hiding this comment

Uh oh!

bmtcril commented Mar 16, 2026

Uh oh!

Ian2012 commented Mar 16, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Ian2012 commented Mar 6, 2026 •

edited

Loading

openedx-webhooks commented Mar 6, 2026 •

edited

Loading