Skip to content

Conversation

@ralf-berger
Copy link
Collaborator

Branching strategy: Everything gets merged into dev first, then merged again into main.

Nourhanbd and others added 30 commits February 11, 2025 23:47
The activity is logged when:
The user clicks on the Slide-KG button inside a slide.
The user switches to the Main Concepts tab.
-Implemented the logging of the activity "User viewed a main concept".
-Included in the slide service the HTTP parameters for all  required HTTP requests.
…ncept' 'View visual explanation of a recommended Concept'

Those activities are only logged when the user explicitly chooses by clicking on either visual or textual explanation.
The logging includes 4 activities:
"User viewed full article of a slide-kg's main concept ",
"User viewed full article of a slide-kg's recommended concept ",
"User viewed full article of a material-kg's concept ",
"User viewed full article of a course-kg's concept ".
The activities include:
"User viewed a main concept/ related concept/ category from a course KG",
"User viewed a main concept/ related concept/ category from a material KG",
-I prepared the logging's code and commented it out to keep the functionality working.
-The problem here is the backend code in the controller. I tried many times to publish the material KG, it takes firstly time and from the network tab I can't receive the response from the Backend!
-When I close the material-Kg and reaccess it, sometimes I receive the materialKg done, and sometimes it should be republished.
Implemented logging for hiding/ unhiding related concepts or categories from Material KG
Bumps [@babel/core](https://github.com/babel/babel/tree/HEAD/packages/babel-core) from 7.26.8 to 7.26.9.
- [Release notes](https://github.com/babel/babel/releases)
- [Changelog](https://github.com/babel/babel/blob/main/CHANGELOG.md)
- [Commits](https://github.com/babel/babel/commits/v7.26.9/packages/babel-core)

---
updated-dependencies:
- dependency-name: "@babel/core"
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Bumps [mongoose](https://github.com/Automattic/mongoose) from 8.10.0 to 8.10.1.
- [Release notes](https://github.com/Automattic/mongoose/releases)
- [Changelog](https://github.com/Automattic/mongoose/blob/master/CHANGELOG.md)
- [Commits](Automattic/mongoose@8.10.0...8.10.1)

---
updated-dependencies:
- dependency-name: mongoose
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Bumps nginxinc/nginx-unprivileged from 1.27.3-alpine to 1.27.4-alpine.

---
updated-dependencies:
- dependency-name: nginxinc/nginx-unprivileged
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Bumps [@types/jasmine](https://github.com/DefinitelyTyped/DefinitelyTyped/tree/HEAD/types/jasmine) from 5.1.5 to 5.1.6.
- [Release notes](https://github.com/DefinitelyTyped/DefinitelyTyped/releases)
- [Commits](https://github.com/DefinitelyTyped/DefinitelyTyped/commits/HEAD/types/jasmine)

---
updated-dependencies:
- dependency-name: "@types/jasmine"
  dependency-type: direct:development
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Bumps [neo4j-driver](https://github.com/neo4j/neo4j-javascript-driver) from 5.28.0 to 5.28.1.
- [Release notes](https://github.com/neo4j/neo4j-javascript-driver/releases)
- [Commits](neo4j/neo4j-javascript-driver@5.28.0...5.28.1)

---
updated-dependencies:
- dependency-name: neo4j-driver
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Bumps nginxinc/nginx-unprivileged from 1.27.3-alpine to 1.27.4-alpine.

---
updated-dependencies:
- dependency-name: nginxinc/nginx-unprivileged
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
…MKG"

courseId was undefined in the HTTP request.
Prepared the logging for the activity "User filtered top N concepts".
It doesn't work yet.
…w/ understood/ not understood.

-Fixed the problem of the activity logging of "User viewed recommended concepts"
-edited the type of the recommended concepts. Previously they could be either main_concept/ related_concept or both.
-Now the ones that are under recommended concepts Tab, they are considered as recommended_concept.
-Added type to the concepts while marking as understood/ not understood/ new.
-There is still a logging problem, by marking a concept from the recommended materials tab!
…mended concept"

-Additionally, commented out some unnecessary console logs and added comments where needed.
Two activites are being logged by adding an annotation:
"User added an annotation" and "User annotated a material".
The annotation object can be "Note", "Question", "External Resource".
The material object can be "pdf", "video", "Youtube".
So the 6 possibilities are the following:
"User added a note",
"User added an external resource",
"User asked a question",
"User annotated a PDF",
"User annotated a video",
"User annotated a youtube video".
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

📜 Review details

Configuration used: CodeRabbit UI
Review profile: ASSERTIVE
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 0f01a29 and ec379fd.

📒 Files selected for processing (6)
  • webapp/src/app/pages/components/annotations/pdf-annotation/pdf-main-annotation/pdf-main-annotation.component.html (1 hunks)
  • webapp/src/app/pages/components/annotations/pdf-annotation/pdf-main-annotation/pdf-main-annotation.component.ts (4 hunks)
  • webapp/src/app/pages/components/materials/material/material.component.html (1 hunks)
  • webapp/src/app/pages/components/materials/material/material.component.ts (7 hunks)
  • webapp/src/app/services/pdfview.service.ts (2 hunks)
  • webserver/src/controllers/channel.controller.js (0 hunks)
💤 Files with no reviewable changes (1)
  • webserver/src/controllers/channel.controller.js
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (9)
  • GitHub Check: Build container images / Build image (coursemapper-kg/recommendation)
  • GitHub Check: Build container images / Build image (coursemapper-kg/concept-map)
  • GitHub Check: Build container images / Build image (webapp)
  • GitHub Check: Build container images / Build image (coursemapper-kg/preprocess)
  • GitHub Check: Analyze (javascript-typescript)
  • GitHub Check: Build container images / Build image (coursemapper-kg/concept-map)
  • GitHub Check: Build container images / Build image (webapp)
  • GitHub Check: Build container images / Build image (coursemapper-kg/recommendation)
  • GitHub Check: Build container images / Build image (coursemapper-kg/preprocess)
🔇 Additional comments (9)
webapp/src/app/pages/components/annotations/pdf-annotation/pdf-main-annotation/pdf-main-annotation.component.html (1)

26-26: LGTM! Good addition of PDF error handling.

Adding the error event handler improves the user experience by enabling the application to detect and respond to PDF loading failures gracefully.

webapp/src/app/pages/components/materials/material/material.component.html (1)

319-322: Excellent user-friendly error message implementation.

The alert provides clear feedback about PDF loading issues with actionable instructions for users. The dismissible design and informative message text follow good UX practices.

webapp/src/app/pages/components/materials/material/material.component.ts (4)

102-102: LGTM! Property initialization is correct.

The boolean flag properly tracks the PDF error state and is initialized to false.


311-313: LGTM! Clean error dismissal implementation.

The dismissError() method properly delegates to the service to clear the error state, maintaining proper separation of concerns.


575-575: Smart prevention of deletion during error state.

The condition !this.showPdfErrorMessage prevents deletion of PDF materials while an error message is displayed, ensuring consistent user experience and preventing accidental operations.


609-609: Good cleanup of error state after successful deletion.

Resetting showPdfErrorMessage to false after successful material deletion ensures the error state doesn't persist inappropriately.

webapp/src/app/pages/components/annotations/pdf-annotation/pdf-main-annotation/pdf-main-annotation.component.ts (1)

372-375: LGTM! Proper PDF error handling implementation.

The onPdfLoadError method correctly logs the error and notifies the service to emit the error state. This integrates well with the reactive error handling system.

webapp/src/app/services/pdfview.service.ts (2)

18-19: LGTM! Clean reactive error state implementation.

The addition of pdfErrorSubject and pdfError$ follows the established pattern in the service and provides a clean way for components to reactively track PDF errors.


40-46: Excellent error state management methods.

The emitError() and clearError() methods provide a clean API for managing PDF error states. The implementation is straightforward and follows good practices.

import * as CourseActions from 'src/app/pages/courses/state/course.actions';
import * as NotificationActions from '../../../notifications/state/notifications.actions';
import { Router } from '@angular/router';
import { MaterilasService } from 'src/app/services/materials.service';
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Fix the typo in the import statement.

The import statement has a typo: MaterilasService should be MaterialsService.

Apply this diff to fix the typo:

-import { MaterilasService } from 'src/app/services/materials.service';
+import { MaterialsService } from 'src/app/services/materials.service';
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
import { MaterilasService } from 'src/app/services/materials.service';
import { MaterialsService } from 'src/app/services/materials.service';
🤖 Prompt for AI Agents
In
webapp/src/app/pages/components/annotations/pdf-annotation/pdf-main-annotation/pdf-main-annotation.component.ts
at line 73, fix the typo in the import statement by changing `MaterilasService`
to `MaterialsService` to correctly import the service.

Comment on lines 376 to 380
// extractMaterialIdFromUrl(url: string): string | null {
// // Example: /api/public/uploads/pdfs/6886010b1152d6b2ee79e612.pdf
// const match = url.match(/\/([\w\d]+)\.pdf$/);
// return match ? match[1] : null;
// }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick (assertive)

Consider removing commented code.

The commented extractMaterialIdFromUrl method appears to be unused. Consider removing it to keep the codebase clean.

-  // extractMaterialIdFromUrl(url: string): string | null {
-  //   // Example: /api/public/uploads/pdfs/6886010b1152d6b2ee79e612.pdf
-  //   const match = url.match(/\/([\w\d]+)\.pdf$/);
-  //   return match ? match[1] : null;
-  // }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
// extractMaterialIdFromUrl(url: string): string | null {
// // Example: /api/public/uploads/pdfs/6886010b1152d6b2ee79e612.pdf
// const match = url.match(/\/([\w\d]+)\.pdf$/);
// return match ? match[1] : null;
// }
🤖 Prompt for AI Agents
In
webapp/src/app/pages/components/annotations/pdf-annotation/pdf-main-annotation/pdf-main-annotation.component.ts
around lines 376 to 380, remove the commented-out extractMaterialIdFromUrl
method entirely since it is unused, to keep the codebase clean and maintainable.

Comment on lines 289 to 291
this.pdfViewService.pdfError$.subscribe((error) => {
this.showPdfErrorMessage = error;
});
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick (assertive)

Good reactive error handling implementation.

The subscription to pdfViewService.pdfError$ properly updates the component's error state. Consider handling potential subscription cleanup in ngOnDestroy if this subscription isn't automatically managed.

🤖 Prompt for AI Agents
In webapp/src/app/pages/components/materials/material/material.component.ts
around lines 289 to 291, the subscription to pdfViewService.pdfError$ is not
being cleaned up, which can cause memory leaks. To fix this, store the
subscription in a class property and unsubscribe from it in the ngOnDestroy
lifecycle method to properly manage the subscription lifecycle.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (2)
webapp/src/app/pages/components/annotations/pdf-annotation/pdf-main-annotation/pdf-main-annotation.component.ts (1)

73-73: Fix the typo in the import statement.

The import statement has a typo: MaterilasService should be MaterialsService.

webapp/src/app/pages/components/materials/material/material.component.ts (1)

290-296: Handle subscription cleanup to prevent memory leaks.

The subscription to pdfViewService.pdfError$ is not being cleaned up, which can cause memory leaks. Store the subscription and unsubscribe in ngOnDestroy.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: ASSERTIVE
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between ec379fd and e1719eb.

📒 Files selected for processing (4)
  • webapp/src/app/pages/components/annotations/pdf-annotation/pdf-main-annotation/pdf-main-annotation.component.ts (4 hunks)
  • webapp/src/app/pages/components/materials/material/material.component.html (1 hunks)
  • webapp/src/app/pages/components/materials/material/material.component.ts (11 hunks)
  • webapp/src/app/services/pdfview.service.ts (2 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)
  • GitHub Check: Analyze (python)
  • GitHub Check: Analyze (javascript-typescript)
  • GitHub Check: Build container images / Build image (coursemapper-kg/preprocess)
  • GitHub Check: Build container images / Build image (coursemapper-kg/recommendation)
  • GitHub Check: Build container images / Build image (webapp)
🔇 Additional comments (9)
webapp/src/app/pages/components/materials/material/material.component.html (1)

319-322: LGTM! Well-implemented PDF error alert.

The conditional alert properly displays PDF loading errors with clear user guidance. The condition correctly checks both the material ID match and error flag, and the Bootstrap styling provides good visual feedback.

webapp/src/app/pages/components/annotations/pdf-annotation/pdf-main-annotation/pdf-main-annotation.component.ts (1)

372-384: LGTM! Well-implemented PDF error handling.

The error handling implementation correctly extracts the material ID from the URL and emits the error through the service. The regex pattern appropriately matches the expected PDF URL format.

webapp/src/app/services/pdfview.service.ts (2)

18-19: LGTM! Well-designed reactive error handling.

The BehaviorSubject pattern provides excellent reactive error state management, allowing components to subscribe to PDF error events with proper initial state handling.


40-46: LGTM! Clean error management methods.

The emitError and clearError methods provide a clean API for managing PDF error state. The implementation correctly uses the BehaviorSubject to emit material IDs for errors and null for clearing.

webapp/src/app/pages/components/materials/material/material.component.ts (5)

102-102: LGTM! Well-defined error state property.

The showPdfErrorMessage boolean property clearly indicates the PDF error display state and integrates well with the reactive error handling system.


122-122: LGTM! Material ID tracking for errors.

The pdfErrorMaterialId property appropriately tracks which material has a PDF error, enabling targeted error display.


318-320: LGTM! Clean error dismissal.

The dismissError method properly clears the PDF error state through the service, maintaining consistency with the reactive error handling pattern.


592-592: LGTM! Smart deletion logic with error handling.

The condition correctly prevents PDF deletion when an error is shown (!this.showPdfErrorMessage), ensuring users can't accidentally delete files they're being warned about.


626-627: LGTM! Proper error cleanup on successful deletion.

The code correctly clears PDF errors and resets the error message flag after successful material deletion, maintaining clean state management.

private changeDetectorRef: ChangeDetectorRef,
private slideKgGenerator: SlideKgOrderedService,
protected router: Router
protected router: Router,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick (assertive)

Consider removing the trailing comma.

The trailing comma after the router parameter is unnecessary and inconsistent with TypeScript style conventions.

-    protected router: Router,
+    protected router: Router
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
protected router: Router,
protected router: Router
🤖 Prompt for AI Agents
In
webapp/src/app/pages/components/annotations/pdf-annotation/pdf-main-annotation/pdf-main-annotation.component.ts
at line 159, remove the trailing comma after the router parameter in the
constructor or method parameter list to comply with TypeScript style conventions
and maintain consistency.


app.get(
"/api/wikipedia/search",
[authJwt.verifyToken],

Check failure

Code scanning / CodeQL

Missing rate limiting High

This route handler performs
authorization
, but is not rate-limited.

Copilot Autofix

AI 6 months ago

To fix the missing rate limiting on the /api/wikipedia/search endpoint, we should add a rate-limiting middleware to this specific route. The best way to do this in an Express application is to use the well-known express-rate-limit package. We will:

  • Import express-rate-limit at the top of the file.
  • Define a rate limiter instance (e.g., allowing 20 requests per minute per IP for this endpoint).
  • Add the rate limiter as a middleware to the /api/wikipedia/search route, before the authentication middleware.
  • Only modify the code in webserver/src/routes/knowledgeGraph.routes.js as shown.

Suggested changeset 2
webserver/src/routes/knowledgeGraph.routes.js

Autofix patch

Autofix patch
Run the following command in your local git repository to apply this patch
cat << 'EOF' | git apply
diff --git a/webserver/src/routes/knowledgeGraph.routes.js b/webserver/src/routes/knowledgeGraph.routes.js
--- a/webserver/src/routes/knowledgeGraph.routes.js
+++ b/webserver/src/routes/knowledgeGraph.routes.js
@@ -2,8 +2,16 @@
 const controller = require("../controllers/knowledgeGraph.controller");
 const recommendationController = require("../controllers/recommendation.controller"); 
 const logger = require("../activity-logger/logger-middlewares/knowledge-graph-logger");
+const rateLimit = require("express-rate-limit");
 
 module.exports = function (app) {
+  // Rate limiter for Wikipedia search endpoint: max 20 requests per minute per IP
+  const wikipediaSearchLimiter = rateLimit({
+    windowMs: 1 * 60 * 1000, // 1 minute
+    max: 20, // limit each IP to 20 requests per windowMs
+    message: "Too many search requests from this IP, please try again after a minute."
+  });
+
   app.use(function (req, res, next) {
     res.header("Access-Control-Allow-Headers", "Origin, Content-Type, Accept");
     next();
@@ -106,7 +112,8 @@
 
   app.get(
     "/api/wikipedia/search",
-    [authJwt.verifyToken],
+    wikipediaSearchLimiter,
+    authJwt.verifyToken,
     controller.searchWikipedia
   );
   app.post(
EOF
@@ -2,8 +2,16 @@
const controller = require("../controllers/knowledgeGraph.controller");
const recommendationController = require("../controllers/recommendation.controller");
const logger = require("../activity-logger/logger-middlewares/knowledge-graph-logger");
const rateLimit = require("express-rate-limit");

module.exports = function (app) {
// Rate limiter for Wikipedia search endpoint: max 20 requests per minute per IP
const wikipediaSearchLimiter = rateLimit({
windowMs: 1 * 60 * 1000, // 1 minute
max: 20, // limit each IP to 20 requests per windowMs
message: "Too many search requests from this IP, please try again after a minute."
});

app.use(function (req, res, next) {
res.header("Access-Control-Allow-Headers", "Origin, Content-Type, Accept");
next();
@@ -106,7 +112,8 @@

app.get(
"/api/wikipedia/search",
[authJwt.verifyToken],
wikipediaSearchLimiter,
authJwt.verifyToken,
controller.searchWikipedia
);
app.post(
webserver/package.json
Outside changed files

Autofix patch

Autofix patch
Run the following command in your local git repository to apply this patch
cat << 'EOF' | git apply
diff --git a/webserver/package.json b/webserver/package.json
--- a/webserver/package.json
+++ b/webserver/package.json
@@ -46,7 +46,8 @@
     "npm-run-all": "^4.1.5",
     "redis": "^4.7.0",
     "rimraf": "^6.0.1",
-    "socket.io": "^4.8.1"
+    "socket.io": "^4.8.1",
+    "express-rate-limit": "^8.0.1"
   },
   "devDependencies": {
     "cors": "^2.8.5"
EOF
@@ -46,7 +46,8 @@
"npm-run-all": "^4.1.5",
"redis": "^4.7.0",
"rimraf": "^6.0.1",
"socket.io": "^4.8.1"
"socket.io": "^4.8.1",
"express-rate-limit": "^8.0.1"
},
"devDependencies": {
"cors": "^2.8.5"
This fix introduces these dependencies
Package Version Security advisories
express-rate-limit (npm) 8.0.1 None
Copilot is powered by AI and may make mistakes. Always verify output.
[authJwt.verifyToken, authJwt.isEnrolled],
controller.getResources
"/api/recommendation/user_resources/filter",
[authJwt.verifyToken],

Check failure

Code scanning / CodeQL

Missing rate limiting High

This route handler performs
authorization
, but is not rate-limited.

Copilot Autofix

AI 6 months ago

To fix the missing rate limiting issue, we should add a rate limiting middleware to the /api/recommendation/user_resources/filter route. The best way to do this is to use the popular express-rate-limit package, which is well-maintained and widely used. We will:

  • Import express-rate-limit at the top of the file.
  • Define a rate limiter instance with reasonable defaults (e.g., 100 requests per 15 minutes per IP).
  • Add the rate limiter middleware to the specific route on line 139, so that it is applied in addition to the existing authentication middleware.
  • Ensure that only the relevant route is changed, without affecting other routes or existing functionality.

Suggested changeset 2
webserver/src/routes/knowledgeGraph.routes.js

Autofix patch

Autofix patch
Run the following command in your local git repository to apply this patch
cat << 'EOF' | git apply
diff --git a/webserver/src/routes/knowledgeGraph.routes.js b/webserver/src/routes/knowledgeGraph.routes.js
--- a/webserver/src/routes/knowledgeGraph.routes.js
+++ b/webserver/src/routes/knowledgeGraph.routes.js
@@ -2,7 +2,14 @@
 const controller = require("../controllers/knowledgeGraph.controller");
 const recommendationController = require("../controllers/recommendation.controller"); 
 const logger = require("../activity-logger/logger-middlewares/knowledge-graph-logger");
+const rateLimit = require("express-rate-limit");
 
+// Rate limiter for recommendation user resources filter route
+const userResourcesFilterLimiter = rateLimit({
+  windowMs: 15 * 60 * 1000, // 15 minutes
+  max: 100, // limit each IP to 100 requests per windowMs
+  message: "Too many requests, please try again later.",
+});
 module.exports = function (app) {
   app.use(function (req, res, next) {
     res.header("Access-Control-Allow-Headers", "Origin, Content-Type, Accept");
@@ -137,7 +143,7 @@
   );
   app.post(
     "/api/recommendation/user_resources/filter",
-    [authJwt.verifyToken],
+    [authJwt.verifyToken, userResourcesFilterLimiter],
     recommendationController.filterUserResourcesSavedBy
   );
 
EOF
@@ -2,7 +2,14 @@
const controller = require("../controllers/knowledgeGraph.controller");
const recommendationController = require("../controllers/recommendation.controller");
const logger = require("../activity-logger/logger-middlewares/knowledge-graph-logger");
const rateLimit = require("express-rate-limit");

// Rate limiter for recommendation user resources filter route
const userResourcesFilterLimiter = rateLimit({
windowMs: 15 * 60 * 1000, // 15 minutes
max: 100, // limit each IP to 100 requests per windowMs
message: "Too many requests, please try again later.",
});
module.exports = function (app) {
app.use(function (req, res, next) {
res.header("Access-Control-Allow-Headers", "Origin, Content-Type, Accept");
@@ -137,7 +143,7 @@
);
app.post(
"/api/recommendation/user_resources/filter",
[authJwt.verifyToken],
[authJwt.verifyToken, userResourcesFilterLimiter],
recommendationController.filterUserResourcesSavedBy
);

webserver/package.json
Outside changed files

Autofix patch

Autofix patch
Run the following command in your local git repository to apply this patch
cat << 'EOF' | git apply
diff --git a/webserver/package.json b/webserver/package.json
--- a/webserver/package.json
+++ b/webserver/package.json
@@ -46,7 +46,8 @@
     "npm-run-all": "^4.1.5",
     "redis": "^4.7.0",
     "rimraf": "^6.0.1",
-    "socket.io": "^4.8.1"
+    "socket.io": "^4.8.1",
+    "express-rate-limit": "^8.0.1"
   },
   "devDependencies": {
     "cors": "^2.8.5"
EOF
@@ -46,7 +46,8 @@
"npm-run-all": "^4.1.5",
"redis": "^4.7.0",
"rimraf": "^6.0.1",
"socket.io": "^4.8.1"
"socket.io": "^4.8.1",
"express-rate-limit": "^8.0.1"
},
"devDependencies": {
"cors": "^2.8.5"
This fix introduces these dependencies
Package Version Security advisories
express-rate-limit (npm) 8.0.1 None
Copilot is powered by AI and may make mistakes. Always verify output.

app.get(
"/api/recommendation/setting/get_concepts_by_cids",
[authJwt.verifyToken],

Check failure

Code scanning / CodeQL

Missing rate limiting High

This route handler performs
authorization
, but is not rate-limited.

Copilot Autofix

AI 6 months ago

To fix the missing rate limiting, we should add a rate limiting middleware to the affected route(s). The best practice is to use a well-known package such as express-rate-limit. We should import (require) express-rate-limit, define a rate limiter (e.g., allowing 100 requests per 15 minutes per IP), and add it to the middleware chain for the /api/recommendation/setting/get_concepts_by_cids route (and optionally, other similar routes). This can be done by inserting the rate limiter middleware into the array of middlewares for the route, immediately before the controller. The import and rate limiter definition should be placed at the top of the file, and the middleware should be added to the relevant route(s).


Suggested changeset 2
webserver/src/routes/knowledgeGraph.routes.js

Autofix patch

Autofix patch
Run the following command in your local git repository to apply this patch
cat << 'EOF' | git apply
diff --git a/webserver/src/routes/knowledgeGraph.routes.js b/webserver/src/routes/knowledgeGraph.routes.js
--- a/webserver/src/routes/knowledgeGraph.routes.js
+++ b/webserver/src/routes/knowledgeGraph.routes.js
@@ -3,6 +3,13 @@
 const recommendationController = require("../controllers/recommendation.controller"); 
 const logger = require("../activity-logger/logger-middlewares/knowledge-graph-logger");
 
+// Rate limiting middleware for resource-intensive endpoints
+const rateLimit = require('express-rate-limit');
+const conceptsByCidsLimiter = rateLimit({
+  windowMs: 15 * 60 * 1000, // 15 minutes
+  max: 100, // limit each IP to 100 requests per windowMs
+  message: "Too many requests from this IP, please try again later."
+});
 module.exports = function (app) {
   app.use(function (req, res, next) {
     res.header("Access-Control-Allow-Headers", "Origin, Content-Type, Accept");
@@ -149,7 +156,7 @@
 
   app.get(
     "/api/recommendation/setting/get_concepts_by_cids",
-    [authJwt.verifyToken],
+    [authJwt.verifyToken, conceptsByCidsLimiter],
     recommendationController.getConceptsByCids
   );
 
EOF
@@ -3,6 +3,13 @@
const recommendationController = require("../controllers/recommendation.controller");
const logger = require("../activity-logger/logger-middlewares/knowledge-graph-logger");

// Rate limiting middleware for resource-intensive endpoints
const rateLimit = require('express-rate-limit');
const conceptsByCidsLimiter = rateLimit({
windowMs: 15 * 60 * 1000, // 15 minutes
max: 100, // limit each IP to 100 requests per windowMs
message: "Too many requests from this IP, please try again later."
});
module.exports = function (app) {
app.use(function (req, res, next) {
res.header("Access-Control-Allow-Headers", "Origin, Content-Type, Accept");
@@ -149,7 +156,7 @@

app.get(
"/api/recommendation/setting/get_concepts_by_cids",
[authJwt.verifyToken],
[authJwt.verifyToken, conceptsByCidsLimiter],
recommendationController.getConceptsByCids
);

webserver/package.json
Outside changed files

Autofix patch

Autofix patch
Run the following command in your local git repository to apply this patch
cat << 'EOF' | git apply
diff --git a/webserver/package.json b/webserver/package.json
--- a/webserver/package.json
+++ b/webserver/package.json
@@ -46,7 +46,8 @@
     "npm-run-all": "^4.1.5",
     "redis": "^4.7.0",
     "rimraf": "^6.0.1",
-    "socket.io": "^4.8.1"
+    "socket.io": "^4.8.1",
+    "express-rate-limit": "^8.0.1"
   },
   "devDependencies": {
     "cors": "^2.8.5"
EOF
@@ -46,7 +46,8 @@
"npm-run-all": "^4.1.5",
"redis": "^4.7.0",
"rimraf": "^6.0.1",
"socket.io": "^4.8.1"
"socket.io": "^4.8.1",
"express-rate-limit": "^8.0.1"
},
"devDependencies": {
"cors": "^2.8.5"
This fix introduces these dependencies
Package Version Security advisories
express-rate-limit (npm) 8.0.1 None
Copilot is powered by AI and may make mistakes. Always verify output.

app.get(
"/api/recommendation/setting/get_concepts_modified_by_user_id",
[authJwt.verifyToken],

Check failure

Code scanning / CodeQL

Missing rate limiting High

This route handler performs
authorization
, but is not rate-limited.

Copilot Autofix

AI 6 months ago

To fix the missing rate limiting, we should add a rate limiting middleware to the relevant route(s). The best way is to use the well-known express-rate-limit package, which is designed for this purpose. We will:

  • Import express-rate-limit at the top of the file.
  • Define a rate limiter instance (e.g., allowing 100 requests per 15 minutes per IP).
  • Apply the rate limiter as middleware to the /api/recommendation/setting/get_concepts_modified_by_user_id route (and, optionally, to other similar routes if desired).
  • This change will only affect the shown file, and will not alter existing functionality except to add rate limiting.

Suggested changeset 2
webserver/src/routes/knowledgeGraph.routes.js

Autofix patch

Autofix patch
Run the following command in your local git repository to apply this patch
cat << 'EOF' | git apply
diff --git a/webserver/src/routes/knowledgeGraph.routes.js b/webserver/src/routes/knowledgeGraph.routes.js
--- a/webserver/src/routes/knowledgeGraph.routes.js
+++ b/webserver/src/routes/knowledgeGraph.routes.js
@@ -3,6 +3,16 @@
 const recommendationController = require("../controllers/recommendation.controller"); 
 const logger = require("../activity-logger/logger-middlewares/knowledge-graph-logger");
 
+// Rate limiting middleware
+const rateLimit = require('express-rate-limit');
+
+// Define a rate limiter: 100 requests per 15 minutes per IP
+const apiLimiter = rateLimit({
+  windowMs: 15 * 60 * 1000, // 15 minutes
+  max: 100, // limit each IP to 100 requests per windowMs
+  standardHeaders: true, // Return rate limit info in the `RateLimit-*` headers
+  legacyHeaders: false, // Disable the `X-RateLimit-*` headers
+});
 module.exports = function (app) {
   app.use(function (req, res, next) {
     res.header("Access-Control-Allow-Headers", "Origin, Content-Type, Accept");
@@ -155,7 +165,7 @@
 
   app.get(
     "/api/recommendation/setting/get_concepts_modified_by_user_id",
-    [authJwt.verifyToken],
+    [authJwt.verifyToken, apiLimiter],
     recommendationController.getConceptsModifiedByUserId
   );
 
EOF
@@ -3,6 +3,16 @@
const recommendationController = require("../controllers/recommendation.controller");
const logger = require("../activity-logger/logger-middlewares/knowledge-graph-logger");

// Rate limiting middleware
const rateLimit = require('express-rate-limit');

// Define a rate limiter: 100 requests per 15 minutes per IP
const apiLimiter = rateLimit({
windowMs: 15 * 60 * 1000, // 15 minutes
max: 100, // limit each IP to 100 requests per windowMs
standardHeaders: true, // Return rate limit info in the `RateLimit-*` headers
legacyHeaders: false, // Disable the `X-RateLimit-*` headers
});
module.exports = function (app) {
app.use(function (req, res, next) {
res.header("Access-Control-Allow-Headers", "Origin, Content-Type, Accept");
@@ -155,7 +165,7 @@

app.get(
"/api/recommendation/setting/get_concepts_modified_by_user_id",
[authJwt.verifyToken],
[authJwt.verifyToken, apiLimiter],
recommendationController.getConceptsModifiedByUserId
);

webserver/package.json
Outside changed files

Autofix patch

Autofix patch
Run the following command in your local git repository to apply this patch
cat << 'EOF' | git apply
diff --git a/webserver/package.json b/webserver/package.json
--- a/webserver/package.json
+++ b/webserver/package.json
@@ -46,7 +46,8 @@
     "npm-run-all": "^4.1.5",
     "redis": "^4.7.0",
     "rimraf": "^6.0.1",
-    "socket.io": "^4.8.1"
+    "socket.io": "^4.8.1",
+    "express-rate-limit": "^8.0.1"
   },
   "devDependencies": {
     "cors": "^2.8.5"
EOF
@@ -46,7 +46,8 @@
"npm-run-all": "^4.1.5",
"redis": "^4.7.0",
"rimraf": "^6.0.1",
"socket.io": "^4.8.1"
"socket.io": "^4.8.1",
"express-rate-limit": "^8.0.1"
},
"devDependencies": {
"cors": "^2.8.5"
This fix introduces these dependencies
Package Version Security advisories
express-rate-limit (npm) 8.0.1 None
Copilot is powered by AI and may make mistakes. Always verify output.

app.get(
"/api/recommendation/setting/get_concepts_modified_by_user_id_and_mid",
[authJwt.verifyToken],

Check failure

Code scanning / CodeQL

Missing rate limiting High

This route handler performs
authorization
, but is not rate-limited.

Copilot Autofix

AI 6 months ago

To fix the missing rate-limiting issue, we should add a rate-limiting middleware to the affected route. The best way is to use the well-known express-rate-limit package, which is designed for Express applications. We should import express-rate-limit, configure a reasonable rate limit (e.g., 100 requests per 15 minutes per IP), and apply it to the specific route (/api/recommendation/setting/get_concepts_modified_by_user_id_and_mid). This can be done by adding the rate-limiter to the middleware array for that route. The required changes are:

  • Import express-rate-limit at the top of the file.
  • Define a rate limiter instance.
  • Add the rate limiter to the middleware array for the affected route (line 164).

Suggested changeset 2
webserver/src/routes/knowledgeGraph.routes.js

Autofix patch

Autofix patch
Run the following command in your local git repository to apply this patch
cat << 'EOF' | git apply
diff --git a/webserver/src/routes/knowledgeGraph.routes.js b/webserver/src/routes/knowledgeGraph.routes.js
--- a/webserver/src/routes/knowledgeGraph.routes.js
+++ b/webserver/src/routes/knowledgeGraph.routes.js
@@ -3,6 +3,13 @@
 const recommendationController = require("../controllers/recommendation.controller"); 
 const logger = require("../activity-logger/logger-middlewares/knowledge-graph-logger");
 
+const rateLimit = require('express-rate-limit');
+const limiter = rateLimit({
+  windowMs: 15 * 60 * 1000, // 15 minutes
+  max: 100, // limit each IP to 100 requests per windowMs
+  standardHeaders: true, // Return rate limit info in the `RateLimit-*` headers
+  legacyHeaders: false, // Disable the `X-RateLimit-*` headers
+});
 module.exports = function (app) {
   app.use(function (req, res, next) {
     res.header("Access-Control-Allow-Headers", "Origin, Content-Type, Accept");
@@ -161,7 +168,7 @@
 
   app.get(
     "/api/recommendation/setting/get_concepts_modified_by_user_id_and_mid",
-    [authJwt.verifyToken],
+    [authJwt.verifyToken, limiter],
     recommendationController.getConceptsModifiedByUserIdAndMid
   );
 
EOF
@@ -3,6 +3,13 @@
const recommendationController = require("../controllers/recommendation.controller");
const logger = require("../activity-logger/logger-middlewares/knowledge-graph-logger");

const rateLimit = require('express-rate-limit');
const limiter = rateLimit({
windowMs: 15 * 60 * 1000, // 15 minutes
max: 100, // limit each IP to 100 requests per windowMs
standardHeaders: true, // Return rate limit info in the `RateLimit-*` headers
legacyHeaders: false, // Disable the `X-RateLimit-*` headers
});
module.exports = function (app) {
app.use(function (req, res, next) {
res.header("Access-Control-Allow-Headers", "Origin, Content-Type, Accept");
@@ -161,7 +168,7 @@

app.get(
"/api/recommendation/setting/get_concepts_modified_by_user_id_and_mid",
[authJwt.verifyToken],
[authJwt.verifyToken, limiter],
recommendationController.getConceptsModifiedByUserIdAndMid
);

webserver/package.json
Outside changed files

Autofix patch

Autofix patch
Run the following command in your local git repository to apply this patch
cat << 'EOF' | git apply
diff --git a/webserver/package.json b/webserver/package.json
--- a/webserver/package.json
+++ b/webserver/package.json
@@ -46,7 +46,8 @@
     "npm-run-all": "^4.1.5",
     "redis": "^4.7.0",
     "rimraf": "^6.0.1",
-    "socket.io": "^4.8.1"
+    "socket.io": "^4.8.1",
+    "express-rate-limit": "^8.0.1"
   },
   "devDependencies": {
     "cors": "^2.8.5"
EOF
@@ -46,7 +46,8 @@
"npm-run-all": "^4.1.5",
"redis": "^4.7.0",
"rimraf": "^6.0.1",
"socket.io": "^4.8.1"
"socket.io": "^4.8.1",
"express-rate-limit": "^8.0.1"
},
"devDependencies": {
"cors": "^2.8.5"
This fix introduces these dependencies
Package Version Security advisories
express-rate-limit (npm) 8.0.1 None
Copilot is powered by AI and may make mistakes. Always verify output.

app.get(
"/api/recommendation/setting/get_concepts_modified_by_user_from_saves",
[authJwt.verifyToken],

Check failure

Code scanning / CodeQL

Missing rate limiting High

This route handler performs
authorization
, but is not rate-limited.

Copilot Autofix

AI 6 months ago

To fix the missing rate limiting, we should add a rate limiting middleware to the affected route. The best practice is to use a well-known package such as express-rate-limit. We will import this package at the top of the file, define a rate limiter (e.g., allowing 100 requests per 15 minutes per IP), and add it to the middleware array for the specific route (/api/recommendation/setting/get_concepts_modified_by_user_from_saves). This change will only affect the specified route and will not alter existing functionality except for limiting the request rate.

Steps:

  • Import express-rate-limit at the top of the file.
  • Define a rate limiter instance (e.g., const limiter = rateLimit({...})).
  • Add the limiter to the middleware array for the affected route (line 176).
  • No changes to other routes or logic.

Suggested changeset 2
webserver/src/routes/knowledgeGraph.routes.js

Autofix patch

Autofix patch
Run the following command in your local git repository to apply this patch
cat << 'EOF' | git apply
diff --git a/webserver/src/routes/knowledgeGraph.routes.js b/webserver/src/routes/knowledgeGraph.routes.js
--- a/webserver/src/routes/knowledgeGraph.routes.js
+++ b/webserver/src/routes/knowledgeGraph.routes.js
@@ -3,6 +3,14 @@
 const recommendationController = require("../controllers/recommendation.controller"); 
 const logger = require("../activity-logger/logger-middlewares/knowledge-graph-logger");
 
+const rateLimit = require('express-rate-limit');
+
+// Rate limiter: max 100 requests per 15 minutes per IP for this sensitive endpoint
+const recommendationLimiter = rateLimit({
+  windowMs: 15 * 60 * 1000, // 15 minutes
+  max: 100, // limit each IP to 100 requests per windowMs
+  message: "Too many requests from this IP, please try again later."
+});
 module.exports = function (app) {
   app.use(function (req, res, next) {
     res.header("Access-Control-Allow-Headers", "Origin, Content-Type, Accept");
@@ -173,7 +181,7 @@
 
   app.get(
     "/api/recommendation/setting/get_concepts_modified_by_user_from_saves",
-    [authJwt.verifyToken],
+    [authJwt.verifyToken, recommendationLimiter],
     recommendationController.getConceptsModifiedByUserFromSaves
   );
 
EOF
@@ -3,6 +3,14 @@
const recommendationController = require("../controllers/recommendation.controller");
const logger = require("../activity-logger/logger-middlewares/knowledge-graph-logger");

const rateLimit = require('express-rate-limit');

// Rate limiter: max 100 requests per 15 minutes per IP for this sensitive endpoint
const recommendationLimiter = rateLimit({
windowMs: 15 * 60 * 1000, // 15 minutes
max: 100, // limit each IP to 100 requests per windowMs
message: "Too many requests from this IP, please try again later."
});
module.exports = function (app) {
app.use(function (req, res, next) {
res.header("Access-Control-Allow-Headers", "Origin, Content-Type, Accept");
@@ -173,7 +181,7 @@

app.get(
"/api/recommendation/setting/get_concepts_modified_by_user_from_saves",
[authJwt.verifyToken],
[authJwt.verifyToken, recommendationLimiter],
recommendationController.getConceptsModifiedByUserFromSaves
);

webserver/package.json
Outside changed files

Autofix patch

Autofix patch
Run the following command in your local git repository to apply this patch
cat << 'EOF' | git apply
diff --git a/webserver/package.json b/webserver/package.json
--- a/webserver/package.json
+++ b/webserver/package.json
@@ -46,7 +46,8 @@
     "npm-run-all": "^4.1.5",
     "redis": "^4.7.0",
     "rimraf": "^6.0.1",
-    "socket.io": "^4.8.1"
+    "socket.io": "^4.8.1",
+    "express-rate-limit": "^8.0.1"
   },
   "devDependencies": {
     "cors": "^2.8.5"
EOF
@@ -46,7 +46,8 @@
"npm-run-all": "^4.1.5",
"redis": "^4.7.0",
"rimraf": "^6.0.1",
"socket.io": "^4.8.1"
"socket.io": "^4.8.1",
"express-rate-limit": "^8.0.1"
},
"devDependencies": {
"cors": "^2.8.5"
This fix introduces these dependencies
Package Version Security advisories
express-rate-limit (npm) 8.0.1 None
Copilot is powered by AI and may make mistakes. Always verify output.

app.post(
"/api/recommendation/save_or_remove_resources",
[authJwt.verifyToken],

Check failure

Code scanning / CodeQL

Missing rate limiting High

This route handler performs
authorization
, but is not rate-limited.

Copilot Autofix

AI 6 months ago

To fix the missing rate limiting issue, we should add a rate-limiting middleware to the affected route(s). The best way to do this in Express is to use the well-known express-rate-limit package. We should import this package, configure a rate limiter (e.g., limit to 100 requests per 15 minutes per IP), and add it to the middleware array for the /api/recommendation/save_or_remove_resources route. This change should be made in webserver/src/routes/knowledgeGraph.routes.js, and the import for express-rate-limit should be added at the top of the file. The rate limiter should be defined once and reused for relevant routes.


Suggested changeset 2
webserver/src/routes/knowledgeGraph.routes.js

Autofix patch

Autofix patch
Run the following command in your local git repository to apply this patch
cat << 'EOF' | git apply
diff --git a/webserver/src/routes/knowledgeGraph.routes.js b/webserver/src/routes/knowledgeGraph.routes.js
--- a/webserver/src/routes/knowledgeGraph.routes.js
+++ b/webserver/src/routes/knowledgeGraph.routes.js
@@ -3,6 +3,14 @@
 const recommendationController = require("../controllers/recommendation.controller"); 
 const logger = require("../activity-logger/logger-middlewares/knowledge-graph-logger");
 
+const rateLimit = require('express-rate-limit');
+// Limit to 100 requests per 15 minutes per IP
+const apiLimiter = rateLimit({
+  windowMs: 15 * 60 * 1000, // 15 minutes
+  max: 100,
+  standardHeaders: true,
+  legacyHeaders: false,
+});
 module.exports = function (app) {
   app.use(function (req, res, next) {
     res.header("Access-Control-Allow-Headers", "Origin, Content-Type, Accept");
@@ -179,7 +187,7 @@
 
   app.post(
     "/api/recommendation/save_or_remove_resources",
-    [authJwt.verifyToken],
+    [authJwt.verifyToken, apiLimiter],
     recommendationController.SaveOrRemveUserResources
   );
 
EOF
@@ -3,6 +3,14 @@
const recommendationController = require("../controllers/recommendation.controller");
const logger = require("../activity-logger/logger-middlewares/knowledge-graph-logger");

const rateLimit = require('express-rate-limit');
// Limit to 100 requests per 15 minutes per IP
const apiLimiter = rateLimit({
windowMs: 15 * 60 * 1000, // 15 minutes
max: 100,
standardHeaders: true,
legacyHeaders: false,
});
module.exports = function (app) {
app.use(function (req, res, next) {
res.header("Access-Control-Allow-Headers", "Origin, Content-Type, Accept");
@@ -179,7 +187,7 @@

app.post(
"/api/recommendation/save_or_remove_resources",
[authJwt.verifyToken],
[authJwt.verifyToken, apiLimiter],
recommendationController.SaveOrRemveUserResources
);

webserver/package.json
Outside changed files

Autofix patch

Autofix patch
Run the following command in your local git repository to apply this patch
cat << 'EOF' | git apply
diff --git a/webserver/package.json b/webserver/package.json
--- a/webserver/package.json
+++ b/webserver/package.json
@@ -46,7 +46,8 @@
     "npm-run-all": "^4.1.5",
     "redis": "^4.7.0",
     "rimraf": "^6.0.1",
-    "socket.io": "^4.8.1"
+    "socket.io": "^4.8.1",
+    "express-rate-limit": "^8.0.1"
   },
   "devDependencies": {
     "cors": "^2.8.5"
EOF
@@ -46,7 +46,8 @@
"npm-run-all": "^4.1.5",
"redis": "^4.7.0",
"rimraf": "^6.0.1",
"socket.io": "^4.8.1"
"socket.io": "^4.8.1",
"express-rate-limit": "^8.0.1"
},
"devDependencies": {
"cors": "^2.8.5"
This fix introduces these dependencies
Package Version Security advisories
express-rate-limit (npm) 8.0.1 None
Copilot is powered by AI and may make mistakes. Always verify output.

app.post(
"/api/recommendation/rating_resource",
[authJwt.verifyToken],

Check failure

Code scanning / CodeQL

Missing rate limiting High

This route handler performs
authorization
, but is not rate-limited.

Copilot Autofix

AI 6 months ago

To fix the missing rate limiting issue, we should add a rate-limiting middleware to the /api/recommendation/rating_resource route. The best way to do this in an Express application is to use the well-known express-rate-limit package. We will:

  • Import express-rate-limit at the top of the file.
  • Define a rate limiter instance with reasonable defaults (e.g., 100 requests per 15 minutes per IP).
  • Add the rate limiter to the middleware array for the affected route (and optionally other expensive routes).
  • Only modify the code in webserver/src/routes/knowledgeGraph.routes.js as shown.

Suggested changeset 2
webserver/src/routes/knowledgeGraph.routes.js

Autofix patch

Autofix patch
Run the following command in your local git repository to apply this patch
cat << 'EOF' | git apply
diff --git a/webserver/src/routes/knowledgeGraph.routes.js b/webserver/src/routes/knowledgeGraph.routes.js
--- a/webserver/src/routes/knowledgeGraph.routes.js
+++ b/webserver/src/routes/knowledgeGraph.routes.js
@@ -3,6 +3,14 @@
 const recommendationController = require("../controllers/recommendation.controller"); 
 const logger = require("../activity-logger/logger-middlewares/knowledge-graph-logger");
 
+const rateLimit = require('express-rate-limit');
+
+// Rate limiter: max 100 requests per 15 minutes per IP
+const ratingResourceLimiter = rateLimit({
+  windowMs: 15 * 60 * 1000, // 15 minutes
+  max: 100, // limit each IP to 100 requests per windowMs
+  message: "Too many requests, please try again later.",
+});
 module.exports = function (app) {
   app.use(function (req, res, next) {
     res.header("Access-Control-Allow-Headers", "Origin, Content-Type, Accept");
@@ -185,7 +193,7 @@
 
   app.post(
     "/api/recommendation/rating_resource",
-    [authJwt.verifyToken],
+    [authJwt.verifyToken, ratingResourceLimiter],
     recommendationController.ratingResource
   );
 
EOF
@@ -3,6 +3,14 @@
const recommendationController = require("../controllers/recommendation.controller");
const logger = require("../activity-logger/logger-middlewares/knowledge-graph-logger");

const rateLimit = require('express-rate-limit');

// Rate limiter: max 100 requests per 15 minutes per IP
const ratingResourceLimiter = rateLimit({
windowMs: 15 * 60 * 1000, // 15 minutes
max: 100, // limit each IP to 100 requests per windowMs
message: "Too many requests, please try again later.",
});
module.exports = function (app) {
app.use(function (req, res, next) {
res.header("Access-Control-Allow-Headers", "Origin, Content-Type, Accept");
@@ -185,7 +193,7 @@

app.post(
"/api/recommendation/rating_resource",
[authJwt.verifyToken],
[authJwt.verifyToken, ratingResourceLimiter],
recommendationController.ratingResource
);

webserver/package.json
Outside changed files

Autofix patch

Autofix patch
Run the following command in your local git repository to apply this patch
cat << 'EOF' | git apply
diff --git a/webserver/package.json b/webserver/package.json
--- a/webserver/package.json
+++ b/webserver/package.json
@@ -46,7 +46,8 @@
     "npm-run-all": "^4.1.5",
     "redis": "^4.7.0",
     "rimraf": "^6.0.1",
-    "socket.io": "^4.8.1"
+    "socket.io": "^4.8.1",
+    "express-rate-limit": "^8.0.1"
   },
   "devDependencies": {
     "cors": "^2.8.5"
EOF
@@ -46,7 +46,8 @@
"npm-run-all": "^4.1.5",
"redis": "^4.7.0",
"rimraf": "^6.0.1",
"socket.io": "^4.8.1"
"socket.io": "^4.8.1",
"express-rate-limit": "^8.0.1"
},
"devDependencies": {
"cors": "^2.8.5"
This fix introduces these dependencies
Package Version Security advisories
express-rate-limit (npm) 8.0.1 None
Copilot is powered by AI and may make mistakes. Always verify output.

app.post(
"/api/recommendation/get_concepts",
[authJwt.verifyToken],

Check failure

Code scanning / CodeQL

Missing rate limiting High

This route handler performs
authorization
, but is not rate-limited.

Copilot Autofix

AI 6 months ago

To fix the missing rate limiting, we should add a rate-limiting middleware to the /api/recommendation/get_concepts route. The best practice is to use a well-known package such as express-rate-limit. We should import or require express-rate-limit, configure a limiter (e.g., 100 requests per 15 minutes per IP), and add it to the middleware array for this route. This change should be made in webserver/src/routes/knowledgeGraph.routes.js, above or within the module export function, ensuring the limiter is only applied to this route (unless you want to apply it globally or to a group of routes). The fix requires adding the express-rate-limit dependency if it is not already present.


Suggested changeset 2
webserver/src/routes/knowledgeGraph.routes.js

Autofix patch

Autofix patch
Run the following command in your local git repository to apply this patch
cat << 'EOF' | git apply
diff --git a/webserver/src/routes/knowledgeGraph.routes.js b/webserver/src/routes/knowledgeGraph.routes.js
--- a/webserver/src/routes/knowledgeGraph.routes.js
+++ b/webserver/src/routes/knowledgeGraph.routes.js
@@ -3,6 +3,13 @@
 const recommendationController = require("../controllers/recommendation.controller"); 
 const logger = require("../activity-logger/logger-middlewares/knowledge-graph-logger");
 
+// Rate limiter for resource-intensive endpoints
+const rateLimit = require('express-rate-limit');
+const conceptsLimiter = rateLimit({
+  windowMs: 15 * 60 * 1000, // 15 minutes
+  max: 100, // limit each IP to 100 requests per windowMs
+  message: "Too many requests from this IP, please try again later."
+});
 module.exports = function (app) {
   app.use(function (req, res, next) {
     res.header("Access-Control-Allow-Headers", "Origin, Content-Type, Accept");
@@ -191,7 +198,7 @@
 
   app.post(
     "/api/recommendation/get_concepts",
-    [authJwt.verifyToken],
+    [authJwt.verifyToken, conceptsLimiter],
     recommendationController.getConcepts
   );
 
EOF
@@ -3,6 +3,13 @@
const recommendationController = require("../controllers/recommendation.controller");
const logger = require("../activity-logger/logger-middlewares/knowledge-graph-logger");

// Rate limiter for resource-intensive endpoints
const rateLimit = require('express-rate-limit');
const conceptsLimiter = rateLimit({
windowMs: 15 * 60 * 1000, // 15 minutes
max: 100, // limit each IP to 100 requests per windowMs
message: "Too many requests from this IP, please try again later."
});
module.exports = function (app) {
app.use(function (req, res, next) {
res.header("Access-Control-Allow-Headers", "Origin, Content-Type, Accept");
@@ -191,7 +198,7 @@

app.post(
"/api/recommendation/get_concepts",
[authJwt.verifyToken],
[authJwt.verifyToken, conceptsLimiter],
recommendationController.getConcepts
);

webserver/package.json
Outside changed files

Autofix patch

Autofix patch
Run the following command in your local git repository to apply this patch
cat << 'EOF' | git apply
diff --git a/webserver/package.json b/webserver/package.json
--- a/webserver/package.json
+++ b/webserver/package.json
@@ -46,7 +46,8 @@
     "npm-run-all": "^4.1.5",
     "redis": "^4.7.0",
     "rimraf": "^6.0.1",
-    "socket.io": "^4.8.1"
+    "socket.io": "^4.8.1",
+    "express-rate-limit": "^8.0.1"
   },
   "devDependencies": {
     "cors": "^2.8.5"
EOF
@@ -46,7 +46,8 @@
"npm-run-all": "^4.1.5",
"redis": "^4.7.0",
"rimraf": "^6.0.1",
"socket.io": "^4.8.1"
"socket.io": "^4.8.1",
"express-rate-limit": "^8.0.1"
},
"devDependencies": {
"cors": "^2.8.5"
This fix introduces these dependencies
Package Version Security advisories
express-rate-limit (npm) 8.0.1 None
Copilot is powered by AI and may make mistakes. Always verify output.

app.post(
"/api/recommendation/get_resources",
[authJwt.verifyToken],

Check failure

Code scanning / CodeQL

Missing rate limiting High

This route handler performs
authorization
, but is not rate-limited.

Copilot Autofix

AI 6 months ago

To fix the missing rate limiting, we should add a rate-limiting middleware to the /api/recommendation/get_resources route. The best way to do this is to use the well-known express-rate-limit package, which is designed for this purpose. We will:

  • Import express-rate-limit at the top of the file.
  • Define a rate limiter instance (e.g., allowing 100 requests per 15 minutes per IP).
  • Apply this rate limiter as middleware to the /api/recommendation/get_resources route (line 200), placing it before the controller.
  • Ensure that only this route is affected, to avoid changing the behavior of other routes.

This change will only affect the specified route and will not alter existing functionality except to add rate limiting.


Suggested changeset 2
webserver/src/routes/knowledgeGraph.routes.js

Autofix patch

Autofix patch
Run the following command in your local git repository to apply this patch
cat << 'EOF' | git apply
diff --git a/webserver/src/routes/knowledgeGraph.routes.js b/webserver/src/routes/knowledgeGraph.routes.js
--- a/webserver/src/routes/knowledgeGraph.routes.js
+++ b/webserver/src/routes/knowledgeGraph.routes.js
@@ -3,6 +3,15 @@
 const recommendationController = require("../controllers/recommendation.controller"); 
 const logger = require("../activity-logger/logger-middlewares/knowledge-graph-logger");
 
+// Rate limiting middleware for resource-intensive endpoints
+const rateLimit = require('express-rate-limit');
+
+// Define a rate limiter: max 100 requests per 15 minutes per IP
+const getResourcesLimiter = rateLimit({
+  windowMs: 15 * 60 * 1000, // 15 minutes
+  max: 100, // limit each IP to 100 requests per windowMs
+  message: "Too many requests from this IP, please try again later."
+});
 module.exports = function (app) {
   app.use(function (req, res, next) {
     res.header("Access-Control-Allow-Headers", "Origin, Content-Type, Accept");
@@ -197,7 +206,7 @@
 
   app.post(
     "/api/recommendation/get_resources",
-    [authJwt.verifyToken],
+    [authJwt.verifyToken, getResourcesLimiter],
     recommendationController.getResources
   );
 
EOF
@@ -3,6 +3,15 @@
const recommendationController = require("../controllers/recommendation.controller");
const logger = require("../activity-logger/logger-middlewares/knowledge-graph-logger");

// Rate limiting middleware for resource-intensive endpoints
const rateLimit = require('express-rate-limit');

// Define a rate limiter: max 100 requests per 15 minutes per IP
const getResourcesLimiter = rateLimit({
windowMs: 15 * 60 * 1000, // 15 minutes
max: 100, // limit each IP to 100 requests per windowMs
message: "Too many requests from this IP, please try again later."
});
module.exports = function (app) {
app.use(function (req, res, next) {
res.header("Access-Control-Allow-Headers", "Origin, Content-Type, Accept");
@@ -197,7 +206,7 @@

app.post(
"/api/recommendation/get_resources",
[authJwt.verifyToken],
[authJwt.verifyToken, getResourcesLimiter],
recommendationController.getResources
);

webserver/package.json
Outside changed files

Autofix patch

Autofix patch
Run the following command in your local git repository to apply this patch
cat << 'EOF' | git apply
diff --git a/webserver/package.json b/webserver/package.json
--- a/webserver/package.json
+++ b/webserver/package.json
@@ -46,7 +46,8 @@
     "npm-run-all": "^4.1.5",
     "redis": "^4.7.0",
     "rimraf": "^6.0.1",
-    "socket.io": "^4.8.1"
+    "socket.io": "^4.8.1",
+    "express-rate-limit": "^8.0.1"
   },
   "devDependencies": {
     "cors": "^2.8.5"
EOF
@@ -46,7 +46,8 @@
"npm-run-all": "^4.1.5",
"redis": "^4.7.0",
"rimraf": "^6.0.1",
"socket.io": "^4.8.1"
"socket.io": "^4.8.1",
"express-rate-limit": "^8.0.1"
},
"devDependencies": {
"cors": "^2.8.5"
This fix introduces these dependencies
Package Version Security advisories
express-rate-limit (npm) 8.0.1 None
Copilot is powered by AI and may make mistakes. Always verify output.
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 89

🔭 Outside diff range comments (14)
webapp/src/app/services/user-concepts.service.ts (2)

19-25: Avoid storing observables in service state; return directly.

These class fields aren’t needed and can cause stale references. Return the observable directly.

Apply:

-  this.userConcepts = this.http.get<any>(
-    `${this.backendEndpointURL}/users/user-concepts/${userID}`,
-    { headers: this.httpHeader }
-  );
-  return this.userConcepts;
+  return this.http.get<any>(
+    `${this.backendEndpointURL}/users/user-concepts/${userID}`,
+    { headers: this.httpHeader }
+  );

37-42: POST request misses headers; add and simplify.

Be consistent with GET and avoid persisting the observable in a field.

-  this.userUpdateConcepts = this.http.post<any>(
-    `${this.backendEndpointURL}/users/user-concepts`,
-    body,
-  );
-  return this.userUpdateConcepts;
+  return this.http.post<any>(
+    `${this.backendEndpointURL}/users/user-concepts`,
+    body,
+    { headers: this.httpHeader }
+  );

Optionally add typing and error handling with catchError.

webapp/src/app/pages/components/knowledge-graph/cytoscape-recommended/cytoscape-recommended.component.ts (4)

366-371: Typo in CSS border radius value ('4xp').

'4xp' is invalid; should be '4px'. The current style won’t apply as intended.

-            div.style.borderRadius = '4xp 4xp 0 0';
+            div.style.borderRadius = '4px 4px 0 0';

401-418: Avoid relying on Cytoscape’s internal _private API.

Accessing eventTarget._private.* is unsupported and brittle. Use public APIs like ele.scratch() or ele.data() to track UI flags.

Example pattern:

// when selecting a command
select: (ele) => {
  const flags = ele.scratch('menuFlags') || {};
  ele.scratch('menuFlags', { ...flags, notUnderstand: true, understood: false, unread: false });
}

// when reading flags
const flags = eventTarget.scratch('menuFlags') || {};
if (flags.notUnderstand) {
  // ...
  eventTarget.scratch('menuFlags', { ...flags, notUnderstand: false });
}

Also consider representing transient UI state via classes or data attributes rather than hidden flags.

Also applies to: 422-437, 441-455


318-325: Use forEach instead of map for side-effects.

map returns a new array you ignore. Prefer forEach for clarity and minor perf.

- this.elements.nodes.map((node) => {
+ this.elements.nodes.forEach((node) => {
    node.data.selected = 'u';
- });
+ });

And similar change in the second block.

Also applies to: 335-338


283-285: Prefer Angular Renderer2 over direct DOM manipulation.

Direct document.getElementById and document.body.appendChild bypass Angular and complicate SSR/testing. Use Renderer2 and ElementRef or Angular templates/portals.

Create and attach the popper div via Renderer2.createElement, setStyle, and appendChild. Bind height via template instead of imperative DOM when possible.

Also applies to: 360-371

coursemapper-kg/recommendation/example.env (1)

9-13: Critical: API keys committed to VCS. Rotate immediately and use placeholders in example files.

The YouTube API keys appear real. Never commit secrets.

Immediate actions:

  • Revoke/rotate the exposed keys in the Google Cloud console.
  • Force-push removal won’t protect secrets; rotation is mandatory.

Follow-ups:

  • Replace example values with placeholders and instructions:
- YOUTUBE_API_KEY=AIzaSyD_CGmR_Voq4DIV5okRaR6G8adoe-ZSZsM
- YOUTUBE_API_KEY_2=AIzaSyADNntK6m7DbA6eZFYOa9Y8e6IYHykUUFE
- YOUTUBE_API_KEY_3=AIzaSyClxnNwQ1x34pGioQazLlGxOjO9Fp2GGTY
- YOUTUBE_API_KEY_4=AIzaSyBphZOn7EJmPMmZwrB71aepaA5Rbuex9MU
- YOUTUBE_API_KEY_5=AIzaSyB2Wck31LUlgsqI7dgTcC2dMeeVXgb9TDI
+ YOUTUBE_API_KEY=<your-youtube-api-key-1>
+ YOUTUBE_API_KEY_2=<your-youtube-api-key-2>
+ YOUTUBE_API_KEY_3=<your-youtube-api-key-3>
+ YOUTUBE_API_KEY_4=<your-youtube-api-key-4>
+ YOUTUBE_API_KEY_5=<your-youtube-api-key-5>
  • Keep real secrets in a local .env (ignored by git) or a secret manager.
  • Consider also replacing NEO4J_PASSWORD and REDIS_PASSWORD with placeholders in example.env, with guidance in README.

I can open a follow-up issue and prepare a PR to purge/replace these and update docs.

webapp/src/app/pages/components/knowledge-graph/knowledge-graph.module.ts (1)

40-115: Remove duplicate & unused imports to trim bundle size

RadioButtonModule, CheckboxModule, DropdownModule, FormsModule, InputTextModule, MultiSelectModule, PaginatorModule etc. are imported twice.
MenuItem is imported but never used.

Cleaning these redundancies reduces bundle size and compile time.

webapp/src/app/pages/components/knowledge-graph/videos/models/video-element.model.ts (1)

2-20: Keep delimiter style consistent

Interface properties mix commas and semicolons. Stick to one style (TS convention: semicolons) for readability.

-    id: number,
+    id: number;
...
-    bookmarked_count?: number,
+    bookmarked_count?: number;
webapp/src/app/pages/components/knowledge-graph/articles/card-article/card-article.component.ts (1)

50-57: Missing ngOnDestroy → subscription leak

subscriptions are added but never disposed. Implement ngOnDestroy and call this.subscriptions.unsubscribe() to avoid memory leaks.

webapp/src/app/pages/components/knowledge-graph/result-view/result-view.component.ts (1)

113-147: Add ngOnDestroy and track all service subscriptions

Only the NgRx store subscription is stored in this.subscriptions; every .subscribe() in getUserResources, getRidsFromUserSaves, getConceptsModifiedByUserFromSaves, etc. is not.
Unsubscribing in ngOnDestroy is mandatory to avoid memory leaks, especially in an always-alive tab view.

Also applies to: 349-364

coursemapper-kg/recommendation/app/services/course_materials/db/neo4_db.py (3)

1708-1735: hash() gives nondeterministic IDs across runs

hash() is salted per-process; the same input yields different values between interpreter starts, breaking id/edge consistency in idfeature.txt.
Use a stable hash (e.g. hashlib.sha1(value.encode()).hexdigest()[:16]) or persist a mapping.


1777-1789: session.commit() will raise AttributeError

In update_concept() a new Session is obtained, .run() is executed, and session.commit() is called.
neo4j.Session has no commit; transactions must be created via session.begin_transaction() or a with session.begin_transaction() as tx: block.

-session = self.driver.session()
-session.run(...)
-session.commit()
+with self.driver.session() as session:
+    with session.begin_transaction() as tx:
+        tx.run(...)
+        tx.commit()

707-734: Sessions opened but never closed

Several helpers (create_user_concept_relationship, get_or_create_user, etc.) instantiate self.driver.session() without closing it.
Use context managers (with self.driver.session() as session:) or finally blocks to prevent connection-pool exhaustion.

♻️ Duplicate comments (2)
.gitignore (1)

17-17: Refine the uploads directory ignore pattern

The pattern should include a trailing slash or /** so that the entire contents of webserver/public/uploads are ignored.

webapp/src/app/pages/components/knowledge-graph/custom-recommendation-option/custom-recommendation-option.component.html (1)

208-216: Repeated class attributes and missing coupling

Same duplication appears here (class="mt-1" lines 208/211). Clean them up when tidying templates.

Comment on lines +16 to +46
def create_job_and_set_state(pipeline_name, args_dict, state="pending"):
"""
# after publishing, craw resources for main concepts
create_job_and_set_state(pipeline_name="get_resources_by_main_concepts",
args_dict={"materialId": job.get('materialId')}
)
"""

redis_client = worker.redis
args_json = json.dumps(args_dict, sort_keys=True)
job_id = hashlib.sha256(args_json.encode()).hexdigest()

# Define the job with pipeline name, args, and state
job_data = {
# "id": job_id,
"pipeline": pipeline_name,
"data": args_json,
# "state": state
}

job_exists = redis_client.hget('jobs', job_id)
if job_exists:
print(f"Job with identifier {job_id} already exists")
else:
# Save the job in Redis
redis_client.hset('jobs', job_id, json.dumps(job_data).encode('utf-8'))
# redis_client.hset('jobs', job_id, job_data)

# Push the job ID to the appropriate queue based on the state
redis_client.lpush(f"queue:{pipeline_name}:pending", job_id)
print(f"Job {job_id} created with state '{state}', pipeline '{pipeline_name}', and args {args_dict}")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Strengthen create_job_and_set_state API

  1. Add type hints to satisfy linters and future readers.
  2. Include pipeline_name in the hash to avoid collisions between pipelines that share identical args.
  3. Honour the state parameter when choosing the queue name.
-def create_job_and_set_state(pipeline_name, args_dict, state="pending"):
+from typing import Dict
+
+def create_job_and_set_state(
+        pipeline_name: str,
+        args_dict: Dict[str, str],
+        state: str = "pending",
+) -> None:
@@
-    job_id = hashlib.sha256(args_json.encode()).hexdigest()
+    job_id = hashlib.sha256(f"{pipeline_name}:{args_json}".encode()).hexdigest()
@@
-        redis_client.lpush(f"queue:{pipeline_name}:pending", job_id)
+        redis_client.lpush(f"queue:{pipeline_name}:{state}", job_id)
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
def create_job_and_set_state(pipeline_name, args_dict, state="pending"):
"""
# after publishing, craw resources for main concepts
create_job_and_set_state(pipeline_name="get_resources_by_main_concepts",
args_dict={"materialId": job.get('materialId')}
)
"""
redis_client = worker.redis
args_json = json.dumps(args_dict, sort_keys=True)
job_id = hashlib.sha256(args_json.encode()).hexdigest()
# Define the job with pipeline name, args, and state
job_data = {
# "id": job_id,
"pipeline": pipeline_name,
"data": args_json,
# "state": state
}
job_exists = redis_client.hget('jobs', job_id)
if job_exists:
print(f"Job with identifier {job_id} already exists")
else:
# Save the job in Redis
redis_client.hset('jobs', job_id, json.dumps(job_data).encode('utf-8'))
# redis_client.hset('jobs', job_id, job_data)
# Push the job ID to the appropriate queue based on the state
redis_client.lpush(f"queue:{pipeline_name}:pending", job_id)
print(f"Job {job_id} created with state '{state}', pipeline '{pipeline_name}', and args {args_dict}")
from typing import Dict
def create_job_and_set_state(
pipeline_name: str,
args_dict: Dict[str, str],
state: str = "pending",
) -> None:
"""
# after publishing, craw resources for main concepts
create_job_and_set_state(pipeline_name="get_resources_by_main_concepts",
args_dict={"materialId": job.get('materialId')}
)
"""
redis_client = worker.redis
args_json = json.dumps(args_dict, sort_keys=True)
job_id = hashlib.sha256(f"{pipeline_name}:{args_json}".encode()).hexdigest()
# Define the job with pipeline name, args, and state
job_data = {
# "id": job_id,
"pipeline": pipeline_name,
"data": args_json,
# "state": state
}
job_exists = redis_client.hget('jobs', job_id)
if job_exists:
print(f"Job with identifier {job_id} already exists")
else:
# Save the job in Redis
redis_client.hset('jobs', job_id, json.dumps(job_data).encode('utf-8'))
# redis_client.hset('jobs', job_id, job_data)
# Push the job ID to the appropriate queue based on the state
redis_client.lpush(f"queue:{pipeline_name}:{state}", job_id)
print(f"Job {job_id} created with state '{state}', pipeline '{pipeline_name}', and args {args_dict}")
🧰 Tools
🪛 Ruff (0.12.2)

16-16: Missing return type annotation for public function create_job_and_set_state

Add return type annotation: None

(ANN201)


16-16: Missing type annotation for function argument pipeline_name

(ANN001)


16-16: Missing type annotation for function argument args_dict

(ANN001)


16-16: Missing type annotation for function argument state

(ANN001)

🤖 Prompt for AI Agents
In coursemapper-kg/concept-map/src/main.py lines 16 to 46, the function
create_job_and_set_state lacks type hints, does not include pipeline_name in the
job ID hash causing possible collisions, and ignores the state parameter when
pushing to the queue. Add appropriate type hints for the function parameters and
return type. Modify the job_id calculation to hash a combination of
pipeline_name and args_json to ensure uniqueness across pipelines. Use the state
parameter dynamically when pushing the job ID to the queue instead of hardcoding
"pending".

Comment on lines +22 to +26
self.addHandler(self.get_console_handler())
if log_file:
self.addHandler(self.get_file_handler())
self.addHandler(self.get_redis_handler())
self.propagate = False
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Make the Redis handler optional

self.addHandler(self.get_redis_handler()) is executed even when no Redis connection is configured, leading to runtime errors at import time.
Wrap this in a feature flag or environment check before attaching the handler.

🤖 Prompt for AI Agents
In coursemapper-kg/recommendation/app/log.py around lines 22 to 26, the Redis
handler is always added regardless of whether a Redis connection is configured,
causing runtime errors. Modify the code to check for a feature flag or
environment variable indicating Redis availability before calling
self.addHandler(self.get_redis_handler()). Only add the Redis handler if the
Redis connection is properly configured or enabled.

Comment on lines +45 to +66
def __init__(self, worker_id, redis):
Handler.__init__(self)
self.worker_id = worker_id
self.redis = redis

def emit(self, record):
try:
msg = json.dumps({
"worker_id": self.worker_id,
"job_id": current_job["id"],
"timestamp": record.created,
"name": record.name,
"level": record.levelname,
"message": record.msg
})
redis.rpush('log', msg)
except RecursionError: # See issue 36272
raise
except TypeError:
pass
except Exception:
self.handleError(record)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick (assertive)

Harden error handling and add type hints

Consider:
• Adding return-type / argument annotations (ruff ANN*** warnings).
• Replacing the blanket except Exception with narrower cases or at least logging the failure to stderr.

🧰 Tools
🪛 Ruff (0.12.2)

45-45: Missing return type annotation for special method __init__

Add return type annotation: None

(ANN204)


45-45: Missing type annotation for function argument worker_id

(ANN001)


45-45: Missing type annotation for function argument redis

(ANN001)


50-50: Missing return type annotation for public function emit

Add return type annotation: None

(ANN201)


50-50: Missing type annotation for function argument record

(ANN001)


58-58: Trailing comma missing

Add trailing comma

(COM812)


65-65: Do not catch blind exception: Exception

(BLE001)

🤖 Prompt for AI Agents
In coursemapper-kg/recommendation/app/log.py between lines 45 and 66, add type
hints for the __init__ and emit methods to specify argument and return types for
better code clarity and to address ruff ANN warnings. Replace the broad except
Exception clause in emit with more specific exception handling or at minimum log
the error to stderr to improve error visibility and robustness.

Comment on lines +50 to +61
def emit(self, record):
try:
msg = json.dumps({
"worker_id": self.worker_id,
"job_id": current_job["id"],
"timestamp": record.created,
"name": record.name,
"level": record.levelname,
"message": record.msg
})
redis.rpush('log', msg)
except RecursionError: # See issue 36272
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Bug – Redis handler ignores the instance it was given

Inside RedisHandler.emit() you call redis.rpush(...), referencing the module-level name, not self.redis.
If another module later rebinds redis, the handler silently breaks; if self.redis was None to disable Redis it is never honoured.

-            redis.rpush('log', msg)
+            self.redis.rpush("log", msg)

Fix this and add a guard (e.g., skip emit when self.redis is None).

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
def emit(self, record):
try:
msg = json.dumps({
"worker_id": self.worker_id,
"job_id": current_job["id"],
"timestamp": record.created,
"name": record.name,
"level": record.levelname,
"message": record.msg
})
redis.rpush('log', msg)
except RecursionError: # See issue 36272
def emit(self, record):
try:
msg = json.dumps({
"worker_id": self.worker_id,
"job_id": current_job["id"],
"timestamp": record.created,
"name": record.name,
"level": record.levelname,
"message": record.msg
})
self.redis.rpush("log", msg)
except RecursionError: # See issue 36272
🧰 Tools
🪛 Ruff (0.12.2)

50-50: Missing return type annotation for public function emit

Add return type annotation: None

(ANN201)


50-50: Missing type annotation for function argument record

(ANN001)


58-58: Trailing comma missing

Add trailing comma

(COM812)

🤖 Prompt for AI Agents
In coursemapper-kg/recommendation/app/log.py between lines 50 and 61, the emit
method incorrectly uses the module-level redis variable instead of the instance
variable self.redis, causing issues if redis is rebound or self.redis is None.
To fix this, replace all references to redis with self.redis inside emit, and
add a guard clause at the start of emit to skip processing if self.redis is
None.

# The structure of text: first column is new id of node(type:int), the second column is the original id (type:string), and the rest is the initial embedding
idx_features = np.genfromtxt("idfeature.txt", dtype=np.dtype(str))
logger.info(idx_features.shape[0])
# logger.info(idx_features.shape[0])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick (assertive)

Prefer debug-level, shape-only logs over commenting out

Commenting logs reduces observability. Log shapes (not dense matrices) at debug-level to keep noise and cost low.

-        # logger.info(idx_features.shape[0])
+        logger.debug("idx_features count: %d", idx_features.shape[0])
@@
-        # logger.info(features.A)
+        logger.debug("features shape: %s", features.shape)
@@
-        # logger.info(adj.A)
+        logger.debug("adj shape: %s", adj.shape)
@@
-        # logger.info(final_embeddings)
+        logger.debug("final_embeddings shape: %s", final_embeddings.shape)

Also applies to: 41-41, 73-73, 82-82

🤖 Prompt for AI Agents
In coursemapper-kg/recommendation/app/services/course_materials/GCN/gcn.py at
lines 37, 41, 73, and 82, replace the commented-out logger.info calls that log
tensor shapes with logger.debug calls instead. This keeps the shape-only logs
active at a debug level to maintain observability without cluttering output or
incurring high logging costs.

Comment on lines +10 to +21
<div *ngIf="video?.title.length < 45; else videoTitlElse">
<h2 class="text-xl font-bold truncate inline-flex" [innerHTML] = "video.title"></h2>
</div>
<ng-template #videoTitlElse>
<div>
<h2 class="text-xl font-bold truncate inline-flex" [innerHTML] = "video?.title.substring(0, 45)"></h2>
</div>
<div>
<h2 class="text-xl font-bold truncate inline-flex" [innerHTML] = "video.title.substring(45)"></h2>
</div>
</ng-template>

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick (assertive)

Title split works; consider avoiding innerHTML and using a 2-line clamp

If you don’t need HTML in titles, prefer interpolation to avoid innerHTML sanitization overhead. Also, Tailwind’s line-clamp (or CSS line clamp) can handle 2-line truncation without manual substring splits.

Example (no diff, conceptual):

  • Replace [innerHTML] with {{ … }}.
  • Use line-clamp-2 (with plugin) or CSS -webkit-line-clamp: 2 to show two lines with ellipsis.
🤖 Prompt for AI Agents
In
webapp/src/app/pages/components/knowledge-graph/videos/watch-video/watch-video.component.html
between lines 10 and 21, the video title is split manually using substring and
rendered with [innerHTML], which is unnecessary and less efficient. Replace the
[innerHTML] bindings with Angular interpolation {{ video.title }} to avoid
sanitization overhead. Remove the manual substring splitting and instead apply a
CSS class using Tailwind's line-clamp-2 utility or equivalent CSS with
-webkit-line-clamp: 2 to truncate the title to two lines with an ellipsis
automatically.

Comment on lines +6 to +7
import { Neo4jResult } from '../models/croForm';
import { HTTPOptions } from '../config/config';
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick (assertive)

Unused import

HTTPOptions is imported but not used; drop it to keep the bundle lean.

🤖 Prompt for AI Agents
In webapp/src/app/services/custom-recommenation-option.service.ts around lines 6
to 7, the import statement includes HTTPOptions which is not used anywhere in
the file. Remove the import of HTTPOptions from the import statement to
eliminate unused code and keep the bundle size smaller.

Comment on lines +18 to +23
private isResultTabSubject = new BehaviorSubject<string>(null);
public isResultTabSelected$ = this.isResultTabSubject.asObservable();

public setResultaTabValue(value: string) {
this.isResultTabSubject.next(value);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick (assertive)

BehaviourSubject type mismatch & typo in setter

new BehaviorSubject<string>(null) violates the generic; use BehaviorSubject<string | null>.

Method name setResultaTabValue has an extra “a”.

🤖 Prompt for AI Agents
In webapp/src/app/services/custom-recommenation-option.service.ts around lines
18 to 23, update the BehaviorSubject type to BehaviorSubject<string | null> to
allow null as an initial value, and rename the method setResultaTabValue to
setResultTabValue to fix the typo.

Comment on lines +8 to +10
const pipelines = ['concept-map', 'modify-graph', 'expand-material', 'concept-recommendation', 'resource-recommendation',
, 'get_resources_by_main_concepts'
];
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Fix sparse array: remove stray comma before new pipeline name

The leading comma on Line 9 creates an empty slot (sparse array) and trips linters (Biome: noSparseArray). Remove it so the array is dense.

-const pipelines = ['concept-map', 'modify-graph', 'expand-material', 'concept-recommendation', 'resource-recommendation',
-  , 'get_resources_by_main_concepts'
-];
+const pipelines = [
+  'concept-map',
+  'modify-graph',
+  'expand-material',
+  'concept-recommendation',
+  'resource-recommendation',
+  'get_resources_by_main_concepts',
+];
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
const pipelines = ['concept-map', 'modify-graph', 'expand-material', 'concept-recommendation', 'resource-recommendation',
, 'get_resources_by_main_concepts'
];
const pipelines = [
'concept-map',
'modify-graph',
'expand-material',
'concept-recommendation',
'resource-recommendation',
'get_resources_by_main_concepts',
];
🧰 Tools
🪛 Biome (2.1.2)

[error] 8-10: This array contains an empty slots..

The presences of empty slots may cause incorrect information and might be a typo.
Unsafe fix: Replace hole with undefined

(lint/suspicious/noSparseArray)

🤖 Prompt for AI Agents
In webserver/src/graph/redis.js around lines 8 to 10, there is a stray leading
comma before the 'get_resources_by_main_concepts' pipeline name in the array,
creating a sparse array. Remove the extra comma so the array elements are
contiguous and the array is dense, resolving the linter error.

Comment on lines +84 to +90
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Verification agent

🧩 Analysis chain

Recommendation Neo4j: add error handling, readiness gating, and graceful shutdown

Connect is invoked without guarding failures or coordinating readiness. If this driver backs new routes, consider handling errors and ensuring the app awaits readiness.

Apply one of the following, depending on recs.connect return type:

Option A (sync connect throws):

-const recs = require("./graph/recommendation.neo4j");
-recs.connect(
-  process.env.NEO4J_URI,
-  process.env.NEO4J_USER,
-  process.env.NEO4J_PASSWORD
-);
+const recs = require("./graph/recommendation.neo4j");
+try {
+  recs.connect(
+    process.env.NEO4J_URI,
+    process.env.NEO4J_USER,
+    process.env.NEO4J_PASSWORD
+  );
+} catch (err) {
+  console.error("Failed to connect to Recommendation Neo4j.", err);
+  process.exit(1);
+}

Option B (async connect returns a Promise):

-const recs = require("./graph/recommendation.neo4j");
-recs.connect(
-  process.env.NEO4J_URI,
-  process.env.NEO4J_USER,
-  process.env.NEO4J_PASSWORD
-);
+const recs = require("./graph/recommendation.neo4j");
+await recs
+  .connect(
+    process.env.NEO4J_URI,
+    process.env.NEO4J_USER,
+    process.env.NEO4J_PASSWORD
+  )
+  .catch((err) => {
+    console.error("Failed to connect to Recommendation Neo4j.", err);
+    process.exit(1);
+  });

Additionally (outside this hunk), ensure graceful shutdown:

// near the bottom of server.js
process.on("SIGINT", async () => {
  try { await neo4j.close?.(); } catch {}
  try { await recs.close?.(); } catch {}
  process.exit(0);
});
process.on("SIGTERM", async () => {
  try { await neo4j.close?.(); } catch {}
  try { await recs.close?.(); } catch {}
  process.exit(0);
});

Run to locate driver close methods and confirm connect is sync vs async:


🏁 Script executed:

#!/bin/bash
rg -n "module\.exports|exports\." ./webserver/src/graph/recommendation.neo4j.js -A 5 -B 5 || true
rg -n "connect\s*\(" ./webserver/src/graph/recommendation.neo4j.* -A 5 -B 5 || true
rg -n "close\s*\(" ./webserver/src/graph/recommendation.neo4j.* -A 5 -B 5 || true

Length of output: 666


Ensure async Neo4j connection is awaited and add graceful shutdown

The connect function in webserver/src/graph/recommendation.neo4j.js is declared async, so invoking it without await or a .catch() risks unhandled promise rejections and startup race conditions. There’s also no exported close method to cleanly shut down the driver.

• webserver/src/server.js (around lines 84–90): wrap the connection call to handle success/failure

- const recs = require("./graph/recommendation.neo4j");
- recs.connect(
-   process.env.NEO4J_URI,
-   process.env.NEO4J_USER,
-   process.env.NEO4J_PASSWORD
- );
+ const recs = require("./graph/recommendation.neo4j");
+ (async () => {
+   try {
+     await recs.connect(
+       process.env.NEO4J_URI,
+       process.env.NEO4J_USER,
+       process.env.NEO4J_PASSWORD
+     );
+     console.log("Recommendation Neo4j connected");
+   } catch (err) {
+     console.error("Failed to connect to Recommendation Neo4j.", err);
+     process.exit(1);
+   }
+ })();

• webserver/src/graph/recommendation.neo4j.js: export a close helper

+ export async function close() {
+   if (graphDb.driver) {
+     await graphDb.driver.close();
+   }
+ }

• Register shutdown handlers in server.js to call recs.close() on SIGINT/SIGTERM:

process.on("SIGINT",  async () => { await recs.close?.(); process.exit(0); });
process.on("SIGTERM", async () => { await recs.close?.(); process.exit(0); });
🤖 Prompt for AI Agents
In webserver/src/server.js around lines 84 to 90, the async connect function
from recommendation.neo4j.js is called without awaiting or handling errors,
risking unhandled promise rejections and race conditions. Modify the code to
await the connect call and add try-catch to handle connection success or failure
gracefully. In webserver/src/graph/recommendation.neo4j.js, export a close
function to cleanly shut down the Neo4j driver. In server.js, register process
signal handlers for SIGINT and SIGTERM that call recs.close() asynchronously
before exiting to ensure graceful shutdown.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review continued from previous batch...

Comment on lines +5 to +10
import numpy as np
from config import Config

import numpy as np
import pandas as pd

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick (assertive)

Remove duplicate numpy import (F811)

Two consecutive import numpy as np re-imports the same name, triggers Ruff F811 and clutters the file.

🧰 Tools
🪛 Ruff (0.12.2)

8-8: Redefinition of unused np from line 5

Remove definition: np

(F811)

🤖 Prompt for AI Agents
In
coursemapper-kg/recommendation/app/services/course_materials/recommendation/resource_recommender.py
between lines 5 and 10, there are two consecutive imports of numpy as np, which
causes a duplicate import warning (F811). Remove the redundant import statement
so that numpy is imported only once, keeping the import statements clean and
avoiding the linter error.

Comment on lines +72 to +96
results = []
recommender = Recommender(embedding_model=None)

# Check if concepts already exist and connected to any resources in Neo4j Database
concepts_db_checked = rrh.check_and_validate_resources(db=self.db, concepts=concepts)

# Crawl resources from YouTube (from each dnu) and Wikipedia API
for concept_updated in concepts_db_checked: #i in range(len(not_understood_concept_list)):
results.append(rrh.parallel_crawling_resources(function=recommender.canditate_selection,
concept_updated=concept_updated,
result_type="records",
top_n_videos=10,
top_n_articles=10
))

if recommendation_type:
# Store resources into Neo4j Database (by creating connection btw Resource and Concept_modified)
for result in results:
self.db.store_resources(resources_dict=result, cid=result["cid"],
recommendation_type=recommendation_type,
resources_form="dict"
)


def _get_personalized_recommendation(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick (assertive)

Add type hints & explicit return for resources_crawler_logic

The helper takes concepts: list[str], recommendation_type: Optional[int] and returns None.
Annotate the signature and add an explicit return None to silence ANN00x diagnostics and improve readability.

🧰 Tools
🪛 Ruff (0.12.2)

72-72: Missing return type annotation for public function resources_crawler_logic

Add return type annotation: None

(ANN201)


72-72: Missing type annotation for function argument concepts

(ANN001)


72-72: Missing type annotation for function argument recommendation_type

(ANN001)


85-85: Trailing comma missing

Add trailing comma

(COM812)


93-93: Trailing comma missing

Add trailing comma

(COM812)

🤖 Prompt for AI Agents
In
coursemapper-kg/recommendation/app/services/course_materials/recommendation/resource_recommender.py
around lines 72 to 96, the method resources_crawler_logic lacks type hints and
an explicit return statement. Add type hints to the method signature specifying
concepts as list of strings and recommendation_type as Optional[int]. Also, add
an explicit return None at the end of the method to clarify that it returns
nothing and to satisfy ANN00x diagnostics.

Comment on lines +129 to +174
rec_params: dict,
factor_weights: dict,
recommendation_type,
user_embedding="",
slide_weighted_avg_embedding_of_concepts="",
slide_document_embedding="",
top_n_resources=10
):
return {}

# Otherwise proceed save the results in the database
resources, relationships = self.db.get_or_create_resoures_relationships(
wikipedia_articles=wikipedia_articles,
youtube_videos=youtube_videos,
user_id=user_id,
material_id=material_id,
concept_ids=concept_ids,
recommender = Recommender(embedding_model=None)
self.resources_crawler_logic(concepts=rec_params["concepts"], recommendation_type=rec_params["recommendation_type"])

# Gather|Retrieve all resources crawled
resources = self.db.retrieve_resources(concepts=rec_params["concepts"], embedding_values=True)
logger.info(f"len of resources {len(resources)}")

# process with the recommendation algorithm selected
if len(resources) > 0:
data_df = pd.DataFrame(resources)
resources_df = recommender.recommend(
slide_weighted_avg_embedding_of_concepts=slide_weighted_avg_embedding_of_concepts,
slide_document_embedding=slide_document_embedding,
user_embedding=user_embedding,
top_n=10,
recommendation_type=recommendation_type,
data=data_df
)
resources_df.replace({np.nan: None}, inplace=True)
resources = resources_df.to_dict(orient='records')
self.db.store_resources(resources_list=resources, resources_form="list",resources_dict=None, cid=None)

# insert attribute "is_bookmarked_fill" for resource saved by the user
rids_user_resources_saved = self.db.get_rids_from_user_saves(user_id=rec_params["user_id"])
resources = [{**resource, 'is_bookmarked_fill': resource['rid'] in rids_user_resources_saved} for resource in resources]

# Apply ranking algorithm on the resources
resources_dict = rrh.rank_resources(resources=resources, weights=factor_weights,
top_n_resources=top_n_resources, recommendation_type=recommendation_type,
pagination_params=rec_params["pagination_params"]
)
result_final = {
"recommendation_type": rec_params["recommendation_type"],
"concepts": rec_params["concepts"],
"nodes": resources_dict
}
return result_final

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Avoid double‐storing resources & drop dead assignment

process_recommandation_pipeline stores resources inside resources_crawler_logic and again via self.db.store_resources, risking duplicates.
Also result_final is built, then immediately returned – drop the prior assignment flagged by RET504.

🧰 Tools
🪛 Ruff (0.12.2)

129-129: Missing return type annotation for public function process_recommandation_pipeline

(ANN201)


132-132: Missing type annotation for function argument recommendation_type

(ANN001)


133-133: Missing type annotation for function argument user_embedding

(ANN001)


134-134: Missing type annotation for function argument slide_weighted_avg_embedding_of_concepts

(ANN001)


135-135: Missing type annotation for function argument slide_document_embedding

(ANN001)


136-136: Missing type annotation for function argument top_n_resources

(ANN001)


136-136: Trailing comma missing

Add trailing comma

(COM812)


143-143: Logging statement uses f-string

(G004)


154-154: Trailing comma missing

Add trailing comma

(COM812)


167-167: Trailing comma missing

Add trailing comma

(COM812)


172-172: Trailing comma missing

Add trailing comma

(COM812)


174-174: Unnecessary assignment to result_final before return statement

Remove unnecessary assignment

(RET504)

🤖 Prompt for AI Agents
In
coursemapper-kg/recommendation/app/services/course_materials/recommendation/resource_recommender.py
between lines 129 and 174, avoid storing resources twice by removing the
resource storage call inside process_recommandation_pipeline since
resources_crawler_logic already stores them. Also, eliminate the intermediate
assignment to result_final and directly return the dictionary to prevent
unnecessary variable assignment flagged by RET504.

Comment on lines +176 to +186
'''
Save cro_form, Crawl Youtube and Wikipedia API and then Store Resources
Result: { "recommendation_type": "", "concepts": [], "nodes": {"articles": [], "videos": []} }
'''
body = rrh.rec_params_request_mapped(data_rec_params, data_default)
result = { "recommendation_type": "", "concepts": [], "nodes": {"articles": [], "videos": []} }

# Map recommendation type to enum values
rec_params = body["rec_params"]
# recommendation_type_str = body["rec_params"]["recommendation_type"]
recommendation_type = RecommendationType.map_type(rec_params["recommendation_type"])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick (assertive)

Parameter top_n is unused

_get_resources(..., top_n=5) never references top_n. Remove the parameter or use it in the ranking logic to prevent API confusion.

🧰 Tools
🪛 Ruff (0.12.2)

176-176: Missing return type annotation for private function _get_resources

(ANN202)


176-176: PEP 484 prohibits implicit Optional

Convert to Optional[T]

(RUF013)


176-176: Missing type annotation for function argument top_n

(ANN001)


176-176: Unused method argument: top_n

(ARG002)

🤖 Prompt for AI Agents
In
coursemapper-kg/recommendation/app/services/course_materials/recommendation/resource_recommender.py
around lines 176 to 186, the parameter top_n is declared but never used in the
_get_resources method. Either remove the top_n parameter if it is unnecessary,
or integrate it into the resource ranking or selection logic to limit the number
of returned resources to top_n, ensuring the parameter serves its intended
purpose.

Comment on lines +254 to +263
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Replace blanket except Exception with precise handling

Catching all exceptions then swallowing them (pass) makes debugging impossible and hides failures from callers.
Catch specific exceptions or re-raise after logging.

🧰 Tools
🪛 Ruff (0.12.2)

254-254: Missing return type annotation for private function _get_resources_by_main_concepts

(ANN202)


259-259: Consider moving this statement to an else block

(TRY300)


260-260: Do not catch blind exception: Exception

(BLE001)


262-262: Unnecessary pass statement

Remove unnecessary pass

(PIE790)

🤖 Prompt for AI Agents
In
coursemapper-kg/recommendation/app/services/course_materials/recommendation/resource_recommender.py
around lines 254 to 263, replace the broad `except Exception` clause with
specific exception types that `resources_crawler_logic` might raise. Instead of
swallowing exceptions with `pass`, either handle them explicitly or log the
error and re-raise it to avoid hiding failures and improve debuggability.

Comment on lines +56 to +81
export async function updateConceptModifiedNode(result, userId = null) {
let resultFinal = [];
if (result && result.length > 0) {
let conceptsModified = [];
if (userId) {
conceptsModified = await neo4j.getConceptsModifiedByUserId(userId);
}

result.forEach(node => {
node.status = false;

// Update concept weight modified by the user
conceptsModified.forEach(conceptModified => {
if (node.cid === conceptModified.cid) {
node.weight = conceptModified.weight;
}
});

resultFinal.push(node);
});

// Sort result by the 'name' property
resultFinal.sort((a, b) => a.name.localeCompare(b.name));
}
return resultFinal;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick (assertive)

Replace nested loops with a hashmap for O(n) merges

updateConceptModifiedNode does result × conceptsModified iteration.
Convert conceptsModified to a Map keyed by cid to drop complexity and improve latency on large result sets.

🤖 Prompt for AI Agents
In webserver/src/controllers/recommendation.controller.js lines 56 to 81, the
nested loops iterating over result and conceptsModified cause O(n*m) complexity.
To optimize, create a Map from conceptsModified keyed by cid before the loop,
then in the result iteration, directly check and update node.weight from the Map
if the cid exists. This reduces the complexity to O(n) and improves performance
on large datasets.

Comment on lines +1 to +14
// import { graphDb } from "../graph/neo4j"

const neo4j = require('neo4j-driver');
const graphDb = {};

export async function connect(url, user, password) {
try {
graphDb.driver = neo4j.driver(url, neo4j.auth.basic(user, password), { disableLosslessIntegers: true });
await graphDb.driver.verifyConnectivity();
console.log(`Connected to Neo4j`);
} catch (error) {
console.error('Failed to connect to Neo4j', error);
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Consistent module format required

Using export with CommonJS require will break on plain Node unless transpiled.
Either switch to import … from or change to module.exports = { … }.

🤖 Prompt for AI Agents
In webserver/src/graph/recommendation.neo4j.js lines 1 to 14, the code mixes ES
module export syntax with CommonJS require, which can cause runtime errors in
plain Node.js. To fix this, either convert all imports and exports to ES module
syntax using `import` and `export`, or switch the export to CommonJS style by
replacing `export async function connect` with `module.exports = { connect }`
and keep using `require` for imports.

Comment on lines +26 to +63
console.log("Saving or Removing from Resource Saved List");
const result = { msg: "" };

if (data.status === true) {
// Save resource
await graphDb.driver.executeQuery(
`
MATCH (a:User {uid: $user_id}), (b:Resource {rid: $rid})
MERGE (a)-[r:HAS_SAVED {user_id: $user_id, rid: $rid}]->(b)
`,
{ user_id: data.user_id, rid: data.rid }
);
result.msg = "saved";
} else {
// Remove resource
await graphDb.driver.executeQuery(
`
MATCH (a:User {uid: $user_id})-[r:HAS_SAVED {user_id: $user_id, rid: $rid}]->(b:Resource {rid: $rid})
DELETE r
`,
{ user_id: data.user_id, rid: data.rid }
);
result.msg = "removed";
}

// Update resource's saves_count
await graphDb.driver.executeQuery(
`
MATCH (a:Resource {rid: $rid})
OPTIONAL MATCH (a)<-[r:HAS_SAVED {rid: $rid}]-()
WITH a, COUNT(r) AS saves_counter
SET a.saves_count = saves_counter
`,
{ rid: data.rid }
);

console.log("Saving or Removing from Resource Saved List: Done");
return result;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick (assertive)

Remove verbose console.log/console.error in library code

Frequent console I/O in hot paths (save/remove, rate) pollutes logs and slows the API.
Return structured results and rely on a centralized logger instead.

🤖 Prompt for AI Agents
In webserver/src/graph/recommendation.neo4j.js between lines 26 and 63, remove
the console.log statements used for debugging the save/remove resource
operations to avoid verbose logging in hot code paths. Instead, ensure the
function returns structured results as it currently does, and rely on a
centralized logging mechanism outside this function for any necessary logging.

Comment on lines +139 to +154
"/api/recommendation/user_resources/filter",
[authJwt.verifyToken],
recommendationController.filterUserResourcesSavedBy
);

app.get(
"/api/wikipedia/search",
"/api/recommendation/user_resources/get_rids_from_user_saves",
[authJwt.verifyToken],
controller.searchWikipedia
recommendationController.getRidsFromUserResourcesSaved
);

app.get(
"/api/recommendation/setting/get_concepts_by_cids",
[authJwt.verifyToken],
recommendationController.getConceptsByCids
);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick (assertive)

Public routes lack explicit rate-limiting / pagination

The new /user_resources/* and /setting/* endpoints can return large sets.
Consider adding query-param limits or middleware rate-limits now to protect the service.

🤖 Prompt for AI Agents
In webserver/src/routes/knowledgeGraph.routes.js around lines 139 to 154, the
new public endpoints under /user_resources/* and /setting/* do not have any
rate-limiting or pagination controls, which risks large data responses and
potential service overload. Add middleware to enforce rate limits on these
routes and implement query parameter-based pagination (e.g., limit and offset)
in the controller methods to restrict the size of returned data sets. Ensure the
middleware is applied alongside authJwt.verifyToken for consistent protection.

Comment on lines +180 to +184
app.post(
"/api/recommendation/save_or_remove_resources",
[authJwt.verifyToken],
recommendationController.SaveOrRemveUserResources
);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Controller method name is misspelled

recommendationController.SaveOrRemveUserResources – “Remove” is missing an “o”.
Route will crash with TypeError: controller.SaveOrRemveUserResources is not a function.
Fix both the route and the controller export.

🤖 Prompt for AI Agents
In webserver/src/routes/knowledgeGraph.routes.js around lines 180 to 184, the
controller method name SaveOrRemveUserResources is misspelled and should be
SaveOrRemoveUserResources. Correct the method name in the route definition and
also update the export or definition of this method in the
recommendationController to use the correct spelling to prevent runtime errors.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review continued from previous batch...

Comment on lines +6 to 7
import json

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick (assertive)

Remove unused import

import json is never referenced – clean it up.

🧰 Tools
🪛 Ruff (0.12.2)

6-6: json imported but unused

Remove unused import: json

(F401)

🤖 Prompt for AI Agents
In coursemapper-kg/recommendation/app/services/course_materials/db/neo4_db.py at
lines 6 to 7, the import statement for json is unused. Remove the line "import
json" to clean up the code and avoid unnecessary imports.

Comment on lines +137 to +146
def _get_data(self, recommendation_type, not_understood_concept_list, slide_concepts, video):
# If personalized recommendation, use DNU concepts to query Youtube and Wikipedia
if (
recommendation_type != RecommendationType.WITHOUT_EMBEDDING
and recommendation_type != RecommendationType.COMBINED_STATIC
and recommendation_type != RecommendationType.STATIC_KEYPHRASE_BASED
and recommendation_type != RecommendationType.STATIC_DOCUMENT_BASED
and recommendation_type != RecommendationType.CONTENT_BASED_KEYPHRASE_VARIANT
and recommendation_type != RecommendationType.CONTENT_BASED_DOCUMENT_VARIANT
):
logger.info("# If personalized recommendation, use DNU concepts to query Youtube and Wikipedia")

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

isinstance(data, list) and data.empty will crash

When data is a list, the second operand is evaluated and list has no .empty, raising AttributeError.
Use separate blocks:

if isinstance(data, list):
    if not data:         # empty list
        return []
elif data.empty:
    return []
🧰 Tools
🪛 Ruff (0.12.2)

137-137: Missing return type annotation for private function _get_data

(ANN202)


137-137: Missing type annotation for function argument recommendation_type

(ANN001)


137-137: Missing type annotation for function argument not_understood_concept_list

(ANN001)


137-137: Missing type annotation for function argument slide_concepts

(ANN001)


137-137: Missing type annotation for function argument video

(ANN001)

🤖 Prompt for AI Agents
In
coursemapper-kg/recommendation/app/services/course_materials/recommendation/recommender.py
around lines 137 to 146, the code checks if data is a list and if it is empty
using a combined condition that causes an AttributeError because lists do not
have an 'empty' attribute. To fix this, separate the checks: first verify if
data is a list and if it is empty using 'if not data', then separately check if
data has the 'empty' attribute and is empty. Return an empty list in both cases
to avoid the crash.

Comment on lines +177 to +188
def recommend(
self,
recommendation_type=RecommendationType.WITHOUT_EMBEDDING,
data:pd.DataFrame=None,
slide_weighted_avg_embedding_of_concepts="",
slide_document_embedding="",
user_embedding="",
top_n=10,
video=True,
not_understood_concept_list=[],
slide_concepts=[],
):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Mutable default arguments & missing default return

not_understood_concept_list=[] and slide_concepts=[] create shared state between calls; also the function falls through with no return if none of the four elif branches match.
Refactor:

def recommend(...,
              not_understood_concept_list=None,
              slide_concepts=None):
    not_understood_concept_list = not_understood_concept_list or []
    slide_concepts = slide_concepts or []
    ...
    return data  # sensible default
🧰 Tools
🪛 Ruff (0.12.2)

177-273: Missing explicit return at the end of function able to return non-None value

Add explicit return statement

(RET503)


177-177: Missing return type annotation for public function recommend

(ANN201)


179-179: Missing type annotation for function argument recommendation_type

(ANN001)


181-181: Missing type annotation for function argument slide_weighted_avg_embedding_of_concepts

(ANN001)


182-182: Missing type annotation for function argument slide_document_embedding

(ANN001)


183-183: Missing type annotation for function argument user_embedding

(ANN001)


184-184: Missing type annotation for function argument top_n

(ANN001)


184-184: Unused method argument: top_n

(ARG002)


185-185: Boolean default positional argument in function definition

(FBT002)


185-185: Missing type annotation for function argument video

(ANN001)


185-185: Unused method argument: video

(ARG002)


186-186: Missing type annotation for function argument not_understood_concept_list

(ANN001)


186-186: Unused method argument: not_understood_concept_list

(ARG002)


186-186: Do not use mutable data structures for argument defaults

Replace with None; initialize within function

(B006)


187-187: Missing type annotation for function argument slide_concepts

(ANN001)


187-187: Unused method argument: slide_concepts

(ARG002)


187-187: Do not use mutable data structures for argument defaults

Replace with None; initialize within function

(B006)

🤖 Prompt for AI Agents
In
coursemapper-kg/recommendation/app/services/course_materials/recommendation/recommender.py
around lines 177 to 188, the function recommend uses mutable default arguments
for not_understood_concept_list and slide_concepts, which can cause shared state
issues across calls. Change their default values to None and inside the function
initialize them to empty lists if they are None. Also, ensure the function
always returns a value by adding a default return statement, such as returning
data, if none of the conditional branches are taken.

Comment on lines +220 to +235
elif recommendation_type == RecommendationType.PKG_BASED_DOCUMENT_VARIANT:
start_time = time.time()

# Transform embedding to tensor
user_tensor = get_tensor_from_embedding(user_embedding.split(","))

# Step 1: compute document embedding for resources
data = self.compute_document_based_embeddings(data)
end_time = time.time()
print(
"Retrieve term-based embeddings Execution time: ",
end_time - start_time,
flush=True,
)
start_time = time.time()


# Step 2: compute similarities between user embeddings and resources document embeddings
logger.info("Compute Cosine Similarities")
data = compute_dynamic_document_based_similarity(
data, recommendation_type, user_embedding=user_tensor
)
end_time = time.time()
print(
"Retrieve user document-based similarity Execution time: ",
end_time - start_time,
flush=True,
)
start_time = time.time()

# Step 3: sort results
sorted_data = sort_by_similarity_type(data, recommendation_type)
data = compute_similarity(embedding_type="document_embedding", data=data, embedding_tensor=user_tensor)

end_time = time.time()
print("Sort result Execution time: ", end_time - start_time, flush=True)

return sorted_data.head(top_n)
# If Combined Dynamic Model.
# TODO this model was Ignored during the evalution. Can be considered in future works
elif recommendation_type == RecommendationType.COMBINED_DYNAMIC:
total_time = time.time() - start_time
logger.info(f"Algorithm Model 2: Execution time {str(total_time)}")
return data

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick (assertive)

Unreachable elif blocks after earlier return

Each branch ends with return data so the elif chain is redundant – replace with independent if blocks or a dispatch table for clarity.

🧰 Tools
🪛 Ruff (0.12.2)

220-220: Unnecessary elif after return statement

Remove unnecessary elif

(RET505)


233-233: Logging statement uses f-string

(G004)


233-233: Use explicit conversion flag

Replace with conversion flag

(RUF010)

🤖 Prompt for AI Agents
In
coursemapper-kg/recommendation/app/services/course_materials/recommendation/recommender.py
between lines 220 and 235, the current use of elif blocks after return
statements makes the subsequent elif blocks unreachable. To fix this, replace
the elif statements with independent if statements or implement a dispatch table
to handle each recommendation_type case separately, ensuring all branches are
reachable and the code is clearer.

Comment on lines +1 to +10
import json
from datetime import datetime, timedelta
from dateutil.parser import parse as date_parse
import numpy as np
from sklearn.preprocessing import normalize as normalize_sklearn, MinMaxScaler as MinMaxScaler_sklearn
import json
import threading
import time
import math
import scipy.stats as st
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick (assertive)

Duplicate / unused imports

json imported twice, timedelta & time never used. Clean them up to cut lint noise.

🧰 Tools
🪛 Ruff (0.12.2)

2-2: datetime.timedelta imported but unused

Remove unused import: datetime.timedelta

(F401)


6-6: Redefinition of unused json from line 1

Remove definition: json

(F811)


8-8: time imported but unused

Remove unused import: time

(F401)

🤖 Prompt for AI Agents
In
coursemapper-kg/recommendation/app/services/course_materials/recommendation/resource_recommender_helper.py
lines 1 to 10, remove the duplicate import of json and eliminate the unused
imports timedelta and time to reduce lint warnings and keep the import section
clean.

Comment on lines +245 to +259
if category == 1:
min_views = int(min(resources, key=lambda x: int(x["views"]))["views"])
max_views = int(max(resources, key=lambda x: int(x["views"]))["views"])

like_count_min_value = min(resources, key=lambda x: x["like_count"])["like_count"]
like_count_max_value = max(resources, key=lambda x: x["like_count"])["like_count"]

for resource in resources:
similarity_normalized = normalize_min_max_score(value=int(resource["similarity_score"]), min_value=similarity_score_min_value, max_value=similarity_score_max_value) # resource["similarity_score"]
rating_normalized = wilson_lower_bound_score(up=resource["helpful_count"], down=resource["not_helpful_count"])
creation_date_normalized = normalize_min_max_score_date(date_str=resource["publish_time"], max=now)
views_normalzed = normalize_min_max_score(value=int(resource["views"]), min_value=min_views, max_value=max_views)
bookmarked_normalized = normalize_min_max_score(value=int(resource["bookmarked_count"]), min_value=bookmarked_min_value, max_value=bookmarked_max_value)
like_count_normalized = normalize_min_max_score(value=int(resource["like_count"]), min_value=like_count_min_value, max_value=like_count_max_value)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Weights computed over ascending sort – likely inverted

get_top_n_concepts sorts key=lambda x: x["weight"] (ascending) then slices – this returns the lowest weighted concepts.
If the intention is “top-N”, sort reverse=True.

🤖 Prompt for AI Agents
In
coursemapper-kg/recommendation/app/services/course_materials/recommendation/resource_recommender_helper.py
around lines 245 to 259, the sorting of concepts by weight is done in ascending
order which returns the lowest weighted concepts instead of the top ones. To fix
this, change the sorting to descending order by adding reverse=True to the sort
or sorted function so that the highest weighted concepts are selected as
intended.

Comment on lines +550 to +557
return { "cid": concept_updated["cid"],
"videos": result_videos,
"articles": result_articles,
"resources_video_exist": concept_updated["resources_video_exist"],
"resources_article_exist": concept_updated["resources_article_exist"],
"is_video_too_old": concept_updated["is_video_too_old"],
"is_article_too_old": concept_updated["is_video_too_old"],
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Copy-paste bug swaps article/video flag

"is_article_too_old": concept_updated["is_video_too_old"] – should reference is_article_too_old.
Wrong field propagates stale/incorrect freshness checks and can trigger unnecessary crawls.

🤖 Prompt for AI Agents
In
coursemapper-kg/recommendation/app/services/course_materials/recommendation/resource_recommender_helper.py
around lines 550 to 557, the dictionary key "is_article_too_old" incorrectly
references concept_updated["is_video_too_old"] due to a copy-paste error. Fix
this by changing the value to concept_updated["is_article_too_old"] to correctly
reflect the article freshness check and avoid propagating incorrect flags.

Comment on lines +313 to +318
setLeftPanelMWinWidth() {
let ipo_interact = document.getElementById('ipo_interact');
ipo_interact.style.minWidth = "30rem";
ipo_interact.style.width = "30rem";
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick (assertive)

Avoid direct DOM manipulation inside components

Calling document.getElementById and mutating style ties the component to browser globals and bypasses Angular’s renderer (no SSR / web-worker support).
Inject Renderer2 or use a template ref + @ViewChild instead.

🤖 Prompt for AI Agents
In
webapp/src/app/pages/components/knowledge-graph/result-view/result-view.component.ts
around lines 313 to 318, avoid direct DOM manipulation using
document.getElementById and style changes. Instead, inject Angular's Renderer2
in the component constructor and use it to set styles safely, or use @ViewChild
to get a template reference to the element and modify its styles via Renderer2.
This ensures compatibility with Angular's rendering engine and supports SSR and
web workers.

Comment on lines +168 to +170
SaveOrRemoveUserResource(data: any): Observable<any> {
return this.http.post<any>(`${this.LEAF_URL}/save_or_remove_resources`, data, { withCredentials: true });
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Inconsistent HTTP options – credentials sent but headers dropped

SaveOrRemoveUserResource omits the shared HTTPOptions (which configures auth headers) and only sets withCredentials.

-return this.http.post<any>(`${this.LEAF_URL}/save_or_remove_resources`, data, { withCredentials: true });
+return this.http.post<any>(
+  `${this.LEAF_URL}/save_or_remove_resources`,
+  data,
+  { ...HTTPOptions, withCredentials: true },
+);

Without headers the backend may reject the request.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
SaveOrRemoveUserResource(data: any): Observable<any> {
return this.http.post<any>(`${this.LEAF_URL}/save_or_remove_resources`, data, { withCredentials: true });
}
SaveOrRemoveUserResource(data: any): Observable<any> {
return this.http.post<any>(
`${this.LEAF_URL}/save_or_remove_resources`,
data,
{ ...HTTPOptions, withCredentials: true },
);
}
🤖 Prompt for AI Agents
In webapp/src/app/services/materials-recommender.service.ts around lines 168 to
170, the SaveOrRemoveUserResource method sets withCredentials but omits the
shared HTTPOptions that include necessary auth headers. To fix this, update the
method to merge or include the shared HTTPOptions object so that both
withCredentials and the required headers are sent with the request, ensuring the
backend receives proper authentication information.

dependabot bot and others added 3 commits August 25, 2025 15:37
Bumps nginxinc/nginx-unprivileged from 1.29.0-alpine to 1.29.1-alpine.

---
updated-dependencies:
- dependency-name: nginxinc/nginx-unprivileged
  dependency-version: 1.29.1-alpine
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
postgres:16 image no longer requires Debian archive hacks
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (4)
coursemapper-kg/wp-pg/Dockerfile (4)

15-15: Ensure the custom entrypoint is executable at build time

Make the permission explicit to avoid relying on host file mode.

Apply this diff:

-COPY bin/entrypoint.sh /usr/local/bin/custom-entrypoint.sh
+COPY --chmod=0755 bin/entrypoint.sh /usr/local/bin/custom-entrypoint.sh

18-18: Reconsider rclone config placement and ownership

Copying to /root likely won’t be readable if the process runs as postgres. Prefer a well-known location and proper ownership, e.g. /etc/rclone/rclone.conf (system-wide) or $PGUSER’s XDG config (/var/lib/postgresql/.config/rclone/rclone.conf).

Minimal, system-wide approach (requires the directory):

+# create system rclone config dir
+RUN install -d -m 0755 /etc/rclone
-COPY config/rclone.conf.orig /root
+COPY --chmod=0644 config/rclone.conf.orig /etc/rclone/rclone.conf

If you keep a per-user config instead, create the tree and set ownership:

RUN install -d -o postgres -g postgres -m 0700 /var/lib/postgresql/.config/rclone
COPY --chown=postgres:postgres --chmod=0600 config/rclone.conf.orig /var/lib/postgresql/.config/rclone/rclone.conf

20-20: Add a container HEALTHCHECK

Helps orchestration detect readiness/liveness; matches Checkov hint CKV_DOCKER_2.

Apply this diff:

 ENTRYPOINT ["/usr/local/bin/custom-entrypoint.sh"]
 
+HEALTHCHECK --interval=30s --timeout=5s --start-period=30s --retries=5 \
+  CMD pg_isready -U "${POSTGRES_USER:-postgres}" -d "${POSTGRES_DB:-postgres}" || exit 1

15-20: Run as non-root (or drop privileges) to satisfy least-privilege and Checkov CKV_DOCKER_3

If your entrypoint doesn’t need root, explicitly run as postgres. Otherwise, ensure the script drops privileges (e.g., exec gosu postgres …) before starting Postgres.

One-option diff (if root not required):

-COPY --chmod=0755 bin/entrypoint.sh /usr/local/bin/custom-entrypoint.sh
-ENTRYPOINT ["/usr/local/bin/custom-entrypoint.sh"]
+COPY --chmod=0755 bin/entrypoint.sh /usr/local/bin/custom-entrypoint.sh
+USER postgres
+ENTRYPOINT ["/usr/local/bin/custom-entrypoint.sh"]

If root is required during init, keep USER as root but drop privileges within the script before launching Postgres.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: ASSERTIVE

Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 43b537e and 322d644.

📒 Files selected for processing (1)
  • coursemapper-kg/wp-pg/Dockerfile (1 hunks)
🧰 Additional context used
🪛 Checkov (3.2.334)
coursemapper-kg/wp-pg/Dockerfile

[LOW] 1-20: Ensure that HEALTHCHECK instructions have been added to container images

(CKV_DOCKER_2)


[LOW] 1-20: Ensure that a user for the container has been created

(CKV_DOCKER_3)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
  • GitHub Check: Build container images / Build image (coursemapper-kg/preprocess)
  • GitHub Check: Build container images / Build image (coursemapper-kg/concept-map)
  • GitHub Check: Build container images / Build image (webapp)
🔇 Additional comments (2)
coursemapper-kg/wp-pg/Dockerfile (2)

1-1: Dockerfile syntax 1.15: good upgrade; ensure your builders default to BuildKit

Adopting 1.15 unlocks heredocs and cache mounts used below. Confirm your CI/builders run with DOCKER_BUILDKIT=1 (most do by default now).


15-20: Overriding Postgres ENTRYPOINT: verify you still initialize the cluster and drop privileges

Replacing the official docker-entrypoint.sh can bypass PG init scripts, env var handling (e.g., POSTGRES_DB/USER/PASSWORD), and the privilege drop to postgres. Ensure your custom-entrypoint.sh either:

  • chains to the official entrypoint (recommended), or
  • fully replicates its semantics and drops to the postgres user (via gosu) before starting the server.

If you intend to chain, a common pattern inside your script is:

exec /usr/local/bin/docker-entrypoint.sh "$@"

Optionally, if you prefer to run the container explicitly as a non-root user, add before ENTRYPOINT:

+USER postgres

Confirm your script does not require root if you set USER.


# Install dependencies
ENV RUNTIME_DEPS "ca-certificates rclone"
ENV RUNTIME_DEPS="ca-certificates rclone"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick (assertive)

Use ARG for build-time deps to avoid leaking ENV into the runtime image

RUNTIME_DEPS is only needed during build; prefer ARG so it doesn’t become a runtime env var.

Apply this diff:

-ENV RUNTIME_DEPS="ca-certificates rclone"
+ARG RUNTIME_DEPS="ca-certificates rclone"
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
ENV RUNTIME_DEPS="ca-certificates rclone"
ARG RUNTIME_DEPS="ca-certificates rclone"
🤖 Prompt for AI Agents
In coursemapper-kg/wp-pg/Dockerfile around line 5, RUNTIME_DEPS is declared with
ENV which leaks build-only packages into the runtime image; change ENV
RUNTIME_DEPS="ca-certificates rclone" to an ARG declaration (e.g., ARG
BUILD_DEPS="ca-certificates rclone") and update subsequent references in the
Dockerfile to use that ARG during build steps (apt-get install, cleanup, etc.)
so the variable is not present as a runtime environment variable.

Comment on lines 6 to 12
RUN --mount=type=cache,sharing=private,target=/var/cache/apt \
--mount=type=cache,sharing=private,target=/var/lib/apt <<EOF
rm -f /etc/apt/apt.conf.d/docker-clean
echo 'Binary::apt::APT::Keep-Downloaded-Packages "true";' > /etc/apt/apt.conf.d/keep-cache
DEBIAN_FRONTEND=noninteractive apt-get update -q &&
apt-get --reinstall install debian-archive-keyring &&
apt-get install -qq --no-install-recommends -o=Dpkg::Use-Pty=0 $RUNTIME_DEPS
EOF
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Harden the APT block: fail fast, noninteractive install, and don’t persist keep-cache config

  • Add set -eux to fail early.
  • Use -y for install (update doesn’t take -y).
  • Remove /etc/apt/apt.conf.d/keep-cache after install so the runtime image isn’t left with build-time APT behavior.
  • Optional: explicitly clear lists (mostly moot with cache mounts, but harmless).

Apply this diff:

 RUN --mount=type=cache,sharing=private,target=/var/cache/apt \
     --mount=type=cache,sharing=private,target=/var/lib/apt <<EOF
-  rm -f /etc/apt/apt.conf.d/docker-clean
-  echo 'Binary::apt::APT::Keep-Downloaded-Packages "true";' > /etc/apt/apt.conf.d/keep-cache
-  DEBIAN_FRONTEND=noninteractive apt-get update -q &&
-  apt-get install -qq --no-install-recommends -o=Dpkg::Use-Pty=0 $RUNTIME_DEPS
+  set -eux
+  rm -f /etc/apt/apt.conf.d/docker-clean
+  echo 'Binary::apt::APT::Keep-Downloaded-Packages "true";' > /etc/apt/apt.conf.d/keep-cache
+  DEBIAN_FRONTEND=noninteractive apt-get update -q
+  apt-get install -y -qq --no-install-recommends -o=Dpkg::Use-Pty=0 $RUNTIME_DEPS
+  rm -f /etc/apt/apt.conf.d/keep-cache
+  rm -rf /var/lib/apt/lists/*
 EOF
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
RUN --mount=type=cache,sharing=private,target=/var/cache/apt \
--mount=type=cache,sharing=private,target=/var/lib/apt <<EOF
rm -f /etc/apt/apt.conf.d/docker-clean
echo 'Binary::apt::APT::Keep-Downloaded-Packages "true";' > /etc/apt/apt.conf.d/keep-cache
DEBIAN_FRONTEND=noninteractive apt-get update -q &&
apt-get --reinstall install debian-archive-keyring &&
apt-get install -qq --no-install-recommends -o=Dpkg::Use-Pty=0 $RUNTIME_DEPS
EOF
RUN --mount=type=cache,sharing=private,target=/var/cache/apt \
--mount=type=cache,sharing=private,target=/var/lib/apt <<EOF
set -eux
rm -f /etc/apt/apt.conf.d/docker-clean
echo 'Binary::apt::APT::Keep-Downloaded-Packages "true";' > /etc/apt/apt.conf.d/keep-cache
DEBIAN_FRONTEND=noninteractive apt-get update -q
apt-get install -y -qq --no-install-recommends -o=Dpkg::Use-Pty=0 $RUNTIME_DEPS
rm -f /etc/apt/apt.conf.d/keep-cache
rm -rf /var/lib/apt/lists/*
EOF
🤖 Prompt for AI Agents
In coursemapper-kg/wp-pg/Dockerfile around lines 6 to 12, the APT block should
be hardened: add a shell fail-fast and verbose header (set -eux) at the top of
the heredoc, use -y on apt-get install (DEBIAN_FRONTEND=noninteractive apt-get
install -y ... -qq --no-install-recommends -o=Dpkg::Use-Pty=0 $RUNTIME_DEPS),
and ensure we remove the temporary keep-cache file after installation (rm -f
/etc/apt/apt.conf.d/keep-cache) so runtime image isn’t left with build-time
behavior; optionally clear apt lists after install (rm -rf /var/lib/apt/lists/*)
to avoid persisting package lists (harmless with cache mounts).

ralf-berger and others added 2 commits August 27, 2025 10:10
postgres:16 image no longer requires Debian archive hacks
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 6

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
webapp/src/app/pages/components/knowledge-graph/concept-map/concept-map.component.ts (1)

1537-1566: Restore button state on error/complete of materials call

isRecommendationButtonDisplayed is reset only on success. Add resets on error/complete to avoid stuck UI.

           .subscribe({
             next: (result) => {
               this.isRecommendationButtonDisplayed = true;
               this.resourcesPagination = result;
               this.kgTabs.kgTabsEnable();
               this.mainConceptsTab = false;
               this.recommendedConceptsTab = false;
               this.recommendedMaterialsTab = true;
             },
-            complete: () => {
-              this.showRecommendationButtonClicked = false;
-            },
+            complete: () => {
+              this.isRecommendationButtonDisplayed = true;
+              this.showRecommendationButtonClicked = false;
+            },
-          });
+          }, 
+          error: (err) => {
+            this.isRecommendationButtonDisplayed = true;
+            this.showRecommendationButtonClicked = false;
+            console.error(err);
+            this.displayMessage(err.message);
+            this.isLoading = false;
+            this.loading.emit(false);
+          });
♻️ Duplicate comments (3)
webapp/src/app/pages/components/knowledge-graph/concept-map/concept-map.component.ts (3)

901-905: Don’t hard‑code graph width to 75%; compute responsively

Use container/sidebar widths instead of a magic percentage. This aligns with prior feedback.

-  setResponsiveWidthKnowledgeGraph(knowledgeGraph) {
-    if (knowledgeGraph && knowledgeGraph.style) {
-      knowledgeGraph.style.width = '75%';
-    }
-  }
+  setResponsiveWidthKnowledgeGraph(knowledgeGraph: HTMLElement | null) {
+    if (!knowledgeGraph) return;
+    const container = document.getElementById('slideKgDialogDiv') as HTMLElement | null;
+    const sidebar = document.getElementById('flexboxNotUnderstood') as HTMLElement | null;
+    const margin = this.showConceptsListSidebar ? 16 : 0; // 1rem
+    const containerWidth =
+      container?.offsetWidth ?? knowledgeGraph.parentElement?.clientWidth ?? window.innerWidth;
+    const sidebarWidth = this.showConceptsListSidebar && sidebar ? sidebar.offsetWidth : 0;
+    knowledgeGraph.style.width = Math.max(0, containerWidth - sidebarWidth - margin) + 'px';
+  }

2295-2298: Make croUpdater robust (early return + defaults)

Avoid silent no‑ops and undefined param flows. Mirrors earlier guidance to guard croComponent.

-  croUpdater(didNotUnderstandConceptsObj: any[], previousConceptsObj: any[]) {
-    this.croComponent?.updateCROformAll(didNotUnderstandConceptsObj, previousConceptsObj);
-  }
+  croUpdater(didNotUnderstandConceptsObj: any[] = [], previousConceptsObj: any[] = []) {
+    if (!this.croComponent) return;
+    this.croComponent.updateCROformAll(didNotUnderstandConceptsObj, previousConceptsObj);
+  }

2308-2324: Remove console noise and avoid event.screen; delegate to responsive helper

Use window.innerWidth and reuse setResponsiveWidthKnowledgeGraph. Also drop magic 40em.

-  setWeightGraphComponent(event) {
+  setWeightGraphComponent(event: UIEvent) {
     setTimeout(() => {
-      let knowledgeGraph = document.getElementById('graphSection');
-      if (this.showMaterialKg) {
-        console.warn("resize -> hideConceptsList -> HostListener event.screen ->", event.screen.width);
-        // let screenWidth = window.innerHeight;
-        let screenWidth = event.screen.width; // event.outerWidth
-
-        if (screenWidth >= 768 && screenWidth < 992) {
-          knowledgeGraph.style.width = '40em';
-        }
-        if (screenWidth > 992 && screenWidth <= 1200) {
-          knowledgeGraph.style.width = '40em';
-        }
-      }
+      const knowledgeGraph = document.getElementById('graphSection') as HTMLElement | null;
+      if (!knowledgeGraph || !this.showMaterialKg) return;
+      const screenWidth = window.innerWidth;
+      if (screenWidth >= 768 && screenWidth < 1200) {
+        this.setResponsiveWidthKnowledgeGraph(knowledgeGraph);
+      }
     }, 3);
   }
📜 Review details

Configuration used: CodeRabbit UI

Review profile: ASSERTIVE

Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 322d644 and e35ebf1.

📒 Files selected for processing (1)
  • webapp/src/app/pages/components/knowledge-graph/concept-map/concept-map.component.ts (25 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
webapp/src/app/pages/components/knowledge-graph/concept-map/concept-map.component.ts (1)
webapp/src/app/models/croForm.ts (2)
  • ActivatorPartCRO (4-8)
  • ResourcesPagination (66-73)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
  • GitHub Check: Build container images / Build image (coursemapper-kg/preprocess)
  • GitHub Check: Build container images / Build image (coursemapper-kg/concept-map)
  • GitHub Check: Build container images / Build image (webapp)
  • GitHub Check: Build container images / Build image (coursemapper-kg/recommendation)
🔇 Additional comments (1)
webapp/src/app/pages/components/knowledge-graph/concept-map/concept-map.component.ts (1)

1490-1490: Verify necessity of PII in recommender requests

Before shipping, please confirm whether the userId (and user_id) fields you’re appending to the concept- and material-recommender payloads are actually consumed by the backend. If they’re not used server-side, remove them to avoid leaking unnecessary PII.

• webapp/src/app/pages/components/knowledge-graph/concept-map/concept-map.component.ts (line 1489)

reqDataMaterial1["userId"] = this.userid;

• webapp/src/app/pages/components/knowledge-graph/result-view/result-view.component.ts (lines 321 & 342)

this.filteringParamsSavedTab.user_id = this.userId;

• webapp/src/app/pages/components/knowledge-graph/custom-recommendation-option/custom-recommendation-option.component.ts (line 141)

this.croForm["user_id"] = this.userId;

Comment on lines +272 to +274
resourcesPagination: ResourcesPagination = undefined;
isRecommendationButtonDisplayed = true;
conceptsUpdatedCRO: any;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick (assertive)

Prefer explicit nullable type for resourcesPagination

Assigning undefined to a non‑nullable type weakens type safety.

-  resourcesPagination: ResourcesPagination = undefined;
+  resourcesPagination: ResourcesPagination | null = null;
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
resourcesPagination: ResourcesPagination = undefined;
isRecommendationButtonDisplayed = true;
conceptsUpdatedCRO: any;
resourcesPagination: ResourcesPagination | null = null;
isRecommendationButtonDisplayed = true;
conceptsUpdatedCRO: any;
🤖 Prompt for AI Agents
In
webapp/src/app/pages/components/knowledge-graph/concept-map/concept-map.component.ts
around lines 272 to 274, the field resourcesPagination is currently declared as
ResourcesPagination and assigned undefined which weakens type safety; change its
declaration to an explicit nullable type (ResourcesPagination | undefined) or
ResourcesPagination | null and keep the undefined assignment (or initialize to
null) so the compiler understands the field may be absent and enforces checks
where it is used.

Comment on lines +810 to +819
onActiveItemChange(event: MenuItem) {
// console.warn("tab onActiveItemChange") // graphSection
if (event.label === 'Main Concepts') {
// console.warn("Main Concepts")
} else if (event.label === 'Recommended Concepts') {
// console.warn("Recommended Concepts")
} else {
}

}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick (assertive)

Clean up no‑op handler and commented logs

Either implement tab‑change side effects or keep a minimal no‑op.

-  onActiveItemChange(event: MenuItem) {
-    // console.warn("tab onActiveItemChange") // graphSection
-    if (event.label === 'Main Concepts') {
-      // console.warn("Main Concepts")
-    } else if (event.label === 'Recommended Concepts') {
-      // console.warn("Recommended Concepts")
-    } else {
-    }
-
-  }
+  onActiveItemChange(_event: MenuItem): void {
+    // No-op for now.
+  }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
onActiveItemChange(event: MenuItem) {
// console.warn("tab onActiveItemChange") // graphSection
if (event.label === 'Main Concepts') {
// console.warn("Main Concepts")
} else if (event.label === 'Recommended Concepts') {
// console.warn("Recommended Concepts")
} else {
}
}
onActiveItemChange(_event: MenuItem): void {
// No-op for now.
}
🤖 Prompt for AI Agents
In
webapp/src/app/pages/components/knowledge-graph/concept-map/concept-map.component.ts
around lines 810 to 819, the onActiveItemChange handler contains only commented
console logs and empty conditional branches; remove the commented logs and the
empty else branch and either implement the needed tab-change side effects or
convert the function to a minimal no-op that clearly does nothing (e.g., keep a
single-line comment like "// no-op: tab changes handled elsewhere" and return),
ensuring the method is concise and has no unused branches or stray commented
code.

Comment on lines +1478 to 1479
this.tabIndex = 2;
this.showRecommendationButtonClicked = true;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Fix tab index: index 2 doesn’t exist with current tabs

Only two tabs remain (“Main Concepts”, “Recommended Materials”). Selecting index 2 is out of bounds and may misroute UI state.

-      this.tabIndex = 2;
+      this.tabIndex = 1;
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
this.tabIndex = 2;
this.showRecommendationButtonClicked = true;
this.tabIndex = 1;
this.showRecommendationButtonClicked = true;
🤖 Prompt for AI Agents
In
webapp/src/app/pages/components/knowledge-graph/concept-map/concept-map.component.ts
around lines 1478 to 1479, the code sets this.tabIndex = 2 which is out of range
because only two tabs exist; update the assignment to use a valid index (set to
1 for the "Recommended Materials" tab) or compute it dynamically (e.g.
this.tabIndex = Math.min(desiredIndex, this.tabs.length - 1)) so tabIndex cannot
exceed available tabs; keep this.showRecommendationButtonClicked = true as-is.

Comment on lines +1535 to +1536
let reqDataFinal = this.croComponent.buildFinalRequestRecMaterial(reqData);

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Null‑guard CRO before building request (fallback to reqData)

croComponent may be undefined; avoid runtime errors.

-          let reqDataFinal = this.croComponent.buildFinalRequestRecMaterial(reqData);
+          const reqDataFinal = this.croComponent
+            ? this.croComponent.buildFinalRequestRecMaterial(reqData)
+            : reqData;
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
let reqDataFinal = this.croComponent.buildFinalRequestRecMaterial(reqData);
const reqDataFinal = this.croComponent
? this.croComponent.buildFinalRequestRecMaterial(reqData)
: reqData;
🤖 Prompt for AI Agents
In
webapp/src/app/pages/components/knowledge-graph/concept-map/concept-map.component.ts
around lines 1535-1536, the code calls
this.croComponent.buildFinalRequestRecMaterial(reqData) without guarding against
croComponent being undefined; add a null-guard so you only call
buildFinalRequestRecMaterial when croComponent exists and otherwise fall back to
using reqData (e.g. compute reqDataFinal = this.croComponent ?
this.croComponent.buildFinalRequestRecMaterial(reqData) : reqData), ensuring
types are adjusted or cast if necessary.

Comment on lines +2299 to +2306
setHeightGraphComponent() {
let knowledgeGraph = document.getElementById('graphSection');
if (knowledgeGraph) {
let ipo_interact = document.getElementById('ipo_interact');
// console.warn("ipo_interact with -> ", ipo_interact.offsetWidth)
this.cyHeight = ipo_interact.offsetHeight - (ipo_interact.offsetHeight * 0.15);
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Null‑guard DOM node before using offsetHeight

ipo_interact may be null; prevent NPE.

-  setHeightGraphComponent() {
-    let knowledgeGraph = document.getElementById('graphSection');
-    if (knowledgeGraph) {
-      let ipo_interact = document.getElementById('ipo_interact');
-      // console.warn("ipo_interact with -> ", ipo_interact.offsetWidth)
-      this.cyHeight = ipo_interact.offsetHeight - (ipo_interact.offsetHeight * 0.15);
-    }
-  }
+  setHeightGraphComponent() {
+    const knowledgeGraph = document.getElementById('graphSection');
+    if (!knowledgeGraph) return;
+    const container = document.getElementById('ipo_interact') as HTMLElement | null;
+    if (container) {
+      this.cyHeight = container.offsetHeight * 0.85;
+    }
+  }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
setHeightGraphComponent() {
let knowledgeGraph = document.getElementById('graphSection');
if (knowledgeGraph) {
let ipo_interact = document.getElementById('ipo_interact');
// console.warn("ipo_interact with -> ", ipo_interact.offsetWidth)
this.cyHeight = ipo_interact.offsetHeight - (ipo_interact.offsetHeight * 0.15);
}
}
setHeightGraphComponent() {
const knowledgeGraph = document.getElementById('graphSection');
if (!knowledgeGraph) return;
const container = document.getElementById('ipo_interact') as HTMLElement | null;
if (container) {
this.cyHeight = container.offsetHeight * 0.85;
}
}
🤖 Prompt for AI Agents
In
webapp/src/app/pages/components/knowledge-graph/concept-map/concept-map.component.ts
around lines 2299 to 2306, the code uses ipo_interact.offsetHeight without
guarding against ipo_interact being null; update the method to first check that
ipo_interact exists (e.g., if (ipo_interact) { ... }) before accessing
offsetHeight, and handle the null case by either returning early or using a safe
fallback height value so no NPE occurs.

* nber recommendations and search box activated

* support for large screen

---------

Co-authored-by: boby024 <williamkana46@gmail.com>
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 27

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (3)
coursemapper-kg/recommendation/app/services/course_materials/db/neo4_db.py (3)

373-383: User fetch/creation bugs: incorrect return handling and wrong function call signature.

  • get_user uses .single() then list(result): raises when result is None and returns meaningless list otherwise.
  • get_or_create_user calls create_user(tx, user_id, username, user_email), but create_user accepts only (tx, user_id).
-def get_user(tx, uid):
+def get_user(tx, uid):
@@
-    result = tx.run(
-        "MATCH (u:User) WHERE u.uid = $uid RETURN u",
-        uid=uid).single()
-
-    return list(result)
+    record = tx.run(
+        "MATCH (u:User) WHERE u.uid = $uid RETURN u.uid AS uid",
+        uid=uid).single()
+    return record["uid"] if record else None
 def get_or_create_user(self, user_id, username="", user_email=""):
@@
-            _uid = get_user(tx, user_id)
-            if _uid:
-                return _uid
-            else:
-                # print(tx)
-                create_user(tx, user_id, username, user_email)
-                tx.commit()
+            uid = get_user(tx, user_id)
+            if uid:
+                tx.commit()
+                return uid
+            create_user(tx, user_id)
+            tx.commit()

Also applies to: 749-768


716-725: Wrong argument order when creating user–concept relationship.

create_user_concept_relationships expects (tx, uid, cid, relation_type), but you pass (tx, concept_id, user_id, ...).

-                create_user_concept_relationships(tx, concept_id, user_id, relation_type)
+                create_user_concept_relationships(tx, user_id, concept_id, relation_type)

1780-1788: Invalid session.commit() usage; run in explicit transaction or use context manager.

Session has no commit(). Use a transaction or implicit auto-commit via session.run.

-        session = self.driver.session()
-        session.run(
-            """Match (c:Concept) WHERE c.cid=$cid and c.mid=$mid SET c.rank=$rank """,
-            cid=node["id"],
-            mid=node["mid"],
-            rank=node["rank"])
-        session.commit()
-        session.close()
+        with self.driver.session() as session:
+            session.run(
+                """MATCH (c:Concept) WHERE c.cid=$cid AND c.mid=$mid SET c.rank=$rank""",
+                cid=node["id"],
+                mid=node["mid"],
+                rank=node["rank"],
+            )
♻️ Duplicate comments (13)
webapp/src/app/pages/components/knowledge-graph/videos/card-video-list/card-video-list.component.html (1)

3-14: Add trackBy and remove the wrapper div to cut DOM churn

Move *ngFor onto the card component and provide trackBy for efficient re-rendering.

-      <div class="" *ngFor="let videoElement of videoElements">
-        <app-card-video 
+        <app-card-video
+          *ngFor="let videoElement of videoElements; trackBy: trackByRid"
           [notUnderstoodConcepts]="notUnderstoodConcepts" 
           (onWatchVideo)="readVideo($event)" 
           [videoElement]="videoElement" 
           [userId]="userId"
           [resultTabType]="resultTabType"
           (resourceRemovedEvent)="onResourceRemovedEvent($event)"
           [currentMaterial]="currentMaterial"
         >
         </app-card-video>
-      </div>

Note: Implement trackByRid(_: number, el: VideoElementModel) { return el.rid; } in the component TS.

webapp/src/app/pages/components/knowledge-graph/result-view/result-view.component.html (3)

57-60: Duplicate id "cro_sorting" on siblings.

Keep ids unique or remove them if not needed. This was flagged earlier as well.

-          <div id="cro_sorting" class="w-3/12"></div>
+          <div class="w-3/12"></div>
           
-          <div id="cro_sorting" class="w-3/12" (click)="op.toggle($event, targetEl)">
+          <div id="cro_sorting" class="w-3/12" (click)="op.toggle($event, targetEl)">

116-135: *Fix Angular ngIf microsyntax and render logic for Videos.

  • Use semicolon in microsyntax (previously reported).
  • Ensure the list renders only when items exist; otherwise show empty/loading states.
-        <div class="mt-1 w-full" id="cro_viedo_list">
-          <div
-            *ngIf="resourcesPagination?.nodes?.videos?.total_items > 0, else elseBlockRsVideos"
-          >
-          </div>
-            <p-scrollPanel styleClass="scroll_panel_list">
-              <app-card-video-list
-                [notUnderstoodConcepts]="concepts"
-                [videoElements]="resourcesPagination?.nodes?.videos?.content"
-                [userId]="userId"
-                (backButtonClicked)="logUserViewedRecommendedVideos()"
-                [currentMaterial]="currentMaterial"
-              ></app-card-video-list>
-            </p-scrollPanel>
-        </div>
+        <div class="mt-1 w-full">
+          <ng-container *ngIf="resourcesPagination?.nodes?.videos?.total_items > 0; else elseBlockRsVideos">
+            <p-scrollPanel styleClass="scroll_panel_list">
+              <app-card-video-list
+                [notUnderstoodConcepts]="concepts"
+                [videoElements]="resourcesPagination?.nodes?.videos?.content"
+                [userId]="userId"
+                (backButtonClicked)="logUserViewedRecommendedVideos()"
+                [currentMaterial]="currentMaterial"
+              ></app-card-video-list>
+            </p-scrollPanel>
+          </ng-container>
+        </div>

Also applies to: 136-156


159-175: *Fix Angular ngIf microsyntax and render logic for Articles.

Mirror the Videos fix; also remove duplicate/typo id.

-      <div class="mt-1 w-full" id="cro_viedo_list">
-        <div
-          *ngIf="resourcesPagination?.nodes?.articles?.total_items > 0, else elseBlockRsArticles"
-        >
-        </div>
-          <p-scrollPanel styleClass="scroll_panel_list">
-            <app-card-article-list
-              [notUnderstoodConcepts]="concepts"
-              [articleElements]="resourcesPagination?.nodes?.articles?.content"
-              [userId]="userId"
-              [currentMaterial]="currentMaterial"
-            ></app-card-article-list>
-          </p-scrollPanel>
-      </div>
+      <div class="mt-1 w-full">
+        <ng-container *ngIf="resourcesPagination?.nodes?.articles?.total_items > 0; else elseBlockRsArticles">
+          <p-scrollPanel styleClass="scroll_panel_list">
+            <app-card-article-list
+              [notUnderstoodConcepts]="concepts"
+              [articleElements]="resourcesPagination?.nodes?.articles?.content"
+              [userId]="userId"
+              [currentMaterial]="currentMaterial"
+            ></app-card-article-list>
+          </p-scrollPanel>
+        </ng-container>
+      </div>

Also applies to: 176-196

webapp/src/app/pages/components/knowledge-graph/concept-map/concept-map.component.ts (8)

911-913: Stop manually setting width/margin; use the helper

Inline writes fight the helper and cause inconsistent layouts.

-      knowledgeGraph.style.marginLeft = 0 + 'rem';
-      knowledgeGraph.style.width = slideKgDialogDiv.offsetWidth + 'px';
+      this.setResponsiveWidthKnowledgeGraph();

270-275: Fix resourcesPagination nullability (assigned undefined/null to non-nullable type)

Make the type explicitly nullable and initialize consistently.

-  resourcesPagination: ResourcesPagination = undefined;
+  resourcesPagination: ResourcesPagination | null = null;

(No further code changes needed where you already set it to null in Line 1450.)

Also applies to: 1450-1451


810-819: Trim no-op tab-change handler

Convert to a minimal no-op; remove commented logs and empty branches.

-  onActiveItemChange(event: MenuItem) {
-    // console.warn("tab onActiveItemChange") // graphSection
-    if (event.label === 'Main Concepts') {
-      // console.warn("Main Concepts")
-    } else if (event.label === 'Recommended Concepts') {
-      // console.warn("Recommended Concepts")
-    } else {
-    }
-
-  }
+  onActiveItemChange(_event: MenuItem): void {
+    // No-op (tab side effects handled elsewhere)
+  }

843-853: Avoid hard-coded widths; compute width from container/sidebar; remove debug log

75%/85% breaks responsiveness and diverges from sidebar sizing. Compute based on container and sidebar widths.

-  setResponsiveWidthKnowledgeGraph() {
-    console.warn("window.innerWidth ", window.innerWidth)
-    let knowledgeGraph = document.getElementById('graphSection');
-    if (knowledgeGraph && knowledgeGraph.style) {
-      if (window.innerWidth < 2700) {
-        knowledgeGraph.style.width = '75%';
-      } else if (window.innerWidth > 2700) {
-        knowledgeGraph.style.width = '85%';
-      }
-    }
-  }
+  setResponsiveWidthKnowledgeGraph(): void {
+    const graph = document.getElementById('graphSection') as HTMLElement | null;
+    const container = document.getElementById('slideKgDialogDiv') as HTMLElement | null;
+    const sidebar = document.getElementById('flexboxNotUnderstood') as HTMLElement | null;
+    if (!graph || !container) return;
+    const sidebarWidth = this.showConceptsListSidebar && sidebar ? sidebar.offsetWidth : 0;
+    const padding = 16; // 1rem margin when sidebar is visible
+    const width = Math.max(320, container.offsetWidth - sidebarWidth - (this.showConceptsListSidebar ? padding : 0));
+    graph.style.marginLeft = this.showConceptsListSidebar ? '1rem' : '0';
+    graph.style.width = `${width}px`;
+  }

893-901: Unify spacing with the helper; drop inline marginLeft write

Delegate width/margin to setResponsiveWidthKnowledgeGraph to prevent drift.

-          if (knowledgeGraph) {
-            knowledgeGraph.style.marginLeft = 1 + 'rem';
-          }
+          // spacing handled in setResponsiveWidthKnowledgeGraph()

1487-1488: Fix tab index (2 is out of bounds for two tabs)

Set to 1 for “Recommended Materials”.

-      this.tabIndex = 2;
+      this.tabIndex = 1;

1542-1545: Null‑guard croComponent before building request (fallback to reqData)

Avoid runtime error when the child component isn’t initialized.

-          let reqDataFinal = this.croComponent.buildFinalRequestRecMaterial(reqData);
+          const reqDataFinal = this.croComponent
+            ? this.croComponent.buildFinalRequestRecMaterial(reqData)
+            : reqData;

2308-2315: Null‑guard ipo_interact before using offsetHeight

Prevents NPE when the container is missing on first render.

-  setHeightGraphComponent() {
-    let knowledgeGraph = document.getElementById('graphSection');
-    if (knowledgeGraph) {
-      let ipo_interact = document.getElementById('ipo_interact');
-      // console.warn("ipo_interact with -> ", ipo_interact.offsetWidth)
-      this.cyHeight = ipo_interact.offsetHeight - (ipo_interact.offsetHeight * 0.15);
-    }
-  }
+  setHeightGraphComponent(): void {
+    const container = document.getElementById('ipo_interact') as HTMLElement | null;
+    if (!container) return;
+    this.cyHeight = container.offsetHeight * 0.85;
+  }
coursemapper-kg/recommendation/app/services/course_materials/db/neo4_db.py (1)

6-6: Remove unused import (duplicate of prior review).

import json isn’t used. Please delete it.

- import json
📜 Review details

Configuration used: CodeRabbit UI

Review profile: ASSERTIVE

Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between e35ebf1 and 63591bd.

📒 Files selected for processing (4)
  • coursemapper-kg/recommendation/app/services/course_materials/db/neo4_db.py (12 hunks)
  • webapp/src/app/pages/components/knowledge-graph/concept-map/concept-map.component.ts (26 hunks)
  • webapp/src/app/pages/components/knowledge-graph/result-view/result-view.component.html (1 hunks)
  • webapp/src/app/pages/components/knowledge-graph/videos/card-video-list/card-video-list.component.html (1 hunks)
🧰 Additional context used
🧬 Code graph analysis (2)
coursemapper-kg/recommendation/app/services/course_materials/db/neo4_db.py (2)
webserver/src/graph/recommendation.neo4j.js (15)
  • result (27-27)
  • result (79-79)
  • result (186-192)
  • result (204-204)
  • result (307-307)
  • result (323-323)
  • result (344-344)
  • result (363-363)
  • resources (237-263)
  • query (206-227)
  • query (283-287)
  • query (308-312)
  • query (325-331)
  • query (345-350)
  • query (364-370)
webserver/src/controllers/recommendation.controller.js (12)
  • result (9-9)
  • result (20-20)
  • result (34-34)
  • result (45-45)
  • result (90-90)
  • result (104-104)
  • result (116-116)
  • result (129-129)
  • data (7-7)
  • data (18-18)
  • data (32-32)
  • cids (86-86)
webapp/src/app/pages/components/knowledge-graph/concept-map/concept-map.component.ts (1)
webapp/src/app/models/croForm.ts (2)
  • ActivatorPartCRO (4-8)
  • ResourcesPagination (66-73)
🪛 Ruff (0.12.2)
coursemapper-kg/recommendation/app/services/course_materials/db/neo4_db.py

6-6: json imported but unused

Remove unused import: json

(F401)


129-129: Missing return type annotation for public function create_external_source_resource

Add return type annotation: None

(ANN201)


129-129: Missing type annotation for function argument tx

(ANN001)


129-129: Missing type annotation for function argument node

(ANN001)


134-134: Use format specifiers instead of percent format

Replace with format specifiers

(UP031)


134-134: Logging statement uses %

(G002)


493-493: Missing return type annotation for public function create_user_v2

(ANN201)


493-493: Missing type annotation for function argument tx

(ANN001)


493-493: Missing type annotation for function argument user

(ANN001)


503-503: Unnecessary assignment to user_id before return statement

Remove unnecessary assignment

(RET504)


1795-1795: Missing return type annotation for public function get_or_create_user_v2

(ANN201)


1795-1795: Missing type annotation for function argument user

(ANN001)


1816-1816: Missing return type annotation for public function create_or_update_video_resource

Add return type annotation: None

(ANN201)


1816-1816: Missing type annotation for function argument tx

(ANN001)


1817-1817: Missing type annotation for function argument recommendation_type

(ANN001)


1817-1817: Unused method argument: recommendation_type

(ARG002)


1818-1818: Boolean default positional argument in function definition

(FBT002)


1818-1818: Missing type annotation for function argument update_embedding_values

(ANN001)


1819-1819: Boolean default positional argument in function definition

(FBT002)


1819-1819: Missing type annotation for function argument update_detail_found

(ANN001)


1819-1819: Unused method argument: update_detail_found

(ARG002)


1855-1855: Avoid equality comparisons to True; use update_embedding_values: for truth checks

Replace with update_embedding_values

(E712)


1865-1865: Use node.get("keyphrases", []) instead of an if block

Replace with node.get("keyphrases", [])

(SIM401)


1866-1866: Use node.get("keyphrase_embedding", []) instead of an if block

Replace with node.get("keyphrase_embedding", [])

(SIM401)


1867-1867: Use node.get("document_embedding", []) instead of an if block

Replace with node.get("document_embedding", [])

(SIM401)


1868-1868: Use node.get("keyphrases_infos", "") instead of an if block

Replace with node.get("keyphrases_infos", "")

(SIM401)


1889-1889: Use format specifiers instead of percent format

Replace with format specifiers

(UP031)


1893-1893: Use format specifiers instead of percent format

Replace with format specifiers

(UP031)


1899-1899: Use node.get("keyphrases", []) instead of an if block

Replace with node.get("keyphrases", [])

(SIM401)


1900-1900: Use node.get("keyphrase_embedding", []) instead of an if block

Replace with node.get("keyphrase_embedding", [])

(SIM401)


1901-1901: Use node.get("document_embedding", []) instead of an if block

Replace with node.get("document_embedding", [])

(SIM401)


1902-1902: Use node.get("helpful_count", 0) instead of an if block

Replace with node.get("helpful_count", 0)

(SIM401)


1903-1903: Use node.get("not_helpful_count", 0) instead of an if block

Replace with node.get("not_helpful_count", 0)

(SIM401)


1904-1904: Use node.get("saves_count", 0) instead of an if block

Replace with node.get("saves_count", 0)

(SIM401)


1907-1907: datetime.datetime.now() called without a tz argument

(DTZ005)


1908-1908: Use node.get("keyphrases_infos", "") instead of an if block

Replace with node.get("keyphrases_infos", "")

(SIM401)


1910-1910: Do not catch blind exception: Exception

(BLE001)


1912-1912: Unnecessary pass statement

Remove unnecessary pass

(PIE790)


1914-1914: Missing return type annotation for public function create_or_update_wikipedia_resource

Add return type annotation: None

(ANN201)


1914-1914: Missing type annotation for function argument tx

(ANN001)


1914-1914: Missing type annotation for function argument node

(ANN001)


1914-1914: Missing type annotation for function argument recommendation_type

(ANN001)


1914-1914: Unused method argument: recommendation_type

(ARG002)


1915-1915: Boolean default positional argument in function definition

(FBT002)


1915-1915: Missing type annotation for function argument update_embedding_values

(ANN001)


1916-1916: Boolean default positional argument in function definition

(FBT002)


1916-1916: Missing type annotation for function argument update_detail_found

(ANN001)


1916-1916: Unused method argument: update_detail_found

(ARG002)


1944-1944: Avoid equality comparisons to True; use update_embedding_values: for truth checks

Replace with update_embedding_values

(E712)


1953-1953: Use node.get("keyphrases", []) instead of an if block

Replace with node.get("keyphrases", [])

(SIM401)


1954-1954: Use node.get("keyphrase_embedding", []) instead of an if block

Replace with node.get("keyphrase_embedding", [])

(SIM401)


1955-1955: Use node.get("document_embedding", []) instead of an if block

Replace with node.get("document_embedding", [])

(SIM401)


1956-1956: Use node.get("keyphrases_infos", "") instead of an if block

Replace with node.get("keyphrases_infos", "")

(SIM401)


1976-1976: Use node.get("keyphrases", []) instead of an if block

Replace with node.get("keyphrases", [])

(SIM401)


1979-1979: Use node.get("keyphrase_embedding", []) instead of an if block

Replace with node.get("keyphrase_embedding", [])

(SIM401)


1980-1980: Use node.get("document_embedding", []) instead of an if block

Replace with node.get("document_embedding", [])

(SIM401)


1981-1981: Use node.get("helpful_count", 0) instead of an if block

Replace with node.get("helpful_count", 0)

(SIM401)


1982-1982: Use node.get("not_helpful_count", 0) instead of an if block

Replace with node.get("not_helpful_count", 0)

(SIM401)


1983-1983: Use node.get("saves_count", 0) instead of an if block

Replace with node.get("saves_count", 0)

(SIM401)


1984-1984: datetime.datetime.now() called without a tz argument

(DTZ005)


1985-1985: Use node.get("keyphrases_infos", "") instead of an if block

Replace with node.get("keyphrases_infos", "")

(SIM401)


1987-1987: Do not catch blind exception: Exception

(BLE001)


1989-1989: Unnecessary pass statement

Remove unnecessary pass

(PIE790)


1991-1991: Missing return type annotation for public function get_top_n_concept_by_slide_id

(ANN201)


1991-1991: PEP 484 prohibits implicit Optional

Convert to Optional[T]

(RUF013)


1991-1991: Missing type annotation for function argument top_n

(ANN001)


2026-2026: Missing return type annotation for public function create_concept_modified

(ANN201)


2055-2094: Missing explicit return at the end of function able to return non-None value

Add explicit return statement

(RET503)


2055-2055: Missing return type annotation for public function update_rs_btw_user_and_cm

(ANN201)


2055-2055: Boolean default positional argument in function definition

(FBT002)


2055-2055: Missing type annotation for function argument only_status

(ANN001)


2062-2062: Avoid equality comparisons to True; use only_status: for truth checks

Replace with only_status

(E712)


2094-2094: Unnecessary assignment to r_detail before return statement

Remove unnecessary assignment

(RET504)


2096-2096: Missing return type annotation for public function update_rs_btw_user_and_cms

Add return type annotation: None

(ANN201)


2096-2096: Missing type annotation for function argument special_status

(ANN001)


2096-2096: Unused method argument: special_status

(ARG002)


2104-2104: Local variable result is assigned to but never used

Remove assignment to unused variable result

(F841)


2114-2114: Missing return type annotation for public function get_user_embedding_with_concept_modified

(ANN201)


2167-2167: Missing return type annotation for public function user_rates_resources

(ANN201)


2183-2183: Comparison to None should be cond is not None

Replace with cond is not None

(E711)


2183-2183: Avoid equality comparisons to True; use rating["reset"]: for truth checks

Replace with rating["reset"]

(E712)


2267-2267: Unnecessary assignment to result before return statement

Remove unnecessary assignment

(RET504)


2269-2269: Missing return type annotation for public function update_rs_btw_resource_and_cm

Add return type annotation: None

(ANN201)


2269-2269: Boolean default positional argument in function definition

(FBT002)


2269-2269: Missing type annotation for function argument action

(ANN001)


2300-2300: Do not catch blind exception: Exception

(BLE001)


2303-2303: Unnecessary pass statement

Remove unnecessary pass

(PIE790)


2305-2305: Missing return type annotation for public function update_rs_btw_resources_and_cm

Add return type annotation: None

(ANN201)


2305-2305: Boolean default positional argument in function definition

(FBT002)


2305-2305: Missing type annotation for function argument action

(ANN001)


2328-2328: Do not catch blind exception: Exception

(BLE001)


2331-2331: Unnecessary pass statement

Remove unnecessary pass

(PIE790)


2333-2333: Missing return type annotation for public function user_saves_or_removes_resource

(ANN201)


2349-2349: Avoid equality comparisons to True; use data["status"]: for truth checks

Replace with data["status"]

(E712)


2384-2384: store_resources is too complex (14 > 10)

(C901)


2384-2384: Missing return type annotation for public function store_resources

Add return type annotation: None

(ANN201)


2384-2384: PEP 484 prohibits implicit Optional

Convert to Optional[T]

(RUF013)


2384-2384: Missing type annotation for function argument recommendation_type

(ANN001)


2385-2385: PEP 484 prohibits implicit Optional

Convert to Optional[T]

(RUF013)


2385-2385: Missing type annotation for function argument resources_form

(ANN001)


2386-2386: Unused method argument: resources_updated

(ARG002)


2386-2386: PEP 484 prohibits implicit Optional

Convert to Optional[T]

(RUF013)


2386-2386: Missing type annotation for function argument resources_updated_type

(ANN001)


2386-2386: Unused method argument: resources_updated_type

(ARG002)


2404-2404: Missing return type annotation for private function get_resource_primary_key

(ANN202)


2415-2415: Logging statement uses f-string

(G004)


2422-2423: Use a single if statement instead of nested if statements

(SIM102)


2424-2424: Logging statement uses f-string

(G004)


2449-2449: Missing return type annotation for public function retrieve_resources

(ANN201)


2449-2449: Boolean default positional argument in function definition

(FBT002)


2449-2449: Missing type annotation for function argument embedding_values

(ANN001)


2455-2455: Missing return type annotation for private function resource_replace_none_value

(ANN202)


2455-2455: Missing type annotation for function argument value

(ANN001)


2456-2456: Comparison to None should be cond is None

Replace with cond is None

(E711)


2462-2462: Avoid equality comparisons to True; use embedding_values: for truth checks

Replace with embedding_values

(E712)


2513-2513: Unnecessary assignment to result before return statement

Remove unnecessary assignment

(RET504)


2515-2515: Missing return type annotation for public function retrieve_resources_by_updated_at_exist_or_counts

(ANN201)


2515-2515: Missing type annotation for function argument cids

(ANN001)


2515-2515: Boolean default positional argument in function definition

(FBT002)


2515-2515: Missing type annotation for function argument only_exist

(ANN001)


2515-2515: Missing type annotation for function argument days

(ANN001)


2526-2526: Avoid equality comparisons to True; use only_exist: for truth checks

Replace with only_exist

(E712)


2548-2548: Unnecessary assignment to count before return statement

Remove unnecessary assignment

(RET504)


2550-2550: Missing return type annotation for public function resources_wrapper_from_query

(ANN201)


2570-2570: Use resource.get("is_bookmarked_fill", False) instead of an if block

Replace with resource.get("is_bookmarked_fill", False)

(SIM401)


2571-2571: Use resource.get("saves_count", 0) instead of an if block

Replace with resource.get("saves_count", 0)

(SIM401)


2572-2572: Use resource.get("concept_cid", 0) instead of an if block

Replace with resource.get("concept_cid", 0)

(SIM401)


2592-2592: Missing return type annotation for public function update_resource_action

Add return type annotation: None

(ANN201)


2592-2592: Boolean default positional argument in function definition

(FBT002)


2592-2592: Missing type annotation for function argument action

(ANN001)


2610-2610: Missing return type annotation for public function filter_user_resources_saved_by

(ANN201)


2676-2676: Missing return type annotation for public function get_rids_from_user_saves

(ANN201)


2691-2691: Unnecessary assignment to nodes before return statement

Remove unnecessary assignment

(RET504)


2693-2693: Missing return type annotation for public function get_main_concepts_by_mid

(ANN201)


2710-2710: Unnecessary assignment to nodes before return statement

Remove unnecessary assignment

(RET504)


2712-2712: Missing return type annotation for public function get_main_concepts_by_slide_id

(ANN201)


2729-2729: Unnecessary assignment to nodes before return statement

Remove unnecessary assignment

(RET504)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)
  • GitHub Check: Build container images / Build image (coursemapper-kg/concept-map)
  • GitHub Check: Build container images / Build image (webapp)
  • GitHub Check: Build container images / Build image (webserver)
  • GitHub Check: Build container images / Build image (coursemapper-kg/recommendation)
  • GitHub Check: Build container images / Build image (coursemapper-kg/preprocess)
🔇 Additional comments (4)
webapp/src/app/pages/components/knowledge-graph/concept-map/concept-map.component.ts (3)

7-9: LGTM: required imports added correctly

ViewChild, Renderer2, CRO types, and the CRO component are appropriately imported.

Also applies to: 34-36


2304-2306: LGTM: safe optional chaining in croUpdater

This avoids errors when the child ViewChild isn’t yet available.


2317-2333: Delegate width logic to the responsive helper
Verify that the resize handler is actually bound (via @HostListener('window:resize', ['$event'])) and that a setResponsiveWidthKnowledgeGraph() helper exists before renaming setWeightGraphComponent to setWidthGraphComponent and updating its callsites.

coursemapper-kg/recommendation/app/services/course_materials/db/neo4_db.py (1)

873-896: Potential signature mismatch in concept resource retrieval.

retrieve_concept_resources is defined as (tx, mid, cid) but get_concept_resources calls it with (tx, concept_id) and earlier with (session, material_id, cid). Ensure all call sites pass both mid and cid.

Would you like me to scan the repo and generate a patch to normalize this API and its usages?

Also applies to: 1023-1046

# os.environ['PYTHONHASHSEED'] = '0'
# os.execv(sys.executable, [sys.executable] + sys.argv)

from datetime import datetime
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Make timestamps timezone-aware (UTC).

Use timezone-aware timestamps to avoid Cypher datetime parsing inconsistencies and DST bugs.

-from datetime import datetime
+from datetime import datetime, timezone
-                    updated_at=datetime.now().isoformat(),
+                    updated_at=datetime.now(timezone.utc).isoformat(),
-                    updated_at=datetime.now().isoformat(),
+                    updated_at=datetime.now(timezone.utc).isoformat(),
-                    updated_at=datetime.now().isoformat(),
+                    updated_at=datetime.now(timezone.utc).isoformat(),

Also applies to: 1848-1849, 1907-1907, 1984-1984

🤖 Prompt for AI Agents
In coursemapper-kg/recommendation/app/services/course_materials/db/neo4_db.py
around line 16 (and also for the occurrences at lines 1848-1849, 1907, 1984),
the code currently uses naive datetime objects; change imports and usages to
produce timezone-aware UTC timestamps (e.g., import timezone from datetime and
replace datetime.now() / datetime.utcnow() usages with
datetime.now(timezone.utc) or datetime.utcnow().replace(tzinfo=timezone.utc)) so
all created timestamps are explicitly UTC-aware before being used in Cypher
queries or stored.

Comment on lines +129 to +147
def create_external_source_resource(tx, node):
"""
Create ExternalSource Node
"""
logger.info(
"Creating ExternalSource resource '%s'" % node["id"])
tx.run(
"""MERGE (c:Resource:ExternalSource {rid: $rid, uri: $uri,
publish_time: $created_at, cid: $cid, description: $description, helpful_count: $helpful_count,
not_helpful_count: $not_helpful_count, saves_count: $saves_count})""",
rid=node["uri"],
uri=node["uri"],
publish_time=node["created_at"],
description=node["description"],
cid=node["cid"],
helpful_count=0,
not_helpful_count=0,
saves_count=0
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Broken MERGE parameter names and RID; fix logger and type hints.

  • Cypher uses $created_at but you pass publish_time; this makes publish_time always null.
  • rid is set to node["uri"] instead of the resource id; likely unintended.
  • Prefer structured logging args over % formatting.
  • Add annotations.
-def create_external_source_resource(tx, node):
+def create_external_source_resource(tx, node) -> None:
@@
-    logger.info(
-        "Creating ExternalSource resource '%s'" % node["id"])
+    logger.info("Creating ExternalSource resource '%s'", node["id"])
     tx.run(
-        """MERGE (c:Resource:ExternalSource {rid: $rid, uri: $uri, 
-        publish_time: $created_at, cid: $cid, description: $description, helpful_count: $helpful_count,
+        """MERGE (c:Resource:ExternalSource {rid: $rid, uri: $uri, 
+        publish_time: $publish_time, cid: $cid, description: $description, helpful_count: $helpful_count,
         not_helpful_count: $not_helpful_count, saves_count: $saves_count})""",
-        rid=node["uri"],
+        rid=node["id"],
         uri=node["uri"],
-        publish_time=node["created_at"],
+        publish_time=node["created_at"],
         description=node["description"],
         cid=node["cid"],
         helpful_count=0,
         not_helpful_count=0,
         saves_count=0
     )
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
def create_external_source_resource(tx, node):
"""
Create ExternalSource Node
"""
logger.info(
"Creating ExternalSource resource '%s'" % node["id"])
tx.run(
"""MERGE (c:Resource:ExternalSource {rid: $rid, uri: $uri,
publish_time: $created_at, cid: $cid, description: $description, helpful_count: $helpful_count,
not_helpful_count: $not_helpful_count, saves_count: $saves_count})""",
rid=node["uri"],
uri=node["uri"],
publish_time=node["created_at"],
description=node["description"],
cid=node["cid"],
helpful_count=0,
not_helpful_count=0,
saves_count=0
)
def create_external_source_resource(tx, node) -> None:
"""
Create ExternalSource Node
"""
logger.info("Creating ExternalSource resource '%s'", node["id"])
tx.run(
"""MERGE (c:Resource:ExternalSource {rid: $rid, uri: $uri,
publish_time: $publish_time, cid: $cid, description: $description, helpful_count: $helpful_count,
not_helpful_count: $not_helpful_count, saves_count: $saves_count})""",
rid=node["id"],
uri=node["uri"],
publish_time=node["created_at"],
description=node["description"],
cid=node["cid"],
helpful_count=0,
not_helpful_count=0,
saves_count=0
)
🧰 Tools
🪛 Ruff (0.12.2)

129-129: Missing return type annotation for public function create_external_source_resource

Add return type annotation: None

(ANN201)


129-129: Missing type annotation for function argument tx

(ANN001)


129-129: Missing type annotation for function argument node

(ANN001)


134-134: Use format specifiers instead of percent format

Replace with format specifiers

(UP031)


134-134: Logging statement uses %

(G002)

Comment on lines +1795 to +1815
def get_or_create_user_v2(self, user):
"""
"""
tx = self.driver.session()
user_node = tx.run(
"MATCH (u:User) WHERE u.uid = $uid RETURN u.uid as uid",
uid=user["user_id"]
).single()

if user_node is None:
user_node = tx.run(
"""MERGE (u:User {name: $name, uid: $uid, type: $type, email: $userEmail, embedding:$embedding}) RETURN u.uid""",
name=user["name"],
uid=user["user_id"],
type="user",
userEmail=user["user_email"],
embedding=""
).single()

return user_node

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Resource leak and naming: open session is never closed.

Use a context manager; return normalized shape (uid string).

-def get_or_create_user_v2(self, user):
+def get_or_create_user_v2(self, user) -> str:
@@
-        tx = self.driver.session()
-        user_node = tx.run(
+        with self.driver.session() as session:
+            record = session.run(
                 "MATCH (u:User) WHERE u.uid = $uid RETURN u.uid as uid",
                 uid=user["user_id"]
-        ).single()
-
-        if user_node is None:
-            user_node = tx.run(
+            ).single()
+
+            if record is None:
+                record = session.run(
                     """MERGE (u:User {name: $name, uid: $uid, type: $type, email: $userEmail, embedding:$embedding}) RETURN u.uid as uid""",
                     name=user["name"],
                     uid=user["user_id"],
                     type="user",
                     userEmail=user["user_email"],
                     embedding=""
-                ).single()
-            
-        return user_node
+                ).single()
+            return record["uid"]
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
def get_or_create_user_v2(self, user):
"""
"""
tx = self.driver.session()
user_node = tx.run(
"MATCH (u:User) WHERE u.uid = $uid RETURN u.uid as uid",
uid=user["user_id"]
).single()
if user_node is None:
user_node = tx.run(
"""MERGE (u:User {name: $name, uid: $uid, type: $type, email: $userEmail, embedding:$embedding}) RETURN u.uid""",
name=user["name"],
uid=user["user_id"],
type="user",
userEmail=user["user_email"],
embedding=""
).single()
return user_node
def get_or_create_user_v2(self, user) -> str:
"""
"""
with self.driver.session() as session:
record = session.run(
"MATCH (u:User) WHERE u.uid = $uid RETURN u.uid as uid",
uid=user["user_id"]
).single()
if record is None:
record = session.run(
"""MERGE (u:User {name: $name, uid: $uid, type: $type, email: $userEmail, embedding:$embedding}) RETURN u.uid as uid""",
name=user["name"],
uid=user["user_id"],
type="user",
userEmail=user["user_email"],
embedding=""
).single()
return record["uid"]
🧰 Tools
🪛 Ruff (0.12.2)

1795-1795: Missing return type annotation for public function get_or_create_user_v2

(ANN201)


1795-1795: Missing type annotation for function argument user

(ANN001)

🤖 Prompt for AI Agents
In coursemapper-kg/recommendation/app/services/course_materials/db/neo4_db.py
around lines 1795-1815 the Neo4j session is opened but never closed and the
function returns a raw record instead of a normalized uid string; fix by using a
context manager (with self.driver.session() as session:) and call
session.run(...) for the MATCH and MERGE queries, retrieve the uid from the
resulting record (e.g. record["uid"] or record.get("uid")) and return that uid
string (not the Record object), and ensure the MERGE/parameters match the same
uid field so the function always returns a consistent uid string.

Comment on lines +1816 to +1913
def create_or_update_video_resource(self, tx, node: dict,
recommendation_type='',
update_embedding_values=False,
update_detail_found=False
):
'''
Creating Resource YouTube
r.similarity_score = $similarity_score,
'''
# logger.info(" Creating Resource YouTube")
try:
"""
if update_detail_found == True:
tx.run(
'''
MATCH (r:Resource: Video)
WHERE r.rid = $rid
SET r.title = $title, r.description = $description, r.description_full = $description_full,
r.text = $text, r.duration = $duration, r.views = $views, like_count = $like_count,
r.channel_title = $channel_title, r.updated_at = $updated_at,
r.keyphrases = $keyphrases, r.keyphrase_embedding = $keyphrase_embedding,
r.document_embedding = $document_embedding
''',
rid=node["rid"],
title=node["title"],
description=node["description"],
description_full=node["description_full"],
text=node["text"],
duration=node["duration"],
views=node["views"],
like_count=node["like_count"],
channel_title=node["channel_title"],
updated_at=datetime.now().isoformat(),
keyphrases=[],
keyphrase_embedding="",
document_embedding=""
)
"""

if update_embedding_values == True: # node.get("keyphrases") != None or node.get("document_embedding") != None or node.get("keyphrase_embedding") != None:
tx.run(
'''
MATCH (r:Resource: Video)
WHERE r.rid = $rid
SET r.keyphrases = $keyphrases, r.keyphrase_embedding = $keyphrase_embedding, r.document_embedding = $document_embedding,
r.keyphrases_infos = $keyphrases_infos

''',
rid=node["rid"],
keyphrases=node["keyphrases"] if "keyphrases" in node else [],
keyphrase_embedding=node["keyphrase_embedding"] if "keyphrase_embedding" in node else [],
document_embedding=node["document_embedding"] if "document_embedding" in node else [],
keyphrases_infos=node["keyphrases_infos"] if "keyphrases_infos" in node else ""
)
else:
tx.run(
'''
MERGE (r:Resource:Video {rid: $rid})
ON CREATE SET
r.uri = $uri, r.title = $title, r.description = $description, r.description_full = $description_full, r.text = $text,
r.keyphrases = $keyphrases, r.document_embedding = $document_embedding, r.keyphrase_embedding = $keyphrase_embedding,
r.thumbnail = $thumbnail, r.duration = $duration, r.views = $views,
r.publish_time = $pub_time, r.channel_title = $channel_title, r.like_count = $like_count,
r.helpful_count = $helpful_count, r.not_helpful_count = $not_helpful_count, r.saves_count = $saves_count,
r.updated_at = $updated_at
ON MATCH SET
r.title = $title, r.description = $description, r.description_full = $description_full, r.text = $text,
r.keyphrases = $keyphrases, r.document_embedding = $document_embedding, r.keyphrase_embedding = $keyphrase_embedding,
r.thumbnail = $thumbnail, r.duration = $duration, r.views = $views,
r.publish_time = $pub_time, r.channel_title = $channel_title, r.like_count = $like_count,
r.updated_at = $updated_at
''',
rid=node["id"],
uri="https://www.youtube.com/embed/%s?autoplay=1" % node["id"],
title=node["title"],
description=node["description"],
description_full=node["description_full"],
thumbnail="https://i.ytimg.com/vi/%s/hqdefault.jpg" % node["id"],
text=node["text"],
duration=node["duration"],
views=node["views"],
pub_time=node["publishTime"],
# similarity_score=node[recommendation_type] if recommendation_type in node.index else 0,
keyphrases=node["keyphrases"] if "keyphrases" in node else [],
keyphrase_embedding=node["keyphrase_embedding"] if "keyphrase_embedding" in node else [],
document_embedding=node["document_embedding"] if "document_embedding" in node else [],
helpful_count=node["helpful_count"] if "helpful_count" in node else 0,
not_helpful_count=node["not_helpful_count"] if "not_helpful_count" in node else 0,
saves_count=node["saves_count"] if "saves_count" in node else 0,
like_count=node["like_count"],
channel_title=node["channel_title"],
updated_at=datetime.now().isoformat(),
keyphrases_infos=node["keyphrases_infos"] if "keyphrases_infos" in node else ""
)
except Exception as e:
print(e)
pass

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Harden video upsert: defaults, boolean checks, formatting, and exceptions.

  • Use .get() for optional fields to avoid KeyError.
  • Prefer simple truth checks over == True.
  • Use f-strings for string building (not for logging).
  • Replace bare except with logging the exception.
-            if update_embedding_values == True: # node.get("keyphrases") != None or node.get("document_embedding") != None or node.get("keyphrase_embedding") != None:
+            if update_embedding_values:
@@
-                    keyphrases=node["keyphrases"] if "keyphrases" in node else [],
-                    keyphrase_embedding=node["keyphrase_embedding"] if "keyphrase_embedding" in node else [],
-                    document_embedding=node["document_embedding"] if "document_embedding" in node else [],
-                    keyphrases_infos=node["keyphrases_infos"] if "keyphrases_infos" in node else ""
+                    keyphrases=node.get("keyphrases", []),
+                    keyphrase_embedding=node.get("keyphrase_embedding", []),
+                    document_embedding=node.get("document_embedding", []),
+                    keyphrases_infos=node.get("keyphrases_infos", "")
                 )
             else:
                 tx.run(
@@
-                    rid=node["id"],
-                    uri="https://www.youtube.com/embed/%s?autoplay=1" % node["id"],
+                    rid=node["id"],
+                    uri=f"https://www.youtube.com/embed/{node['id']}?autoplay=1",
@@
-                    thumbnail="https://i.ytimg.com/vi/%s/hqdefault.jpg" % node["id"],
+                    thumbnail=f"https://i.ytimg.com/vi/{node['id']}/hqdefault.jpg",
@@
-                    keyphrases=node["keyphrases"] if "keyphrases" in node else [],
-                    keyphrase_embedding=node["keyphrase_embedding"] if "keyphrase_embedding" in node else [],
-                    document_embedding=node["document_embedding"] if "document_embedding" in node else [],
-                    helpful_count=node["helpful_count"] if "helpful_count" in node else 0,
-                    not_helpful_count=node["not_helpful_count"] if "not_helpful_count" in node else 0,
-                    saves_count=node["saves_count"] if "saves_count" in node else 0,
+                    keyphrases=node.get("keyphrases", []),
+                    keyphrase_embedding=node.get("keyphrase_embedding", []),
+                    document_embedding=node.get("document_embedding", []),
+                    helpful_count=node.get("helpful_count", 0),
+                    not_helpful_count=node.get("not_helpful_count", 0),
+                    saves_count=node.get("saves_count", 0),
@@
-                    updated_at=datetime.now().isoformat(),
-                    keyphrases_infos=node["keyphrases_infos"] if "keyphrases_infos" in node else ""
+                    updated_at=datetime.now(timezone.utc).isoformat(),
+                    keyphrases_infos=node.get("keyphrases_infos", "")
                 )
-        except Exception as e:
-            print(e)
-            pass
+        except Exception as e:
+            logger.exception("create_or_update_video_resource failed: %s", e)
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
def create_or_update_video_resource(self, tx, node: dict,
recommendation_type='',
update_embedding_values=False,
update_detail_found=False
):
'''
Creating Resource YouTube
r.similarity_score = $similarity_score,
'''
# logger.info(" Creating Resource YouTube")
try:
"""
if update_detail_found == True:
tx.run(
'''
MATCH (r:Resource: Video)
WHERE r.rid = $rid
SET r.title = $title, r.description = $description, r.description_full = $description_full,
r.text = $text, r.duration = $duration, r.views = $views, like_count = $like_count,
r.channel_title = $channel_title, r.updated_at = $updated_at,
r.keyphrases = $keyphrases, r.keyphrase_embedding = $keyphrase_embedding,
r.document_embedding = $document_embedding
''',
rid=node["rid"],
title=node["title"],
description=node["description"],
description_full=node["description_full"],
text=node["text"],
duration=node["duration"],
views=node["views"],
like_count=node["like_count"],
channel_title=node["channel_title"],
updated_at=datetime.now().isoformat(),
keyphrases=[],
keyphrase_embedding="",
document_embedding=""
)
"""
if update_embedding_values == True: # node.get("keyphrases") != None or node.get("document_embedding") != None or node.get("keyphrase_embedding") != None:
tx.run(
'''
MATCH (r:Resource: Video)
WHERE r.rid = $rid
SET r.keyphrases = $keyphrases, r.keyphrase_embedding = $keyphrase_embedding, r.document_embedding = $document_embedding,
r.keyphrases_infos = $keyphrases_infos
''',
rid=node["rid"],
keyphrases=node["keyphrases"] if "keyphrases" in node else [],
keyphrase_embedding=node["keyphrase_embedding"] if "keyphrase_embedding" in node else [],
document_embedding=node["document_embedding"] if "document_embedding" in node else [],
keyphrases_infos=node["keyphrases_infos"] if "keyphrases_infos" in node else ""
)
else:
tx.run(
'''
MERGE (r:Resource:Video {rid: $rid})
ON CREATE SET
r.uri = $uri, r.title = $title, r.description = $description, r.description_full = $description_full, r.text = $text,
r.keyphrases = $keyphrases, r.document_embedding = $document_embedding, r.keyphrase_embedding = $keyphrase_embedding,
r.thumbnail = $thumbnail, r.duration = $duration, r.views = $views,
r.publish_time = $pub_time, r.channel_title = $channel_title, r.like_count = $like_count,
r.helpful_count = $helpful_count, r.not_helpful_count = $not_helpful_count, r.saves_count = $saves_count,
r.updated_at = $updated_at
ON MATCH SET
r.title = $title, r.description = $description, r.description_full = $description_full, r.text = $text,
r.keyphrases = $keyphrases, r.document_embedding = $document_embedding, r.keyphrase_embedding = $keyphrase_embedding,
r.thumbnail = $thumbnail, r.duration = $duration, r.views = $views,
r.publish_time = $pub_time, r.channel_title = $channel_title, r.like_count = $like_count,
r.updated_at = $updated_at
''',
rid=node["id"],
uri="https://www.youtube.com/embed/%s?autoplay=1" % node["id"],
title=node["title"],
description=node["description"],
description_full=node["description_full"],
thumbnail="https://i.ytimg.com/vi/%s/hqdefault.jpg" % node["id"],
text=node["text"],
duration=node["duration"],
views=node["views"],
pub_time=node["publishTime"],
# similarity_score=node[recommendation_type] if recommendation_type in node.index else 0,
keyphrases=node["keyphrases"] if "keyphrases" in node else [],
keyphrase_embedding=node["keyphrase_embedding"] if "keyphrase_embedding" in node else [],
document_embedding=node["document_embedding"] if "document_embedding" in node else [],
helpful_count=node["helpful_count"] if "helpful_count" in node else 0,
not_helpful_count=node["not_helpful_count"] if "not_helpful_count" in node else 0,
saves_count=node["saves_count"] if "saves_count" in node else 0,
like_count=node["like_count"],
channel_title=node["channel_title"],
updated_at=datetime.now().isoformat(),
keyphrases_infos=node["keyphrases_infos"] if "keyphrases_infos" in node else ""
)
except Exception as e:
print(e)
pass
def create_or_update_video_resource(self, tx, node: dict,
recommendation_type='',
update_embedding_values=False,
update_detail_found=False
):
'''
Creating Resource YouTube
r.similarity_score = $similarity_score,
'''
# logger.info(" Creating Resource YouTube")
try:
"""
if update_detail_found == True:
tx.run(
'''
MATCH (r:Resource: Video)
WHERE r.rid = $rid
SET r.title = $title, r.description = $description, r.description_full = $description_full,
r.text = $text, r.duration = $duration, r.views = $views, like_count = $like_count,
r.channel_title = $channel_title, r.updated_at = $updated_at,
r.keyphrases = $keyphrases, r.keyphrase_embedding = $keyphrase_embedding,
r.document_embedding = $document_embedding
''',
rid=node["rid"],
title=node["title"],
description=node["description"],
description_full=node["description_full"],
text=node["text"],
duration=node["duration"],
views=node["views"],
like_count=node["like_count"],
channel_title=node["channel_title"],
updated_at=datetime.now(timezone.utc).isoformat(),
keyphrases=[],
keyphrase_embedding="",
document_embedding=""
)
"""
if update_embedding_values:
tx.run(
'''
MATCH (r:Resource: Video)
WHERE r.rid = $rid
SET r.keyphrases = $keyphrases,
r.keyphrase_embedding = $keyphrase_embedding,
r.document_embedding = $document_embedding,
r.keyphrases_infos = $keyphrases_infos
''',
rid=node["rid"],
keyphrases=node.get("keyphrases", []),
keyphrase_embedding=node.get("keyphrase_embedding", []),
document_embedding=node.get("document_embedding", []),
keyphrases_infos=node.get("keyphrases_infos", "")
)
else:
tx.run(
'''
MERGE (r:Resource:Video {rid: $rid})
ON CREATE SET
r.uri = $uri, r.title = $title, r.description = $description,
r.description_full = $description_full, r.text = $text,
r.keyphrases = $keyphrases, r.document_embedding = $document_embedding,
r.keyphrase_embedding = $keyphrase_embedding,
r.thumbnail = $thumbnail, r.duration = $duration,
r.views = $views, r.publish_time = $pub_time,
r.channel_title = $channel_title, r.like_count = $like_count,
r.helpful_count = $helpful_count, r.not_helpful_count = $not_helpful_count,
r.saves_count = $saves_count, r.updated_at = $updated_at
ON MATCH SET
r.title = $title, r.description = $description,
r.description_full = $description_full, r.text = $text,
r.keyphrases = $keyphrases, r.document_embedding = $document_embedding,
r.keyphrase_embedding = $keyphrase_embedding,
r.thumbnail = $thumbnail, r.duration = $duration,
r.views = $views, r.publish_time = $pub_time,
r.channel_title = $channel_title, r.like_count = $like_count,
r.updated_at = $updated_at
''',
rid=node["id"],
uri=f"https://www.youtube.com/embed/{node['id']}?autoplay=1",
title=node["title"],
description=node["description"],
description_full=node["description_full"],
thumbnail=f"https://i.ytimg.com/vi/{node['id']}/hqdefault.jpg",
text=node["text"],
duration=node["duration"],
views=node["views"],
pub_time=node["publishTime"],
keyphrases=node.get("keyphrases", []),
keyphrase_embedding=node.get("keyphrase_embedding", []),
document_embedding=node.get("document_embedding", []),
helpful_count=node.get("helpful_count", 0),
not_helpful_count=node.get("not_helpful_count", 0),
saves_count=node.get("saves_count", 0),
like_count=node["like_count"],
channel_title=node["channel_title"],
updated_at=datetime.now(timezone.utc).isoformat(),
keyphrases_infos=node.get("keyphrases_infos", "")
)
except Exception as e:
logger.exception("create_or_update_video_resource failed: %s", e)
🧰 Tools
🪛 Ruff (0.12.2)

1816-1816: Missing return type annotation for public function create_or_update_video_resource

Add return type annotation: None

(ANN201)


1816-1816: Missing type annotation for function argument tx

(ANN001)


1817-1817: Missing type annotation for function argument recommendation_type

(ANN001)


1817-1817: Unused method argument: recommendation_type

(ARG002)


1818-1818: Boolean default positional argument in function definition

(FBT002)


1818-1818: Missing type annotation for function argument update_embedding_values

(ANN001)


1819-1819: Boolean default positional argument in function definition

(FBT002)


1819-1819: Missing type annotation for function argument update_detail_found

(ANN001)


1819-1819: Unused method argument: update_detail_found

(ARG002)


1855-1855: Avoid equality comparisons to True; use update_embedding_values: for truth checks

Replace with update_embedding_values

(E712)


1865-1865: Use node.get("keyphrases", []) instead of an if block

Replace with node.get("keyphrases", [])

(SIM401)


1866-1866: Use node.get("keyphrase_embedding", []) instead of an if block

Replace with node.get("keyphrase_embedding", [])

(SIM401)


1867-1867: Use node.get("document_embedding", []) instead of an if block

Replace with node.get("document_embedding", [])

(SIM401)


1868-1868: Use node.get("keyphrases_infos", "") instead of an if block

Replace with node.get("keyphrases_infos", "")

(SIM401)


1889-1889: Use format specifiers instead of percent format

Replace with format specifiers

(UP031)


1893-1893: Use format specifiers instead of percent format

Replace with format specifiers

(UP031)


1899-1899: Use node.get("keyphrases", []) instead of an if block

Replace with node.get("keyphrases", [])

(SIM401)


1900-1900: Use node.get("keyphrase_embedding", []) instead of an if block

Replace with node.get("keyphrase_embedding", [])

(SIM401)


1901-1901: Use node.get("document_embedding", []) instead of an if block

Replace with node.get("document_embedding", [])

(SIM401)


1902-1902: Use node.get("helpful_count", 0) instead of an if block

Replace with node.get("helpful_count", 0)

(SIM401)


1903-1903: Use node.get("not_helpful_count", 0) instead of an if block

Replace with node.get("not_helpful_count", 0)

(SIM401)


1904-1904: Use node.get("saves_count", 0) instead of an if block

Replace with node.get("saves_count", 0)

(SIM401)


1907-1907: datetime.datetime.now() called without a tz argument

(DTZ005)


1908-1908: Use node.get("keyphrases_infos", "") instead of an if block

Replace with node.get("keyphrases_infos", "")

(SIM401)


1910-1910: Do not catch blind exception: Exception

(BLE001)


1912-1912: Unnecessary pass statement

Remove unnecessary pass

(PIE790)

🤖 Prompt for AI Agents
In coursemapper-kg/recommendation/app/services/course_materials/db/neo4_db.py
around lines 1816 to 1913, the video upsert is fragile: replace direct indexing
like node["..."] with node.get("...") and provide sensible defaults (e.g. empty
list or 0) to avoid KeyError; change the conditional check from
update_embedding_values == True to a simple truthy check (if
update_embedding_values:); build uri and thumbnail using f-strings (e.g.
f"https://www.youtube.com/embed/{node.get('id')}?autoplay=1") instead of
%-formatting; and replace the current bare/print exception handling with a
proper logger call that logs the exception and stack (e.g. logger.exception or
logger.error with exc_info=True) rather than swallowing errors.

Comment on lines +1914 to +1990
def create_or_update_wikipedia_resource(self, tx, node, recommendation_type='',
update_embedding_values=False,
update_detail_found=False
):
'''
Creating Resource Wikipedia
'''
# logger.info("Creating Resource Wikipedia")
try:
"""
if update_detail_found == True:
tx.run(
'''
MATCH (r:Resource: Article)
WHERE r.rid = $rid
SET r.title = $title, r.abstract = $abstract, r.text = $text, updated_at = $updated_at,
r.keyphrases = $keyphrases, r.keyphrase_embedding = $keyphrase_embedding,
r.document_embedding = $document_embedding
''',
rid=node["rid"],
title=node["title"],
abstract=node["abstract"],
text=node["text"],
updated_at=datetime.now().isoformat(),
keyphrases=node["keyphrases"] if "keyphrases" in node else [],
keyphrase_embedding=str(node["keyphrase_embedding"] if "keyphrase_embedding" in node else ""),
document_embedding=str(node["document_embedding"] if "document_embedding" in node else ""),
)
"""

if update_embedding_values == True: # node.get("keyphrases") != None or node.get("document_embedding") != None or node.get("keyphrase_embedding") != None:
tx.run(
'''
MATCH (r:Resource: Article)
WHERE r.rid = $rid
SET r.keyphrases = $keyphrases, r.keyphrase_embedding = $keyphrase_embedding, r.document_embedding = $document_embedding,
r.keyphrases_infos = $keyphrases_infos
''',
rid=node["rid"],
keyphrases=node["keyphrases"] if "keyphrases" in node else [],
keyphrase_embedding=node["keyphrase_embedding"] if "keyphrase_embedding" in node else [],
document_embedding=node["document_embedding"] if "document_embedding" in node else [],
keyphrases_infos=node["keyphrases_infos"] if "keyphrases_infos" in node else ""
)
else:
tx.run(
'''
MERGE (r:Resource:Article {rid: $rid})
ON CREATE SET
r.uri = $uri, r.title = $title, r.abstract = $abstract, r.text = $text,
r.keyphrases = $keyphrases, r.document_embedding = $document_embedding, r.keyphrase_embedding = $keyphrase_embedding,
r.helpful_count = $helpful_count, r.not_helpful_count = $not_helpful_count, r.saves_count = $saves_count,
r.updated_at = $updated_at
ON MATCH SET
r.uri = $uri, r.title = $title, r.abstract = $abstract, r.text = $text,
r.keyphrases = $keyphrases, r.document_embedding = $document_embedding, r.keyphrase_embedding = $keyphrase_embedding,
r.updated_at = $updated_at
''',
rid=node["id"],
uri=node["id"],
title=node["title"],
abstract=node["abstract"],
keyphrases=node["keyphrases"] if "keyphrases" in node else [],
text=node["text"],
# similarity_score=node[recommendation_type] if recommendation_type in node.index else 0,
keyphrase_embedding=node["keyphrase_embedding"] if "keyphrase_embedding" in node else [],
document_embedding=node["document_embedding"] if "document_embedding" in node else [],
helpful_count=node["helpful_count"] if "helpful_count" in node else 0,
not_helpful_count=node["not_helpful_count"] if "not_helpful_count" in node else 0,
saves_count=node["saves_count"] if "saves_count" in node else 0,
updated_at=datetime.now().isoformat(),
keyphrases_infos=node["keyphrases_infos"] if "keyphrases_infos" in node else ""
)
except Exception as e:
print(e)
pass

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Harden article upsert similarly; avoid bare exceptions and make timestamps UTC.

-            if update_embedding_values == True: # node.get("keyphrases") != None or node.get("document_embedding") != None or node.get("keyphrase_embedding") != None:
+            if update_embedding_values:
@@
-                    keyphrases=node["keyphrases"] if "keyphrases" in node else [],
-                    keyphrase_embedding=node["keyphrase_embedding"] if "keyphrase_embedding" in node else [],
-                    document_embedding=node["document_embedding"] if "document_embedding" in node else [],
-                    keyphrases_infos=node["keyphrases_infos"] if "keyphrases_infos" in node else ""
+                    keyphrases=node.get("keyphrases", []),
+                    keyphrase_embedding=node.get("keyphrase_embedding", []),
+                    document_embedding=node.get("document_embedding", []),
+                    keyphrases_infos=node.get("keyphrases_infos", "")
                 )
             else:
                 tx.run(
@@
-                    keyphrases=node["keyphrases"] if "keyphrases" in node else [],
+                    keyphrases=node.get("keyphrases", []),
@@
-                    keyphrase_embedding=node["keyphrase_embedding"] if "keyphrase_embedding" in node else [],
-                    document_embedding=node["document_embedding"] if "document_embedding" in node else [],
-                    helpful_count=node["helpful_count"] if "helpful_count" in node else 0,
-                    not_helpful_count=node["not_helpful_count"] if "not_helpful_count" in node else 0,
-                    saves_count=node["saves_count"] if "saves_count" in node else 0,
-                    updated_at=datetime.now().isoformat(),
-                    keyphrases_infos=node["keyphrases_infos"] if "keyphrases_infos" in node else ""
+                    keyphrase_embedding=node.get("keyphrase_embedding", []),
+                    document_embedding=node.get("document_embedding", []),
+                    helpful_count=node.get("helpful_count", 0),
+                    not_helpful_count=node.get("not_helpful_count", 0),
+                    saves_count=node.get("saves_count", 0),
+                    updated_at=datetime.now(timezone.utc).isoformat(),
+                    keyphrases_infos=node.get("keyphrases_infos", "")
                 )
-        except Exception as e:
-            print(e)
-            pass
+        except Exception as e:
+            logger.exception("create_or_update_wikipedia_resource failed: %s", e)
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
def create_or_update_wikipedia_resource(self, tx, node, recommendation_type='',
update_embedding_values=False,
update_detail_found=False
):
'''
Creating Resource Wikipedia
'''
# logger.info("Creating Resource Wikipedia")
try:
"""
if update_detail_found == True:
tx.run(
'''
MATCH (r:Resource: Article)
WHERE r.rid = $rid
SET r.title = $title, r.abstract = $abstract, r.text = $text, updated_at = $updated_at,
r.keyphrases = $keyphrases, r.keyphrase_embedding = $keyphrase_embedding,
r.document_embedding = $document_embedding
''',
rid=node["rid"],
title=node["title"],
abstract=node["abstract"],
text=node["text"],
updated_at=datetime.now().isoformat(),
keyphrases=node["keyphrases"] if "keyphrases" in node else [],
keyphrase_embedding=str(node["keyphrase_embedding"] if "keyphrase_embedding" in node else ""),
document_embedding=str(node["document_embedding"] if "document_embedding" in node else ""),
)
"""
if update_embedding_values == True: # node.get("keyphrases") != None or node.get("document_embedding") != None or node.get("keyphrase_embedding") != None:
tx.run(
'''
MATCH (r:Resource: Article)
WHERE r.rid = $rid
SET r.keyphrases = $keyphrases, r.keyphrase_embedding = $keyphrase_embedding, r.document_embedding = $document_embedding,
r.keyphrases_infos = $keyphrases_infos
''',
rid=node["rid"],
keyphrases=node["keyphrases"] if "keyphrases" in node else [],
keyphrase_embedding=node["keyphrase_embedding"] if "keyphrase_embedding" in node else [],
document_embedding=node["document_embedding"] if "document_embedding" in node else [],
keyphrases_infos=node["keyphrases_infos"] if "keyphrases_infos" in node else ""
)
else:
tx.run(
'''
MERGE (r:Resource:Article {rid: $rid})
ON CREATE SET
r.uri = $uri, r.title = $title, r.abstract = $abstract, r.text = $text,
r.keyphrases = $keyphrases, r.document_embedding = $document_embedding, r.keyphrase_embedding = $keyphrase_embedding,
r.helpful_count = $helpful_count, r.not_helpful_count = $not_helpful_count, r.saves_count = $saves_count,
r.updated_at = $updated_at
ON MATCH SET
r.uri = $uri, r.title = $title, r.abstract = $abstract, r.text = $text,
r.keyphrases = $keyphrases, r.document_embedding = $document_embedding, r.keyphrase_embedding = $keyphrase_embedding,
r.updated_at = $updated_at
''',
rid=node["id"],
uri=node["id"],
title=node["title"],
abstract=node["abstract"],
keyphrases=node["keyphrases"] if "keyphrases" in node else [],
text=node["text"],
# similarity_score=node[recommendation_type] if recommendation_type in node.index else 0,
keyphrase_embedding=node["keyphrase_embedding"] if "keyphrase_embedding" in node else [],
document_embedding=node["document_embedding"] if "document_embedding" in node else [],
helpful_count=node["helpful_count"] if "helpful_count" in node else 0,
not_helpful_count=node["not_helpful_count"] if "not_helpful_count" in node else 0,
saves_count=node["saves_count"] if "saves_count" in node else 0,
updated_at=datetime.now().isoformat(),
keyphrases_infos=node["keyphrases_infos"] if "keyphrases_infos" in node else ""
)
except Exception as e:
print(e)
pass
def create_or_update_wikipedia_resource(self, tx, node, recommendation_type='',
update_embedding_values=False,
update_detail_found=False
):
'''
Creating Resource Wikipedia
'''
# logger.info("Creating Resource Wikipedia")
try:
"""
if update_detail_found == True:
tx.run(
'''
MATCH (r:Resource: Article)
WHERE r.rid = $rid
SET r.title = $title, r.abstract = $abstract, r.text = $text, updated_at = $updated_at,
r.keyphrases = $keyphrases, r.keyphrase_embedding = $keyphrase_embedding,
r.document_embedding = $document_embedding
''',
rid=node["rid"],
title=node["title"],
abstract=node["abstract"],
text=node["text"],
updated_at=datetime.now().isoformat(),
keyphrases=node["keyphrases"] if "keyphrases" in node else [],
keyphrase_embedding=str(node["keyphrase_embedding"] if "keyphrase_embedding" in node else ""),
document_embedding=str(node["document_embedding"] if "document_embedding" in node else ""),
)
"""
if update_embedding_values:
tx.run(
'''
MATCH (r:Resource: Article)
WHERE r.rid = $rid
SET r.keyphrases = $keyphrases, r.keyphrase_embedding = $keyphrase_embedding, r.document_embedding = $document_embedding,
r.keyphrases_infos = $keyphrases_infos
''',
rid=node["rid"],
- keyphrases=node["keyphrases"] if "keyphrases" in node else [],
- keyphrase_embedding=node["keyphrase_embedding"] if "keyphrase_embedding" in node else [],
- document_embedding=node["document_embedding"] if "document_embedding" in node else [],
keyphrases=node.get("keyphrases", []),
keyphrase_embedding=node.get("keyphrase_embedding", []),
document_embedding=node.get("document_embedding", []),
keyphrases_infos=node.get("keyphrases_infos", "")
)
else:
tx.run(
'''
MERGE (r:Resource:Article {rid: $rid})
ON CREATE SET
r.uri = $uri, r.title = $title, r.abstract = $abstract, r.text = $text,
r.keyphrases = $keyphrases, r.document_embedding = $document_embedding, r.keyphrase_embedding = $keyphrase_embedding,
r.helpful_count = $helpful_count, r.not_helpful_count = $not_helpful_count, r.saves_count = $saves_count,
r.updated_at = $updated_at
ON MATCH SET
r.uri = $uri, r.title = $title, r.abstract = $abstract, r.text = $text,
r.keyphrases = $keyphrases, r.document_embedding = $document_embedding, r.keyphrase_embedding = $keyphrase_embedding,
r.updated_at = $updated_at
''',
rid=node["id"],
uri=node["id"],
title=node["title"],
abstract=node["abstract"],
keyphrases=node.get("keyphrases", []),
text=node["text"],
# similarity_score=node[recommendation_type] if recommendation_type in node.index else 0,
- keyphrase_embedding=node["keyphrase_embedding"] if "keyphrase_embedding" in node else [],
- document_embedding=node["document_embedding"] if "document_embedding" in node else [],
- helpful_count=node["helpful_count"] if "helpful_count" in node else 0,
- not_helpful_count=node["not_helpful_count"] if "not_helpful_count" in node else 0,
- saves_count=node["saves_count"] if "saves_count" in node else 0,
- updated_at=datetime.now().isoformat(),
keyphrase_embedding=node.get("keyphrase_embedding", []),
document_embedding=node.get("document_embedding", []),
helpful_count=node.get("helpful_count", 0),
not_helpful_count=node.get("not_helpful_count", 0),
saves_count=node.get("saves_count", 0),
updated_at=datetime.now(timezone.utc).isoformat(),
keyphrases_infos=node.get("keyphrases_infos", "")
)
- except Exception as e:
- print(e)
except Exception as e:
logger.exception("create_or_update_wikipedia_resource failed: %s", e)
🧰 Tools
🪛 Ruff (0.12.2)

1914-1914: Missing return type annotation for public function create_or_update_wikipedia_resource

Add return type annotation: None

(ANN201)


1914-1914: Missing type annotation for function argument tx

(ANN001)


1914-1914: Missing type annotation for function argument node

(ANN001)


1914-1914: Missing type annotation for function argument recommendation_type

(ANN001)


1914-1914: Unused method argument: recommendation_type

(ARG002)


1915-1915: Boolean default positional argument in function definition

(FBT002)


1915-1915: Missing type annotation for function argument update_embedding_values

(ANN001)


1916-1916: Boolean default positional argument in function definition

(FBT002)


1916-1916: Missing type annotation for function argument update_detail_found

(ANN001)


1916-1916: Unused method argument: update_detail_found

(ARG002)


1944-1944: Avoid equality comparisons to True; use update_embedding_values: for truth checks

Replace with update_embedding_values

(E712)


1953-1953: Use node.get("keyphrases", []) instead of an if block

Replace with node.get("keyphrases", [])

(SIM401)


1954-1954: Use node.get("keyphrase_embedding", []) instead of an if block

Replace with node.get("keyphrase_embedding", [])

(SIM401)


1955-1955: Use node.get("document_embedding", []) instead of an if block

Replace with node.get("document_embedding", [])

(SIM401)


1956-1956: Use node.get("keyphrases_infos", "") instead of an if block

Replace with node.get("keyphrases_infos", "")

(SIM401)


1976-1976: Use node.get("keyphrases", []) instead of an if block

Replace with node.get("keyphrases", [])

(SIM401)


1979-1979: Use node.get("keyphrase_embedding", []) instead of an if block

Replace with node.get("keyphrase_embedding", [])

(SIM401)


1980-1980: Use node.get("document_embedding", []) instead of an if block

Replace with node.get("document_embedding", [])

(SIM401)


1981-1981: Use node.get("helpful_count", 0) instead of an if block

Replace with node.get("helpful_count", 0)

(SIM401)


1982-1982: Use node.get("not_helpful_count", 0) instead of an if block

Replace with node.get("not_helpful_count", 0)

(SIM401)


1983-1983: Use node.get("saves_count", 0) instead of an if block

Replace with node.get("saves_count", 0)

(SIM401)


1984-1984: datetime.datetime.now() called without a tz argument

(DTZ005)


1985-1985: Use node.get("keyphrases_infos", "") instead of an if block

Replace with node.get("keyphrases_infos", "")

(SIM401)


1987-1987: Do not catch blind exception: Exception

(BLE001)


1989-1989: Unnecessary pass statement

Remove unnecessary pass

(PIE790)

Comment on lines +209 to +224
value="video"
[(ngModel)]="filteringParamsSavedTab.content_type"
inputId="video"
(onClick)="onContentTypeChange($event)">
</p-radioButton>
<label for="video" class="ml-1">Video</label>
</div>
<div class="flex flex-row items-center">
<p-radioButton
name="article"
value="article"
[(ngModel)]="filteringParamsSavedTab.content_type"
inputId="article"
(onClick)="onContentTypeChange($event)">
</p-radioButton>
<label for="article" class="ml-1">Article</label>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick (assertive)

Confirm PrimeNG radio events.

Ensure p-radioButton actually emits onClick; if not, prefer (ngModelChange) to react to selection.

-                  (onClick)="onContentTypeChange($event)">
+                  (ngModelChange)="onContentTypeChange($event)">
...
-                  (onClick)="onContentTypeChange($event)">
+                  (ngModelChange)="onContentTypeChange($event)">
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
name="video"
value="video"
[(ngModel)]="filteringParamsSavedTab.content_type"
inputId="video"
(onClick)="onContentTypeChange($event)">
</p-radioButton>
<label for="video" class="ml-1">Video</label>
</div>
<div class="flex flex-row items-center">
<p-radioButton
name="article"
value="article"
[(ngModel)]="filteringParamsSavedTab.content_type"
inputId="article"
(onClick)="onContentTypeChange($event)">
</p-radioButton>
<div class="flex flex-row items-center">
<p-radioButton
name="video"
value="video"
[(ngModel)]="filteringParamsSavedTab.content_type"
inputId="video"
(ngModelChange)="onContentTypeChange($event)">
</p-radioButton>
<label for="video" class="ml-1">Video</label>
</div>
<div class="flex flex-row items-center">
<p-radioButton
name="article"
value="article"
[(ngModel)]="filteringParamsSavedTab.content_type"
inputId="article"
(ngModelChange)="onContentTypeChange($event)">
</p-radioButton>
🤖 Prompt for AI Agents
In
webapp/src/app/pages/components/knowledge-graph/result-view/result-view.component.html
around lines 209 to 224, the template is binding (onClick) to PrimeNG
p-radioButton which does not reliably emit that event; replace the click binding
with (ngModelChange) so selection updates trigger the handler (e.g.
(ngModelChange)="onContentTypeChange($event)") and ensure the component's
onContentTypeChange accepts the new value (string) rather than a MouseEvent;
update both radio buttons accordingly to use the same ngModel and ngModelChange
handler.

Comment on lines +346 to +347
Total Articles: {{resourcesPagination?.nodes?.articles?.total_items}}
</div>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Wrong binding: Articles total gated by Videos total.

The label visibility for Articles is controlled by the Videos total.

-      <div class="my-auto mx-1 text-color" *ngIf="resourcesPagination?.nodes?.videos?.total_items">
-        Total Articles: {{resourcesPagination?.nodes?.articles?.total_items}}
+      <div class="my-auto mx-1 text-color" *ngIf="resourcesPagination?.nodes?.articles?.total_items">
+        Total Articles: {{ resourcesPagination?.nodes?.articles?.total_items }}
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
<div class="my-auto mx-1 text-color" *ngIf="resourcesPagination?.nodes?.videos?.total_items">
Total Articles: {{resourcesPagination?.nodes?.articles?.total_items}}
<div class="my-auto mx-1 text-color" *ngIf="resourcesPagination?.nodes?.articles?.total_items">
Total Articles: {{ resourcesPagination?.nodes?.articles?.total_items }}
🤖 Prompt for AI Agents
In
webapp/src/app/pages/components/knowledge-graph/result-view/result-view.component.html
around lines 346 to 347, the "Total Articles" label is conditionally shown using
resourcesPagination?.nodes?.videos?.total_items which hides the articles count
when videos are zero; change the *ngIf to check
resourcesPagination?.nodes?.articles?.total_items instead (use the articles
total_items null-safe path) so the Articles label visibility is driven by the
articles count.

<ng-template [ngIf]="!showVideo" [ngIfElse]="watchvideo">
<div class="" *ngFor="let videoElement of videoElements">
<app-card-video
<div class="grid p-8 gap-8 sm:grid-cols-1 ">
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

A11y: add list semantics and an accessible name for the icon-only button; mark strings for i18n

Improves screen-reader UX and localization readiness.

-<div class="grid p-8 gap-8 sm:grid-cols-1 ">
+<div class="grid p-8 gap-8 sm:grid-cols-1" role="list">
@@
-        <app-card-video
+        <app-card-video
+          role="listitem"
@@
-      <button id="backToList"
+      <button id="backToList"
               pButton
               type="button"
               (click)="goBack()"
               icon="pi pi-chevron-left"
-              pTooltip="Go back to list"
+              pTooltip="Go back to list"
+              i18n-pTooltip="@@backToListTooltip"
+              aria-label="Go back to list"
+              i18n-aria-label="@@backToListAriaLabel"
               class="bg-[#0277BD] p-button-rounded"
       ></button>

Also applies to: 4-13, 22-29

🤖 Prompt for AI Agents
In
webapp/src/app/pages/components/knowledge-graph/videos/card-video-list/card-video-list.component.html
around line 1 (also applies to lines 4-13 and 22-29): the container and its
child items lack list semantics, the icon-only button has no accessible name,
and visible strings are not marked for i18n; fix by giving the container proper
list semantics (replace the generic div with a semantic <ul> or add
role="list"), ensure each item is a <li> or has role="listitem", add an
accessible name to the icon-only button (aria-label or aria-labelledby) with a
translatable string, and mark all user-facing text with the framework i18n
attributes (e.g., i18n or translate) so strings are ready for localization.

<app-card-video
<div class="grid p-8 gap-8 sm:grid-cols-1 ">
<ng-template [ngIf]="!showVideo" [ngIfElse]="watchvideo">
<div class="" *ngFor="let videoElement of videoElements">
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick (assertive)

Prefer structural *ngIf with else over [ngIf]/[ngIfElse] templates

Reduces extra DOM and improves readability.

-    <ng-template [ngIf]="!showVideo" [ngIfElse]="watchvideo">
+    <ng-container *ngIf="!showVideo; else watchvideo">
 ...
-    </ng-template>
+    </ng-container>

Also applies to: 15-15

🤖 Prompt for AI Agents
In
webapp/src/app/pages/components/knowledge-graph/videos/card-video-list/card-video-list.component.html
around lines 2 and 15, the template uses the property bindings [ngIf] and
[ngIfElse]; replace those with the structural directive form (*ngIf="!showVideo;
else watchvideo") to reduce extra DOM and improve readability, and apply the
same replacement at line 15 where the pattern repeats; ensure the named
ng-template for "watchvideo" remains defined and referenced unchanged.

Comment on lines +16 to +21
<app-watch-video
[video]="video"
[notUnderstoodConcepts]="notUnderstoodConcepts"
(onWatchVideo)="readVideo($event)"
[videoElement]="videoElement"
[currentMaterial]="currentMaterial"
></app-card-video>
</div>
</ng-template>
<ng-template #watchvideo>
<app-watch-video
[video]="video"
[notUnderstoodConcepts]="notUnderstoodConcepts"
[currentMaterial]="currentMaterial"
></app-watch-video>
<button
id="backToList"
pButton
type="button"
(click)="goBack()"
icon="pi pi-chevron-left"
pTooltip="Go back to list"
class="bg-[#0277BD] p-button-rounded"
></button>
</ng-template>
</div>
></app-watch-video>
<button id="backToList"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick (assertive)

Guard against null/undefined video in the else block

Prevents runtime errors if video isn’t set when showVideo flips.

-    <ng-template #watchvideo>
-      <app-watch-video
-        [video]="video" 
-        [notUnderstoodConcepts]="notUnderstoodConcepts"
-        [currentMaterial]="currentMaterial"
-      ></app-watch-video>
+    <ng-template #watchvideo>
+      <ng-container *ngIf="video as v">
+        <app-watch-video
+          [video]="v"
+          [notUnderstoodConcepts]="notUnderstoodConcepts"
+          [currentMaterial]="currentMaterial"
+        ></app-watch-video>
+      </ng-container>

Confirm that video is always set before toggling showVideo = true.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
<ng-template #watchvideo>
<app-watch-video
[video]="video"
[notUnderstoodConcepts]="notUnderstoodConcepts"
(onWatchVideo)="readVideo($event)"
[videoElement]="videoElement"
[currentMaterial]="currentMaterial"
></app-card-video>
</div>
</ng-template>
<ng-template #watchvideo>
<app-watch-video
[video]="video"
[notUnderstoodConcepts]="notUnderstoodConcepts"
[currentMaterial]="currentMaterial"
></app-watch-video>
<button
id="backToList"
pButton
type="button"
(click)="goBack()"
icon="pi pi-chevron-left"
pTooltip="Go back to list"
class="bg-[#0277BD] p-button-rounded"
></button>
</ng-template>
</div>
></app-watch-video>
<ng-template #watchvideo>
<ng-container *ngIf="video as v">
<app-watch-video
[video]="v"
[notUnderstoodConcepts]="notUnderstoodConcepts"
[currentMaterial]="currentMaterial"
></app-watch-video>
</ng-container>
🤖 Prompt for AI Agents
In
webapp/src/app/pages/components/knowledge-graph/videos/card-video-list/card-video-list.component.html
around lines 16-21, the template renders <app-watch-video> in the
else/watchvideo block without guarding against a null/undefined video; either
add a truthy check to the template (e.g., wrap the <app-watch-video> with
*ngIf="video") so the component only renders when video is set, or ensure in the
component TS that you only set showVideo = true after verifying video is
non-null (throw or return early if not); implement one of these guards to
prevent runtime errors when showVideo flips.

Prevents IndexError: list index out of range when a video is invalid/private/deleted or filtered out.
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 10

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
coursemapper-kg/recommendation/app/services/course_materials/recommendation/youtube_service.py (2)

33-35: Do not set GOOGLE_APPLICATION_CREDENTIALS in code

Remove inline env mutation; configure via deployment (env/secret manager). Leaving it here risks leaking paths and breaks local/runtime separation.

-    os.environ[
-        "GOOGLE_APPLICATION_CREDENTIALS"
-    ] = "masterthesis-350015-47ab14d0b53b.json"
+    # GOOGLE_APPLICATION_CREDENTIALS must be provided by the environment (CI/K8s/secret store).

176-178: Avoid global string fill; fill per-column with typed defaults

Global "-1" converts numerics to strings.

-            video_data = video_data.fillna("-1")
+            video_data = video_data.fillna({
+                "duration": "0",
+                "views": 0,
+                "description_full": "",
+                "like_count": 0,
+                "channel_title": "",
+                "text": "",
+            })
♻️ Duplicate comments (3)
coursemapper-kg/recommendation/app/services/course_materials/recommendation/youtube_service.py (3)

39-52: Hard-coded API keys committed — rotate immediately and load from env/secret store

All keys must be revoked and removed from source. Load a comma-separated list from env and fail fast if missing. Also drop the unused class-level client to avoid accidental usage.

-    # DEVELOPER_KEY = os.environ.get("YOUTUBE_API_KEY")
-    DEVELOPER_KEY = "AIzaSyBphZOn7EJmPMmZwrB71aepaA5Rbuex9MU"
-    youtube = googleapiclient.discovery.build(
-        api_service_name,
-        api_version,
-        developerKey="AIzaSyClxnNwQ1x34pGioQazLlGxOjO9Fp2GGTY",
-    )
-    DEVELOPER_KEYS = [
-        "AIzaSyD_CGmR_Voq4DIV5okRaR6G8adoe-ZSZsM",
-        "AIzaSyClxnNwQ1x34pGioQazLlGxOjO9Fp2GGTY",
-        "AIzaSyADNntK6m7DbA6eZFYOa9Y8e6IYHykUUFE",
-        "AIzaSyBphZOn7EJmPMmZwrB71aepaA5Rbuex9MU",
-        "AIzaSyB2Wck31LUlgsqI7dgTcC2dMeeVXgb9TDI",
-    ]
+    # Comma-separated API keys from env (e.g., "k1,k2,k3"). Do NOT commit keys.
+    DEVELOPER_KEYS = [k.strip() for k in os.getenv("YOUTUBE_API_KEYS", "").split(",") if k.strip()]
+    if not DEVELOPER_KEYS:
+        raise RuntimeError("YOUTUBE_API_KEYS env var is required (comma-separated YouTube API keys).")

54-90: Retry/backoff logic is broken; fix loop, logging, and quota handling

‘i’ never increments; retries never happen; retry_count == 0 is unreachable; use logger.exception and switch keys on quotaExceeded.

-    def search_youtube_videos(self, developer_keys, query, top_n=50, api_service_name="youtube", api_version="v3"):
+    def search_youtube_videos(self, developer_keys, query, top_n=50, api_service_name="youtube", api_version="v3"):
         """
             Switching YouTube API keys
         """
-        retry_count = 3
-        retry_delay = 5
-        i = 0
-        for key in developer_keys:
-            try:
-                youtube = googleapiclient.discovery.build(api_service_name, api_version, developerKey=key)
-                request = youtube.search().list(
-                    part="snippet",
-                    maxResults=top_n,
-                    type="video",
-                    q=query,
-                    relevanceLanguage="en",
-                )
-                return request.execute(), youtube
-            except (ConnectionAbortedError, ConnectionResetError, timeout) as e:
-                logger.error("Error while getting the videos")
-                logger.error(e)
-                if i == retry_count - 1:
-                    raise  # re-raise the exception if all retries fail
-                delay = retry_delay * (2 ** i)  # use a backoff algorithm to increase the delay
-                time.sleep(delay)
-                logger.info("New Try")
-                if retry_count == 0:
-                    return None, None
-            except HttpError as e:
-                if e.resp.status == 403 and "quota" in str(e):
-                    print(f"Quota exceeded for key: {key}. Trying next key...")
-                else:
-                    raise e
-        raise Exception("All API keys have exceeded their quota.")
+        retry_count = 3
+        retry_delay = 5
+        top_n = min(int(top_n), 50)  # API max
+        developer_keys = developer_keys or self.DEVELOPER_KEYS
+        for key in developer_keys:
+            for i in range(retry_count):
+                try:
+                    youtube = googleapiclient.discovery.build(api_service_name, api_version, developerKey=key)
+                    request = youtube.search().list(
+                        part="snippet",
+                        maxResults=top_n,
+                        type="video",
+                        q=query,
+                        relevanceLanguage="en",
+                    )
+                    return request.execute(), youtube
+                except (ConnectionAbortedError, ConnectionResetError, socket.timeout) as e:
+                    logger.exception("Transient connection error on attempt %d/%d (key ****%s)", i + 1, retry_count, key[-4:])
+                    time.sleep(retry_delay * (2 ** i))
+                    continue
+                except HttpError as e:
+                    status = getattr(getattr(e, "resp", None), "status", None)
+                    msg = str(e).lower()
+                    if status == 403 and ("quotaexceeded" in msg or "dailylimitexceeded" in msg or "quota" in msg):
+                        logger.warning("Quota exceeded for key ****%s; switching key…", key[-4:])
+                        break  # next key
+                    raise
+        raise RuntimeError("All YouTube API keys exhausted or failed.")

Additionally apply outside this range:

# at top of file
import socket  # replace 'from socket import *'

# and remove the star import usage in excepts (done in diff above).

73-76: Log exceptions with traceback; avoid double error lines

Use logger.exception once; also avoid relying on timeout from star-import.

-            except (ConnectionAbortedError, ConnectionResetError, timeout) as e:
-                logger.error("Error while getting the videos")
-                logger.error(e)
+            except (ConnectionAbortedError, ConnectionResetError, socket.timeout) as e:
+                logger.exception("Error while getting the videos")

Outside this range, replace from socket import * with import socket.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: ASSERTIVE

Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 63591bd and 4b97a54.

📒 Files selected for processing (1)
  • coursemapper-kg/recommendation/app/services/course_materials/recommendation/youtube_service.py (3 hunks)
🧰 Additional context used
🪛 Gitleaks (8.27.2)
coursemapper-kg/recommendation/app/services/course_materials/recommendation/youtube_service.py

47-47: Uncovered a GCP API key, which could lead to unauthorized access to Google Cloud services and data breaches.

(gcp-api-key)


48-48: Uncovered a GCP API key, which could lead to unauthorized access to Google Cloud services and data breaches.

(gcp-api-key)


49-49: Uncovered a GCP API key, which could lead to unauthorized access to Google Cloud services and data breaches.

(gcp-api-key)


50-50: Uncovered a GCP API key, which could lead to unauthorized access to Google Cloud services and data breaches.

(gcp-api-key)


51-51: Uncovered a GCP API key, which could lead to unauthorized access to Google Cloud services and data breaches.

(gcp-api-key)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
  • GitHub Check: Build container images / Build image (coursemapper-kg/preprocess)
  • GitHub Check: Build container images / Build image (coursemapper-kg/concept-map)
  • GitHub Check: Build container images / Build image (coursemapper-kg/recommendation)
  • GitHub Check: Build container images / Build image (webapp)
🔇 Additional comments (1)
coursemapper-kg/recommendation/app/services/course_materials/recommendation/youtube_service.py (1)

193-199: Good: exception visibility improved

Catching HttpError and logging with traceback is appropriate.

import logging
from log import LOG
import time
from googleapiclient.errors import HttpError
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick (assertive)

Minor: clean imports

Now that HttpError is imported directly, drop import googleapiclient.errors. Also remove duplicate import time (present twice).

-from youtube_transcript_api import YouTubeTranscriptApi
+from youtube_transcript_api import YouTubeTranscriptApi
@@
-import googleapiclient.errors
+# googleapiclient.errors not needed; using HttpError directly
@@
-import time
+# remove duplicate 'import time' if present twice

Committable suggestion skipped: line range outside the PR's diff.

🤖 Prompt for AI Agents
In
coursemapper-kg/recommendation/app/services/course_materials/recommendation/youtube_service.py
around line 15, tidy imports by removing the redundant module import and
duplicate import: delete the now-unnecessary "import googleapiclient.errors"
since HttpError is imported directly from googleapiclient.errors, and remove the
duplicated "import time" so time is only imported once; ensure no other
references depend on the removed module import.

Comment on lines +84 to +88
except HttpError as e:
if e.resp.status == 403 and "quota" in str(e):
print(f"Quota exceeded for key: {key}. Trying next key...")
else:
raise e
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Replace print with structured logging; broaden quota reasons

Use logger and check standard reasons.

-                if e.resp.status == 403 and "quota" in str(e):
-                    print(f"Quota exceeded for key: {key}. Trying next key...")
+                if e.resp.status == 403 and any(r in str(e).lower() for r in ["quotaexceeded", "dailylimitexceeded", "quota"]):
+                    logger.warning("Quota exceeded for key ****%s. Trying next key…", key[-4:])
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
except HttpError as e:
if e.resp.status == 403 and "quota" in str(e):
print(f"Quota exceeded for key: {key}. Trying next key...")
else:
raise e
except HttpError as e:
if e.resp.status == 403 and any(r in str(e).lower() for r in ["quotaexceeded", "dailylimitexceeded", "quota"]):
logger.warning("Quota exceeded for key ****%s. Trying next key…", key[-4:])
else:
raise e
🤖 Prompt for AI Agents
In
coursemapper-kg/recommendation/app/services/course_materials/recommendation/youtube_service.py
around lines 84-88, replace the plain print with structured logging and broaden
the quota detection: use the module or instance logger (e.g., logger.warning) to
log a clear message that includes the key and exception details (include
exc_info or the exception object), and detect quota errors not only by checking
if "quota" is in the string but also by inspecting standard API error reasons
such as "quotaExceeded", "dailyLimitExceeded" or the HttpError response
body/reason; if a quota-related condition is detected, log the warning and
continue to the next key, otherwise re-raise the exception.

Comment on lines +104 to +106
response, youtube_api_sinlge = self.search_youtube_videos(
developer_keys=self.DEVELOPER_KEYS, query=concepts, top_n=top_n
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick (assertive)

Typo: youtube_api_sinlge → youtube_api_single

Fix naming for readability and to avoid propagating typos.

-        response, youtube_api_sinlge = self.search_youtube_videos(
+        response, youtube_api_single = self.search_youtube_videos(
             developer_keys=self.DEVELOPER_KEYS, query=concepts, top_n=top_n
         )
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
response, youtube_api_sinlge = self.search_youtube_videos(
developer_keys=self.DEVELOPER_KEYS, query=concepts, top_n=top_n
)
response, youtube_api_single = self.search_youtube_videos(
developer_keys=self.DEVELOPER_KEYS, query=concepts, top_n=top_n
)
🤖 Prompt for AI Agents
In
coursemapper-kg/recommendation/app/services/course_materials/recommendation/youtube_service.py
around lines 104 to 106, there's a typo in the variable name
"youtube_api_sinlge" — rename it to "youtube_api_single" consistently where it's
assigned and anywhere else it's referenced to improve readability and avoid
further typos; update the function return unpacking to use youtube_api_single
and search for other occurrences of the misspelled identifier in the file and
replace them to keep names consistent.

Comment on lines 108 to 111
if len(response["items"]) == 0:
logger.info("No Video found for this input")
return []
else:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Guard against None/empty responses and return consistent type

Return an empty DataFrame (not list) to keep a stable return type.

-        if len(response["items"]) == 0:
+        if not response or not response.get("items"):
             logger.info("No Video found for this input")
-            return []
+            return pd.DataFrame()
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
if len(response["items"]) == 0:
logger.info("No Video found for this input")
return []
else:
if not response or not response.get("items"):
logger.info("No Video found for this input")
return pd.DataFrame()
else:
🤖 Prompt for AI Agents
In
coursemapper-kg/recommendation/app/services/course_materials/recommendation/youtube_service.py
around lines 108 to 111, the code returns an empty list when no videos are
found; change this to return an empty pandas DataFrame to maintain a consistent
return type. Guard against response being None or missing the "items" key before
accessing it (e.g., check response and "items" in response), and return
pandas.DataFrame() when there are no items or response is invalid so callers
always receive a DataFrame.

Comment on lines 127 to +137
for index, id in enumerate(df_ids["id"]):
# try:
# df_snippet["text"][index] = df_snippet["text"][index] + ". " + get_subtitles(id)
# df_snippet["text"][index] = df_snippet["text"][index] + ". " + get_subtitles(id)
# except (NoTranscriptFound, TranscriptsDisabled) as e:
# logger.error("No transcript found in english or transcript disabled for this video "
# "https://www.youtube.com/watch?v={} ".format(id))
# logger.error("No transcript found in english or transcript disabled for this video "
# "https://www.youtube.com/watch?v={} ".format(id))

try:
duration, views, description = self.get_video_details(id)
duration = re.findall(r"\d+", duration)
duration = ":".join(duration)
# print(id, duration, views)
duration_list.append(duration)
view_list.append(views)
description_list.append(description)
except Exception as e:
logger.error("Error while getting the videos details", e)
res = self.get_video_details(youtube_api_sinlge, id)
if res is None:
raise ValueError("No details returned for video id {}".format(id))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick (assertive)

Avoid shadowing built-in id and improve error

Rename loop var to video_id; keep error message intact.

-            for index, id in enumerate(df_ids["id"]):
+            for index, video_id in enumerate(df_ids["id"]):
@@
-                    res = self.get_video_details(youtube_api_sinlge, id)
+                    res = self.get_video_details(youtube_api_single, video_id)
-                    if res is None:
-                        raise ValueError("No details returned for video id {}".format(id))
+                    if res is None:
+                        raise ValueError("No details returned for video id {}".format(video_id))
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
for index, id in enumerate(df_ids["id"]):
# try:
# df_snippet["text"][index] = df_snippet["text"][index] + ". " + get_subtitles(id)
# df_snippet["text"][index] = df_snippet["text"][index] + ". " + get_subtitles(id)
# except (NoTranscriptFound, TranscriptsDisabled) as e:
# logger.error("No transcript found in english or transcript disabled for this video "
# "https://www.youtube.com/watch?v={} ".format(id))
# logger.error("No transcript found in english or transcript disabled for this video "
# "https://www.youtube.com/watch?v={} ".format(id))
try:
duration, views, description = self.get_video_details(id)
duration = re.findall(r"\d+", duration)
duration = ":".join(duration)
# print(id, duration, views)
duration_list.append(duration)
view_list.append(views)
description_list.append(description)
except Exception as e:
logger.error("Error while getting the videos details", e)
res = self.get_video_details(youtube_api_sinlge, id)
if res is None:
raise ValueError("No details returned for video id {}".format(id))
for index, video_id in enumerate(df_ids["id"]):
# try:
# df_snippet["text"][index] = df_snippet["text"][index] + ". " + get_subtitles(id)
# except (NoTranscriptFound, TranscriptsDisabled) as e:
# logger.error("No transcript found in english or transcript disabled for this video "
# "https://www.youtube.com/watch?v={} ".format(id))
try:
- res = self.get_video_details(youtube_api_sinlge, id)
- if res is None:
res = self.get_video_details(youtube_api_single, video_id)
if res is None:
raise ValueError("No details returned for video id {}".format(video_id))
🤖 Prompt for AI Agents
In
coursemapper-kg/recommendation/app/services/course_materials/recommendation/youtube_service.py
around lines 127 to 137, the loop variable currently named "id" shadows Python's
built-in id() and should be renamed to "video_id"; update the for statement and
every reference inside the loop (e.g., pass video_id to get_video_details and to
the error message string.format) so behavior and the existing error message
remain the same while avoiding built-in shadowing.

Comment on lines +141 to +153
# Keep your duration normalization
duration = re.findall(r"\d+", str(duration))
duration = ":".join(duration)
except Exception:
# Fix logging formatter error and keep same message semantics
logger.exception("Error while getting the videos details for id %s", id)
# Append safe defaults to keep list lengths aligned with df_ids
duration = "0"
views = 0
description = ""
like_count = 0
channel_title = ""

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick (assertive)

Normalize duration properly; keep numeric types for counts

Parse ISO-8601 with isodate if available; coerce counts to ints.

-                    duration = re.findall(r"\d+", str(duration))
-                    duration = ":".join(duration)
+                    # Prefer true ISO-8601 parsing if available; fallback kept
+                    try:
+                        import isodate
+                        duration = int(isodate.parse_duration(str(duration)).total_seconds())
+                    except Exception:
+                        duration = ":".join(re.findall(r"\d+", str(duration)))
@@
-                    views = 0
+                    views = 0
@@
-                    like_count = 0
+                    like_count = 0

And when appending (below), coerce to numeric (see next comment).

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
# Keep your duration normalization
duration = re.findall(r"\d+", str(duration))
duration = ":".join(duration)
except Exception:
# Fix logging formatter error and keep same message semantics
logger.exception("Error while getting the videos details for id %s", id)
# Append safe defaults to keep list lengths aligned with df_ids
duration = "0"
views = 0
description = ""
like_count = 0
channel_title = ""
# Keep your duration normalization
- duration = re.findall(r"\d+", str(duration))
# Prefer true ISO-8601 parsing if available; fallback to regex join
try:
import isodate
duration = int(isodate.parse_duration(str(duration)).total_seconds())
except Exception:
duration = ":".join(re.findall(r"\d+", str(duration)))
except Exception:
# Fix logging formatter error and keep same message semantics
logger.exception("Error while getting the videos details for id %s", id)
# Append safe defaults to keep list lengths aligned with df_ids
duration = "0"
views = 0
description = ""
like_count = 0
channel_title = ""
🤖 Prompt for AI Agents
In
coursemapper-kg/recommendation/app/services/course_materials/recommendation/youtube_service.py
around lines 141 to 153, the duration and numeric fields are being treated as
strings and not normalized consistently; update the exception handling and
normal parsing to (1) parse ISO-8601 durations using isodate.parse_duration
(fall back to existing regex only if isodate is unavailable) and normalize to a
canonical "H:MM:SS" or "MM:SS" string for duration, (2) ensure counts like views
and like_count are coerced to int (use int(...) with safe default 0) and
description/channel_title stay as strings, and (3) keep the existing
logger.exception call but maintain its safe formatting; after fixing here,
ensure any append/population logic below appends ints for numeric fields and the
normalized duration string so list lengths and types remain consistent.

Comment on lines +163 to +169
# Assign lists (same behavior as before, now lengths aligned)
video_data["duration"] = pd.Series(duration_list)
video_data["views"] = pd.Series(view_list)
video_data["description_full"] = pd.Series(description_list)
video_data["like_count"] = pd.Series(like_count_list)
video_data["channel_title"] = pd.Series(channel_title_list)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Preserve numeric dtypes for views/likes; avoid implicit object dtype

Cast with to_numeric and nullable Int64.

-            video_data["views"] = pd.Series(view_list)
+            video_data["views"] = pd.to_numeric(pd.Series(view_list), errors="coerce").fillna(0).astype("Int64")
@@
-            video_data["like_count"] = pd.Series(like_count_list)
+            video_data["like_count"] = pd.to_numeric(pd.Series(like_count_list), errors="coerce").fillna(0).astype("Int64")
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
# Assign lists (same behavior as before, now lengths aligned)
video_data["duration"] = pd.Series(duration_list)
video_data["views"] = pd.Series(view_list)
video_data["description_full"] = pd.Series(description_list)
video_data["like_count"] = pd.Series(like_count_list)
video_data["channel_title"] = pd.Series(channel_title_list)
# Assign lists (same behavior as before, now lengths aligned)
video_data["duration"] = pd.Series(duration_list)
video_data["views"] = pd.to_numeric(pd.Series(view_list), errors="coerce").fillna(0).astype("Int64")
video_data["description_full"] = pd.Series(description_list)
video_data["like_count"] = pd.to_numeric(pd.Series(like_count_list), errors="coerce").fillna(0).astype("Int64")
video_data["channel_title"] = pd.Series(channel_title_list)
🤖 Prompt for AI Agents
In
coursemapper-kg/recommendation/app/services/course_materials/recommendation/youtube_service.py
around lines 163 to 169, assigning views and like_count directly from lists
produces object dtype; convert these Series to numeric using
pandas.to_numeric(..., errors='coerce') and then cast to the nullable Integer
dtype .astype("Int64") so missing/non-numeric values become <NA> instead of
strings; update the assignment for video_data["views"] and
video_data["like_count"] to build Series from the lists, apply to_numeric with
errors='coerce', and then .astype("Int64").

Comment on lines 171 to 173
video_data["text"] = pd.DataFrame(
video_data["title"] + ". " + video_data["description"]
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick (assertive)

Assign Series, not DataFrame, when building text

Avoid wrapping in DataFrame; also coerce to str to be safe.

-            video_data["text"] = pd.DataFrame(
-                video_data["title"] + ". " + video_data["description"]
-            )
+            video_data["text"] = video_data["title"].astype(str) + ". " + video_data["description"].astype(str)
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
video_data["text"] = pd.DataFrame(
video_data["title"] + ". " + video_data["description"]
)
video_data["text"] = video_data["title"].astype(str) + ". " + video_data["description"].astype(str)
🤖 Prompt for AI Agents
In
coursemapper-kg/recommendation/app/services/course_materials/recommendation/youtube_service.py
around lines 171 to 173, the code wraps the concatenated title and description
in a DataFrame; instead assign a pandas Series and ensure both title and
description are coerced to strings. Replace the DataFrame construction with a
Series (e.g., pd.Series((video_data["title"].astype(str) + ". " +
video_data["description"].astype(str)))) so text is a Series of strings rather
than a DataFrame.

Comment on lines +180 to +186
def get_video_details(self, youtube_api_sinlge, video_id):
# print("get_video_details for id -------------------- ", video_id)
try:
duration = (
r["items"][0]["contentDetails"]["duration"]
if r["items"][0]["contentDetails"]["duration"]
else 0
)
views = (
r["items"][0]["statistics"]["viewCount"]
if r["items"][0]["statistics"]["viewCount"]
else 0
)
description = (
r["items"][0]["snippet"]["description"]
if r["items"][0]["snippet"]["description"]
else ""
)

except Exception as e:
print("---------------------------------------")
print(e)
# The number of views are not present for some videos and this leads to an exception. For this
# reason a default value of 0 views will be given that video.
views = 0
duration = (
r["items"][0]["contentDetails"]["duration"]
if r["items"][0]["contentDetails"]["duration"]
else 0
)
description = (
r["items"][0]["snippet"]["description"]
if r["items"][0]["snippet"]["description"]
else ""
r = (
# self.youtube.videos()
youtube_api_sinlge.videos()
.list(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick (assertive)

Typo and minimize API fields

Fix param name and narrow fields to what you actually read.

-    def get_video_details(self, youtube_api_sinlge, video_id):
+    def get_video_details(self, youtube_api_single, video_id):
@@
-                youtube_api_sinlge.videos()
+                youtube_api_single.videos()
                 .list(
-                    part="snippet,statistics,contentDetails",
+                    part="snippet,statistics,contentDetails",
                     id=video_id,
-                    fields="items(statistics,contentDetails(duration),snippet)",
+                    fields="items(statistics(viewCount,likeCount),contentDetails(duration),snippet(description,channelTitle))",
                 )
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
def get_video_details(self, youtube_api_sinlge, video_id):
# print("get_video_details for id -------------------- ", video_id)
try:
duration = (
r["items"][0]["contentDetails"]["duration"]
if r["items"][0]["contentDetails"]["duration"]
else 0
)
views = (
r["items"][0]["statistics"]["viewCount"]
if r["items"][0]["statistics"]["viewCount"]
else 0
)
description = (
r["items"][0]["snippet"]["description"]
if r["items"][0]["snippet"]["description"]
else ""
)
except Exception as e:
print("---------------------------------------")
print(e)
# The number of views are not present for some videos and this leads to an exception. For this
# reason a default value of 0 views will be given that video.
views = 0
duration = (
r["items"][0]["contentDetails"]["duration"]
if r["items"][0]["contentDetails"]["duration"]
else 0
)
description = (
r["items"][0]["snippet"]["description"]
if r["items"][0]["snippet"]["description"]
else ""
r = (
# self.youtube.videos()
youtube_api_sinlge.videos()
.list(
def get_video_details(self, youtube_api_single, video_id):
# print("get_video_details for id -------------------- ", video_id)
try:
r = (
# self.youtube.videos()
youtube_api_single.videos()
.list(
part="snippet,statistics,contentDetails",
id=video_id,
fields="items(statistics(viewCount,likeCount),contentDetails(duration),snippet(description,channelTitle))",
)
)
# …rest of method…

Comment on lines +210 to +217
# Safe gets with defaults (keep same return semantics)
duration = cd.get("duration") if cd.get("duration") else 0
views = st.get("viewCount") if st.get("viewCount") else 0
description = sn.get("description") if sn.get("description") else ""
like_count = st.get("likeCount") if st.get("likeCount") else 0
channel_title = sn.get("channelTitle") if sn.get("channelTitle") else ""

return duration, views, description, like_count, channel_title
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Return typed values; default safely

Coerce counts to int and keep empty strings for text fields.

-        duration = cd.get("duration") if cd.get("duration") else 0
-        views = st.get("viewCount") if st.get("viewCount") else 0
-        description = sn.get("description") if sn.get("description") else ""
-        like_count = st.get("likeCount") if st.get("likeCount") else 0
-        channel_title = sn.get("channelTitle") if sn.get("channelTitle") else ""
+        duration = cd.get("duration") or 0
+        views = int(st.get("viewCount") or 0)
+        description = sn.get("description") or ""
+        like_count = int(st.get("likeCount") or 0)
+        channel_title = sn.get("channelTitle") or ""
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
# Safe gets with defaults (keep same return semantics)
duration = cd.get("duration") if cd.get("duration") else 0
views = st.get("viewCount") if st.get("viewCount") else 0
description = sn.get("description") if sn.get("description") else ""
like_count = st.get("likeCount") if st.get("likeCount") else 0
channel_title = sn.get("channelTitle") if sn.get("channelTitle") else ""
return duration, views, description, like_count, channel_title
# Safe gets with defaults (keep same return semantics)
duration = cd.get("duration") or 0
views = int(st.get("viewCount") or 0)
description = sn.get("description") or ""
like_count = int(st.get("likeCount") or 0)
channel_title = sn.get("channelTitle") or ""
return duration, views, description, like_count, channel_title
🤖 Prompt for AI Agents
In
coursemapper-kg/recommendation/app/services/course_materials/recommendation/youtube_service.py
around lines 210 to 217, the numeric fields are returned as-is and may be
strings or None; coerce the count/duration fields to ints and preserve empty
strings for text fields by using safe defaults. Replace the current assignments
with expressions that use the .get(... ) or fallback and coerce to int (e.g.
int(cd.get("duration") or 0), int(st.get("viewCount") or 0),
int(st.get("likeCount") or 0)) and keep description and channel_title as
sn.get("description") or "" and sn.get("channelTitle") or "" so return types are
int, int, str, int, str.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
coursemapper-kg/recommendation/app/services/course_materials/GCN/gcn.py (2)

22-24: Don't disable encryption for Neo4j in production.

encrypted=False exposes data in transit. Make it configurable and default to True (or use a neo4j+s URI).

Apply:

-        self.driver = GraphDatabase.driver(neo4j_uri,
-                                           auth=(neo4j_user, neo4j_pass),
-                                           encrypted=False)
+        self.driver = GraphDatabase.driver(
+            neo4j_uri,
+            auth=(neo4j_user, neo4j_pass),
+            encrypted=getattr(Config, "NEO4J_ENCRYPTED", True),
+        )

87-97: Batch Neo4j writes via UNWIND to avoid N round-trips.

One query per node will be slow at scale.

Apply:

-        with self.driver.session() as session:
-            for i in range(final_embeddings.shape[0]):
-                id = idx[i]
-                f_embedding = final_embeddings[i]
-                embedding = ",".join(str(i) for i in f_embedding)
-                # Find a node in neo4j by its original id and save its final embedding into its "final_embedding" property
-                result = session.run("""MATCH (n) WHERE n.cid= $id or n.sid= $id
-                        set n.final_embedding = $embedding RETURN n""",
-                    id=id,
-                    embedding=embedding)
+        rows = [
+            {"id": str(idx[i]), "embedding": ",".join(map(str, final_embeddings[i]))}
+            for i in range(final_embeddings.shape[0])
+        ]
+        with self.driver.session() as session:
+            session.run(
+                """
+                UNWIND $rows AS row
+                MATCH (n) WHERE n.cid = row.id OR n.sid = row.id
+                SET n.final_embedding = row.embedding
+                """,
+                rows=rows,
+            )
♻️ Duplicate comments (1)
coursemapper-kg/recommendation/app/services/course_materials/GCN/gcn.py (1)

37-37: Prefer debug-level, shape-only logs over commenting out.

Same feedback as before; keep lightweight observability.

Apply:

-        # logger.info(idx_features.shape[0])
+        logger.debug("idx_features count: %d", idx_features.shape[0])
@@
-        # logger.info(features.A)
+        logger.debug("features shape: %s", features.shape)
@@
-        # logger.info(adj.A)
+        logger.debug("adj shape: %s", adj.shape)
@@
-        # logger.info(final_embeddings)
+        logger.debug("final_embeddings shape: %s", final_embeddings.shape)

Also applies to: 41-41, 74-74, 83-83

📜 Review details

Configuration used: CodeRabbit UI

Review profile: ASSERTIVE

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 4b97a54 and 508ee8c.

📒 Files selected for processing (1)
  • coursemapper-kg/recommendation/app/services/course_materials/GCN/gcn.py (4 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)
  • GitHub Check: Analyze (javascript-typescript)
  • GitHub Check: Build container images / Build image (coursemapper-kg/preprocess)
  • GitHub Check: Build container images / Build image (webapp)
  • GitHub Check: Build container images / Build image (coursemapper-kg/concept-map)
  • GitHub Check: Build container images / Build image (coursemapper-kg/recommendation)
🔇 Additional comments (2)
coursemapper-kg/recommendation/app/services/course_materials/GCN/gcn.py (2)

69-69: No-op formatting change.

Safe to keep or drop per project style.


126-127: LGTM; where= supported since NumPy 1.17 and pinned version ==1.26.4 satisfies the requirement.

@@ -66,11 +66,12 @@ def load_data(self):
dtype=np.float32,
)
adj = np.around(adj, 2)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

np.around densifies sparse matrices — major perf/memory bug.

Rounding the whole sparse matrix converts it to dense; use .data instead.

Apply:

-        adj = np.around(adj, 2)
+        # Round only nonzero weights to keep sparsity
+        adj = adj.tocoo()
+        adj.data = np.around(adj.data, 2)
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
adj = np.around(adj, 2)
# Round only nonzero weights to keep sparsity
adj = adj.tocoo()
adj.data = np.around(adj.data, 2)
🤖 Prompt for AI Agents
In coursemapper-kg/recommendation/app/services/course_materials/GCN/gcn.py
around line 68, calling np.around on the entire sparse adjacency matrix
densifies it and causes major perf/memory issues; instead apply rounding only to
the sparse data array (the .data attribute) and keep the matrix as a sparse
type, i.e., update the matrix's internal data with rounded values and do not
convert or recreate a dense matrix.

# matrix plus its unit matrix and transpose matrix to obtain the complete adjacency matrix
adj = adj + adj.T.multiply(adj.T > adj) - adj.multiply(adj.T > adj)

adj = self.normalize(adj) + sp.eye(adj.shape[0])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Add self-loops before normalization (GCN convention).

Current order changes weights vs. Kipf & Welling’s  = D̂^{-1/2}(A+I)D̂^{-1/2}.

Apply:

-        adj = self.normalize(adj) + sp.eye(adj.shape[0])
+        # Add self-loops, then normalize
+        adj = self.normalize(adj + sp.eye(adj.shape[0]))
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
adj = self.normalize(adj) + sp.eye(adj.shape[0])
# Add self-loops, then normalize
adj = self.normalize(adj + sp.eye(adj.shape[0]))
🤖 Prompt for AI Agents
In coursemapper-kg/recommendation/app/services/course_materials/GCN/gcn.py
around line 73 the code adds self-loops after calling normalize which changes
the resulting weights versus the standard GCN convention; move the addition of
the identity so you add self-loops to adj before calling self.normalize (i.e.,
compute adj_with_loops = adj + sp.eye(adj.shape[0]) then call
self.normalize(adj_with_loops)), and ensure the normalize function computes
degrees and performs symmetric normalization on that augmented matrix.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants