Skip to content

Work for this week #2

@pjmagee

Description

@pjmagee

spit solution with backend/admin and frontend - two separate applications

  • frontend on swdata.ai
  • frontend blazor hybrid with API
  • backend on admin.swdata.ai (ETL, batch jobs - should not be interrupted with frontend changes or deployments)

article page ingestion improvements

  • hash the page content and compare hashes when re-consuming pages (to know if they change and need re-processing)
  • when ingesting new or updated content be sure to compare hash and re-process any other ETL pipelines

knowledge graph redesign

  • split nodes and relationships properly
  • mongo graph 100MB memory limit - is this a concern with the graph lookup later on?
  • process infobox on the right part of the page
  • low thinking
  • process left part of page last
  • differentiate between properties and nodes. Not all information is a node.
  • phase 1 - build out graph WIHTOUT article content on the left first (review, see quality of graph)
  • phase 2 - find all the links of entities that have no properties and fill in all the missing properties

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions