Stack Used:
- Dataset
- Embedding Model: sentence-transformers (e.g., all-MiniLM-L6-v2)
- Vector Store: FAISS
- Retriever: FAISS retriever
- RAG Pipeline: Custom
Steps:
- Load and preprocess movie data
- Embed movie descriptions (or plots)
- Index them with FAISS
- Query using a user question
- Retrieve top-k relevant contexts
- Generate an answer using a language model
Extrac:
- Save and Load FAISS Index
- Logic to Use Stored Index
Many industries require document compliance checks, such as financial reports, legal contracts, or medical records. Traditionally, these checks require human review to verify text content and document structure (tables, forms, signatures, etc.). An AI-powered solution can streamline this process using LLMs and image processing.
Structural ideas for the first draft:
- Extract Text & Visual Elements
- to extract printed or handwritten text from scanned documents
- Analyze Content with an LLM
- Feed the extracted text to an LLM (such as BERT model) for semantic analysis, flagging compliance issues.
- Use Named Entity Recognition (NER) to detect key terms, dates, and sensitive data