Skip to content

Mochrks/web-cv-parser-ai

Repository files navigation

CV Parser: PDF to JSON with AI

Tech Stack Icons

Project Overview

This project is a web application built with Next.js that allows users to upload CV files in PDF format, parse them, and convert the content into structured JSON using OpenAI's language model. It streamlines the process of extracting relevant information from CVs for easier data management and analysist.

Tech Stack & Dependencies

Next.js React TypeScript OpenAI PDF-parser

Setup & Installation

  1. Clone the repository

    git clone https://github.com/Mochrks/cv-parser-ai.git
    cd cv-parser-ai
  2. Install dependencies

    npm install
  3. Configure Environment Variables Create a .env file in the root directory and add:

    OPENAI_API_KEY=your_openai_api_key
    
  4. Run the development server

    npm run dev
  5. Open http://localhost:3000 with your browser to see the result.

Features

  • PDF Upload: Users can upload CV files in PDF format.
  • PDF Parsing: Extracts text content from uploaded PDF files.
  • AI-Powered Conversion: Utilizes OpenAI's model to convert parsed text into structured JSON.

How It Works

  1. File Upload: User uploads a CV in PDF format.
  2. PDF Parsing: The application uses PDF.js to extract text content from the PDF.
  3. AI Processing: The extracted text is sent to OpenAI's model with a custom prompt to structure the data.
  4. JSON Generation: The AI model returns a structured JSON representation of the CV data.

API Routes

  • POST api/process-pdf: Handles PDF file upload and parsing.
  • POST api/generate-json: Sends parsed text to OpenAI for conversion to JSON.

Library

  • Pdf-parser
  • Lucide-react icon
  • TailwindCSS

Explain API

PDF Parser API Endpoint api/process-pdf

A simple API endpoint to convert PDF files to text using Next.js.

How It Works

  1. Receive Request

    • Endpoint only accepts POST method
    • Receives PDF file in base64 format
  2. Main Process

    • Converts base64 to buffer
    • Extracts text from PDF using pdf-parse
    • Returns result in JSON format

Code Explanation

// Required imports
import type { NextApiRequest, NextApiResponse } from 'next';
import pdfParse from 'pdf-parse';

export default async function handler(req: NextApiRequest, res: NextApiResponse) {
    // Check for POST method
    if (req.method !== 'POST') {
        return res.status(405).json({ error: 'Method not allowed' });
    }

    try {
        // Get file content from request body
        const { fileContent } = req.body;
        
        // Convert base64 to buffer
        const pdfBuffer = Buffer.from(fileContent.split(',')[1], 'base64');
        
        // Parse PDF to text
        const pdfData = await pdfParse(pdfBuffer);
        const text = pdfData.text;
        
        // Send success response
        return res.status(200).json({ text });
    } catch (error) {
        // Handle error
        console.error('Error processing PDF:', error);
        return res.status(500).json({ error: 'Failed to process PDF' });
    }
}

CV Text to JSON Converter API Endpoint api/generate-json

A Next.js API endpoint that converts CV/Resume text into structured JSON format using OpenAI's GPT model.

How It Works

  1. Receive Request

    • Endpoint only accepts POST method
    • Receives CV text in request body
  2. Main Process

    • Sends text to OpenAI API with specific CV structure instructions
    • Processes response and converts to structured JSON
    • Returns JSON result using format by props

Code Explanation

// Required imports
import type { NextApiRequest, NextApiResponse } from 'next';
import { CV_STRUCTURE_INSTRUCTIONS } from '@/utils/cvStructure';

export default async function handler(req: NextApiRequest, res: NextApiResponse) {
    // Check for POST method
    if (req.method !== 'POST') {
        return res.status(405).json({ error: 'Method not allowed' });
    }

    try {
        // Get CV text from request body
        const { text } = req.body;

        // Make request to OpenAI API
        const response = await fetch("https://api.openai.com/v1/chat/completions", {
            method: "POST",
            headers: {
                "Content-Type": "application/json",
                "Authorization": `Bearer ${process.env.OPENAI_API_KEY}`,
            },
            body: JSON.stringify({
                model: "gpt-4o-mini",
                messages: [
                    {
                        role: "system",
                        content: CV_STRUCTURE_INSTRUCTIONS
                    },
                    {
                        role: "user",
                        content: `Convert the following CV/Resume text into JSON:\n\n${text}`
                    }
                ],
                temperature: 0.7,
                max_tokens: 2000
            }),
        });

        if (!response.ok) {
            throw new Error('OpenAI API request failed');
        }

        // Process and return JSON response
        const data = await response.json();
        return res.status(200).json(JSON.parse(data.choices[0].message.content));
    } catch (error) {
        // Handle error
        console.error('Error generating JSON:', error);
        return res.status(500).json({ error: 'Failed to generate JSON' });
    }
}

References

OpenAI API

Articles

Connect with me:

GitHub YouTube Instagram LinkedIn Behance Dribbble

About

A streamlined document processing tool that converts PDF resumes into structured data. This application allows users to upload PDF files, processes them through a PDF parser to extract text content, and utilizes AI models to analyze and transform the information into organized JSON format. Built to simplify resume data extraction and make candidate

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages