Skip to content

jeshwang/cs5001-final

Repository files navigation

Final Project: Grocery List

  • Author: Jessica Hwang
  • Course: CS 5001 & 5003 - Intensive Foundations of Computer Science
  • Program: Align MS in CS, Khoury College of Computer Sciences at Northeastern University
  • Instructors: Dr. Albert Lionelle, Dr. Mark Miller
  • Semester: Spring 2023

This is a program in Python 3 that transforms an audio recording of grocery list into a simple, text version of that list. The audio recording input is in the form of a .wav file. The output is in the form of a .txt file.

Goals

  • Learn a new Python library
  • Begin exploration on speech recognition and ML
  • Use programming concepts learned in the course, such as:
    • Modular design
    • Divide-and-conquer
    • Defensive programming: error handling
    • OOP basics: abstraction, encapsulation
    • Functions, dictionaries, classes

Tools Used

Instructions

Run the program from app.py, which transcribes the sample file grocery_final.wav and writes output into "New List.txt." Given nothing in the code changes, every run will write over existing content in "New List.txt."

The execution of this app depends on an Internet connection and the availability of the Google Web Speech API. As of April 2023, Google does not require any authentication with an API key or a username/password combination.

Instructions for transcribing your own audio file

  • Audio files must be in .wav format. Replace "grocery_file.wav" with the name of your new file in the main() of app.py
  • Audio must contain the following keywords for the programming to run properly:
    • "start" indicates the start of the list. All audio before this word will be disregarded in the final output.
    • "stop" indicates the end of the list. All audio after this word will be disregarded in the final output.
    • "comma" separates each list item from another
  • Every list item in the audio MUST contain the following details, in this order:
    • quantity, such as "one"
    • unit, such as "package of". Including a unit is optional. However, all units must be followed with "of."
    • item name, such as "chicken legs" or "apple"
  • The program does not have language capabilities beyond English.

File Directory

Code

The main code is divided into:

  • app.py: contains the function that calls the API, and main().
  • shopping.py: contains helper functions that transform the raw transcription into text, and then load text into a file
  • shopping_classes.py: contains ListItem and ShoppingList classes, and their individual attributes.

Dictionaries

There are three data dictionaries used to correct mispellings from the transcription given by speech_recognition and Google Web Speech API.

  • item.dat, which corrects commonly mispelled grocery item names
  • unit.dat, which corrects commonly mispelled unit names, such as package, box, carton, etc.
  • quantity.dat, which corrects commonly mispelled integers in word form into its integer form

These data dictionaries are sufficient for transcribing grocery_final.wav, and most common American grocery items. The dictionaries are meant to be expanded upon should new item names, quantities, and units that are transcribed incorrectly are introduced via future audio recordings.

Tests

All .py files using for test code are prefixed with "test_" in front of the name of the file they are testing. The .dat and .txt files prefixed with "test_" are used by these .py files to run tests.

References

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages