Skip to content

innovationgarage/label-V

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

61 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

LabelV is a semi-automatic video annotation tool for computer vision training data generation

Installation

sudo apt install ffmpeg

pip install .

Quick start.

  • clone this repository and from the root directory, install it and then run
labelv-service

More detailed explanation and show-case

There is a blog post describing how this is implemented using OpenCV and how it can be used in generating training data for object detection algorithms.

IMAGE ALT TEXT HERE

Data format

Conecpts:

  • Session - a set of keyframes generated by a certain user for a certain video
  • Frame - a video is theoretically made up of concecutive, numbered images
  • Keyframe - a frame annotation created by a user containing labels
  • Label - an object label for an object in the video, such as a chair, a lamp, a bike etc
  • Bbox - a bounding box around an object in the video
  • Title - a string describing a label
  • Group - a label that contains other groups and labels. The bbox of a group always exactly contains all the bboxes of its children.

Whenever a video is uploaded it is saved under upload/video/VIDEO_ID.EXT where VIDEO_ID is a unique random string and EXT is the file format extension of your video.

Every time a user starts working with a video adding keyframes and labels, a session is created. The stored under upload/session/VIDEO_ID.EXT-SESSION_ID where SESSION_ID is a unique random string. This files contain a json object.

The session object contains a "keyframes" member whose keys are keyframe frame numbers (as strings due to the json format), and whose values are keyframe objects:

{"keyframes": {"14": KEYFRAME_OBJECT,
               "26": KEYFRAME_OBJECT,
               "200": KEYFRAME_OBJECT}}

Each keyframe object has a set of labels and a KEYFRAME_KEY. The KEYFRAME_KEY is a unique id use to identify this particular set of labels for this particular frame. If the user where to change the keyframe, a new key would be generated.

KEYFRAME_OBJECT = {"key": "KEYFRAME_KEY",
                   "data": {"label": ITEM}}

The keyframe labels reside under the key "labels" under the key "data" and is a recursively defined structure. At each level one of two possible objects can be present:

A label

ITEM = {"type": "Label",
        "args": {"bbox": [208,214,69,84],
                 "title": "The chair"}}

or a group

ITEM = {"type": "Group".
        "args": {"bbox": [208,214,69,84],
                 "children": [ITEM,ITEM,...],
                 "title": "Dining group"}}

When a user navigates to a non-keyframe, the tracker tracks the bboxes from the last keyframe before the current frame, and generates updated bboxes for all frames in between. These are stored under upload/tracker/VIDEO_ID.EXT/KEYFRAME_NUMBER/KEYFRAME_KEY/FRAME_NUMBER.json where FRAME_NUMBER is the frame number minus the keyframe frame number (so starts from zero). Each such file contains an ITEM as defined above encoded as json.

Releases

No releases published

Packages

 
 
 

Contributors