Merge pull request #77 from yoterel/master

yoterel · web-flow · commit 6b27de638db9 · 2024-02-27T15:21:40.000+09:00
better command line options for ui
diff --git a/.gitignore b/.gitignore
@@ -8,3 +8,6 @@ src/icatcher.egg-info/
 dist/
 build/
 node_modules/
+tests/test_data/test_short/
+tests/test_data/test_short.npz
+tests/test_data/test_short.txt
diff --git a/README.md b/README.md
@@ -45,44 +45,44 @@ You can run iCatcher+ with the command:
 
 `icatcher --help`
 
-which will list all available options. The description below will help you get more familiar with some common command line arguments.
+Which will list all available options. Below we list some common options to help you get more familiar with iCatcher+. The pipeline is highly configurable, please see [the website](https://icatcherplus.github.io/) for more explanation about the flags.
 
 ### Annotating a Video
 To produce annotations for a video file (if a folder is provided, all videos will be used for prediction):
 
 `icatcher /path/to/my/video.mp4`
 
->**NOTE:** For any videos you wish to visualize with the web app, you must use the `--ui_packaging_path` flag:
->
->`icatcher /path/to/my/video.mp4 --ui_packaging_path /path/to/desired/output/directory/`
+**NOTE:** For any videos you wish to visualize with the [Web App](#web-app), you must use the `--output_annotation` and the `--output_format ui` flags:
 
-### Using the iCatcher Web App
-To launch the iCatcher+ web app, use:
+`icatcher /path/to/my/video.mp4 --output_annotation /path/to/desired/output/directory/ --output_format ui`
 
-`icatcher --app`
+### Common Flags
 
-The app should open automatically at [http://localhost:5001](http://localhost:5001). For more details, see [Web App](#web-app).
+- You can save a labeled video by adding:
 
-### Common Annotation Flags
-A common option is to add:
+`--output_video_path /path/to/output_folder`
 
-`icatcher /path/to/my/video.mp4 --use_fc_model`
+- If you want to output annotations to a file, use:
 
-Which enables a child face detector for more robust results (however, sometimes this can result in too much loss of data).
+`--output_annotation /path/to/output_annotation_folder`
 
-You can save a labeled video by adding:
+See [Output format](#output-format) below for more information on how the files are formatted.
 
-`--output_video_path /path/to/output_folder`
+- To show the predictions online in a seperate window, add the option:
+
+`--show_output`
 
-If you want to output annotations to a file, use:
+- To launch the iCatcher+ [Web App](#web-app) (after annotating), use:
 
-`--output_annotation /path/to/output_annotation_folder`
+`icatcher --app`
 
-To show the predictions online in a seperate window, add the option:
+The app should open automatically at [http://localhost:5001](http://localhost:5001). For more details, see [Web App](#web-app).
 
-`--show_output`
+- Originally a face classifier was used to distinguish between adult and infant faces (however this can result in too much loss of data). It can be turned on by using:
+
+`icatcher /path/to/my/video.mp4 --use_fc_model`
 
-You can also add parameters to crop the video a given percent before passing to iCatcher: 
+- You can also add parameters to crop the video a given percent before passing to iCatcher: 
 
 `--crop_mode m` where `m` is any of [top, left, right], specifying which side of the video to crop from (if not provided, default is none; if crop_percent is provided but not crop_mode, default is top)
 
@@ -94,22 +94,21 @@ Currently we supports 3 output formats, though further formats can be added upon
 
 - **raw_output:** a file where each row will contain the frame number, the class prediction and the confidence of that prediction seperated by a comma
 - **compressed:** a npz file containing two numpy arrays, one encoding the predicted class (n x 1 int32) and another the confidence (n x 1 float32) where n is the number of frames. This file can be loaded into memory using the numpy.load function. For the map between class number and name see test.py ("predict_from_video" function).
-- **ui_output:** needed to open a video in the web app; produces a directory of the following structure
+- **ui:** needed for viewing results in the web app; produces a directory of the following structure
 
         ├── decorated_frames     # dir containing annotated jpg files for each frame in the video
-        ├── video.mp4            # the original video
-        ├── labels.txt           # file containing annotations in the `raw_output` form described above
+        ├── labels.txt           # file containing annotations in the `raw_output` format described above
 
 # Web App
 The iCatcher+ app is a tool that allows users to interact with output from the iCatcher+ ML pipeline in the browser. The tool is designed to operate entirely locally and will not upload any input files to remote servers.
 
 ### Using the UI
 
-When you open the iCatcher+ UI, you will be met with a pop-up inviting you to upload your video directory. Please note, this requires you to upload *the whole output directory* which should include a `labels.txt` file and a sub-directory containing all of the frame images from the video.
+When you open the iCatcher+ UI, you will be met with a pop-up inviting you to upload a directory. Please note, this requires you to upload *the whole output directory* which should include a `labels.txt` file and a sub-directory named `decorated_frames` containing all of the frames of the video as image files.
 
-Once you've submitted the video, you should see a pop-up asking if you want to upload the whole video. Rest assured, this will not upload those files through the internet or to any remote servers. This is only giving the local browser permission to access those files. The files will stay local to whatever computer is running the browser.
+Once you've uploaded a directory, you should see a pop-up asking whether you are sure want to upload all files. Rest assured, this will not upload the files to any remote servers. This is only giving the local browser permission to access those files. The files will stay local to whatever computer is running the browser.
 
-At this point, you should see your video on the screen (you may need to give it a few second to load). Now you can start to review your annotations. Below the video you'll see heatmaps giving you a visual overview of the labels for each frame, as well as the confidence level for each frame.
+At this point, you should see the video on the screen (you may need to give it a few second to load). Now you can start to review the annotations. Below the video you'll see heatmaps giving you a visual overview of the labels for each frame, as well as the confidence level for each frame.
 
 # Datasets access
 
diff --git a/src/icatcher/cli.py b/src/icatcher/cli.py
@@ -269,15 +269,6 @@ def create_output_streams(video_path, framerate, resolution, opt):
     fourcc = cv2.VideoWriter_fourcc(
         *"MP4V"
     )  # may need to be adjusted per available codecs & OS
-    if opt.ui_packaging_path:
-        video_creator = lambda path: cv2.VideoWriter(
-            str(path), fourcc, framerate, resolution, True
-        )
-        ui_output_components = ui_packaging.prepare_ui_output_components(
-            opt.ui_packaging_path,
-            video_path,
-            video_creator,
-        )
     if opt.output_video_path:
         my_video_path = Path(opt.output_video_path, video_path.stem + "_output.mp4")
         video_output_file = cv2.VideoWriter(
@@ -286,7 +277,15 @@ def create_output_streams(video_path, framerate, resolution, opt):
     if opt.output_annotation:
         if opt.output_format == "compressed":
             prediction_output_file = Path(opt.output_annotation, video_path.stem)
-        else:
+            npz_extension = Path(str(prediction_output_file) + ".npz")
+            if npz_extension.exists():
+                if opt.overwrite:
+                    npz_extension.unlink()
+                else:
+                    raise FileExistsError(
+                        "Annotation output file already exists. Use --overwrite flag to overwrite."
+                    )
+        elif opt.output_format == "raw_output":
             prediction_output_file = Path(
                 opt.output_annotation, video_path.stem + opt.output_file_suffix
             )
@@ -297,6 +296,16 @@ def create_output_streams(video_path, framerate, resolution, opt):
                     raise FileExistsError(
                         "Annotation output file already exists. Use --overwrite flag to overwrite."
                     )
+        elif opt.output_format == "ui":
+            ui_output_components = ui_packaging.prepare_ui_output_components(
+                opt.output_annotation,
+                video_path,
+                opt.overwrite,
+            )
+        else:
+            raise NotImplementedError(
+                "output format {} not implemented".format(opt.output_annotation)
+            )
 
     return video_output_file, prediction_output_file, ui_output_components, skip
 
@@ -667,25 +676,25 @@ def handle_output(
                         confidence,
                     )
                 )
-    if opt.ui_packaging_path:
-        if is_from_tracker and opt.track_face:
-            rect_color = (0, 0, 255)
-        else:
-            rect_color = (0, 255, 0)
-        output_for_ui = ui_packaging.prepare_frame_for_ui(
-            cur_frame,
-            cur_bbox,
-            rect_color=rect_color,
-            conf=confidence,
-            class_text=class_text,
-            frame_number=frame_count + cursor + 1,
-            pic_in_pic=opt.pic_in_pic,
-        )
-        ui_packaging.save_ui_output(
-            frame_idx=frame_count + cursor + 1,
-            ui_output_components=ui_output_components,
-            output_for_ui=output_for_ui,
-        )
+        elif opt.output_format == "ui":
+            if is_from_tracker and opt.track_face:
+                rect_color = (0, 0, 255)
+            else:
+                rect_color = (0, 255, 0)
+            output_for_ui = ui_packaging.prepare_frame_for_ui(
+                cur_frame,
+                cur_bbox,
+                rect_color=rect_color,
+                conf=confidence,
+                class_text=class_text,
+                frame_number=frame_count + cursor + 1,
+                pic_in_pic=opt.pic_in_pic,
+            )
+            ui_packaging.save_ui_output(
+                frame_idx=frame_count + cursor + 1,
+                ui_output_components=ui_output_components,
+                output_for_ui=output_for_ui,
+            )
     logging.info(
         "frame: {}, class: {}, confidence: {:.02f}, cur_fps: {:.02f}".format(
             str(frame_count + cursor + 1),
diff --git a/src/icatcher/options.py b/src/icatcher/options.py
@@ -109,16 +109,13 @@ def parse_arguments(my_string=None):
         "--output_format",
         type=str,
         default="raw_output",
-        choices=["raw_output", "compressed"],
+        choices=["raw_output", "compressed", "ui"],
+        help="Selects output format.",
     )
     parser.add_argument(
         "--output_video_path",
         help="If present, annotated video will be saved to this folder.",
     )
-    parser.add_argument(
-        "--ui_packaging_path",
-        help="If present, packages the output data into the UI format.",
-    )
     parser.add_argument(
         "--pic_in_pic",
         action="store_true",
diff --git a/src/icatcher/ui_packaging.py b/src/icatcher/ui_packaging.py
@@ -1,35 +1,33 @@
-import json
 import cv2
 import numpy as np
 from pathlib import Path
 from icatcher import draw
-
-from typing import Callable, Dict, Union, Tuple
+from typing import Dict, Tuple
 
 
 def prepare_ui_output_components(
-    ui_packaging_path: str, video_path: str, video_creator: Callable
-) -> Dict[str, Union[cv2.VideoWriter, str]]:
+    ui_packaging_path: str, video_path: str, overwrite: bool
+) -> Dict[str, str]:
     """
     Given a path to a directory, prepares a dictionary of paths and videos necessary for the UI.
 
     :param ui_packaging_path: path to folder in which the output will be saved
     :param video_path: the original video path
-    :param video_creator: a function to create video files given a path
+    :param overwrite: if true and label file already exists, overwrites it. else will throw an error.
     :return: a dictionary mapping each UI component to its path or video writer
     """
 
-    decorated_video_path = Path(
-        ui_packaging_path, video_path.stem, "decorated_video.mp4"
-    )
+    labels_path = Path(ui_packaging_path, video_path.stem, "labels.txt")
+    if labels_path.exists():
+        if overwrite:
+            labels_path.unlink()
+        else:
+            raise FileExistsError(
+                "Annotation output file already exists. Use --overwrite flag to overwrite."
+            )
     decorated_frames_path = Path(ui_packaging_path, video_path.stem, "decorated_frames")
-
     decorated_frames_path.mkdir(parents=True, exist_ok=True)
-
-    labels_path = Path(ui_packaging_path, video_path.stem, "labels.txt")
-
     ui_output_components = {
-        "decorated_video": video_creator(decorated_video_path),
         "decorated_frames_path": decorated_frames_path,
         "labels_path": labels_path,
     }
@@ -89,12 +87,10 @@ def save_ui_output(frame_idx: int, ui_output_components: Dict, output_for_ui: Tu
     decorated_frame, label_text = output_for_ui
 
     # Save decorated frame
-    ui_output_components["decorated_video"].write(decorated_frame)
     decorated_frame_path = Path(
         ui_output_components["decorated_frames_path"], f"frame_{frame_idx:05d}.jpg"
     )
     cv2.imwrite(str(decorated_frame_path), decorated_frame)
-
-    # Wrtie new annotation to labels file
+    # Write new annotation to labels file
     with open(ui_output_components["labels_path"], "a", newline="") as f:
         f.write(f"{frame_idx}, {label_text}\n")
diff --git a/tests/test_basic.py b/tests/test_basic.py
@@ -87,6 +87,10 @@ def test_mask():
             "tests/test_data/test_short.mp4 --model icatcher+_lookit.pth --fd_model opencv_dnn --output_annotation tests/test_data --mirror_annotation --output_format compressed --overwrite",
             "tests/test_data/test_short_result.txt",
         ),
+        (
+            "tests/test_data/test_short.mp4 --model icatcher+_lookit.pth --fd_model opencv_dnn --output_annotation tests/test_data --output_format ui --overwrite",
+            "tests/test_data/test_short_result.txt",
+        ),
     ],
 )
 def test_predict_from_video(args_string, result_file):
@@ -116,7 +120,7 @@ def test_predict_from_video(args_string, result_file):
             data = np.load(output_file)
             predicted_classes = data["arr_0"]
             confidences = data["arr_1"]
-        else:
+        elif args.output_format == "raw_output":
             output_file = Path("tests/test_data/{}.txt".format(Path(args.source).stem))
             with open(output_file, "r") as f:
                 data = f.readlines()
@@ -125,6 +129,17 @@ def test_predict_from_video(args_string, result_file):
                 [icatcher.classes[x] for x in predicted_classes]
             )
             confidences = np.array([float(x.split(",")[2].strip()) for x in data])
+        elif args.output_format == "ui":
+            output_file = Path(
+                "tests/test_data/{}/labels.txt".format(Path(args.source).stem)
+            )
+            with open(output_file, "r") as f:
+                data = f.readlines()
+            predicted_classes = [x.split(",")[1].strip() for x in data]
+            predicted_classes = np.array(
+                [icatcher.classes[x] for x in predicted_classes]
+            )
+            confidences = np.array([float(x.split(",")[2].strip()) for x in data])
         assert len(predicted_classes) == len(confidences)
         assert len(predicted_classes) == len(gt_classes)
         if args.mirror_annotation: