Skip to content

Training not converging well, Dataset available #22

@samhodge-aiml

Description

@samhodge-aiml

Here are my modifications to the source code

diff --git a/configs/Test/images.yaml b/configs/Test/images.yaml
index 81a5824..4435fb2 100644
--- a/configs/Test/images.yaml
+++ b/configs/Test/images.yaml
@@ -12,4 +12,5 @@ training:
   auto_scheduler: True
   eval_pose_every: -1
 extract_images:
-  resolution: [540, 960]
\ No newline at end of file
+  resolution: [3024, 4032]
+with_depth: False
diff --git a/configs/preprocess.yaml b/configs/preprocess.yaml
index c56b1fd..d3ec72c 100644
--- a/configs/preprocess.yaml
+++ b/configs/preprocess.yaml
@@ -1,9 +1,9 @@
 depth:
   type: DPT
 dataloading:
-  path: data/nerf_llff_data
-  scene: ['fern']
+  path: data/Test
+  scene: ['images']
   resize_factor: 
   load_colmap_poses: False
 training:
-  mode: 'all'
\ No newline at end of file
+  mode: 'all'
diff --git a/dataloading/dataset.py b/dataloading/dataset.py
index d40af73..846273d 100644
--- a/dataloading/dataset.py
+++ b/dataloading/dataset.py
@@ -82,11 +82,16 @@ class DataField(object):
         _, _, h, w = imgs.shape
 
         if customized_focal:
-            focal_gt = np.load(os.path.join(load_dir, 'intrinsics.npz'))['K'].astype(np.float32)
+            #focal_gt = np.load(os.path.join(load_dir, 'intrinsics.npz'))['K'].astype(np.float32)
+            FX_ = 13/35.0
+            CX_ = 4032
+            CY_ = 3024
+            FY_= FX_*(CY_/CX_)
:
+  scene: ['images']
   resize_factor: 
   load_colmap_poses: False
 training:
-  mode: 'all'
\ No newline at end of file
+  mode: 'all'
diff --git a/dataloading/dataset.py b/dataloading/dataset.py
index d40af73..846273d 100644
--- a/dataloading/dataset.py
+++ b/dataloading/dataset.py
@@ -82,11 +82,16 @@ class DataField(object):
         _, _, h, w = imgs.shape
 
         if customized_focal:
-            focal_gt = np.load(os.path.join(load_dir, 'intrinsics.npz'))['K'].astype(np.float32)
+            #focal_gt = np.load(os.path.join(load_dir, 'intrinsics.npz'))['K'].astype(np.float32)
+            FX_ = 13/35.0
+            CX_ = 4032
+            CY_ = 3024
+            FY_= FX_*(CY_/CX_)
+            focal_gt = [[FX_, 0, CX_], [0, FY_, CY_], [0, 0, 1]]
             if resize_factor is None:
                 resize_factor = 1
-            fx = focal_gt[0, 0] / resize_factor
-            fy = focal_gt[1, 1] / resize_factor
+            fx = focal_gt[0][0] / resize_factor
+            fy = focal_gt[1][1] / resize_factor
         else:
             if load_colmap_poses:
                 fx, fy = focal, focal
diff --git a/environment.yaml b/environment.yaml
index dfde749..a81a313 100644
--- a/environment.yaml
+++ b/environment.yaml
@@ -4,12 +4,13 @@ channels:
   - conda-forge
   - anaconda
   - defaults
+  - nvidia
 dependencies:
-  - python=3.9
-  - pytorch=1.7
-  - torchvision=0.8.2 
-  - torchaudio 
-  - cudatoolkit=10.1
+  - python
+  - pytorch=2.0.0
+  - torchvision=0.15.0
+  - torchaudio=2.0.0
+  - pytorch-cuda=11.8
   - cffi
   - cython
   - imageio
@@ -39,4 +40,4 @@ dependencies:
     - lpips
     - setuptools
     - kornia==0.5.0
-    - imageio-ffmpeg
\ No newline at end of file
+    - imageio-ffmpeg
~
~
~
(END)
+  scene: ['images']
   resize_factor: 
   load_colmap_poses: False
 training:
-  mode: 'all'
\ No newline at end of file
+  mode: 'all'
diff --git a/dataloading/dataset.py b/dataloading/dataset.py
index d40af73..846273d 100644
--- a/dataloading/dataset.py
+++ b/dataloading/dataset.py
@@ -82,11 +82,16 @@ class DataField(object):
         _, _, h, w = imgs.shape
 
         if customized_focal:
-            focal_gt = np.load(os.path.join(load_dir, 'intrinsics.npz'))['K'].astype(np.float32)
+            #focal_gt = np.load(os.path.join(load_dir, 'intrinsics.npz'))['K'].astype(np.float32)
+            FX_ = 13/35.0
+            CX_ = 4032
+            CY_ = 3024
+            FY_= FX_*(CY_/CX_)
+            focal_gt = [[FX_, 0, CX_], [0, FY_, CY_], [0, 0, 1]]
             if resize_factor is None:
                 resize_factor = 1
-            fx = focal_gt[0, 0] / resize_factor
-            fy = focal_gt[1, 1] / resize_factor
+            fx = focal_gt[0][0] / resize_factor
+            fy = focal_gt[1][1] / resize_factor
         else:
             if load_colmap_poses:
                 fx, fy = focal, focal
diff --git a/environment.yaml b/environment.yaml
index dfde749..a81a313 100644
--- a/environment.yaml
+++ b/environment.yaml
@@ -4,12 +4,13 @@ channels:
   - conda-forge
   - anaconda
   - defaults
+  - nvidia
 dependencies:
-  - python=3.9
-  - pytorch=1.7
-  - torchvision=0.8.2 
-  - torchaudio 
-  - cudatoolkit=10.1
+  - python
+  - pytorch=2.0.0
+  - torchvision=0.15.0
+  - torchaudio=2.0.0
+  - pytorch-cuda=11.8
   - cffi
   - cython
   - imageio
@@ -39,4 +40,4 @@ dependencies:
     - lpips
     - setuptools
     - kornia==0.5.0
-    - imageio-ffmpeg
\ No newline at end of file
+    - imageio-ffmpeg

And my dataset

https://drive.google.com/drive/folders/1ZZgZUrFrnP47rx8bN5K6yvYnSC50a-9G?usp=sharing

When what I have done to start training is put the images in

data/Test/images/images

then run the preprocess and train commands

and I have found the tensorboard attached here:

log.zip

Screenshot from 2023-09-03 13-07-27

Is this OK?

or did I muck up the intrinsics?

attached in a JPG to look at the EXIF information

6063

Screenshot from 2023-09-03 13-09-33

I think it may be 14 rather than 13 I will try again.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions