Fix: Remove device_map to prevent meta tensor errors in table detection by micmarty-deepsense · Pull Request #459 · Unstructured-IO/unstructured-inference

micmarty-deepsense · 2026-01-27T13:21:41Z

Summary

Fixes NotImplementedError: Cannot copy out of meta tensor error in table detection during multi-threaded processing.

Root Cause: How device_map Creates Meta Tensors

The Problem

When passing device_map to from_pretrained(), HuggingFace Transformers uses a special loading path:

Forces low_cpu_mem_usage=True automatically (HF #33326)
Creates model on meta device first - placeholder tensors with NO actual data
Attempts to load state_dict onto meta tensors without assign=True (HF #37615)
Tries to move to target device with .to(device)
FAILS because meta tensors cannot be copied/moved (HF #26700)

From HF issue #33326:

"Tensors created on the meta device are meaningless empty tensors, which renders initialization code completely ineffective."

From HF issue #26700:

"When using device_map, the transformers library creates a context manager that sets the default device to 'meta'. During initialization, the code attempts to copy weights from the original modules. However, since the backbone was created on the meta device, the weights are not materialized, causing the copy operation to fail."

Two Different Code Paths in Transformers

With device_map (broken path):

from_pretrained(device_map="cpu")
  → Accelerate's distributed loading
    → Initialize on meta device
      → Load state_dict onto meta tensors
        → Try to move to CPU
          → ERROR: Cannot copy from meta tensor

Without device_map (working path):

from_pretrained()
  → Normal PyTorch loading
    → Initialize on CPU with real tensors
      → Load weights directly into memory
        → .to(device) moves real data
          → SUCCESS

Why device_map Exists

Originally designed for HUGE models (>100B params) that don't fit in memory. Our TableTransformer models (~500MB) don't need this optimization and shouldn't use it.

Changes

Removed device_map from DetrImageProcessor.from_pretrained() (line 75)
Removed device_map from TableTransformerForObjectDetection.from_pretrained() (lines 85-88)
Added explicit .to(self.device) after model loading (line 87)

# BEFORE (broken)
self.model = TableTransformerForObjectDetection.from_pretrained(
    model,
    device_map=self.device,  # Forces meta device path
)

# AFTER (working)
self.model = TableTransformerForObjectDetection.from_pretrained(model)  # Normal loading
self.model.to(self.device)  # Move real tensors to device

Pattern Consistency

This matches the fix pattern used for SentenceTransformer models in core-product:

c8b175f7: Added device parameter to model constructor
db636932: Made thread-safe with @threadsafe_lazyproperty

Antonio's PR #446 addressed a different issue (thread-safety race condition), while this PR fixes the underlying meta tensor problem.

Testing

Error observed in in-vpc customer deployment with strategy=fast + table detection during OCR processing.

Stacktrace (from customer deployment)

tables_agent_patch.py:297 → patched_load_agent
  → tables.py:85 → initialize
    → TableTransformerForObjectDetection.from_pretrained()
      → model.to(device)  ← META TENSOR ERROR

References

The device_map parameter with HuggingFace Transformers can cause NotImplementedError "Cannot copy out of meta tensor" in multi-threaded contexts when loading TableTransformerForObjectDetection models. This fix: - Removes device_map from DetrImageProcessor.from_pretrained() - Removes device_map from TableTransformerForObjectDetection.from_pretrained() - Uses explicit .to(device) after model loading instead This pattern matches the fix applied to SentenceTransformer models in core-product (commits c8b175f7 and db636932). Error observed in in-vpc customer deployment when using strategy=fast with table detection.

…rors Removed manual bitmap.close() and page.close() calls in convert_pdf_to_image() to prevent pypdfium2 AssertionError during concurrent PDF processing. Issue: When manually closing child objects (bitmap, page) followed by parent PDF close, pypdfium2's weakref finalizers can run after parent closes, triggering assertion failures in cleanup logic. Solution: Let pypdfium2 finalizers handle resource cleanup automatically. This prevents double-cleanup race conditions and simplifies code. Version: Bumped to 1.1.9

micmarty-deepsense added 2 commits January 27, 2026 14:21

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix: Remove device_map to prevent meta tensor errors in table detection#459

Fix: Remove device_map to prevent meta tensor errors in table detection#459
micmarty-deepsense wants to merge 2 commits intomainfrom
fix/table-transformer-meta-tensor

micmarty-deepsense commented Jan 27, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

micmarty-deepsense commented Jan 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Root Cause: How device_map Creates Meta Tensors

The Problem

Two Different Code Paths in Transformers

Why device_map Exists

Changes

Pattern Consistency

Testing

Stacktrace (from customer deployment)

References

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

micmarty-deepsense commented Jan 27, 2026 •

edited

Loading