Skip to content

streamlit app using Table Transformer and OCR#200

Closed
salman-moh wants to merge 16 commits intoNielsRogge:masterfrom
salman-moh:master
Closed

streamlit app using Table Transformer and OCR#200
salman-moh wants to merge 16 commits intoNielsRogge:masterfrom
salman-moh:master

Conversation

@salman-moh
Copy link
Copy Markdown

addition of OCR to download tables directly as csv files.
HF space link: https://huggingface.co/spaces/SalML/TableTransformer2CSV

addition of OCR to download tables directly as csv files.
@salman-moh salman-moh marked this pull request as draft October 19, 2022 07:18
@salman-moh salman-moh marked this pull request as ready for review October 19, 2022 07:29
Comment thread Table Transformer/app.py Outdated
Comment thread Table Transformer/app.py Outdated
salman-moh and others added 5 commits October 19, 2022 13:04
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
@maxjeblick
Copy link
Copy Markdown

When uploading an image to the app, I'm getting:
(probably the table extraction failed on that particular image)

AttributeError: 'UploadedFile' object has no attribute 'split'
Traceback:

File "/home/user/.local/lib/python3.8/site-packages/streamlit/scriptrunner/script_runner.py", line 554, in _run_script
    exec(code, module.__dict__)
File "/home/user/app/app.py", line 501, in <module>
    asyncio.run(te.start_process(img_name, TD_THRESHOLD=0.6, TSR_THRESHOLD=0.8, padd_top=padd_top, padd_left=padd_left, padd_bottom=padd_bottom, padd_right=padd_right, delta_xmin=0, delta_ymin=0, delta_xmax=0, delta_ymax=0, expand_rowcol_bbox_top=0, expand_rowcol_bbox_bottom=0))
File "/usr/local/lib/python3.8/asyncio/runners.py", line 44, in run
    return loop.run_until_complete(main)
File "/usr/local/lib/python3.8/asyncio/base_events.py", line 616, in run_until_complete
    return future.result()
File "/home/user/app/app.py", line 438, in start_process
    print('No table found in the pdf-page image'+image_path.split('/')[-1])

@salman-moh
Copy link
Copy Markdown
Author

salman-moh commented Oct 19, 2022

Yea, the model did not find any bbox. Thanks for this, I have updated app to just print out 'no table found' during such a case.
Added slider for threshold, lower your threshold and check @maxjeblick

@salman-moh
Copy link
Copy Markdown
Author

@NielsRogge let me know if any other changes you see fit.

@salman-moh salman-moh requested a review from NielsRogge October 22, 2022 09:05
@NielsRogge
Copy link
Copy Markdown
Owner

Hi,

thanks for your PR. Maybe it's clearer to just include a link to your demo, I'd like to keep this repo just for notebooks.

@salman-moh
Copy link
Copy Markdown
Author

Sounds good, I have removed app.py and included demo link with screenshot in the readme.

Comment thread Table Transformer/README.md Outdated
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
@salman-moh salman-moh requested a review from NielsRogge October 22, 2022 11:31
Comment thread README.md Outdated
@salman-moh salman-moh requested a review from NielsRogge November 2, 2022 08:16
@NielsRogge
Copy link
Copy Markdown
Owner

Thanks for building this, however I'll close it since this repository is meant for notebooks only. You could of course host the Streamlit app on Spaces.

@NielsRogge NielsRogge closed this Mar 7, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants