Find Jobs
Hire Freelancers

Scan pdf and read tables with OpenCV & Tesseract OCR

$250-750 USD

Completed
Posted about 5 years ago

$250-750 USD

Paid on delivery
Project Mission: Search and find all tables in a PDF Convert images of tables from PDF (or other) to CSV-formatted tables. Mainly pdf but must use OCR because not all PDF are formatted for parsing Must be able to handle tables that are on two pages (see standard bank report pg 12/13) Requirements: OpenCV (Python) Tesseract v4 A set images of pdfs will be provided. It's important not to optimize the solution for these specific tables. The solution must be generic and will be tested against other images of tables. It is a priority to handle regular tables with high precision. Pie-charts and similar diagrams are a bonus. Proposed steps: 1. Analyze images using OpenCV to determine table cells (rows and columns). 2. Slice input image into multiple images based on cells. 2. Use Tesseract 4 to OCR text from each cell. 4. Output data to CSV Expected outcome: - Conversion is at least 95% accurate with our test-set. Standard tables but not provided to avoid overfitting. - Docker image with all dependencies provided. - Function / Script / API that takes an image and outputs CSV-table. Readings / Links: Improving quality: [login to view URL] Finding text blocks in an image using OpenCV: [login to view URL] Table Analysis using with histogram: [login to view URL] Docker OpenCV Image: [login to view URL] Attached files: pdfs to convert
Project ID: 18574010

About the project

6 proposals
Remote project
Active 5 yrs ago

Looking to make some money?

Benefits of bidding on Freelancer

Set your budget and timeframe
Get paid for your work
Outline your proposal
It's free to sign up and bid on jobs
Awarded to:
User Avatar
Hello sir. I have been working on web technologies since 6 years and studying machine learning since 2 years. I have worked on OCR and OpenCV for reading images and emotion detection projects. Ready to start your job. Please ping if interested.
$500 USD in 10 days
4.3 (11 reviews)
3.6
3.6
6 freelancers are bidding on average $1,101 USD for this job
User Avatar
Hi there, I've read your project description and I am confident enough that I can handle this project according to your expectations. I have done similar projects before and I want to take over this project as well. If you're interested then please contact me to see my portfolio :) I'll be waiting for your response. Regards
$550 USD in 18 days
5.0 (9 reviews)
5.5
5.5
User Avatar
Hi there Roaya is a startup based in Egypt and we are Odoo official partner. We are ready to start working on your project. Please let us discuss the details. Regards Mohammad Alaa
$2,000 USD in 20 days
4.9 (19 reviews)
5.0
5.0
User Avatar
I have already worked on Ocr with PDF . and extract text from it . so I can do your job within a time limit with your satisfaction.
$1,888 USD in 30 days
4.1 (2 reviews)
2.9
2.9
User Avatar
Hi,dear. I am very interested in your project - 'Scan pdf and read tables with OpenCV & Tesseract OCR'. I've already done this kind of project before. I'm a professional programmer with 12 years of experience. If you award me, I'll implement all of your requirements in a short time. Skills: Java, Machine Learning, Python, Software Development
$555 USD in 3 days
0.0 (0 reviews)
0.0
0.0
User Avatar
I'm a senior software developer with very a high personal standard for code quality and I pay attention to detail. I have been programming full-time for more than 10 years. Some of my experience is summarized below: ➢ Java 7 & 8 (6+ years experience) ▪ Android, Java EE(J2EE), J2ME, JSF, JSP, PhoneGap ▪ Gradle, Maven, Ant ▪ Spring, Hibernate, MyBatis, EJB ▪ Jboss/Wildfly, Tomcat, Weblogic ▪ TestNG, JUnit, Mockito ▪ Swagger, Dropwizard, JAXB, Axis2 ➢ C# (.NET Core + Standard + Framework) ▪ Dapper & Entity Framework ▪ NUnit ➢ SQL (10+ years experience) ▪ MySQL, MSSQL, Stored Procedures ➢ Oracle (+- 1 year experience) ▪ PL SQL, Stored Procedures ➢ HTML (+HTML 5, 10+ years experience) ▪ JSON, JavaScript, CSS, AJAX, XML, YAML ➢ PHP (10+ years experience) ➢ C++ (3+ years experience) ➢ Pure C (2+ years experience) ➢ Cisco IOS (2+ years experience) ➢ Perl (2+ years experience) ➢ SH (10+ years experience) ➢ BASH (10+ years experience) ➢ Clarion (version 8 & version 10) ➢ Python ➢ VB (.NET) ➢ Delphi ➢ Assembly I am very proficient with Linux/Unix which I have used for more than 10 years with KDE, Gnome, Fluxbox and pure terminal. Flavours I have used include: ➢ Gentoo ➢ CentOS ➢ Debian ➢ Mint ➢ Kali + Backtrack 2 & 3 ➢ RedHat ➢ (K)Ubuntu ➢ FreeBSD (UNIX) ➢ Knoppix ➢ Arch ➢ PHLAK ➢ OpenSUSE ➢ Fedora ➢ PCLinuxOS among many others
$1,111 USD in 20 days
0.0 (0 reviews)
0.0
0.0

About the client

Flag of SOUTH AFRICA
Sandton, South Africa
0.0
0
Payment method verified
Member since Jan 22, 2019

Client Verification

Thanks! We’ve emailed you a link to claim your free credit.
Something went wrong while sending your email. Please try again.
Registered Users Total Jobs Posted
Freelancer ® is a registered Trademark of Freelancer Technology Pty Limited (ACN 142 189 759)
Copyright © 2024 Freelancer Technology Pty Limited (ACN 142 189 759)
Loading preview
Permission granted for Geolocation.
Your login session has expired and you have been logged out. Please log in again.