d2jsp
d2jsp Forums > Programmer's Haven > Pdf Extractor For Python > Paid Work - 3k
Add Reply New Topic New Poll
Receptor
#1 May 10 2019 07:43am
Group: Member
Posts: 11,613
Joined: May 27 2013
Gold: Locked
i have a pdf that i'll like to extract

you can find an example here; https://www.spglobal.com/platts/plattscontent/_assets/_files/en/productsservices/market-reports/sbb-steel-markets-daily030818.pdf

i want to extract data from here;

part 1




part 2



to be consolidated under a csv file
dont' mind the wrong numbers im using an old screenshot



ideally, the data represented in csv or xlsx (if xlsx can be done, great!) would look like this: and have a recurring list,
so, 4th may data, it would be at row 4,
5th may data, it would be at row 5.. etc

i'm thinking you need pandas or numpy w/e the package is to put it in a table of sorts then write it to csv?

obv i don't want a regex expression, i only need parts of the pdf to be extracted

bottom line:
i need a full functioning code for me to pay you, which is the same requirements as my past threads:

https://forums.d2jsp.org/topic.php?t=81032020&f=120
https://forums.d2jsp.org/topic.php?t=81036357&f=120

Receptor
#2 May 11 2019 05:20am
Group: Member
Posts: 11,613
Joined: May 27 2013
Gold: Locked
fixed by Klexmoo <3
VodkaLover
#3 May 15 2019 11:15am
Group: Member
Posts: 6,732
Joined: Dec 20 2006
Gold: 15,105.50
Are those the only two headings you want extracted? Does this pdf arrive recurring or do you want to have to manually add the weblink?


Here, this will do it for free. You can automate it on your system with automator or use the open source sdk to change delivery.

Cheers.

https://tabula.technology

This post was edited by VodkaLover on May 15 2019 11:20am
Go Back To Programmer's Haven Topic List
Add Reply New Topic New Poll