d2jsp
Log InRegister
d2jsp Forums > Off-Topic > Computers & IT > Programming & Development > How Hard Is It To Extract Data From A Program?
Add Reply New Topic New Poll
Member
Posts: 35,075
Joined: Jul 26 2006
Gold: 125.00
Jul 28 2017 07:52pm
I realize this question is vague, but perhaps you can inform me on obstacles and other questions I might not thought of.

Objective: To extract data (in an export fashion, not real-time) from a variety of programs, using Python as the program language to create the extraction software. Programs that I want to read/pull data from would include Quickbooks, and other accounting software. The idea is to be able to pull the data so that I can run my own program through the data. To be clear, this isn't anything nefarious, this is so that I can automate certain accounting tasks. Obviously I would have full access to the files (passwords/etc), so I'd be able to open the software and disable any protection / permit access. I am wondering how significant of a job this could be on a per-program basis (as I imagine each program would require individual coding to be done), and whether it could be downright not feasible in some cases?

I understand this could be very complex if the software isn't built with the intention of being friendly to something like this. Certain extraction software already exists, and there are some limited export features that provide crude data, but nothing out there is ideal or built for the purpose of gathering exactly the data I want. I'd want to be able to pull the core information necessary, but also certain links & relationships that are stored that would provide information about the logic of past activities (not just the end numbers of past activities). Example, someone is receiving payments from a company, and within the software those payments are linked to specific debts; on extraction, the specific debt isn't stored, just one bunched accounts receivable account. If you look further on some extraction software, you might see "Company X: debt", but that will be a bunched total, and not include each specific debt (ex. $100 job A, $200 on job B, $500 on job C). Perhaps the company owes multiple debts, and payments on those debts are being attributed specifically within the program. I'm just not sure how difficult this could be, and what roadblocks I'll likely hit that might make me have to give up.

This post was edited by Canadian_Man on Jul 28 2017 07:58pm
Member
Posts: 32,925
Joined: Jul 23 2006
Gold: 3,804.50
Jul 28 2017 08:20pm
quickbooks doesn't have an export feature?
Member
Posts: 35,075
Joined: Jul 26 2006
Gold: 125.00
Jul 28 2017 09:22pm
Quote (carteblanche @ Jul 28 2017 07:20pm)
quickbooks doesn't have an export feature?


TLDR: It has some export features, but what can be exported with in-built features isn't enough.

Certain data can be exported to a text file in a certain format.
Certain reports/data can be exported to an Excel file (end-user friendly format, meant for the non-accountant individual).
I am unsure what is exported by existing software, but a decent amount is. The existing export software is 3rd-party, I don't know much about how much / how little the 3rd party software gathers (as far as I know, 3rd party extraction software is designed to only pull necessary data, so that another specific program can read it, not so that you can have the data accessible for any general program).

Any of the above exports end up producing a crude rendition of the information. For example, I wouldn't know that one transaction was posted in US dollars, and automatically converted to CAD at a specific conversion rate at that date. I might not know that an entry for January 3rd was made on October 5th. Perhaps I want to know about all of the account relationships, even the hidden accounts (ex. there might be 6 different property accounts, but only 2 show on a regular export because accounts that have no activity for the entire fiscal year are not presented on a regular export). There's tons of underlying data that just wouldn't be made available to me. If I could export all the data, I could have a data set of sequential entries, and some of the values for each thing would be "Date last modified", "Date first created", and "General journal date".

In order for do what I want to do, I need to gather as much data as I can, so I'm hoping to be able to extract as much as I can.

This post was edited by Canadian_Man on Jul 28 2017 09:33pm
Member
Posts: 18,087
Joined: Dec 10 2007
Gold: 5,639.46
Jul 29 2017 08:57am
This is why I use stata, it auto-compiles your data and stores it incase you want to re-arrange from wide to tall
issue is that it extracts too much, for instance if you were to import pdf files, it may detect some paragraphs as a table.

e: do you have a specific assignment you are trying to accomplish?
I.E. pull specific data from worldbank or such?


This post was edited by Arcolithe on Jul 29 2017 08:59am
Go Back To Programming & Development Topic List
Add Reply New Topic New Poll