d2jsp
Log InRegister
d2jsp Forums > Off-Topic > Computers & IT > Programming & Development > Make Me A Better Programmer - From Step 2
Prev18910
Add Reply New Topic New Poll
Member
Posts: 32,925
Joined: Jul 23 2006
Gold: 3,804.50
Aug 8 2015 06:23pm
i seemed to have run into a snag. using nodejs and http module to scrape data from amazon's free kindle ebooks site.

Code
var path = '/Best-Sellers-Kindle-Store-Teen-Young-Adult-Horror-eBooks/zgbs/digital-text/6064559011?tf=1#2';
var options = {
host: 'www.amazon.com',
port: 80,
path: path
};
http.get(options, onHttpGetSearchPage.bind(null, page_number, path));


supposed to have #1, #2, #3, #4, #5, but it seems to be ignored. i looked up the documentation and didn't see anything.

Quote
Options:

host: A domain name or IP address of the server to issue the request to. Defaults to 'localhost'.
hostname: To support url.parse() hostname is preferred over host
port: Port of remote server. Defaults to 80.
localAddress: Local interface to bind for network connections.
socketPath: Unix Domain Socket (use one of host:port or socketPath)
method: A string specifying the HTTP request method. Defaults to 'GET'.
path: Request path. Defaults to '/'. Should include query string if any. E.G. '/index.html?page=12'. An exception is thrown when the request path contains illegal characters. Currently, only spaces are rejected but that may change in the future.
headers: An object containing request headers.
auth: Basic authentication i.e. 'user:password' to compute an Authorization header.
agent: Controls Agent behavior. When an Agent is used request will default to Connection: keep-alive. Possible values:
undefined (default): use global Agent for this host and port.
Agent object: explicitly use the passed in Agent.
false: opts out of connection pooling with an Agent, defaults request to Connection: close.
keepAlive: {Boolean} Keep sockets around in a pool to be used by other requests in the future. Default = false
keepAliveMsecs: {Integer} When using HTTP KeepAlive, how often to send TCP KeepAlive packets over sockets being kept alive. Default = 1000. Only relevant if keepAlive is set to true.
The optional callback parameter will be added as a one time listener for the 'response' event.


This post was edited by carteblanche on Aug 8 2015 06:27pm
Member
Posts: 32,925
Joined: Jul 23 2006
Gold: 3,804.50
Aug 8 2015 06:30pm
well i've been guessing at query params. tried adding &p=2, &page=2, and finally struck gold with &pg=2.

i feel like a loser for abandoning the # and finding a workaround :(
Member
Posts: 23,862
Joined: Aug 16 2006
Gold: 20.00
Aug 8 2015 10:28pm
Quote (carteblanche @ Aug 8 2015 07:30pm)
well i've been guessing at query params. tried adding &p=2, &page=2, and finally struck gold with &pg=2.

i feel like a loser for abandoning the # and finding a workaround :(


shitty documentation?!? never seen that before!!! /s
Member
Posts: 32,925
Joined: Jul 23 2006
Gold: 3,804.50
Aug 8 2015 10:33pm
i pm'd the duck already, but if anyone else is interested in free cyberpunk novels:

http://www.amazon.com/Best-Sellers-Kindle-Store-Cyberpunk-Science-Fiction/zgbs/digital-text/6401749011?tf=1
Member
Posts: 32,925
Joined: Jul 23 2006
Gold: 3,804.50
Aug 13 2015 05:16pm
so i've got my javascript scraper that will search 170+ genres and saves the ids / urls / etc of the free ebooks that i didn't already grab to sqlite.

i started my python script. it can login to amazon, detect that i didnt already buy it, detect the price is 0, detect it's not a kindle first book, then buy it.

just need to integrate it with sqlite and it should automatically buy a few hundred books a day for me B)

This post was edited by carteblanche on Aug 13 2015 05:17pm
Member
Posts: 23,862
Joined: Aug 16 2006
Gold: 20.00
Aug 13 2015 06:36pm
Quote (carteblanche @ Aug 13 2015 06:16pm)
so i've got my javascript scraper that will search 170+ genres and saves the ids / urls / etc of the free ebooks that i didn't already grab to sqlite.

i started my python script. it can login to amazon, detect that i didnt already buy it, detect the price is 0, detect it's not a kindle first book, then buy it.

just need to integrate it with sqlite and it should automatically buy a few hundred books a day for me B)


starting your own E-library? lol
Member
Posts: 62,204
Joined: Jun 3 2007
Gold: 9,039.20
Aug 13 2015 10:37pm
Quote (carteblanche @ Aug 8 2015 10:33pm)
i pm'd the duck already, but if anyone else is interested in free cyberpunk novels:

http://www.amazon.com/Best-Sellers-Kindle-Store-Cyberpunk-Science-Fiction/zgbs/digital-text/6401749011?tf=1


ty
Member
Posts: 32,925
Joined: Jul 23 2006
Gold: 3,804.50
Aug 15 2015 07:45pm
all kindle apps auto-download everything you own when you connect via wifi. no way to disable it. the highest end model of the kindle paperwhite (which im considering buying) can hold 2,000 books. unfortunately, i own more than that. i contacted support and they suggested it might be a problem. #firstworldproblems

This post was edited by carteblanche on Aug 15 2015 08:05pm
Go Back To Programming & Development Topic List
Prev18910
Add Reply New Topic New Poll