d2jsp
Log InRegister
d2jsp Forums > Off-Topic > Computers & IT > Programming & Development > Need Info On How To Build A Macro For Websites
Add Reply New Topic New Poll
Member
Posts: 9,664
Joined: Dec 22 2007
Gold: 845.30
Dec 24 2015 10:30am
I'm looking to build a macro that will search for something on different websites and if there is a quantity > 0 then to buy the product as long as it is under a certain price. What would be the best way of going about doing this? What language? Should I just use a text editor like atom? Would like some help on getting started. Thanks guys.
Member
Posts: 9,664
Joined: Dec 22 2007
Gold: 845.30
Dec 24 2015 11:16am
I'm looking into using python and learning about web scraping. and using pycharm as the IDE.
Any other tips?
Member
Posts: 9,664
Joined: Dec 22 2007
Gold: 845.30
Dec 25 2015 07:34pm
any ideas
Member
Posts: 1,158
Joined: Oct 5 2010
Gold: 0.00
Mar 6 2016 05:17am
Any language that allows you to:
- parse html (no regex, really, don't)
- send/receive http post/get

You could try:
- Delphi/pascal (Some example code here: https://bitbucket.org/tstki/dragontavern-logger)
- .net
Member
Posts: 10,812
Joined: Oct 15 2009
Gold: Locked
Warn: 20%
Mar 6 2016 05:28am
Quote (vunel @ Dec 24 2015 10:16am)
I'm looking into using python and learning about web scraping. and using pycharm as the IDE.
Any other tips?


check out the Mechanize and BeautifulSoup4 packages for python.
Member
Posts: 62,215
Joined: Jun 3 2007
Gold: 9,039.20
Mar 6 2016 04:03pm
Mechanize is kind of old and I don't think it's maintained anymore, it's still good if you're on Python 2.

For scraping you can use requests and bs4as mentioned above, for most sites, for something with heavy JavaScript you might have to go with Scrapy/ScrapyJS, good luck.

Code
PS C:\WINDOWS\system32> python -m pip install BeautifulSoup4, requests, mechanize --upgrade --force
Collecting BeautifulSoup4
Using cached beautifulsoup4-4.4.1-py2-none-any.whl
Collecting requests
Using cached requests-2.9.1-py2.py3-none-any.whl
Collecting mechanize
Using cached mechanize-0.2.5.tar.gz
Installing collected packages: BeautifulSoup4, requests, mechanize
Found existing installation: beautifulsoup4 4.4.1
Uninstalling beautifulsoup4-4.4.1:
Successfully uninstalled beautifulsoup4-4.4.1
Found existing installation: requests 2.9.1
Uninstalling requests-2.9.1:
Successfully uninstalled requests-2.9.1
Found existing installation: mechanize 0.2.5
Uninstalling mechanize-0.2.5:
Successfully uninstalled mechanize-0.2.5
Running setup.py install for mechanize ... done
Successfully installed BeautifulSoup4-4.4.1 mechanize-0.2.5 requests-2.9.1
PS C:\WINDOWS\system32> python -c "import requests, mechanize, bs4 #testing their installation"


Good intro video into scraping with Python



Code
#!/usr/bin/env python

from bs4 import BeautifulSoup
import requests
import mechanize

url = "http://d2jsp.org"
page = requests.get(url).content
soup = BeautifulSoup(page.content, 'html.parser')
soup.title.text


mech = mechanize.Browser()
mech.open(url)
mech.title()

Go Back To Programming & Development Topic List
Add Reply New Topic New Poll