Need Info On How To Build A Macro For Websites - Topic

Member

Posts: 9,664

Joined: Dec 22 2007

Gold: 845.30

Dec 24 2015 10:30am

I'm looking to build a macro that will search for something on different websites and if there is a quantity > 0 then to buy the product as long as it is under a certain price. What would be the best way of going about doing this? What language? Should I just use a text editor like atom? Would like some help on getting started. Thanks guys.

vunel

Member

Posts: 9,664

Joined: Dec 22 2007

Gold: 845.30

Dec 24 2015 11:16am

I'm looking into using python and learning about web scraping. and using pycharm as the IDE.
Any other tips?

vunel

Member

Posts: 9,664

Joined: Dec 22 2007

Gold: 845.30

Dec 25 2015 07:34pm

any ideas

bbrtki

Member

Posts: 1,158

Joined: Oct 5 2010

Gold: 0.00

Mar 6 2016 05:17am

Any language that allows you to:
- parse html (no regex, really, don't)
- send/receive http post/get

You could try:
- Delphi/pascal (Some example code here: https://bitbucket.org/tstki/dragontavern-logger)
- .net

Azrad

Member

Posts: 10,812

Joined: Oct 15 2009

Gold: Locked

Warn: 20%

Mar 6 2016 05:28am

Quote (vunel @ Dec 24 2015 10:16am)

I'm looking into using python and learning about web scraping. and using pycharm as the IDE.
Any other tips?

check out the Mechanize and BeautifulSoup4 packages for python.

j0ltk0la

Member

Posts: 62,215

Joined: Jun 3 2007

Gold: 9,039.20

Mar 6 2016 04:03pm

Mechanize is kind of old and I don't think it's maintained anymore, it's still good if you're on Python 2.

For scraping you can use requests and bs4as mentioned above, for most sites, for something with heavy JavaScript you might have to go with Scrapy/ScrapyJS, good luck.

Code

PS C:\WINDOWS\system32> python -m pip install BeautifulSoup4, requests, mechanize --upgrade --force
Collecting BeautifulSoup4
Using cached beautifulsoup4-4.4.1-py2-none-any.whl
Collecting requests
Using cached requests-2.9.1-py2.py3-none-any.whl
Collecting mechanize
Using cached mechanize-0.2.5.tar.gz
Installing collected packages: BeautifulSoup4, requests, mechanize
Found existing installation: beautifulsoup4 4.4.1
Uninstalling beautifulsoup4-4.4.1:
Successfully uninstalled beautifulsoup4-4.4.1
Found existing installation: requests 2.9.1
Uninstalling requests-2.9.1:
Successfully uninstalled requests-2.9.1
Found existing installation: mechanize 0.2.5
Uninstalling mechanize-0.2.5:
Successfully uninstalled mechanize-0.2.5
Running setup.py install for mechanize ... done
Successfully installed BeautifulSoup4-4.4.1 mechanize-0.2.5 requests-2.9.1
PS C:\WINDOWS\system32> python -c "import requests, mechanize, bs4 #testing their installation"

Good intro video into scraping with Python

Code

#!/usr/bin/env python

from bs4 import BeautifulSoup
import requests
import mechanize

url = "http://d2jsp.org"
page = requests.get(url).content
soup = BeautifulSoup(page.content, 'html.parser')
soup.title.text

mech = mechanize.Browser()
mech.open(url)
mech.title()

Go Back To Programming & Development Topic List