Parsing HTML
Today I wrote a bit of code to parse the ASX website for our latest stock price information at work. It’s really easy and the ASX is fine with it provided you reference the data and advise readers that there’s a 20 minute delay on the data.
Here’s how I did it:
This is the page we have to parse: http://www.asx.com.au/asx/markets/equityPrices.do?by=asxCodes&asxCodes=nvt
I used this simple HTML DOM Parser to make life easy: http://simplehtmldom.sourceforge.net/
This is the code:
< ?php include_once('simple_html_dom.php');
function scraping_asx() {
$html = file_get_html('http://www.asx.com.au/asx/markets/equityPrices.do?by=asxCodes&asxCodes=nvt');
foreach($html->find('table[class=datatable]') as $data) {
$item['last'] = trim($data->find('td', 0)->plaintext);
$item['change'] = trim($data->find('td', 1)->plaintext);
$item['volume'] = trim($data->find('td', 7)->plaintext);
}
$html->clear();
unset($html);
return $item;
}
$scrape = scraping_asx();
?>
<html>
<head>
<title>Stock Price</title>
<style type="text/css">
div#stock{
border:1px solid;
width:130px;
padding-left: 10px;
}
</style>
</head>
<body>
<div id="stock">
<p id="stock_price"> Price: <? echo $scrape['last']?></p>
<p id="stock_change"> Change: <? echo $scrape['change']?></p>
<p id="stock_volume"> Volume: <? echo $scrape['volume']?></p>
</div>
</body>
</html>
I’m aware that the foreach loop probably isn’t necessary, but I’m ok with it.
HTML parsing is neat. I’ll probably toy with it a lot more doing some iPhone or simple web app dev in future.