Out of curiosity, which pages do you request and what information do you need?
Well, the first request depends on which server you want really
1) It can be any on of the following :
"http://ragial.com/search/iRO-Renewal/" + item_name -> Used for Renewal
"http://ragial.com/search/iRO-Classic/" + item_name -> Used for Classic
"http://ragial.com/search/iRO-Thor/" + item_name -> Used for Thor
2) Now the parser will download that entire page and select the items link which is one of the tags inside the above 3 links.
Eg for Renewal : "a[href*=ragial.com/item/iRO-Renewal]"
It will extract this link from the above site it downloaded. It doesn't need to use another network connection to ragial.com
3) It makes the second call to ragial.com to obtain all the vender information. Since the name is encoded, I have no choice but to make another ragial.com request.
Eg for Renewal Strawberry item : http://ragial.com/item/iRO-Renewal/HbXm5a42iFgr
This is the second call onto ragial.com that is needed.
4) It then parses that site to obtain all the vender information as well as calculates if it is on sale.
The data model is quite simple :
Name of the item, short/long range - no of items, min price, max price, avg price, standard deviation and the confidence value (unused for now) [ My algorithm only takes the vender price, min price, avg price and standard deviation into account, I have no idea what confidence is even doing there in ragial.com items :P ]
Each item has the above data stored in a temp database for quick access and also the following Vender info
Vender Name, store name, item count (eg, how many strawberries he has), price for each item and its standard deviation.
Then my algorithm runs over these vend prices to see if any item is on sale.
Edited by Haseox1, 01 June 2015 - 02:55 AM.