Introduction to writing spiders and agents
Identifying your agent (or not)
       
    Remember: your actions will appear in server log files!
       
    $url = "http://www.someURL.com";
    $agent = "My special spider v1.0";
       
    $pageData = `\curl\curl -A $agent $url`
       
    A typical HTTP_USER_AGENT looks like:
    Mozilla/5.0 (Windows; U; Win 9x 4.90; en-US; rv:1.0.0) Gecko/20020530
    What about page referers?
    Match POST/GET methods