WEB php scraping

Discussion in 'OT Technology' started by burn__, Oct 12, 2008.

  1. burn__

    burn__ New Member

    Joined:
    Mar 21, 2006
    Messages:
    10,673
    Likes Received:
    0
    im trying to make a guild webpage for warhammeronline for some friends and need to scrape the page to pull out everyones name.

    http://realmwar.warhammeronline.com/realmwar/GuildInfo.war?id=99&server=169

    the code i have so far will only pull and display the first name, "Gorgoroth" and the script just stops. how can i get it to where it will pull every $regex entry until there isnt anything else that satisfies that argument?

    Code:
    <?
        ini_set('user_agent', 'Mozilla/5.0 (Windows; U; Windows NT 6.0; en-US; rv:1.8.1.14) Gecko/20080404 Firefox/2.0.0.14');
        $data = file_get_contents('http://realmwar.warhammeronline.com/realmwar/GuildInfo.war?id=99&server=169');	
    	$regex = '/GuildRoster-Name">(.+?)</';
    	preg_match($regex,$data,$guild);
    	echo $guild[1];
    
    ?>
    
     
  2. retorq

    retorq What up bitch??

    Joined:
    Dec 14, 2006
    Messages:
    6,061
    Likes Received:
    0
    Location:
    Mohave Desert
  3. burn__

    burn__ New Member

    Joined:
    Mar 21, 2006
    Messages:
    10,673
    Likes Received:
    0
    :uh:

    i knew it was going to be something small like that.

    i tried it before but my echo statement was still calling the wrong information so i took it as the preg_match_all wasnt working.

    i feel like an idiot now :mamoru:
     
  4. burn__

    burn__ New Member

    Joined:
    Mar 21, 2006
    Messages:
    10,673
    Likes Received:
    0
    ok so im having another problem..

    each one of these statements (name and rank) work just fine by themselves, but trying to combine them keeps throwing an error at preg_match_all.

    Code:
    Warning: preg_match_all(): Unknown modifier '/'
    
    Code:
    <?
        ini_set('user_agent', 'Mozilla/5.0 (Windows; U; Windows NT 6.0; en-US; rv:1.8.1.14) Gecko/20080404 Firefox/2.0.0.14');
        $data = file_get_contents('http://realmwar.warhammeronline.com/realmwar/GuildInfo.war?id=99&server=169');	
    	$name = '/<div class="GuildRoster-Name">(.+?)</';
    	$rank = '/<div class="GuildRoster-Rank">([^<]+)</';
    	
    	$regex = $name . $rank;
    	preg_match_all($regex,$data,$guild);
    	array_shift($guild);
    	print_r($guild);
    
    	
    ?>
    
    essentially i want a list that will display the "name - rank"
    ex:

    Gorgoroth - 25
    Wren - 25
    Skarsquee - 21

    etc.
     
  5. burn__

    burn__ New Member

    Joined:
    Mar 21, 2006
    Messages:
    10,673
    Likes Received:
    0
    ahh, ok. regular expressions just dont seem to mesh well with me :hs:
     

Share This Page