Perl Experts

Discussion in 'OT Technology' started by Jabberwocky, Jan 9, 2003.

  1. Jabberwocky

    Jabberwocky 06 08 33 35 36

    Joined:
    Jan 24, 2002
    Messages:
    1,140
    Likes Received:
    0
    Location:
    SD, Cali
    How would I pull strings out of an HTML source (STDIN I guess)?
     
  2. Dommi

    Dommi Guest

    if I understand what you are doing... you want to emulate something like ubb or vb code. remove the html and replace it with proprietary tags. Or vice versa. There is a doc on how to do this in php on the php site. But as for perl I would not know. I can say however that you can pull the source on how to do this from any decent bulletin board that uses perl. Like Ikonboard.
     
  3. Jabberwocky

    Jabberwocky 06 08 33 35 36

    Joined:
    Jan 24, 2002
    Messages:
    1,140
    Likes Received:
    0
    Location:
    SD, Cali
    I don't think I'm explaining myself too well. I'm just going to put in a big block of HTML source using STDIN, and then I want to check through it and parse certain text into variables so I can print them out in a tab or comma delimited form.

    Any sites that might help me with pattern recognition or setting this up?
     
  4. RaginBajin

    RaginBajin Have you punched a donkey today?

    Joined:
    Dec 24, 2001
    Messages:
    8,740
    Likes Received:
    0
    Location:
    NoVA
    Here's an example something you could use. This would break apart hh:mm:ss. It's a very common example. Just need to go to google and type in reg expressions perl and look at all the sites there are.

    Code:
         $time = /(\d\d):(\d\d):(\d\d)/; 
             $hours = $1;
             $minutes = $2;
             $seconds = $3;
    
     
    Last edited: Jan 10, 2003
  5. Jabberwocky

    Jabberwocky 06 08 33 35 36

    Joined:
    Jan 24, 2002
    Messages:
    1,140
    Likes Received:
    0
    Location:
    SD, Cali
    I understand the storing part, but how would I write it so that it would grab the the part immediately following the match test text?


    Here's an example of the code:

    <a href="http://medicine.ucsd.edu/pharmaco/jadams.htm"> Adams, J.A. </a> - University of California, San Diego (USA)
    </li>

    So I'd like to use the /<a href=/ as the trigger, than if that =~ true, then I how would parse the link, first/last name, and location into diff variables?

    I'm just confused on how the test runs through the input...is it character after character? Or if it gets a match does it jump to the new line?
     
    Last edited: Jan 10, 2003
  6. Kabuko

    Kabuko Guest

    As said above look up regular expressions.
     
  7. Penguin Man

    Penguin Man Protect Your Digital Liberties

    Joined:
    Apr 27, 2002
    Messages:
    21,696
    Likes Received:
    0
    Location:
    Edmonton, AB
    Hm, this is what I use to parse form input, and I'm not sure if it parses URL input too, but it might give you some ideas:
    Code:
    read(STDIN, $buffer, $ENV{'CONTENT_LENGTH'});
    
    @alpha = split(/&/, $buffer);
    @gamma = split(/&/, $ENV{'QUERY_STRING'});
    @pairs = (@alpha, @gamma);
    foreach $pair (@pairs) {
            ($name, $value) = split(/=/, $pair);
            $value =~ tr/+/ /;
            $value =~ s/%([a-fA-F0-9][a-fA-F0-9])/pack("C", hex($1))/eg;
            $value =~ s/<!--(.|\n)*-->//g;
            $value =~ s/<([^>]|\n)*>//g;
            $INPUT{$name} = $value;
    }
    
    Hope that helps you :)

    Edit: Upon reading the code (I've blindly used it in all my scripts for years), I realize that it does in fact parse url input. See, $ENV{'QUERY_STRING'} is whatever is after the ? in the URL, and then it just goes though and parses them all into input's, so if it was /asdf.cgi?name=PenguinMan&status=UberCool, then $INPUT{'name'} would be "PenguinMan", and $INPUT{'status'} would be "UberCool"... I think.

    Edit2: Just tested it, I was right.
     
    Last edited: Jan 11, 2003

Share This Page