regual expression
Arga, Distribuição Auto, Lda. (regel 3)
eruit te filteren.
ik heb nu zelf dit:
als iemand weet hoe ik mijn regex aan moet passen om hem werkend te krijgen en de uitleg erbij wil geven hoe hij werk dan zou dat mij enorm helpen.
alvast bedankt
Code (php)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
<img src="_pics/baredvert.gif" align="top" width="68" height="20">
<font size="3" face="Arial"><br>
Arga, Distribuio Auto, Lda.
</font><font size="2" face="Arial">
<br><img src="_pics/baredhor.gif" align="top" width="400" height="4"><br>
<img src="_pics/baredvertinfu.gif" align="top" width="68" height="18">
Motor vehicle supplies and new parts<br>
<img src="_pics/baredvertinfp.gif" align="top" width="68" height="18">
<i>Gros. Peas/Aces p/Automveis</i><br>
<img src="_pics/bargrayhor400.gif" align="top" width="400" height="2"><br>
<img src="_pics/bargrayhor400.gif" align="top" width="400" height="2"><br>
<img src="_pics/baredvertppl.gif" align="top" width="68" height="18">
Domingos Abreu Figueiredo (Eng.) - Director<br>
<img src="_pics/baredvert.gif" align="top" width="68" height="18">
Comercial/Vendas<br>
<img src="_pics/bargrayhor400.gif" align="top" width="400" height="2"><br>
<img src="_pics/bargrayhor400.gif" align="top" width="400" height="2"><br>
<img src="_pics/baredvertadr.gif" align="top" width="68" height="18">
Alameda Antonio Sergio, 57-A<br>
<img src="_pics/baredvert.gif" align="top" width="68" height="18">
2795 - LINDA-A-VELHA<br>
<img src="_pics/bargrayhor400.gif" align="top" width="400" height="2"><br>
<img src="_pics/baredverttel.gif" align="top" width="68" height="18">
21
419 10 41
<img src="_pics/baredvertfax.gif" align="top" width="68" height="18">
21
419 10 50
<br>
<img src="_pics/bargrayhor400.gif" align="top" width="400" height="2"><br>
<img src="_pics/bargrayhor400.gif" align="top" width="400" height="2"><br>
<img src="_pics/baredvert.gif" align="top" width="68" height="18">
Source/<i>Fonte</i>: <a href="../45bizservices/spie/index.html"><img src="_pics/logospie60x14.gif" alt="SPIE" align="absmiddle" border="0" width="60" height="14">
<font size="3" face="Arial"><br>
Arga, Distribuio Auto, Lda.
</font><font size="2" face="Arial">
<br><img src="_pics/baredhor.gif" align="top" width="400" height="4"><br>
<img src="_pics/baredvertinfu.gif" align="top" width="68" height="18">
Motor vehicle supplies and new parts<br>
<img src="_pics/baredvertinfp.gif" align="top" width="68" height="18">
<i>Gros. Peas/Aces p/Automveis</i><br>
<img src="_pics/bargrayhor400.gif" align="top" width="400" height="2"><br>
<img src="_pics/bargrayhor400.gif" align="top" width="400" height="2"><br>
<img src="_pics/baredvertppl.gif" align="top" width="68" height="18">
Domingos Abreu Figueiredo (Eng.) - Director<br>
<img src="_pics/baredvert.gif" align="top" width="68" height="18">
Comercial/Vendas<br>
<img src="_pics/bargrayhor400.gif" align="top" width="400" height="2"><br>
<img src="_pics/bargrayhor400.gif" align="top" width="400" height="2"><br>
<img src="_pics/baredvertadr.gif" align="top" width="68" height="18">
Alameda Antonio Sergio, 57-A<br>
<img src="_pics/baredvert.gif" align="top" width="68" height="18">
2795 - LINDA-A-VELHA<br>
<img src="_pics/bargrayhor400.gif" align="top" width="400" height="2"><br>
<img src="_pics/baredverttel.gif" align="top" width="68" height="18">
21
419 10 41
<img src="_pics/baredvertfax.gif" align="top" width="68" height="18">
21
419 10 50
<br>
<img src="_pics/bargrayhor400.gif" align="top" width="400" height="2"><br>
<img src="_pics/bargrayhor400.gif" align="top" width="400" height="2"><br>
<img src="_pics/baredvert.gif" align="top" width="68" height="18">
Source/<i>Fonte</i>: <a href="../45bizservices/spie/index.html"><img src="_pics/logospie60x14.gif" alt="SPIE" align="absmiddle" border="0" width="60" height="14">
Gewijzigd op 01/01/1970 01:00:00 door Simon
Vertel eens iets meer over de bron van het bestand, waar komt het vandaan, hoe word het gegenereerd
ik hoop dat je me zo kunt helpen anders moet je nog maar wat meer informatie vragen ;)
Wat is het doel?
maar ik wil nu:
Arga, Distribuição Auto, Lda. (regel 3)
uit de html code halen.
er zijn meerdere van deze HTML stukjes maar die heb ik geexplode en daarna in een foreach loop gezet zodat ik steeds losse kleine stukjes HTML krijg en zo dus makkelijker kan zoeken met regex.
de site waar ik de gegevens vanaf haal is: http://portugalvirtual.pt/0/3054dat1.html
hieronder mijn code:
Code (php)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
<?php
ini_set ('display_errors', 1);
error_reporting (E_ALL | E_STRICT);
$result = file_get_contents('http://portugalvirtual.pt/0/30.html');
preg_match_all('/3054dat[0-7]\.html/', $result, $gegevens);
foreach ($gegevens[0] as $gegevens=>$value){
$result = file_get_contents('http://portugalvirtual.pt/0/'.$value);
// $result = file_get_contents('http://portugalvirtual.pt/0/3054dat1.html');
// $result = file_get_contents('iets.php');
$firma1 = get_string_between($result, '<font size="3" face="Arial"><strong>', '</strong>');
preg_match_all('(\<font size="3" face="Arial"\>\<strong\>\s[^\<\>]{1,50}\s\<\/strong\>)', $result, $firma);
preg_match_all('(\<img src="_pics\/baredvertadr\.gif" width="68" align="top" height="18"\>[^\<\>]{1,100}\<br\>)', $result, $straat);
print_r($straat);
$test = explode('<strong>', $result);
foreach($test as $res=>$waarde){
$aap = preg_match('/<br>\n([a-zA-Z0-9|,.\s\w]{1,})\n<\/font>/', $waarde, $firma);
print_r($firma);
print_r($waarde);
echo '<br/>';
}
}
function get_string_between($string, $start, $end){
$string = " ".$string;
$ini = strpos($string,$start);
if ($ini == 0) return "";
$ini += strlen($start);
$len = strpos($string,$end,$ini) - $ini;
return substr($string,$ini,$len);
}
?>
ini_set ('display_errors', 1);
error_reporting (E_ALL | E_STRICT);
$result = file_get_contents('http://portugalvirtual.pt/0/30.html');
preg_match_all('/3054dat[0-7]\.html/', $result, $gegevens);
foreach ($gegevens[0] as $gegevens=>$value){
$result = file_get_contents('http://portugalvirtual.pt/0/'.$value);
// $result = file_get_contents('http://portugalvirtual.pt/0/3054dat1.html');
// $result = file_get_contents('iets.php');
$firma1 = get_string_between($result, '<font size="3" face="Arial"><strong>', '</strong>');
preg_match_all('(\<font size="3" face="Arial"\>\<strong\>\s[^\<\>]{1,50}\s\<\/strong\>)', $result, $firma);
preg_match_all('(\<img src="_pics\/baredvertadr\.gif" width="68" align="top" height="18"\>[^\<\>]{1,100}\<br\>)', $result, $straat);
print_r($straat);
$test = explode('<strong>', $result);
foreach($test as $res=>$waarde){
$aap = preg_match('/<br>\n([a-zA-Z0-9|,.\s\w]{1,})\n<\/font>/', $waarde, $firma);
print_r($firma);
print_r($waarde);
echo '<br/>';
}
}
function get_string_between($string, $start, $end){
$string = " ".$string;
$ini = strpos($string,$start);
if ($ini == 0) return "";
$ini += strlen($start);
$len = strpos($string,$end,$ini) - $ini;
return substr($string,$ini,$len);
}
?>