regex - php, strpos extract digit from string -
i have huge html code scan. until have been using preg_match_all
extract desired parts it. problem start was extremely cpu time consuming. decided use other method extraction. read in articles preg_match
can compared in performance strpos
. claim strpos
beats regex scanner 20 times in efficiency. thought try method dont know how started.
lets have html string:
<li id="ncc-nba-16451" class="che10"><a href="/en/star">23 - star</a></li> <li id="ncd-bbt-5674" class="che10"><a href="/en/moon">54 - moon</a></li> <li id="ertw-cxda-c6543" class="che10"><a href="/en/sun">34,780 - sun</a></li>
i want extract number each id , text (letters) content of a
tags. preg_match_all
scan:
'/<li.*?id=".*?([\d]+)".*?<a.*?>.*?([\w]+)<\/a>/s'
here can see result: link
now if want replace method strpos
functionality how approach like? understand strpos
returns index of start match took place. how can use to:
- get possible matches, not one
- extract numbers or text desired place in string
thank , tips ;)
using dom
$html = ' <html> <head></head> <body> <li id="ncc-nba-16451" class="che10"><a href="/en/star">23 - star</a></li> <li id="ncd-bbt-5674" class="che10"><a href="/en/moon">54 - moon</a></li> <li id="ertw-cxda-c6543" class="che10"><a href="/en/sun">34,780 - sun</a></li> </body> </html>'; $dom_document = new domdocument(); $dom_document->loadhtml($html); $rootelement = $dom_document->documentelement; $getid = $rootelement->getelementsbytagname('li'); $res = []; foreach($getid $tag) { $data = explode('-',$tag->getattribute('id')); $res['li_id'][] = end($data); } $getnode = $rootelement->getelementsbytagname('a'); foreach($getnode $tag) { $res['a_node'][] = $tag->parentnode->textcontent; } print_r($res);
output :
array ( [li_id] => array ( [0] => 16451 [1] => 5674 [2] => c6543 ) [a_node] => array ( [0] => 23 - star [1] => 54 - moon [2] => 34,780 - sun ) )
Comments
Post a Comment