PHP get_headers() fails with Pinterest -
i'm working on tool integrates link of different social networks:
facebook: https://www.facebook.com/jonathan.parentlevesque google plus: https://plus.google.com/+jonathanparentl%c3%a9vesque instagram: https://instagram.com/mariloubiz/ pinterest: https://www.pinterest.com/jonathan_parl/ rss: https://regex101.com twitter: https://twitter.com/arcadefire vimeo: https://vimeo.com/ondemand/crashtest/135301838 youtube: https://www.youtube.com/user/darkjo666
i'm using basic regex one:
/^https?:\/\/(?:[a-z]{2}|[w]{3})?\.pinterest.com\/[\s]{5,}$/i
on client , server side minimal domain validation on each links.
then, i'm using function validate page exists (it's useless integrate social network links don't work after all):
public static function isurlexists($url){ $exists = false; if(!stringmanager::stringstartwith($url, "http") , !stringmanager::stringstartwith($url, "ftp")){ $url = "https://" . $url; } if (preg_match(regularexpression::url, $url)){ $headers = get_headers($url); if ($headers !== false , !empty($headers)){ if (strpos($headers[0], '404') === false){ $exists = true; } } } return $exists; }
note: in function i'm using diego perini's regex validating url before sending request:
const url = "%^(?:(?:https?|ftp)://)(?:\s+(?::\s*)?@|\d{1,3}(?:\.\d{1,3}){3}|(?:(?:[a-z\d\x{00a1}-\x{ffff}]+-?)*[a-z\d\x{00a1}-\x{ffff}]+)(?:\.(?:[a-z\d\x{00a1}-\x{ffff}]+-?)*[a-z\d\x{00a1}-\x{ffff}]+)*(?:\.[a-z\x{00a1}-\x{ffff}]{2,6}))(?::\d+)?(?:[^\s]*)?$%iu"; //@copyright diego perini
all tested links far didn't generate error, testing pinterest produce me quite scary series of error messages:
get_headers(): ssl operation failed code 1. openssl error messages: error:14090086:ssl routines:ssl3_get_server_certificate:certificate verify failed array ( [url] => https://www.pinterest.com/jonathan_parl/ [exists] => ) get_headers(): failed enable crypto array ( [url] => https://www.pinterest.com/jonathan_parl/ [exists] => ) get_headers(https://www.pinterest.com/jonathan_parl/): failed open stream: operation failed array ( [url] => https://www.pinterest.com/jonathan_parl/ [exists] => )
is has idea i'm doing wrong here?
i mean, ain't pinterest popular social network valid certificate (i don't use personally, created account testing)?
thank help,
jonathan parent-lévesque montreal
i tried create self-signed certificate development environment (xampp) suggested n.b. in comment. solution didn't worked me.
his other solution use curl or guzzle instead get_headers(). not worked, but, according developper's tests:
http://php.net/manual/fr/function.get-headers.php#104723
it way faster get_headers().
for interested, here's code of new function interested:
/** * send http request $url , check header posted back. * * @param $url string url must send request. * @param $failcodelist int array list of codes page considered invalid. * * @return boolean */ public static function isurlexists($url, array $failcodelist = array(404)){ $exists = false; if(!stringmanager::stringstartwith($url, "http") , !stringmanager::stringstartwith($url, "ftp")){ $url = "https://" . $url; } if (preg_match(regularexpression::url, $url)){ $handle = curl_init($url); curl_setopt($handle, curlopt_returntransfer, true); curl_setopt($handle, curlopt_ssl_verifypeer, false); curl_setopt($handle, curlopt_header, true); curl_setopt($handle, curlopt_nobody, true); curl_setopt($handle, curlopt_useragent, true); $headers = curl_exec($handle); curl_close($handle); if (empty($failcodelist) or !is_array($failcodelist)){ $failcodelist = array(404); } if (!empty($headers)){ $exists = true; $headers = explode(php_eol, $headers); foreach($failcodelist $code){ if (is_numeric($code) , strpos($headers[0], strval($code)) !== false){ $exists = false; break; } } } } return $exists; }
let me explains curl options:
curlopt_returntransfer: return string instead of displaying calling page on screen.
curlopt_ssl_verifypeer: curl won't checkout certificate
curlopt_header: include header in string
curlopt_nobody: don't include body in string
curlopt_useragent: site needs function (by example : https://plus.google.com)
additional note: explode header string , user headers[0] sure validate return code , message (example: 200, 404, 405, etc.)
additional note 2: sometime validating code 404 not enough (see unit test), there's optional $failcodelist parameter supply code list reject.
and, of course, here's unit test legitimates coding:
public function testisurlexists(){ //invalid $this->assertfalse(toolmanager::isurlexists("woot")); $this->assertfalse(toolmanager::isurlexists("https://www.facebook.com/jonathan.parentlevesque4545646456")); $this->assertfalse(toolmanager::isurlexists("https://plus.google.com/+jonathanparentl%c3%a9vesque890800")); $this->assertfalse(toolmanager::isurlexists("https://instagram.com/mariloubiz1232132/", array(404, 405))); $this->assertfalse(toolmanager::isurlexists("https://www.pinterest.com/jonathan_parl1231/")); $this->assertfalse(toolmanager::isurlexists("https://regex101.com/546465465456")); $this->assertfalse(toolmanager::isurlexists("https://twitter.com/arcadefire4566546")); $this->assertfalse(toolmanager::isurlexists("https://vimeo.com/**($%?%$", array(400, 405))); $this->assertfalse(toolmanager::isurlexists("https://www.youtube.com/user/darkjo666456456456")); //valid $this->asserttrue(toolmanager::isurlexists("www.google.ca")); $this->asserttrue(toolmanager::isurlexists("https://www.facebook.com/jonathan.parentlevesque")); $this->asserttrue(toolmanager::isurlexists("https://plus.google.com/+jonathanparentl%c3%a9vesque")); $this->asserttrue(toolmanager::isurlexists("https://instagram.com/mariloubiz/")); $this->asserttrue(toolmanager::isurlexists("https://www.facebook.com/jonathan.parentlevesque")); $this->asserttrue(toolmanager::isurlexists("https://www.pinterest.com/")); $this->asserttrue(toolmanager::isurlexists("https://regex101.com")); $this->asserttrue(toolmanager::isurlexists("https://twitter.com/arcadefire")); $this->asserttrue(toolmanager::isurlexists("https://vimeo.com/")); $this->asserttrue(toolmanager::isurlexists("https://www.youtube.com/user/darkjo666")); }
i hope solution someone,
jonathan parent-lévesque montreal
Comments
Post a Comment