Link exchange? Do research before you put a link for someone.

What is Link Exchange

In case you are not familiar with SEO, let me elaborate what “link exchange” is. Usually as a website owner, I would like to have more other websites (of good quality) link to my website so as to increase the exposure and page-rank of my website. If I think the websites linking to mine is good, I would put links to them in my website as a reward, so that both of us will be benefited. This is so call “link exchange” and it is a win-win game, as I always believe.

The story

So let’s start talking about the story: Yesterday when our project website received an invitation about advertising on us and propose link exchange. The sender is a medium-size, international job hunting website. The advertising campaign sounds quite good but unfortunately it’s technically not feasible for our website (I’m not going to cover that in this post.). What we wanna tell today is their link exchange proposal. They propose that we add a link to their website while they do the same for us as well. They provided us an URL to a page (Let’s call it “Secret Page”, and the website FoolingYouWebsite.com. You will know why soon.) showing how the links would look like in their website. The “Secret Page” has a quite simple and direct URL structure: www.FoolingYouWebsite.com/SecretPage.html.

We were almost agree to the proposal, right before we do further investigation: We just can’t find a way to go to the “Secret Page” from the website’s entrance, unless we type in its URL directly. This means that, the “Secret Page” actually is isolated from the website’s main pages. Since it’s isolated, the search engine may be able to find and index the “Secret Page”. If search engine cannot find and index the page, anything on the page is meaningless. This is one doubt.

Secondly we checks the robots.txt of the website (www.FoolingYouWebsite.com/robots.txt). Surprisingly we find that the “Secret Page” is “Disallowed” from ALL robots! What does it mean? It means that ALL search engines will NEVER read that page even when they found that page. What a big cheating!

Conclusion

Not mentioning we refused the proposal, we are writing here to let people learn about this. We suggest every website owner have to do a research on the website as well as the robots.txt for every link exchange request.

How to know a blog comment is actually a spam?

A surprise

Zome off is a new and little popular blog that is even not showing on any search result by any keywords (except the word ‘zomeoff’ which nobody would be interested to look for it) in any search engine. So it’s really a big surprise that many people are trying to leave comment on my blog, not once but several times. (But up to now you still don’t see any comments on the blog, huh?)

The latest comment is this:

trying to find you on facebook, wats ur profile

— frostwire

I almost reply, but…

Thanks to WordPress that I can approve all the comments before it’s gone published. As this is a so personal request, I wouldn’t give out my facebook profile before knowing who “frostwire” actually is. There is quite complete contact information (website, email) provided by “frostwire”. So I start with checking the website. It’s about a P2P software and gives not much clue about the person “frostwire”. Then I look into the email address which is an AOL email address. Then I check the IP of the comment (using IP 2 location services such as hostip.info). It shows that the IP is from “Richardson, TX, UNITED STATES”. Still, not much clue is found. Finally I try to google with the whole comment, that is, search the whole sentence “trying to find you on facebook, wats ur profile” in google. Surprisingly there are more than 2000 results, most out of them are WordPress blogs. At this step I realize that I’m fooled.

Why?

So why does “forstwire” leave such a comment on the blog? Actually the comment contains no selling messages so leaving such comment actually gives no help to anyone. However, I think of 2 reasons that, no matter the comment gets published or not, the “forstwire” still win.

  1. Increasing the traffic of the website:“forstwire” provides a website URL when submitting the comment, so bloggers like me who wonder the background of the commenter will definitely visit the submitted website. So even before we ban the comment, we have been cheated to pay a visit to that website.
  2. Collecting personal infomation:Since the message itself is quite personal, and “forstwire” provides an email address. Blogger may reply to the email instead of responding publicly on the blog (Unfortunately there are some bloggers who do give out their personal information as a reply to this message on their blogs.) . I noticed that the “trying to find you on facebook, wats ur profile” message appears mostly on WordPress-empowered blogs, so I suspect that actually “forstwire” actually is using some kind of robot software (targeted on WordPress) to make the comment in large batch scale. If this is true, then there is no challenge for “forstwire” to collect the blogger’s response and extract personal information, such as email address from it. No matter how the blogger replies (by private email or public reply), “forstwire” have your information anyway.

Conclusion: Bloggers must be aware of spammers!

This is so far my first time come across spammers who is targetting on bloggers instead of the blog readers. So blogger must be aware of it. When a blogger has doubt about a comment, GOOGLE it with the whole message. Generally spammers do spamming in ways that are automatic and in large scale, so if the message is a spam, you will see a lot of victims in the google result. Do not visit any website before you can trust the commenter.

PHP – isset() vs array_key_exists() : a better way to determine array element’s existence

The story

In the CourseYou project, we’re asked to check if an element is set in an array. That is,  we’re asked to determine whether $Arr[‘MyElement’] exists.

So we use the following code as a start.

<?php 
if (isset($Arr['MyElement'])) { 
     ... do my stuff ... 
} ?> 

This code works fine, but, it works fine for most of cases only. In some other cases (and it’s quite often actually), using this code  to check the existence of an array element can be very DANGEROUS.

What’s wrong with isset()?

Perhaps isset() is one of the most frequently used function that do a very frequent task: determine if a variable has been set. It is simple, and more importantly is FAST, is very FAST. However, the returned result of isset() can be misleading sometimes.

According to the PHP’s manual: isset() — Determine if a variable is set AND is not NULL

So the case that the isset() cause you danger is: the element does exist in the array but it is set NULL. i.e. $Arr[‘MyElemenet’] =NULL; In this case, isset() always return FALSE.  Professional programmers should be aware of this.

The right solution: array_key_exists()

The right way to check  if an element exists in an array is to use array_key_exists(). The array_key_exists() will tell if the given key or index has been “created” in the array regardless the value of the element. So to tell if elements ‘MyElement’ exists in the array $Arr, we should use this:

<?php if (array_key_exists('MyElement', $Arr)) { ... do my stuff ... } ?> 

Why array_key_exists() still sucks?

However, array_key_exits() still sucks. Yes, it’s more reliable than isset(), but it’s SLOW.  We benchmarked the array_key_exists() and isset() methods as shown below and find that array_key_exists() is almost 5 times slower than isset().

To take the speed advantage of isset() while keeping the reliable result from array_key_exists(), we combined the both: Usually an element being set NULL is a rare case, so in most of the time, isset() is still reliable. When isset() fails, we should do an additional checking by array_key_exists() to double confirm that the key really doesn’t exist. It turns out that the below code works the best:

<?php 
if (isset($Arr['MyElement']) || array_key_exists('MyElement', $Arr)) { 
      ... do my stuff ... 
} ?>


The beauty of PHP (also many other modem languages) is that it doesn’t require the whole conditional statement being fully parsed. So the PHP engine actually only evaluate the result of isset(). if isset() returns FALSE, it then evaluate array_key_exists(). If isset() returns TRUE, array_key_exists() is never evaluated. That’s saying the sequence of the two conditions cannot be reversed.

Benchmarking

We did a simple benchmarking base on the isset(), array_key_exists() and the combined method, and the result of the combined method is very promising.

<?php 
$a = array('a'=>1,'b'=>2,'c'=>3,'d'=>4, 'e'=>null); 
$s = microtime(true); 
for($i=0; $i<=100000; $i++) { 
     $t= array_key_exists('a', $a); //true 
     $t= array_key_exists('f', $a); //false
     $t= array_key_exists('e', $a); //true 
} 

$e = microtime(true); 
echo 'array_key_exists : ', ($e-$s); 

$s = microtime(true); 
for($i=0; $i<=100000; $i++) { 
     $t = isset($a['a']); //true 
     $t = isset($a['f']); //false
     $t = isset($a['e']); //false 
} 

$e = microtime(true); 
echo 'is_set : ' , ($e-$s); 

$s = microtime(true); 
for($i=0; $i<=100000; $i++) { 
     $t= (isset($a['a']) || array_key_exists('a', $a)); //true 
     $t= (isset($a['f']) || array_key_exists('f', $a)); //false
     $t= (isset($a['e']) || array_key_exists('e', $a)); //true 
} 

$e = microtime(true); 
echo 'isset() + array_key_exists : ', ($e-$s); 
?> 

The benchmarking result (average):

  • array_key_exists() : 308 ms
  • is_set() : 4.7ms
  • isset() + array_key_exists() :217ms

Latest Update: I have packaged this method to a single function, and added the checking of element existence in multiple-dimension arrays. Please check my another post: A complete element existence checking function for PHP.

Equal (==), identical (===) and array comparison in PHP

Equal (==)

If you use equal (==), you are allowing type conversion which means PHP will try to convert the two sides into the same type and then do the comparison. So even if the two sides are NOT the same thing, they MAY still be treat as the SAME.

Consider this code:

<?php 
$left = "C"; 
$right = 0; 
var_dump($left == $right); 
?> 

Output:

bool(true)

"C" equals to 0 ?? The logic behind is : $left is a String of "C", since it is compared to $right which is a number, PHP will first convert the String "C" to a number by parsing "C" as a numeric value which is unfortunately 0, then this 0 is compares to $right which is 0, so although strange the comparison result is logically "true".

Identical (===)

On the contrary, when identical (===) is used in the comparison, PHP will not do any type conversion. PHP firstly check if the both side is of the same type. If not, then just return false. If they are of the same type, it then compare the values to see if they are the same. So it should be no wonder that the output of the below codes is "false":

<?php 
$left = "5"; 
$right = 5; 
var_dump($left === $right); 
?> 

Output:

bool(false)

What if they are Arrays?

Consider this code:

<?php 
$a = array('a'=>1, 'b'=>2, 'c'=>3);                 //reference array 
$b = array('a'=>1, 'b'=>2, 'c'=>3);                //equal and identical 
$c = array('a'=>1, 'b'=>2);                                //one element less 
$d = array('a'=>1, 'b'=>100, 'c'=>3);          //one element has different value 
$e = array('a'=>1, 'c'=>3, 'b'=>2);               //same key-value pairs but different sequence 
echo '$a == $b is ', var_dump($a ==$b); 
echo '$a === $b is ', var_dump($a === $b); 
echo '$a == $c is ', var_dump($a ==$c); 
echo '$a === $c is ', var_dump($a === $c); 
echo '$a == $d is ', var_dump($a ==$d); 
echo '$a === $d is ', var_dump($a === $d); 
echo '$a == $e is', var_dump($a ==$e); 
echo '$a === $e is', var_dump($a === $e); 
?> 

Output:

$a == $b is bool(true) 
$a === $b is bool(true) 
$a == $c is bool(false) 
$a === $c is bool(false) 
$a == $d is bool(false) 
$a === $d is bool(false) 
$a == $e is bool(true) 
$a === $e is bool(false) 

So we conclude that:

  • When two arrays are same in each key/value pair, and they have the same amount of elements, and the elements are in the same sequence, they are equal (==) and identical (===),
  • If one array has less elements than another one, they are neither equal (==) nor identical (===).
  • If one of the elements in an array has different value, the two arrays are neither equal (==) nor identical (===)
  • If two arrays have the same element, but different sequence, they are equal (==) but NOT identical (===).

Reference:

  1. Type conversion during comparison in PHP (they call it type juggling): http://php.net/manual/en/language.types.type-juggling.php
  2. Type comparisons in PHP: http://php.net/manual/en/types.comparisons.php