What is Copy-on-write and why is it important to understand?
To start with this topic, lets see what happens when we assign one array to another and then change the first element of the first array:
$a = array("apples", "oranges", "peaches");
$b = $a;
$a[0] = "grapes";
print_r($a);
print_r($b);
$b = $a;
$a[0] = "grapes";
print_r($a);
print_r($b);
Results:
Array ( [0] => grapes [1] => oranges [2] => peaches )
Array ( [0] => apples [1] => oranges [2] => peaches )
As with most simple variables, arrays are passed and assigned by value, in other words, when we did $b = $a we seemingly created a duplicate of the original array. We know this because we changed the first entry of the first array and the change wasn't reflected in the second array.
If you are used to other dynamic languages however you'd think that assigning an array would perform an assign-by-reference in order for the array not to be duplicated in memory. You would then assume that changing one array would also change the original array since they are basically the same array. Duplicating arrays as PHP is doing here would cause concern for anyone who cares about memory usage and who uses a lot of arrays to pass data around. You would also rightfully be concerned about the speed impact of having PHP duplicating arrays all the time.
But this... this is madness!?
Luckily for us, there is method behind all this madness. PHP uses what is called copy-on-write technology. What this means is that the array is actually assigned by reference and that a copy of the array is only made if any one of the arrays is changed later on. When we did $b = $a there was still only one copy of the array in memory up to the point when we changed one of them.
There is a question though... If you really don't want PHP to duplicate arrays, ever, should you always implicitly pass/assign arrays by reference, e.g.:
$b =& $a;
Well, yes you could if you really wanted to, but be careful since passing an array to a function by reference, e.g.: function test(&$parameter){} is actually slower than just passing the array the usual way, e.g.: function my_function($parameter) {} since PHP needs to do extra work behind the scenes.
Some links regarding the topic:
Research paper on Copy-On-Write in PHP (PDF)
http://php.net/manual/en/functions.arguments.php
http://www.php.net/manual/en/features.gc.refcounting-basics.php
http://php.net/manual/en/internals2.variables.intro.php
As with most simple variables, arrays are passed and assigned by value, in other words, when we did $b = $a we seemingly created a duplicate of the original array. We know this because we changed the first entry of the first array and the change wasn't reflected in the second array.
If you are used to other dynamic languages however you'd think that assigning an array would perform an assign-by-reference in order for the array not to be duplicated in memory. You would then assume that changing one array would also change the original array since they are basically the same array. Duplicating arrays as PHP is doing here would cause concern for anyone who cares about memory usage and who uses a lot of arrays to pass data around. You would also rightfully be concerned about the speed impact of having PHP duplicating arrays all the time.
But this... this is madness!?
Luckily for us, there is method behind all this madness. PHP uses what is called copy-on-write technology. What this means is that the array is actually assigned by reference and that a copy of the array is only made if any one of the arrays is changed later on. When we did $b = $a there was still only one copy of the array in memory up to the point when we changed one of them.
There is a question though... If you really don't want PHP to duplicate arrays, ever, should you always implicitly pass/assign arrays by reference, e.g.:
$b =& $a;
Well, yes you could if you really wanted to, but be careful since passing an array to a function by reference, e.g.: function test(&$parameter){} is actually slower than just passing the array the usual way, e.g.: function my_function($parameter) {} since PHP needs to do extra work behind the scenes.
Some links regarding the topic:
Research paper on Copy-On-Write in PHP (PDF)
http://php.net/manual/en/functions.arguments.php
http://www.php.net/manual/en/features.gc.refcounting-basics.php
http://php.net/manual/en/internals2.variables.intro.php