duplicate() is bad for your (object's) health
June 1, 2007 · 30 Comments
ColdFusion 8 brings a lot of enhancements, both large and small, and it's interesting to see what gets some people excited. Andrew Powell thinks that being able to duplicate a CFC is the most important new feature in ColdFusion 8.
I've already commented on that blog post but I thought I'd elaborate and talk about why I think this particular feature is dangerous and misguided. I really hope that this is just a temporary aberration in the public beta build and that the ColdFusion team remove this ability and restore the CFMX 7 behavior: duplicate() on a CFC should throw an exception.
"What?", you say, "but we've been asking for the ability to duplicate() CFCs for ages!"
Yes, yes, I know... but have you actually thought about what it means?As it stands, calling duplicate() on a CFC produces a full, deep copy of a CFC. It's quite a common design idiom to have references to other objects within any given object. If you duplicate such an object, you will create duplicates of all the objects it references - a full, deep copy. Now look at Model-Glue and Transfer: both of these frameworks create objects that contain references back to the core framework object. In both cases, the core framework object is a singleton - only one instance is supposed to exist. In Model-Glue, the event context contains a reference back to the core ModelGlue framework object (several other objects also follow this model). In Transfer, each generated TransferObject contains a reference back to the core Transfer framework object. I expect Mach II and Reactor and some other frameworks behave the same way. It's a common idiom.
Duplicate one of these and you have a pretty serious problem: you suddenly have a full, deep copy of the entire framework object tree in your newly minted object! Ouch!
With Transfer, for example, you'll now have a separate copy of the cache in each of your duplicated objects and you'll start to get subtle problems with the integrity of your data. Something that seemed like a simple operation - copying a transient object - suddenly turns into an extremely hard-to-debug problem with random data corruption in your application! Ouch! Ouch!
So why do you actually want duplicate() in the first place? The most common reason I've heard so far is that createObject() is "slow" so it would be great if you could just create one object and then duplicate it to produce new objects. This assumes duplicate() is faster than createObject(), right? And why do you think it would be? createObject() just creates a new object and runs the pseudo-constructor. duplicate() on the other hand would have to allocate space and copy all of the elements of the original object recursively. I think duplicate() would be slower than createObject(), especially now that ColdFusion 8 has made incredible improvements in performance, especially around creating new objects.
I ran some tests. I created a simple.cfc that has just a small init() method that sets two variables and two getters for those variables. Then I timed 1000 createObject() calls. About 50ms. Then I timed 1000 creations plus calls to the init() method. About 60ms. Nice. What about duplicate()? You think it'll be faster? Well, 1000 duplicate() calls took about 3 seconds. Yes, you read that right: 3000ms.
Still want duplicate() on CFCs?
Tags: coldfusion

30 responses so far ↓
1 John Farrar // Jun 1, 2007 at 5:26 AM
My vote is the time would have been better spent adding some things to cfscript like redirect, abort, etc. to make it a complete scripting environment!
2 Jeremy French // Jun 1, 2007 at 5:40 AM
The only way I can see duplicate() being workable on a CFC is if it's NOT a deep copy, in which case, how much use is it?
3 Sami Hoda // Jun 1, 2007 at 6:16 AM
4 Radek Gruchalski // Jun 1, 2007 at 6:24 AM
5 Damien McKenna // Jun 1, 2007 at 6:27 AM
6 Steve Bryant // Jun 1, 2007 at 6:55 AM
Great post. When I first read Andrew's post, something bothered me about using Duplicate on CFCs. I couldn't figure out why it seemed wrong to me.
That being said, I can where Duplicating a CFC could be nice for testing purposes if I want to see what would happen to a Component if I did X, but I don't want to mess with the component itself (and I plan to destroy the copy after I run my test).
Not sure if I would actually do that in practice or not though.
7 Sean Corfield // Jun 1, 2007 at 7:40 AM
@Sami, duplicate() only makes sense on a CFC that has just simple data members (i.e., very simple beans) but even then it will be faster to explicitly create a new object and initialize it from the existing values (not that speed should really influence our decision at that level).
8 Andy Powell // Jun 1, 2007 at 7:54 AM
9 Matthew Lesko // Jun 1, 2007 at 8:05 AM
Stop asking for cfscript additions! Coldfusion is a tag-based, Markup Language (i.e. CFML). cfscript is not, and never can be, ECMA compliant unless changes to do so break existing code. However, it pretends/appears to be, which is just that much worse. So better to deprecate rather than expand a poorly implemented experiment.
</rant>
Regarding duplicate(), it now provides developers a loaded gun that can take off a leg (or worse), but may only be apparent when an application experiences load. That said, the same can be said of concurrency (i.e. cfthread). But these issues are not new and I think the solutions they provide outweigh the hazards they create.
When I read Andrew Powell's blog about applying duplicate() to framework code I had the same reaction as Sean, but I don't know enough about the any Framework's design to comment definitively. Conceptually though, I see applicability in the ORM space, but not MVC.
That said, I do think duplicate() provides an elegant solution for creating objects when:
1. part of the object instantiation (including its composition) is expensive (usually some sort of I/O dependency in my experience) and identical between instances. Note, you then need to be able to alter object state after copying.
2. classes are loaded dynamically at run time so you cannot use sub classing or decorators.
See the GoF Prototype pattern for a more in depth discussion.
10 Sean Corfield // Jun 1, 2007 at 8:36 AM
@Matthew, again, factories are the solution to the "expensive" instantiation problem - you don't need to duplicate the CFC, just the data that would be expensive to fetch. Not sure what you mean about dynamically loaded classes (Transfer ORM, for example, generates objects on the fly at runtime but still allows you to use decorators).
11 John Farrar // Jun 1, 2007 at 8:43 AM
Read this.
http://www.microsoft.com/presspass/press/1997/jun97/jecmapr.mspx
In otherwords this was achieved with script in IE4. 100% compliance.
12 Steve Bryant // Jun 1, 2007 at 9:03 AM
John didn't ask for ECMA compliance, just more utility in cfscript. You may not like cfscript, but others do.
Personally, I think it would help the language if a developer could use a script format or a tag format for anything. It might help attract those who prefer a scripting syntax.
Andy,
Take a look at ColdSpring or Lightwire for some good offerings at circumventing the need for tons of CreateObject() calls (both are Dependency Injection engines).
13 Fred Fortier // Jun 1, 2007 at 12:09 PM
Coldfusion tag scripting is great when working with HTML and generating table etc. But I force myself never to put any display/HTML generating function in CFC's... just to keep things seperated and clean.
If I could do all my CFC's in cfscript that would be awesome.
14 Mike Kelp // Jun 1, 2007 at 3:22 PM
I know CF isn't java but I think it was a good way to address all the concerns mentioned above as well as security.
Mike.
15 Daniel Greenfeld // Jun 1, 2007 at 10:19 PM
I share your feeling about cfscript not having enough utility. So I extend it.
What I do is take the tags I need from CF that are not available in cfscript and replicate them as functions inside a utility cfc. A simple example follows:
<cfdump> becomes dump(). dump() does everything that <cfdump> does. Often I extend these functions to include more capability. Better yet, many of these have already been done and are stored on cflib.org. So much of the work is already done!
So I end up with a win/win situation.
16 Adam Cameron // Jun 2, 2007 at 4:35 AM
Surely the problem is not with the notion of duplicating CFC instances, it's with HOW the duplicate() function seems to have been implemented to do this.
If an object (quicker to type than "CFC instance"!) has a member variable that is a REFERENCE to another object, then the REFERENCE to the underlying object should be duplicated, NOT a "deep" duplication of the object to which the reference... err... refers.
Or does this raise issues of its own? (I've only spent about 30sec thinking about this, and I have a hangover).
?
--
Adam
17 John Farrar // Jun 2, 2007 at 6:41 AM
I also tend for now to use _redirect(), and _abort() in my code. The truth is most of the tags just are not "fundamental". The bigger issue is things like <cfAbort> and things like that are wrong to be missing from script.
And to the thought that you can just add them... functions like <cfSaveContent> could be added without making it part of the core language. Yet, adding that and others enhanced the utility of tag developers. The same utility is what those of us who do use script are after. (I am not asking for "all" the tags to be converted.)
18 Nolan // Jun 2, 2007 at 8:25 AM
Sean, I understand your point, and it's a valid one, however it seems (to me) that your argument is specific to applications that have an object pattern which would break if Duplicate() were used. Moving forward with CF8, we'll have interfaces to use, which could cause a shift in how CF apps are designed, making this less of an issue, yes?
Also, just saying "Duplicate() is bad for objects" is no different than saying "cfregistry in a cfloop could cause your server to crash". Okay so the tools used in a specific context aren't the best way to write code. Why not provide the tools for those situations that have a valid use, and educate against the possible negative aspects? Adobe could do something like...
"objectCopy() -- Note: depending on your object model, using this function may cause unexpected behavior. Do not use objectCopy() if your object inherits from a Singleton as it will cause a bug in the application."
...and then if a user does it anyway, it's his/her own fault. However for those cases that aren't tied to a Singleton or a framework (maybe college kids using CF to learn how objects work? or just simple apps that aren't framework based but still use CFCs), we'd have a simple way to copy objects as needed.
Yes? No?
2 cents.
-Nolan
http://www.southofshasta.com/blog/
19 Sean Corfield // Jun 2, 2007 at 9:38 AM
You could happily use duplicate() on simple beans - but it would still be (much) faster to create a new object and initialize it with data from the original bean according to my tests.
20 Gert Franz // Jun 2, 2007 at 1:39 PM
here my two things about duplicate(). There were about the same reasons why we have implemented a second parameter into the duplicate() method called deepcopy. You then can command Railo to do a flat copy of the component and maintain the pointers to the same objects. By default deepcopy is set to true.
I have done the same tests as you did and it turned out that Railo needs 147ms for 10,000 object creations and only 32ms for duplicates. But with deepcopy set to true.
Gert
21 Elliott Sprehn // Jun 12, 2007 at 3:22 AM
Anyway, I'm not sure I agree that this should throw an exception. C and C++ let you do some pretty awful things with pointers and memory (corruption), but those features exist in case you need them.
Duplicate could be quite useful if you need to create another instance of an object that's expensive to create where a factory does not exist (3rd party framework) and could not be easily implemented.
Also, if you were unit testing and wanted to compare the members of two objects, before and after a change. You could, of course, mirror the setup calls in the unit test for both instances, or you could write a factory, which honestly seems like overkill if it'll only ever need to be done in the tests.
Duplicate is also useful for working on copies of expensive to create configuration objects. Could we write a factory? Sure. But we could also duplicate.
Your loop doesn't test the performance to duplicate a very expensive object, like the core of Model Glue, vs loading it all over again, or writing a method for it to create a deep copy, either.
If you were going to duplicate the entire expensive-like-ModelGlue instance we can:
- Write a factory method that gets each member, duplicates it's primitive structure or creates a new instance if it's a cfc, and walks the whole tree. This is going to be O(n) in the CFML code.
- Keep a reference to the loaded configuration data (XML for MG) and generate a new instance of the object each time. This is going to be O(n) in the CFML code.
- Just call duplicate() on it. This is going to be O(n) in the Java code.
Which one is more readable? Which one uses less memory? Which one is faster? Which one results in more maintainable code?
My bet is on duplicate. It might not be the best solution for all problems, it might not be the best solution to most problems, but to say it'll never be useful doesn't seem quite right to me.
(Note that this is hardly news too, C++ has deep copies, Python has deep copies, ruby has deep copies, BlueDragon has had deep copying for cfcs with duplicate() since 6.1, and Railo allows it too)
22 Andrew Powell // Jun 12, 2007 at 8:38 AM
When it all comes down to it, THAT is the beauty and power of CF. It can accommodate any development style a developer wants to use or not to use.
23 Sean Corfield // Jun 12, 2007 at 8:47 AM
Your example of the Model-Glue core is specious: Model-Glue is a singleton. Most "expensive to create" objects are singletons so duplicate() won't apply to them by definition, in my opinion.
I'm sure folks will find duplicate() useful - I'm just very concerned that people will duplicate() a CFC and shoot themselves in the foot because deep copy semantics are not appropriate.
For simple bean-like CFCs, duplicate() will work fine (although not for Reactor or Transfer managed objects!) but, as I noted, duplicate() in the Public Beta is much, much slower than createObject() now that the CF team have vastly improved the performance of createObject().
24 Gert Franz // Jun 12, 2007 at 8:55 AM
25 Sean Corfield // Jun 12, 2007 at 9:18 AM
26 Elliott Sprehn // Jun 12, 2007 at 11:19 AM
It seems silly to me to require people to write their own deep copy constructor for every component to create a deep copy. CF is about saving time.
I think using the clone() method, as you suggest, if it existed, and if it didn't then the default deep copy behavior seems like a good compromise.
As a side note, duplicate() deep copies, structCopy() shallow copies. Most struct functions work on cfc instances. So it makes more sense to me to add the capacity to structCopy() a cfc instance for a shallow copy than to add parameters to duplicate.
27 Elliott // Aug 8, 2008 at 2:03 AM
Creating 3000 objects:
coldspring.beans.DefaultXmlBeanFactory:
createObject: ~3400ms, duplicate: ~3400ms
So I looked through the code and it turns out that CS calls createUUID() in the body of coldspring.beans.AbstractBeanFactory. I removed this because I suspected a performance hit...
coldspring.beans.DefaultXmlBeanFactory(3000 times):
createObject: ~850ms, duplicate: ~486ms
Then I tried something very simple...
coldspring.beans.BeanReference(3000 times):
createObject: ~400ms, duplicate: ~70ms
So duplicate can be 6x faster in the right conditions, or just as slow in the bad conditions.
(Incidentally Transfer uses duplicate() to make creating TransferObjects faster on CF8)
The oddest part of the initial test is that the createUUID() in the *body* of the <cfcomponent> tag was causing the duplicate slowness.
So I tried adding writeOutput("test!") to the body of a cfcomponent. Creating it once and duplicating it 3000 times. And what did I get? "test!" output 3000 times! eep.
That definitely has to be a bug (I hope), and a really nasty one at that.
It does seem using that bug that you can fake singleton behavior though:
<cfcomponent>
<cfif isDefined("application.instances.MySingleton")>
<cfthrow ...>
</cfif>
<cfset application.instances.MySingleton = 1>
</cfcomponent>
28 Sean Corfield // Aug 8, 2008 at 6:25 AM
Interesting behavior that the pseudo-constructor is executed in duplicate() - have you filed it as a bug? http://adobe.com/go/wish
29 Gary // Oct 29, 2008 at 4:02 AM
30 Joel Ferreira // Mar 15, 2011 at 8:00 AM
I setup a test case and was very surprised by the results.
in CF8 - duplicate() was twice as fast as createObject()
in CF9 - duplicate() was about 30% slower than createObject()
Something under the hood has definitely changed to be producing this behavior.
Leave a Comment