Database Introspection and the MetaData Facade

Being a forerunner is often kind of exciting. There's a certain sense of pride that goes along with being the first person to achieve a particular goal. Y'know, ego stroking. ;) But there's another more challenging side to being a forerunner. You're the first person to get hit when things are launched in your direction.

Okay well I may not have been the first to get hit in this case but my challenges over the past couple days certainly stem from having been a forerunner.

As of today I'm running my ColdFusion 8 installation on my horribly under-powered, not to mention OLD notebook with the new CF Admin option "disable access to internal ColdFusion Java components".

The challenge in doing this was that I needed to find a way to get all the metadata I needed from the database without being able to get an actual JDBC metadata object. The JDBC object I need is actually *NOT* part of the ColdFusion server, oddly enough. The problem stems from the fact that it needs to be associated with an active Java SQL.connection object and the only way I can get that is by going through the undocumented "coldfusion.server.ServiceFactory".

Until Adobe released ColdFusion 8, this wasn't a problem. And Adobe actually tried to resolve the problem I have by also introducing the CFDBINFO tag in ColdFusion 8. (I'm pretty sure I submitted the enhancement request to Adobe for this tag and I've had several people thank me for it, saying it's one of their favorite new features, so I put that in the win category.) ;) Unfortunately although they do return some good and useful introspection data, they failed to solve my problems in particular. There are a handful of problems with the data returned by the cfdbinfo tag, like the fact that it won't tell you the names of any of the foreign key constraints (which, you MUST know if you ever plan to drop them programmatically). The data's there - it's part of the standard set of data returned by the JDBC metadata object even.

That wasn't such a huge thing, the bigger issue (and one I may still need to address more) is that it doesn't return any schema information for tables, views or foreign key constraints. Most of the time your applications will only work in one particular schema (on SQL server it's usually although not always DBO), but particularly if you work with Oracle for the enterprise at all there's a good chance that you may need to draw data from alternate schemas. So the DataFaucet introspection code was designed to account for that -- after all, it's been part of the SQL standard for many, many years! And without that information the foreign key constraint code in particular will fail. And so because of that I couldn't use cfdbinfo for all of the metadata I needed. Instead I had to go back to using information_schema for constraint information (although it's in the sqlagent.cfc which means you can customize it for different db's if you need to).

Don't get me wrong. I understand exactly what Adobe was trying to do when they implemented cfdbinfo. They were trying to take the complexities of JDBC introspection and boil them down into a package that would be easier for developers to digest. And so instead of requesting imported vs. exported foreign keys, you simply request columns and the column list says which columns are foreign keys. That much I understand and could even agree with. But they dropped vital information while they were at it. Hopefully they'll fix those oversights in a future release.

I have actually used information_schema before. Many years ago, I started doing database introspection when we were still limited to ColdFusion 5. That was before CFCs even, all I had to work with were custom tags. NOBODY was talking about writing db abstraction in ColdFusion at the time. In fact, pretty much everyone recommended to avoid even trying it. Except me. I'm not sure Mark was even around yet and Transfer certainly wasn't. Neither was Reactor. Of course now it's common, but not back then. And those tags I had to work with were full of an amalgam of information_schema and proprietary meta-data tools. Ugh! But that was the price of being a forerunner, working on things that weren't easy to pull off. It wasn't until around the time that 6.1 (Red Sky) was released that I converted everything to use JDBC, specifically because even between databases that support the information_schema standard, the data from JDBC is still more consistent and reliable.

So ultimately the existing code in DataFaucet that relies on the serviceFactory is still the most solid way of handling all the things that DataFaucet does. So it's going to try and use that by default, because that's a best case scenario. But we've been working on a project with Eric Jones the past week and their hosting provider disables access to the ColdFusion classes. That meant I had to work on an alternative solution. And I've found myself more or less in the same sort of situation I was in back when I was doing this with CF5, slogging through a maze of different and none-too-consistent standards for getting the information I need.

For that matter, I even asked both Mark Mandel and Sean Corfield if they had any suggestions on the Java front for an easier way that maybe I could get at a loaded metadata object without going through the serviceFactory since it's not a ColdFusion class (and so wouldn't be protected). Neither of them had any advice and for that matter, Mark doesn't even do database introspection generally speaking. So here I am again, the forerunner... bitten in the rear by my success. ;)

I can't say that the solution hasn't been interesting. I had to design a system that would attempt to get the serviceFactory and then on failure create a couple of new components. These new components are facades for the connection and metadata objects. They're just CFCs like everything else in the framework, but what they're doing is pretending to be the java objects the framework wanted. (And hooray for duck-typing in ColdFusion! Because this problem would have been a lot harder to solve in an explicitly typed language like Java.) So these objects have to return the data in the same format that's in use. Since 6.1 that's been the JDBC format... and guess what, neither cfdbinfo nor information_schema follow the same format or naming conventions. So I can't say that it hasn't been interesting (and certainly challenging), but I do wish it weren't necessary.

But then, I'm used to these kinds of tough challenges. I've never been the kind of guy to run away from a technical challenge. ;) And this does mean that if you've had problems using the ORM due to the limitations of your hosting there's now a solution. :) (Unless they disable the creation of Java objects all-together in your sandbox.)

Anyway I haven't released the code yet, but I have checked it in to the SVN repository. So if you have an immediate need, you can try the new code by getting the BER from SVN. I probably won't release a new downloadable archive until I create a small sample application and some documentation for the new Persistence Service that I've talked about in my last few blogs. And at that point because of the new service and because of the metadata facade I think I may roll it up from 1.0 to version 1.1.

DataFaucet PersistenceService Update - Tested Many-to-Many

I just committed some updates to the SVN repository that includes testing for many-to-many relationships with the PersistenceService. And yes there were a handful of issues that needed to be addressed to get these working. ;)

The object code for a reciprocal many-to-many relationshiup looks like this:

<cfcomponent displayname="department">
<cfproperty name="departmentID" type="uuid" required="true" key="1" />
<cfproperty name="departmentName" type="string" required="true" />
<cfproperty name="facultyArray" type="array" required="false" />

...
</cfcomponent>

<cfcomponent displayname="faculty">
<cfproperty name="facultyID" type="uuid" required="true" key="1" />
<cfproperty name="firstname" type="string" required="false" />
<cfproperty name="lastname" type="string" required="false" />
<cfproperty name="departmentArray" type="array" required="false" xref="tblDepartmentFaculty" />

...
</cfcomponent>

Once you've got these CFCs ready, then you would set up your PersistenceService with the appropriate IoC factory (or multiple factories if you prefer) and call the install method to install the tables to hold the data for these components. (The below code would normally be configured in your IoC factory - it's shown here for reference.) :)

<cfscript>
cacheFactory = CreateObject("component","datafaucet.system.classcachefactory").init();
cacheManager = CreateObject("component","datafaucet.system.classcachemanager").init(cacheFactory);
daoFactory = CreateObject("component","datafaucet.system.daofactory").init(accessorPattern="get*",mutatorPattern="set*");
factory = CreateObject("component","datafaucet.system.classfactory").init("datafaucet.demo");
service = CreateObject("component","datafaucet.system.persistenceservice").init(factory,daoFactory,cacheManager);

service.install("department");
service.install("faculty");
</cfscript>

And viola! You've got tables. Then all you need to do is get and save a few departments and faculty. As you create and save them it will both cache the objects and relationships and it will update the cross-reference table indicated in the xref attribute. Restart your ColdFusion server and fetch any department or faculty record via service.get("department",departmentid) and it will automatically load the associated objects.

As previously mentioned, this is a deceptively simple feature and it can be very easy to swamp your server with hundreds or even thousands of objects that aren't being used, simply because you declared a couple properties of type "array". So BE CAREFUL if you choose to use this. Make sure these are objects that really are frequently used and really should be cached in memory.

ORM PersistenceService BER

Okay so as of today, the bleading-edge release of DataFaucet includes a reasonably complete INITIAL PROTOTYPE of the PersistenceService. It now includes reciprocal one-to-many/many-to-one relationships and the purge operation has been given some preliminary testing. Aggregation (many-to-many relationships) have not been tested and cache purging needs some additional testing.

What does this mean?

For those of you who are advanced users and may have been interested in an "external ORM" where you could use your own business objects that are ignorant of the persistence engine, you can now have components like this:

<cfcomponent displayname="conference">
<cfproperty name="conferenceid" type="uuid" required="true" key="1" />
<cfproperty name="conferencename" type="string" required="true" />
<cfproperty name="forumArray" type="array" required="false" />

...
</cfcomponent>

<cfcomponent displayname="forum">
<cfproperty name="forumid" type="uuid" required="true" key="1" />
<cfproperty name="forumname" type="string" required="true" />
<cfproperty name="conference" type="conference" required="true" />

...
</cfcomponent>

Now where I put those elipses, you would have your getters and setters (however you want to write them) and in the PersistenceService you can set a pattern for them, so you can use either getValue/setValue the way I mostly use DataFaucet or you can use getConference/setConference. And then you can simply request your objects from the service in much the same way that Transfer works. In this above example when you request a conference object via PersistenceService.get("conference",conferenceid) it will automatically load that conference object with an array of its cached forums and similarly, each forum object will be loaded with a reference to the cached conference object. So if you've got one conference and 5 forums, you'll have 6 objects in the cache.

Can't wait?

For more information about how to configure the service, see the previous blog entry. There is NOT any documentation in the distribution yet. To get the code, export from the SVN repository (it's NOT in the public download yet and won't be until there's documentation).

In my mind there are specific use cases in which this is beneficial over using the existing active records and imo vice versa as well.

On the plus side:

Using the PersistenceService means that your individual cached objects won't contain all the data access methods as is the case with active record objects. That means they'll have a smaller memory footprint.

For some of you OO enthusiasts you'll be happy that the object is nearly independent of the ORM (aside from a few special attributes in the cfproperty tags).

Also there's just one DAO class, instead of having a separate DAO CFC for each object, woohoo! The DAO is loaded with some information about the class it's being used to persist, but generally speaking you shouldn't need to extend the DAO, it should just work as is to persist most business classes.

On the minus side:

Because of the seamless, recursive nature of the links between these objects, it could be really easy to code yourself into a situation where you've got dozens or even hundreds of objects loading when you only asked for ONE. As an example, given a product catalog of any size, you wouldn't want to have an array of Products in each Category obect. And that's really not the way online shopping is done anyway, so hopefully that won't be a problem for you. ;)

As of right now there are no solutions to lazy-loading property arrays. I have a few ideas about ways to handle that, but no code yet. Doing that in ColdFusion will probably mean subclassing your business objects to add hooks into the ORM for fetching those properties, although there will need to be a way of informing the DAO that those properties will be lazy-loaded.

Race conditions... race conditions... and more race conditions... To be honest there are race conditions with the active record implementation also, however, the addition of caching for any collection of objects increases the number and type of race conditions that will occur in your application. Unfortunately beyond "in your business objects", I can't tell you where you'll find them, I just know that you will. If any given method in your business objects sets or gets more than one instance variable, there's a potential race condition in it.

Use Cases

You might want to use the PersistenceService instead of active records if you have a behavior-rich application in which a number of objects need to remain resident in memory for frequent access. A good example of this would be application security / authentication. User objects, Role objects and Permission objects (if you have them) are apt to be accessed rather frequently and may include somewhat rich behaviors for determining which users have access to what features of your application. In this scenario it makes a lot of sense to retain these objects in memory to reduce the amount of database access and keep the individual objects readily accessible to each other to handle their exchange of information.

As always, I'm available to answer whatever questions you may have. ;)

New PersistenceService option in DataFaucet ORM

So exactly one week ago today on the 3rd of November, Joe Rinehart posted this blog entry in which he suggested that none of the existing ORM solutions for ColdFusuion are "true ORMs" because they function in ways that are a bit different from the way "traditional" ORM tools work in other languages... or maybe just in Java, I'm not sure.

Anyway as you might expect Joe's article has been fairly popular. MangoBlog doesn't show a view count, but there are 34 comments! In a follow-up he listed a handful of more specific problems he believes need to be solved in order to call a tool an "ORM".

Now I actually disagree with the notion that we should just avoid calling these tools ORMs unless they work in this specific way, in part because the name "object relational mapping" doesn't include any words that actually describe several of the problems listed. Caching as an example isn't described by the name although it's part of the problems outlined. I am however a fan of options.

Until recently the only caching in DataFaucet was for queries. It didn't have any caching utilities for objects. Part of the reason for an object cache being needed in Joe's description is because of the "identity" problem he wants an ORM tool to solve. The basic idea is that for instance X of a given class, there should only ever be one object in memory. So you couldn't have say Product #5103 and have Joe have one instance of 5103 and Ike have another instance of 5103 and they be different objects, because there's only ever supposed to be one Product 5103 object. This of course gives rise to a variety of new race conditions.

The active record objects in DataFaucet on the other hand are designed on a somewhat different concept of identity, one that handles race conditions related to referential integrity in a different way. Actually most of the potential race conditions that occur when everything is cached as is the case in Joe's description (and in Transfer), don't occur when you're using DataFaucet's active record objects, and so there are only a much smaller handful of race conditions to consider, those being largely related to deleted records and referential integrity.

Now I'm not saying that the active record model I developed in DataFaucet is better, I'm merely pointing out that it's different and so it has different strengths and weaknesses. If you're like me you may consider that there are a lot of applications for the web where the kind of very complicated caching done to solve the identity problem in Transfer isn't really necessary and so a somewhat simpler approach may work just fine.

On the other hand there certainly may be cases in which knowing that those objects are atomic may be very important to your business model, in which case you would need something that handles caching. In the past I just handled the cache on a case-by-case basis. The onTap framework's member plugin does this, having a manager object that maintains cache for the individual member objects and resolves the identity problem for just those objects. They actually are active record objects but in that case I felt it was important that the identity problem be solved the way Joe described in order to facilitate a more bulletproof security model. ;)

I realize I'm getting kind of long-winded here (as usual). What I'm getting at is that I've started working on a PersistenceService object in the DataFaucet core that provides a generic service/factory for working with objects in the way Joe described in his article. I have no intention for this to replace any of what already exists in DataFaucet. The existing features will stay right where they are, I'm merely extending the framework to include some new features in addition for those who prefer to work with their data in a manner more like Transfer.

One thing that you will find different from Transfer with the PersistenceService is that it's not going to create the individual objects itself. Instead it's going to defer to an IoC framework like ColdSpring or LightWire (or the simplified IoC factory in the onTap framework). And so you won't be creating any decorators like you would with Transfer because it's going to use your objects from the word go.

Also as Joe described in his article, this will make it possible to create your objects as "pure" objects, which have no internal knowledge of the ORM. Instead the ORM will analyze the object's properties and build the data schema from that - or you can provide a schema file for a given class if you need more control. I'm still working out the details of how to configure the PersistenceService and its related cache manager and DAO objects. The DAO objects and object-cache objects can also come from your IoC factory if you prefer to configure them that way (which if you're using IoC already, I rather expect you will prefer that).

I also made a particular point of designing the DAO objects to support not only the generic get/set methods that I prefer but to also support explicit getters and setters as well if you prefer that. Of course they default to getValue and setValue because the onTap framework's "duck" object has those method names, but if you've got existing objects with explicit getters and setters (from a snippet maybe?) then you can use those too.

Here's an example of a CFC I'm using for testing to show how you can use it with objects that are rather unlike the kind of objects I usually create:

<cfcomponent displayname="vendor">
   <cfproperty name="vendorid" type="uuid" required="true" key="1" />
   <cfproperty name="vendorname" type="string" required="true" />
   <cfproperty name="vendornotes" type="string" required="false" length="250*" />
   
   <cffunction name="init" access="public" output="false">
      <cfset variables.instance = newInstance() />
      <cfreturn this />
   </cffunction>
   
   <cffunction name="newInstance" access="private" output="false">
      <cfset var result = structNew() />
      <cfset result.vendorid = "" />
      <cfset result.vendorname = "" />
      <cfset result.vendornotes = "" />
      <cfreturn result />
   </cffunction>
   
   <cffunction name="getVendorID" access="public" output="false">
      <cfreturn instance.vendorID />
   </cffunction>
   
   <cffunction name="setVendorID" access="public" output="false">
      <cfargument name="vendorid" type="string" required="true" />
      <cfset instance.vendorid = arguments.vendorid />
   </cffunction>
   
   <cffunction name="getVendorName" access="public" output="false">
      <cfreturn instance.vendorName />
   </cffunction>
   
   <cffunction name="setVendorName" access="public" output="false">
      <cfargument name="VendorName" type="string" required="true" />
      <cfset instance.VendorName = arguments.VendorName />
   </cffunction>
   
   <cffunction name="getVendorNotes" access="public" output="false">
      <cfreturn instance.vendorNotes />
   </cffunction>
   
   <cffunction name="setVendorNotes" access="public" output="false">
      <cfargument name="VendorNotes" type="string" required="true" />
      <cfset instance.VendorNotes = arguments.VendorNotes />
   </cffunction>
   
</cfcomponent>

You might notice that, as Joe described, this CFC doesn't extend anything. It doesn't have to. The persistenceService is able to get all the information it needs either from the metadata of the object or from its configs, but the object itself doesn't need to know anything at all about the ORM. It doesn't even need to know you're using an ORM at all.

So having said that, the setup for my test page to configure the service looks like this:

<cfscript>
   cacheFactory = CreateObject("component","datafaucet.system.classcachefactory").init();
   cacheManager = CreateObject("component","datafaucet.system.classcachemanager").init(cacheFactory);
   daoFactory = CreateObject("component","datafaucet.system.daofactory").init(accessorPattern="get*",mutatorPattern="set*");
   factory = CreateObject("component","datafaucet.system.classfactory").init("datafaucet.demo");
   service = CreateObject("component","datafaucet.system.persistenceservice").init(factory,daoFactory,cacheManager);
   
   service.install("vendor");
   
   vendor = service.get("vendor");
   vendor.setVendorName("sony");
   service.save(vendor);
   
   vendor2 = service.get("vendor",vendor.getVendorID());
   vendor3 = service.get("vendor",vendor.getVendorID());
</cfscript>

If this were a working application I wouldn't have those createObject calls at the top. Instead I would have those objects configured in my IoC framework and then I would fetch the PersistenceService from the IoC factory and call service.get() as you see here. I would probably also use just the one IoC factory for all three types of objects (cache, dao and business objects). The init method of the service object even defaults the daoFactory to the business object factory.

The function call service.install("vendor") analyzes the object and configuration and creates the necessary tables to hold the data for the vendor object.

Then my get() call fetches a new vendor object from my IoC factory (in this case "datafaucet.system.classfactory" is just a naive tool that creates an object and spits it back, rather than an actual IoC factory - it's a stub really, existing only to support the model for testing - and the daoFactory here is the same).

The call to service.save() then checks to see if the object is new and performs either an insert or an update of the database depending on its status, just like Transfer.

Subsequent calls to service.get() where I've included the id of the record I want then return the already created object from the cache.

So far I've tested this on the single object. It creates one table and there are no composed objects yet. There is some code in SVN for handling the composition cache, but I haven't tested that part yet. There are a bunch of other things I expect don't work yet. Many of them are things that are already working with the active record objects. For example it won't handle autonumber ids yet or storing data for a single object in multiple tables. Heck, I haven't even tested the functions for purging the cache!

I wanted to post a note to let everyone know about roughly two days work on this. The code is in SVN and none of it is documented yet because of course it's not released yet. But if you're interested in playing around with it, you're more than welcome to get the BER from the SVN repository and check it out. :)

I will say that one particularly keen drawback of the PersistenceService is that it fails one of the primary objectives of the DataFaucet framework and that is making data access easy. I've said it on several occasions that it's called DataFaucet because I wanted to make getting data as easy as getting a drink of water from your kitchen sink! ;) In many ways I feel the gateway and active record objects really do live up to that ideal, particularly with features like and/or keyword searching. The PersistenceService not so much.

There's not a whole lot I think I can do about that because it needs an IoC framework to function and of course that introduces a new dependency that's beyond the scope of the ORM. But I'm certainly not opposed to the framework having "advanced techniques" for people who need or want them. As long as I can maintain that gentle learning curve, I'll be happy. :)

Query of Query - Did you know?

Did you know that you can put complex objects in a ColdFusion query? I'm honestly not sure if that's true of Java resultset objects. It may be, it may not. But you can with a ColdFusion query object. That means you can put arrays, structs and yes even CFC/Java objects in them.

What's even more important at this moment is that aside from the fact that you can inject complex objects into a query column, you can also execute query-of-query operations against them. ;) Now, in all likelyhood you would not be able to FILTER on the complex values... But that's not what I'm after. You can have a separate column with an id key, which allows you to filter the query-of-query on the simple value and that lets me do what I need.

What I'm driving at here is something that Joe Rinehart mentioned in a recent blog. Joe's contention is that there aren't currently any true ORM systems for ColdFusion. Okay, that's fine, but what do we call all these tools like Transfer and Reactor if we're not calling them ORMs? We could call them all "Data Access Layers" (DAL), which is how Steve Bryant bills DataMgr. But in the end I don't think that's a particularly helpful bone to pick. I would rather simply continue to call these ORM tools and discuss the differences in approach and feature sets on a case by case basis. I just don't see any ROI for getting all persnickety over the semantics of the label.

Having said that however, Joe's two posts did get me thinking about the way DataFaucet works and what I could provide as alternatives.

Let me start by saying that what is in DataFaucet now is not going anywhere. It works. I think it works well, I enjoy using it and I expect to continue to enjoy using it.

I do think however that I can offer some alternatives that will be more similar to what Joe described, external to the domain model objects, and at the same time will provide a record-caching mechanism that I've previously been hesitant to provide as a result of efficient cascade-deletes. The cascade-delete issue still troubles me, but I'll work on that. ;)

But at its core the engine I have in mind for handling the identity problem (1-record = 1-object), is based on the ability to place complex objects in queries and to get that data back using a query of query. Key in this strategy is the fact that objects stored in the query are returned by pointers (NOT deep-cloned or duplicated) when you execute the subsequent. If the query-of-query were to clone the object, then it would be useless because you would have multiple copies of the object in memory, making the caching mechanism moot. But because the query-of-query returns a reference, that means all instances in which that object are returned will be the same cached object. And as a result that makes a query an effective storage medium for these objects.

You might be asking "umm... hey Ike, why don't you just use a structure?"

That's a good question! And I could. However the structure poses two issues. One is that if I want to selectively cull the cache (based for example on idle time), then I would have to analyze each individual entry in the structure, whereas with the query I can simply execute a new query-of-query that excludes all entries meeting the criteria to cull -- it's a much faster operation.

Secondly I'm recalling something I read from Hal Helms in which he was talking about the Toys-R-Us debacle. In it he mentions how the developers (all very smart guys mind you) who worked on the Toys-R-Us site had made the decision to cache all the products in a big structure in memory... and the CF Server falls flat on its face.

I happen to know the reason for this problem with structures. A structure basically works on the hash-table principal. And hash tables are good for relatively small numbers of objects, however, because the hash-table ultimately is based on a "hash function", its results are somewhat imperfect. You tell it to store "foo" and it uses the hash-function to convert "foo" into a number and then attempts to store your value in that index in an array. Arrays are faster than string parsing, great!

But for the hash function to be faster than string parsing, that means it needs to actually not parse the string. Wha?! Yeah, I know, it sounds really strange, but it really doesn't. It basically performs a couple of complex math operations on the individual bytes of the string to come up with its number. And this means that, although ideally the hash function will produce few collisions, it *will* produce collisions.

A collision is when you tell it "store foo", it looks up the index for "foo", sees that there's already a value stored there, checks to see if that value is for "foo" and realizes that "bar" produces the same index using the hash-function. So since the hash function returns say 100 for both "foo" and "bar", it has to find some way to store both foo and bar in position 100 of the array. The first collision or two you won't notice much difference. The problem is that when you get a few million products in a database and you're storing them all in a big structure, you wind up with THOUSANDS of collisions in your hash table.

At that point the hash table actually becomes less efficient than doing the relatively more expensive string-parsing the hash-table was designed to avoid. Hence the server chokes... coughs up a lung and dies. And it will do that, irrespective of how much physical memory it has available, because the bottleneck is the hash table.

(To clarify, I don't actually know that's what happened at Toys-R-Us, I'm just extrapolating from what I've read about the event and about hash tables.)

Now I actually have some code currently that stores data in structures and at this point I'm considering revising some of it (though not all) to use queries instead for these two reasons I described. In at least one case the code that I have uses nested structures to avoid some of the issues related to the potential of hash table collisions. So instead of asking for struct["foo.bar.thing.#thingid#"] it actually converts that id into a foo struct, a bar struct, a thing struct and then the final data, so it's nested 3-structs deep. That allows me to store more in those structs without collisions, but because it's using structKeyExists several times in succession, that eats away some at the performance gain from caching. It's still better than not caching, but other maintenance tasks especially like culling old cache become rather complex, cumbersome and inefficient with that strategy. So I'm hoping the query-of-query approach will simplify those caching tools across the board. I wouldn't be surprised if this also finds its way into the CacheBox project at some point in the not too distant future as well.

In any event, here's some test code I was using to figure out if query-of-query could be used for this caching strategy. It seemed to work pretty perfectly actually. So I'm pleased. :)

<cfset qry = QueryNew("id,object") />
<cfset obj = CreateObject("component","datafaucet.system.duck") />

<cfloop index="x" from="1" to="5">
   <cfset QueryAddRow(qry,1) />
   <cfset qry.id[x] = x />
   <cfset qry.object[x] = duplicate(obj) />
   <cfset qry.object[x].setValue("id",x) />
</cfloop>

<cfquery name="qry2" dbtype="query">
   select * from qry where id = 3
</cfquery>

<cfset qry2.object[1].setValue("id","test") />

<cfoutput>#qry.object[3].getValue("id")#</cfoutput>

<cfdump var="#qry#" />

If you run this from a page that has a mapping for /datafaucet, you should get the output "test", followed by the dump of the 5 objects. What that means is that when it collected the data for qry2 in the query-of-query, it returned a pointer to the original duck object in record 3 of the original query. I updated the value of the ID, and then went back to the original query to display the ID value from there. If it had cloned or duplicated the object, it would have output "3" instead of "test" and that would have told me I couldn't use the query-of-query as a storage medium. But it gave me the result I wanted, so woohoo! Queries, here I come. :)

New DataFaucet ORM Build Nov 2

Aaargh!

So after having had a really unusual amount of trouble related to the recent changes leading up to the addition of the iterator, I had declared after releasing the iterator CFC that there shouldn't be any new builds for a while. Not that there aren't things to work on, there certainly are planned enhancements. For example right now reinstalling an active record object won't update any columns with modified data types, nor will it prepopulate an existing table with the default value for a new column. Both of those are planned for a future release...

What I hadn't expected however was a FREAKING TYPO in the active record object. Same place. Easy fix, regardless of your relative skill level, because the CF server tells you that "aguments.objectid" is undefined. I'm really surprised by this one actually, because I was pretty certain I'd tested that... but I guess I must have been experiencing source monitoring failure (a false memory) as often happens with the tasks we perform frequently. Of course TDD won't prevent this, because the more often you run your tests, the more likely you are to remember running them successfully when the reality is that you haven't run them.

And oddly enough even though I thought I had tested it, I was also doing a bunch of work with other code that uses the active record and that didn't show this problem either apparently because the only place where I was calling read() with the argument apparently I was also trapping and ignoring errors. Which meant that even though the member plugin for the onTap framework was installing, it was omitting insertions for a number of db records it needed and the security system was just generally not working because it wasn't loading the role and permission records correctly. Oops!

Now that's where TDD would have helped. If I'd had a set of regression tests set up for the member plugin and for DF, I could have run them and seen right away where the problem was, rather than having to go through all that troubleshooting. ;)

Ahh well, it gave me an excuse to enhance the member plugin a little more anyway and set it up to upgrade DF during installation. :)

New DataFaucet ORM Build - added iterator CFC

Okay, so I don't usually publish new builds this close together. :) It turns out that I was just too hasty to add the reset() method to the read() method of the active record object and I realized that in addition to the previous problem this also interfered with some of my other existing code because it was clearing out my default values for record objects where I had defined a default value in the soft-constructor of the CFC.

But really the reset() method had come about originally partly in preparation for the iterator object, which just takes a query and a record and provides a way of looping over the query and loading the object with each record in the query. It gives you a middle-ground between the performance of a gateway and the encapsulation of a record object. I won't call it an "iterating business object" because the default iterator itself isn't really a business object per-se. It doesn't really have any business logic in it.

So I've added the iterator CFC plus documentation and I've removed the reset() from the read() method of the active record object, archived the new code (honestly not huge changes) and uploaded another new build. I don't expect to publish any new builds for a little while after this, I promise. :)

New Build - Oct 27

I actually need to apologize for yesterday's build. I had made a change to the active record object so that it would clear its properties if you performed a read with a new ID, so for example:

<cfset ar = ds.getActiveRecord("my.record").init() />

<cfset ar.read(1) />
... do other stuff ...

<cfset ar.read(2) />
... do more stuff ...

The idea is that by clearing the properties when 2 is read in this example, you're less likely to get any bleed through of the data from the previous record, from properties that might be lazy-loaded where it doesn't fetch or create the property until the property is requested.

What happened is that I poorly implemented the change and it ended up resetting the properties and then attempting to read in new data from the db but after having reset itself it no longer had an ID to know which record to fetch. Oops!

So when I noticed that this morning I immediately fixed it and uploaded a new build for today.

Getting Rid of an Old Notebook?

A lot of folks have been talking lately about upgrading to the new MacBook Pro... or for that matter upgrading in general. And honestly I've been looking around at machines myself.

The only machine I have for development right now is an HP notebook with 1gb RAM (that was *with* an upgrade) and a 1ghz dual-core AMD Turion processor. It's a 64-bit processor although I was told by HP support that they wouldn't support me putting a 64-bit OS on it because there weren't drivers available for the hardware at the time. Anyway this machine cost me about $1600 new when I bought it a few years ago. Today I can go down the street to Staples and get any old generic, bottom-of-the-line HP notebook and for less than half what I spent on this machine it would be twice as fast and have 3x as much physical memory.

However right now I can't afford to upgrade. In lieu of that here's my pitch. If you're one of these guys who's currently upgrading and you've got and older notebook that you're not using anymore, you can contribute quite a lot to the continued development of the onTap framework and the DataFaucet ORM by donating your previous notebook.

My plan is to build a CLAM server (ColdFusion, Linux, Apache, MySQL). :) Then I'll disable the relevant server services on the notebook I have now where all my email and personal stuff is, and I'll do my primary development and testing on the new notebook with just those services on it.

Thanks. :)

New DataFaucet ORM Build - Fri Oct 17

Okay, so following up on the CF Meetup presentation yesterday that went so well (and thanks again to everyone who participated), I've uploaded a new build of the ORM framework today that addresses a couple of issues with sequences and with min/max filters on columns (with a data type of "real") that were seen in the presentation. I've also updated the schema exporter to support creating cross-reference tables using the cfproperty tag, as I said during the presentation so that wouldn't be a limitation anymore. And I've updated the documentation to show the creation of cross-reference tables using the new xref property.

Nice Surprise

Wow. I think there were actually 10 new downloads today right after the CF Meetup presentation earlier.

I think that's about 2/3rds of the people who attended.

I'm really flattered. :) When I've given presentations to other CFUG groups in the past there's not usually an immediate uptake like that - there's maybe a gradual stream of people downloading it... And I think that's even after having forgotten to give people the URL, so I guess either they assumed it was datafaucet.com or they googled it, but either way they went straight to download it after seeing today's presentation. That's really encouraging! :)

I need to get back to working on the handful of things I need to add and fix though, because I showed a couple of things in the presentation about sequences that are fixed in SVN but not in the latest download distribution yet. :) Hopefully there will be a new release in the next couple days with those changes.

More Entries

BlogCFC was created by Raymond Camden. This blog is running version 5.5.006.