DataFaucet ORM API Documentation

Dan Lancelot just submitted to the mailing list this API documentation for the framework created using Mark Mandel's ColdDoc application.

Dan also noted a couple of leftovers in the documentation where I had used "save as" and then neglected to update the hint on the CFC. Oops!

Anyway, I decided to put it up on the framework site for folks and say thanks to Dan and Mark. Of course as new versions of the framework are published, we'll also update the api information, and I may actually decide make it part of the distribution.

SQL is often more complicated than it needs to be

A friend of mine just showed me this code sample from an ASP project he was asked to work on:

SELECT tb2.response, tb1.cnt AS pcnt, tb2.cnt AS tcnt
FROM

(SELECT COUNT(m.id) AS cnt, CAST(ms.response AS varchar(100)) AS response
FROM members AS m INNER JOIN
member_surveys AS ms ON ms.member_id = m.id
WHERE (m.activestate = 1) AND (LOWER(m.affiliation) IN ('customer', 'consultant'))
AND (ms.date_created >= 'xx/xx/xxxx')
AND (ms.date_created <= 'xx/xx/xxxx')
GROUP BY CAST(ms.response AS varchar(100))) AS tb1

RIGHT OUTER JOIN
(SELECT COUNT(m.id) AS cnt, CAST(ms.response AS varchar(100)) AS response
FROM members AS m INNER JOIN
member_surveys AS ms ON ms.member_id = m.id
WHERE (m.activestate = 1) AND (LOWER(m.affiliation) IN ('customer', 'consultant'))
GROUP BY CAST(ms.response AS varchar(100))) AS tb2 ON tb1.response = tb2.response

ORDER BY tb2.response

It took me a few minutes to figure out what the original developer was trying to do, but once I did, boy is it a lot easier than this. I guess maybe he wasn't aware that you can filter on a left join?

SELECT
left(ms.response,100) as response,
count(ms.response) as tcnt,
count(ms2.response) as pcnt

FROM members m

INNER JOIN join member_surveys ms on (ms.member_id = m.id)

LEFT JOIN member_surveys ms2 on (ms2.member_id = ms.member_id
AND left(ms2.response,100) = left(ms.response,100)
AND (ms2.date_created >= 'xx/xx/xxxx')
AND (ms2.date_created <= 'xx/xx/xxxx')
)

WHERE m.activestate = 1
AND LOWER(m.affiliation) IN ('customer', 'consultant')

GROUP BY LEFT(ms.response,100)
ORDER BY LEFT(ms.response,100)

In addition to being a lot cleaner, this is also apt to be much more efficient and execute a lot faster than the original query. The current tool set for DataFaucet will support this new much more efficient version. It will not support the horrid nigh indecipherable, sloppy and inefficient monstrosity above. :)

edit: although I'm actually not sure about joining on left(ms.response,100) with DataFaucet -- which really shouldn't be done that way either... if you needed that, you should probably be creating a view first that contains left(ms.response,100) as its own column and then performing your aggregate against the view instead... although based on past experience and the brute-force approach in the original developer's query I would guess that the survey table and the response column here are also poorly normalized...

edit 2: an ideal join here between the ms and ms2 tables (which are both aliases for the same table) would be on a primary key column like "surveyid" (or based on the sample simply "id") instead of joining both the member id and the response text. Using a primary key column on the member_survey table for that join would be immensely superior both in terms of simplicity and efficiency. I didn't introduce that into my substitute code only because I don't know from the original code if the member_survey table has a primary key column.

Liquify Converts Queries to DataFaucet For You

Thanks to Rob Parkhill for helping me identify a source of confusion for people interested in learning DataFaucet.

There's a tool in the DataFaucet documentation called "liquify". You can see the liquify page on the public site here where you can enter in a query from your application and when you submit the form it will show you how you can create that same query with the DataFaucet syntax. Like any tool of this nature it can't be perfect, although it should be able to convert the majority of queries. Primarily this tool is designed as a "learning aid" to help you get a feel for the syntax of DataFaucet by seeing the the before and after with your own queries from SQL to DataFaucet (and don't ask me how long I spent writing it!) :)

One of the more difficult challenges of publishing any software API is knowing how to write the documentation. In an ideal world, there would always be a separate technical writer handling documentation issues, conferring with the programmers to ensure his documentation is correct. But this begs the question, "isn't that adding more work? Wouldn't it be easier for the programmers to write their own documentation?"

In some ways this is true - the programmer already knows the system and so they don't have to ask questions about it, because they already know the answers. And that may save some time. The problem however lies in the fact that those questions don't get asked. And because those questions don't get asked, it's very easy for a programmer like myself to accidentally omit vital information when writing documentation for their own API. This happens precisely because we are so familiar with the system already that we don't know what questions an outside observer might ask. The end result is that we usually end up taking some things for granted.

The issue that Rob Parkhill helped me uncover is one of these issues of too much familiarity.

Originally the liquify tool would check the provided code for a cfquery tag with a name attribute and if it found a name, it would create code that would return a query, like this:

this original code:
<cfquery name="myQuery">
select * from mytable
</cfquery>

becomes this DataFaucet code:
<cfscript>
myQuery = ds.select("*","mytable");
</cfscript>

Now that looks fine (although you may personally like to remove the cfscript tags). You had originally created a query called "myQuery" and then the returned code also created the same query. But what if, seeing the liquify tool for the first time you didn't notice the comment at the top of the page that said you could include your cfquery tags, and made an intuitive guess that you should just include your sql code and not the cfquery tags. That's a logical assumption isn't it?

this original code:
select * from mytable

becomes this DataFaucet code:
<cfscript>
stmt = ds.getSelect("*","mytable");
</cfscript>

The difference here is pretty subtle. And looking at this output from liquify, you might think "great, there's my query in that stmt variable." Again having not seen it before, that's a pretty logical guess. It wouldn't work though because where ds.select() returns a query, ds.getSelect() returns an object.

The problem is that I hadn't anticipated this logical progression from newcomers. When I looked at the above code, I merely said to myself "ahh, there's no name for the query, so I can't create a query variable, I can only create a statement object". That made perfect sense to me because I was already familiar with the system. But if you try to use that stmt object as a query, you're going to get an ugly error message from ColdFusion telling you it's not a query. And if you're not already familiar with object-oriented (OO) concepts you might also not immediately realize that you need to execute the statement with stmt.execute() to get your query.

So to help clarify, I've modified the Liquify tool slightly so that it will always create a query for select statements.

DataFaucet, onTap and setProperties()

I often find myself amazed at how today's "new breed" of ColdFusion programmers are not only willing to sacrifice simplicity for no reason, but seem to actually prefer things to be outrageously complicated... a lot like Java. And really I think the only reason for it is because of unwarranted Java love. That's not to say that there aren't good things in Java. There are indeed good things in Java, however, the ColdFusion community in recent years has been unfortunately influenced by a halo effect, causing us to largely avoid the good parts of Java and acquire the bad parts or to acquire techniques that aren't appropriate for ColdFusion in spite of being perfectly appropriate for Java... leaving us with the worst of both worlds. See for example Ray Camden's recent frustration with JBoss (in the comments). The kind of frustration Ray demonstrates here is the crux of it, where he points out that "things don't *need* to be this difficult" (paraphrased).

And this is even after Sean Corfield himself no less, a man who is single-handedly responsible for a LARGE amount of popular technique in the ColdFusion community and who genuinely dislikes Ruby, recommended that perhaps the ColdFusion community ought to be spending a little more time paying attention to the Ruby community's thoughts about "beautiful code". Don't believe me? Check out Design Patterns and ColdFusion.

I talked about this a fair amount when I posted a previous blog entry on the onTap framework blog about duck-typing. And that's actually where the setProperties() method came from. SetProperties() is a method I've used in my "ducks" for a long time to provide a simple and elegant way of setting multiple properties for a given object with a structure or a query. Of course it also has a comparable getProperties() twin for returning all the defined properties of a given object. These have been supported aspects of the onTap framework for a long time even before there was DataFaucet and of course they've become supported parts of DataFaucet also.

Yet in the rest of the community I continually see people writing blog entries where they're going round and about in circles over "when/where/how do I set my properties on this object?!" Not only are they going round in circles over it, but they're often even excited to be doing it! I was recently reading one such article on the blog of Jason P. Dean where he was talking about using Transfer's getMemento() function and it turned out that not only was getMemento() not officially supported, it actually changed suddenly shortly after the blog was posted. And then today Scott Stroz posted an article on the Alagad blog titled Transfer, Model-Glue and makeEventBean(). You see where this is going don't you? :)

(Please note that this is not intended to be derogatory toward either Jason or Scott, they're both smart guys. I'm simply trying to make a case here and these were useful examples for my case.)

At the end of Scott's article he says "It has saved us quite a bit if time, and code." And what he's describing is a system that's still an order of magnitude more complicated than the way onTap/DataFaucet ducks work. And it is that way largely because Transfer and Model-Glue both see problems through the lens of Java and are therefore unable to see how much simpler things could be if they weren't carrying around all of Java's dead weight all the time.

Here's a relevant snippet of Scott's code:

<cfset var user = getUserService().getUser( arguments.event.getValue( "userId", 0 ) ) />
<cfset var company = getCompanyService().getCompany( arguments.event.getValue( "companyID", 0 ) ) />
<cfset user.setCompany( company) />
<cfset arguments.event.makeEventBean( user ) />

He then explains how you can simplify this code by creating yet more code as a decorator for your transfer object with functions like this:

<cffunction name="setCompanyId" access="public" returntype="void" output="false">
<cfargument name="companyId" type="numeric" required="true" />
<cfset var company = getTransfer().get("company",arguments.companyId)>
<cfset setCompany(Company)>
</cffunction>

And here's the equivalent of Scott's code (all of it) done with onTap and DataFaucet:

<cfparam name="attributes.userid" default="0" />
<cfset user = request.tap.getIoC("myapp").getBean("User").read(attributes.userid) />
<cfset user.setProperties(attributes) />

There are no bloated or complicated decorators in play here. There isn't any need for confusing method names like makeEventBean() (what the heck is that supposed to mean anyway?!) The most complicated part of this is the getIoC() and getBean() (instead of getUserService()), which isn't even part of what he was demonstrating, they're just different approaches to fetching the user object.

All the OO "purists" in the CF community ultimately always come around in the end to a solution that is similar if not darned nearly identical to the onTap/DataFaucet solution with one exception: it's needlessly overcomplicated. Scott's Model-Glue/Transfer article is merely another example. In the end, Scott created a system that would do what DataFaucet's "active record" objects do... but in order to get there, he had to create a decorator for his transfer object and a bunch of other tedious and overly complicated things... In essence he's just run a marathon and said "wow, look how I didn't have to run all the way across the continent! That was so much easier!" Of course, he hasn't realized that he could have just skipped the running all-together and casually strolled next door.

UPDATE: Minutes after posting this blog, I received an email from Twitter to let me know that Scott Stroz started following me. Coincidence? I wonder if that's because he wasn't offended or if it's because he's contemplating Sun-tzu's advice to "Keep your friends close, and your enemies closer." :)

Aha! Documentation formatting issue resolved...

If you've been staring at the documentation on the new DataFaucet site (http://www.datafaucet.com) or on your own server, and wondering why the formatting is all out of whack, then I have good news. I finally figured out what causes this -- and apparently it's the ColdFusion server's white-space management.

I don't use it on my own servers -- and the onTap framework allows me to handle most of my white-space issues by simply turning on EnableCFOutputOnly in the config.cfc, so my onTap framework projects end up having pretty clean result pages without the need for aggressive white space management. And so hence, since I never turned it on, I never noticed the problem with the framework documentation on my own machine.

It suddenly dawned on me today... I'd noticed last night that one of the examples on the official site was whacked (during a presentation no less!) and thought, "I'll have to fix that later"... And then today I realized that it wasn't just that one, it was all of them. I was hoping I could just set something in the onRequestStart to fix it, but no such luck. I've had to insert cfprocessingdirective tags in each page to prevent the aggressive white-space management from trashing my pre-formatted samples.

I don't think I could even put that in the custom tag that handles those samples, because the cfprocessingdirective tag requires an end tag, and if memory serves, you cant put the opening tag and the closing tag in separate halves of a flow-control like a cfif or a cfswitch, which is what you'd need to do to wrap the content of the tag.

Anyway, long story short, I finally fixed it and published a new zip archive. :)

And hopefully this may help some other folks who are scratching their heads about a similar issue with a HostMySite account.

Filtering On Left Join

This isn't anything new, in fact this has been part of the design of the ORM tools since before they were CFCs. But it's not in the documentation and I thought I might offer this insight here, some folks might find it useful. :) If so I may also add this to the wiki later.

DataFaucet doesn't really have "left" and "right" joins. It has "required" and "not required" joins. I made that decision a long time ago to simplify the join abstraction because I felt (and still feel) that the semantic of inner/outer joins is confusing and unhelpful. I have yet to find a situation in which I really need a right join, so the system doesn't actually support them. So instead you just "join" and you get to specify whether the join is required or not.

In other words if no matching records are found in the joined table, do you still want to see a record? If you do, that's a "left outer join" or in DataFaucet, an "optional join". If you don't want to see those records, that's an "inner join" or in DataFaucet a "required join", meaning that it requires data in the joined table to return a record.

Now for the really meaty part.

[More]

Gateways

Okay so I'm a little late noticing Sean Corfield's PDF from his presentation to the CF Meetup group in December. :) But loosely inspired by some things he mentioned in that presentation (here's a PDF of his slides) I decided to go ahead and add a Gateway object to the current version of DataFaucet. For a while I didn't really see much reason for a generic gateway object because the system is flexible enough to let you do "whatever". Sean's mention of the Ruby-style dynamic "findBy" methods that aren't really feasible with Java got me thinking more about it.

In addition to those Ruby-style dynamic FindBy methods, I've also added findByFilters(url) and search(form.searchPhrase).

[More]

Fusebox Lexicon

I just added a first draft of a lexicon for using the DataFaucet SQL library with Fusebox 5. It uspports select, insert, update and delete. I've tested select, filter, filtergroup, collection filter and even using a subquery as a filter. Nothing else is tested yet, and it doesn't include any support yet for using ActiveRecord as the medium for data access.

Waterlog

Well with any new project there are always a few things to work through. :) I had to fix a few more bugs in the Waterlog sample that were a result of my again trying to be "slick". ;) I had ignored the viewCount column because view counts are handled in memory and then realized that ignoring it causes it to insert a null in the column instead of the default 0.

I should probably try and revise the insert statement to avoid inserting into those columns, but it's just never been a big issue for me. There's always been an easy workaround, so I've always just used that. In this case since the view count is being captured by the viewcounter listener I'm just using that object to set the ignore on the column after it's known that the object has been inserted.

But while I was in there I also uncovered a few more issues with the cascade delete feature for many-to-many relationships. One issue is that it was throwing an error when attempting to delete with no foreign key constraint because the delete method was requiring an idlist argument that should have been defaulted to an empty string - easy fix.

The other one was a little more involved. If a foreign key constraint did exist, it was performing the delete twice, because the delete statement is configured to delete across foreign keys automatically. So I had to update the code that checks the relationship keys because it wasn't returning a "hasConstraint" value, which meant the delete method receiving the key information didn't know which relationships to ignore. It wasn't breaking anything, just executing extra delete queries that weren't needed. So those cascade deletes should perform better now.

Committed to SVN and the zip is updated.

Encapsulating Filters

So I'm working on this search tool for a project I'm on where we're not using any ORM tools. ColdBox and ColdSpring, but no ORM. And I come across this great scenario for encapsulating some filters and my mind starts running, trying to figure out a way to do that within the fairly standard gateway object I have here and nothing is coming to me. This is something that would be pretty easy to do with DataFaucet, so I figured I'd share the example here.

[More]

Waterlog Update

So I realized there were some issues with the install scripts for the Waterlog sample application... that'll teach me to try and be slick. :) Anyway I think I ironed out those problems and I uploaded a new zip archive of the project with the updated sample app. A couple of the articles in the samples directory (the docs) now link to the Waterlog sample, so I guess having it install properly would be a good thing. ;)

More Entries

BlogCFC was created by Raymond Camden. This blog is running version 5.5.006. | Protected by Akismet | Blog with WordPress