DataFaucet ORM API Documentation

Dan Lancelot just submitted to the mailing list this API documentation for the framework created using Mark Mandel's ColdDoc application.

Dan also noted a couple of leftovers in the documentation where I had used "save as" and then neglected to update the hint on the CFC. Oops!

Anyway, I decided to put it up on the framework site for folks and say thanks to Dan and Mark. Of course as new versions of the framework are published, we'll also update the api information, and I may actually decide make it part of the distribution.

Query Optimization Hints - thanks to Rick Osborne and Ben Nadel

Ben Nadel posted this blog early this morning with a bunch of query optimization hints from Rick Osborne. Thanks Ben and thanks Rick!

I don't have a DB2 database handy, which is part of the reason why there's not currently a DB2 sql-agent in DataFaucet. There is however no reason why anyone couldn't simply create a sql-agent for DB2. These comments about query optimization however lead to some interesting thoughts about potential improvements for the sql-agents in general in terms of making the agent perform more efficiently for its target platform.

As an example:

Rick says: "Yes, put as much of the filtering as you can in ON clauses. Not only does it put the conditions where they are most relevant, but in some engines you'll get orders of magnitude better performance. The DB2/400 optimizer is so dumb (how dumb is it?) that if you put the conditions in the WHERE instead of the ON it will do the joins first, no matter how big the tables, and then only apply the conditions at the end. For extremely large tables, this is a nightmare."

By default, DataFaucet's query-builder automatically puts all the join conditions in the ON clauses and when performing a left join it properly places any filters on the joined table inside the ON clause as well, so that you can filter on a left join without turning it into an inner join in disguise.

But what struck me about Rick's comment here was that it would be pretty easy to write the sql agent so that it places those filters first before the join condition to improve the query performance just on DB2. For that matter Rick mentions a number of potential optimizations that could be similarly handled inside the abstraction. DataFaucet doesn't currently handle anything in that kind of detail, for example, reordering tables based on size, etc. but in the future it could. It's at least an idea worth keeping in mind for now. :)

Something else Rick said: "And no, the dream of having one query work perfectly on multiple engines is really just a dream."

If you're talking about flat out queries, yes, that may be true. Part of the reason why I started working on DataFaucet in the first place however, way back in the CF5 days, was to produce platform agnostic SQL. So while it may not be possible within an individual query, it might certainly still be possible within the abstraction. ;)

There are two other comments from Rick that I'd like to highlight here.

Rick said: "In most modern DBMSes, you almost don't need to index as long as you have Primary *and* Foreign keys set up. Joins are where indexes really shine, so proper keying will get you 90% of the way there."

Wouldn't you know, I've been trying to convince people to use foreign key constraints for years. ;) DataFaucet makes really good use of them and also makes them really easy to build. If you're using the built-in DDL features that allow the objects to automatically install tables, making a foreign key constraint is as easy as declaring a join (or easier I think). Here's an example from a previous article.

<cfcomponent output="false" extends="datafaucet.system.activerecord">
<cfproperty name="productid" type="uuid" required="true" key="1" />
<cfproperty name="productname" type="string" required="true" length="100*" />
<cfproperty name="productdescription" type="string" required="false" length="long*" />
<cfproperty name="productprice" type="numeric" required="true" length="real" />
<!--- create a foreign key constraint to ensure this product is placed in a category --->
<cfproperty name="categoryid" type="uuid" required="true" references="tblProductCategory.CategoryID" />

<cfset setTable("tblProduct") />
</cfcomponent>

And lastly I'll just encourage you to go read the article on Ben's blog, because Rick made a really clear analogy that helps to explain "selectivity" of an index, which you may have also heard described as the "cardinality" of the data. It's a good help to understanding how indexes work to improve the performance of your queries.

New DataFaucet ORM Build Nov 2

Aaargh!

So after having had a really unusual amount of trouble related to the recent changes leading up to the addition of the iterator, I had declared after releasing the iterator CFC that there shouldn't be any new builds for a while. Not that there aren't things to work on, there certainly are planned enhancements. For example right now reinstalling an active record object won't update any columns with modified data types, nor will it prepopulate an existing table with the default value for a new column. Both of those are planned for a future release...

What I hadn't expected however was a FREAKING TYPO in the active record object. Same place. Easy fix, regardless of your relative skill level, because the CF server tells you that "aguments.objectid" is undefined. I'm really surprised by this one actually, because I was pretty certain I'd tested that... but I guess I must have been experiencing source monitoring failure (a false memory) as often happens with the tasks we perform frequently. Of course TDD won't prevent this, because the more often you run your tests, the more likely you are to remember running them successfully when the reality is that you haven't run them.

And oddly enough even though I thought I had tested it, I was also doing a bunch of work with other code that uses the active record and that didn't show this problem either apparently because the only place where I was calling read() with the argument apparently I was also trapping and ignoring errors. Which meant that even though the member plugin for the onTap framework was installing, it was omitting insertions for a number of db records it needed and the security system was just generally not working because it wasn't loading the role and permission records correctly. Oops!

Now that's where TDD would have helped. If I'd had a set of regression tests set up for the member plugin and for DF, I could have run them and seen right away where the problem was, rather than having to go through all that troubleshooting. ;)

Ahh well, it gave me an excuse to enhance the member plugin a little more anyway and set it up to upgrade DF during installation. :)

Getting Rid of an Old Notebook?

A lot of folks have been talking lately about upgrading to the new MacBook Pro... or for that matter upgrading in general. And honestly I've been looking around at machines myself.

The only machine I have for development right now is an HP notebook with 1gb RAM (that was *with* an upgrade) and a 1ghz dual-core AMD Turion processor. It's a 64-bit processor although I was told by HP support that they wouldn't support me putting a 64-bit OS on it because there weren't drivers available for the hardware at the time. Anyway this machine cost me about $1600 new when I bought it a few years ago. Today I can go down the street to Staples and get any old generic, bottom-of-the-line HP notebook and for less than half what I spent on this machine it would be twice as fast and have 3x as much physical memory.

However right now I can't afford to upgrade. In lieu of that here's my pitch. If you're one of these guys who's currently upgrading and you've got and older notebook that you're not using anymore, you can contribute quite a lot to the continued development of the onTap framework and the DataFaucet ORM by donating your previous notebook.

My plan is to build a CLAM server (ColdFusion, Linux, Apache, MySQL). :) Then I'll disable the relevant server services on the notebook I have now where all my email and personal stuff is, and I'll do my primary development and testing on the new notebook with just those services on it.

Thanks. :)

New DataFaucet ORM Build - Fri Oct 17

Okay, so following up on the CF Meetup presentation yesterday that went so well (and thanks again to everyone who participated), I've uploaded a new build of the ORM framework today that addresses a couple of issues with sequences and with min/max filters on columns (with a data type of "real") that were seen in the presentation. I've also updated the schema exporter to support creating cross-reference tables using the cfproperty tag, as I said during the presentation so that wouldn't be a limitation anymore. And I've updated the documentation to show the creation of cross-reference tables using the new xref property.

Nice Surprise

Wow. I think there were actually 10 new downloads today right after the CF Meetup presentation earlier.

I think that's about 2/3rds of the people who attended.

I'm really flattered. :) When I've given presentations to other CFUG groups in the past there's not usually an immediate uptake like that - there's maybe a gradual stream of people downloading it... And I think that's even after having forgotten to give people the URL, so I guess either they assumed it was datafaucet.com or they googled it, but either way they went straight to download it after seeing today's presentation. That's really encouraging! :)

I need to get back to working on the handful of things I need to add and fix though, because I showed a couple of things in the presentation about sequences that are fixed in SVN but not in the latest download distribution yet. :) Hopefully there will be a new release in the next couple days with those changes.

New DataFaucet ORM Presentation Recording

Just wanted to let everyone know that the presentation recording for today's CF Meetup presentation is now available at this url:

https://admin.na3.acrobat.com/_a204547676/p71497487/

Thanks to Charlie Arehart for all the excellent work he does managing the CF Meetup group to help us share our knowledge. And thanks to all the attendees for coming and giving me the opportunity to share. :)

Upcoming ORM Presentations

I wanted to post here briefly to let everybody know about a few upcoming presentations.

Oct. 14 7:30pm CST - DFW CFUG managed by Dave Shuck

Oct. 16 12:00pm EST - Online ColdFusion Meetup managed by Charlie Arehart

Oct. 27 6:00pm BST - Devon UK CFUG managed by John Whish

If you'd like to schedule a presentation for your group, you can contact me at info@turnkey.to. :)

SQL Injection

I find myself rather wishing that there were a horde (or maybe even a handful) of high-profile ColdFusion sites using DataFaucet for their data access layer... Why? Because in order for someone to get a SQL injection past you with DataFaucet, you have to go out of your way to let it happen.

Three things you should never ever, ever do with DataFaucet.

  1. put raw user-supplied variables into the select list of a query (or the groupby or having clauses)
  2. put ANY user-supplied variables into the orderby property of a select statement (use SORT, it's designed specifically for that purpose)
  3. put ANY user-supplied variables into the content property of a dynamic filter (i.e. stmt.sqlFilter("column",form.x,"in"))

Never ever do those three things and you should never have to worry about ... well at least not this sql injection attack... I don't want to say never because it's theoretically possible that future versions of ColdFusion (or smarter hackers) could produce new vulnerabilities and that may even include vulnerabilities in cfqueryparam. But for today, cfqueryparam keeps you safe and using DataFaucet and not doing these three things will protect you.

I've been hearing a lot about this problem today specifically... the company I work for has a few clients with some older sites that were affected... I sent an email with a couple of links to my bosses and the other programmers today, since I was the only one who'd read about it prior to the attack and already knew where to find the info.

So for anyone else who may yet be looking for the info, here are those links:

Russ Michaels and Mark Kruger talk about protecting yourself from SQL Injection.

-- and --

Hackers talk about something other than protecting yourself from SQL Injection.

I'm even tempted to say read it quick, because I suspect they might change the url on that article once they realize that CF programmers are reading it off of Google's index.

Good luck!

p.s. I would also be leery of acunetix!

Status Update

Hi all,

I haven't dropped off the earth. :) I still haven't started working on those 7 feature requests yet because I've been really busy trying to wrap up some major revisions on my other framework projects in the past couple weeks. And yesterday and the day before I worked a 14hr and 12hr day respectively at my 9-5 job.

In the meanwhile, the subject of caching has been on my mind a lot lately and I've come to realize that I wasn't aware of some of the new tools available for cache management in ColdFusion. Not native tools mind you, external tools say from newer versions of the JVM. Which has inspired me to start onto a project called CacheBox that will be pluggable so it can be used in both DataFaucet and the onTap framework and hopefully most other ColdFusion frameworks as well.

Request for Assistance - Metrics

One of the questions that was asked during the cfmeetup presentation is with regard to performance. The specific question was, are there any performance metrics against large tables, for example over 5-million records.

While I could create a table with 5-million records, I wonder if anyone else perhaps already has a large database they could test this with, perhaps at their office?

If I were working for a company that had a large database myself right now, I would just take a backup of the production database onto the staging server and test it there. That's not available to me right now, so the options are create the big db manually on my notebook, which I could do, but would prefer not to have to, or preferably if someone else has the ability to do some testing I'd appreciate the help. :)

If so give me a shout ( info@turnkey.to or you can catch me on AIM as "isaacdealey" - I may set up a twitter account ) and I'll hash out a small handful of apples-to-apples tests for select, insert/update and delete queries.

Thanks

p.s. I'm planning a follow-up presentation to give more examples of using the system. My hope is to have some metrics and by then I may have some additional integration utilities built for ColdSpring or LightWire similar to the IoCFactory I created for the onTap framework over this past weekend.

More Entries

BlogCFC was created by Raymond Camden. This blog is running version 5.5.006.