CacheBox Presentation Recording

We had a great time with today's presentation of the new CacheBox framework!

We had an even dozen show up and a pleasant surprise at the end. There's a poll at the end of these ColdFusionMeetup.com presentations so the people who attended can anonymously rate the presentation. It asks "was this information useful?" and the options range from 1-star (it was a waste of time) to 5-stars (yes, very much). In this case 50%, an even half of the guys who came gave me 5 stars. So I feel really good about the presentation. :) There is certainly still room for improvement in my presentation skills, but I think this is great progress.

I've also used some of the feedback from this presentation and updated the intro to the documentation for the CacheBox project to clarify the benefits of using it.

Here's the presentation recording for anyone who missed it: https://admin.na3.acrobat.com/_a204547676/p26212200/

CacheBox Presentation

This latest version of DataFaucet includes integration of the new CacheBox project, the hot-swappable caching framework for ColdFusion. That's a great thing for DataFaucet because it means the DF cache is now much more configurable and you've got a handy management application that lets you look right into it, which wasn't there before.

Honestly the framework is still incomplete, but it is in a working state. We could use input into the management application, features (did we cover everything?), and in particular I'm really hoping to get some folks to provide some additional insight to help me finish writing the intelligence portion -- the methods that will allow CacheBox to auto-configure the cache based on usage patterns.

I hope you will all come join us for the discussion tomorrow at 6pm EST at the Online ColdFusion Meetup group.

Who's Using DataFaucet

A lot of open source projects have a "who's using" page on their website and I think this helps keep the projects active. So knowing that there have been quite a few downloads for DataFaucet in the past year or two, I'd like to get your help creating a Who's Using DataFaucet page for our site. You can reply on the blog here, post a note to the google group, or if you'd prefer to remain anonymous you can email me at info@datafaucet.com.

Thanks!

New 1.1 Release Candidate

I just put up a new version of DataFaucet, 1.1 and now a release candidate. This new version has a couple of new features.

1. Integrates with the new CacheBox caching framework that lets you hot-swap the caching engine and gives you a nice management application for all your applications (ColdBox, Wheels, onTap, DataFaucet, FuseBox, etc).

2. A new revision of the Persistence Service that I had started working on last year. This feature was inspired by discussion on two of Joe Rineharts blog entries last year, titled Does ColdFusion Have NO Real ORM Frameworks? and What Makes a Framework an ORM? Out of those discussions I created this new system in DataFaucet that I think will appeal to developers who enjoy Dependency Injection (DI) frameworks like ColdSpring or Lightwire, and would like an ORM that will more or less "stay out of your way". It uses your existing DI Factory as a source for objects and will even generate all the database tables necessary to persist those objects.

Here's the documentation for the new Persistence Service:

http://www.datafaucet.com/persistenceservice.cfm

Enjoy!

Pros and Cons of an ORM

I just came across this blog thanks to a post on Hal Helms Facebook page. Every design pattern we use has some advantages and then some drawbacks, it's just the nature of the beast. We're essentially purchasing the advantages we want by paying with the drawbacks, so the question becomes what is the advantage worth? In the case of a traditional ORM for example, is it worth a certain loss of direct control over our persistence layer (database) in order to theoretically simplify our day-to-day coding tasks.

I think Bill Karwin made some good points in his blog and it's interesting because we tend not to talk about or hear much about the drawbacks of the design patterns we like to use. This is actually just a basic fact of human nature. For example, if you happen to be a big fan of Linux, you're liable to conveniently forget anything about Linux that's challenging or frustrating. And of course the same thing happens with people who are fans of Microsoft. But it's good to remind ourselves from time to time that you can only see polarization from the outside. By this I mean, a fan boy usually doesn't think he's a fan boy. ;)

This blog entry points out the effect of polarization so that you can see the obvious biases between Glenn Block who is an ORM enthusiast and Karwin who is obviously not. In many cases they even disagree about whether something belongs in the pro column or the con column. So looking at the same feature, they have exactly opposite reactions to it. It doesn't get much simpler than this example from Karwin's blog where he gives his opinion of something Block describes as an advantage of ORM systems:

Block: 4. Rich query capability.

Karwin: Absolutely wrong.

I'm always curious to know what other people really think about my software. DataFaucet seems to have been rather well received. But DataFaucet may also be difficult to categorize in particular as an ORM or simply as a Data Access Layer (DAL). Steve Bryant describes his DataMgr tool as a DAL - it's much smaller and much simpler than Reactor, Transfer or DataFaucet. But this sort of thing really depends who you ask, because according to Joe Rinehart, none of the tools billed as ORM for ColdFusion actually fit the definition of ORM. Joe's blog encouraged some additions to DataFaucet that are in source control, but not yet officially released, but, even with added features that make DataFaucet more like traditional ORM tools, does that make it an ORM? Well if you asked Bill Karwin again, who inspired this article, I think he would say no. And here's why I think Karwin would say that DataFaucet is not an ORM:

Block: 2. Huge reduction in code.

Karwin: Depends. When executing simple CRUD operations against a single table, yes. When executing complex queries, most ORM implementations fail spectacularly compared to the simplicity of using SQL queries.

... Block: 4. Rich query capability.

Karwin: Absolutely wrong.

Block: 5. You can navigate object relationships transparently.

Karwin: This is definitely a negative rather than a positive. When you want a result set to include rows from dependent tables, do a JOIN. Doing the "lazy-load" approach, executing additional SQL queries internally when you reference columns of related tables, is usually less efficient. Leaving it up to the ORM internals deprives you of the opportunity to decide which solution is better.

Block: 6. Data loads are completely configurable ...

Karwin: This is not a benefit of an ORM. It is actually easier to achieve this using plain SQL.

My impression is that, although it might be a little slower, Karwin would not have these same objections to DataFaucet, which started it's life as an attempt to abstract the SQL language in a way that Ben Forta declared impossible. The key word was "portability" at the time, but in the process I managed to find ways to not only make querying the database portable, but easier as well. The ability to specify and/or keywords in search queries (think Google) is a prime example of a use case in which standard SQL is a big challenge, but DataFaucet is dead easy. The reason is because for all the ORM features in DataFaucet, it started as a "language". As far as I know, none of the other ORM systems for ColdFusion have approached this particular task.

Although I think Joe Rinehart might have been wrong when he said ColdFusion doesn't have any real ORM systems. If what you want really is a traditional ORM system, I believe there actually is one for ColdFusion. It's a built-in part of the FarCry framework called FourQ. It wasn't included in Steve Bryant's comparison of DAL tools, I think primarily because it's inseparable from FarCry, which if memory serves is an 8MB download (that's compressed). According to their own documentation, the objective of FourQ is to ensure that as a programmer, the word "database" never enters your mind. I'm not sure how effective they've been at achieving that goal, since I don't use it with any regularity. It may work beautifully if that's what you're after. But it does mean that you won't get the kind of querying flexibility that a system designed as a querying language like DataFaucet will give you.

It's not my intention to promote or to bash anyone here (except of course obviously to promote DataFaucet). But I think Karwin made some good points and these are worth considering when choosing between DAL or ORM frameworks.

DataFaucet ORM API Documentation

Dan Lancelot just submitted to the mailing list this API documentation for the framework created using Mark Mandel's ColdDoc application.

Dan also noted a couple of leftovers in the documentation where I had used "save as" and then neglected to update the hint on the CFC. Oops!

Anyway, I decided to put it up on the framework site for folks and say thanks to Dan and Mark. Of course as new versions of the framework are published, we'll also update the api information, and I may actually decide make it part of the distribution.

Query Optimization Hints - thanks to Rick Osborne and Ben Nadel

Ben Nadel posted this blog early this morning with a bunch of query optimization hints from Rick Osborne. Thanks Ben and thanks Rick!

I don't have a DB2 database handy, which is part of the reason why there's not currently a DB2 sql-agent in DataFaucet. There is however no reason why anyone couldn't simply create a sql-agent for DB2. These comments about query optimization however lead to some interesting thoughts about potential improvements for the sql-agents in general in terms of making the agent perform more efficiently for its target platform.

As an example:

Rick says: "Yes, put as much of the filtering as you can in ON clauses. Not only does it put the conditions where they are most relevant, but in some engines you'll get orders of magnitude better performance. The DB2/400 optimizer is so dumb (how dumb is it?) that if you put the conditions in the WHERE instead of the ON it will do the joins first, no matter how big the tables, and then only apply the conditions at the end. For extremely large tables, this is a nightmare."

By default, DataFaucet's query-builder automatically puts all the join conditions in the ON clauses and when performing a left join it properly places any filters on the joined table inside the ON clause as well, so that you can filter on a left join without turning it into an inner join in disguise.

But what struck me about Rick's comment here was that it would be pretty easy to write the sql agent so that it places those filters first before the join condition to improve the query performance just on DB2. For that matter Rick mentions a number of potential optimizations that could be similarly handled inside the abstraction. DataFaucet doesn't currently handle anything in that kind of detail, for example, reordering tables based on size, etc. but in the future it could. It's at least an idea worth keeping in mind for now. :)

Something else Rick said: "And no, the dream of having one query work perfectly on multiple engines is really just a dream."

If you're talking about flat out queries, yes, that may be true. Part of the reason why I started working on DataFaucet in the first place however, way back in the CF5 days, was to produce platform agnostic SQL. So while it may not be possible within an individual query, it might certainly still be possible within the abstraction. ;)

There are two other comments from Rick that I'd like to highlight here.

Rick said: "In most modern DBMSes, you almost don't need to index as long as you have Primary *and* Foreign keys set up. Joins are where indexes really shine, so proper keying will get you 90% of the way there."

Wouldn't you know, I've been trying to convince people to use foreign key constraints for years. ;) DataFaucet makes really good use of them and also makes them really easy to build. If you're using the built-in DDL features that allow the objects to automatically install tables, making a foreign key constraint is as easy as declaring a join (or easier I think). Here's an example from a previous article.

<cfcomponent output="false" extends="datafaucet.system.activerecord">
<cfproperty name="productid" type="uuid" required="true" key="1" />
<cfproperty name="productname" type="string" required="true" length="100*" />
<cfproperty name="productdescription" type="string" required="false" length="long*" />
<cfproperty name="productprice" type="numeric" required="true" length="real" />
<!--- create a foreign key constraint to ensure this product is placed in a category --->
<cfproperty name="categoryid" type="uuid" required="true" references="tblProductCategory.CategoryID" />

<cfset setTable("tblProduct") />
</cfcomponent>

And lastly I'll just encourage you to go read the article on Ben's blog, because Rick made a really clear analogy that helps to explain "selectivity" of an index, which you may have also heard described as the "cardinality" of the data. It's a good help to understanding how indexes work to improve the performance of your queries.

New DataFaucet ORM Build Nov 2

Aaargh!

So after having had a really unusual amount of trouble related to the recent changes leading up to the addition of the iterator, I had declared after releasing the iterator CFC that there shouldn't be any new builds for a while. Not that there aren't things to work on, there certainly are planned enhancements. For example right now reinstalling an active record object won't update any columns with modified data types, nor will it prepopulate an existing table with the default value for a new column. Both of those are planned for a future release...

What I hadn't expected however was a FREAKING TYPO in the active record object. Same place. Easy fix, regardless of your relative skill level, because the CF server tells you that "aguments.objectid" is undefined. I'm really surprised by this one actually, because I was pretty certain I'd tested that... but I guess I must have been experiencing source monitoring failure (a false memory) as often happens with the tasks we perform frequently. Of course TDD won't prevent this, because the more often you run your tests, the more likely you are to remember running them successfully when the reality is that you haven't run them.

And oddly enough even though I thought I had tested it, I was also doing a bunch of work with other code that uses the active record and that didn't show this problem either apparently because the only place where I was calling read() with the argument apparently I was also trapping and ignoring errors. Which meant that even though the member plugin for the onTap framework was installing, it was omitting insertions for a number of db records it needed and the security system was just generally not working because it wasn't loading the role and permission records correctly. Oops!

Now that's where TDD would have helped. If I'd had a set of regression tests set up for the member plugin and for DF, I could have run them and seen right away where the problem was, rather than having to go through all that troubleshooting. ;)

Ahh well, it gave me an excuse to enhance the member plugin a little more anyway and set it up to upgrade DF during installation. :)

Getting Rid of an Old Notebook?

A lot of folks have been talking lately about upgrading to the new MacBook Pro... or for that matter upgrading in general. And honestly I've been looking around at machines myself.

The only machine I have for development right now is an HP notebook with 1gb RAM (that was *with* an upgrade) and a 1ghz dual-core AMD Turion processor. It's a 64-bit processor although I was told by HP support that they wouldn't support me putting a 64-bit OS on it because there weren't drivers available for the hardware at the time. Anyway this machine cost me about $1600 new when I bought it a few years ago. Today I can go down the street to Staples and get any old generic, bottom-of-the-line HP notebook and for less than half what I spent on this machine it would be twice as fast and have 3x as much physical memory.

However right now I can't afford to upgrade. In lieu of that here's my pitch. If you're one of these guys who's currently upgrading and you've got and older notebook that you're not using anymore, you can contribute quite a lot to the continued development of the onTap framework and the DataFaucet ORM by donating your previous notebook.

My plan is to build a CLAM server (ColdFusion, Linux, Apache, MySQL). :) Then I'll disable the relevant server services on the notebook I have now where all my email and personal stuff is, and I'll do my primary development and testing on the new notebook with just those services on it.

Thanks. :)

New DataFaucet ORM Build - Fri Oct 17

Okay, so following up on the CF Meetup presentation yesterday that went so well (and thanks again to everyone who participated), I've uploaded a new build of the ORM framework today that addresses a couple of issues with sequences and with min/max filters on columns (with a data type of "real") that were seen in the presentation. I've also updated the schema exporter to support creating cross-reference tables using the cfproperty tag, as I said during the presentation so that wouldn't be a limitation anymore. And I've updated the documentation to show the creation of cross-reference tables using the new xref property.

Nice Surprise

Wow. I think there were actually 10 new downloads today right after the CF Meetup presentation earlier.

I think that's about 2/3rds of the people who attended.

I'm really flattered. :) When I've given presentations to other CFUG groups in the past there's not usually an immediate uptake like that - there's maybe a gradual stream of people downloading it... And I think that's even after having forgotten to give people the URL, so I guess either they assumed it was datafaucet.com or they googled it, but either way they went straight to download it after seeing today's presentation. That's really encouraging! :)

I need to get back to working on the handful of things I need to add and fix though, because I showed a couple of things in the presentation about sequences that are fixed in SVN but not in the latest download distribution yet. :) Hopefully there will be a new release in the next couple days with those changes.

More Entries

BlogCFC was created by Raymond Camden. This blog is running version 5.5.006. | Protected by Akismet | Blog with WordPress