Friday, September 5, 2008

Fan Traps and Chasm Traps

When consolidating numerous data sources, including databases and lists, care must be taken to issues besides finding proper keys or, alternatively, match codes. That is pretty daunting on its own. Worse comes, Chasm traps and Fan traps. They can mislead us into very wrongful and wasteful action in our direct marketing even if we have organized the database properly.

In many consolidated, operational data stores, we typically end up with a core record in a “fact” table that is associated with multiple “dimension” tables with characteristics for each record in the fact table. For example, a university Alumni table as our “fact”/core table, a Donations table and a Financial Aid table, among others that tell us more about the person in the fact table.

Someone in the Alumni table may have made 0, 1 or many donations and so have 0 to many associated records in the Donations table. Likewise, someone in the Alumni table may have received 0, 1 or many scholarships or loans to attend college. Many of our data associations in this type of consolidated customer database will have this 1 to many or 1-M quality.





This can result in very subtle errors and very misleading interpretations in the analysis we do on the database or in the more mundane operations such as mailings. Let’s say we want to test the hypothesis that Alumni who received financial aid are more likely to donate money than those who did not, a propensity to donate. The correct way to accomplish this is to run two queries, one a list of alumni and the sum of all amounts they donated, and the other a list of alumni and the sum of all financial aid they received. Then we merge the two results sets.

It is tempting instead to run one query that reads our alumni table and simultaneously sums both donations and financial aid in the related tables. In a SQL database we have just stepped into a chasm trap. Such a query will disproportionally count both the donations and the financial aid for those friends who have made more than 1 donation AND received more than 1 loans or scholarships. (see Business Intelligence Blog, p 1 or IDS, pp 1-3 or http://db.grussell.org/resources/pdf/co22001%20notes.pdf

References
International Documentation Solution (April 14, 2000). Recognizing and resolving Chasm and Fan traps when designing. Retrieved on August 28, 2008 from http://www.eagle.co.nz/businessobjects/pdfs/ttchasm.pdf

Napier University (August 2002). Database SystemsStudent Notes. Retrieved on August 28, 2008 from http://db.grussell.org/resources/pdf/co22001%20notes.pdf

No comments: