Wondering how to handle dimensions with no business keys? Well, here is a cool way out.
Recently I have been extracting data from XML files having nodes with all its attributes marked as NULLABLE in XSD schema.
Soon I realised that this table cannot have a definite business key. So now how one decides whether it is an update or an insert. No truncate and reload please :)
How about creating a key based all column combination (exclusing IDs and any non business related columns). Hmmm. Next thing in mind was to create a Hash code.
The good guy, SQL 2008 made my life easy as it provides a cool function called CHECKSUM(column1, column2....). So now generate a old value hash and then compare it with the new column value hash to decide whether the data need to be updated or inserted.
This strategy could also be used to avoid unnecessary updates when there is no change in column value and yet the data has come in the source.
A common example for this could be handling of Address dimension where we do not have any attribute that could be treated as a business key. [Street address, City, State, Country, Postal Code]
Here is a snapshot of how it is done in SSIS package.