I'm the network administrator for a Wireless ISP. Client information is done using Surrogate Keys. However, one of our databases is used for tracking the various pieces of network equipment. In this case, the MAC address makes a very Natural and very useful key. Adding a surrogate key in this case would add unnecessary complexity to an otherwise very simple database.
The real rule in which type of key to use is KISS: "Keep It Simple S****d".
Discussion on:
View:
Show:
Sorry Mike, the real rule in which type of key to use is to do it right. A MAC address can be changed. Yes I know, it is not likely to happen, still it is possible and does happen if a NIC is replaced.
The KISS principle is fun but is not valid and doesn't work for database design above the hobby level.
/gustav
The KISS principle is fun but is not valid and doesn't work for database design above the hobby level.
/gustav
And that is why a properly indexed table with PK constraints (if appropriate) are better in terms of database building, deployment and scalability.
To simply say "the NIC won't change" is ignoring a potential DR problem that could leave you right in the brown and smelly!
To simply say "the NIC won't change" is ignoring a potential DR problem that could leave you right in the brown and smelly!
Exactly. A key is almost always all the better for NOT being a piece of 'real' data. We recently had to change the motherboard in our server, and our spam filter didn't recognise us any more because it used the MAC address to identify us. I entirely understand why they did that, but we had to add messing about with that to the other pains of having the server down.
A classic case of where you should NOT use a natural key. Although many people assume that MAC addresses are unique, the certainly aren't. For instance, here's the instructions on the "WAN > MAC address" screen of the management console of one popular WAP/Router"
"Some ISPs require that you clone (copy) the MAC address of your computer's network card into the Router. If you are not sure then simply clone the MAC address of the computer that was originally connected to the modem before installing the Router."
In this situation there are always two pieces of equipment with the same MAC address.
Also are 01-23-45-67-89-ab and 01:23:45:67:89:ab two different keys?
The real rule is - always use a surrogate.
"Some ISPs require that you clone (copy) the MAC address of your computer's network card into the Router. If you are not sure then simply clone the MAC address of the computer that was originally connected to the modem before installing the Router."
In this situation there are always two pieces of equipment with the same MAC address.
Also are 01-23-45-67-89-ab and 01:23:45:67:89:ab two different keys?
The real rule is - always use a surrogate.
... and VMware and other virtualization products also allow MAC manipulation.
simplistic.
I can think of more than a few exceptions to your MAC address is unique rule.
Okay it's not likely as spoofing isn't a generally popular technique wuithin business IT, and cards tend to be on board now, but it is possible and it will catch you by surprise, and make your system look daft to all concerned, all to save one identity column....
I can think of more than a few exceptions to your MAC address is unique rule.
Okay it's not likely as spoofing isn't a generally popular technique wuithin business IT, and cards tend to be on board now, but it is possible and it will catch you by surprise, and make your system look daft to all concerned, all to save one identity column....
It isn't a bad example. As others have pointed out it may change in the real world and you could end up with an orphaned record that would skew a lot of your reports.
Just make sure you don't create a clustered index on this column (or GUIDs) as the random nature can cause fragmentation and performance issues.
Just make sure you don't create a clustered index on this column (or GUIDs) as the random nature can cause fragmentation and performance issues.
MAC addresses are an example of a bad natural key, unless you're storing them as integers with 16 insignificant bits. They are a long string and that's not good. The article kinda fails to mention that a good key will be SMALL! That's why 32-bit integers work nicely - because the processor can do comparisons on those in ONE TICK. Whereas with a string, if only the last character is different, you have to load every character from both strings onto the processor and compare. You've got to think about what goes on in the guts of the machine - the data type is super important.
However, I agree with your point - natural keys are best when the key is appropriate - a customer number, or an order number, if it's an integer *and stored that way*
I'm working on a system now where they screwed this up both ways - they added surrogate keys to tables with great primary keys already, and they stored order numbers as strings, and didn't add a surrogate key.
This vendor is plagued with cargo-cult thinking - always do this, or always do that, and they don't know the reasons why. So when they were told that every table had to have a surrogate key, nobody questioned it. And then they didn't even implement the policy very well.
Don't get into that kind of thinking. If someone says "X is always wrong" then they are wrong, period. Approach each situation individually, and be aware of what works in one case might not work in another case.
The basic point is, IF you have a natural key in your data, and it's a short number, then you should use it as the key. If you don't have that, generate one.
However, I agree with your point - natural keys are best when the key is appropriate - a customer number, or an order number, if it's an integer *and stored that way*
I'm working on a system now where they screwed this up both ways - they added surrogate keys to tables with great primary keys already, and they stored order numbers as strings, and didn't add a surrogate key.
This vendor is plagued with cargo-cult thinking - always do this, or always do that, and they don't know the reasons why. So when they were told that every table had to have a surrogate key, nobody questioned it. And then they didn't even implement the policy very well.
Don't get into that kind of thinking. If someone says "X is always wrong" then they are wrong, period. Approach each situation individually, and be aware of what works in one case might not work in another case.
The basic point is, IF you have a natural key in your data, and it's a short number, then you should use it as the key. If you don't have that, generate one.
In the common case of the 'system' generating the next order number, I'd say that makes it very like an autonumber, even if you're not actually using a native number generator, but some equivalent code. The order number doesn't mean anything, it's just a short unique serial number - exactly what a key needs to be. It doesn't need to be something always hidden from the user, it can be a good key, as well as being something humans use.
A surrogate key has two qualities by definition: First, it is unique. Second it does *NOT* have any reason to change (achieved by never having a real-world meaning).
Even if a table's record is made unique by a single field (e.g. order number), which seemingly could serve as the key, *TOMORROW* the requirements to change in such a way as to change the nature of that key (e.g. an order is now made unique by order-number *AND* client id).
Even if a table's record is made unique by a single field (e.g. order number), which seemingly could serve as the key, *TOMORROW* the requirements to change in such a way as to change the nature of that key (e.g. an order is now made unique by order-number *AND* client id).
I agree exactly with this IF you use a natural key in your data then do not duplicate and complicate things. It's way too easy to get over complicated in a simpler task and get carried away with programing. It's especially important to be flexible and exact and then be able to alter the architecture accordingly.
reduces flexibility and decreases exactitude.
Surrogate keys are not duplication, nor do they add some major complexity overhead.
OrderID (Surrogate) and OrderNumber natural are not the same thing.
Order ID will never change, OrderNumber could.
Okay yes it's a bit more work, but if something horrible happens like ordernumber was an int and is now alphanumeric.
Use surrogates and the amount of work required to implement is relatively trivial and near zero risk.
Use natural ones and you are all over the place.
They will save your ass.
Good habits are just as hard to break as bad ones.
The default option should always be surrogates and only in extreme circumstances which most definitely does not include the burden one extra column on a table, they should be used.
I've lost count of the number of times someone 'simplified' a schema or were just ignorant, and caused me major palpitations WHEN things changed.
They always do.
Look for reasons to do it right, not excuses to do it wrong.
Surrogate keys are not duplication, nor do they add some major complexity overhead.
OrderID (Surrogate) and OrderNumber natural are not the same thing.
Order ID will never change, OrderNumber could.
Okay yes it's a bit more work, but if something horrible happens like ordernumber was an int and is now alphanumeric.
Use surrogates and the amount of work required to implement is relatively trivial and near zero risk.
Use natural ones and you are all over the place.
They will save your ass.
Good habits are just as hard to break as bad ones.
The default option should always be surrogates and only in extreme circumstances which most definitely does not include the burden one extra column on a table, they should be used.
I've lost count of the number of times someone 'simplified' a schema or were just ignorant, and caused me major palpitations WHEN things changed.
They always do.
Look for reasons to do it right, not excuses to do it wrong.
The author states "Keys are used by the RDBMS, not users" Certainly this isn't true. Users normally need a way of uniquely identifying the things recorded in the database. Duplicate data is no use to anyone.
The article continues with the advice to "apply indexes appropriately to avoid duplicate records". I guess she means a unique index - one that enforces a uniqueness (AKA a key) as a way of preventing duplicate data that would otherwise be visible to users. In other words a natural key.
The article continues with the advice to "apply indexes appropriately to avoid duplicate records". I guess she means a unique index - one that enforces a uniqueness (AKA a key) as a way of preventing duplicate data that would otherwise be visible to users. In other words a natural key.
Surrogate OrderNo LineNo
1 1 1
2 1 2
Primary key is surrogate
Index is OrderNo and LineNo.
1 1 1
2 1 2
Primary key is surrogate
Index is OrderNo and LineNo.
Presumably (OrderNo, LineNo) would be the natural key in that case. I would expect both keys to be implemented, not just the surrogate.
surrogate would be a primary key and the to get some integrity you'd make the other pair a unique key.
The thing that was foolishly missing from my example was that would would also have a surrogate key from Orders so you could set up a foreign key.
Order number/line number is whatthe user sees, so for instance if you deleted lineno three from a UI, you could shuffle up 4 and 5 without changing your keys.
Many times the benefits in some of these common problems are a bit hard to nail down. But keeping the habit and having a very powerful tool for exceptions at your disposal for such a trivial amount of effort and cost...
The thing that was foolishly missing from my example was that would would also have a surrogate key from Orders so you could set up a foreign key.
Order number/line number is whatthe user sees, so for instance if you deleted lineno three from a UI, you could shuffle up 4 and 5 without changing your keys.
Many times the benefits in some of these common problems are a bit hard to nail down. But keeping the habit and having a very powerful tool for exceptions at your disposal for such a trivial amount of effort and cost...
So each table would have (at least) one other candidate key in addition to the surrogate. From a data integrity perspective, implementing the natural keys is much more important than the question of whether the table also has a surrogate. This is why I don't understand people who say "don't use natural keys, use surrogates". It's as if they believe that a table should only have ONE key - how strange!
What they should mean is don't use a natural key as your primary key.
If you can gain either referential integrity (can't have two line 3s on an order), or performance Select using the natural keys from a front end, you'd be mad not to add a unique key.
If you can gain either referential integrity (can't have two line 3s on an order), or performance Select using the natural keys from a front end, you'd be mad not to add a unique key.
Well, you can call a unique index a natural key if you like, but unless you use it as one, I can't see how it matters.
If the database implements any kind of constraint that prevents duplicate values in a table then that is by definition an implementation of a key. Whether an index is involved or not is just an implementation detail.
A primary key is just a candidate key and does not mean anything different from any other candidate key. If the author really means "minimise the use of business data in foreign key references" then it would be better to say that instead of making vague and arbitrary generalisations about keys.
A primary key is just a candidate key and does not mean anything different from any other candidate key. If the author really means "minimise the use of business data in foreign key references" then it would be better to say that instead of making vague and arbitrary generalisations about keys.
When I design a database, the first thing I do is talk to the users about what reports they want. If they don't want to report on something, then it ain't worth tracking. Too many user groups get absorbed in what they _can_ track without worrying about whether it is useful reporting on it after a week or a month or a year.
Same with DB designers. Normalization can put _all_ the data in perspective but if I'm only going to report out from views, then normalization is only needed for performance issues, not design issues. A view _is_ denormalizing your data and if you can live with the [non-complex] view, then you can probably live with the abnormal table, if it is properly indexed with correct primary keys for reporting. (Finding the right fit between performance and db design is the essence of data warehousing).
Same with DB designers. Normalization can put _all_ the data in perspective but if I'm only going to report out from views, then normalization is only needed for performance issues, not design issues. A view _is_ denormalizing your data and if you can live with the [non-complex] view, then you can probably live with the abnormal table, if it is properly indexed with correct primary keys for reporting. (Finding the right fit between performance and db design is the essence of data warehousing).
That the database's purpose is reporting.
You normalise to get referential integrity. Referential integrity is for data entry, not analysis and reporting.
If both collection and analysis have equal weight then any any compromise is going to be negative somewhere important, so you split the functions in to two databases and have some form of replication.
You normalise to get referential integrity. Referential integrity is for data entry, not analysis and reporting.
If both collection and analysis have equal weight then any any compromise is going to be negative somewhere important, so you split the functions in to two databases and have some form of replication.
The first major database I developed, collection and analysis held equal weight; the analysis had much more weight. As much as I tried to normalize as much as i could, there was no getting around that at least 10 data fields (with different field names of course) had to be repeated through out the majority of the tables. Why? Because of the analysis. So whenever I enforced the Referential Integrity, later on down the line in subsequent tables and even in the SQL queries, there would always be problems.
a cost, the question is, is it worth it?
Slow complex reports and analyis versus the very real possibilty of analysing garbage data very fast....
If neither is acceptable then two databases is the only way to go, otherwise every attempt to improve one aspect of the functionality, mars the other.
Slow complex reports and analyis versus the very real possibilty of analysing garbage data very fast....
If neither is acceptable then two databases is the only way to go, otherwise every attempt to improve one aspect of the functionality, mars the other.
"Slow complex reports and analysis" was more important than fast analysis. The data was imported into SPSS For Windows, which I did as well. It was REALLY important for the analysis to be clean and accurate.
The database was developed for the division of a small college I worked for. The specific program that the data came from is very small. The student number would not exceed 60 every academic semester; but in reality, the number would be between 3-10 students. During each the analysis wasn't just kept in the division and program I worked in. it went start to the top with the Office of The President and the Board of Trustees of the college. Also, it was reported to the U.S. Dept of Education as well.
Now, I did have a Test database that I used, but i didn't use it as much as I should have
The database was developed for the division of a small college I worked for. The specific program that the data came from is very small. The student number would not exceed 60 every academic semester; but in reality, the number would be between 3-10 students. During each the analysis wasn't just kept in the division and program I worked in. it went start to the top with the Office of The President and the Board of Trustees of the college. Also, it was reported to the U.S. Dept of Education as well.
Now, I did have a Test database that I used, but i didn't use it as much as I should have
I have had more than one occasion over the years where I had to move a peice of equipment to a different network segment because the MAC was duplicated.
The odds are you will not have a duplicate MAC on the same network segment, but it happens.
The odds are you will not have a duplicate MAC on the same network segment, but it happens.
#1) A quick Review of SQL For Dummies or "Learn SQL in 24 Hours" is all I [as a developer] need to know about RDBMS & the SQL Language to build or edit a database.
#2) Cursors are not only fine to use because they work but they are also much easier to use/understand.
Number 1 is far common amongst non-DBA types who work in development and number 2 above is just as prevalent with those who learned to code before they learned to query.
The problem is that the DBA thinks in terms of data sets whereas the developer thinks in terms of iterations or individual items.
To a DBA the goal is to do it in as few steps as possible and in as efficient a manner as possible. For a developer the goal is to simply get it to work. Here-in lies the reason for why Cursors, the evil ant-effecient curse n the SQL Language is often embraced and embraced passionately by the procedural and Object Oriented programmer.
I was luck to learn SQL & RDBMS first and then go into OOP development using C++ and later VB/VB.Net. This order allowed me to learn how to work with sets of data and understand how to before learning the more common non-set based approach used in the OO/Procedural code world.
While bad design consisting of poorly selected and implemented Keys, missing and or badly designed Indexes along with little to no normalization can certainly trash your DB Design, one can still undo everything thru the equally poor design and use of that database by using procedural like methods to query that data. If you have an outstanding DB that is perfectly designed it can be for nothing if you then access/use that DB thru poor query choices such as using a Cursor when it should NOT be used.
While this is certainly not true for all developers it is still too common; the attitude that you don?t need a proper understanding and working knowledge of RDBMS and the SQL Language will be your undoing when trying to connect to and use a Relational Database such as Oracle or SQL Server.
#2) Cursors are not only fine to use because they work but they are also much easier to use/understand.
Number 1 is far common amongst non-DBA types who work in development and number 2 above is just as prevalent with those who learned to code before they learned to query.
The problem is that the DBA thinks in terms of data sets whereas the developer thinks in terms of iterations or individual items.
To a DBA the goal is to do it in as few steps as possible and in as efficient a manner as possible. For a developer the goal is to simply get it to work. Here-in lies the reason for why Cursors, the evil ant-effecient curse n the SQL Language is often embraced and embraced passionately by the procedural and Object Oriented programmer.
I was luck to learn SQL & RDBMS first and then go into OOP development using C++ and later VB/VB.Net. This order allowed me to learn how to work with sets of data and understand how to before learning the more common non-set based approach used in the OO/Procedural code world.
While bad design consisting of poorly selected and implemented Keys, missing and or badly designed Indexes along with little to no normalization can certainly trash your DB Design, one can still undo everything thru the equally poor design and use of that database by using procedural like methods to query that data. If you have an outstanding DB that is perfectly designed it can be for nothing if you then access/use that DB thru poor query choices such as using a Cursor when it should NOT be used.
While this is certainly not true for all developers it is still too common; the attitude that you don?t need a proper understanding and working knowledge of RDBMS and the SQL Language will be your undoing when trying to connect to and use a Relational Database such as Oracle or SQL Server.
knowledge and SQL amongst developers, but the lack that is most catastrophic is that of client server.
Lesson one is do it on the server. With that onboard you learn and use SQL.
Perhaps a bit more math would help but beyond dealing with individual members it's quite natural to think of sets in programming as well. I certainly do, but then I learnt databases when the nifty tools we have now weren't available and SQL was easier than writing bucketloads of code to interact with with the database.
It's been easier to do it wrong for a while now and bad habits learnt early are very hard to break, as of course are good ones....
Everytime someone recommends learning relational databases with access, I cringe.
Lesson one is do it on the server. With that onboard you learn and use SQL.
Perhaps a bit more math would help but beyond dealing with individual members it's quite natural to think of sets in programming as well. I certainly do, but then I learnt databases when the nifty tools we have now weren't available and SQL was easier than writing bucketloads of code to interact with with the database.
It's been easier to do it wrong for a while now and bad habits learnt early are very hard to break, as of course are good ones....
Everytime someone recommends learning relational databases with access, I cringe.
Well then thats the problem. Look at it this way what if DBA types had to do develpment work and an experienced OOP developer posted the comment that the SQL people should take the time to learn how to use/do OOP properly and the comments from the SQL people were "I cringe whenever some one says to learn OOP"?
In your rpely your very last line reads and I am quoting from your own reply, "Everytime someone recommends learning relational databases with access, I cringe".
So perhpas you did not say it but you included it in your post and if it was a quote from someone else and not yoru own then it was not clear in your reply.
S0 NO, we will not label that an OOPs. I think you also may have misunderstood my point which wasn't to be critcial of you but to merely point out the revrse scenario.
So perhpas you did not say it but you included it in your post and if it was a quote from someone else and not yoru own then it was not clear in your reply.
S0 NO, we will not label that an OOPs. I think you also may have misunderstood my point which wasn't to be critcial of you but to merely point out the revrse scenario.
Access is horrible tool to teach relational database theory with.
It's also an awful tool to teach SQL with.
The 'reverse' would be command line Osql would be a horrible power user tool for an MBA....
And of course, completely and totally irrelevant.
Unfortunately, or fortunately depending on how you look at it, there are a lot of devs doing database design and sql (well sort of). A horrifying percentage of them quite obviously don't get either, I should know I get paid to tidy up after them.
I don't expect a dev to know 6NF (I don't), I don't expect them to be able to tune a partitioned databasde over a raid (I can't), or set up a cluster (woudln't no where to start)
I do expect them to know up to 3NF, and that you don't do select * from Table, suck everything on to a clients and then iterate through with next to find the one called Bill and set his surname to Gates.
I suppose I might be expecting too much, I mean a lot of them can't program either....
It's also an awful tool to teach SQL with.
The 'reverse' would be command line Osql would be a horrible power user tool for an MBA....
And of course, completely and totally irrelevant.
Unfortunately, or fortunately depending on how you look at it, there are a lot of devs doing database design and sql (well sort of). A horrifying percentage of them quite obviously don't get either, I should know I get paid to tidy up after them.
I don't expect a dev to know 6NF (I don't), I don't expect them to be able to tune a partitioned databasde over a raid (I can't), or set up a cluster (woudln't no where to start)
I do expect them to know up to 3NF, and that you don't do select * from Table, suck everything on to a clients and then iterate through with next to find the one called Bill and set his surname to Gates.
I suppose I might be expecting too much, I mean a lot of them can't program either....
The more management likes the design the more likely you're screwed.
I'd add one more that trips people up.
Be very careful about exposing your surrogate keys to downstream systems. This makes them natural.
There you are with a nice identity int as a key on orders and then you pass it as a link to your invoicing system as OrderID.
Something breaks or changes ( a classic is when someone important expects the numbers to be sequential (no gaps).
And now you are faced with renumbering them in your invoicing system and you need was and is....
Doesn't cause a problem often, but when it does you'll be banging your head against the desk for a good long time.
Be very careful about exposing your surrogate keys to downstream systems. This makes them natural.
There you are with a nice identity int as a key on orders and then you pass it as a link to your invoicing system as OrderID.
Something breaks or changes ( a classic is when someone important expects the numbers to be sequential (no gaps).
And now you are faced with renumbering them in your invoicing system and you need was and is....
Doesn't cause a problem often, but when it does you'll be banging your head against the desk for a good long time.
/*
Be very careful about exposing your surrogate keys to downstream systems. This makes them natural.
*/
I don't understand how this exposure makes the keys natural and how the keys are being passed "downstream".
Be very careful about exposing your surrogate keys to downstream systems. This makes them natural.
*/
I don't understand how this exposure makes the keys natural and how the keys are being passed "downstream".
Never expose your surrogate key to the user. Users tend to identify with them and use them as meaning.
I'm of the thought always use surrogates, never expose them to the user, keep them for what they are meant to be, internal used by the system, no meaning to the average user. You can always add fields to identify the record to the user. Doing it this way, you can change anything meaningful to the user without compromising referential integrity and causing problems for YOU, the developer.
I'm of the thought always use surrogates, never expose them to the user, keep them for what they are meant to be, internal used by the system, no meaning to the average user. You can always add fields to identify the record to the user. Doing it this way, you can change anything meaningful to the user without compromising referential integrity and causing problems for YOU, the developer.
A nice example two remote offices local servers for ordering linked to a common billing system in a third.
One office closes.
Please put the orders on the other system.
Should be a simple export and import. They get new surrogate because both started at 1.
Of course someone used the the surrogate as the link to invoicing didn't they.
Not so simple after all..
One office closes.
Please put the orders on the other system.
Should be a simple export and import. They get new surrogate because both started at 1.
Of course someone used the the surrogate as the link to invoicing didn't they.
Not so simple after all..
I think he means that making the surrogate key visible to the users will cause it to become a natural key and thus lose its value as a surrogate. Surrogate keys should always be hidden from users in the application.
http://blog.pmtechnix.com
http://blog.pmtechnix.com
.. I'm not starting anything.. i'm curious.
How would you construct a surrogate for a situation where you are referencing a quantized Cartesian array{x,y,z} where data values are stored at say{1,3,5, red sphere}, and {1,2,5, blue fuzzy cube} and referenced from an external program as the value of the position?
--not a DB guy just curious...
How would you construct a surrogate for a situation where you are referencing a quantized Cartesian array{x,y,z} where data values are stored at say{1,3,5, red sphere}, and {1,2,5, blue fuzzy cube} and referenced from an external program as the value of the position?
--not a DB guy just curious...
Surrogate keys are by definition meaningless, so any of the usual suspects(integer or guid, usually) would do. They exist simply to identify a record uniquely. Finding records is a different matter - your external program with the value of the position would execute a query or run databse code to return the data it needs. Once it's got the immutable surrogate key value, it uses that for any subsequent update or delete operations.
I hope that's what you were asking
I hope that's what you were asking
Nice one... Think I'll "borrow" that paragraph for training as an introduction, prior to the current pages(!) of examples that slowly enlighten the students to the benefits and reasoning of surrogates over 'natural' PKs.
Surrogates would then be the most useful for situations where you have "longer" lived referencing relationships, i.e. (using the widgets example below) if I needed to update the widget, query DB, return widget+ surrogate. in the situation where the DB is Solely referenced once ( say pulling a static string, with versioning{z axis}) then does one REALLY need to return the surrogate? and or does using a surrogate speed up the query, since you end up doing at least one more Set/Function call to do this? -- Again Honest non-instigating inquiry.
My thinking on this...your Cartesian array can be represented in a single table "Widget". Each element of the array is a column of "Widget". The structure would be something like {[autoinc], [int], [int], [int], [string/pointer/object/etc.]}. For every row inserted into Widget, your surrogate would be created automagically, which would be used to key back into the table.
Referential integrity and normalization the the most common problems I have found. Developers what to provide referential integrity in their code because they don't want the database telling them what they can and cannot do. They assume that their code is always bug-free and never crashes though.
Normalization is another area that most people just don't get. There has been many times when I was called in to develop an app that someone prototyped but got stuck because they didn't understand this fundamental concept. Other times professional developers de-normalized for the sake of efficiency and stuck their users with unneeded limitations in apps. Many times the efficiency gained was miniscule.
http://blog.pmtechnix.com
Normalization is another area that most people just don't get. There has been many times when I was called in to develop an app that someone prototyped but got stuck because they didn't understand this fundamental concept. Other times professional developers de-normalized for the sake of efficiency and stuck their users with unneeded limitations in apps. Many times the efficiency gained was miniscule.
http://blog.pmtechnix.com
Cookie cutters, possibly, incompetents possibly, requirements that meant any other design would be hideously complex maybe.
I love referential integrity. If my code breaks the rules, I fix it, job done.
Mind you I've been a DBA as well...
Without it, issues run from testers thinking you are a careless pratt, to becoming firmly convinced that little blue men from Rigel just played a practical joke on you.
The thing about normalisation and referential integrity, is you can relax contraints with very little effort. Try to retrofit them into an established code base though and you just went high cost, high risk and you are going to have to explain relational database theory to the men with the pointy hair styles.
Not worth it....
Denormalisation, and reducing data integrity are the last resort.
I love referential integrity. If my code breaks the rules, I fix it, job done.
Mind you I've been a DBA as well...
Without it, issues run from testers thinking you are a careless pratt, to becoming firmly convinced that little blue men from Rigel just played a practical joke on you.
The thing about normalisation and referential integrity, is you can relax contraints with very little effort. Try to retrofit them into an established code base though and you just went high cost, high risk and you are going to have to explain relational database theory to the men with the pointy hair styles.
Not worth it....
Denormalisation, and reducing data integrity are the last resort.
That's been my approach. In a large environment, there would be different systems for production and decision support. Smaller environments (like a department/division of a bank in my case) it may seem that separate systems would be overkill. So, where there was a need for complex links that would bog down the system, I put in some stored procedures that would keep a separate set of decision support tables up to date.
That kept the production system fast (the stored procedure to update the DSS table acted very fast) and the decision support system fast (use two or three tables instead of a dozen).
My philosophy was to make a normalisation strategy that suited the purpose.
That kept the production system fast (the stored procedure to update the DSS table acted very fast) and the decision support system fast (use two or three tables instead of a dozen).
My philosophy was to make a normalisation strategy that suited the purpose.
design, starting with one, well that's an issue. 
If that's what you want or need, why in Cthulu's name are you using a relational database in the first place?
If that's what you want or need, why in Cthulu's name are you using a relational database in the first place?
The decision support that was de-normalised was just along for the ride.
The application got quite complicated with all the business requirements. We needed long lists of things for the calculation (Budget, Actual, etc.). The Decision Support tables would summarise the long lists for reporting and analysis.
I have to say that the application was quite fun except for the 2-hour meetings that ended up with 'ingore all that we discussed here, we'll have to try again'.
The application got quite complicated with all the business requirements. We needed long lists of things for the calculation (Budget, Actual, etc.). The Decision Support tables would summarise the long lists for reporting and analysis.
I have to say that the application was quite fun except for the 2-hour meetings that ended up with 'ingore all that we discussed here, we'll have to try again'.
- Keyboard Shortcuts:
- Prev
- Next
- Toggle

































