<?xml version="1.0" encoding="UTF-8" ?>
<rss version="2.0" xmlns:s="http://www.techrepublic.com/search" xmlns:dc="http://purl.org/dc/elements/1.1/"  xmlns:atom="http://www.w3.org/2005/Atom">
<channel>
    <title><![CDATA[Discussion on 10 tips for choosing between a surrogate and natural primary key ]]></title>
    <link><![CDATA[http://www.techrepublic.com/forum/discussions/102-342790]]></link>
    <atom:link rel="hub" type="application/rss+xml" href="http://pubsubhubbub.appspot.com/" />
    <atom:link rel="self" type="application/rss+xml" href="http://www.techrepublic.com/forum/discussions/102-342790/rss" />

    <description><![CDATA[]]></description>
    <language>en-us</language>
    <lastBuildDate>2013-05-18T19:09:59-07:00</lastBuildDate>
             

    <item>
        <title><![CDATA[People DBs are notorious for duplicates]]></title>
        <link><![CDATA[http://www.techrepublic.com/forum/discussions/102-342790-3565276]]></link>
        <description><![CDATA[People DBs might be one of the few cases where there is no natural key. Therefore surrogate keys must be used, but notice that this won't help you prevent multiple records for the same person.  SSN doesn't work for the reasons noted.  Name and Address doesn't work because the same person can have two addresses, not to mention that both are unstructured text data.  In this case, human intervention is often needed to identify and merge dups.]]></description>
        <guid><![CDATA[http://www.techrepublic.com/forum/discussions/102-342790-3565276]]></guid>
        <dc:creator><![CDATA[jrborgeson]]></dc:creator>
        <pubDate>Fri, 20 Jan 2012 08:42:49 -0800</pubDate>
    </item>
             

    <item>
        <title><![CDATA[Normalization is based on the natural key]]></title>
        <link><![CDATA[http://www.techrepublic.com/forum/discussions/102-342790-3565273]]></link>
        <description><![CDATA[Normalization is based on the natural key, which may or may not be used as the primary (aka physical) key of a table.  The natural (aka logical) key is what defines a logically unique record from the business point of view.  As the article mentions, when the primary key is defined as a surrogate, one should put a unique index on the natural key.  If this is not done, then the database is NOT normalized.  You could create logically duplicate records. Using a simplified example, we define a table of stock positions held by various customers,with a logical key of customer-id and stock-id.  If two records existed for Jim B and IBM, the data would not be normal.  I haven't said how the physical key is defined for that table because it doesn't matter in terms of normalizing the data.]]></description>
        <guid><![CDATA[http://www.techrepublic.com/forum/discussions/102-342790-3565273]]></guid>
        <dc:creator><![CDATA[jrborgeson]]></dc:creator>
        <pubDate>Fri, 20 Jan 2012 08:32:42 -0800</pubDate>
    </item>
             

    <item>
        <title><![CDATA[That's certainly more useful reasoning than anything in the article]]></title>
        <link><![CDATA[http://www.techrepublic.com/forum/discussions/102-342790-3481599]]></link>
        <description><![CDATA[On the other hand, you can choose which column(s) to define as your &quot;PRIMARY KEY&quot; (PK constraint I mean) and the IOT will be based on that. A UNIQUE constraint on any other keys is functionally equivalent in other ways. So the fact that Oracle IOTs are based on the PRIMARY KEY constraint does not need to influence your choice of key(s) in any important way.That is an example of why Susan Harkins' article is so pointless and misguided. Read no further than her first &quot;tip&quot; that &quot;a primary key value [sic] must be unique&quot;! Um, yes. But that's a requirement for *every* key and not just one particular key you designate to be the &quot;primary&quot; one. So as far as uniqueness is concerned it makes absolutely no difference which of the keys are natural and which are surrogate. Similar remarks apply to the other &quot;tips&quot; in the article (stability, simplicity, etc also being desirable characteristics for any keys). No-one should take the advice in this article seriously.]]></description>
        <guid><![CDATA[http://www.techrepublic.com/forum/discussions/102-342790-3481599]]></guid>
        <dc:creator><![CDATA[dportas]]></dc:creator>
        <pubDate>Tue, 09 Aug 2011 08:55:46 -0700</pubDate>
    </item>
             

    <item>
        <title><![CDATA[An use case for natural key as pk]]></title>
        <link><![CDATA[http://www.techrepublic.com/forum/discussions/102-342790-3481194]]></link>
        <description><![CDATA[despite being in favor of using surrogate keys, I think that natural key could be used as primary key in some particular use cases. As an example, Oracle Index-Organized-Tables could give huge performance benefits when doing range scans, but rows must be organized by primary key. So if range scans are done on the natural key (or the first part of it) it could be useful to have it also as primary key.]]></description>
        <guid><![CDATA[http://www.techrepublic.com/forum/discussions/102-342790-3481194]]></guid>
        <dc:creator><![CDATA[emyl_79]]></dc:creator>
        <pubDate>Mon, 08 Aug 2011 05:13:52 -0700</pubDate>
    </item>
             

    <item>
        <title><![CDATA[Sound advice in it?]]></title>
        <link><![CDATA[http://www.techrepublic.com/forum/discussions/102-342790-3436545]]></link>
        <description><![CDATA[Don't use natural keys as your primary key.Whether that's a key that does follow the rules stable, known and unique is irrelevant.Renumbering is an easy operation?. Yes and no.If you Ignore the fact that changing a key by purist rules means it isn't one, which is not my default option. Think about what doing it means.Update OrderLines set line_number = 3 where line_number = 4 and Order_Number = 1Yeah that's easy.As long as there isn't already a 3 and there is a 4. So you have to lock the entire order for the length of  all the renumbering actions.If as is likely in a shuffle up, you are rekeying more than one record, there is an explicit ordering of the update statements.You could get round that  withUpdate Order_lines set line_number = line_number =1 Where Line_Number &gt; 3but you are still hugely dependant on the state of the affected records.I personaly wouldn't put line_number in the table at all. If it was required on the output becaseu that is the natural way users think about lines on an order, I;d do some sort of calculated field. If it was legacy system knocked up by someone unaware of the issues and I couldn't get rid of line number easily then I'd simply update to be a consecutive integer after any deletes, inserts etc were complete.None of this even considers the problems surrounding communication of changes to orders and how they impact say invoices and vice versa in a synchronous system, never mind an asynchronous one...No hijacking involved, merely some reasons why Ms Harkins advice does turn out to be sound if you look further than a mere database schema  for one aspect of some system.]]></description>
        <guid><![CDATA[http://www.techrepublic.com/forum/discussions/102-342790-3436545]]></guid>
        <dc:creator><![CDATA[Tony Hopkinson]]></dc:creator>
        <pubDate>Sat, 02 Apr 2011 04:50:57 -0700</pubDate>
    </item>
             

    <item>
        <title><![CDATA[A key is what you create]]></title>
        <link><![CDATA[http://www.techrepublic.com/forum/discussions/102-342790-3435615]]></link>
        <description><![CDATA[by adding PRIMARY KEY, NOT NULL, UNIQUE and other constraints to ensure that column(s) are unique and non-nullable in a table. There is no room for &quot;perception&quot; about what the keys are when they are plain to see in DDL and enforced by the DBMS. A key is what is constrained to be unique and irreducible (ie a minimal superkey).&quot;Besides what about the order line example&quot;What about it? You didn't say anything interesting. Are you suggesting that we shouldn't make line numbers unique within an order and that you'd rather allow duplicates? What would be the point of that? &quot;Shuffling up&quot; line numbers is a very easy operation to do whether the columns are key or not. And yes I have done it. I don't recall ever being asked to allow duplicate line numbers on anything.You appear to be hijacking this thread rather than discussing the article in question. I asked if there was any part of the article that you would consider was sound advice. I'd like to know because I struggle to see how anyone could take away anything useful from it.]]></description>
        <guid><![CDATA[http://www.techrepublic.com/forum/discussions/102-342790-3435615]]></guid>
        <dc:creator><![CDATA[dportas]]></dc:creator>
        <pubDate>Thu, 31 Mar 2011 02:32:18 -0700</pubDate>
    </item>
             

    <item>
        <title><![CDATA[I've seen it many times]]></title>
        <link><![CDATA[http://www.techrepublic.com/forum/discussions/102-342790-3435489]]></link>
        <description><![CDATA[Especially from one of the OPs main reader base, access developers.Your entire post was founded on the assumption that what is perceived to be a natural key is a key...Until you rethink that, every thing I've said is going right over your head...Besides what about the order line example, surely you've seen someone do that...]]></description>
        <guid><![CDATA[http://www.techrepublic.com/forum/discussions/102-342790-3435489]]></guid>
        <dc:creator><![CDATA[Tony Hopkinson]]></dc:creator>
        <pubDate>Wed, 30 Mar 2011 14:10:42 -0700</pubDate>
    </item>
             

    <item>
        <title><![CDATA[Off topic, but...]]></title>
        <link><![CDATA[http://www.techrepublic.com/forum/discussions/102-342790-3435068]]></link>
        <description><![CDATA[&quot;US's SSN, Britain's NI Number...&quot;I can't think why anyone would want to use an SSN or NI number as a key. In every organisation I worked for employees were identified by an employee number / payroll number, which was certainly required to be unique. In HR and payroll systems I have worked on the payroll number was implemented as a key in the &quot;employee&quot; table. It would be extraordinary in my opinion to think of designing an HR system without a natural key for employees. You aren't really suggesting that you would are you?&quot;Your argument ... is proven bollocks though&quot;You have destroyed a straw man, that's all.&quot;We seem to be talking past each other&quot;Yes, the topic is the original article and you haven't said anything much about it or my criticisms of it.]]></description>
        <guid><![CDATA[http://www.techrepublic.com/forum/discussions/102-342790-3435068]]></guid>
        <dc:creator><![CDATA[dportas]]></dc:creator>
        <pubDate>Tue, 29 Mar 2011 20:50:24 -0700</pubDate>
    </item>
             

    <item>
        <title><![CDATA[We seem to be talking past each other]]></title>
        <link><![CDATA[http://www.techrepublic.com/forum/discussions/102-342790-3434959]]></link>
        <description><![CDATA[What is a natural key?US's SSN, Britain's NI Number, a phone number, a part id, ISBN number.First of all we aren't in control of them, so even if as far as we can see, they are unique within our data domain, there is absolutely no guarantee that uniqueness will be maintained.Therefore they cannot be considered stable.In your system do you wish to constrain it to so the only valid operations on the data require the natural key. Even if that currently meets the need, the likelihood of it changing is high, therefore the natural key could for a set of valid operations, such as get the natural key, be null.So they arenlt stable and they are nullable therefore they are not valid candidates for a key on the tuple.So if you use a surrogate key to meet first normal form and then use it to enforce relations between tables. If you wish to relax the uniqueness contraint to allow nullable values you can easily.If you wish to expand the natural key to include say Country code for phone number when you go international you can easily.If you'd made the natural key the primary key, and then enforced all your relations on it then you are in a  world of hurt, and that's without having any code or any downstream consumers!Natural key = database key is an initial assumption and later fallacy because it's not a key as per database theory.Could the article have put that better, perhaps. Your argument that surrogates are an unnecessary overhead because there is a natural &quot;key&quot; is proven bollocks though.I can come up with so many examples in my career where the foolish assumption that a natural key is unique is database terms caused hideous problems. Here's a simple one.Line item number on an order or invoice. Lost count of how many times I've seen some newbie key the OrderLines table by OrderNumber and LineNumber.Delete a line in the middle of the order, you have to shuffle them up. Don't even mention if you have item by item stock receipts, production fulfilment or invoicing or allow line by line remittance.My rule is don't use a natural key as your primary key ever.If there is a viable unique key constraint within the system, fine use it, data integrity is always good, and should that chnage all you do is drop the thing, job done, aside from basic operational changes.My approach to database applications has become (took me a while to learn this one as well) is I put as much data integrity in as I can affiord to back stop my application, not to drive it.When the applications breaks the rules, it falls over. Far too often I've seen the two opposites.You can't change the thing without being severely beaten by the bean counters, whio won't underatnd crap about why.Or all too often, no rules at all, and everthing in code, as though it was a hierachical database written by an idiot.Sod the theory, yoir rukle might be true, now, ther's no rule that says it has tio stay that way. If your application is founded on it, it will founder...]]></description>
        <guid><![CDATA[http://www.techrepublic.com/forum/discussions/102-342790-3434959]]></guid>
        <dc:creator><![CDATA[Tony Hopkinson]]></dc:creator>
        <pubDate>Tue, 29 Mar 2011 14:34:40 -0700</pubDate>
    </item>
             

    <item>
        <title><![CDATA[Keys]]></title>
        <link><![CDATA[http://www.techrepublic.com/forum/discussions/102-342790-3434912]]></link>
        <description><![CDATA[Keys are about data integrity, irrespective of how the application uses them. The reason why we enforce keys in the database is precisely so that they are independent of the application. I don't believe the author is thinking of how keys might be used by an application but if she is then she should say so.Even if one accepted your unconventional explanation it wouldn't make much sense of the article. Eg. Point 4. Stability is a desirable characterstic of ALL keys not just ones used by the application but it's not an absolute requirement and isn't always wanted. In fact one of the benefits of a surrogate is that you can change it with only minimal impact. 5 &quot;you must know the key&quot; but that applies equally to every key and if you wanted only a surrogate key value and nothing else then you might as well use a key generator that's independent of the table in question. 6. &quot;apply an index&quot; And how is that any different from what you would do with a surrogate key? etc, etc ... but as I say, there isn't enough space to refute everything that's wrong here.You said: &quot;Natural keys do not have to be unique&quot; but all keys are required to be unique and not nullable - that after all is the definition of a key and the reason we implement them in the database is to make sure they remain unique!Keys are an important topic that deserve much better than this article. If you really think there is any useful advice here then I'd love to know what it is.]]></description>
        <guid><![CDATA[http://www.techrepublic.com/forum/discussions/102-342790-3434912]]></guid>
        <dc:creator><![CDATA[dportas]]></dc:creator>
        <pubDate>Tue, 29 Mar 2011 12:52:59 -0700</pubDate>
    </item>
             

    <item>
        <title><![CDATA[Primary key is the one]]></title>
        <link><![CDATA[http://www.techrepublic.com/forum/discussions/102-342790-3434831]]></link>
        <description><![CDATA[you build you data access logic round. It's the one the code uses. It's the default one the DBMS will use unless told otherwise (implicitly or explicitly).You don't put primary in front of it for a laugh!Alternate keys as in Unique Key, no problem, but that should be the natural one not the surrogate, and that's if you can afford it seeing as it cannot be null in most implementations. Index, well that's simply an optmisation, and completely dependant on design and implementation and naff all to do with the logical schema.If you want to go purist and implementaion agnostic discounting the primary directive, then you'd have a point given you could gaurantee that the natural key would be static, or you are prepared to pay the huge potential price when that turns out to be not exactly correct....So come again?Without the word primary, the article would be a waste of pixels.Natural keys do not have to be unique, they could initially be unknown....PrImary and Unique Keys have to be unique, natural keysn do not....Your entire argument on it's arse that isn't it?]]></description>
        <guid><![CDATA[http://www.techrepublic.com/forum/discussions/102-342790-3434831]]></guid>
        <dc:creator><![CDATA[Tony Hopkinson]]></dc:creator>
        <pubDate>Tue, 29 Mar 2011 10:10:34 -0700</pubDate>
    </item>
             

    <item>
        <title><![CDATA[Sarcasm?]]></title>
        <link><![CDATA[http://www.techrepublic.com/forum/discussions/102-342790-3434826]]></link>
        <description><![CDATA[... I can't tell if this is sarcastic or not.]]></description>
        <guid><![CDATA[http://www.techrepublic.com/forum/discussions/102-342790-3434826]]></guid>
        <dc:creator><![CDATA[hobbes@...]]></dc:creator>
        <pubDate>Tue, 29 Mar 2011 09:42:11 -0700</pubDate>
    </item>
             

    <item>
        <title><![CDATA[Are you sure YOU read it?]]></title>
        <link><![CDATA[http://www.techrepublic.com/forum/discussions/102-342790-3434765]]></link>
        <description><![CDATA[All the points in the article except paras 4 and 8 apply equally to ANY keys whether a table has a surrogate or not (and point 4 is wrong anyway, or at least overly prescriptive and dogmatic). The author even says &quot;The only reason Ive encountered for forcing [sic] a natural key involved records from integrated systems&quot;!As for implying that I don't know what I'm talking about, well if you think that a &quot;primary&quot; key means anything different from any other key then I would certainly question what you are talking about. A key is a key. Appending the word &quot;primary&quot; in front of it doesn't alter the meaning of the article and can't possibly support any of the claims that the author makes.]]></description>
        <guid><![CDATA[http://www.techrepublic.com/forum/discussions/102-342790-3434765]]></guid>
        <dc:creator><![CDATA[dportas]]></dc:creator>
        <pubDate>Tue, 29 Mar 2011 08:24:34 -0700</pubDate>
    </item>
             

    <item>
        <title><![CDATA[It's only misleading and arbritrary if you didn't read it right.]]></title>
        <link><![CDATA[http://www.techrepublic.com/forum/discussions/102-342790-3434726]]></link>
        <description><![CDATA[Use a surrogate or a natural as your PRIMARY key.So with that in mind might be time to revise your post, before a potential employer gets the impression you don't know what you are talking about.Now if you'd have said using surrogates was basically a no-brainer, I could have lived with that...]]></description>
        <guid><![CDATA[http://www.techrepublic.com/forum/discussions/102-342790-3434726]]></guid>
        <dc:creator><![CDATA[Tony Hopkinson]]></dc:creator>
        <pubDate>Tue, 29 Mar 2011 08:00:03 -0700</pubDate>
    </item>
             

    <item>
        <title><![CDATA[Who can you rely on?]]></title>
        <link><![CDATA[http://www.techrepublic.com/forum/discussions/102-342790-3434733]]></link>
        <description><![CDATA[Never rely on an external system to provide your unique key.  NEVER.]]></description>
        <guid><![CDATA[http://www.techrepublic.com/forum/discussions/102-342790-3434733]]></guid>
        <dc:creator><![CDATA[mustang84]]></dc:creator>
        <pubDate>Tue, 29 Mar 2011 07:28:15 -0700</pubDate>
    </item>
             

    <item>
        <title><![CDATA[Thanks for the article]]></title>
        <link><![CDATA[http://www.techrepublic.com/forum/discussions/102-342790-3434740]]></link>
        <description><![CDATA[I love an article that confirms my thinking, Thanks]]></description>
        <guid><![CDATA[http://www.techrepublic.com/forum/discussions/102-342790-3434740]]></guid>
        <dc:creator><![CDATA[larry@...]]></dc:creator>
        <pubDate>Tue, 29 Mar 2011 07:22:53 -0700</pubDate>
    </item>
             

    <item>
        <title><![CDATA[Another misleading article on this subject]]></title>
        <link><![CDATA[http://www.techrepublic.com/forum/discussions/102-342790-3434626]]></link>
        <description><![CDATA[There is too much fudging of issues and vague use of terminology in this article to correct every wrong assumption in it. Users and consumers of data require a way to identify the real world facts represented by database tuples. The identifiers they use (ie natural keys) are of prime importance to the correct use of the data and to data integrity generally. Surrogate keys, if they are used at all, serve a very different purpose and are by definition optional and dispensable as far as business requirements are concerned. So the premise of the artcile - that there is a choice to be made between either a surrogate or a natural key - is an entirely false dilemma. Natural keys are fundamental to successful database implementations that meet business requirements and achieve data integrity. You may or may not decide to create surrogates as well as natural keys but to present this as a choice between one or the other is just arbitrary and misleading.]]></description>
        <guid><![CDATA[http://www.techrepublic.com/forum/discussions/102-342790-3434626]]></guid>
        <dc:creator><![CDATA[dportas]]></dc:creator>
        <pubDate>Tue, 29 Mar 2011 02:37:48 -0700</pubDate>
    </item>
             

    <item>
        <title><![CDATA[Doubt anyone would disagree with that.]]></title>
        <link><![CDATA[http://www.techrepublic.com/forum/discussions/102-342790-3434488]]></link>
        <description><![CDATA[If nothing else because surrogate keys are not exposed the UI will be driven by the natural key, so not having an index on it would be 'erm foolish. If it happens to be unique as well, that's a bonus...They key point between natural and surrogate keys, is in most cases the natural key is outside of the control of the software using it, whereas you are always in charge of the surrogate. One can support your business logic the other can destroy it on a whim.It's not hard choice really....]]></description>
        <guid><![CDATA[http://www.techrepublic.com/forum/discussions/102-342790-3434488]]></guid>
        <dc:creator><![CDATA[Tony Hopkinson]]></dc:creator>
        <pubDate>Mon, 28 Mar 2011 16:02:43 -0700</pubDate>
    </item>
             

    <item>
        <title><![CDATA[need for a natural key index]]></title>
        <link><![CDATA[http://www.techrepublic.com/forum/discussions/102-342790-3434474]]></link>
        <description><![CDATA[The point of constraints (domain, foreign-key, unique) is to use the database itself to keep the data as clean as possible. If you have defined your entity in such a way that you can identify a Natural key candidate (as you should be able to in a 3rd normal form database), you should give serious consideration to implementing that Natural key as a unique constraint. NOT necessarily as the primary key (for the reasons cited in the article), but as an unique key. This provides 2 benefits: 1) when and if the database is updated outside of the original application (adhocs, new systems, et al.), the data quality is better preserved. 2) you are providing the database engine with useful information. Tom Kyte of Oracle says that the more information you provide the database engine, the more effectively it can process your SQL queries.]]></description>
        <guid><![CDATA[http://www.techrepublic.com/forum/discussions/102-342790-3434474]]></guid>
        <dc:creator><![CDATA[alvin.c.steele@...]]></dc:creator>
        <pubDate>Mon, 28 Mar 2011 15:02:25 -0700</pubDate>
    </item>
             

    <item>
        <title><![CDATA[Careful?. Don't use it as a key at all]]></title>
        <link><![CDATA[http://www.techrepublic.com/forum/discussions/102-342790-3434437]]></link>
        <description><![CDATA[It's a classic example of why you need surrogate keys. and in this case simply used to illustrate that surrogate keys don't violate normalisation principles.That's not even counting the fact that it might not be unique. That just means it's not a candidate for a key, no matter how you lean on the issue.]]></description>
        <guid><![CDATA[http://www.techrepublic.com/forum/discussions/102-342790-3434437]]></guid>
        <dc:creator><![CDATA[Tony Hopkinson]]></dc:creator>
        <pubDate>Mon, 28 Mar 2011 14:27:24 -0700</pubDate>
    </item>
    </channel>
</rss>

