Fellow Csla developers and UUIDs in Urls #3125
Replies: 3 comments 1 reply
-
This SO thread seems pretty well articulated. https://stackoverflow.com/questions/9727090/unique-identifier-guid-as-primary-key-in-database-design |
Beta Was this translation helpful? Give feedback.
-
There are two separate things here, and it is good to segregate them. There is how you structure your database, and how you go about your external interface. You do not have to achieve both with the same solution. Yes, using a sequential identifier that can be manipulated by a user is a risk. It's such a risk that it was in the OWASP Top Ten for a long time - known as use of insecure direct object references. It's a huge risk in a multi-tenanted system, where users are only allowed to access a subset of the items in a collection. As it happens, it dropped off the top ten a couple of iterations ago - but only because there were even worse things to add in its place. It remains a big risk in any scenario where any user is only allowed to access a subset of items, for whatever reason. There are several potential solutions, as there are several parts to the risk; in short: direct and insecure. You can reduce the risk by using a highly randomised value that is extremely hard to guess, and some UUIDs fall into that category - but only some. I'll come back to that. However, the other part is direct access, and that's the best solution. Do whatever you can to avoid the common approach of controllers accepting the ID directly and you are much better off. That is the best solution, and is what security experts recommend. In my systems, it tends to be the case that people access an individual item by first seeing a list of the items available. In that scenario, you can easily generate an indirect object reference - such as the position of the item in the list that they were shown. Using that index position to look up the real unique identifier on the server goes a long way towards mitigating risk. To overcome the problem of the contents of the list changing, store the list that was displayed in per-user storage - for example Session. No, stick with me. I know that you just rolled your eyes. Yes, ideally systems would be stateless, but here's the thing: there is always a trade-off in systems design. Session seems old-fashioned to people I am sure, but session state of some form can help alleviate a LOT of security risk, so don't turn your back on it without really considering if stateless is really beneficial, or just trendy. Bear in mind that the people who are telling you to be stateless are likely also the ones passing integer IDs via the URL. Security matters, and a lot of people ignore it - and their systems suffer as a result. OK, so on to UUIDs. If the UUID is properly random and the number of matches is low compared to the number of possible values, then that can be an appropriate (if, ultimately, less secure) alternative to using a list of items and indirect references. However, properly random UUIDs index really, really badly - because they are random. If you add a lot of new items in the gap between reindexes then the performance implication of lookups using them can be very significant. This can massively reduce the scalability of a system based upon a relational database - and scalability limitations in the database are some of the hardest to overcome. A lot of people recommend NEWSEQUENTIALID() or equivalent as a solution to the indexing problem, and that's fair - except they are then far from random. UUIDs that are not random do index much, much better, but they do not offer the same security that random UUIDs do. Both are valid, but for very different use cases. And that is why I say there are two things going on. There is some middle ground to consider. You can use integers as primary keys in databases and then also add a UUID that is used solely for lookups where the end user can tamper with the value. That's what we did in a number of places where that was the only option. The benefit is that a lot of the code within the server can use the integer ID directly - fast - but use the UUID for values exposed to the client (a subset of code, slower.) There is a risk that the integer will get misused by devs who do not understand the difference, so you have to decide whether education can overcome the risk of misuse - and remember that the education needs to continue as the team grows and changes. If you are able to use UUIDs plus Session (or other very fast lookups or caching) then you are onto a winner. An in-memory lookup from UUID to integer would be very fast, and result in a well indexed database query on a relational system using the integer. An alternate strategy is to use something other than a relational database - a NoSQL solution of some form. Document databases don't do indexing in the same way, so they are generally very good at doing lookups using a UUID if the UUID is your unique document identifier. If you use a NoSQL-style store for caching, then that can do your lookup too. Redis is unbelievably fast at one to one translations of this kind, so using Redis as some session-like store or caching layer to offer UUID to integer transformations can work well. In summary, how you index your data store and what you expose to end users do not need to use the same value. Consider separating them into separate values to get the combined benefit of security AND speed. This is especially true if you can translate from the external to the internal value using a very fast lookup - one that doesn't cause any database traffic at all. |
Beta Was this translation helpful? Give feedback.
-
I'll just chime in here. GUIDs in URLs are fine and if you want to use them there's not issue. The issue is using GUIDs as a primary key as those are usually created as clustered. This is horrible for indexing as well as causes lots of moving to data with in the files as SQL needs to make space in pages for new rows from time to time. This absolutely kills your DB performance. We have this issue at my current job. Its bad enough that we're taking the effort to replace the GUID PKs with ints/longs. If you want a guid for an external ID, what I personally would do is have the int/long PK and then create another column to expose as the external id; that can be your guid and you can put an index on it. That way you avoid all the issues with performance, and another thing I like is that the PKs then are ONLY for your internal use in the app & ensuring referential integrity. Any time a key is exposed somehow public, you risk people then ask you to do something with it, which may not make it a great PK anymore. So I like to keep the PKs private in the BOs as much as possible. |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
This question is related to #1649
We have a discussion about SQL identity fields in URLs and one suggestion is to use UUIDs (in Microsoft universe GUIDs?). I remember we used to have GUIDs in Project Tracker.
Are fellow members using UUIDs in conjunction with Identity fields?
Thoughts please?
Kind regards
Beta Was this translation helpful? Give feedback.
All reactions