Jump to content

4.0: Adopt a naming convention for database columns

Featured Replies

Posted

A pet peeve I have with the 3.x series of IPS apps is the lack of a standardized naming convention for tables and columns in the database schema. This is a mostly cosmetic change, yes, but making them adhere to some naming conventions would make them a great deal more human-friendly to work with - which usually means happier developers. ;)


For example, depending on which table you're looking in, you might find a user's account ID referred to as "member", "member_id", "mid", "author_id", "ps_member", "i_member", "log_member", "comment_by", "report_by", or a slew of other names. Topic ID's are referred to as everything from "tid" to "topic_id" to "exdat2". This lack of convention stretches through every app's schema.

It's mostly foreign and primary keys that stand to benefit from this. One convention I'm a fan of is calling a table's primary key "id", and foreign keys referring to it "{singular}_id". So instead of letting each table have its own idea of how to reference a topic ID, they could all be, simply, "topic_id".

In some tables, column names have enigmatic abbreviations that don't make their purpose particularly obvious - like "can_mm" on the "moderators" table. That column could be renamed to "can_multimod" and suddenly becomes much easier to understand.

Sometimes, even the column names within a single table are really inconsistent. Look at the schema of the "moderators" table for a prime example - most of its columns hold what amounts to a simple boolean value denoting whether a moderator can or cannot perform an action. The way it is, one column is prefixed with "can_", two with "mod_can_", one with "allow_", and most don't have a prefix at all. Wouldn't it make some sense to standardize all these columns to be simply prefixed with "can_"?

And sometimes, prefixes are used in places where they serve no clear purpose. Every single column in the "nexus_ads" table, for example, is prefixed with "ad_", which isn't particularly meaningful when the entire table only contains ad data. The "ad_id" and "ad_locations" columns in the "nexus_ads" table would still be just as meaningful if they were called "id" and "locations".


Again, I realize that a site's users never have to see any part of the database schema, but developers sure do. Applications will have to be largely refactored to function on IP.Suite 4.0 anyway, so this is a rare opportunity to make breaking changes to the schema.

A pet peeve I have with the 3.x series of IPS apps is the lack of a standardized naming convention for tables and columns in the database schema. This is a mostly cosmetic change, yes, but making them adhere to some naming conventions would make them a great deal more human-friendly to work with - which usually means happier developers. ;)

For example, depending on which table you're looking in, you might find a user's account ID referred to as "member", "member_id", "mid", "author_id", "ps_member", "i_member", "log_member", "comment_by", "report_by", or a slew of other names. Topic ID's are referred to as everything from "tid" to "topic_id" to "exdat2". This lack of convention stretches through every app's schema.

It's mostly foreign and primary keys that stand to benefit from this. One convention I'm a fan of is calling a table's primary key "id", and foreign keys referring to it "{singular}_id". So instead of letting each table have its own idea of how to reference a topic ID, they could all be, simply, "topic_id".

tell me you write queries against the tables?

I would be more in favor of LESS conflicts... as it stands now you end up having to do a bunch of resetting working with topics because of the 'member' 'title', and the same for Custom Profile Fields and Content Fields.

Think of it in terms of a JOIN. you will only get one of the matching fields back unless you alias the specific field..... *needing* to because of matching fields is adding more code, and it happens behind the scenes in places now.

This is actually a WordPress pet peeve of mine now that you mention it.... every primary in every table is ID.... be it meta, post, or user... so utterly handy when pulling the data with a JOIN(does the sarcasm pour off the screen yet? :smile: )

Much less logistical queries... if the topic_id/tid becomes out of synch in these other fields, or moreover, when updating them *to* synch, if we have pulled the data with a join, we would need to alias it there to be certain of having the correct tid if they did all match.

tell me you write queries against the tables?

I would be more in favor of LESS conflicts... as it stands now you end up having to do a bunch of resetting working with topics because of the 'member' 'title', and the same for Custom Profile Fields and Content Fields.

Think of it in terms of a JOIN. you will only get one of the matching fields back unless you alias the specific field..... *needing* to because of matching fields is adding more code, and it happens behind the scenes in places now.

This is actually a WordPress pet peeve of mine now that you mention it.... every primary in every table is ID.... be it meta, post, or user... so utterly handy when pulling the data with a JOIN(does the sarcasm pour off the screen yet? :smile: )

...table aliases are the reality of any large relational DB system. Get used to using them.

It's hard to make such a major change once software has been around for so long, even during a major rewrite (alters are heavy)... but I 100% support this.

...table aliases are the reality of any large relational DB system. Get used to using them.

It's hard to make such a major change once software has been around for so long, even during a major rewrite (alters are heavy)... but I 100% support this.

I have no problem using them.

I do have a problem with the whole system becoming completely indistinguishable, and an alias required for every single field in every single query.

Let's say every table in the suite uses id for primary.

They also all use title and title_seo for tables that are applicable, including members.

Can you write a single JOIN query in such a system not having to alias multiple fields for no real reason other than to receive them in the result? Just one. :smile:

I have no problem at-all with aliases, I do have a problem with boilerplate additional being forced by table design.

And sometimes, prefixes are used in places where they serve no clear purpose. Every single column in the "nexus_ads" table, for example, is prefixed with "ad_", which isn't particularly meaningful when the entire table only contains ad data. The "ad_id" and "ad_locations" columns in the "nexus_ads" table would still be just as meaningful if they were called "id" and "locations".

Did you not think of WHY that table is prefixed?

Just as meaningful to the human, sure... but the table as a single entity with no context to the other tables is not the only usage for that table, it is easier to work with with a standard prefix.

Our convention for new tables is "[table_description]_[column_description]" - so in the members table it's member_id - the member ID in another table might be comment_author. Reasoning being if you want to do a select statement with a join, joining two tables which have the same column name - the latter will overwrite the former in the results (note I'm referring to the results, not within the statement itself where you can use aliases), which is rarely desired. Using your example - nexus_ads is often queried joined with members to get information about the member who created the ad - if both tables used the "id" field, you'd have to start aliasing columns, which is a bit of a pain if you wanted to do SELECT * FROM.

Admittedly, this would probably make more sense if every table was like that, but as Ryan mentioned, ALTER queries are really expensive on larger tables (all the indexes need to be rebuilt). Plus we'd have to change all the references in the code, and that's a lot of radical knock-on effects for a change that doesn't resolve any problem.

My hope for 4.0 though is that we will tidy up the schema wherever it is feasible (one change for example which I want to make, is for all tables to be prefixed with the app that owns them), and produce useful schema documentation to explain the purposes of columns which may not be obvious from their name.

  • Management

As Mark says, it's often a challenge refactoring existing code. We always have to balance the impact of changes against our customers and as much as we'd like to toss out the schematic and start again it's not really practical.

We do plan to make some broad changes to table naming which will help bring order to the database.

Archived

This topic is now archived and is closed to further replies.

Recently Browsing 0

  • No registered users viewing this page.