Database

Semibase has substantial experience in Database conversion, development and programming using SQL Server, SQL, mySQL, Access, FoxPro, Excel, Ingres, PHP, SQL*Plus, PL/SQL and more on various platforms. In this section we provide an overview of database conversion, development & programming guidelines from various development perspectives.Databases are the heart and soul of most of the enterprise applications and it is very essential to pay special attention to database programming. Generally for a better performing database a company needs a real DBA and a specialist database programmer, let it be Microsoft SQL Server, Oracle, Sybase, DB2 or anything else. If a database specialist is not used during a development cycle, database may end up becoming the performance bottleneck.

Here are some of the database conversion and programming guidelines, best practices, keeping quality, performance and maintainability in mind. This list many not be complete at this moment, and might be constantly updated.

  • Decide upon a database naming convention, standardize it across your organization and be consistent in following it. It helps make your code more readable and understandable.
  • Make sure you normalize your data at least till 3rd normal form. At the same time, do not compromise on query performance. A little bit of de-normalization helps queries perform faster.
  • Do not use SELECT * in your queries. Always write the required column names after the SELECT statement, like SELECT CustomerID, CustomerFirstName, City. This technique results in less disk IO and less network traffic and hence better performance.
  • Do not depend on undocumented functionality. The reasons being: you will not get support from Oracle, Microsoft, Sybase or IBM, when something goes wrong with your undocumented code - undocumented functionality is not guaranteed to exist (or behave the same) in a future release or service pack, there by breaking your code.
  • Try not to use system tables directly. System table structures may change in a future releases. Wherever possible, use the sp_help* stored procedures or INFORMATION_SCHEMA views. There might be situations where you cannot avoid accessing system table though.
  • Write comments in your stored procedures, triggers and SQL batches generously, whenever something is not very obvious. This helps other programmers understand your code clearly. Don't worry about the length of the comments, as it won't impact the performance, unlike interpreted languages like ASP 2.0.
  • Try to avoid server side cursors as much as possible. Always stick to 'set based approach' instead of a 'procedural approach' for accessing/manipulating data. Cursors can be easily avoided by SELECT statements in many cases. If a cursor is unavoidable, use a simple WHILE loop instead, to loop through the table. Generally a WHILE loop is faster than a cursor most of the times. But for a WHILE loop to replace a cursor you need a column (primary key or unique key) to identify each row uniquely and every table should have a primary or unique key.
  • Avoid creation of temporary tables while processing data, as much as possible, as creating a temporary table means more disk IO. Consider advanced SQL or views or table variables of SQL Server 2000 or derived tables, instead of temporary tables. Keep in mind that, in some cases, using a temporary table performs better than a highly complicated query.
  • Try to avoid wildcard characters at the beginning of a word while searching using the LIKE keyword, as that results in an index scan, which is defeating the purpose of having an index. In the following statements first one results in an index scan, while the second statement results in an index seek: a) SELECT LocationID FROM Locations WHERE Specialties LIKE '%apples', b) SELECT LocationID FROM Locations WHERE Specialties LIKE 'A%s'.
  • While designing database, design it keeping 'performance' in mind. You can't really tune performance later, when your database is in production, as it involves rebuilding tables/indexes, re-writing queries. Use the graphical execution plan in Query Analyzer or SHOWPLAN_TEXT or SHOWPLAN_ALL commands to analyze your queries. Make sure your queries do 'Index seeks' instead of 'Index scans' or 'Table scans'. A table scan or an index scan is a very bad thing and should be avoided where possible (sometimes when the table is too small or when the whole table needs to be processed, the optimizer will choose a table or index scan).
  • Prefix the table names with owner names, as this improves readability, avoids any unnecessary confusions. Microsoft SQL Server Books Online even states that qualifying tables names, with owner names helps in execution plan reuse.
  • Use the more readable ANSI-Standard Join clauses instead of the old style joins. With ANSI joins the WHERE clause is used only for filtering data. Whereas with older style joins, the WHERE clause handles both the join condition and filtering data.
  • Views are generally used to show specific data to specific users based on their interest. Views are also used to restrict access to the base tables by granting permission on only views. Yet another significant use of views is that, they simplify your queries. Incorporate your frequently required complicated joins and calculations into a view, so that you don't have to repeat those joins/calculations in all your queries, instead just select from the view.
  • Do not let your front-end applications query/manipulate the data directly using SELECT or INSERT/UPDATE/DELETE statements. Instead, create stored procedures, and let your applications access these stored procedures. This keeps the data access clean and consistent across all the modules of your application, at the same time centralizing the business logic within the database.
  • If you have a choice, do not store binary files, image files (Binary large objects or BLOBs) etc. inside the database. Instead store the path to the binary/image file in the database and use that as a pointer to the actual binary file. Retrieving, manipulating these large binary files is better performed outside the database and after all, database is not meant for storing files.
  • Avoid dynamic SQL statements as much as possible. Dynamic SQL tends to be slower than static SQL, as SQL Server must generate an execution plan every time at runtime. IF and CASE statements come in handy to avoid dynamic SQL. Another major disadvantage of using dynamic SQL is that, it requires the users to have direct access permissions on all accessed objects like tables and views. Generally, users are given access to the stored procedures that reference the tables, but not directly on the tables. In this case, dynamic SQL will not work.
  • Offload tasks like string manipulations, concatenations, row numbering, case conversions, type conversions etc. to the front-end applications, if these operations are going to consume more CPU cycles on the database server (it's okay to do simple string manipulations on the database end though). Also try to do basic validations in the front-end itself during data entry. This saves unnecessary network roundtrips.