Symptoms
Visual Basic allows you to retrieve data from Jet databases (MDB files) byusing Structured Query Language (SQL). These query operations can be mademore efficient by implementing some of the suggestions in this article.
This article assumes that you are using the Microsoft Jet database engine.If you are querying ODBC tables, many of these points still apply. Formore information regarding improving performance of ODBC queries, pleasesearch on the following words in the Microsoft Knowledge Base:
ODBC and Optimizing and Tables
Resolution
Here are some tips for optimizing your SQL queries:
Performance AnalyzerIf you have Microsoft Access 95 or 97, you can open the database and usethe Performance Analyzer to profile your queries and suggest improvements.
Table DesignWhen defining a field in a table, choose the smallest data typeappropriate for the data in the field. This increases the number ofrecords that can fit on a page.
Fields you use in joins should have the same or compatible data types.
Compact the DatabaseThis has two performance benefits:
The Microsoft Jet database engine uses a cost-based method ofoptimization. As your database grows, the optimization scheme may nolonger be efficient. Compacting the database updates the databasestatistics and re-optimizes all queries.As your database grows, it will become fragmented. Compacting writesall the data in a table into contiguous pages on the hard disk, improvingperformance of sequential scans.
To compact your database, use the CompactDatabase statement. This examplecompacts the database and makes a backup:
DBEngine.CompactDatabase “C:\VB\BIBLIO.MDB”, “C:\VB\BIBLIO2.MDB”Kill “C:\VB\BIBLIO.BAK”Name “C:\VB\BIBLIO.MDB” As “C:\VB\BIBLIO.BAK”Name “C:\VB\BIBLIO2.MDB” As “C:\VB\BIBLIO.MDB”
If your database is updated heavily, you may want to consider compactingnightly.
Avoid Expressions in Query OutputExpressions in query output can cause query optimization problems if thequery is used later as input to another query and you add criteria tothe calculated output. In the following example, Query1 is used as inputfor a second SELECT statement:
Dim DB As DatabaseDim RS As RecordSetSet DB = DBEngine.Workspaces(0).Opendatabase(“Biblio.MDB”)DB.CreateQueryDef(“Query1″, _”SELECT IIF(Au_ID=1,’Hello’,'Goodbye’) AS X FROM Authors”)Set RS = DB.OpenRecordSet(“SELECT * FROM Query1 WHERE X=’Hello’”)
Because the IIF() expression in Query1 cannot be optimized, the WHEREcondition in second SELECT statement also cannot be optimized. If anexpression gets buried deeply enough in a query tree, you can forget thatit is there. As a result, your entire string of queries cannot beoptimized.
If you can, merge the SQL into a single level of nesting:
Set RS = DB.OpenRecordSet(“SELECT * FROM Authors WHERE Au_ID=1″)
For more complex nested queries, expose the fields that make up theexpression:
DB.CreateQueryDef(“Query1″, _”SELECT IIF(Au_ID=1,’Hello’,'Goodbye’) AS X, Au_ID, FROM Authors”)Set RS = DB.OpenRecordSet(“SELECT * FROM Query1 WHERE Au_ID=1″)
If you cannot avoid calculated values in query output, place them in thetop-level query and not in lower-level queries.
Output Only the Fields NeededWhen creating a query, return only the fields you need. If a field doesn’thave to be in the SELECT clause, don’t add it. The above example ofexposing additional fields to make nested queries more efficient is anexception.
GROUP BY, Joins, and AggregatesThis is an issue when you are joining two tables. For example, if you jointwo tables on the Customer Name field, and also GROUP BY the Customer Namefield, make sure that both the GROUP BY field (Customer Name) and thefield that is in the aggregate (Sum, Count, and so on) come from the sametable.
NOTE: This query is less efficient because the SUM aggregate is on the Ordtable and the GROUP BY clause is on the Cust table:
SELECT Cust.CustID,FIRST(Cust.CustName) AS CustName,SUM(Ord.Price) AS TotalFROM Cust INNER JOIN Ord ON Cust.CustID = Ord.CustIDGROUP BY Cust.CustID
A more efficient query would be to GROUP BY on Ord.CustID:
SELECT Ord.CustID,FIRST(Cust.CustName) AS CustName,SUM(Ord.Price) AS TotalFROM Cust INNER JOIN Ord ON Cust.CustID = Ord.CustIDGROUP BY Ord.CustID
NOTE: The First and Last functions do not have the overhead of otheraggregates and should not weigh very heavily in this decision.
GROUP BY As Few Fields As PossibleThe more fields in the GROUP BY clause, the longer the query takes toexecute. Use the First aggregate function to help reduce the number offields required in the GROUP BY clause.
Less efficient:
SELECT Cust.CustID,Cust.CustName,Cust.Phone,SUM(Ord.Price) AS TotalFROM Cust INNER JOIN Ord ON Cust.CustID = Ord.CustIDGROUP BY Cust.CustID, Cust.CustName, Cust.Phone
More efficient:
SELECT Ord.CustID,FIRST(Cust.CustName) AS CustName,FIRST(Cust.Phone) AS Phone,SUM(Ord.Price) AS TotalFROM Cust INNER JOIN Ord ON Cust.CustID = Ord.CustIDGROUP BY Ord.CustID
Nest GROUP BY Clause Before JoiningIf you are joining two tables and only grouping by fields in one of them,it may be more efficient to split the SELECT statement into two queries.making the SELECT statement with the GROUP BY clause into a nested queryjoined to the non-grouped table in the top-level query.
Less efficient:
SELECT Ord.CustID,FIRST(Cust.CustName) AS CustName,FIRST(Cust.Phone) AS Phone,SUM(Ord.Price) AS TotalFROM Cust INNER JOIN Ord ON Cust.CustID = Ord.CustIDGROUP BY Ord.CustID
More efficient:
Query1:SELECT CustID, SUM(Price) AS TotalFROM OrdGROUP BY CustIDQuery2:SELECT Query1.CustID, Cust.CustName, Cust.Phone, Query1.TotalFROM Cust INNER JOIN Ord ON Cust.CustID = Ord.CustID
Index Both Fields Use in JoinWhen joining tables, try to index the fields on both sides of a join. Thiscan speed query execution by allowing the query optimizer to use moresophisticated internal join strategy.
However, if you know one table is going to remain relatively small (occupy1-2 2K pages), it may be more efficient to remove indexes in thattable because fewer pages will have to be read into memory. You should trythis on a case-by-case basis.
Add Indexes to Speed Searches and SortsPlace an index on all fields that are used in a join or in a restriction.With the use of Rushmore query optimization technology, the Microsoft Jet2.0 and later database engine is able to take advantage of multipleindexes on a single table, which makes indexing multiple fieldsadvantageous.
Avoid restrictive query criteria on calculated and non-indexed columnswhenever possible.
Use sorting judiciously, especially with calculated and non-indexed fields.
Use Optimizable ExpressionsTry to construct your queries so that Rushmore technology can be used tohelp optimize them. Rushmore is a data-access technology that permits setsof records to be queried very efficiently. With Rushmore, when you usecertain types of expressions in query criteria, your query will run muchfaster. Rushmore does not automatically speed up all your queries. You mustconstruct your queries in a certain way for Rushmore to be able to improvethem.
Use the REFERENCES section at the end of the article to locate morespecific information.
Use COUNT(*) Instead of COUNT([Column Name])The Microsoft Jet database engine has special optimizations that allowCOUNT(*) to be executed much faster than COUNT([Column Name]).
NOTE: These two operations also have slightly different behavior:
Count(*) counts all rows returned.Count([Column Name]) counts all rows where [Column Name] is not NULL.
Avoid LIKE on ParametersBecause the value of the parameter is unknown at the time the query iscompiled, indexes will not be used. You can gain performance byconcatenating the parameter value as a literal in the SQL statement.
Use the REFERENCES section at the end of the article to locate morespecific information.
Avoid LIKE and Leading WildcardIf you use the LIKE operator with a wildcard to find approximate matches,use only one asterisk at the end of character string to ensure that anindex is used. For example, the following criteria uses an index:
Like “Smith”Like “Sm*”
The following criteria does not use an index:
Like “*sen”Like “*sen*”
Test Joins with RestrictionsIf you use criteria to restrict the values in a field used in a join, testwhether the query runs faster with the criteria placed on the “one” sideor the “many” side of the join. In some queries, you get fasterperformance by adding the criteria to the field on the “one” side of thejoin instead of the “many” side.
Use Intermediate TablesUse SELECT INTO statements to create work tables, especially if theresults are going to be used in a number of other queries. The more workyou can do up-front, the more efficient the process.
Avoid NOT IN with SubSelectsUsing sub-selects and NOT IN is poorly optimized. Converting to nestedqueries or OUTER JOINs are more efficient. The following example findscustomers without orders:
Less efficient:
SELECT Customers.*FROM CustomersWHERE Customers.[Customer ID]NOT IN (SELECT [Customer ID] FROM Orders);
More efficient:
SELECT Customers.*FROM Customers LEFT JOIN OrdersON Customers.[Customer ID] = Orders.[Customer ID]WHERE ((Orders.[Customer ID] Is Null));