Optimizing Query Speeds

Options
There are several criteria that can be handled different ways when setting up a Query, so I'm just curious if there are any instances in which one option runs a little faster than another. Some examples:
  • Using the "Include Deceased" checkbox vs using the Deceased? field under Constituent Information
  • Using Query type (Constituent, Individual, Organization) vs using the Key Indicator field within the query
  • Selecting all the options in "Equals one of..." vs using the option "...is not blank"
Basically, I'm just curious if there are any guides or tips & tricks for speeding up query processing times.

Comments

  • First, a couple of cautions:  If you use Individual or Organization Query Type rather than Key Indicator, be sure you remember you've done that...especially if you tend to use the same query later.  Also, [Equals One Of] and [Is Not Blank] can sometimes return vastly different results!



    As to your original question, I've found that it more depends on how complex your criteria are.  You can use Merge Queries or work with Query Lists for some fine-tuning, should you prefer working completely within RE.



    I usually try to use as few criteria as possible, and then either clean up the data outside of RE or Globally Add an Attribute, then start whittling down the list by adding one or two pieces of criteria at a time to a Query that starts with "Attribute Category equals This List".  Once I'm done, I can use Global Change to delete the Attributes, and go into Config to delete the Attribute Category itself.  (In my case, I have 5 Attributes for this purpose...with a number in the Attribute Category, so I have them at the ready whenever I need them, and they can hang out for as long as I need them.  I do this most often for Appeals, because it's so difficult to edit Appeal Tags once in RE...Package, for example, cannot be Globally Changed and Marketing Segment cannot be Globally Added.)



     
  • I personally have not seen a difference between the two Deceased choices.

    Query Type is really a different function than Key Indicator and so I would be very careful about switching those up.

    And Jennifer is right! One of vs. Is Blank and give you very different results.



    If you that concerned about speeds, you would be better off building a few queries with fewer criteria and merging them.  Just be aware that the order in which you merge can also affect your results.
  • This is a very complex topic with, literally, millions of variables that can affect performance. Both RE and SQL Server have some automatic optimization built in that do a good job in most cases.



    Aside from just running tests on the various possibilities you can go to the menu bar View > SQL (or Ctrl+Q) and see what the underlying SQL is.



    Looking at just the checkbox options here't a Constituent Query with all boxes checked and no Criteria or Output:

    SELECT

    RECORDS.ID "QRECID"

    FROM

    DBO.RECORDS AS RECORDS

    WHERE

    ((RECORDS.IS_CONSTITUENT = -1))



    Unchecking Deceased gives this:

    SELECT

    RECORDS.ID "QRECID"

    FROM

    DBO.RECORDS AS RECORDS

    WHERE

    ((RECORDS.IS_CONSTITUENT = -1) AND (RECORDS.DECEASED = 0))



    Adding Criteria of "Deceased equals No" gives this:

    SELECT

    CASE RECORDS.DECEASED WHEN 0 THEN 'N' WHEN -1 THEN 'Y' ELSE '' END "Deceased",

    RECORDS.ID "QRECID"

    FROM

    DBO.RECORDS AS RECORDS

    WHERE

    ((RECORDS.DECEASED = 0) AND (RECORDS.IS_CONSTITUENT = -1))



    Because the Decease value gets automatically added to the Output (at least that's how I have it set in my Options) it adds additional processing overhead in the Select statement to return a "Y" or "N" in the Results tab instead of a zero or one. This will take a microscopically longer time to process.



    If I remove Deceased from the Output the SQL returns to this:

    SELECT

    RECORDS.ID "QRECID"

    FROM

    DBO.RECORDS AS RECORDS

    WHERE

    ((RECORDS.DECEASED = 0) AND (RECORDS.IS_CONSTITUENT = -1))

    Identical to what it was before.



    Then there's the Execution Plan tab of the SQL Statement window where you can see some of the initial optimization that's being done:

    |--Parallelism(Gather Streams)

    |--Clustered Index Scan(OBJECT:([Minnesota_Childrens_Museum_RE].[dbo].[RECORDS].[PK__RECORDS__408F9238] AS [RECORDS]), WHERE:([Minnesota_Childrens_Museum_RE].[dbo].[RECORDS].[DECEASED] as [RECORDS].[DECEASED]=(0) AND [Minnesota_Childrens_Museum_RE].[dbo].[RECORDS].[IS_CONSTITUENT] as [RECORDS].[IS_CONSTITUENT]=(-1)))

    That's just on the RE end of things.  The SQL Server will do still more optimization completely behind the scenes based on it's current state, memory allocated, etc.



    If it's something that runs fairly quickly and doesn't have to be run often, I wouldn't worry about it.
  • John Heizer:



    Aside from just running tests on the various possibilities you can go to the menu bar View > SQL (or Ctrl+Q) and see what the underlying SQL is.

    Just to add to John's post, I think the View > SQL menu option is only available to organisations who have unlocked the optional RODBA module.

  • Alan French:

     

    John Heizer:



    Aside from just running tests on the various possibilities you can go to the menu bar View > SQL (or Ctrl+Q) and see what the underlying SQL is.

    Just to add to John's post, I think the View > SQL menu option is only available to organisations who have unlocked the optional RODBA module.

     

    True.  But just about anyone with a maintenance agreement can get it unlocked (although it can be a pain to do and has to be done per workstation if I remember correctly).

  • John Heizer:

    That's just on the RE end of things.  The SQL Server will do still more optimization completely behind the scenes based on it's current state, memory allocated, etc.


     
    Thanks John, that was really interesting. I'm pretty sure they don't show you all the SQL that's generated for a given query, though. For instance, I have a query that pulls gifts based on a certain attribute, and ctrl-Q gives me:



    SELECT GIFT.ID "QRECID"

    FROM DBO.GIFT AS GIFT


    LEFT OUTER JOIN DBO.GIFTSPLIT AS GIFT_GiftSplit ON GIFT.ID = GIFT_GiftSplit.GIFTID

    LEFT OUTER JOIN DBO.CAMPAIGN AS GIFT_GiftSplit_CAMPAIGN ON GIFT_GiftSplit.CAMPAIGNID = GIFT_GiftSplit_CAMPAIGN.ID

    LEFT OUTER JOIN DBO.tmp2425893_477508300_1 AS GIFT_V_GiftAttributes_0_1523 ON GIFT.ID = GIFT_V_GiftAttributes_0_1523.PARENTID


     


    Now, I don't know what that tmp2425893_477508300_1 table or view or whatever is, but I assume it's some kind of temp table or table var or something, because the name changes every time I open up the query and hit ctrl-Q.  I wish they'd show where that came from or how it was generated.

Categories