Best Practices
MISSING SUBTEXT

Optimizations for Salesforce Orgs with Lots of Data (Part 2)

Part 2 of our 4 part series on how to optimize your Salesforce org for performance when it contains large volumes of data.

June 3, 2021

This is Part II of a four part series on optimization techniques for Salesforce orgs with lots of data. Part I of the series can be found here

Writing Better Queries

Whether you are using SOQL or SOSL, if you are filtering a set of records, making selective queries is key. As for deciding which query language to use in Salesforce, the general approach is that if you know which object and fields you are searching for, you can use SOQL. Both SOQL and SOSL have different use cases and have drawbacks and advantages in different scenarios. 

SOQL indexes are primary keys mainly, but they can also be custom fields (with unique values) or system audit fields (CreatedDate). They can also be foreign keys (master-detail relationship field). You can’t index multi-picklist, currency fields (if multi-currency enabled), long text fields or binary fields, for example. Also you can run queries to see child object count in SOQL. Neat!

Salesforce Object Search Language (SOSL) is lesser known among developers and it’s used to construct text-based search queries against the search index. The search engine looks for matches across a maximum of 2,000 records. You can search text, email, and phone fields for multiple objects, including custom objects, that you have access to in a single query in the for SOAP/REST calls, Apex and Visualforce controllers and methods and more.

SOSL also enables you to search text, email and phone fields at the same time, making it more efficient when handling large data queries. This is especially useful if you do not know which objects contain the values you are searching for. I used to think that SOSL always had higher governor limits compared to SOQL, but that is not the case. If you want more than 2,000 records returned, you need to use SOQL where the transaction record limit for Apex is 50,000.

Here are some additional tips: 

  • SOSL can tokenize multiple terms within a field (for example, multiple words separated by spaces).
  • Limit scope by targeting specific objects rows owned by the user and rows within a division (when applicable).
  • The Name, Phone, Text, picklist fields are indexed and appropriate for SOSL.
  • For SOSL, Chatter Feed searches aren't affected by the scope of your search, and their results include matches across all objects.

To avoid timeout errors with SOQL queries, try to reduce query scope and selective filters. Consider using the Bulk API and adding a LIMIT clause (starting with 100,000 records) to your queries. If using batch Apex for your queries, use chaining to get sets of records (using LIMIT) or consider moving filter logic to the execute method.

Force.com Query Optimizer
Source: http://resources.docs.salesforce.com/194/0/en-us/sfdc/pdf/salesforce_query_search_optimization_developer_cheatsheet.pdf

Defer Sharing Calculation

In some circumstances, it might be appropriate to use a feature called Defer Sharing Calculation, which allows users to defer the processing of sharing rules when users are added and updated and role hierarchy in the org changes.

An organization’s administrator can use the Defer Sharing Calculation permission to suspend and resume sharing calculations, and to manage two processes: group membership calculation and sharing rule calculation. The administrator can suspend these calculations when performing a large number of configuration changes, which might lead to very long sharing rule evaluations or timeouts and resume calculations during an organization’s maintenance period. This deferral can help users process a large number of sharing-related configuration changes quickly during working hours, and then let the recalculation process run overnight between business days or over a weekend.

Skinny Tables

Skinny tables can be created on custom objects, as well as the Account, Contact, Opportunity, Lead and Case objects and Custom Objects. While they can increase performance, they have a limit of 100 objects and can’t contain fields from other objects. They can only be used on Full Copy sandboxes by default. In order to use skinny tables you need to contact Salesforce Customer Support.

Salesforce maintains separate tables at the database level for standard and custom fields. Normally a database join is required when you query these fields. You can create skinny tables for frequently accessed fields and reduce processing times. However, this only applies to certain read-only operations. If you change the query that runs and add new tables, your skinny table definitions also need to be manually updated by Salesforce customer support. Salesforce also syncs skinny tables and source tables continuously to avoid database joins, lowering processing times. Skinny tables are most useful for reports as they are a read-only operation and you will get much better results if you have more than a million records. Skinny tables can also be used for encrypted data.

Query Selectivity and Indexes

A simple first step is to leverage standard platform indexes by making our queries as selective and specific as possible.

Consider a simple query with a WHERE clause below:

SELECT Id, Name FROM Opportunity WHERE Stagename = 'Closed Won'

A skilled developer or admin can get necessary information about the dataset they will be interacting with. This can be more effective than reports especially in large data volumes. For example, if you need to see the opportunity count grouped by their stage name, you can use the GROUP BY ROLLUP to get the necessary information. It will also return the total amount of records in the null row.

SELECT Stagename, COUNT(id) FROM Opportunity GROUP BY ROLLUP (Stagename)
Salesforce query - group by rollup

You can also use more complex filter conditions. For filter conditions that combine two or more conditions (using AND), the query optimizer considers the overall filter condition selective when the filter targets less than 150,000 records for both filter conditions. 

If one of the conditions returns more than 250,000 records, the overall filter can be made selective if both conditions overlap and return less than 150,000 records while each filter condition corresponds to less than 300,000 records. For OR conditions each filter must meet the thresholds individually. (We are going to talk about index thresholds in detail below.)

In order to use indexed searches for SOQL queries with multiple filters, try breaking the query down into as many parts as filters and join the results separately.

Salesforce index selectivity conditions and thresholds
Source: http://resources.docs.salesforce.com/194/0/en-us/sfdc/pdf/salesforce_query_search_optimization_developer_cheatsheet.pdf

Aside from keeping the governor limit of 50,000 records returned in a transaction, you should also be mindful about deleted records in your org. They can affect your query performance as well. You can both use the isDeleted flag to filter these reports as this field is available on all standard and custom objects or empty your recycle bin frequently. Also If you’re using Workbench, select Exclude for the Deleted and archived records option list.

Indexes Explained

Another thing we can do is leverage standard indexes that operate on system fields. If your filter condition involves a custom field, work with Salesforce customer support to create a custom index on the field your filter uses. Not all fields can have an index, such as non-deterministic formula fields.

SOQL Index - Fields
Source: http://resources.docs.salesforce.com/194/0/en-us/sfdc/pdf/salesforce_query_search_optimization_developer_cheatsheet.pdf

Salesforce supports custom indexes on custom fields, except for multi-select picklists, some text-areas,non-deterministic formula fields, and encrypted text fields. External Ids can be indexed, however only auto number, email, number and text fields can be configured to serve as an External Id.

The reason for using indexes is to enable search engines to avoid using table scans, where the engine will search all records in the org for the query object. Indexes can be based on sharing, internal optimizations standard by Salesforce or by custom indexes created on custom fields. You should also avoid using wildcard operators, You should also consider avoiding NULL values and querying for formula fields since these formula fields will be calculated at runtime and NULL values will cause a full table scan. Here’s an example below:

String stage = NULL;
List<Opportunity> = [SELECT Name FROM Opportunity WHERE Stagename=: stage];

To diagnose problems with your queries you can also use the Query Plan Tool. It gives us insights based on different query plans and which filters should be indexed and provides the cost of using these techniques compared to a full table scan. Here’s a general overview of index selectivity:

SOQL Index selectivy exceptions
Source: http://resources.docs.salesforce.com/194/0/en-us/sfdc/pdf/salesforce_query_search_optimization_developer_cheatsheet.pdf


Index Thresholds

Consider this query:

SELECT * FROM Account WHERE Account_Type__c = 'Large'

Platform uses a query optimizer to maintain statistics of the searches being made against and tries to determine whether to use standard indexes or use the custom indexes created. For the query entered above, if the number of records that fits this criteria is lower than the custom index threshold, it will use a standard index instead of a custom one. The threshold is 10% of the object’s total records or if the record count is higher than 333,333.

We will continue in the next post with an in-depth look at custom indexes and case by case tips for improving performance inside Salesforce!

More like this