Best Practices

Optimizations for Salesforce Orgs with Lots of Data (Part 2)

Optimize Salesforce org performance with efficient queries, indices, and selectivity. Part 2 of our 4 part series on handling large data volumes.

Last Update:
May 22, 2023
Published:
June 3, 2021

In this article, we will delve into the finer details of writing effective queries in Salesforce, making use of Salesforce Object Query Language (SOQL) and Salesforce Object Search Language (SOSL). Through examining specific use cases, pros and cons, and practical tips, you'll gain insights into how to navigate these two Salesforce query languages, along with understanding the importance of indices, selectivity, and the role of features like Skinny Tables and Defer Sharing Calculation.

Here are our 5 key takeaways:

  • SOQL vs. SOSL: SOQL is optimal when you know the objects and fields, while SOSL is ideal for broad text-based searches.
  • SOQL Indexing: Indexing in SOQL is limited; not all fields can be indexed, but it does support primary keys, custom fields, system audit fields, and foreign keys.
  • Defer Sharing Calculation Feature: This feature can help in managing sharing rules during large configuration changes in your organization.
  • Skinny Tables: Skinny tables reduce processing times for frequently accessed fields, though they require manual updates when changes are made.
  • Utilizing Selectivity and Indexes: Make queries selective and specific to improve performance, and consider creating custom indexes with Salesforce's help for optimal results.

Table of Contents

This is Part II of a four part series on optimization techniques for Salesforce orgs with lots of data. Part I of the series can be found here

Writing Better Salesforce Queries

Whether you are using SOQL or SOSL, if you are filtering a set of records, making selective queries is key. As for deciding which query language to use in Salesforce, the general approach is that if you know which object and fields you are searching for, you can use SOQL. Both SOQL and SOSL have different use cases and have drawbacks and advantages in different scenarios. 

SOQL indexes are primary keys mainly, but they can also be custom fields (with unique values) or system audit fields (CreatedDate). They can also be foreign keys (master-detail relationship field). You can’t index multi-picklist, currency fields (if multi-currency enabled), long text fields or binary fields, for example. Also you can run queries to see child object count in SOQL. Neat!

Salesforce Object Search Language (SOSL) is lesser known among developers and it’s used to construct text-based search queries against the search index. The search engine looks for matches across a maximum of 2,000 records. You can search text, email, and phone fields for multiple objects, including custom objects, that you have access to in a single query in the for SOAP/REST calls, Apex and Visualforce controllers and methods and more.

SOSL also enables you to search text, email and phone fields at the same time, making it more efficient when handling large data queries. This is especially useful if you do not know which objects contain the values you are searching for. I used to think that SOSL always had higher governor limits compared to SOQL, but that is not the case. If you want more than 2,000 records returned, you need to use SOQL where the transaction record limit for Apex is 50,000.

Here are some additional tips: 

  • SOSL can tokenize multiple terms within a field (for example, multiple words separated by spaces).
  • Limit scope by targeting specific objects rows owned by the user and rows within a division (when applicable).
  • The Name, Phone, Text, picklist fields are indexed and appropriate for SOSL.
  • For SOSL, Chatter Feed searches aren't affected by the scope of your search, and their results include matches across all objects.

To avoid timeout errors with SOQL queries, try to reduce query scope and selective filters. Consider using the Bulk API and adding a LIMIT clause (starting with 100,000 records) to your queries. If using batch Apex for your queries, use chaining to get sets of records (using LIMIT) or consider moving filter logic to the execute method.

Force.com Query Optimizer
Source: http://resources.docs.salesforce.com/194/0/en-us/sfdc/pdf/salesforce_query_search_optimization_developer_cheatsheet.pdf

Salesforce Defer Sharing Calculation

In some circumstances, it might be appropriate to use a feature called Defer Sharing Calculation, which allows users to defer the processing of sharing rules when users are added and updated and role hierarchy in the org changes.

An organization’s administrator can use the Defer Sharing Calculation permission to suspend and resume sharing calculations, and to manage two processes: group membership calculation and sharing rule calculation. The administrator can suspend these calculations when performing a large number of configuration changes, which might lead to very long sharing rule evaluations or timeouts and resume calculations during an organization’s maintenance period. This deferral can help users process a large number of sharing-related configuration changes quickly during working hours, and then let the recalculation process run overnight between business days or over a weekend.

Salesforce Skinny Tables

Skinny tables can be created on custom objects, as well as the Account, Contact, Opportunity, Lead and Case objects and Custom Objects. While they can increase performance, they have a limit of 100 objects and can’t contain fields from other objects. They can only be used on Full Copy sandboxes by default. In order to use skinny tables you need to contact Salesforce Customer Support.

Salesforce maintains separate tables at the database level for standard and custom fields. Normally a database join is required when you query these fields. You can create skinny tables for frequently accessed fields and reduce processing times. However, this only applies to certain read-only operations. If you change the query that runs and add new tables, your skinny table definitions also need to be manually updated by Salesforce customer support. Salesforce also syncs skinny tables and source tables continuously to avoid database joins, lowering processing times. Skinny tables are most useful for reports as they are a read-only operation and you will get much better results if you have more than a million records. Skinny tables can also be used for encrypted data.

Query Selectivity and Indexes

A simple first step is to leverage standard platform indexes by making our queries as selective and specific as possible.

Consider a simple query with a WHERE clause below:

SELECT Id, Name FROM Opportunity WHERE Stagename = 'Closed Won'

A skilled developer or admin can get necessary information about the dataset they will be interacting with. This can be more effective than reports especially in large data volumes. For example, if you need to see the opportunity count grouped by their stage name, you can use the GROUP BY ROLLUP to get the necessary information. It will also return the total amount of records in the null row.

SELECT Stagename, COUNT(id) FROM Opportunity GROUP BY ROLLUP (Stagename)
Salesforce query - group by rollup

You can also use more complex filter conditions. For filter conditions that combine two or more conditions (using AND), the query optimizer considers the overall filter condition selective when the filter targets less than 150,000 records for both filter conditions. 

If one of the conditions returns more than 250,000 records, the overall filter can be made selective if both conditions overlap and return less than 150,000 records while each filter condition corresponds to less than 300,000 records. For OR conditions each filter must meet the thresholds individually. (We are going to talk about index thresholds in detail below.)

In order to use indexed searches for SOQL queries with multiple filters, try breaking the query down into as many parts as filters and join the results separately.

Salesforce index selectivity conditions and thresholds
Source: http://resources.docs.salesforce.com/194/0/en-us/sfdc/pdf/salesforce_query_search_optimization_developer_cheatsheet.pdf

Aside from keeping the governor limit of 50,000 records returned in a transaction, you should also be mindful about deleted records in your org. They can affect your query performance as well. You can both use the isDeleted flag to filter these reports as this field is available on all standard and custom objects or empty your recycle bin frequently. Also If you’re using Workbench, select Exclude for the Deleted and archived records option list.

Salesforce Custom Index: Explained

Another thing we can do is leverage standard indexes that operate on system fields. If your filter condition involves a custom field, work with Salesforce customer support to create a custom index on the field your filter uses. Not all fields can have an index, such as non-deterministic formula fields.

SOQL Index - Fields
Source: http://resources.docs.salesforce.com/194/0/en-us/sfdc/pdf/salesforce_query_search_optimization_developer_cheatsheet.pdf

Salesforce supports custom indexes on custom fields, except for multi-select picklists, some text-areas,non-deterministic formula fields, and encrypted text fields. External Ids can be indexed, however only auto number, email, number and text fields can be configured to serve as an External Id.

The reason for using indexes is to enable search engines to avoid using table scans, where the engine will search all records in the org for the query object. Indexes can be based on sharing, internal optimizations standard by Salesforce or by custom indexes created on custom fields. You should also avoid using wildcard operators, You should also consider avoiding NULL values and querying for formula fields since these formula fields will be calculated at runtime and NULL values will cause a full table scan. Here’s an example below:

String stage = NULL;
List<Opportunity> = [SELECT Name FROM Opportunity WHERE Stagename=: stage];

To diagnose problems with your queries you can also use the Query Plan Tool. It gives us insights based on different query plans and which filters should be indexed and provides the cost of using these techniques compared to a full table scan. Here’s a general overview of index selectivity:

SOQL Index selectivy exceptions
Source: http://resources.docs.salesforce.com/194/0/en-us/sfdc/pdf/salesforce_query_search_optimization_developer_cheatsheet.pdf


Index Thresholds

Consider this query:

SELECT * FROM Account WHERE Account_Type__c = 'Large'

Platform uses a query optimizer to maintain statistics of the searches being made against and tries to determine whether to use standard indexes or use the custom indexes created. For the query entered above, if the number of records that fits this criteria is lower than the custom index threshold, it will use a standard index instead of a custom one. The threshold is 10% of the object’s total records or if the record count is higher than 333,333.

We will continue in the next post with an in-depth look at custom indexes and case by case tips for improving performance inside Salesforce!

Enhancing Salesforce Performance for Large Data Volumes

Optimizing the performance of a Salesforce org with large data volumes is crucial for efficient operations and user experience. In this article, we explored the intricacies of writing effective queries using SOQL and SOSL, along with understanding the significance of indices, selectivity, and features like Skinny Tables and Defer Sharing Calculation. By leveraging query selectivity, utilizing indexes, and being mindful of deleted records, you can enhance query performance and minimize processing times.

To further dive into advanced optimization techniques and gain comprehensive insights, stay tuned for Part 3 of our series. In this article, we will delve into custom indexes and provide case-by-case tips to further improve performance within Salesforce. Don't miss out on unlocking the full potential of your Salesforce org!


FAQ

Q: What is the purpose of Part 2 in the four-part series on optimizing Salesforce orgs with a large volume of data?

A: Part 2 of the series focuses on managing and improving performance in a data-intensive Salesforce environment. It does this by explaining various techniques such as writing better queries, deferring sharing calculations, utilizing skinny tables, and understanding the intricacies of query selectivity and indexes.

Q: What are the key takeaways from Part 2?

A: The key takeaways include understanding how to write better queries using Salesforce Object Query Language (SOQL) and Salesforce Object Search Language (SOSL), deferring sharing calculations to improve performance during large configuration changes, using skinny tables to reduce processing times, and leveraging standard platform indexes for making queries more selective and specific.

Q: What are Skinny Tables and how can they improve Salesforce performance?

A: Skinny Tables are special tables that can increase performance by storing frequently accessed fields and reducing processing times. They have limitations and require Salesforce Customer Support for creation and management. However, they can be used to speed up queries on large datasets and are especially useful for reports in data-intensive environments.

Q: What is Defer Sharing Calculation in Salesforce?

A: Defer Sharing Calculation is a feature that allows users to defer the processing of sharing rules when users are added and updated, and when the role hierarchy in the Salesforce org changes. This improves performance during large configuration changes.

Q: What is the difference between SOQL and SOSL?

A: SOQL is useful when you know the specific object and fields you're searching for, while SOSL can search across multiple objects and fields simultaneously. However, each has different use cases and strengths and limitations in different scenarios.

Q: How do Salesforce Custom Indexes work?

A: Salesforce supports custom indexes on custom fields, except for a few exceptions. These indexes speed up data retrieval and improve query performance. However, using these indexes requires coordination with Salesforce customer support.

Q: What is meant by Query Selectivity and Indexes in Salesforce?

A: Query selectivity refers to making queries as specific as possible to leverage standard platform indexes. It involves using conditions that target a specific subset of data, which allows the system to use indexes to retrieve data more efficiently.

Q: What is the significance of Index Thresholds in Salesforce?

A: Index thresholds refer to the decision-making process of the Salesforce query optimizer, whether to use standard or custom indexes based on the number of records matching a query. Understanding these thresholds can help optimize query performance.

Q: What is the next part of the series about?

A: Part 3 of the series will delve deeper into custom indexes and provide case-by-case tips to further improve performance within Salesforce.

Q: What is the ultimate goal of this series on optimizing Salesforce orgs with lots of data?

A: The ultimate goal of this series is to provide a comprehensive guide to enhancing the performance of a Salesforce org with large data volumes. This is to ensure efficient operations and user experience by leveraging query selectivity, utilizing indexes, and understanding various performance-enhancing features like Skinny Tables and Defer Sharing Calculation.

More like this