CRM Cleaning: Merging Salesforce Objects using SOAP API (with a Python Focus)

Photo by Hitesh Choudhary on Unsplash

We recently helped a client to clean up their Salesforce CRM. First, we made our proprietary solution to find all of the duplicate records that would pass the client’s criteria and then had to use Salesforce native merge functionality to remove the duplicates. By doing the deduplication process, we helped them save their Sales Reps time by an estimated 10% to 20%.

After a few hours of searching online, we figured that there is not much documentation on how to interact with the Salesforce API we had to work with, their SOAP API and particularly no one had talked about using their merge functionality. This article is written to make it easier for other developers that are planning to do what we did and save them time, a lot of it. Let’s say Salesforce API and SOAP APIs are not the most fascinating APIs to work with!

You can send a SOAP request through any programming language that you like. Here, we are using python as it is becoming more and more popular especially among data scientists and other data related experts. Extending it to your favourite language should be quite straightforward though.

Let’s look at a bit of background on SOAP protocol that can be found on Wikipedia:

SOAP (abbreviation for Simple Object Access Protocol) is a messaging protocol specification for exchanging structured information in the implementation of web services in computer networks. Its purpose is to provide extensibility, neutrality and independence. It uses XML Information Set for its message format, and relies on application layer protocols, most often Hypertext Transfer Protocol (HTTP) or Simple Mail Transfer Protocol(SMTP), for message negotiation and transmission.”

So essentially, we need to send an xml message to Salesforce API that is based on a given standard. You need to look at the Web Service Description Language of Salesforce (WSDL) to (hopefully J) find out how to format your xml message. The good news that we’ve done this for you for merging.

First, we need to make a nice header to send with our API call. Here is how the header should look like:

headers = { 'Content-Type': 'text/xml', 'Accept': 'application/soap+xml,application/dime, multipart/related, text/*', 'Authorization': 'Bearer ' + self.session_id, 'SOAPAction': 'merge', 'Sforce-Auto-Assign': 'false', 'charset':'UTF-8'}

It should be quite self-explanatory; if you are wondering about ‘Sforce-Auto-Assign’: ‘false’ part, it’s helping us to avoid any weird default assignment rule that your Salesforce might have.

Now it’s time to actually make a nice xml message. We are doing a bulk merge request and hence we have a list of records that we want to merge. We have stored that list into a list called data and each row of data is a list containing the following in order:

  • sf_type: type of the records to be merged (Account, Contact, Lead)
  • master_id: Salesforce Id of the master
  • victim_id: Salesforce Id of the victim
  • valuesToKeep: Please see below

For example, the first row of data can be:

['Lead', '00Q2G00001atpfx', '00Q2G01001atqfl', {'Industry': 'Law', 'Phone': '+1 111 111 1111'}]

One very important note. Salesforce does not keep the info from the victim even if the fields for the master are empty! You need to tell it explicitly to overwrite those fields. As a result, we made a dictionary ( valuesToKeep) of field-names (keys of valuesToKeep) and victim values (values of valuesToKeep) we want to overwrite into the master record. For example, in the above, we are asking Salesforce to use ‘ Law’ as the final record’s ‘ Industry’ and ‘+1 111 111 1111’ as ‘ Phone’.

Finally, let’s put all of these together and form xml body:

body = '''<?xml version="1.0" encoding="utf-8" ?> <soapenv:Envelope xmlns:soapenv="http://schemas.xmlsoap.org/soap/envelope/" xmlns:urn="urn:enterprise.soap.sforce.com" xmlns:urn1="urn:sobject.enterprise.soap.sforce.com" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"><soapenv:Header> <urn:SessionHeader><urn:sessionId>{sessionId}</urn:sessionId></urn:SessionHeader> </soapenv:Header><soapenv:Body> <ns1:merge xmlns:ns1='urn:partner.soap.sforce.com'>'''.format(sessionId = self.session_id) for rec in data: sf_type, master_id,victim_id,valuesToKeep = rec body = body + '''<ns1:merge><ns1:masterRecord> <ens:type xmlns:ens='urn:sobject.partner.soap.sforce.com'>{sf_type}</ens:type> <ens:Id xmlns:ens='urn:sobject.partner.soap.sforce.com'>{master_id} </ens:Id>'''.format(sf_type = sf_type, master_id = master_id) for rec in data:
sf_type, master_id,victim_id,valuesToKeep = rec
body = body + '''<ns1:merge><ns1:masterRecord>
<ens:type xmlns:ens=
'urn:sobject.partner.soap.sforce.com'{sf_type}</ens:type>
<ens:Id xmlns:ens='urn:sobject.partner.soap.sforce.com'
{master_id}
</ens:Id>'''.format(sf_type = sf_type, master_id = master_id)
for item in valuesToKeep.keys():
try:
value = valuesToKeep[item]
body = body + '''<ens:{item}
xmlns:ens='urn:sobject.partner.soap.sforce.com'>
{value}</ens:{item}>'''.format(item=item, value=value)
except:
pass
body = body + '''</ns1:masterRecord><ns1:recordToMergeIds>{victim_id}</ns1:recordToMergeIds> </ns1:merge>'''.format(victim_id = victim_id) body = body + '''</ns1:merge></soapenv:Body></soapenv:Envelope>'''

We use the library requests to the API call. You are free to use another method that suits you.

import requests 
from simple_salesforce import Salesforce
session = requests.Session()
sf = Salesforce(username = username, password = password, security_token = token, session = session, domain = domain)
sf.session.request(method = 'POST', url = 'https://' + sf.sf_instance +'/services/Soap/u/' + sf.sf_version, data = body, headers = headers)

You need to provide for username, password, and security_token. domain is ‘test’ for Sandbox and None for the other cases.

Note: Salesforce API version 38 is problematic. You want to avoid it.

If everything goes well, Salesforce returns you a xml that contains no error message. Otherwise, it will try to help you understand where it went wrong, and you can correct any potential mistakes.

I wish you good luck and enjoy working with Salesforce API! Feel free to contact me if you had any questions.

Originally published at https://hamidomid.com on August 29, 2019.

An entrepreneur, company advisor, physicist, machine learning scientist, classical music lover, and a dreamer. @hamid_omid_