Reverse ETL (rETL) - EveryAction & VAN

Modified on Mon, 8 Apr, 2024 at 3:43 PM

CTA enables reverse Reverse ETL (rETL) syncs between PAD and VAN & EveryAction (EA) instances. This allows you to extract data from PAD and load it into VAN or EveryAction. Use cases include enriching your VAN with data from other integrations or performing regular mass updates of your VAN data. Essentially, any bulk import candidate for VAN is suitable for rETL. And the best part? You can run this sync by triggering the process directly via a Google Workflow.

How do I set up this sync?

Contact your CTA representative or email help@techallies.org, to set up a VAN/EA rETL sync. In order to set up the sync, CTA will need an API key for your VAN or EveryAction committee under CTA’s integration.

How do I find and run the sync?

The process of finding and running the sync is simple! You can find a step-by-step walkthrough of the process below:

You can run this sync directly in your project via a Google Workflow. You can find all of your Google Workflows in your Google Cloud console here. This workflow will be called “ngpvan-bulk-import”, or something similar.

Google Workflows are a powerful tool for orchestrating tasks. With workflows, you can execute complex processes without having to write any underlying code. To run your workflow, simply click on it and select "Execute" from the menu at the top. On the prompt screen that appears, you'll see a section for Input. Simply provide the necessary inputs, and you're good to go!

The Input box is where you’ll set the parameters for the workflow, including what data you’re pushing into VAN and what the data should be imported as. The basic syntax looks like this:

{
"bulk_import_job_type": "",
"dataset_id": "",
"table_id": "",
"van_app_name": "",
"van_database_mode": <input 0 for MyVoters keys, 1 for MyCampaign or EveryAction keys>
}

There is no need to change the field "van_app_name"; we will provide you with a value to prefill it. For a bulk import job, you will need to specify the type of job you are running. Refer to the next section for the standard names for commonly used job types.
To populate the "dataset_id" and "table_id" fields, you will need to locate the table in BigQuery that contains the data you want to upload into VAN. Simply enter the dataset name and table name in the appropriate fields. Refer to the section below to find the standard column names and required columns for the table, depending on your chosen job type.

Make sure to maintain the same punctuation, spacing, and parentheses as in the example above. Any changes may cause the workflow to malfunction as it feeds into a script.
Once you’ve input the syntax, select the Execute button on the bottom left. You should see a status page while the job runs, and after a few minutes, you should see an output message that says “Success.” The execution state will also move to “Succeeded” once it is done.

If your job returns a failed status, please confirm your input table has the correct column names and includes a VANID in the first column. If it does, submit a help@techallies.org ticket with the workflow execution ID (at the top of the page when you execute a job) so we can troubleshoot.

What are the different types of jobs I can run, and what are their differences?

The rETL workflow allows for different types of bulk import jobs. Include the "bulk_import_job_type" in the syntax to specify the job type. Below are some examples of common job types and the required upload columns. If you can't find the job type you're looking for, please contact help@techallies.org. Additionally, you can refer to the bottom of this section for information on naming requirements for standard columns.

Adding or Modifying Contacts: The upsert_contacts job type allows you to add or modify your VAN/EA contacts. You can use this to add new people or modify any fields tied to person info (addresses, email addresses, phone numbers, names, membership system IDs, etc.). Note that this job does not currently support custom fields.
- Creating New Contacts: Your table should include at least first and last names.
- Modifying Existing Contacts: Your table should include VANID and any additional columns you’d like to modify in the first column.
Applying Activist Codes: The apply_activist_codes job type allows you to apply activist codes to existing individuals in your VAN or EA instance. You will need to create the activist code you want to apply in advance and locate the Activist Code ID in VAN/EA. To run this job, your table should include at minimum VanID (in the first column) and ActivistCodeID. You can also include DateCanvassed, CanvassedBy and ContactTypeID, but they cannot be null or blank, and they depend on each other (so if you include DateCanvassed, you should include the other ones as well).
Standard Column Naming: VAN has several examples you can access in your VAN help center when logged in. A good practice is to call the column its name in the bulk import mapping in the VAN UI. You can also pull a raw full list of column names and their descriptions from the API; VAN has documentation on how to do so here.

How can I see the results of a sync run?

When you run the workflow, two tables will appear in your project under the ngpvan_results dataset. One will be a summary table with counts of all the records passed in, and one will be a full table with every column that you passed in and a few default additional columns, such as the job time. You can tell these apart because the summary table will have summary in the name. Different job types - e.g., updating contacts vs. applying activist codes - will have different table prefixes. A new table will be generated every time you run the workflow, and tables will auto-delete after 90 days. If you need to store this data for longer than 90 days for compliance, reference, lookup table or other purposes, we recommend creating your own compiled table to store everything relevant to you. You can query all of the results tables at once without having to union by using a wildcard in your table name in the FROM clause (i.e, if you want all the results tables for the 'upsert_contacts' job type at once, you can run FROM `.ngpvan_results.upsert_contacts_results*`) Keep in mind that if your tables have different schemas - which they might, if you've passed in files with different column headers to the workflow - then the wildcard will only pick up the columns that are common across all tables. Reach out to help@techallies.org if you need help with unioning together results tables with different schemas.

FAQ:

Q: The data I want to push into VAN doesn’t have VANIDs attached to it; it uses dwid, VoterBase ID, or another CRM’s IDs
A: If you need to upload data via the workflow, you should match it to VANID first. After each run, the workflow creates a result table that displays the entered data and the modified VANID if applicable. You can use this table to create a crosswalk or add an ID to your data as necessary. Some job types, such as adding activist code IDs, require a VANID due to API limitations. However, if a person doesn't exist in VAN and you want to create a new contact, it's acceptable to leave the VANID field blank.
Q: I want to do multiple job types at once, e.g., add new contacts and apply an activist code to them
A: Due to API limitations, the workflow does not support multiple job types at once. You will have to run the workflow once for each job type; e.g., run it once to create new contacts and then run it again to apply activist codes to them.
Q: My job says it ran successfully, but some of the information I included did not update in VAN.
A: Ensure that you have the correct column names for each column. If a column name is incorrect, the update won't apply. Additionally, ensure your data is formatted correctly and doesn't have extraneous symbols (see below for invalid characters to avoid).
Q: Are there any invalid characters I should avoid when using this workflow?
A: Yes. Invalid characters will cause both the row that has it to fail, and potentially other rows following it, due to VAN API structure. Currently, we're aware of one invalid character - the ‘#' when included in addressline1. To remove any instances of this from your data, we recommend adding:
```
where regexp_contains(addressline1, '#') = FALSE
```
when creating your tables for upload if you are uploading addressline1. This WHERE statement will cause your query to only return rows in which addressline1 does not contain a '#’. Please contact VAN if you have any questions about other invalid characters, and reach out to help@ctatechallies.org if you’d like assistance modifying the above query to include other characters.
Q: I want to use a job type I don’t see listed here:
A: Reach out to help@techallies.org; we can help you find the appropriate syntax if available.

Have questions? Reach out to us a help@techallies.org.