This is a process I started years ago on one of my first data cleansing projects for one of the big 3 three auto companies.  I was given a task to update thousands of rows of data and a where clause would not work because the conditions were too vast.  At the time, Mainframe was king, and PC’s computing as a prototyping device and the toolset are not overly evolved.

I wrote the code to create the keys I needed and then exported the keys to excel and manually wrote the update using the key data in cell one.  Once I had this working, I copied the formula to the rest of the rows. Flash forward current time, I write queries to create and run the code on the fly.  But to get good at this, you need to know how I got to this point.  So, I will start back at the excel days.  Ok, maybe it was Lotus 123, but that dates me.  As we progress through this, I decided to break this up into three parts to keep it understandable.  Also, thanks to the cloud, I got to revisit the excel version with an HR task. The HR tool has within the application a SQL window where I can write ad-hoc code and get a result set back.

The initial code was to query for the active employees’ W2 withholdings.

This gives me a table like (the real results coming in at around 2000 rows):

Employee Box Value Account PK
z886 W4 BOX1C M 1 123537
z859 W4 BOX2C Y 200 123576
z642 W4 BOX3 2000.00 300 123596
z323 W4 BOX4B 7000.00 400 127424
z123 W4 BOX4C 10.00 500 123582
z123 W4 BOX4C 10.00 500 123582
z123 W4 BOX4C 10.00 500 123582

 

To make this work, I have to build an update statement for each box type.

Step 1: Copy the data into excel.

Step 2: create the correct update for each Box type using Employee column, Box, and Value (columns A, C, and B

My spreadsheet is 9 columns wide.  The cloud application SQL tool gives me a blank column for A:

A B C D E F G H I
Null Employee Box Value Account PK Prep Where Gen Code
z886 W4 BOX1C M 1 123537
z859 W4 BOX2C Y 200 123576
z642 W4 BOX3 2000.00 300 123596
z323 W4 BOX4B 7000.00 400 127424

 

In Column G – I have the beginning text of the update:

The ending space is NOT required in this column. But I wanted you to see that the single quote and double group are different also. In excel, I do not have the double-quotes.

Column H – has

Here the leading space is required, and the trailing space is NOT required

Column I – has the concatenate function

Note the blob of quotes is really ” ‘ ” (Double, Single, Double).  If your SQL tools require a semi colin for the ending, add it in the “’” blob, so it looks like “’;” (Double, Single, Semi Colin, Double)

The output of this should look like this:

Let’s make this easier to read by showing the colors of each part

Step 3, review the code generated in column I.  This needs to be a valid update statement!

Step 4, copy the cells G through I and past down to the end of this box type.

Step 5 repeats for each box type.

Then you complete this in excel, TEST, TEST, TEST. You are writing an updates statement, and you don’t want a mistake here. Once I have completed the testing and copied the entire 2000+ rows for each box type, I pasted it into the cloud application and ran them all. Making thousands of updates quickly.

This is the first level of automation.  The next one will be writing code in SSMS and generating code.  The third type is to write the code to generate and run the code in one step.  Yes, these will be the next two blog articles. Wait for it.

In the meantime, if you need help with this, contact me to help get yourself on the path to good solid automation of your code to make your servers fast, secure and reliable?

Privacy Preference Center