SQL Functions

table of beads

As discussed in previous posts (ETL and Extracting with Bulk Insert), the first step in the ETL process is to land the data. Once landed in SQL Server, it is best practice to never assume the landing schema will be the final target schema. Since this data set does not appear to have a gauntlet of data types, I wanted to see if I could write something to clue me on to the data types.

The CSV files contained columns that were clearly labeled as ids. With a quick handful of queries, I was able to confirm the integer columns were well enough. The remainder of the columns were varchar. Just by scanning the tables, it was easy to see that the varchar data type would not be sufficient. The problem became, how do I check what the data type should be for one value?. Then, how do I perform this data type check for all varchar values?.

Performing the check for one column turned out to be using a built-in SQL Server function ISNUMERIC(expression). With this and a check for a decimal place within the varchar to determine if the value is a decimal or an integer. It would be silly to copy and paste this conditional all over the type checking scripts, a stored procedure is too big of a tool, the middle of the road option is SQL Functions. There are hundreds of functions that come out of the box, as developers we can also define our own. For example:

use FormulaOneAnalysis

drop function if exists dbo.TypeCheck
go

create function dbo.TypeCheck(@param varchar(max))
returns varchar(max)
begin
    declare @response varchar(max)
    set @response = ''

    -- check if the parameter passed is a numeric value
    if (isnumeric(@param) = 1) 
        begin 
            set @response = concat(@response, 'numeric')
            -- check if the value is a decimal or an int
            if (charindex('.', @param) > 0)
                set @response = concat(@response, ':decimal')
            else
                set @response = concat(@response, ':int')
        end
    else
        begin
            set @response = concat(@response, 'varchar:', len(@param))
        end

    -- return the concatenated string response
    return @response
end;

With the function defined we are just a query away from using it:

use FormulaOneAnalysis

select dbo.TypeCheck(Name)
from RacesTypeAnalysis

With this knowledge, there is no need to copy-paste from script to script building a maintenance nightmare to maintain.