Functions in the USQL – the hidden gem in the Summer 2017 Update

Functions in the USQL – the hidden gem in the Summer 2017 Update

Hello SQL Folks. We all love USQL for its great extensibility and how it makes our life easier in some particular “area of data”.

There are a lot of things that have been updated this Summer. Complete list of fat can be found here. 

I will give you some practical example of them in one of mu further posts but today I would like to point you to the feature that is already available however it is not docummented. I would like to thank to Michael Rys to point out to this feature :).

Let’s start with something you already know. USQL allows you to define a function written in C#. I have demostrated this a lot during my session in Copenhagen and SQLDay 2017 in Wrocław.

I have a set of files in one one my ADL directory and read them by using a TSV extractor:


DECLARE @in string = "/Data/Aircraft/2006ByMonth/{*}.tsv";
DECLARE @out string = "/Data/Aircraft/2006ByMonth/out/CSharpFunction.tsv";

@search =
EXTRACT Month int,
  DayOfMonth int?,
  DayOfWeek int?,
  UniqueCarrier string,
  FlightNum int?,
  TailNum string,
  ActualElapsedTime int?,
  CRSElapsedTime int?,
  AirTime int?,
  ArrDelay int?,
  DepDelay int?,
  Origin string,
  Dest string,
  Distance int?,
  CarrierDelay int?,
  WeatherDelay int?,
  NASDelay int?,
  SecurityDelay int?,
  LateAircraftDelay int?
FROM @in
USING Extractors.Tsv();

Then in the second step I do a call to a function named “AddOneMonth”.


@m = SELECT DISTINCT SQLSatDenmark.Udfs.AddOneMonth(Month) AS AddedMonth, Month
FROM @search;

The function generates a new column (AddedMonth) which is later send to the output file along with the existing Month column.


OUTPUT @m
TO @out
USING Outputters.Csv();

Now let’s see how the function look like. It is really hard to imagine that something could be simpler than that. And not working well in all cases but the goal of the function is to show how to use the C# code in the USQL code.


namespace SQLSatDenmark
{
   public class Udfs
   {
      public static int AddOneMonth(int month)
      {
         return month + 1;
      }
   }
}

 

Now let’s go to the new thing – undocumented usage of inline functions. My function is pretty simple and I can imagine that function you will write could be as simple as mine but your functions will probably do something more useful. To simplyfy the coding process you could use inline function in your USQL script and not have to write any code in the C# file.

The could could look like this:


DECLARE @in string = "/Data/Aircraft/2006ByMonth/{*}.tsv";
DECLARE @out string = "/Data/Aircraft/2006ByMonth/out/CSharpFunction.tsv";

DECLARE @func Func<int,int> = (s)=>{return s+1;};

Now I skip the EXTRACT part of the code as this has not changed. What has changed is how the @m recordset is prepared:


@m = SELECT DISTINCT <strong>@func(Month) AS AddedMonth, Month
FROM @search;

How it works? This is just a regular inline function that you might probably know from C#. The function can be used only in this particular USQL script and cannot be referenced outside it. Hence we can call it a private function.

In my example the function takes “s” as parameter of the type integer  and returns also integer value.

In general, the function can be vry complicated, for example take a look here:

DECLARE @func Func <string,int?> = (s) =>{int i; var x = Int32.TryParse(s, out i); return x?(int?)i:null;};
OUTPUT
(
 SELECT @func("12") AS ParseStringToInt
 FROM
 ( VALUES(1)) AS T(X)
)
TO "/output/ParseStringToInt.csv"
USING Outputters.Csv();

 

Can we do more than this? Sure we can. You can have more than one function defined in the USQL script:


DECLARE @stringFunc Func<string,string> = (s) =>{return "555"+s;};
DECLARE @func Func <string,int?> = (s) =>{int i; var x = Int32.TryParse(s, out i); return x?(int?)i:null;};
OUTPUT
(
SELECT @func("12") AS ParseStringToInt,
@stringFunc("009") AS StringFunc
FROM
( VALUES(1)) AS T(X)
)
TO "/output/ParseStringToInt.csv"
USING Outputters.Csv();

 

Imagine we can do even more. Let me show you two other examples now.

Let’s analyze the code. I have declared two functions. The @stringFunc function is responsible for concatenating two literals. The @func function tries to convert a string into an integer value. The “magic” happens in the SELECT statement where I use the @stringFunc function as a parameter to the @func function.


DECLARE @stringFunc Func<string,string> = (s) =>{return "555"+s;};
DECLARE @func Func <string,int?> = (s) =>{int i; var x = Int32.TryParse(s, out i); return x?(int?)i:null;};
OUTPUT
(
SELECT @func(@stringFunc("009")) AS ParseStringToInt
FROM
( VALUES(1)) AS T(X)
)
TO "/output/ParseStringToInt.csv"
USING Outputters.Csv();

 

Let me now change the code a little bit. I have changed the definition of the @func function so it now calls @stringFunc function and processes it. The SELECT statement will be much more readable in that case as you might expect and the complexity of the call from the previous example is hidden in the function body.


DECLARE @stringFunc Func<string,string> = (s) =>{return "555"+s;};
DECLARE @func Func <string,int?> = (s) =>{int i; var x = Int32.TryParse(@stringFunc(s), out i); return x?(int?)i:null;};

OUTPUT
(
SELECT @func("009") AS ParseStringToInt
FROM
( VALUES(1)) AS T(X)
)
TO "/output/ParseStringToInt.csv"
USING Outputters.Csv();

 

Enough for today… Should we use inline functions? Hell, yes!!!!

They have all the advantages as they have in the C# programming

 

Cheers

Damian

Previous SQLSaturday #656 Denmark - we were there!
Next ASF 004: Mark Broadbent interview

About author

You might also like

Last Week Reading 0 Comments

Last week reading (2018-09-09)

Automatic intelligent insights to optimize performance with SQL Data Warehouse With no additional costs, Azure SQL DW delivers you best practice recommendations in terms of data skew and suboptimal table

Last week reading (2018-07-15)

Structured Streaming with Databricks into Power BI & Cosmos DB Giuliano Rapoz (Microsoft) is showing the concept of Structured Streaming with all these puzzles. Lightning fast query performance with Azure

Last week reading (2018-01-28)

SQL Server 2017 CU3 adds CXCONSUMER waits, doesn’t mention it New wait is not visible everywhere, but check where you can find it. Sabbatical and the search for a dream

3 Comments

Leave a Reply

Click here to cancel reply.