SQL splitting a varchar list argument to row of data


SQL splitting a varchar list argument to row of data

Posted by Luis Majano
May 14, 2010 10:35:48 UTC

I got into yet another issue with my SQL optimizations this week at work and one of them was that I needed to pass a list of IDs to a function that would be then used in a SQL statement via an “in” clause.  The problem is that the list argument is a varchar:

   1: @docTypeIds varchar(500) = '0',

 

Then I needed to use it in my SQL statement like this:

   1: SELECT *

   2: FROM table

   3: WHERE (

   4:   (@docTypeIds = '0' OR @docTypeIds = '')

   5:   OR

   6:   docType in (@docTypeIds)  

   7: )

The problem with this approach is that the datatype for the docType column was tinyint and then the argument is a varchar list, so it would fail as it cannot cast it. Bummer!  So what to do?  Well, after some searching and searching, the best idea so far (brought up by our local dev guru Sanjib Panda), was to break the list into a simple table with one column of type tinyint, but how?  Here is the simple logic which could easily be created as a UDF:

   1: declare @testString as varchar(20)

   2: set @testString = '1,2,3,4,5,6,7,8,9'

   3:  

   4: declare @tempTable table

   5: (

   6: docType tinyint

   7: )

   8:  

   9: while charindex(',',@testString) > 0

  10: begin

  11:   INSERT INTO @tempTable select substring(@testString,1,1)

  12:   SET @testString = substring(@testString,3,len(@testString))

  13: end

  14:  

  15: select * from @tempTable

So if you run the above, that will give you our temp table with one docType column with all the values we needed.  Man, sometimes I wish SQL could have some ColdFusion goodness and create a cool function like: listToTable() or something.  Anyways, after splitting the list into a table then I can change my SQL to something like this:

   1: AND(

   2:   (@docTypeIds = '0' OR @docTypeIds = '')

   3:   OR

   4:   d.docType in ( select docType from @tempTable )  

   5: )

There you go! The solution is working really fast and given us a nice little split to rows function that can be used on any variable.  Hope this helps.

 

UPDATE

After much trying out and pains, what I had did not work fully :(  So here is an updated function:

   1: -- Temp tables for lists

   2: DECLARE @docTypes TABLE( docType tinyint )

   3: DECLARE @docTypeIds varchar(300)

   4: DECLARE @delimIndex int

   5:  

   6: set @docTypeIds = '11,1,3,54'

   7:     

   8: -- If docTypes sent in, then convert to table rows

   9: IF LEN(@docTypeIds) > 0 AND @docTypeIds <> '0'

  10: BEGIN

  11:     -- Get Delimiter index

  12:     SET @delimIndex = CHARINDEX(',',@docTypeIds)

  13:     -- We have at least one delimiter, start on it

  14:     WHILE @delimIndex > 0

  15:     BEGIN

  16:         INSERT INTO @docTypes SELECT SUBSTRING(@docTypeIds,1,@delimIndex-1)

  17:         SET @docTypeIds = SUBSTRING(@docTypeIds,@delimIndex+1,LEN(@docTypeIds))

  18:         SET @delimIndex = CHARINDEX(',',@docTypeIds)

  19:     END

  20:     -- Insert the last entry or the first single no delim entry

  21:     INSERT INTO @docTypes SELECT @docTypeIds

  22: END

  23:  

  24: select * from @docTypes


Rick O

Another way to do this without a platform-specific function is with a Numbers table:

CREATE TABLE Numbers ( n INT NOT NULL );

-- Insert 1 .. 500 into Numbers

SELECT n FROM Numbers INNER JOIN Whatever ON (SUBSTRING(Whatever.BigString, Numbers.n, 1) = ',');

This gets a list of positions of all of the commas in the string. Join this back to itself to get character ranges for tokens:

SELECT firstChar, MIN(lastChar) AS lastChar FROM ( SELECT 1 AS firstChar UNION SELECT n - 1 FROM Numbers INNER JOIN Whatever ON (SUBSTRING(Whatever.BigString, Numbers.n, 1) = ',') ) AS le LEFT JOIN ( SELECT n + 1 AS lastChar FROM Numbers INNER JOIN Whatever ON (SUBSTRING(Whatever.BigString, Numbers.n, 1) = ',') UNION SELECT LENGTH(Whatever.BigString) ) AS ri ON (firstChar < lastChar) GROUP BY firstChar

Which you can then join back to your original table to get the tokens:

SELECT SUBSTRING(Whatever.BigString, firstChar, lastChar) AS token FROM Whatever CROSS JOIN ( -- the big query above ) AS parts ORDER BY firstChar

Yeah, it's a little overkill ... bit if you absolutely must tokenize strings in a cross-platform way, it works.