Archive for Sql

Loading MS SQL Database With CSV Data

Apr 14

I recently had to load a lot of comma separated data into a file never knew how easy it was to load CSV data into a table. Here is the t-sql:

BULK
INSERT Address
FROM ‘c:\address.csv’
WITH
    (
    FIELDTERMINATOR = ‘,’,
    ROWTERMINATOR = ‘\n’,
    FIRSTROW = 2
    )

So handy to have this feature. Simply just point the from to the delimited data on disk and set your delimiters. In my case I am using a comma for the field separator but you could use any character (use \t for tab).

One other common thing is that the first row in your CSV file is the header information (as is the case in my example). If you want to specify to ignore the first row and start reading data from the second row use FIRSTROW=2 (as shown) to skip the header record.

Filed Under: Sql

Database Tricks – Effective Records

Mar 23

 One common place where people replicate data is in situations where you need data at a specific point in time. A perfect example I saw today was this situation:

You can see that the price column is duplicated in both the products and orderDetails table. The rationale for this is that so that when the price of the product changes that it does not affect historical orders.

This is a fairly viable solution but has a few drawbacks. The main one is data duplication. Most of the time the data will be the same (unless product pricing changes frequently). The second one is that if the product name changes it will affect historical data (to overcome that we could add another column to orderDetails called productName but that is leading to more duplication). The final issue is that if a product is to be remove from the catalog then the order detail would have to be removed as well (or some other ugly alternative).

The simple fix for this is to use the effective record approach. With this approach whenever you change a row you expire the original record and create a new record with the new changes in it. In this example we will change the products table to have an effectiveEndDate column on it and remove price from the orderDetails table:

And here is a sample of changing the products table data:

ProductId     Name         Price    EffectiveEndDate
1                 Oranges     1.99     null

so lets say we update Oranges to be $2.35 we would first expire the original and then create a new record with the updated information:

ProductId     Name         Price    EffectiveEndDate
1                 Oranges     1.99     3/21/2007 12:00AM
2                 Oranges     2.35     null

So now people who purchased oranges before 3/21/2007 will show the price of 1.99 while purchases made after will be linked to the $2.35 price.

Things to Consider
1. Anything that queries a table must factor in wether or not to return expired records.
2. When updating a record you now need to expire the old record and create a new record which is more work and should be done in a transactional context incase of failure.
3. To delete a record simply change the record so that it’s EffectiveEndDate is current.
4. Using nulls in the effectiveEndDate column. This can be tricky as it is not a valid date but the best way to describe an indeterminate expiry. Some people use a high date value which makes writing queries easier for this data but is really a magic number which makes me cringe. One technique I have used is to create a function called MaxDateTime() that returns my magic maxdatetime and then have a query like this:
    select    *
    from     product
    where   isnull(EffectiveEndDate, MaxDateTime()) > getDate()
so if the EffectiveEndDate is null convert it to our magic number (that is only in one place due to our use of a function) and ensure that the expiry is greater than today.

You don’t have to use dates either but I like that I can see when a record was taken out of commission. I bit field called IsExpired could be used if you would like.

Another thing that I have implemented before is an audit trail on records that shows which record was changed. I do this by adding a ParentRecord column to the table e.g.:

ProductId     Name         Price    EffectiveEndDate          ParentRecord
1                 Oranges     1.99     3/21/2007 12:00AM      null
2                 Oranges     2.35     4/21/2007 12:00AM      1
3                 Tangerines 2.35     null                             2

Another Use
Another use for this that I have done is time sensitive pricing. For this I had to effective columns. The first is the start date and the second is the end date. This allows to have a record be active for a period of time:

ProductId     Name         Price    EffectiveStartDate      EffectiveEndDate         
1                 Oranges     1.99     3/21/2007 12:00AM    4/21/2007 12:00AM
2                 Oranges     2.35     4/21/2007 12:00AM    5/21/2007 12:00AM
3                 Oranges     2.95     5/21/2007 12:00AM    null

In this case the price is effective for a month. The select query (if properly written) will automatically return the proper price for the point in time which the query is run.

Filed Under: Sql

SQL permission script

Sep 9

Often for my deployments I need a script that recreates all stored procs and then grants the public role to have access to it. I posted a script that someone else had developed a long time ago but…. it sucked…. ya. so here is the simple new one:

select    ‘grant execute on ‘+[name]+ ‘ to public’
from    sysobjects
where    xtype=’P’
or        xtype=’FN’

Filed Under: Sql

SQL comparison

Aug 1

One of the tools I find lacking in the industry is a good sql comparison utility. I have tried some free ones that really seem to not work very well at all. Commercially I have found that red-gate and adept’s programs quite good but I am cheap. Has anyone found a good tool to compare to databases to each other that is free? Or would anyone be interested in starting an open source project to do this with me?

Filed Under: Sql

SQL Boolean Queries

Jul 31

The other night at edmug Richard Campbell presented some good SQL Querying Tips & Techniques. A lot of them were about sql 2005 which was nice to see (espeically using Common Table Expressions which are quite handy in the case of recursion).

The coolest thing I found was in regards to a cross tab query. The example used was to get sales amount by month for each employee. As you can see in the following example this is done with a lot of subqueries that builds a hideous execution plan.

SELECT Salespeople.Salesperson, SUM(S1.Quantity*S1.Price) AS Total,
  (SELECT SUM(S2.Quantity*S2.Price) FROM Sales AS S2 WHERE
    DatePart(mm,S2.Invoice_Date)=1 AND S2.Sales_ID = S1.Sales_ID) AS Jan,
  (SELECT SUM(S2.Quantity*S2.Price) FROM Sales AS S2 WHERE
    DatePart(mm,S2.Invoice_Date)=2 AND S2.Sales_ID = S1.Sales_ID) AS Feb,
  (SELECT SUM(S2.Quantity*S2.Price) FROM Sales AS S2 WHERE
    DatePart(mm,S2.Invoice_Date)=3 AND S2.Sales_ID = S1.Sales_ID) AS Mar,
  (SELECT SUM(S2.Quantity*S2.Price) FROM Sales AS S2 WHERE
    DatePart(mm,S2.Invoice_Date)=4 AND S2.Sales_ID = S1.Sales_ID) AS Apr,
  (SELECT SUM(S2.Quantity*S2.Price) FROM Sales AS S2 WHERE
    DatePart(mm,S2.Invoice_Date)=5 AND S2.Sales_ID = S1.Sales_ID) AS May,
  (SELECT SUM(S2.Quantity*S2.Price) FROM Sales AS S2 WHERE
    DatePart(mm,S2.Invoice_Date)=6 AND S2.Sales_ID = S1.Sales_ID) AS Jun,
  (SELECT SUM(S2.Quantity*S2.Price) FROM Sales AS S2 WHERE
    DatePart(mm,S2.Invoice_Date)=7 AND S2.Sales_ID = S1.Sales_ID) AS Jul,
  (SELECT SUM(S2.Quantity*S2.Price) FROM Sales AS S2 WHERE
    DatePart(mm,S2.Invoice_Date)=8 AND S2.Sales_ID = S1.Sales_ID) AS Aug,
  (SELECT SUM(S2.Quantity*S2.Price) FROM Sales AS S2 WHERE
    DatePart(mm,S2.Invoice_Date)=9 AND S2.Sales_ID = S1.Sales_ID) AS Sep,
  (SELECT SUM(S2.Quantity*S2.Price) FROM Sales AS S2 WHERE
    DatePart(mm,S2.Invoice_Date)=10 AND S2.Sales_ID = S1.Sales_ID) AS Oct,
  (SELECT SUM(S2.Quantity*S2.Price) FROM Sales AS S2 WHERE
    DatePart(mm,S2.Invoice_Date)=11 AND S2.Sales_ID = S1.Sales_ID) AS Nov,
  (SELECT SUM(S2.Quantity*S2.Price) FROM Sales AS S2 WHERE
    DatePart(mm,S2.Invoice_Date)=12 AND S2.Sales_ID = S1.Sales_ID) AS Dec
FROM Sales AS S1 INNER JOIN Salespeople ON S1.Sales_ID = Salespeople.Sales_ID
GROUP BY Salespeople.Salesperson, S1.Sales_ID;

A Russiam mathemetician by the name of Rozenshtein has a great query (I beleive it was called a boolean query)

SELECT Salespeople.Salesperson, SUM(Sales.Quantity*Sales.Price) AS Total,
  SUM(Sales.Quantity*Sales.Price*(1-ABS(SIGN(DatePart(mm,Sales.Invoice_Date)-1)))) AS Jan,
  SUM(Sales.Quantity*Sales.Price*(1-ABS(SIGN(DatePart(mm,Sales.Invoice_Date)-2)))) AS Feb,
  SUM(Sales.Quantity*Sales.Price*(1-ABS(SIGN(DatePart(mm,Sales.Invoice_Date)-3)))) AS Mar,
  SUM(Sales.Quantity*Sales.Price*(1-ABS(SIGN(DatePart(mm,Sales.Invoice_Date)-4)))) AS Apr,
  SUM(Sales.Quantity*Sales.Price*(1-ABS(SIGN(DatePart(mm,Sales.Invoice_Date)-5)))) AS May,
  SUM(Sales.Quantity*Sales.Price*(1-ABS(SIGN(DatePart(mm,Sales.Invoice_Date)-6)))) AS Jun,
  SUM(Sales.Quantity*Sales.Price*(1-ABS(SIGN(DatePart(mm,Sales.Invoice_Date)-7)))) AS Jul,
  SUM(Sales.Quantity*Sales.Price*(1-ABS(SIGN(DatePart(mm,Sales.Invoice_Date)-8)))) AS Aug,
  SUM(Sales.Quantity*Sales.Price*(1-ABS(SIGN(DatePart(mm,Sales.Invoice_Date)-9)))) AS Sep,
  SUM(Sales.Quantity*Sales.Price*(1-ABS(SIGN(DatePart(mm,Sales.Invoice_Date)-10)))) AS Oct,
  SUM(Sales.Quantity*Sales.Price*(1-ABS(SIGN(DatePart(mm,Sales.Invoice_Date)-11)))) AS Nov,
  SUM(Sales.Quantity*Sales.Price*(1-ABS(SIGN(DatePart(mm,Sales.Invoice_Date)-12)))) AS Dec
FROM Sales INNER JOIN Salespeople ON Sales.Sales_ID = Salespeople.Sales_ID
GROUP BY Salespeople.Salesperson;

Breaking this down we take the numeric month out of the Invoice date with the DatePart(mm, Sales.Invoice_date) section. Next the value is run through the SIGN method. SIGN converts a number to a +1 if greater than 0, 0 if the value is 0, or -1 if the number is less than 0. Then convert the number to a positive using ABS (absolute value). Then subtract the value from 1 so now we have a boolean 0 or -1 for the current row. Then multiply the data (in this case the queantiy * price) by the value we have been calculated. So if the row is in the current month it will be multiplied by -1, otherwise by 0 (and hence equal 0 so it will not be included in the sum).

Here is a table showing the value at each step

      

Subtraction SIGN ABS Subtraction
Jan 2 1 1 0
Feb 1 1 1 0
Mar 0 0 0 1
Apr -1 -1 1 0
May -2 -1 1 0
Jun -3 -1 1 0

If you look at the execution plan for this method you will see it is drastically more efficient which really helps with large amounts of data. This technique does not have to be used for dates. A good example Richard Campbell showed was using CharIndex to find a name (as it will return -1 if not found, 0 if an exact match, or greater than 0 if contained within the word). Basically anything you can turn into a numeric to say you have matched something vs. not matched then this technique should be applicable.

Filed Under: Sql

Returning extra data in a stored proc

Jul 18

    Today I was asked how to return an error code from a database along with other data. While I try to sheer away from error codes from the database and do my validation / logic in the business / data layers this still is necessary. Most people I have worked with have no idea that sql2000 (and 2005) support output parameters which can be quite usefull.

CREATE PROCEDURE GetCustomers ( @ErrorCode int OUTPUT )AS

SELECT	* FROM 	customers

--if there is an error set this to something elseset @ErrorCode = 0go

In code you would have something like this:

cmd.parameters.add(“@ErrorCode”, SqlDbType.int)
cmd.parameters(“@ErrorCode”).Direction = ParameterDirection.Output
reader = cmd.executeReader()
if cmd.parameters(“@ErrorCode”).value <> 0 then
    LogError()
else
    … enumerate reader here
end if

So this is a handy way of returning some additional bit of data without having to return another result set. This is something that is fairly rarely used but can be quite handy.

One place I have used this quite effectively is with a dataset centric app that uses the sql application block. I set the identity field in the database to be an output and map that to my dataset. Whenever I insert a record the identity column in the dataset gets updated with the identity from the insert operation. It is beyond the scope of this post to describe it in detail though (and I am hungry) but has been working quite well for me.

Filed Under: Sql

SQL Security Script 2

Jun 10

A while back I posted a script someone shared with me to generate grants on functions and stored procs to a user. I discovered it is pretty crappy and convoluted. I built this instead:

select     ’grant execute on ‘ + [name] + ‘ to public’  
from     sysObjects
where     (
        xtype = ‘P’
    or     xtype=’FN’
    )
and    [name] not like ‘dt_%’

Filed Under: Sql

SQL basic like goto statements

May 30

A lot of people do not know is that SQL supports the GOTO command and labels just like good old basic. An example is worth a 3.3337 words:

 

    BEGIN TRAN UpdateDetails
        
        update     details
        set        payer = @payer
        where      payerId=@PayerId
        
        IF @@error <> 0
            GOTO   Errorhandler

        update     payer
        set        lastActivityDate=getdate()
        where      payerId = @PayerId

        IF @@error <> 0
            GOTO Errorhandler

    commit tran UpdateDetails
ErrorHandler:
        BEGIN
    ROLLBACK TRAN UpdateDetails
        END

As you can see that after every statement we run we check the error status and if there is an error we go to ErrorHandler. Pretty easy (and ugly)

Now you might wonder why I am checking the @@error after every statement and not just at the end. SQL’s @@error gets set to 0 on a successfull statement run. So this SQL snippet

select * from tabledoesnotexist

select * from tableDOESexist

select @@error

Will return 0 because we have run a select against the table that does exist before we check the status of @@error which resets it back to 0. This is why it is incredibly important to test after every method call. I was reading last night that sql2005 supports Try / Catch semantics that makes things a lot easier and cleaner to work with

 

 

Filed Under: Sql

SQL Reformater

May 29

I just built a quick little SQL reformater that makes your sql a bit more readable.

It takes this:
select pmtlndet.pmtlndet_amt, glcdty.glcdty_desc from pmtlndet
join pmtln on pmtln.pmtln_ID = pmtLndet.pmtln_ID
join pmt on pmt.pmt_id = pmtln.pmt_id
join GlCdTy on pmtlndet.GlCdty_id = GlCdTy.GlCdty_id
where pmt.hostappl_inv_num = @HostApplInvNum and pmtln.pmtStatSubTy_id=@RCRYty_ID

and turns it into this:

select	pmtlndet.pmtlndet_amt,	glcdty.glcdty_descfrom	pmtlndetjoin	pmtlnon	pmtln.pmtln_ID=pmtLndet.pmtln_IDjoin	pmton	pmt.pmt_id=pmtln.pmt_idjoin	GlCdTyon	pmtlndet.GlCdty_id=GlCdTy.GlCdty_idwhere	pmt.hostappl_inv_num=@HostApplInvNumand	pmtln.pmtStatSubTy_id=@RCRYty_ID

This tool was just a quick one off and does not reformat subqueries very well as well as function calls in the select statement (i.e. isnull(col, ‘replacement’)) but so far has saved me a ton of time.

NOTE: When you click the parse button it automatically copies the reformated SQL to the clipboard.

Download

Filed Under: Sql

GUIDS in the Database

May 16

For those of you who don’t know what a GUID is it is a 128 bit
value that is “Unique across space and time”. A GUID looks like this:
B2658C9D-A76G-4D72-B0E4-B732332408D6. There garanteed uniqueness has
been used by the com+ system in windows for years. This is why two com
components with the same name can be installed without conflicting
(they each have a GUID that keeps them seperate).

Developers
seem to be pessimistic about GUID collissions in their systems. I just
have to say that it will not happen. As I said com+ has been using this
forever and I have never heard of a GUID collission ever. GUID itself
stands for Globally Unique IDentifier. i.e. accrosss the globe no one
else has the same GUID.

Now onto why these are usefull in databases:

1. Merging of data
If
you have 2 databases with the same tables but are currently seperated
(i.e. an identical sales database in 2 seperate offices). If you want
to combine these into a data warehouse you will have issues with
numerical identies colliding. If you used a GUID then there would be no
conflict.
If you are using replication then GUIIDS are your friends for this same reason

2. Security
If you look at this query string
editpage.aspx?userId=23526
I
would guess that chaning that userId up or down one would allow me to
be another person on the system. Granted it is bad design to expose
this information to the user (hide it in session state instead so the
client never sees it). But as I always say security should be layered.
If we used a GUID for our userIds we would have a query string like
this:
editpage.aspx?userId=B2658C9D-A76G-4D72-B0E4-B732332408D6
By
its very deffinition it would be impossibly for anyone to brute force
this (they could never generate another GUID that matches one in our
system because they are globally unique).
3. Disconnected Data
Typically
when creating data we create the records, insert it, then read back the
numerical key that sql has created. By using a GUID we can set our key
in data to be that GUID and fire it off to the database knowing for
sure that no other ID like it will exist in the datbase.
4. Integer Data Space
You
can run out of integers in a database (I can’t remember the limit) for
IDs but by using GUIDs you can have limitless number of IDs without
fear of exhaustinga limit.

The Cons
Speed: It takes time to generate a number that is that unique. This takes CPU cycles
Readability:
It is easier to read and type a query like select * from table where
ID=30 than select * from table where
ID=’B2658C9D-A76G-4D72-B0E4-B732332408D6′
Space: GUIDs are larger
than integers (4 times larger) and take up more space. I am of the
adage that storage is cheap now so unless you are going to have a
massive massive massive database this does not concern me that much
Page Splits: Due to the size of GUIDs they can contribute to page splits in SQL
Searching and Comparison: It takes a bit longer to compare two guids when searching that it does an integer

 

Filed Under: Sql