<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	xmlns:creativeCommons="http://backend.userland.com/creativeCommonsRssModule">

<channel>
	<title>Idea Excursion &#187; SQL Server</title>
	<atom:link href="http://www.ideaexcursion.com/category/microsoft/windows/sql-server/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.ideaexcursion.com</link>
	<description>Technology Musings</description>
	<lastBuildDate>Tue, 29 Jun 2010 21:24:52 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.0</generator>
	<atom:link rel='hub' href='http://www.ideaexcursion.com/?pushpress=hub'/>
<creativeCommons:license>http://creativecommons.org/licenses/by-sa/3.0/us/</creativeCommons:license>		<item>
		<title>Default column value to identity of different column</title>
		<link>http://www.ideaexcursion.com/2010/04/19/default-column-value-to-identity-of-different-column/</link>
		<comments>http://www.ideaexcursion.com/2010/04/19/default-column-value-to-identity-of-different-column/#comments</comments>
		<pubDate>Mon, 19 Apr 2010 21:09:36 +0000</pubDate>
		<dc:creator>Taylor Gerring</dc:creator>
				<category><![CDATA[SQL Server]]></category>

		<guid isPermaLink="false">http://www.ideaexcursion.com/?p=1507</guid>
		<description><![CDATA[Given a table with one column as an identity, how you default another column to the identity of the first, while retaining the ability to change later]]></description>
			<content:encoded><![CDATA[<p>Thanks to my friend Paul Mayle for putting this one out there:</p>
<blockquote><p>Given a table with columns:<br />
A: int, primary key, identity<br />
B: int, not null</p>
<p>How can I set a default for column B to be the value in column A?</p></blockquote>
<p>This is an interesting problem on a couple levels. First, a computed column won&#8217;t work, because we need the ability to arbitrarily update column B. We could program a trigger, but I prefer to avoid them when possible. Preferably, we could actualize the purpose of the DEFAULT option. Unfortunately, this isn&#8217;t immediately an apparent choice because SQL Server doesn&#8217;t allow you to reference a column in a DEFAULT specification. In fact, according to <abbr title="Books Online">BOL</abbr>, &#8220;Only a constant value, such as a character string; a scalar function (either a  system, user-defined, or CLR function); or NULL can be used as a default.&#8221;<br />
<span id="more-1507"></span><br />
Fortunately, the column which we&#8217;re trying to set a default from had an identity property, and system functions dealing with identity values is the key to using the DEFAULT specification. First, a little code:</p>

<div class="wp_syntax"><div class="code"><pre class="tsql" style="font-family:monospace;"><span style="color: #0000FF;">CREATE</span> <span style="color: #0000FF;">TABLE</span> dbo.<span style="color: #202020;">IdentDefault</span>
<span style="color: #808080;">&#40;</span>
	A <span style="color: #0000FF;">INT</span> <span style="color: #808080;">NOT</span> <span style="color: #808080;">NULL</span> <span style="color: #0000FF;">IDENTITY</span> <span style="color: #808080;">&#40;</span><span style="color: #000;">1</span>, <span style="color: #000;">1</span><span style="color: #808080;">&#41;</span> <span style="color: #0000FF;">PRIMARY</span> <span style="color: #0000FF;">KEY</span>,
	B <span style="color: #0000FF;">INT</span> <span style="color: #808080;">NOT</span> <span style="color: #808080;">NULL</span> <span style="color: #0000FF;">DEFAULT</span> <span style="color: #FF00FF;">SCOPE_IDENTITY</span><span style="color: #808080;">&#40;</span><span style="color: #808080;">&#41;</span>
<span style="color: #808080;">&#41;</span>
&nbsp;
<span style="color: #0000FF;">INSERT</span> <span style="color: #0000FF;">INTO</span> dbo.<span style="color: #202020;">IdentDefault</span> <span style="color: #0000FF;">DEFAULT</span> <span style="color: #0000FF;">VALUES</span>
<span style="color: #0000FF;">INSERT</span> <span style="color: #0000FF;">INTO</span> dbo.<span style="color: #202020;">IdentDefault</span> <span style="color: #0000FF;">DEFAULT</span> <span style="color: #0000FF;">VALUES</span>
<span style="color: #0000FF;">INSERT</span> <span style="color: #0000FF;">INTO</span> dbo.<span style="color: #202020;">IdentDefault</span> <span style="color: #0000FF;">DEFAULT</span> <span style="color: #0000FF;">VALUES</span>
&nbsp;
<span style="color: #0000FF;">UPDATE</span> dbo.<span style="color: #202020;">IdentDefault</span> <span style="color: #0000FF;">SET</span> B <span style="color: #808080;">=</span> <span style="color: #000;">10</span> <span style="color: #0000FF;">WHERE</span> A <span style="color: #808080;">=</span> <span style="color: #000;">1</span>
&nbsp;
<span style="color: #0000FF;">SELECT</span> <span style="color: #808080;">*</span> <span style="color: #0000FF;">FROM</span> dbo.<span style="color: #202020;">IdentDefault</span></pre></div></div>

<p>And the result:</p>

<div class="wp_syntax"><div class="code"><pre class="none" style="font-family:monospace;">A           B
----------- -----------
1           10
2           2
3           3</pre></div></div>

<p>As you can see from the above result set, column B defaults to whatever column A contains, yet we can still update column B to whatever value necessary.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.ideaexcursion.com/2010/04/19/default-column-value-to-identity-of-different-column/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Import Excel 2007 with SQL Server Import and Export Wizard</title>
		<link>http://www.ideaexcursion.com/2010/01/06/import-excel-2007-with-sql-server-import-and-export-wizard/</link>
		<comments>http://www.ideaexcursion.com/2010/01/06/import-excel-2007-with-sql-server-import-and-export-wizard/#comments</comments>
		<pubDate>Wed, 06 Jan 2010 15:38:02 +0000</pubDate>
		<dc:creator>Taylor Gerring</dc:creator>
				<category><![CDATA[SQL Server]]></category>

		<guid isPermaLink="false">http://www.ideaexcursion.com/?p=1466</guid>
		<description><![CDATA[Neither SQL Server 2008 nor Office 2007 install Microsoft.ACE.OLEDB.12.0 providers. With a small tweak, we can enable connections to Excel and Access 2007.]]></description>
			<content:encoded><![CDATA[<p>If you need to pull in data from an external source ad hoc, using <abbr title="SQL Server Integration Services">SSIS</abbr> is often overkill. Instead, SQL Server Import and Export Wizard (called &#8220;Import and Export Data&#8221; in the Start Menu, but DTSWizard.exe in the filesystem) usually does a good job. This is especially great way to pull in data from users, which typically comes in the form of an Excel attachment. Unfortunately, the providers to import from the newer Office 2007 and 2010 XLSX  file format (also referred to as &#8220;Open Office XML&#8221;) are not available by default and will likely result in a &#8220;Microsoft.ACE.OLEDB.12.0 Provider is not registered&#8221; error. The fix is as easy as  installing the &#8220;<a title="Download Details: 2007 Office System Driver: Data Connectivity Components" href="http://www.microsoft.com/downloads/details.aspx?FamilyID=7554F536-8C28-4598-9B72-EF94E038C891&amp;displaylang=en" target="_blank">2007 Office System Driver: Data Connectivity Components</a>&#8221; package from Microsoft. This same package will also enable access to Access 2007.<br />
<span id="more-1466"></span><br />
After installation, simply re-run DTSWizard and try to import again. If you can&#8217;t find Excel or Access in the Data Source dropdown list, remember that the providers only work in 32-bit mode, and therefore you need to run &#8220;Import and Export Data (32-bit),&#8221; which is located at &#8220;C:\Program Files (x86)\Microsoft SQL Server\100\DTS\Binn\DTSWizard.exe&#8221;.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.ideaexcursion.com/2010/01/06/import-excel-2007-with-sql-server-import-and-export-wizard/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Accessing Custom .NET Assemblies in SSIS 2008 Script Tasks</title>
		<link>http://www.ideaexcursion.com/2009/10/14/accessing-custom-net-assemblies-in-ssis-2008-script-tasks/</link>
		<comments>http://www.ideaexcursion.com/2009/10/14/accessing-custom-net-assemblies-in-ssis-2008-script-tasks/#comments</comments>
		<pubDate>Wed, 14 Oct 2009 19:36:24 +0000</pubDate>
		<dc:creator>Taylor Gerring</dc:creator>
				<category><![CDATA[SSIS]]></category>
		<category><![CDATA[.NET]]></category>
		<category><![CDATA[C#]]></category>
		<category><![CDATA[SQL Server]]></category>
		<category><![CDATA[VB.NET]]></category>

		<guid isPermaLink="false">http://www.ideaexcursion.com/?p=1429</guid>
		<description><![CDATA[If you need to access a custom .NET Assembly from an SSIS Script Task, Microsoft doesn't make things very easy - but it's still possible with a little setup.]]></description>
			<content:encoded><![CDATA[<p>If you need to access a custom .NET Assembly from an <abbr title="SQL Server Integration Services">SSIS</abbr> Script Task, Microsoft doesn&#8217;t make things very easy &#8211; but it&#8217;s still possible with a little setup. This is a great way to introduce custom data types or some new functionality without having to replicate that code in a new environment.<br />
<span id="more-1429"></span></p>
<h2>The Setup</h2>
<ul>
<li>Windows 7 64-bit</li>
<li>Visual Studio 2008</li>
<li>SQL Server 2008 64-bit</li>
</ul>
<h2>The Process</h2>
<ol>
<li>Create a signing key (See also, <a title="How to: Create a Public/Private Key Pair" href="http://msdn.microsoft.com/en-us/library/6f05ezxy.aspx " target="_blank">How to: Create a Public/Private Key Pair</a>)
<ol>
<li>Open Visual Studio 2008 command Prompt &#8211; the regular command prompt <em>will not</em> work</li>
<li>Change to a friendly directory: cd %userprofile%\Desktop</li>
<li>Create the key file: sn -k key.snk</li>
</ol>
</li>
<li>Sign the assembly &#8211; There are a few ways to do this, but I found this to be the easiest. If you want to sign it some other way, check out <a title="http://msdn.microsoft.com/en-us/library/xc31ft41.aspx" href="http://msdn.microsoft.com/en-us/library/xc31ft41.aspx" target="_blank">How to: Sign an Assembly with a Strong Name</a>
<ol>
<li>Right-click the Project</li>
<li>Select &#8220;Properties&#8221;</li>
<li>Navigate to the &#8220;Signing&#8221; tab</li>
<li>Browse to strong name key file (which was created in the previous step)</li>
<li>Recompile the project</li>
</ol>
</li>
<li>Copy the re-compiled assembly to your <acronym title="Global Assembly Cache">GAC</acronym>
<ol>
<li>gacutil -i &#8220;C:\Path\to\CustomAssemblyName.dll&#8221;</li>
</ol>
</li>
<li>Copy assembly to &#8220;%programfiles(x86)%\Microsoft SQL Server\100\SDK\Assemblies&#8221;</li>
<li>Add reference in script task. Repeat this for <strong>every </strong>Script Task you want to access this assembly from
<ol>
<li>Right-click References</li>
<li>Click &#8220;Add Reference&#8230;&#8221;
<p><div id="attachment_1437" class="wp-caption alignnone" style="width: 310px"><a rel="attachment wp-att-1437" href="http://www.ideaexcursion.com/2009/10/14/accessing-custom-net-assemblies-in-ssis-2008-script-tasks/add-reference/"><img class="size-medium wp-image-1437" title="Add Reference..." src="http://www.ideaexcursion.com/wp-content/uploads/2009/10/add-reference-300x139.png" alt="Add Reference..." width="300" height="139" /></a><p class="wp-caption-text">Add Reference...</p></div></li>
<li>On the .NET tab, scroll to find your assembly</li>
<li>Press &#8220;OK&#8221;</li>
<li>The Assembly should now appear under the References list</li>
</ol>
</li>
<li> Add a reference to the assembly in code, at the top
<ol>
<li>(C#) Using CustomAssemblyName;</li>
<li> (VB.NET) Imports CustomAssemblyName</li>
</ol>
</li>
<li>You should now have full access to the imported <abbr title="Dynamic Link Library">DLL</abbr></li>
</ol>
<h2>Caveats</h2>
<p>This method works pretty well, but deployment isn&#8217;t exactly seamless &#8211; you&#8217;ll have to repeat this for each server and re-register &amp; copy the <abbr title="Dynamic Link Library">DLL</abbr> separately for any updates. Additionally, there is no way to globally add the assembly reference to the entire project or package. Instead, you&#8217;ll have to repeat step 6 (adding the reference) for every Script Task.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.ideaexcursion.com/2009/10/14/accessing-custom-net-assemblies-in-ssis-2008-script-tasks/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>SQL Server Regular Expression CLR UDF</title>
		<link>http://www.ideaexcursion.com/2009/08/18/sql-server-regular-expression-clr-udf/</link>
		<comments>http://www.ideaexcursion.com/2009/08/18/sql-server-regular-expression-clr-udf/#comments</comments>
		<pubDate>Tue, 18 Aug 2009 19:55:21 +0000</pubDate>
		<dc:creator>Taylor Gerring</dc:creator>
				<category><![CDATA[SQL Server]]></category>
		<category><![CDATA[CLR]]></category>
		<category><![CDATA[RegEx]]></category>
		<category><![CDATA[Regular Expressions]]></category>
		<category><![CDATA[SQL Server 2005]]></category>
		<category><![CDATA[SQL Server 2008]]></category>
		<category><![CDATA[UDF]]></category>

		<guid isPermaLink="false">http://www.ideaexcursion.com/?p=1378</guid>
		<description><![CDATA[How to create a .NET user-defined function that exposes RegEx functionality to Microsoft SQL Server.]]></description>
			<content:encoded><![CDATA[<p>Face it: data cleanup is a fact of life. While SQL Server has a handful of string manipulation functions, nothing even comes close to the power of RegEx. Fortunately, by leveraging the <abbr title="Common Language Runtime">CLR</abbr> functionality in SQL Server 2005 and SQL Server 2008, we can add a host of new features, including regular expressions.<br />
<span id="more-1378"></span></p>
<h3>Steps</h3>
<ol>
<li>First, fire up Visual Studio (2005 or 2008 &#8211; it doesn&#8217;t matter).</li>
<li>Create a new project &#8211; name it something clever, like &#8220;RegEx&#8221;</li>
<li>After creating the project, you should be prompted to connect to a database where you&#8217;ll eventually want to deploy the project. This is completely optional and can be changed later.</li>
<li>Right-click the project name (&#8220;RegEx&#8221;) and choose Add &rarr; User-Defined Function<div id="attachment_1389" class="wp-caption alignnone" style="width: 310px"><a href="http://www.ideaexcursion.com/2009/08/18/sql-server-regular-expression-clr-udf/add-udf/" rel="attachment wp-att-1389"><img src="http://www.ideaexcursion.com/wp-content/uploads/2009/08/add-udf-300x202.PNG" alt="Add User Defined Function" title="Add User Defined Function" width="300" height="202" class="size-medium wp-image-1389" /></a><p class="wp-caption-text">Add User Defined Function</p></div></li>
<li>Name the file RegExMatch.</li>
<li>Paste the following code into that file</li>
<li>When done, simple build (Ctrl+Shift+B) and deploy (Right-click the project name (&#8220;RegEx&#8221;) &rarr; Deploy).</li>
</ol>

<div class="wp_syntax"><table><tr><td class="line_numbers"><pre>1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
</pre></td><td class="code"><pre class="csharp" style="font-family:monospace;"><span style="color: #0600FF;">using</span> <span style="color: #008080;">System</span><span style="color: #008000;">;</span>
<span style="color: #0600FF;">using</span> <span style="color: #008080;">System.Data</span><span style="color: #008000;">;</span>
<span style="color: #0600FF;">using</span> <span style="color: #008080;">System.Data.SqlClient</span><span style="color: #008000;">;</span>
<span style="color: #0600FF;">using</span> <span style="color: #008080;">System.Data.SqlTypes</span><span style="color: #008000;">;</span>
<span style="color: #0600FF;">using</span> <span style="color: #008080;">Microsoft.SqlServer.Server</span><span style="color: #008000;">;</span>
<span style="color: #0600FF;">using</span> <span style="color: #008080;">System.Text.RegularExpressions</span><span style="color: #008000;">;</span>
&nbsp;
<span style="color: #0600FF;">public</span> <span style="color: #0600FF;">partial</span> <span style="color: #FF0000;">class</span> UserDefinedFunctions
<span style="color: #000000;">&#123;</span>
    <span style="color: #000000;">&#91;</span>Microsoft.<span style="color: #0000FF;">SqlServer</span>.<span style="color: #0000FF;">Server</span>.<span style="color: #0000FF;">SqlFunction</span><span style="color: #000000;">&#40;</span>IsDeterministic <span style="color: #008000;">=</span> <span style="color: #0600FF;">true</span>, IsPrecise <span style="color: #008000;">=</span> <span style="color: #0600FF;">true</span><span style="color: #000000;">&#41;</span><span style="color: #000000;">&#93;</span>
    <span style="color: #0600FF;">public</span> <span style="color: #0600FF;">static</span> SqlString RegExMatch<span style="color: #000000;">&#40;</span>SqlString expression, SqlString pattern<span style="color: #000000;">&#41;</span>
    <span style="color: #000000;">&#123;</span>
        <span style="color: #0600FF;">if</span> <span style="color: #000000;">&#40;</span>expression.<span style="color: #0000FF;">IsNull</span> <span style="color: #008000;">||</span> pattern.<span style="color: #0000FF;">IsNull</span><span style="color: #000000;">&#41;</span>
            <span style="color: #0600FF;">return</span> SqlString.<span style="color: #0000FF;">Null</span><span style="color: #008000;">;</span>
&nbsp;
        Match match <span style="color: #008000;">=</span> <span style="color: #008000;">new</span> Regex<span style="color: #000000;">&#40;</span>pattern.<span style="color: #0000FF;">ToString</span><span style="color: #000000;">&#40;</span><span style="color: #000000;">&#41;</span><span style="color: #000000;">&#41;</span>.<span style="color: #0000FF;">Match</span><span style="color: #000000;">&#40;</span>expression.<span style="color: #0000FF;">ToString</span><span style="color: #000000;">&#40;</span><span style="color: #000000;">&#41;</span><span style="color: #000000;">&#41;</span><span style="color: #008000;">;</span>
&nbsp;
        <span style="color: #0600FF;">return</span> match.<span style="color: #0000FF;">Success</span> <span style="color: #008000;">?</span> <span style="color: #008000;">new</span> SqlString<span style="color: #000000;">&#40;</span>match.<span style="color: #0000FF;">Value</span><span style="color: #000000;">&#41;</span> <span style="color: #008000;">:</span> SqlString.<span style="color: #0000FF;">Null</span><span style="color: #008000;">;</span>
&nbsp;
    <span style="color: #000000;">&#125;</span>
<span style="color: #000000;">&#125;</span><span style="color: #008000;">;</span></pre></td></tr></table></div>

<h3>Code Explanation</h3>
<ul>
<li>Line 1-6: Including necessary assemblies. The only item you need to add is line 6 &#8211; System.Text.RegularExpressions</li>
<li>Line 11: We indicate the function requires 2 parameters, the input string and the regular expression to apply.</li>
<li>Line 13-14: Check if either input string is NULL. If so, return NULL and do nothing else.</li>
<li>Line 16: There&#8217;s a lot packed on this line, but essentially, it creates an object named <em>match</em> based on the results of the regular expression match operation.</li>
<li>Line 18: Use the ternary operator to check if the match was a success. If so, return the matching string. Otherwise, return a NULL.</li>
</ul>
<h3>Examples</h3>
<p>We&#8217;re going to use a simple regular expression to check for a valid US postal code (AKA Zipcode + 4):</p>
<pre>^\d{5}(-\d{4})?$</pre>
<p>This regular expressions checks for exactly 5 digits followed by an option group of hyphen and 4 more digits.</p>

<div class="wp_syntax"><div class="code"><pre class="tsql" style="font-family:monospace;"><span style="color: #0000FF;">SELECT</span>
	  dbo.<span style="color: #202020;">RegExMatch</span><span style="color: #808080;">&#40;</span><span style="color: #FF0000;">'90210'</span>,<span style="color: #FF0000;">'^<span style="color: #000099; font-weight: bold;">\d</span>{5}(-<span style="color: #000099; font-weight: bold;">\d</span>{4})?$'</span><span style="color: #808080;">&#41;</span>
	, dbo.<span style="color: #202020;">RegExMatch</span><span style="color: #808080;">&#40;</span><span style="color: #FF0000;">'90210-1234'</span>,<span style="color: #FF0000;">'^<span style="color: #000099; font-weight: bold;">\d</span>{5}(-<span style="color: #000099; font-weight: bold;">\d</span>{4})?$'</span><span style="color: #808080;">&#41;</span>
	, dbo.<span style="color: #202020;">RegExMatch</span><span style="color: #808080;">&#40;</span><span style="color: #FF0000;">'90210-'</span>,<span style="color: #FF0000;">'^<span style="color: #000099; font-weight: bold;">\d</span>{5}(-<span style="color: #000099; font-weight: bold;">\d</span>{4})?$'</span><span style="color: #808080;">&#41;</span>
	, dbo.<span style="color: #202020;">RegExMatch</span><span style="color: #808080;">&#40;</span><span style="color: #FF0000;">'9021A'</span>,<span style="color: #FF0000;">'^<span style="color: #000099; font-weight: bold;">\d</span>{5}(-<span style="color: #000099; font-weight: bold;">\d</span>{4})?$'</span><span style="color: #808080;">&#41;</span></pre></div></div>

<p>And results:</p>

<div class="wp_syntax"><div class="code"><pre class="none" style="font-family:monospace;">90210	90210-1234	NULL	NULL</pre></div></div>

<h3>Other notes</h3>
<p>If you need to change the target database, do so in the project&#8217;s properties:</p>
<ol>
<li>Right-click the proeject name (&#8220;RegEx&#8221;) &rarr; Properties</li>
<li>Select the Database tab (2nd item from the bottom)</li>
<li>Click &#8220;Browse&#8230;&#8221; to create and test the connection string.<div id="attachment_1390" class="wp-caption alignnone" style="width: 310px"><a href="http://www.ideaexcursion.com/2009/08/18/sql-server-regular-expression-clr-udf/database-properties/" rel="attachment wp-att-1390"><img src="http://www.ideaexcursion.com/wp-content/uploads/2009/08/database-properties-300x148.PNG" alt="Database Properties" title="Database Properties" width="300" height="148" class="size-medium wp-image-1390" /></a><p class="wp-caption-text">Database Properties</p></div></li>
</ol>
<p>Thanks to <a title="Regular Expression Replace in SQL 2005 (via the CLR)" href="http://weblogs.sqlteam.com/jeffs/archive/2007/04/27/SQL-2005-Regular-Expression-Replace.aspx" target="_blank">Jeff&#8217;s SQL Server Blog</a> for the initial Regular Expression Replace code.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.ideaexcursion.com/2009/08/18/sql-server-regular-expression-clr-udf/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>HOWTO: Connect to MySQL in SSIS</title>
		<link>http://www.ideaexcursion.com/2009/06/04/howto-connect-to-mysql-in-ssis/</link>
		<comments>http://www.ideaexcursion.com/2009/06/04/howto-connect-to-mysql-in-ssis/#comments</comments>
		<pubDate>Thu, 04 Jun 2009 14:58:43 +0000</pubDate>
		<dc:creator>Taylor Gerring</dc:creator>
				<category><![CDATA[Google]]></category>
		<category><![CDATA[SSIS]]></category>
		<category><![CDATA[MySQL]]></category>
		<category><![CDATA[SQL Server]]></category>

		<guid isPermaLink="false">http://www.ideaexcursion.com/?p=793</guid>
		<description><![CDATA[Using the MySQL ADO.NET provider, SQL Server Integration Services can natively query MySQL databases, providing an easy method to transfer data between systems.]]></description>
			<content:encoded><![CDATA[<p>While Microsoft provided <a title="SSIS Team Blog New connectivity options in 2008" href="http://blogs.msdn.com/mattm/archive/2008/03/10/new-connectivity-options-in-2008.aspx" target="_blank">connectors for Oracle, Teradata, and SAP BI</a> for <abbr title="SQL Server Integration Services">SSIS</abbr> 2008, there are many other database systems left out of the mix. Fortunately, <abbr title="SQL Server Integration Services">SSIS</abbr> is exceptionally flexible in connecting to various data sources and allows other vendors to provide native support. The MySQL team did just that with <a title="MySQL :: Download Connector/Net 6.0" href="http://dev.mysql.com/downloads/connector/net/6.0.html" target="_blank">Connector/NET 6.0</a>, their ADO.NET provider. This tool allows us to use the the ADO.NET connections in SQL Server Integration Services to easily connect to MySQL. This is a walk through on how to connect to MySQL with <abbr title="SQL Server Integration Services">SSIS</abbr> 2005 utilizing the Connector/NET 6.0 ADO.NET provider.<br />
<span id="more-793"></span></p>
<ol>
<li>Download and install MySQL <a title="MySQL :: Download Connector/Net 6.0" href="http://dev.mysql.com/downloads/connector/net/6.0.html" target="_blank">Connector/NET 6.0</a></li>
<li>Start a new Integration Services project in <acronym title="Business Intelligence Development Studio">BIDS</acronym>
</li>
<li>Right-click in Connection Managers and create a new ADO.NET Connection
<p><a rel="attachment wp-att-808" href="http://www.ideaexcursion.com/2009/06/04/howto-connect-to-mysql-in-ssis/new-ado-net-connection/"><img class="size-medium wp-image-808" title="New ADO.NET Connection" src="http://www.ideaexcursion.com/wp-content/uploads/2009/05/new-ado-net-connection-250x300.png" alt="New ADO.NET Connection" width="250" height="300" /></a></li>
<li>In the Provider dropdown, expand .Net Providers and select MySQL Data Provider. Press &quot;OK&quot;
<p><a rel="attachment wp-att-807" href="http://www.ideaexcursion.com/2009/06/04/howto-connect-to-mysql-in-ssis/mysql-data-provider/"><img class="size-medium wp-image-807" title="MySQL Data Provider" src="http://www.ideaexcursion.com/wp-content/uploads/2009/05/mysql-data-provider-300x201.png" alt="MySQL Data Provider" width="300" height="201" /></a></li>
<li>Fill out the Server name, User name, Password and select the database name for the target MySQL server. Be sure to test the connection and press &#8220;OK&#8221;
<p><a rel="attachment wp-att-799" href="http://www.ideaexcursion.com/2009/06/04/howto-connect-to-mysql-in-ssis/connection-manager-connection-info/"><img class="size-medium wp-image-799" title="Connection Manager Connection Info" src="http://www.ideaexcursion.com/wp-content/uploads/2009/05/connection-manager-connection-info-294x300.png" alt="Connection Manager Connection Info" width="294" height="300" /></a></li>
<li>Rename the connection to &#8220;MySQLDB&#8221;
<p><a rel="attachment wp-att-800" href="http://www.ideaexcursion.com/2009/06/04/howto-connect-to-mysql-in-ssis/connection-managers-mysqldb/"><img class="size-full wp-image-800" title="Connection Managers MySQLDB" src="http://www.ideaexcursion.com/wp-content/uploads/2009/05/connection-managers-mysqldb.png" alt="Connection Managers MySQLDB" width="143" height="50" /></a></li>
<li>Open up the Toolbox and drag a Data Flow Task from the toolbox onto the Control Flow surface
<div id="attachment_795" class="wp-caption alignnone" style="width: 310px"><a rel="attachment wp-att-795" href="http://www.ideaexcursion.com/2009/06/04/howto-connect-to-mysql-in-ssis/add-dataflowtask/"><img class="size-medium wp-image-795" title="Add Dataflow Task" src="http://www.ideaexcursion.com/wp-content/uploads/2009/05/add-dataflowtask-300x76.png" alt="Add Dataflow Task" width="300" height="76" /></a><p class="wp-caption-text">Add Dataflow Task</p></div>
</li>
<li>Double-click the Data Flow Task to switch to the Data Flow view</li>
<li>Create a new variable, &#8220;MySQLResult&#8221; with the Data Type of Object. We will be using this as the final destination for the data, so we don&#8217;t need to connect to a file or database to store the data from this test
<p><div id="attachment_812" class="wp-caption alignnone" style="width: 310px"><a rel="attachment wp-att-812" href="http://www.ideaexcursion.com/2009/06/04/howto-connect-to-mysql-in-ssis/variables-mysqlresult/"><img class="size-medium wp-image-812" title="MySQLResult Variable" src="http://www.ideaexcursion.com/wp-content/uploads/2009/05/variables-mysqlresult-300x66.png" alt="MySQLResult Variable" width="300" height="66" /></a><p class="wp-caption-text">MySQLResult Variable</p></div></li>
<li>Drag a new DataReader Source component onto the Data Flow surface
<p><div id="attachment_796" class="wp-caption alignnone" style="width: 310px"><a rel="attachment wp-att-796" href="http://www.ideaexcursion.com/2009/06/04/howto-connect-to-mysql-in-ssis/add-datareader-source/"><img class="size-medium wp-image-796" title="Add DataReader Source" src="http://www.ideaexcursion.com/wp-content/uploads/2009/05/add-datareader-source-300x78.png" alt="Add DataReader Source" width="300" height="78" /></a><p class="wp-caption-text">Add DataReader Source</p></div></li>
<li>Double-click the DataReader Source to open the Advanced Editor. On the Connection Managers tab, select the previously-created MySQLDB connection
<p><div id="attachment_805" class="wp-caption alignnone" style="width: 310px"><a rel="attachment wp-att-805" href="http://www.ideaexcursion.com/2009/06/04/howto-connect-to-mysql-in-ssis/datareader-source-connection-managers/"><img class="size-medium wp-image-805" title="DataReader Source Connection Managers" src="http://www.ideaexcursion.com/wp-content/uploads/2009/05/datareader-source-connection-managers-300x290.png" alt="DataReader Source Connection Managers" width="300" height="290" /></a><p class="wp-caption-text">DataReader Source Connection Managers</p></div>
</li>
<li>Switch to the Component Properties tab and enter the SQL query in the SqlCommand property. Note that the query must be compatible with MySQL syntax, not SQL Server.
<p><div id="attachment_804" class="wp-caption alignnone" style="width: 310px"><a rel="attachment wp-att-804" href="http://www.ideaexcursion.com/2009/06/04/howto-connect-to-mysql-in-ssis/datareader-source-component-properties/"><img class="size-medium wp-image-804" title="DataReader Source Component Properties" src="http://www.ideaexcursion.com/wp-content/uploads/2009/05/datareader-source-component-properties-300x269.png" alt="DataReader Source Component Properties" width="300" height="269" /></a><p class="wp-caption-text">DataReader Source Component Properties</p></div>
</li>
<li>Switch to the Column Mappings tab to verify that the query is successful and the all the columns were pulled from the database. When done, press &#8220;OK&#8221;.
<div id="attachment_803" class="wp-caption alignnone" style="width: 310px"><a rel="attachment wp-att-803" href="http://www.ideaexcursion.com/2009/06/04/howto-connect-to-mysql-in-ssis/datareader-soruce-column-mappings/"><img class="size-medium wp-image-803" title="DataReader Source Column Mappings" src="http://www.ideaexcursion.com/wp-content/uploads/2009/05/datareader-soruce-column-mappings-300x269.png" alt="DataReader Source Column Mappings" width="300" height="269" /></a><p class="wp-caption-text">DataReader Source Column Mappings</p></div></li>
<li>Create a new Recordset Destination by dragging it from the toolbox to the Data Flow surface
<p><div id="attachment_797" class="wp-caption alignnone" style="width: 310px"><a rel="attachment wp-att-797" href="http://www.ideaexcursion.com/2009/06/04/howto-connect-to-mysql-in-ssis/add-recordset-destination/"><img class="size-medium wp-image-797" title="Add Recordset Destination" src="http://www.ideaexcursion.com/wp-content/uploads/2009/05/add-recordset-destination-300x102.png" alt="Add Recordset Destination" width="300" height="102" /></a><p class="wp-caption-text">Add Recordset Destination</p></div>
</li>
<li>Drag the green Data Flow Path from DataReader Source to Recordset Destination, so they connect
<p><div id="attachment_801" class="wp-caption alignnone" style="width: 170px"><a rel="attachment wp-att-801" href="http://www.ideaexcursion.com/2009/06/04/howto-connect-to-mysql-in-ssis/connect-source-destination/"><img class="size-full wp-image-801" title="Connect Source to Destination" src="http://www.ideaexcursion.com/wp-content/uploads/2009/05/connect-source-destination.png" alt="Connect Source to Destination" width="160" height="145" /></a><p class="wp-caption-text">Connect Source to Destination</p></div></li>
<li>Double-click the Recordset Destination to open its Advanced Editor</li>
<li>Under Custom Properties, select the dropdown for VariableName and select the variable we created before, User::MySQLResult
<p><div id="attachment_810" class="wp-caption alignnone" style="width: 310px"><a rel="attachment wp-att-810" href="http://www.ideaexcursion.com/2009/06/04/howto-connect-to-mysql-in-ssis/recordset-destination-component-properties/"><img class="size-medium wp-image-810" title="Recordset Destination Component Properties" src="http://www.ideaexcursion.com/wp-content/uploads/2009/05/recordset-destination-component-properties-300x269.png" alt="Recordset Destination Component Properties" width="300" height="269" /></a><p class="wp-caption-text">Recordset Destination Component Properties</p></div></li>
<li>Switch to the Input Columns tab and select those columns that you want stored in the Recordset Destination. When complete, click &#8220;OK&#8221;
<p><div id="attachment_811" class="wp-caption alignnone" style="width: 308px"><a rel="attachment wp-att-811" href="http://www.ideaexcursion.com/2009/06/04/howto-connect-to-mysql-in-ssis/recordset-destination-input-columns/"><img class="size-medium wp-image-811" title="Recordset Destination Input Columns" src="http://www.ideaexcursion.com/wp-content/uploads/2009/05/recordset-destination-input-columns-298x300.png" alt="Recordset Destination Input Columns" width="298" height="300" /></a><p class="wp-caption-text">Recordset Destination Input Columns</p></div>
</li>
<li>Right-click the green Data Flow Path and choose &#8220;Data Viewers&#8230;&#8221;
<p><div id="attachment_813" class="wp-caption alignnone" style="width: 264px"><a rel="attachment wp-att-813" href="http://www.ideaexcursion.com/2009/06/04/howto-connect-to-mysql-in-ssis/data-flow-path-data-viewers/"><img class="size-medium wp-image-813" title="Data Flow Path Data Viewers" src="http://www.ideaexcursion.com/wp-content/uploads/2009/05/data-flow-path-data-viewers-254x300.png" alt="Data Flow Path Data Viewers" width="254" height="300" /></a><p class="wp-caption-text">Data Flow Path Data Viewers</p></div>
</li>
<li>Select &#8220;Data Viewers&#8221; from the left pane and click the &#8220;Add&#8230;&#8221; button
<p><div id="attachment_802" class="wp-caption alignnone" style="width: 310px"><a rel="attachment wp-att-802" href="http://www.ideaexcursion.com/2009/06/04/howto-connect-to-mysql-in-ssis/data-flow-path-editor/"><img class="size-medium wp-image-802" title="Data Flow Path Editor" src="http://www.ideaexcursion.com/wp-content/uploads/2009/05/data-flow-path-editor-300x253.png" alt="Data Flow Path Editor" width="300" height="253" /></a><p class="wp-caption-text">Data Flow Path Editor</p></div>
</li>
<li>Under the General tab, select Grid and press &#8220;OK&#8221;
<p><div id="attachment_798" class="wp-caption alignnone" style="width: 310px"><a rel="attachment wp-att-798" href="http://www.ideaexcursion.com/2009/06/04/howto-connect-to-mysql-in-ssis/configure-data-viewer/"><img class="size-medium wp-image-798" title="Configure Data Viewer" src="http://www.ideaexcursion.com/wp-content/uploads/2009/05/configure-data-viewer-300x229.png" alt="Configure Data Viewer" width="300" height="229" /></a><p class="wp-caption-text">Configure Data Viewer</p></div></li>
<li>Run the package</li>
<li>If you&#8217;ve done everything correctly, you should see a Data Reader Output Data Viewer window pop up with the contents of the query we specified earlier.
<p><div id="attachment_806" class="wp-caption alignnone" style="width: 310px"><a rel="attachment wp-att-806" href="http://www.ideaexcursion.com/2009/06/04/howto-connect-to-mysql-in-ssis/data-viewer-output/"><img class="size-medium wp-image-806" title="Data Viewer Output" src="http://www.ideaexcursion.com/wp-content/uploads/2009/05/data-viewer-output-300x165.png" alt="Data Viewer Output" width="300" height="165" /></a><p class="wp-caption-text">Data Viewer Output</p></div></li>
</ol>
<p>SQL Server Integration Services makes connecting to other systems very easy. The MySQL ADO.NET provider works well, but requires more configuration than a native Source component.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.ideaexcursion.com/2009/06/04/howto-connect-to-mysql-in-ssis/feed/</wfw:commentRss>
		<slash:comments>8</slash:comments>
		</item>
		<item>
		<title>Flatten Heirarchies in SQL Server with Common Table Expressions</title>
		<link>http://www.ideaexcursion.com/2009/05/12/flatten-heirarchies-in-sql-server-with-common-table-expressions/</link>
		<comments>http://www.ideaexcursion.com/2009/05/12/flatten-heirarchies-in-sql-server-with-common-table-expressions/#comments</comments>
		<pubDate>Tue, 12 May 2009 14:41:57 +0000</pubDate>
		<dc:creator>Taylor Gerring</dc:creator>
				<category><![CDATA[SQL Server]]></category>
		<category><![CDATA[SQL]]></category>
		<category><![CDATA[SQL Server 2005]]></category>
		<category><![CDATA[SQL Server 2008]]></category>
		<category><![CDATA[T-SQL]]></category>

		<guid isPermaLink="false">http://www.ideaexcursion.com/?p=731</guid>
		<description><![CDATA[Use a CTE to recursively roll up all descendant records into separate groups, regardless of level. Example data and Common Table Expression code included.]]></description>
			<content:encoded><![CDATA[<p>Common Table Expressions were a new feature added to SQL Server 2005 and provide an efficient way to recursively query relationships stored in a normalized table. We&#8217;re going to build on that essential functionality to flatten a typical corporate structure so that all children, grand children, great grand children, etc. roll up into a single, flattened parent, regardless of depth. To graphically visualize this, take a look at the actual relationship we&#8217;ll be querying against:<br />
<span id="more-731"></span><br />
<div id="attachment_772" class="wp-caption alignnone" style="width: 310px"><a href="http://www.ideaexcursion.com/2009/05/12/flatten-heirarchies-in-sql-server-with-common-table-expressions/actual-relationship/" rel="attachment wp-att-772"><img src="http://www.ideaexcursion.com/wp-content/uploads/2009/05/actual-relationship-300x211.png" alt="Actual employee structure to be flattened" title="Actual Relationship" width="300" height="211" class="size-medium wp-image-772" /></a><p class="wp-caption-text">Actual employee structure to be flattened</p></div></p>
<p>This structure will act as our example for the Development Department. The Department Head (Employee 0) has asked for a report of all employees within each team of the development group. Notice that different groups have varying levels of depth. The Database team only has a Manager with two direct reports. The <abbr title="User Interface">UI</abbr> team is a single person. The Middle Tier group is much larger, with a Manager having two direct reports, and one of those employees having several employees beneath him. The requested report should group all employees, flattening the relationship to a single Manager, but excluding himself (since that would mean every employee would simply roll up to him). This &#8220;Desired Relationship&#8221; is represented below:</p>
<div id="attachment_771" class="wp-caption alignnone" style="width: 310px"><a href="http://www.ideaexcursion.com/2009/05/12/flatten-heirarchies-in-sql-server-with-common-table-expressions/desired-relationship/" rel="attachment wp-att-771"><img src="http://www.ideaexcursion.com/wp-content/uploads/2009/05/desired-relationship-300x80.png" alt="The desired result of the query" title="Desired Relationship" width="300" height="80" class="size-medium wp-image-771" /></a><p class="wp-caption-text">The desired result of the query</p></div>
<p>To recreate this scenario, I&#8217;m providing some code for both the setup and query. If you&#8217;re not familiar with Common Table Expressions, I would suggest that you familiarize yourself with the <a href="http://msdn.microsoft.com/en-us/library/ms186243.aspx" title="SQL Server Books Online">SQL Server <abbr title="Books Online">BOL</abbr> entry</a>. The main concept you should understand to be able to adapt this to your own data is that for a <abbr title="Common Table Expression">CTE</abbr> to act recursively, you need both an anchor and recursive query. I&#8217;ll explain their parts later.</p>
<p>First, let&#8217;s instantiate the tables and populate them with data:</p>

<div class="wp_syntax"><table><tr><td class="line_numbers"><pre>1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
</pre></td><td class="code"><pre class="tsql" style="font-family:monospace;"><span style="color: #0000FF;">USE</span> tempdb
go
&nbsp;
<span style="color: #0000FF;">IF</span> <span style="color: #FF00FF;">OBJECT_ID</span><span style="color: #808080;">&#40;</span><span style="color: #FF0000;">'dbo.Employees'</span><span style="color: #808080;">&#41;</span> <span style="color: #0000FF;">IS</span> not null
	<span style="color: #0000FF;">DROP</span> <span style="color: #0000FF;">TABLE</span> dbo.<span style="color: #202020;">Employees</span>
go
&nbsp;
<span style="color: #0000FF;">CREATE</span> <span style="color: #0000FF;">TABLE</span> dbo.<span style="color: #202020;">Employees</span>
<span style="color: #808080;">&#40;</span>
	  EmployeeID <span style="color: #0000FF;">INT</span> <span style="color: #0000FF;">PRIMARY</span> <span style="color: #0000FF;">KEY</span>
	, EmployeeName <span style="color: #0000FF;">VARCHAR</span><span style="color: #808080;">&#40;</span><span style="color: #000;">50</span><span style="color: #808080;">&#41;</span> not null
	, ManagerEmployeeID <span style="color: #0000FF;">INT</span> null
<span style="color: #808080;">&#41;</span>
go
&nbsp;
<span style="color: #0000FF;">INSERT</span> <span style="color: #0000FF;">INTO</span> dbo.<span style="color: #202020;">Employees</span> <span style="color: #808080;">&#40;</span>EmployeeID, EmployeeName, ManagerEmployeeID<span style="color: #808080;">&#41;</span>
<span style="color: #0000FF;">SELECT</span> <span style="color: #000;">0</span>, <span style="color: #FF0000;">'Employee 0'</span>, null
<span style="color: #0000FF;">UNION</span> all
<span style="color: #0000FF;">SELECT</span> <span style="color: #000;">1</span>, <span style="color: #FF0000;">'Employee 1'</span>, <span style="color: #000;">0</span>
<span style="color: #0000FF;">UNION</span> all
<span style="color: #0000FF;">SELECT</span> <span style="color: #000;">2</span>, <span style="color: #FF0000;">'Employee 2'</span>, <span style="color: #000;">0</span>
<span style="color: #0000FF;">UNION</span> all
<span style="color: #0000FF;">SELECT</span> <span style="color: #000;">3</span>, <span style="color: #FF0000;">'Employee 3'</span>, <span style="color: #000;">0</span>
<span style="color: #0000FF;">UNION</span> all
<span style="color: #0000FF;">SELECT</span> <span style="color: #000;">4</span>, <span style="color: #FF0000;">'Employee 1.1'</span>, <span style="color: #000;">1</span>
<span style="color: #0000FF;">UNION</span> all
<span style="color: #0000FF;">SELECT</span> <span style="color: #000;">5</span>, <span style="color: #FF0000;">'Employee 1.2'</span>, <span style="color: #000;">1</span>
<span style="color: #0000FF;">UNION</span> all
<span style="color: #0000FF;">SELECT</span> <span style="color: #000;">6</span>, <span style="color: #FF0000;">'Employee 3.1'</span>, <span style="color: #000;">3</span>
<span style="color: #0000FF;">UNION</span> all
<span style="color: #0000FF;">SELECT</span> <span style="color: #000;">7</span>, <span style="color: #FF0000;">'Employee 3.2'</span>, <span style="color: #000;">3</span>
<span style="color: #0000FF;">UNION</span> all
<span style="color: #0000FF;">SELECT</span> <span style="color: #000;">8</span>, <span style="color: #FF0000;">'Employee 3.1.1'</span>, <span style="color: #000;">6</span>
<span style="color: #0000FF;">UNION</span> all
<span style="color: #0000FF;">SELECT</span> <span style="color: #000;">9</span>, <span style="color: #FF0000;">'Employee 3.1.1.1'</span>, <span style="color: #000;">8</span>
go</pre></td></tr></table></div>

<p>This script will simply create the example employees table and populate it with data to relationally represent what is displayed in the first picture. Here are what the various lines accomplish:</p>
<ul>
<li>1-2: Switch the database context to tempdb. Since the contents of tempdb are cleared upon service restart, we&#8217;re simply ensuring that this will be cleaned up eventually.</li>
<li>4-6: Check if dbo.Employees exists. If so, drop it.</li>
<li>8-14: Create a table to hold our example data.</li>
<li>16-36: Manually populate the example data. Note that we&#8217;re using a UNION ALL between the selects, so only a single INSERT occurs.</li>
</ul>
<p>As for the query to manipulate this data, here is the actual <abbr title="Common Table Expression">CTE</abbr>:</p>

<div class="wp_syntax"><table><tr><td class="line_numbers"><pre>1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
</pre></td><td class="code"><pre class="tsql" style="font-family:monospace;"><span style="color: #008080;">-- Use a CTE to flatten an organization structure to each department</span>
<span style="color: #0000FF;">WITH</span> AllMyChildren <span style="color: #808080;">&#40;</span>ManagerEmployeeID, EmployeeID, GroupManagerID, EmployeeName, <span style="color: #0000FF;">DEPTH</span><span style="color: #808080;">&#41;</span>
<span style="color: #0000FF;">AS</span>
<span style="color: #808080;">&#40;</span>
	<span style="color: #008080;">-- The anchor statement. This selects each department head</span>
	<span style="color: #008080;">-- The important part is to alias the original EmployeeID as the GroupManagerID</span>
	<span style="color: #0000FF;">SELECT</span> ManagerEmployeeID, EmployeeID, EmployeeID <span style="color: #808080;">&#91;</span>GroupManagerID<span style="color: #808080;">&#93;</span>, EmployeeName, <span style="color: #000;">0</span> <span style="color: #808080;">&#91;</span><span style="color: #0000FF;">DEPTH</span><span style="color: #808080;">&#93;</span>
	<span style="color: #0000FF;">FROM</span> dbo.<span style="color: #202020;">Employees</span> <span style="color: #0000FF;">WHERE</span> EmployeeID in
		<span style="color: #008080;">-- We want all employees 1 level below below a certain level, so we'll utilize a subquery to find them</span>
		<span style="color: #808080;">&#40;</span>
		<span style="color: #008080;">-- Get EmployeeIDs that are children to their common parent</span>
		<span style="color: #0000FF;">SELECT</span> EmployeeID <span style="color: #0000FF;">FROM</span> dbo.<span style="color: #202020;">Employees</span> <span style="color: #0000FF;">WHERE</span> ManagerEmployeeID <span style="color: #808080;">=</span> <span style="color: #000;">0</span>
		<span style="color: #808080;">&#41;</span>
	<span style="color: #0000FF;">UNION</span> <span style="color: #808080;">ALL</span>
	<span style="color: #008080;">-- The recursive statement which finds all employees under each department</span>
	<span style="color: #0000FF;">SELECT</span> pc.<span style="color: #202020;">ManagerEmployeeID</span>, pc.<span style="color: #202020;">EmployeeID</span>, amc.<span style="color: #202020;">GroupManagerID</span>, pc.<span style="color: #202020;">EmployeeName</span>, <span style="color: #0000FF;">DEPTH</span> <span style="color: #808080;">+</span> <span style="color: #000;">1</span>
	<span style="color: #0000FF;">FROM</span> dbo.<span style="color: #202020;">Employees</span> pc
	<span style="color: #0000FF;">INNER</span> join AllMyChildren amc <span style="color: #0000FF;">ON</span> pc.<span style="color: #202020;">ManagerEmployeeID</span> <span style="color: #808080;">=</span> amc.<span style="color: #202020;">EmployeeID</span>
<span style="color: #808080;">&#41;</span>
<span style="color: #008080;">-- The CTE is primed, but we still need to execute it with the statement below</span>
<span style="color: #0000FF;">SELECT</span> EmployeeName, GroupManagerID, <span style="color: #0000FF;">DEPTH</span> 
<span style="color: #0000FF;">FROM</span> AllMyChildren
<span style="color: #0000FF;">ORDER</span> <span style="color: #0000FF;">BY</span> GroupManagerID, <span style="color: #0000FF;">DEPTH</span>
<span style="color: #008080;">-- Use the MAXRECURSION query hint to avoid default recursion limit of 100</span>
<span style="color: #0000FF;">OPTION</span> <span style="color: #808080;">&#40;</span>MAXRECURSION <span style="color: #000;">0</span><span style="color: #808080;">&#41;</span></pre></td></tr></table></div>

<p>I&#8217;ve commented liberally, but I&#8217;ll go through this line-by-line to explain it in more detail.</p>
<ul>
<li>2-4: Define the <abbr title="Common Table Expression">CTE</abbr> (AllMyChildren) and the columns that it will output (ManagerEmployeeID, EmployeeID, GroupManagerID, EmployeeName, Depth)</li>
<li>7-8: This is the anchor statement. SELECT the fields we eventually want outputted. The key to this is that EmployeeID is being included a second time, but aliased as GroupManagerID.</li>
<li>10-13: This is a subquery in the WHERE condition of the anchor statement. Because we want to discard some levels (only the topmost level in our example), we need to tell the anchor to fetch all children whose parent EmployeeID is 0. Adjust this sub-query to affect at what level the teams should roll up to.</li>
<li>14-18: UNION ALL the anchor to the recursive portion of the <abbr title="Common Table Expression">CTE</abbr>. The recursive part of the query is similar to the anchor in that it must SELECT similar columns. Notice that it again selects from dbo.Employees, but it also performs an INNER JOIN against the <abbr title="Common Table Expression">CTE</abbr>, linking the EmployeeID and ManagerEmployeeID. This is what causes to recursion to occur. Additionally, we increment Depth by 1. The inclusion of the level is completely optional in the <abbr title="Common Table Expression">CTE</abbr>, but may be helpful for reporting.</li>
<li>19: Close the <abbr title="Common Table Expression">CTE</abbr> block. At this point, it is defined, but we have yet to execute it.</li>
<li>21-23: Perform a SELECT against the <abbr title="Common Table Expression">CTE</abbr> to start it.</li>
<li>25: By default, recursion in Common Table Expressions is limited to 100 iterations. If you anticipate having more than 100 records, you&#8217;ll need to specify MAXRECURSION 0. When testing, you can limit it to any number up to 32,767 to prevent wild-running queries. Here &#8211; since this is tested and working &#8211; we specify &#8220;0&#8243; for unlimited iterations.</li>
</ul>
<p>And this is the result of the above query:<br />
<div id="attachment_766" class="wp-caption alignnone" style="width: 283px"><a href="http://www.ideaexcursion.com/2009/05/12/flatten-heirarchies-in-sql-server-with-common-table-expressions/query-result/" rel="attachment wp-att-766"><img src="http://www.ideaexcursion.com/wp-content/uploads/2009/05/query-result.png" alt="The tabular results of the query" title="Query Result" width="273" height="191" class="size-full wp-image-766" /></a><p class="wp-caption-text">The tabular results of the query</p></div><br />
Notice that the GroupManagerID is the same for each team, regardless of depth. Also, the Depth corresponds to how far down the tree each employee exist. I&#8217;ve conveniently coded the EmployeeName in dot-notation to make the comparison much simpler.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.ideaexcursion.com/2009/05/12/flatten-heirarchies-in-sql-server-with-common-table-expressions/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>HOWTO: Setup SQL Server Linked Server to MySQL</title>
		<link>http://www.ideaexcursion.com/2009/02/25/howto-setup-sql-server-linked-server-to-mysql/</link>
		<comments>http://www.ideaexcursion.com/2009/02/25/howto-setup-sql-server-linked-server-to-mysql/#comments</comments>
		<pubDate>Wed, 25 Feb 2009 21:54:06 +0000</pubDate>
		<dc:creator>Taylor Gerring</dc:creator>
				<category><![CDATA[SQL Server]]></category>
		<category><![CDATA[HOWTO]]></category>
		<category><![CDATA[MySQL]]></category>
		<category><![CDATA[SQL Server 2005]]></category>
		<category><![CDATA[SQL Server 2008]]></category>

		<guid isPermaLink="false">http://www.ideaexcursion.com/?p=558</guid>
		<description><![CDATA[An illustrated, step-by-step guide for connecting MySQL and SQL Server via Linked Servers utilizing the MySQL ODBC Connector driver.]]></description>
			<content:encoded><![CDATA[<p>Despite being completely proprietary, one of the nice connectivity features offered in SQL Server is the ability to query other servers through a Linked Server. Essentially, a linked server is a method of directly querying another <abbr title="Relational DataBase Management System">RDBMS</abbr>; this often happens through the use of an <abbr title="Open DataBase Connectivity">ODBC</abbr> driver installed on the server. Fortunately, many popular databases provide this <abbr title="Open DataBase Connectivity">ODBC</abbr>  driver, giving SQL Server the ability to connect to a wide range of other systems. I&#8217;ve already written about <a href="http://www.ideaexcursion.com/2009/01/05/connecting-to-oracle-from-sql-server/" title="Connecting to Oracle from SQL Server">how to connect Oracle and SQL Server</a>. In this post, I&#8217;m going to go through the steps necessary to connect SQL Server and MySQL.<br />
<span id="more-558"></span><br />
The first step is to fetch an appropriate <a title="MySQL Connector/ODBC 5.1 Downloads" href="http://dev.mysql.com/downloads/connector/odbc/5.1.html" target="_blank">MySQL Connector/<abbr title="Open DataBase Connectivity">ODBC</abbr> 5.1 download</a>. Drivers are available for a variety of <abbr title="Operating System">OS</abbr>&#8216;s, but we&#8217;re obviously focused on Windows or Window x64, which should correspond to the version of SQL Server installed. After you&#8217;ve downloaded and installed the driver, we have a few things to configure, so let&#8217;s get started:</p>
<h3>Configure a MySQL <abbr title="Data Source Name">DSN</abbr></h3>
<p>The first step is to configure a MySQL data source by running the <abbr title="Open DataBase Connectivity">ODBC</abbr>  Data Source Administrator. This step is technically entirely optional, but allows a simpler configuration in the SQL Server Linked Server settings. Instead of composing a complicated MySQL connection string, we can use a simple <acronym title="Graphical User Interface">GUI</acronym> application.</p>
<div id="attachment_571" class="wp-caption alignnone" style="width: 310px"><a href="http://www.ideaexcursion.com/wp-content/uploads/2009/02/runodbcad32.png"><img class="size-medium wp-image-571" title="Run odbcad32" src="http://www.ideaexcursion.com/wp-content/uploads/2009/02/runodbcad32-300x154.png" alt="Run odbcad32" width="300" height="154" /></a><p class="wp-caption-text">Run odbcad32</p></div>
<p>If you&#8217;re using Windows Server 2003, bring up a Run dialog box with Start&rarr;Run or WinKey+R. If you&#8217;re using Windows Server 2008, use the Start Menu search box directly. In either <abbr title="Operating System">OS</abbr>, type in &#8220;odbcad32&#8243; and hit Enter.</p>
<div id="attachment_567" class="wp-caption alignnone" style="width: 310px"><a href="http://www.ideaexcursion.com/wp-content/uploads/2009/02/odbcsystemdsn.png"><img class="size-medium wp-image-567" title="System DSN" src="http://www.ideaexcursion.com/wp-content/uploads/2009/02/odbcsystemdsn-300x245.png" alt="System DSN" width="300" height="245" /></a><p class="wp-caption-text">System DSN</p></div>
<p>Select the System <abbr title="Data Source Name">DSN</abbr> tab to configure a data source for the entire system. If you only want to create the <abbr title="Data Source Name">DSN</abbr> for a specific user (such as your service account), use the User <abbr title="Data Source Name">DSN</abbr> tab. In either scenario, select the &#8220;Add&#8230;&#8221; button.</p>
<div id="attachment_560" class="wp-caption alignnone" style="width: 310px"><a href="http://www.ideaexcursion.com/wp-content/uploads/2009/02/createnewdatasource.png"><img class="size-medium wp-image-560" title="Create New Data Source" src="http://www.ideaexcursion.com/wp-content/uploads/2009/02/createnewdatasource-300x221.png" alt="Create New Data Source" width="300" height="221" /></a><p class="wp-caption-text">Create New Data Source</p></div>
<p>Scroll down in the Create New Data Source window and select &#8220;MySQL <abbr title="Open DataBase Connectivity">ODBC</abbr> 3.51 Driver&#8221; and click &#8220;Finish&#8221;.</p>
<div id="attachment_566" class="wp-caption alignnone" style="width: 310px"><a href="http://www.ideaexcursion.com/wp-content/uploads/2009/02/odbcconfigurelogin.png"><img class="size-medium wp-image-566" title="MySQL Connector Login Settings" src="http://www.ideaexcursion.com/wp-content/uploads/2009/02/odbcconfigurelogin-300x217.png" alt="MySQL Connector Login Settings" width="300" height="217" /></a><p class="wp-caption-text">MySQL Connector Login Settings</p></div>
<p>Once added, clicking the &#8220;Configure&#8230;&#8221; button will bring up the Connector/<abbr title="Open DataBase Connectivity">ODBC</abbr> 3.51 Configure Data Source application. This is where you can specify all the connection settings for connecting SQL Server to MySQL. Select a Data Source Name &#8211; I typically name it after the application or database I&#8217;m connecting to. The Server, User, Password, and Database should all be obvious.</p>
<div id="attachment_568" class="wp-caption alignnone" style="width: 239px"><a href="http://www.ideaexcursion.com/wp-content/uploads/2009/02/odbctestconnection.png"><img class="size-full wp-image-568" title="Test ODBC Connection" src="http://www.ideaexcursion.com/wp-content/uploads/2009/02/odbctestconnection.png" alt="Test ODBC Connection" width="229" height="113" /></a><p class="wp-caption-text">Test ODBC Connection</p></div>
<p>After you&#8217;ve entered all the required parameters, click the &#8220;Test&#8221; button to ensure a connection can be made to the MySQL server.</p>
<p>These settings are the bare minimum required to connect MySQL and SQL Server via a linked server, but I like to specify additional options to optimize the connection between the servers. Without these, I have run into &#8220;Out of Memory&#8221; errors that require restarting the service.</p>
<div id="attachment_563" class="wp-caption alignnone" style="width: 310px"><a href="http://www.ideaexcursion.com/wp-content/uploads/2009/02/odbcconfigureflags1.png"><img class="size-medium wp-image-563" title="MySQL Connector Advanced Flags 1 Settings" src="http://www.ideaexcursion.com/wp-content/uploads/2009/02/odbcconfigureflags1-300x217.png" alt="MySQL Connector Advanced Flags 1 Settings" width="300" height="217" /></a><p class="wp-caption-text">MySQL Connector Advanced Flags 1 Settings</p></div>
<p>Select the Advanced tab and you&#8217;ll be placed on the &#8220;Flags 1&#8243; sub-tab. Check the boxes labeled &#8220;Allow Big Results&#8221; and &#8220;Use Compressed Protocol&#8221;.</p>
<div id="attachment_564" class="wp-caption alignnone" style="width: 310px"><a href="http://www.ideaexcursion.com/wp-content/uploads/2009/02/odbcconfigureflags2.png"><img class="size-medium wp-image-564" title="MySQL Connector Advanced Flags 2 Settings" src="http://www.ideaexcursion.com/wp-content/uploads/2009/02/odbcconfigureflags2-300x217.png" alt="MySQL Connector Advanced Flags 2 Settings" width="300" height="217" /></a><p class="wp-caption-text">MySQL Connector Advanced Flags 2 Settings</p></div>
<p>Next, switch to the &#8220;Flags 2&#8243; tab and select &#8220;Don&#8217;t Cache Result (forward only cursors)&#8221;. This can actually be a performance penalty if you perform the same query multiple times to the same linked server. However, in my experience, the reason to connect SQL Server to MySQL, is to pull data into a single server, in which case, this option is perfectly suited.</p>
<div id="attachment_565" class="wp-caption alignnone" style="width: 310px"><a href="http://www.ideaexcursion.com/wp-content/uploads/2009/02/odbcconfigureflags3.png"><img class="size-medium wp-image-565" title="MySQL Connector Advanced Flags 3 Settings" src="http://www.ideaexcursion.com/wp-content/uploads/2009/02/odbcconfigureflags3-300x217.png" alt="MySQL Connector Advanced Flags 3 Settings" width="300" height="217" /></a><p class="wp-caption-text">MySQL Connector Advanced Flags 3 Settings</p></div>
<p>On the &#8220;Flags 3&#8243; tab, select &#8220;Force Use Of Forward Only Cursors&#8221;. When you&#8217;re done setting all these options, select the &#8220;Ok&#8221; button.</p>
<h3>Configure Linked Server Provider</h3>
<p>Adjusting the Linked Server Provider is simple, but it comes with a caveat: When adjusting a provider, you are adjusting it for all connections using that provider. I am not aware of any way to change these settings on a per-connection basis.</p>
<div id="attachment_570" class="wp-caption alignnone" style="width: 290px"><a href="http://www.ideaexcursion.com/wp-content/uploads/2009/02/providerproperties.png"><img class="size-medium wp-image-570" title="Provider Properties" src="http://www.ideaexcursion.com/wp-content/uploads/2009/02/providerproperties-280x300.png" alt="Provider Properties" width="280" height="300" /></a><p class="wp-caption-text">Provider Properties</p></div>
<p>Drill down to Server Object &rarr; Linked Servers &rarr; Providers, right-click MSDASQL, and select &#8220;Properties&#8221;.</p>
<div id="attachment_569" class="wp-caption alignnone" style="width: 310px"><a href="http://www.ideaexcursion.com/wp-content/uploads/2009/02/provideroptions.png"><img class="size-medium wp-image-569" title="Set Provider Options" src="http://www.ideaexcursion.com/wp-content/uploads/2009/02/provideroptions-300x269.png" alt="Set Provider Options" width="300" height="269" /></a><p class="wp-caption-text">Set Provider Options</p></div>
<p>The Provider Options for Microsoft <acronym title="Object Linking and Embedding">OLE</acronym> <abbr title="DataBase">DB</abbr> Provider for <abbr title="Open DataBase Connectivity">ODBC</abbr> Drivers dialog box will open allowing you to configure several options. Ensure the following four options are checked:</p>
<ul>
<li>Nested queries</li>
<li>Level zero only</li>
<li>Allow inprocess</li>
<li>Supports &#8216;Like&#8217; Operator</li>
</ul>
<p>All other options should be unchecked. When done, click &#8220;OK&#8221;.</p>
<h3>Create Linked Server to MySQL</h3>
<p>Finally, the last step in our process is to create the actual MySQL Linked Server.</p>
<div id="attachment_562" class="wp-caption alignnone" style="width: 273px"><a href="http://www.ideaexcursion.com/wp-content/uploads/2009/02/newlinkedserver.png"><img class="size-full wp-image-562" title="Create a New Linked Server" src="http://www.ideaexcursion.com/wp-content/uploads/2009/02/newlinkedserver.png" alt="Create a New Linked Server" width="263" height="237" /></a><p class="wp-caption-text">Create a New Linked Server</p></div>
<p>You should already have Linked Servers expanded in the Object Explorer tree. If not, find it in Server Objects &rarr; Linked Server. Once there, right-click Linked Servers and select &#8220;New Linked Server&#8230;&#8221;</p>
<div id="attachment_561" class="wp-caption alignnone" style="width: 310px"><a href="http://www.ideaexcursion.com/wp-content/uploads/2009/02/linkedserversettings.png"><img class="size-medium wp-image-561" title="New linked Server Settings" src="http://www.ideaexcursion.com/wp-content/uploads/2009/02/linkedserversettings-300x269.png" alt="New linked Server Settings" width="300" height="269" /></a><p class="wp-caption-text">New linked Server Settings</p></div>
<p>The New Linked Server dialog box will open. Because we specified all our connection settings in the <abbr title="Open DataBase Connectivity">ODBC</abbr> Data Source Administrator, this last step is very simple. Name the linked server. As with the Data Source Name, I like to name it after the product or database I&#8217;m connecting to. In my example, I used MYSQLAPP. Ensure that the &#8220;Other data source&#8221; option is selected and choose &#8220;Microsoft <acronym title="Object Linking and Embedding">OLE</acronym> <abbr title="DataBase">DB</abbr> Provider for <abbr title="Open DataBase Connectivity">ODBC</abbr> Drivers&#8221; from the Provider dropdown. Lastly, specify the Product name and Data source. The Product name doesn&#8217;t matter so much as the Data source must match what you provided in the MySQL Connector/<abbr title="Open DataBase Connectivity">ODBC</abbr> configuration. Press &#8220;OK&#8221; when complete.</p>
<h3>Testing the SQL Server to MySQL connection</h3>
<p>If everything has been set correctly, you should be able to execute a query directly again the MySQL database from SQL Server Management Studio. For example:</p>

<div class="wp_syntax"><div class="code"><pre class="tsql" style="font-family:monospace;"><span style="color: #0000FF;">SELECT</span> <span style="color: #0000FF;">TOP</span> <span style="color: #000;">10</span> TABLE_NAME <span style="color: #0000FF;">FROM</span> MYSQLAPP...<span style="color: #202020;">tables</span> <span style="color: #0000FF;">WHERE</span> TABLE_TYPE <span style="color: #808080;">!=</span> <span style="color: #FF0000;">'MEMORY'</span></pre></div></div>

<p>If you&#8217;ve done everything correctly, you should get back a result set. There are several error message you might receive:</p>

<div class="wp_syntax"><div class="code"><pre class="none" style="font-family:monospace;">OLE DB provider &quot;MSDASQL&quot; for linked server &quot;MYSQLAPP&quot; returned message &quot;[Microsoft][ODBC Driver Manager] Data source name not found and no default driver specified&quot;.
Msg 7303, Level 16, State 1, Line 1
Cannot initialize the data source object of OLE DB provider &quot;MSDASQL&quot; for linked server &quot;MYSQLAPP&quot;.</pre></div></div>

<p>The message indicates that the Data source name you&#8217;ve specified for the linked server does not match that of the Data Source Name specified in the MySQL Connector.</p>

<div class="wp_syntax"><div class="code"><pre class="none" style="font-family:monospace;">Msg 7313, Level 16, State 1, Line 1
An invalid schema or catalog was specified for the provider &quot;MSDASQL&quot; for linked server &quot;MySQLApp&quot;.</pre></div></div>

<p>This uninsightful error is a result of not correctly setting the options for the Linked Server Provider.</p>

<div class="wp_syntax"><div class="code"><pre class="none" style="font-family:monospace;">Msg 7399, Level 16, State 1, Line 1
The OLE DB provider &quot;MSDASQL&quot; for linked server &quot;MySQLApp&quot; reported an error. The provider did not give any information about the error.
Msg 7312, Level 16, State 1, Line 1
Invalid use of schema or catalog for OLE DB provider &quot;MSDASQL&quot; for linked server &quot;MySQLApp&quot;. A four-part name was supplied, but the provider does not expose the necessary interfaces to use a catalog or schema.</pre></div></div>

<p>This &#8220;four-part name&#8221; error is due to a limitation in the MySQL <abbr title="Open DataBase Connectivity">ODBC</abbr> driver. You cannot switch catalogs/schemas using dotted notation. Instead, you will have to register another <abbr title="Data Source Name">DSN</abbr> and Linked Server for the different catalogs you want to access. Be sure and follow the three-dot notation noted in the example query.</p>
<p>If, however, you want to access other schemas, you can do so utilizing OPENQUERY. This is also a great way to test your connection if you&#8217;re receiving problems. The syntax looks like this:</p>

<div class="wp_syntax"><div class="code"><pre class="tsql" style="font-family:monospace;"><span style="color: #0000FF;">SELECT</span> <span style="color: #808080;">*</span> <span style="color: #0000FF;">FROM</span> <span style="color: #0000FF;">OPENQUERY</span><span style="color: #808080;">&#40;</span>MYSQLAPP, <span style="color: #FF0000;">'SELECT * FROM INFORMATION_SCHEMA.TABLES LIMIT 10'</span><span style="color: #808080;">&#41;</span></pre></div></div>

<p>Notice that the actual query syntax in the string must be in the MySQL format (SQL Server does not support the LIMIT keyword). Additionally, you can specify a different schema using SCHEMA.TABLENAME in the query.</p>
<h3>Conclusion</h3>
<p>Creating a linked server between SQL Server and MySQL is a simple process. The first time requires you to install the software and configure the Linked Server Provider, but all subsequent connections require only a <abbr title="Data Source Name">DSN</abbr> and Linked Server.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.ideaexcursion.com/2009/02/25/howto-setup-sql-server-linked-server-to-mysql/feed/</wfw:commentRss>
		<slash:comments>42</slash:comments>
		</item>
		<item>
		<title>Efficiently query the DATE in DATETIME</title>
		<link>http://www.ideaexcursion.com/2009/02/17/efficiently-query-the-date-in-datetime/</link>
		<comments>http://www.ideaexcursion.com/2009/02/17/efficiently-query-the-date-in-datetime/#comments</comments>
		<pubDate>Tue, 17 Feb 2009 17:19:31 +0000</pubDate>
		<dc:creator>Taylor Gerring</dc:creator>
				<category><![CDATA[SQL Server]]></category>
		<category><![CDATA[T-SQL]]></category>

		<guid isPermaLink="false">http://www.ideaexcursion.com/?p=489</guid>
		<description><![CDATA[The DATETIME data type is often misunderstood and used inefficiently, forcing table scans with YEAR or MONTH functions. With the proper knowledge and tools, developers can more efficiently leverage the DATETIME data type and speed up slow-running queries.]]></description>
			<content:encoded><![CDATA[<p>The DATETIME data type is often misunderstood and used inefficiently. This article focuses on the date component of DATETIME, how it is handled internally and how it can be used effectively for querying. The DATETIME type is internally stored as two separate 4-byte integers: one of those integers stores the date portion, and the other the time. When the date portion has a value of 0, the date is 1900-01-01. Because the date is internally stored as an INT, casting and converting directly between the types is natural:<br />
<span id="more-489"></span><br />
<em><strong>Note:</strong> Because we are focusing on the date portion, we can ignore the time components for the purposes of this article. All the times will be midnight, but I won&#8217;t reprint them below. Additionally, the techniques demonstrated within can be applied to SMALLDATETIME and DATETIME2 (SQL 2008). </em></p>

<div class="wp_syntax"><div class="code"><pre class="tsql" style="font-family:monospace;"><span style="color: #0000FF;">SELECT</span> <span style="color: #0000FF;">CONVERT</span><span style="color: #808080;">&#40;</span><span style="color: #0000FF;">DATETIME</span>, <span style="color: #000;">0</span><span style="color: #808080;">&#41;</span>
<span style="color: #008080;">--Result: 1900-01-01</span>
&nbsp;
<span style="color: #008080;">--Store the date to a DATETIME variable first</span>
<span style="color: #0000FF;">DECLARE</span> @<span style="color: #0000FF;">DATE</span> <span style="color: #0000FF;">DATETIME</span>
<span style="color: #0000FF;">SET</span> @<span style="color: #0000FF;">DATE</span> <span style="color: #808080;">=</span> <span style="color: #FF0000;">'1900-01-01'</span>
<span style="color: #0000FF;">SELECT</span> <span style="color: #0000FF;">CAST</span><span style="color: #808080;">&#40;</span>@<span style="color: #0000FF;">DATE</span> <span style="color: #0000FF;">AS</span> <span style="color: #0000FF;">INT</span><span style="color: #808080;">&#41;</span>
<span style="color: #008080;">--Result: 0</span></pre></div></div>

<p>You can use cast and convert for either direction, there is no restriction. Note in the second example, I stored the date in a variable first, because there is no way to natively pass a DATETIME type into the query window; SQL Server must always convert it from a string. By taking the intermediary step of storing it into a variable, we ensure that the Database Engine truly understands the value as a DATETIME and not a VARCHAR. If you had attempted to convert the VARCHAR value directly, you would receive the following error:</p>

<div class="wp_syntax"><div class="code"><pre class="tsql" style="font-family:monospace;"><span style="color: #0000FF;">SELECT</span> <span style="color: #0000FF;">CAST</span><span style="color: #808080;">&#40;</span><span style="color: #FF0000;">'1900-01-01'</span> <span style="color: #0000FF;">AS</span> <span style="color: #0000FF;">INT</span><span style="color: #808080;">&#41;</span>
<span style="color: #008080;">--Msg 245, Level 16, State 1, Line 1</span>
<span style="color: #008080;">--Conversion failed when converting the varchar value '1900-01-01' to data type int.</span></pre></div></div>

<p>The trick works for any value valid in the DATETIME range (January 1, 1753, through December 31, 9999):</p>

<div class="wp_syntax"><div class="code"><pre class="tsql" style="font-family:monospace;"><span style="color: #0000FF;">SELECT</span> <span style="color: #0000FF;">CONVERT</span><span style="color: #808080;">&#40;</span><span style="color: #0000FF;">DATETIME</span>, <span style="color: #000;">2958463</span><span style="color: #808080;">&#41;</span>
<span style="color: #008080;">--Result: 9999-12-31</span></pre></div></div>

<p>Try to go too far, however, and you&#8217;ll get an error:</p>

<div class="wp_syntax"><div class="code"><pre class="tsql" style="font-family:monospace;"><span style="color: #0000FF;">SELECT</span> <span style="color: #0000FF;">CONVERT</span><span style="color: #808080;">&#40;</span><span style="color: #0000FF;">DATETIME</span>, <span style="color: #000;">2958464</span><span style="color: #808080;">&#41;</span>
<span style="color: #008080;">--Msg 8115, Level 16, State 2, Line 1</span>
<span style="color: #008080;">--Arithmetic overflow error converting expression to data type datetime.</span></pre></div></div>

<h3>Manipulating Dates</h3>
<p>Fortunately Microsoft has provided some very useful functions for manipulating the DATETIME data type, negating our need to perform complicated math.</p>
<h4>DATEADD</h4>
<p>According to <a title="DATEADD (Transact-SQL)" href="http://msdn.microsoft.com/en-us/library/ms186819%28SQL.90%29.aspx" target="_blank">MSDN</a>, DATEADD returns a specified <em>date</em> with the specified <em>number</em> interval (signed integer) added to a specified <em>datepart</em> of that <em>date</em>. The function prototype looks something like this:</p>

<div class="wp_syntax"><div class="code"><pre class="tsql" style="font-family:monospace;"><span style="color: #FF00FF;">DATEADD</span> <span style="color: #808080;">&#40;</span><span style="color: #FF00FF;">DATEPART</span>, number, <span style="color: #0000FF;">DATE</span><span style="color: #808080;">&#41;</span></pre></div></div>

<p>Because any date can be referenced as a simple integer, adding dates becomes trivial with the DATEADD function. Let&#8217;s compare adding days with both as a DATETIME and as an INT:</p>

<div class="wp_syntax"><div class="code"><pre class="tsql" style="font-family:monospace;"><span style="color: #0000FF;">SELECT</span> <span style="color: #FF00FF;">DATEADD</span><span style="color: #808080;">&#40;</span><span style="color: #0000FF;">DAY</span>, <span style="color: #000;">1</span>, <span style="color: #FF0000;">'1900-1-1'</span><span style="color: #808080;">&#41;</span>
<span style="color: #008080;">--Result: 1900-01-02</span></pre></div></div>

<p>And compare that with adding 1 date to 0</p>

<div class="wp_syntax"><div class="code"><pre class="tsql" style="font-family:monospace;"><span style="color: #0000FF;">SELECT</span> <span style="color: #FF00FF;">DATEADD</span><span style="color: #808080;">&#40;</span><span style="color: #0000FF;">DAY</span>, <span style="color: #000;">1</span>, <span style="color: #000;">0</span><span style="color: #808080;">&#41;</span>
<span style="color: #008080;">--Result: 1900-01-02</span></pre></div></div>

<p>Notice the result is the same. We can also add any other datepart. Want the start of 2009?</p>

<div class="wp_syntax"><div class="code"><pre class="tsql" style="font-family:monospace;"><span style="color: #0000FF;">SELECT</span> <span style="color: #FF00FF;">DATEADD</span><span style="color: #808080;">&#40;</span><span style="color: #0000FF;">YEAR</span>, <span style="color: #000;">109</span>, <span style="color: #000;">0</span><span style="color: #808080;">&#41;</span>
<span style="color: #008080;">--Result: 2009-01-01</span></pre></div></div>

<h4>DATEDIFF</h4>
<p>DATEDIFF acts in an opposite capacity of DATEADD: it calculates the date or time difference between DATETIME values. The prototype looks like this:</p>

<div class="wp_syntax"><div class="code"><pre class="tsql" style="font-family:monospace;"><span style="color: #FF00FF;">DATEDIFF</span> <span style="color: #808080;">&#40;</span><span style="color: #FF00FF;">DATEPART</span>, startdate , enddate<span style="color: #808080;">&#41;</span></pre></div></div>

<p>And an actual example:</p>

<div class="wp_syntax"><div class="code"><pre class="tsql" style="font-family:monospace;"><span style="color: #0000FF;">SELECT</span> <span style="color: #FF00FF;">DATEDIFF</span><span style="color: #808080;">&#40;</span><span style="color: #0000FF;">YEAR</span>, <span style="color: #FF0000;">'1900-01-01'</span>, <span style="color: #FF0000;">'2009-01-01'</span><span style="color: #808080;">&#41;</span>
<span style="color: #008080;">--Result: 109</span></pre></div></div>

<p>As before, we can certainly specify any of the dates as a INT. Since we know that 1900 has a corresponding integer value of 0, let&#8217;s try that:</p>

<div class="wp_syntax"><div class="code"><pre class="tsql" style="font-family:monospace;"><span style="color: #0000FF;">SELECT</span> <span style="color: #FF00FF;">DATEDIFF</span><span style="color: #808080;">&#40;</span><span style="color: #0000FF;">YEAR</span>, <span style="color: #000;">0</span>, <span style="color: #FF0000;">'2009-01-01'</span><span style="color: #808080;">&#41;</span>
<span style="color: #008080;">--Result: 109</span></pre></div></div>

<p>We can take it a step further and compare two INT values (again, as long as they fall within the valid DATETIME range). Let&#8217;s find how many days are between 1900-01-01 and 9999-12-31:</p>

<div class="wp_syntax"><div class="code"><pre class="tsql" style="font-family:monospace;"><span style="color: #0000FF;">SELECT</span> <span style="color: #FF00FF;">DATEDIFF</span><span style="color: #808080;">&#40;</span><span style="color: #0000FF;">YEAR</span>, <span style="color: #000;">0</span>, <span style="color: #000;">2958463</span><span style="color: #808080;">&#41;</span>
<span style="color: #008080;">--Result: 8099</span></pre></div></div>

<p>The important aspect to note is that DATEDIFF &#8220;returns the number of date and time boundaries crossed between two specified dates.&#8221; (<a title="DATEDIFF (Transact-SQL)" href="http://msdn.microsoft.com/en-us/library/ms189794%28SQL.90%29.aspx" target="_blank">MSDN</a>) In plain language, that means SQL Server never rounds the time period, but rather calculates the time periods completed.  For example, let&#8217;s check the number of years between Leap Day in 2000 and Christmas 2008:</p>

<div class="wp_syntax"><div class="code"><pre class="tsql" style="font-family:monospace;"><span style="color: #0000FF;">SELECT</span> <span style="color: #FF00FF;">DATEDIFF</span><span style="color: #808080;">&#40;</span><span style="color: #0000FF;">YEAR</span>, <span style="color: #FF0000;">'2000-02-29'</span>, <span style="color: #FF0000;">'2008-12-25'</span><span style="color: #808080;">&#41;</span>
<span style="color: #008080;">--Result: 8</span></pre></div></div>

<p>Although the total time is nearly 8.9 years, only 8 actual year boundaries have been crossed. This principle is the basis for calculating beginnings of time periods. When we combine DATEDIFF with DATEADD, the combination will predictably give us the start of any time period.  Let&#8217;s say we wanted to calculate the number of years since 1900-01-01:</p>

<div class="wp_syntax"><div class="code"><pre class="tsql" style="font-family:monospace;"><span style="color: #008080;">--Assume GETDATE() returns 2009-02-12 22:23:00.000</span>
<span style="color: #0000FF;">SELECT</span> <span style="color: #FF00FF;">DATEDIFF</span><span style="color: #808080;">&#40;</span><span style="color: #0000FF;">YEAR</span>, <span style="color: #000;">0</span>, <span style="color: #FF00FF;">GETDATE</span><span style="color: #808080;">&#40;</span><span style="color: #808080;">&#41;</span><span style="color: #808080;">&#41;</span>
<span style="color: #008080;">--Result: 109</span></pre></div></div>

<p>We could easily take this result and place it in a DATEADD and see that we get 1900 + 109 = 2009</p>

<div class="wp_syntax"><div class="code"><pre class="tsql" style="font-family:monospace;"><span style="color: #0000FF;">SELECT</span> <span style="color: #FF00FF;">DATEADD</span><span style="color: #808080;">&#40;</span><span style="color: #0000FF;">YEAR</span>, <span style="color: #000;">109</span>, <span style="color: #FF0000;">'1900-01-01'</span><span style="color: #808080;">&#41;</span>
<span style="color: #008080;">--Result: 2009-01-01</span></pre></div></div>

<p>Or, as before, let&#8217;s use the integer value for 1900-01-01:</p>

<div class="wp_syntax"><div class="code"><pre class="tsql" style="font-family:monospace;"><span style="color: #0000FF;">SELECT</span> <span style="color: #FF00FF;">DATEADD</span><span style="color: #808080;">&#40;</span><span style="color: #0000FF;">YEAR</span>, <span style="color: #000;">109</span>, <span style="color: #000;">0</span><span style="color: #808080;">&#41;</span>
<span style="color: #008080;">--Result: 2009-01-01</span></pre></div></div>

<p>Since DATEDIFF returns an INT and DATEADD adds an INT to a DATETIME, we can combine the two and return a natural DATETIME in a single query:</p>

<div class="wp_syntax"><div class="code"><pre class="tsql" style="font-family:monospace;"><span style="color: #008080;">--Assume GETDATE() returns 2009-02-12 22:23:00.000</span>
<span style="color: #0000FF;">SELECT</span> <span style="color: #FF00FF;">DATEADD</span><span style="color: #808080;">&#40;</span><span style="color: #0000FF;">YEAR</span>, <span style="color: #FF00FF;">DATEDIFF</span><span style="color: #808080;">&#40;</span><span style="color: #0000FF;">YEAR</span>, <span style="color: #000;">0</span>, <span style="color: #FF00FF;">GETDATE</span><span style="color: #808080;">&#40;</span><span style="color: #808080;">&#41;</span><span style="color: #808080;">&#41;</span>, <span style="color: #000;">0</span><span style="color: #808080;">&#41;</span>
<span style="color: #008080;">--Result: 2009-01-01</span></pre></div></div>

<p>To analyze what is actually occurring, let&#8217;s start with the DATEDIFF in the middle &lt;DATEDIFF(year, 0, GETDATE()&gt;. This calculates the number of year boundaries crossed between 1900 and 2009, which is 109. The outer portion &lt;DATEADD(year, 109, 0)&gt; adds that result of 109 to 1900 and gives us the start of the year 2009. This technique allows us to find the start of any date or time period based on a relative starting point. That starting point is commonly GETDATE(), but can be any valid DATETIME. If you want the beginning of the month, swap &#8220;year&#8221; for &#8220;month&#8221;. The same can be used for day, week, or any other datepart. This can help us greatly when querying a DATETIME field to fall in a range of values (i.e. &#8220;Select all rows created <strong>this year</strong>.&#8221;)  What if we wanted the start of next year (2010)?</p>

<div class="wp_syntax"><div class="code"><pre class="tsql" style="font-family:monospace;"><span style="color: #008080;">--Assume GETDATE() returns 2009-02-12 22:23:00.000</span>
<span style="color: #0000FF;">SELECT</span> <span style="color: #FF00FF;">DATEADD</span><span style="color: #808080;">&#40;</span><span style="color: #0000FF;">YEAR</span>, <span style="color: #FF00FF;">DATEDIFF</span><span style="color: #808080;">&#40;</span><span style="color: #0000FF;">YEAR</span>, <span style="color: #000;">0</span>, <span style="color: #FF00FF;">GETDATE</span><span style="color: #808080;">&#40;</span><span style="color: #808080;">&#41;</span><span style="color: #808080;">&#41;</span> <span style="color: #808080;">+</span> <span style="color: #000;">1</span>, <span style="color: #000;">0</span><span style="color: #808080;">&#41;</span>
<span style="color: #008080;">--Result: 2010-01-01</span></pre></div></div>

<p>Notice the only change was a &#8220;+ 1&#8243; next to the DATEDIFF calculation. Because we&#8217;re calculating the number of years since 1900 (109), we can adjust for years forward by adding, or years back by subtracting. By adding only 1, we jump forward a single year.</p>
<h3>Query efficiency</h3>
<p>Using DATEADD and DATEDIFF is much faster than string manipulation with CAST and CONVERT. Additionally, it allows us to check the bounds of a DATETIME column without using the YEAR, MONTH, or DAY function. Why is this good?  Any modification applied to a column in the WHERE condition, negates the ability for SQL Server to utilize an index on that column. This is because the index contains only the actual values, not some calculation of them. Think of it this way: If you looked in the index of a book and wanted to know the pages of topics whose third letter is an &#8220;e&#8221; is, you would have to scan the entire article list, because the index is based on the first letter, not third. This is similar to how a YEAR function might affect an index on a DATETIME column.  To get around this, rather than altering the column value in the condition, we alter our lookup condition. For example, instead of YEAR(DateColumn) = YEAR(GETDATE()), we can specify DateColumn &gt;= @BeginningOfThisYear AND &lt; @BeginningOfNextYear. Or, more specifically:</p>

<div class="wp_syntax"><div class="code"><pre class="tsql" style="font-family:monospace;"><span style="color: #0000FF;">SELECT</span> <span style="color: #808080;">*</span>
<span style="color: #0000FF;">FROM</span> TableName
<span style="color: #0000FF;">WHERE</span> DateColumn <span style="color: #808080;">&gt;=</span> <span style="color: #FF00FF;">DATEADD</span><span style="color: #808080;">&#40;</span><span style="color: #0000FF;">YEAR</span>, <span style="color: #FF00FF;">DATEDIFF</span><span style="color: #808080;">&#40;</span><span style="color: #0000FF;">YEAR</span>, <span style="color: #000;">0</span>, <span style="color: #FF00FF;">GETDATE</span><span style="color: #808080;">&#40;</span><span style="color: #808080;">&#41;</span><span style="color: #808080;">&#41;</span>, <span style="color: #000;">0</span><span style="color: #808080;">&#41;</span>
	<span style="color: #808080;">AND</span> DateColumn <span style="color: #808080;">&lt;</span> <span style="color: #FF00FF;">DATEADD</span><span style="color: #808080;">&#40;</span><span style="color: #0000FF;">YEAR</span>, <span style="color: #FF00FF;">DATEDIFF</span><span style="color: #808080;">&#40;</span><span style="color: #0000FF;">YEAR</span>, <span style="color: #000;">0</span>, <span style="color: #FF00FF;">GETDATE</span><span style="color: #808080;">&#40;</span><span style="color: #808080;">&#41;</span><span style="color: #808080;">&#41;</span> <span style="color: #808080;">+</span> <span style="color: #000;">1</span>, <span style="color: #000;">0</span><span style="color: #808080;">&#41;</span></pre></div></div>

<p>In English, it reads something like this: &#8220;Select everything from TableName where DateColumn is at least the start of the year and also before next year&#8221;. Or, even simpler, &#8220;Show me where DateColumn is the current year&#8221;.  This has a similar effect of YEAR(DateColumn), but will allow an index on DateColumn to be used. Obviously, this will not guarantee the index will be used &#8211; if one exists at all &#8211; but, it at least leaves the option available to the query optimizer.</p>
<h3>Conclusion</h3>
<p>It is very common to check for records falling in a specified range such as current year or last month, but many people resort to nasty functions forcing a table scan. With the above knowledge and tools, developers can more efficiently leverage the DATETIME data type and speed up slow-running queries.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.ideaexcursion.com/2009/02/17/efficiently-query-the-date-in-datetime/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
		<item>
		<title>Handling Embedded Text Qualifiers in SSIS 2005</title>
		<link>http://www.ideaexcursion.com/2009/02/03/handling-embedded-text-qualifiers-in-ssis-2005/</link>
		<comments>http://www.ideaexcursion.com/2009/02/03/handling-embedded-text-qualifiers-in-ssis-2005/#comments</comments>
		<pubDate>Tue, 03 Feb 2009 16:32:04 +0000</pubDate>
		<dc:creator>Taylor Gerring</dc:creator>
				<category><![CDATA[SQL Server]]></category>
		<category><![CDATA[SSIS]]></category>
		<category><![CDATA[HOWTO]]></category>
		<category><![CDATA[VB.NET]]></category>

		<guid isPermaLink="false">http://www.ideaexcursion.com/?p=415</guid>
		<description><![CDATA[Just a quick note advising that I&#8217;ve updated my Handling Embedded Text Qualifiers post to also include a Visual Basic example, making the information also relevant to SQL Server 2005.]]></description>
			<content:encoded><![CDATA[<p>Just a quick note advising that I&#8217;ve updated my <a title="Handling Embedded Text Qualifiers" href="http://www.ideaexcursion.com/2008/11/12/handling-embedded-text-qualifiers/">Handling Embedded Text Qualifiers</a> post to also include a Visual Basic example, making the information also relevant to SQL Server 2005.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.ideaexcursion.com/2009/02/03/handling-embedded-text-qualifiers-in-ssis-2005/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Import Wikipedia articles into SQL Server with SSIS</title>
		<link>http://www.ideaexcursion.com/2009/01/26/import-wikipedia-articles-into-sql-server-with-ssis/</link>
		<comments>http://www.ideaexcursion.com/2009/01/26/import-wikipedia-articles-into-sql-server-with-ssis/#comments</comments>
		<pubDate>Mon, 26 Jan 2009 22:22:56 +0000</pubDate>
		<dc:creator>Taylor Gerring</dc:creator>
				<category><![CDATA[SQL Server]]></category>
		<category><![CDATA[SSIS]]></category>
		<category><![CDATA[HOWTO]]></category>

		<guid isPermaLink="false">http://www.ideaexcursion.com/?p=265</guid>
		<description><![CDATA[Import the XML dump of Wikipedia data into SQL Server through SSIS. Full walkthrough with pictures and pre-configured SSIS package provided.]]></description>
			<content:encoded><![CDATA[<p>If you&#8217;ve ever wanted to mine <a title="Wikipedia" href="http://wikipedia.org/">Wikipedia</a> data, it would be possible &#8211; but difficult &#8211; to scrape the whole site. Instead of performing such a slow &amp; arduous operation, the <a title="Home - Wikimedia Foundation" href="http://wikimediafoundation.org/">Wikimedia Foundation</a> has provided the contents for free, in a downloadable format. These exports can then be loaded and used for a multitude of reasons, including personal use.<br />
<span id="more-265"></span></p>
<h3>Ingredients</h3>
<p>I&#8217;ve created an <abbr title="SQL Server Integration Services">SSIS</abbr> package that will import the articles and pages into a SQL Server 2005 database. To do this, you&#8217;ll first need to gather a few files:</p>
<ul>
<li><a title="EN Wikipedia database dumps" href="http://download.wikimedia.org/enwiki/latest/" target="_blank">Latest Wikipeda pages/articles dump</a> (Download enwiki-latest-pages-articles.xml.bz2, approximately 4.1<abbr title="GigaByte">GB</abbr> at time of writing)</li>
<li><a title="MediaWiki XML Schema Definition" href="http://www.mediawiki.org/xml/export-0.3.xsd" target="_blank">MediaWiki XSD</a> (originally located on <a title="Manual:XML Import file manipulation in CSharp" href="http://www.mediawiki.org/wiki/Manual:XML_Import_file_manipulation_in_CSharp" target="_blank">Manual:XML Import file manipulation in CSharp</a>)</li>
<li><a title="Import Wikipedia SSIS Package" href="http://www.ideaexcursion.com/wp-content/uploads/2009/01/WikipediaImport.zip">SSIS Package &amp; Database Scripts<br />
</a></li>
</ul>
<p>You&#8217;ll need approximately 75<abbr title="GigaByte">GB</abbr> of free space &#8211; 50<abbr title="GigaByte">GB</abbr> for the database and 20<abbr title="GigaByte">GB</abbr> for the <abbr title="eXtensible Markup Language">XML</abbr> file. Also, this will likely take several hours, if not longer. If you have separate drive spindles, it would certainly help to separate the XML and database files. My example uses C:\wikipedia\ as the working folder; if you prefer another location, we&#8217;ll configure it later. If not, this is what the structure should look like:</p>
<div id="attachment_303" class="wp-caption alignnone" style="width: 563px"><a href="http://www.ideaexcursion.com/wp-content/uploads/2009/01/ssiswpdirlisting.png"><img class="size-full wp-image-303" title="Directory Listing" src="http://www.ideaexcursion.com/wp-content/uploads/2009/01/ssiswpdirlisting.png" alt="Directory Listing" width="553" height="155" /></a><p class="wp-caption-text">Directory Listing</p></div>
<h3>Recipe</h3>
<p><em><strong>Note:</strong> If you&#8217;re planning on importing the <abbr title="eXtensible Markup Language">XML</abbr> file into a remote server, I highly recommend performing all these operations on the server itself through Remote Desktop. Aside from having to re-transfer the gigantic <abbr title="eXtensible Markup Language">XML</abbr> dump, debugging <abbr title="SQL Server Integration Services">SSIS</abbr> packages is much easier when working locally.</em></p>
<ol>
<li>Connect to the SQL Server database and run WikipediaImport\WikipediaImport\DatabaseCreate.sql
<ol>
<li>This creates the database (cleverly named &#8220;Wikipedia&#8221;) on C:\wikipedia\. If you want the data and log files located elsewhere, find 50<abbr title="GigaByte">GB</abbr> of free space and update the CREATE DATABASE statement.</li>
<li>Because we know the database is going to grow immediately, I&#8217;ve told the script to allocation 40<abbr title="GigaByte">GB</abbr> for data and 10<abbr title="GigaByte">GB</abbr> for log, so this step may take a while to run.</li>
</ol>
</li>
<li>Open WikipediaImport\WikipediaImport.sln in Visual Studio 2005</li>
<li>Enable the Variables window if it is not already visible
<ol>
<li>Select Data Flow</li>
<li>Select the View menu</li>
<li>Select Other Windows</li>
<li>Select Variables
<div id="attachment_308" class="wp-caption alignnone" style="width: 260px"><a href="http://www.ideaexcursion.com/wp-content/uploads/2009/01/ssiswpvariablesmenu.png"><img class="size-medium wp-image-308" title="Enable Variables Menu" src="http://www.ideaexcursion.com/wp-content/uploads/2009/01/ssiswpvariablesmenu-250x300.png" alt="Enable Variables Menu" width="250" height="300" /></a><p class="wp-caption-text">Enable Variables Menu</p></div></li>
</ol>
</li>
<li>If the working files were placed somewhere besides C:\wikipedia\, you can configure that in the Variables window. Be sure to update both PageArticlesXML and PageArticlesXSD
<p><div id="attachment_309" class="wp-caption alignnone" style="width: 571px"><a href="http://www.ideaexcursion.com/wp-content/uploads/2009/01/ssiswpvariablevalues.png"><img class="size-full wp-image-309" title="Set Variable Values" src="http://www.ideaexcursion.com/wp-content/uploads/2009/01/ssiswpvariablevalues.png" alt="Set Variable Values" width="561" height="94" /></a><p class="wp-caption-text">Set Variable Values</p></div></li>
<li>Additionally, if you&#8217;re not importing to localhost, configure the database connection variable (named DatabaseConnection)</li>
<li>Verify there are no warnings or errors and build the solution (Ctrl + Shift + B or Build?Build WikipediaImport)</li>
<li>If the build succeeds, go ahead and run it (F5 or Debug?Start Debugging)</li>
<li>If all goes well, the <abbr title="eXtensible Markup Language">XML</abbr> file should now be streaming into the database. This will likely take hours, even with a fast <acronym title="Redundant Array of Inexpensive Disks">RAID</acronym>. Notice that the file only requires a single pass, rather than scanning it once per table.
<p><div id="attachment_305" class="wp-caption alignnone" style="width: 310px"><a href="http://www.ideaexcursion.com/wp-content/uploads/2009/01/ssiswpprogress.png"><img class="size-medium wp-image-305" title="Import Progress" src="http://www.ideaexcursion.com/wp-content/uploads/2009/01/ssiswpprogress-300x163.png" alt="Import Progress" width="300" height="163" /></a><p class="wp-caption-text">Import Progress</p></div></li>
<li>Switch back to <abbr title="SQL Server Management Studio">SSMS</abbr> and run WikipediaImport\WikipediaImport\IndexCreate.sql
<ol>
<li>This step is technically optional, but is going to help speed up your queries significantly</li>
<li>If we had created the indexes before the import, the import would have been even slower</li>
<li>This will take a while!</li>
</ol>
</li>
<li>Run a test query

<div class="wp_syntax"><div class="code"><pre class="tsql" style="font-family:monospace;"><span style="color: #0000FF;">SELECT</span> <span style="color: #808080;">*</span> <span style="color: #0000FF;">FROM</span> dbo.<span style="color: #202020;">vw_Articles</span> <span style="color: #0000FF;">WHERE</span> title <span style="color: #808080;">=</span> <span style="color: #FF0000;">'Green Day'</span></pre></div></div>

</li>
</ol>
<h3>Behind the Scenes</h3>
<p>Getting this to work took a while of tweaking, but there are a few highlights I&#8217;d like to point out.</p>
<h4>Data Types</h4>
<p><abbr title="XML Schema Definition">XSD</abbr>: The provided <abbr title="eXtensible Markup Language">XML</abbr> Schema Definition file does not contain any information about the intended length of the string data. Fortunately, through testing, I was able to shrink some of those sizes down, although they do not strictly conform to the <a title="Wikipedia:Database download SQL schema" href="http://en.wikipedia.org/wiki/Wikipedia_database#SQL_schema" target="_blank">official database schema</a>. Specifically, I have shrunk page.restrictions and text.space from nvarchar(255) to nvarchar(50). Most other items conform as close as possible. In addition to these, I had to update text.text to a nvarchar(max). <abbr title="SQL Server Integration Services">SSIS</abbr> initially suggested an nvarchar(255), but articles are obviously much longer than this. To perform these changes, right-click the <abbr title="eXtensible Markup Language">XML</abbr> Source (named PageArticles) in the Import <abbr title="eXtensible Markup Language">XML</abbr> Data Flow Task and select &#8220;Show Advanced Editor&#8230;&#8221;</p>
<p><div id="attachment_306" class="wp-caption alignnone" style="width: 218px"><a href="http://www.ideaexcursion.com/wp-content/uploads/2009/01/ssiswpshowadvancededitor.png"><img class="size-full wp-image-306" title="Advanced Editor Menu" src="http://www.ideaexcursion.com/wp-content/uploads/2009/01/ssiswpshowadvancededitor.png" alt="Advanced Editor Menu" width="208" height="321" /></a><p class="wp-caption-text">Advanced Editor Menu</p></div>
<p>Expand out each of the changed columns (both External and Output) and update the DataType. For example, SQL Server-specifc nvarchar(max) is a more general &#8220;Unicode text stream [DT_NTEXT]&#8221; in <abbr title="SQL Server Integration Services">SSIS</abbr>. For the others, just update the length from 255 to 50.</p>
<div id="attachment_307" class="wp-caption alignnone" style="width: 310px"><a href="http://www.ideaexcursion.com/wp-content/uploads/2009/01/ssiswptextstream.png"><img class="size-medium wp-image-307" title="Set Data Types" src="http://www.ideaexcursion.com/wp-content/uploads/2009/01/ssiswptextstream-300x270.png" alt="Set Data Types" width="300" height="270" /></a><p class="wp-caption-text">Set Data Types</p></div>
<h4>Extraneous information</h4>
<p>There are many more fields in the <abbr title="eXtensible Markup Language">XML</abbr> file than I&#8217;ve decided to import. Unfortunately, I can&#8217;t just turn them off completely, lest SSIS complains. Instead I have chosen to suffer the lesser fate of &#8220;Warning&#8221;. Additionally, I changed the Error Output to &#8220;Ignore failure&#8221; on Error. This screen can be accessed by double-clicking the <abbr title="eXtensible Markup Language">XML</abbr> Source, PageArticles, then selecting the Error Output page.</p>
<div id="attachment_304" class="wp-caption alignnone" style="width: 310px"><a href="http://www.ideaexcursion.com/wp-content/uploads/2009/01/ssiswperroroutput.png"><img class="size-medium wp-image-304" title="Configure Error Output" src="http://www.ideaexcursion.com/wp-content/uploads/2009/01/ssiswperroroutput-300x259.png" alt="Configure Error Output" width="300" height="259" /></a><p class="wp-caption-text">Configure Error Output</p></div>
<h3>Wrap-up</h3>
<p><abbr title="SQL Server Integration Services">SSIS</abbr> is very picky about metadata, making this a somewhat difficult project to get running, however, it <em>is</em> actually running. This could be further extended with an automated download and increased amount of data imported, but for now it serves its purpose.</p>
<p>I know not everyone will get this on first run, so if you have a problem, please leave a comment below and I&#8217;ll do my best to answer them in a timely manner.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.ideaexcursion.com/2009/01/26/import-wikipedia-articles-into-sql-server-with-ssis/feed/</wfw:commentRss>
		<slash:comments>7</slash:comments>
		</item>
	</channel>
</rss>
