<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	xmlns:creativeCommons="http://backend.userland.com/creativeCommonsRssModule">

<channel>
	<title>Idea Excursion &#187; SSIS</title>
	<atom:link href="http://www.ideaexcursion.com/category/microsoft/windows/sql-server/ssis-sql-server-windows-microsoft/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.ideaexcursion.com</link>
	<description>Technology Musings</description>
	<lastBuildDate>Tue, 29 Jun 2010 21:24:52 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.0</generator>
	<atom:link rel='hub' href='http://www.ideaexcursion.com/?pushpress=hub'/>
<creativeCommons:license>http://creativecommons.org/licenses/by-sa/3.0/us/</creativeCommons:license>		<item>
		<title>Accessing Custom .NET Assemblies in SSIS 2008 Script Tasks</title>
		<link>http://www.ideaexcursion.com/2009/10/14/accessing-custom-net-assemblies-in-ssis-2008-script-tasks/</link>
		<comments>http://www.ideaexcursion.com/2009/10/14/accessing-custom-net-assemblies-in-ssis-2008-script-tasks/#comments</comments>
		<pubDate>Wed, 14 Oct 2009 19:36:24 +0000</pubDate>
		<dc:creator>Taylor Gerring</dc:creator>
				<category><![CDATA[SSIS]]></category>
		<category><![CDATA[.NET]]></category>
		<category><![CDATA[C#]]></category>
		<category><![CDATA[SQL Server]]></category>
		<category><![CDATA[VB.NET]]></category>

		<guid isPermaLink="false">http://www.ideaexcursion.com/?p=1429</guid>
		<description><![CDATA[If you need to access a custom .NET Assembly from an SSIS Script Task, Microsoft doesn't make things very easy - but it's still possible with a little setup.]]></description>
			<content:encoded><![CDATA[<p>If you need to access a custom .NET Assembly from an <abbr title="SQL Server Integration Services">SSIS</abbr> Script Task, Microsoft doesn&#8217;t make things very easy &#8211; but it&#8217;s still possible with a little setup. This is a great way to introduce custom data types or some new functionality without having to replicate that code in a new environment.<br />
<span id="more-1429"></span></p>
<h2>The Setup</h2>
<ul>
<li>Windows 7 64-bit</li>
<li>Visual Studio 2008</li>
<li>SQL Server 2008 64-bit</li>
</ul>
<h2>The Process</h2>
<ol>
<li>Create a signing key (See also, <a title="How to: Create a Public/Private Key Pair" href="http://msdn.microsoft.com/en-us/library/6f05ezxy.aspx " target="_blank">How to: Create a Public/Private Key Pair</a>)
<ol>
<li>Open Visual Studio 2008 command Prompt &#8211; the regular command prompt <em>will not</em> work</li>
<li>Change to a friendly directory: cd %userprofile%\Desktop</li>
<li>Create the key file: sn -k key.snk</li>
</ol>
</li>
<li>Sign the assembly &#8211; There are a few ways to do this, but I found this to be the easiest. If you want to sign it some other way, check out <a title="http://msdn.microsoft.com/en-us/library/xc31ft41.aspx" href="http://msdn.microsoft.com/en-us/library/xc31ft41.aspx" target="_blank">How to: Sign an Assembly with a Strong Name</a>
<ol>
<li>Right-click the Project</li>
<li>Select &#8220;Properties&#8221;</li>
<li>Navigate to the &#8220;Signing&#8221; tab</li>
<li>Browse to strong name key file (which was created in the previous step)</li>
<li>Recompile the project</li>
</ol>
</li>
<li>Copy the re-compiled assembly to your <acronym title="Global Assembly Cache">GAC</acronym>
<ol>
<li>gacutil -i &#8220;C:\Path\to\CustomAssemblyName.dll&#8221;</li>
</ol>
</li>
<li>Copy assembly to &#8220;%programfiles(x86)%\Microsoft SQL Server\100\SDK\Assemblies&#8221;</li>
<li>Add reference in script task. Repeat this for <strong>every </strong>Script Task you want to access this assembly from
<ol>
<li>Right-click References</li>
<li>Click &#8220;Add Reference&#8230;&#8221;
<p><div id="attachment_1437" class="wp-caption alignnone" style="width: 310px"><a rel="attachment wp-att-1437" href="http://www.ideaexcursion.com/2009/10/14/accessing-custom-net-assemblies-in-ssis-2008-script-tasks/add-reference/"><img class="size-medium wp-image-1437" title="Add Reference..." src="http://www.ideaexcursion.com/wp-content/uploads/2009/10/add-reference-300x139.png" alt="Add Reference..." width="300" height="139" /></a><p class="wp-caption-text">Add Reference...</p></div></li>
<li>On the .NET tab, scroll to find your assembly</li>
<li>Press &#8220;OK&#8221;</li>
<li>The Assembly should now appear under the References list</li>
</ol>
</li>
<li> Add a reference to the assembly in code, at the top
<ol>
<li>(C#) Using CustomAssemblyName;</li>
<li> (VB.NET) Imports CustomAssemblyName</li>
</ol>
</li>
<li>You should now have full access to the imported <abbr title="Dynamic Link Library">DLL</abbr></li>
</ol>
<h2>Caveats</h2>
<p>This method works pretty well, but deployment isn&#8217;t exactly seamless &#8211; you&#8217;ll have to repeat this for each server and re-register &amp; copy the <abbr title="Dynamic Link Library">DLL</abbr> separately for any updates. Additionally, there is no way to globally add the assembly reference to the entire project or package. Instead, you&#8217;ll have to repeat step 6 (adding the reference) for every Script Task.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.ideaexcursion.com/2009/10/14/accessing-custom-net-assemblies-in-ssis-2008-script-tasks/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>HOWTO: Connect to MySQL in SSIS</title>
		<link>http://www.ideaexcursion.com/2009/06/04/howto-connect-to-mysql-in-ssis/</link>
		<comments>http://www.ideaexcursion.com/2009/06/04/howto-connect-to-mysql-in-ssis/#comments</comments>
		<pubDate>Thu, 04 Jun 2009 14:58:43 +0000</pubDate>
		<dc:creator>Taylor Gerring</dc:creator>
				<category><![CDATA[Google]]></category>
		<category><![CDATA[SSIS]]></category>
		<category><![CDATA[MySQL]]></category>
		<category><![CDATA[SQL Server]]></category>

		<guid isPermaLink="false">http://www.ideaexcursion.com/?p=793</guid>
		<description><![CDATA[Using the MySQL ADO.NET provider, SQL Server Integration Services can natively query MySQL databases, providing an easy method to transfer data between systems.]]></description>
			<content:encoded><![CDATA[<p>While Microsoft provided <a title="SSIS Team Blog New connectivity options in 2008" href="http://blogs.msdn.com/mattm/archive/2008/03/10/new-connectivity-options-in-2008.aspx" target="_blank">connectors for Oracle, Teradata, and SAP BI</a> for <abbr title="SQL Server Integration Services">SSIS</abbr> 2008, there are many other database systems left out of the mix. Fortunately, <abbr title="SQL Server Integration Services">SSIS</abbr> is exceptionally flexible in connecting to various data sources and allows other vendors to provide native support. The MySQL team did just that with <a title="MySQL :: Download Connector/Net 6.0" href="http://dev.mysql.com/downloads/connector/net/6.0.html" target="_blank">Connector/NET 6.0</a>, their ADO.NET provider. This tool allows us to use the the ADO.NET connections in SQL Server Integration Services to easily connect to MySQL. This is a walk through on how to connect to MySQL with <abbr title="SQL Server Integration Services">SSIS</abbr> 2005 utilizing the Connector/NET 6.0 ADO.NET provider.<br />
<span id="more-793"></span></p>
<ol>
<li>Download and install MySQL <a title="MySQL :: Download Connector/Net 6.0" href="http://dev.mysql.com/downloads/connector/net/6.0.html" target="_blank">Connector/NET 6.0</a></li>
<li>Start a new Integration Services project in <acronym title="Business Intelligence Development Studio">BIDS</acronym>
</li>
<li>Right-click in Connection Managers and create a new ADO.NET Connection
<p><a rel="attachment wp-att-808" href="http://www.ideaexcursion.com/2009/06/04/howto-connect-to-mysql-in-ssis/new-ado-net-connection/"><img class="size-medium wp-image-808" title="New ADO.NET Connection" src="http://www.ideaexcursion.com/wp-content/uploads/2009/05/new-ado-net-connection-250x300.png" alt="New ADO.NET Connection" width="250" height="300" /></a></li>
<li>In the Provider dropdown, expand .Net Providers and select MySQL Data Provider. Press &quot;OK&quot;
<p><a rel="attachment wp-att-807" href="http://www.ideaexcursion.com/2009/06/04/howto-connect-to-mysql-in-ssis/mysql-data-provider/"><img class="size-medium wp-image-807" title="MySQL Data Provider" src="http://www.ideaexcursion.com/wp-content/uploads/2009/05/mysql-data-provider-300x201.png" alt="MySQL Data Provider" width="300" height="201" /></a></li>
<li>Fill out the Server name, User name, Password and select the database name for the target MySQL server. Be sure to test the connection and press &#8220;OK&#8221;
<p><a rel="attachment wp-att-799" href="http://www.ideaexcursion.com/2009/06/04/howto-connect-to-mysql-in-ssis/connection-manager-connection-info/"><img class="size-medium wp-image-799" title="Connection Manager Connection Info" src="http://www.ideaexcursion.com/wp-content/uploads/2009/05/connection-manager-connection-info-294x300.png" alt="Connection Manager Connection Info" width="294" height="300" /></a></li>
<li>Rename the connection to &#8220;MySQLDB&#8221;
<p><a rel="attachment wp-att-800" href="http://www.ideaexcursion.com/2009/06/04/howto-connect-to-mysql-in-ssis/connection-managers-mysqldb/"><img class="size-full wp-image-800" title="Connection Managers MySQLDB" src="http://www.ideaexcursion.com/wp-content/uploads/2009/05/connection-managers-mysqldb.png" alt="Connection Managers MySQLDB" width="143" height="50" /></a></li>
<li>Open up the Toolbox and drag a Data Flow Task from the toolbox onto the Control Flow surface
<div id="attachment_795" class="wp-caption alignnone" style="width: 310px"><a rel="attachment wp-att-795" href="http://www.ideaexcursion.com/2009/06/04/howto-connect-to-mysql-in-ssis/add-dataflowtask/"><img class="size-medium wp-image-795" title="Add Dataflow Task" src="http://www.ideaexcursion.com/wp-content/uploads/2009/05/add-dataflowtask-300x76.png" alt="Add Dataflow Task" width="300" height="76" /></a><p class="wp-caption-text">Add Dataflow Task</p></div>
</li>
<li>Double-click the Data Flow Task to switch to the Data Flow view</li>
<li>Create a new variable, &#8220;MySQLResult&#8221; with the Data Type of Object. We will be using this as the final destination for the data, so we don&#8217;t need to connect to a file or database to store the data from this test
<p><div id="attachment_812" class="wp-caption alignnone" style="width: 310px"><a rel="attachment wp-att-812" href="http://www.ideaexcursion.com/2009/06/04/howto-connect-to-mysql-in-ssis/variables-mysqlresult/"><img class="size-medium wp-image-812" title="MySQLResult Variable" src="http://www.ideaexcursion.com/wp-content/uploads/2009/05/variables-mysqlresult-300x66.png" alt="MySQLResult Variable" width="300" height="66" /></a><p class="wp-caption-text">MySQLResult Variable</p></div></li>
<li>Drag a new DataReader Source component onto the Data Flow surface
<p><div id="attachment_796" class="wp-caption alignnone" style="width: 310px"><a rel="attachment wp-att-796" href="http://www.ideaexcursion.com/2009/06/04/howto-connect-to-mysql-in-ssis/add-datareader-source/"><img class="size-medium wp-image-796" title="Add DataReader Source" src="http://www.ideaexcursion.com/wp-content/uploads/2009/05/add-datareader-source-300x78.png" alt="Add DataReader Source" width="300" height="78" /></a><p class="wp-caption-text">Add DataReader Source</p></div></li>
<li>Double-click the DataReader Source to open the Advanced Editor. On the Connection Managers tab, select the previously-created MySQLDB connection
<p><div id="attachment_805" class="wp-caption alignnone" style="width: 310px"><a rel="attachment wp-att-805" href="http://www.ideaexcursion.com/2009/06/04/howto-connect-to-mysql-in-ssis/datareader-source-connection-managers/"><img class="size-medium wp-image-805" title="DataReader Source Connection Managers" src="http://www.ideaexcursion.com/wp-content/uploads/2009/05/datareader-source-connection-managers-300x290.png" alt="DataReader Source Connection Managers" width="300" height="290" /></a><p class="wp-caption-text">DataReader Source Connection Managers</p></div>
</li>
<li>Switch to the Component Properties tab and enter the SQL query in the SqlCommand property. Note that the query must be compatible with MySQL syntax, not SQL Server.
<p><div id="attachment_804" class="wp-caption alignnone" style="width: 310px"><a rel="attachment wp-att-804" href="http://www.ideaexcursion.com/2009/06/04/howto-connect-to-mysql-in-ssis/datareader-source-component-properties/"><img class="size-medium wp-image-804" title="DataReader Source Component Properties" src="http://www.ideaexcursion.com/wp-content/uploads/2009/05/datareader-source-component-properties-300x269.png" alt="DataReader Source Component Properties" width="300" height="269" /></a><p class="wp-caption-text">DataReader Source Component Properties</p></div>
</li>
<li>Switch to the Column Mappings tab to verify that the query is successful and the all the columns were pulled from the database. When done, press &#8220;OK&#8221;.
<div id="attachment_803" class="wp-caption alignnone" style="width: 310px"><a rel="attachment wp-att-803" href="http://www.ideaexcursion.com/2009/06/04/howto-connect-to-mysql-in-ssis/datareader-soruce-column-mappings/"><img class="size-medium wp-image-803" title="DataReader Source Column Mappings" src="http://www.ideaexcursion.com/wp-content/uploads/2009/05/datareader-soruce-column-mappings-300x269.png" alt="DataReader Source Column Mappings" width="300" height="269" /></a><p class="wp-caption-text">DataReader Source Column Mappings</p></div></li>
<li>Create a new Recordset Destination by dragging it from the toolbox to the Data Flow surface
<p><div id="attachment_797" class="wp-caption alignnone" style="width: 310px"><a rel="attachment wp-att-797" href="http://www.ideaexcursion.com/2009/06/04/howto-connect-to-mysql-in-ssis/add-recordset-destination/"><img class="size-medium wp-image-797" title="Add Recordset Destination" src="http://www.ideaexcursion.com/wp-content/uploads/2009/05/add-recordset-destination-300x102.png" alt="Add Recordset Destination" width="300" height="102" /></a><p class="wp-caption-text">Add Recordset Destination</p></div>
</li>
<li>Drag the green Data Flow Path from DataReader Source to Recordset Destination, so they connect
<p><div id="attachment_801" class="wp-caption alignnone" style="width: 170px"><a rel="attachment wp-att-801" href="http://www.ideaexcursion.com/2009/06/04/howto-connect-to-mysql-in-ssis/connect-source-destination/"><img class="size-full wp-image-801" title="Connect Source to Destination" src="http://www.ideaexcursion.com/wp-content/uploads/2009/05/connect-source-destination.png" alt="Connect Source to Destination" width="160" height="145" /></a><p class="wp-caption-text">Connect Source to Destination</p></div></li>
<li>Double-click the Recordset Destination to open its Advanced Editor</li>
<li>Under Custom Properties, select the dropdown for VariableName and select the variable we created before, User::MySQLResult
<p><div id="attachment_810" class="wp-caption alignnone" style="width: 310px"><a rel="attachment wp-att-810" href="http://www.ideaexcursion.com/2009/06/04/howto-connect-to-mysql-in-ssis/recordset-destination-component-properties/"><img class="size-medium wp-image-810" title="Recordset Destination Component Properties" src="http://www.ideaexcursion.com/wp-content/uploads/2009/05/recordset-destination-component-properties-300x269.png" alt="Recordset Destination Component Properties" width="300" height="269" /></a><p class="wp-caption-text">Recordset Destination Component Properties</p></div></li>
<li>Switch to the Input Columns tab and select those columns that you want stored in the Recordset Destination. When complete, click &#8220;OK&#8221;
<p><div id="attachment_811" class="wp-caption alignnone" style="width: 308px"><a rel="attachment wp-att-811" href="http://www.ideaexcursion.com/2009/06/04/howto-connect-to-mysql-in-ssis/recordset-destination-input-columns/"><img class="size-medium wp-image-811" title="Recordset Destination Input Columns" src="http://www.ideaexcursion.com/wp-content/uploads/2009/05/recordset-destination-input-columns-298x300.png" alt="Recordset Destination Input Columns" width="298" height="300" /></a><p class="wp-caption-text">Recordset Destination Input Columns</p></div>
</li>
<li>Right-click the green Data Flow Path and choose &#8220;Data Viewers&#8230;&#8221;
<p><div id="attachment_813" class="wp-caption alignnone" style="width: 264px"><a rel="attachment wp-att-813" href="http://www.ideaexcursion.com/2009/06/04/howto-connect-to-mysql-in-ssis/data-flow-path-data-viewers/"><img class="size-medium wp-image-813" title="Data Flow Path Data Viewers" src="http://www.ideaexcursion.com/wp-content/uploads/2009/05/data-flow-path-data-viewers-254x300.png" alt="Data Flow Path Data Viewers" width="254" height="300" /></a><p class="wp-caption-text">Data Flow Path Data Viewers</p></div>
</li>
<li>Select &#8220;Data Viewers&#8221; from the left pane and click the &#8220;Add&#8230;&#8221; button
<p><div id="attachment_802" class="wp-caption alignnone" style="width: 310px"><a rel="attachment wp-att-802" href="http://www.ideaexcursion.com/2009/06/04/howto-connect-to-mysql-in-ssis/data-flow-path-editor/"><img class="size-medium wp-image-802" title="Data Flow Path Editor" src="http://www.ideaexcursion.com/wp-content/uploads/2009/05/data-flow-path-editor-300x253.png" alt="Data Flow Path Editor" width="300" height="253" /></a><p class="wp-caption-text">Data Flow Path Editor</p></div>
</li>
<li>Under the General tab, select Grid and press &#8220;OK&#8221;
<p><div id="attachment_798" class="wp-caption alignnone" style="width: 310px"><a rel="attachment wp-att-798" href="http://www.ideaexcursion.com/2009/06/04/howto-connect-to-mysql-in-ssis/configure-data-viewer/"><img class="size-medium wp-image-798" title="Configure Data Viewer" src="http://www.ideaexcursion.com/wp-content/uploads/2009/05/configure-data-viewer-300x229.png" alt="Configure Data Viewer" width="300" height="229" /></a><p class="wp-caption-text">Configure Data Viewer</p></div></li>
<li>Run the package</li>
<li>If you&#8217;ve done everything correctly, you should see a Data Reader Output Data Viewer window pop up with the contents of the query we specified earlier.
<p><div id="attachment_806" class="wp-caption alignnone" style="width: 310px"><a rel="attachment wp-att-806" href="http://www.ideaexcursion.com/2009/06/04/howto-connect-to-mysql-in-ssis/data-viewer-output/"><img class="size-medium wp-image-806" title="Data Viewer Output" src="http://www.ideaexcursion.com/wp-content/uploads/2009/05/data-viewer-output-300x165.png" alt="Data Viewer Output" width="300" height="165" /></a><p class="wp-caption-text">Data Viewer Output</p></div></li>
</ol>
<p>SQL Server Integration Services makes connecting to other systems very easy. The MySQL ADO.NET provider works well, but requires more configuration than a native Source component.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.ideaexcursion.com/2009/06/04/howto-connect-to-mysql-in-ssis/feed/</wfw:commentRss>
		<slash:comments>8</slash:comments>
		</item>
		<item>
		<title>Handling Embedded Text Qualifiers in SSIS 2005</title>
		<link>http://www.ideaexcursion.com/2009/02/03/handling-embedded-text-qualifiers-in-ssis-2005/</link>
		<comments>http://www.ideaexcursion.com/2009/02/03/handling-embedded-text-qualifiers-in-ssis-2005/#comments</comments>
		<pubDate>Tue, 03 Feb 2009 16:32:04 +0000</pubDate>
		<dc:creator>Taylor Gerring</dc:creator>
				<category><![CDATA[SQL Server]]></category>
		<category><![CDATA[SSIS]]></category>
		<category><![CDATA[HOWTO]]></category>
		<category><![CDATA[VB.NET]]></category>

		<guid isPermaLink="false">http://www.ideaexcursion.com/?p=415</guid>
		<description><![CDATA[Just a quick note advising that I&#8217;ve updated my Handling Embedded Text Qualifiers post to also include a Visual Basic example, making the information also relevant to SQL Server 2005.]]></description>
			<content:encoded><![CDATA[<p>Just a quick note advising that I&#8217;ve updated my <a title="Handling Embedded Text Qualifiers" href="http://www.ideaexcursion.com/2008/11/12/handling-embedded-text-qualifiers/">Handling Embedded Text Qualifiers</a> post to also include a Visual Basic example, making the information also relevant to SQL Server 2005.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.ideaexcursion.com/2009/02/03/handling-embedded-text-qualifiers-in-ssis-2005/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Import Wikipedia articles into SQL Server with SSIS</title>
		<link>http://www.ideaexcursion.com/2009/01/26/import-wikipedia-articles-into-sql-server-with-ssis/</link>
		<comments>http://www.ideaexcursion.com/2009/01/26/import-wikipedia-articles-into-sql-server-with-ssis/#comments</comments>
		<pubDate>Mon, 26 Jan 2009 22:22:56 +0000</pubDate>
		<dc:creator>Taylor Gerring</dc:creator>
				<category><![CDATA[SQL Server]]></category>
		<category><![CDATA[SSIS]]></category>
		<category><![CDATA[HOWTO]]></category>

		<guid isPermaLink="false">http://www.ideaexcursion.com/?p=265</guid>
		<description><![CDATA[Import the XML dump of Wikipedia data into SQL Server through SSIS. Full walkthrough with pictures and pre-configured SSIS package provided.]]></description>
			<content:encoded><![CDATA[<p>If you&#8217;ve ever wanted to mine <a title="Wikipedia" href="http://wikipedia.org/">Wikipedia</a> data, it would be possible &#8211; but difficult &#8211; to scrape the whole site. Instead of performing such a slow &amp; arduous operation, the <a title="Home - Wikimedia Foundation" href="http://wikimediafoundation.org/">Wikimedia Foundation</a> has provided the contents for free, in a downloadable format. These exports can then be loaded and used for a multitude of reasons, including personal use.<br />
<span id="more-265"></span></p>
<h3>Ingredients</h3>
<p>I&#8217;ve created an <abbr title="SQL Server Integration Services">SSIS</abbr> package that will import the articles and pages into a SQL Server 2005 database. To do this, you&#8217;ll first need to gather a few files:</p>
<ul>
<li><a title="EN Wikipedia database dumps" href="http://download.wikimedia.org/enwiki/latest/" target="_blank">Latest Wikipeda pages/articles dump</a> (Download enwiki-latest-pages-articles.xml.bz2, approximately 4.1<abbr title="GigaByte">GB</abbr> at time of writing)</li>
<li><a title="MediaWiki XML Schema Definition" href="http://www.mediawiki.org/xml/export-0.3.xsd" target="_blank">MediaWiki XSD</a> (originally located on <a title="Manual:XML Import file manipulation in CSharp" href="http://www.mediawiki.org/wiki/Manual:XML_Import_file_manipulation_in_CSharp" target="_blank">Manual:XML Import file manipulation in CSharp</a>)</li>
<li><a title="Import Wikipedia SSIS Package" href="http://www.ideaexcursion.com/wp-content/uploads/2009/01/WikipediaImport.zip">SSIS Package &amp; Database Scripts<br />
</a></li>
</ul>
<p>You&#8217;ll need approximately 75<abbr title="GigaByte">GB</abbr> of free space &#8211; 50<abbr title="GigaByte">GB</abbr> for the database and 20<abbr title="GigaByte">GB</abbr> for the <abbr title="eXtensible Markup Language">XML</abbr> file. Also, this will likely take several hours, if not longer. If you have separate drive spindles, it would certainly help to separate the XML and database files. My example uses C:\wikipedia\ as the working folder; if you prefer another location, we&#8217;ll configure it later. If not, this is what the structure should look like:</p>
<div id="attachment_303" class="wp-caption alignnone" style="width: 563px"><a href="http://www.ideaexcursion.com/wp-content/uploads/2009/01/ssiswpdirlisting.png"><img class="size-full wp-image-303" title="Directory Listing" src="http://www.ideaexcursion.com/wp-content/uploads/2009/01/ssiswpdirlisting.png" alt="Directory Listing" width="553" height="155" /></a><p class="wp-caption-text">Directory Listing</p></div>
<h3>Recipe</h3>
<p><em><strong>Note:</strong> If you&#8217;re planning on importing the <abbr title="eXtensible Markup Language">XML</abbr> file into a remote server, I highly recommend performing all these operations on the server itself through Remote Desktop. Aside from having to re-transfer the gigantic <abbr title="eXtensible Markup Language">XML</abbr> dump, debugging <abbr title="SQL Server Integration Services">SSIS</abbr> packages is much easier when working locally.</em></p>
<ol>
<li>Connect to the SQL Server database and run WikipediaImport\WikipediaImport\DatabaseCreate.sql
<ol>
<li>This creates the database (cleverly named &#8220;Wikipedia&#8221;) on C:\wikipedia\. If you want the data and log files located elsewhere, find 50<abbr title="GigaByte">GB</abbr> of free space and update the CREATE DATABASE statement.</li>
<li>Because we know the database is going to grow immediately, I&#8217;ve told the script to allocation 40<abbr title="GigaByte">GB</abbr> for data and 10<abbr title="GigaByte">GB</abbr> for log, so this step may take a while to run.</li>
</ol>
</li>
<li>Open WikipediaImport\WikipediaImport.sln in Visual Studio 2005</li>
<li>Enable the Variables window if it is not already visible
<ol>
<li>Select Data Flow</li>
<li>Select the View menu</li>
<li>Select Other Windows</li>
<li>Select Variables
<div id="attachment_308" class="wp-caption alignnone" style="width: 260px"><a href="http://www.ideaexcursion.com/wp-content/uploads/2009/01/ssiswpvariablesmenu.png"><img class="size-medium wp-image-308" title="Enable Variables Menu" src="http://www.ideaexcursion.com/wp-content/uploads/2009/01/ssiswpvariablesmenu-250x300.png" alt="Enable Variables Menu" width="250" height="300" /></a><p class="wp-caption-text">Enable Variables Menu</p></div></li>
</ol>
</li>
<li>If the working files were placed somewhere besides C:\wikipedia\, you can configure that in the Variables window. Be sure to update both PageArticlesXML and PageArticlesXSD
<p><div id="attachment_309" class="wp-caption alignnone" style="width: 571px"><a href="http://www.ideaexcursion.com/wp-content/uploads/2009/01/ssiswpvariablevalues.png"><img class="size-full wp-image-309" title="Set Variable Values" src="http://www.ideaexcursion.com/wp-content/uploads/2009/01/ssiswpvariablevalues.png" alt="Set Variable Values" width="561" height="94" /></a><p class="wp-caption-text">Set Variable Values</p></div></li>
<li>Additionally, if you&#8217;re not importing to localhost, configure the database connection variable (named DatabaseConnection)</li>
<li>Verify there are no warnings or errors and build the solution (Ctrl + Shift + B or Build?Build WikipediaImport)</li>
<li>If the build succeeds, go ahead and run it (F5 or Debug?Start Debugging)</li>
<li>If all goes well, the <abbr title="eXtensible Markup Language">XML</abbr> file should now be streaming into the database. This will likely take hours, even with a fast <acronym title="Redundant Array of Inexpensive Disks">RAID</acronym>. Notice that the file only requires a single pass, rather than scanning it once per table.
<p><div id="attachment_305" class="wp-caption alignnone" style="width: 310px"><a href="http://www.ideaexcursion.com/wp-content/uploads/2009/01/ssiswpprogress.png"><img class="size-medium wp-image-305" title="Import Progress" src="http://www.ideaexcursion.com/wp-content/uploads/2009/01/ssiswpprogress-300x163.png" alt="Import Progress" width="300" height="163" /></a><p class="wp-caption-text">Import Progress</p></div></li>
<li>Switch back to <abbr title="SQL Server Management Studio">SSMS</abbr> and run WikipediaImport\WikipediaImport\IndexCreate.sql
<ol>
<li>This step is technically optional, but is going to help speed up your queries significantly</li>
<li>If we had created the indexes before the import, the import would have been even slower</li>
<li>This will take a while!</li>
</ol>
</li>
<li>Run a test query

<div class="wp_syntax"><div class="code"><pre class="tsql" style="font-family:monospace;"><span style="color: #0000FF;">SELECT</span> <span style="color: #808080;">*</span> <span style="color: #0000FF;">FROM</span> dbo.<span style="color: #202020;">vw_Articles</span> <span style="color: #0000FF;">WHERE</span> title <span style="color: #808080;">=</span> <span style="color: #FF0000;">'Green Day'</span></pre></div></div>

</li>
</ol>
<h3>Behind the Scenes</h3>
<p>Getting this to work took a while of tweaking, but there are a few highlights I&#8217;d like to point out.</p>
<h4>Data Types</h4>
<p><abbr title="XML Schema Definition">XSD</abbr>: The provided <abbr title="eXtensible Markup Language">XML</abbr> Schema Definition file does not contain any information about the intended length of the string data. Fortunately, through testing, I was able to shrink some of those sizes down, although they do not strictly conform to the <a title="Wikipedia:Database download SQL schema" href="http://en.wikipedia.org/wiki/Wikipedia_database#SQL_schema" target="_blank">official database schema</a>. Specifically, I have shrunk page.restrictions and text.space from nvarchar(255) to nvarchar(50). Most other items conform as close as possible. In addition to these, I had to update text.text to a nvarchar(max). <abbr title="SQL Server Integration Services">SSIS</abbr> initially suggested an nvarchar(255), but articles are obviously much longer than this. To perform these changes, right-click the <abbr title="eXtensible Markup Language">XML</abbr> Source (named PageArticles) in the Import <abbr title="eXtensible Markup Language">XML</abbr> Data Flow Task and select &#8220;Show Advanced Editor&#8230;&#8221;</p>
<p><div id="attachment_306" class="wp-caption alignnone" style="width: 218px"><a href="http://www.ideaexcursion.com/wp-content/uploads/2009/01/ssiswpshowadvancededitor.png"><img class="size-full wp-image-306" title="Advanced Editor Menu" src="http://www.ideaexcursion.com/wp-content/uploads/2009/01/ssiswpshowadvancededitor.png" alt="Advanced Editor Menu" width="208" height="321" /></a><p class="wp-caption-text">Advanced Editor Menu</p></div>
<p>Expand out each of the changed columns (both External and Output) and update the DataType. For example, SQL Server-specifc nvarchar(max) is a more general &#8220;Unicode text stream [DT_NTEXT]&#8221; in <abbr title="SQL Server Integration Services">SSIS</abbr>. For the others, just update the length from 255 to 50.</p>
<div id="attachment_307" class="wp-caption alignnone" style="width: 310px"><a href="http://www.ideaexcursion.com/wp-content/uploads/2009/01/ssiswptextstream.png"><img class="size-medium wp-image-307" title="Set Data Types" src="http://www.ideaexcursion.com/wp-content/uploads/2009/01/ssiswptextstream-300x270.png" alt="Set Data Types" width="300" height="270" /></a><p class="wp-caption-text">Set Data Types</p></div>
<h4>Extraneous information</h4>
<p>There are many more fields in the <abbr title="eXtensible Markup Language">XML</abbr> file than I&#8217;ve decided to import. Unfortunately, I can&#8217;t just turn them off completely, lest SSIS complains. Instead I have chosen to suffer the lesser fate of &#8220;Warning&#8221;. Additionally, I changed the Error Output to &#8220;Ignore failure&#8221; on Error. This screen can be accessed by double-clicking the <abbr title="eXtensible Markup Language">XML</abbr> Source, PageArticles, then selecting the Error Output page.</p>
<div id="attachment_304" class="wp-caption alignnone" style="width: 310px"><a href="http://www.ideaexcursion.com/wp-content/uploads/2009/01/ssiswperroroutput.png"><img class="size-medium wp-image-304" title="Configure Error Output" src="http://www.ideaexcursion.com/wp-content/uploads/2009/01/ssiswperroroutput-300x259.png" alt="Configure Error Output" width="300" height="259" /></a><p class="wp-caption-text">Configure Error Output</p></div>
<h3>Wrap-up</h3>
<p><abbr title="SQL Server Integration Services">SSIS</abbr> is very picky about metadata, making this a somewhat difficult project to get running, however, it <em>is</em> actually running. This could be further extended with an automated download and increased amount of data imported, but for now it serves its purpose.</p>
<p>I know not everyone will get this on first run, so if you have a problem, please leave a comment below and I&#8217;ll do my best to answer them in a timely manner.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.ideaexcursion.com/2009/01/26/import-wikipedia-articles-into-sql-server-with-ssis/feed/</wfw:commentRss>
		<slash:comments>7</slash:comments>
		</item>
		<item>
		<title>Set FTP password in SSIS</title>
		<link>http://www.ideaexcursion.com/2008/11/24/set-ftp-password-in-ssis/</link>
		<comments>http://www.ideaexcursion.com/2008/11/24/set-ftp-password-in-ssis/#comments</comments>
		<pubDate>Mon, 24 Nov 2008 17:08:44 +0000</pubDate>
		<dc:creator>Taylor Gerring</dc:creator>
				<category><![CDATA[SQL Server]]></category>
		<category><![CDATA[SSIS]]></category>
		<category><![CDATA[C#]]></category>
		<category><![CDATA[FTP]]></category>
		<category><![CDATA[HOWTO]]></category>
		<category><![CDATA[SQL Server 2008]]></category>

		<guid isPermaLink="false">http://www.ideaexcursion.com/?p=97</guid>
		<description><![CDATA[SSIS does not allow an FTP password to be set through Expressions.  Fortunately, we can manually set the value stored in a variable through a Script Component.]]></description>
			<content:encoded><![CDATA[<p>Undoubtedly, you&#8217;re reading this because you&#8217;ve discovered that SQL Server Integration Services (as of <abbr title="SQL Server Integration Services">SSIS</abbr> 2008) will not allow you to set the password of an <abbr title="File Transfer Protocol">FTP</abbr> connection through expressions. Fortunately, there is an easy workaround, that requires a simple Script Task. While not as simple as native expression support, it&#8217;s darn close. I&#8217;ve included C# code, but you may need to adapt to VB.Net if that&#8217;s your preferred flavor.<br />
<span id="more-97"></span></p>
<ol>
<li>Ensure a string variable is setup with the password. For this demo, I&#8217;m using the name, &#8220;FTPPassword&#8221;</li>
<li>Add a Script Task to your package</li>
<li>Edit the task</li>
<li>On the Script page, click the elipsis for ReadOnlyVariables and check the box for User::FTPPassword.</li>
<li>Click the &#8220;Edit Script&#8230;&#8221; button</li>
<li>Change your entry point (Main, by default) to look like the below code. Save, close, and hit OK.</li>
</ol>

<div class="wp_syntax"><table><tr><td class="line_numbers"><pre>1
2
3
4
5
6
7
8
</pre></td><td class="code"><pre class="csharp" style="font-family:monospace;"><span style="color: #0600FF;">public</span> <span style="color: #0600FF;">void</span> Main<span style="color: #000000;">&#40;</span><span style="color: #000000;">&#41;</span>
<span style="color: #000000;">&#123;</span>
	ConnectionManager FTPConn<span style="color: #008000;">;</span>
	FTPConn <span style="color: #008000;">=</span> Dts.<span style="color: #0000FF;">Connections</span><span style="color: #000000;">&#91;</span><span style="color: #666666;">&quot;FTPServer&quot;</span><span style="color: #000000;">&#93;</span><span style="color: #008000;">;</span>
	FTPConn.<span style="color: #0000FF;">Properties</span><span style="color: #000000;">&#91;</span><span style="color: #666666;">&quot;ServerPassword&quot;</span><span style="color: #000000;">&#93;</span>.<span style="color: #0000FF;">SetValue</span><span style="color: #000000;">&#40;</span>FTPConn, Dts.<span style="color: #0000FF;">Variables</span><span style="color: #000000;">&#91;</span><span style="color: #666666;">&quot;FTPPassword&quot;</span><span style="color: #000000;">&#93;</span>.<span style="color: #0000FF;">Value</span><span style="color: #000000;">&#41;</span><span style="color: #008000;">;</span>
&nbsp;
	Dts.<span style="color: #0000FF;">TaskResult</span> <span style="color: #008000;">=</span> <span style="color: #000000;">&#40;</span><span style="color: #FF0000;">int</span><span style="color: #000000;">&#41;</span>ScriptResults.<span style="color: #0000FF;">Success</span><span style="color: #008000;">;</span>
<span style="color: #000000;">&#125;</span></pre></td></tr></table></div>

<p>A couple notes:</p>
<ul>
<li>Ensure you update lines 4-5 to reflect the actual connection name. My example uses the name FTPServer.</li>
<li>Just to reiterate, my password is stored in the variable name FTPPassword. If yours is different make this change on line 5.</li>
</ul>
<p>That&#8217;s it. Make sure you&#8217;ve got this task being executed <strong>before</strong> your actual <abbr title="File Transfer Protocol">FTP</abbr> task and everything should work fine. Cheers!</p>
]]></content:encoded>
			<wfw:commentRss>http://www.ideaexcursion.com/2008/11/24/set-ftp-password-in-ssis/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Handling Embedded Text Qualifiers</title>
		<link>http://www.ideaexcursion.com/2008/11/12/handling-embedded-text-qualifiers/</link>
		<comments>http://www.ideaexcursion.com/2008/11/12/handling-embedded-text-qualifiers/#comments</comments>
		<pubDate>Wed, 12 Nov 2008 19:43:20 +0000</pubDate>
		<dc:creator>Taylor Gerring</dc:creator>
				<category><![CDATA[SQL Server]]></category>
		<category><![CDATA[SSIS]]></category>
		<category><![CDATA[C#]]></category>
		<category><![CDATA[HOWTO]]></category>

		<guid isPermaLink="false">http://www.ideaexcursion.com/?p=88</guid>
		<description><![CDATA[SSIS breaks when processing embedded text qualifiers in flat file sources. Tutorial on how to configure a script component to manually split on delimiters.]]></description>
			<content:encoded><![CDATA[<p>It seems that <abbr title="SQL Server Integration Services">SSIS</abbr> can&#8217;t handle embedded text qualifiers when importing from a flat file. What is an embedded text qualifier? Let&#8217;s say you have a <abbr title="Comma-separated values">CSV</abbr> file with a few fields. To thwart problems with commas inside your text fields causing confusion with your column delimiters, you implement a text qualifiers &#8211; typically double quotes. It makes a row look something like this:</p>

<div class="wp_syntax"><div class="code"><pre class="none" style="font-family:monospace;">&quot;12036&quot;,&quot;Company Name, Inc.&quot;,&quot;555-555-1234&quot;,&quot;3.14159&quot;</pre></div></div>

<p>That&#8217;s fine, but what happens when the field also contains quotes, such as this:</p>

<div class="wp_syntax"><div class="code"><pre class="none" style="font-family:monospace;">&quot;52665&quot;,&quot;Best &quot;Kept&quot; Secret Storage Facility&quot;,&quot;555-555-9876&quot;,&quot;2.71828&quot;</pre></div></div>

<p><span id="more-88"></span><br />
<em><strong>Note:</strong> This post originally assumed you were working with SQL Server 2008 because the included code was only provided in C#. The post has been updated to include a <abbr title="Visual Basic">VB</abbr> .NET code sample as well.</em></p>
<p>Now, depending on how the parser interprets this line, it could see the double-quote before  &#8220;Kept&#8221; and expect a field terminator (a comma in our case). Well, it so happens, that for whatever reason, Integration Services exhibits this type of behavior and will fail on this type of line. It&#8217;s a <a title="Google Search: ssis embedded text qualifiers" href="http://www.google.com/search?q=ssis+embedded+text+qualifiers" target="_blank">well-documented bug</a> and <a title="Microsoft Connect: Flat File Parser cannot import files with embedded text qualifiers" href="http://connect.microsoft.com/SQLServer/feedback/ViewFeedback.aspx?FeedbackID=312164" target="_blank">oft-requested fix</a>, but even as of SQL Server 2008, the issue is still present (Note: This apparently should be a supported configuration according to <a title="RFC4180 - Common Format and MIME Type for Comma-Separated Value" href="http://www.faqs.org/rfcs/rfc4180.html" target="_blank">RFC 4180</a>). Ideally, you can simply import a different format &#8211; using a different text qualifier is probably the easiest change (ever tried the thorn?). If this is not feasible, a variety of other solutions have been suggested, the most flexible being, writing your own Script Transformation to custom-parse the rows.</p>
<p>The setup is simple: configure the Flat File Source to import the whole record as a single field. Pipe that output to your script component, and then connect that to the destination database or rest of your process. Here&#8217;s what my Data Flow Task looks like:</p>
<div id="attachment_89" class="wp-caption alignnone" style="width: 310px"><a href="http://www.ideaexcursion.com/wp-content/uploads/2008/11/datataskoverview.png"><img class="size-medium wp-image-89" title="Data Flow Task Overview" src="http://www.ideaexcursion.com/wp-content/uploads/2008/11/datataskoverview-300x269.png" alt="Data Task Overview" width="300" height="269" /></a><p class="wp-caption-text">Data Flow Task Overview</p></div>
<p>And here are the settings I used for my Flat File Connection:</p>
<ul>
<li>General
<ul>
<li>Text qualifier: &lt;none&gt;</li>
<li>Header row delimiter: {CR}{LF}</li>
</ul>
</li>
<li>Columns
<ul>
<li>Row delimiter: {CR}{LF}</li>
<li>Column delimiter: [blank]</li>
</ul>
</li>
<li>Advanced
<ul>
<li>Name your column. My example user the unimaginative name, &#8220;line&#8221;.</li>
<li>DataType: string [DT_STR]</li>
<li>OutputColumnWidth: 8000</li>
</ul>
</li>
</ul>
<p>And the steps to get started:</p>
<ol>
<li>Go ahead and add a Flat File Source to your Data Flow Task and configure it to use this Flat File Connection</li>
<li>Now, place a Script Component on your Data Flow and select &#8220;Transformation&#8221; when prompted</li>
<li>Drag the green output arrow from Flat File Source to Script Component</li>
<li>Edit the Script Component and switch over to the &#8220;Inputs and Outputs&#8221; page</li>
<li>Rename the output to &#8220;Match Rows&#8221;, or anything else of your choosing.</li>
<li>Set ExclusionGroup = 1</li>
<li>Add the correct output columns, with necessary names. If you set the types to string [DT_STR], you can convert them in a data convesion transformation later. Or, if you prefer to set the correct types now, you&#8217;ll need to case them specifically in the script component. Add an extra column for the line number with type of &#8220;four-byte signed integer [DT_I4]&#8220;. I called mine MatchLineNum.</li>
<li>Add another output and call it &#8220;Error Rows&#8221; or anything else of your choosing.</li>
<li>Set ExclusionGroup = 1</li>
<li>Set &#8220;Synchronous InputID&#8221; to your input. In my case, it is named &#8220;Input 0&#8243;</li>
<li>Add only two columns, ErrorLine (string [DT_STR] 8000) and ErrorLineNum (four-byte signed integer [DT_I4]).</li>
</ol>
<p>Now, a quick explanation on what all this was for: Because we are only importing the whole row, we want to configure the Script Component to output the individual columns. We did this part when we completed steps 5-7. Next, we configured an additional output for rows that we are not able to successfully parse. This is why we defined the Error Rows output in steps 8-10. Be aware, that if you&#8217;re supremely confident that you can parse every row, you could optionally skip the creation of the Error Rows output, but I would advise against it.</p>
<p>Now, we will create the actual script to parse the rows manually.</p>
<ol>
<li>Switch back to the Script page and click the &#8220;Edit Script&#8230;&#8221; button. This will open up a script editor that looks like a stripped-down Visual Studio.</li>
<li>Take note of the code below, and I will explain it below:</li>
</ol>

<div class="wp_syntax"><table><tr><td class="line_numbers"><pre>1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
</pre></td><td class="code"><pre class="csharp" style="font-family:monospace;"><span style="color: #008080; font-style: italic;">/* Microsoft SQL Server Integration Services Script Component
*  Write scripts using Microsoft Visual C# 2008.
*  ScriptMain is the entry point class of the script.*/</span>
&nbsp;
<span style="color: #0600FF;">using</span> <span style="color: #008080;">System</span><span style="color: #008000;">;</span>
<span style="color: #0600FF;">using</span> <span style="color: #008080;">System.Data</span><span style="color: #008000;">;</span>
<span style="color: #0600FF;">using</span> <span style="color: #008080;">Microsoft.SqlServer.Dts.Pipeline.Wrapper</span><span style="color: #008000;">;</span>
<span style="color: #0600FF;">using</span> <span style="color: #008080;">Microsoft.SqlServer.Dts.Runtime.Wrapper</span><span style="color: #008000;">;</span>
&nbsp;
<span style="color: #000000;">&#91;</span>Microsoft.<span style="color: #0000FF;">SqlServer</span>.<span style="color: #0000FF;">Dts</span>.<span style="color: #0000FF;">Pipeline</span>.<span style="color: #0000FF;">SSISScriptComponentEntryPointAttribute</span><span style="color: #000000;">&#93;</span>
<span style="color: #0600FF;">public</span> <span style="color: #FF0000;">class</span> ScriptMain <span style="color: #008000;">:</span> UserComponent
<span style="color: #000000;">&#123;</span>
    <span style="color: #008080; font-style: italic;">//Declare LineNum to keep track of the number of rows</span>
    <span style="color: #FF0000;">int</span> LineNum<span style="color: #008000;">;</span>
&nbsp;
    <span style="color: #0600FF;">public</span> <span style="color: #0600FF;">override</span> <span style="color: #0600FF;">void</span> PreExecute<span style="color: #000000;">&#40;</span><span style="color: #000000;">&#41;</span>
    <span style="color: #000000;">&#123;</span>
        <span style="color: #0600FF;">base</span>.<span style="color: #0000FF;">PreExecute</span><span style="color: #000000;">&#40;</span><span style="color: #000000;">&#41;</span><span style="color: #008000;">;</span>
&nbsp;
        <span style="color: #008080; font-style: italic;">//Initalize LineNum</span>
        LineNum <span style="color: #008000;">=</span> <span style="color: #FF0000;">0</span><span style="color: #008000;">;</span>
    <span style="color: #000000;">&#125;</span>
&nbsp;
    <span style="color: #0600FF;">public</span> <span style="color: #0600FF;">override</span> <span style="color: #0600FF;">void</span> PostExecute<span style="color: #000000;">&#40;</span><span style="color: #000000;">&#41;</span>
    <span style="color: #000000;">&#123;</span>
        <span style="color: #0600FF;">base</span>.<span style="color: #0000FF;">PostExecute</span><span style="color: #000000;">&#40;</span><span style="color: #000000;">&#41;</span><span style="color: #008000;">;</span>
    <span style="color: #000000;">&#125;</span>
&nbsp;
    <span style="color: #0600FF;">public</span> <span style="color: #0600FF;">override</span> <span style="color: #0600FF;">void</span> Input0_ProcessInputRow<span style="color: #000000;">&#40;</span>Input0Buffer Row<span style="color: #000000;">&#41;</span>
    <span style="color: #000000;">&#123;</span>
        <span style="color: #008080; font-style: italic;">//Increment LineNum  to keep track of the number of rows</span>
        LineNum <span style="color: #008000;">+=</span> <span style="color: #FF0000;">1</span><span style="color: #008000;">;</span>
&nbsp;
        <span style="color: #008080; font-style: italic;">//split the input column. This will be highly dependent on the format and will probably need adjustment</span>
        <span style="color: #FF0000;">string</span><span style="color: #000000;">&#91;</span><span style="color: #000000;">&#93;</span> columns <span style="color: #008000;">=</span> <span style="color: #000000;">System.<span style="color: #0000FF;">Text</span>.<span style="color: #0000FF;">RegularExpressions</span></span>.<span style="color: #0000FF;">Regex</span>.<span style="color: #0000FF;">Split</span><span style="color: #000000;">&#40;</span>Row.<span style="color: #0000FF;">line</span>, <span style="color: #666666;">&quot;(?&lt;=&quot;</span><span style="color: #000000;">&#41;</span>,<span style="color: #000000;">&#40;</span><span style="color: #008000;">?=</span><span style="color: #666666;">&quot;)&quot;</span><span style="color: #000000;">&#41;</span><span style="color: #008000;">;</span>
        <span style="color: #008080; font-style: italic;">//If the number of elements is not expected, we assume there was a problem</span>
        <span style="color: #0600FF;">if</span> <span style="color: #000000;">&#40;</span>columns.<span style="color: #0000FF;">Length</span> <span style="color: #008000;">!=</span> <span style="color: #FF0000;">4</span><span style="color: #000000;">&#41;</span>
        <span style="color: #000000;">&#123;</span>
            Row.<span style="color: #0000FF;">ErrorLine</span> <span style="color: #008000;">=</span> Row.<span style="color: #0000FF;">line</span><span style="color: #008000;">;</span>
            Row.<span style="color: #0000FF;">ErrorLineNum</span> <span style="color: #008000;">=</span> LineNum<span style="color: #008000;">;</span>
            Row.<span style="color: #0000FF;">DirectRowToErrorRows</span><span style="color: #000000;">&#40;</span><span style="color: #000000;">&#41;</span><span style="color: #008000;">;</span>
        <span style="color: #000000;">&#125;</span>
        <span style="color: #008080; font-style: italic;">//Everything looks good, let's move forward</span>
        <span style="color: #0600FF;">else</span>
        <span style="color: #000000;">&#123;</span>
            Row.<span style="color: #0000FF;">MatchLineNum</span> <span style="color: #008000;">=</span> LineNum<span style="color: #008000;">;</span>
            Row.<span style="color: #0000FF;">ID</span> <span style="color: #008000;">=</span> StripQualifier<span style="color: #000000;">&#40;</span>columns<span style="color: #000000;">&#91;</span><span style="color: #FF0000;">0</span><span style="color: #000000;">&#93;</span>, <span style="color: #666666;">&quot;&quot;</span><span style="color: #666666;">&quot;);
            Row.Company = StripQualifier(columns[1], &quot;</span><span style="color: #666666;">&quot;&quot;</span><span style="color: #000000;">&#41;</span><span style="color: #008000;">;</span>
            Row.<span style="color: #0000FF;">PhoneNumber</span> <span style="color: #008000;">=</span> StripQualifier<span style="color: #000000;">&#40;</span>columns<span style="color: #000000;">&#91;</span><span style="color: #FF0000;">2</span><span style="color: #000000;">&#93;</span>, <span style="color: #666666;">&quot;&quot;</span><span style="color: #666666;">&quot;);
            Row.FavoriteIrrationalNumber = StripQualifier(columns[3], &quot;</span><span style="color: #666666;">&quot;&quot;</span><span style="color: #000000;">&#41;</span><span style="color: #008000;">;</span>
            Row.<span style="color: #0000FF;">DirectRowToMatchRows</span><span style="color: #000000;">&#40;</span><span style="color: #000000;">&#41;</span><span style="color: #008000;">;</span>
        <span style="color: #000000;">&#125;</span>
&nbsp;
    <span style="color: #000000;">&#125;</span>
&nbsp;
    <span style="color: #0600FF;">public</span> <span style="color: #0600FF;">static</span> <span style="color: #FF0000;">string</span> StripQualifier<span style="color: #000000;">&#40;</span><span style="color: #FF0000;">string</span> InputString, <span style="color: #FF0000;">string</span> Qualifier<span style="color: #000000;">&#41;</span>
    <span style="color: #000000;">&#123;</span>
        <span style="color: #008080; font-style: italic;">//This is a helper function only to remove surrounding text qualifiers</span>
&nbsp;
        <span style="color: #FF0000;">string</span> OutputString<span style="color: #008000;">;</span>
&nbsp;
        <span style="color: #0600FF;">if</span><span style="color: #000000;">&#40;</span>
            InputString.<span style="color: #0000FF;">Substring</span><span style="color: #000000;">&#40;</span><span style="color: #FF0000;">0</span>, Qualifier.<span style="color: #0000FF;">Length</span><span style="color: #000000;">&#41;</span> <span style="color: #008000;">==</span> Qualifier
            <span style="color: #008000;">&amp;&amp;</span> InputString.<span style="color: #0000FF;">Substring</span><span style="color: #000000;">&#40;</span>InputString.<span style="color: #0000FF;">Length</span> <span style="color: #008000;">-</span> Qualifier.<span style="color: #0000FF;">Length</span>, Qualifier.<span style="color: #0000FF;">Length</span><span style="color: #000000;">&#41;</span> <span style="color: #008000;">==</span> Qualifier
          <span style="color: #000000;">&#41;</span>
            OutputString <span style="color: #008000;">=</span> InputString.<span style="color: #0000FF;">Substring</span><span style="color: #000000;">&#40;</span>Qualifier.<span style="color: #0000FF;">Length</span>, InputString.<span style="color: #0000FF;">Length</span> <span style="color: #008000;">-</span> <span style="color: #000000;">&#40;</span><span style="color: #FF0000;">2</span> <span style="color: #008000;">*</span> Qualifier.<span style="color: #0000FF;">Length</span><span style="color: #000000;">&#41;</span><span style="color: #000000;">&#41;</span><span style="color: #008000;">;</span>
        <span style="color: #0600FF;">else</span>
            OutputString <span style="color: #008000;">=</span> InputString<span style="color: #008000;">;</span>
&nbsp;
        <span style="color: #0600FF;">return</span> OutputString<span style="color: #008000;">;</span>
    <span style="color: #000000;">&#125;</span>
&nbsp;
<span style="color: #000000;">&#125;</span></pre></td></tr></table></div>

<ul>
<li>Line 14: Declare LineNum for counting our position in the file.</li>
<li>Line 21: Initialize LineNum to 0. In the above script, we increment before parsing, so the first record will have a LineNum of 1. If you happen to have a header row and would prefer it to be 2 to coincide with the actual lines in your flat file, feel free to change the initalization to -1.</li>
<li>Line 31: Increment LineNum for keeping track of our position in the file</li>
<li>Line 35: This is the real nugget to the whole post. It utilizes regular expressions to correctly parse out the row, even with an embedded text qualifier. This is a  fairly simplistic implementation. <strong>If you need to tweak the parsing routine to your situation do it here!</strong></li>
<li>Line 37: In my example I am expecting 4 fields, so I check that my string array contains four entries. If not, I assume it&#8217;s just a bad row and will inspect later.</li>
<li>Line 39-40: Set the appropriate output columns.</li>
<li>Line 41: Force the rows to the &#8220;Error Rows&#8221; output. Remember that extra output we created? It&#8217;s specifically for this situation, so we can redirect the &#8220;bad&#8221; rows to a different location.</li>
<li>Line 44: If we have 4 elements in our array, assume the best! You probably want to perform additional sanity checks here, including wrapping the whole thing in a try-catch block, but for our example, I&#8217;m keeping it simple.</li>
<li>Line 47-50: Set the appropriate output columns</li>
<li>Line 51: Redirect output to &#8220;Match Rows&#8221;</li>
<li>Line 56-71: This is simply a helper method to strip the text qualifiers off the text. Discussing it is outside the scope of this post, but I&#8217;ve tested it and it works fine.</li>
</ul>
<p>Here is a similar script in Visual Basic .NET that will work for SQL Server 2005:</p>

<div class="wp_syntax"><table><tr><td class="line_numbers"><pre>1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
</pre></td><td class="code"><pre class="vbnet" style="font-family:monospace;"><span style="color: #008080; font-style: italic;">' Microsoft SQL Server Integration Services user script component</span>
<span style="color: #008080; font-style: italic;">' This is your new script component in Microsoft Visual Basic .NET</span>
<span style="color: #008080; font-style: italic;">' ScriptMain is the entrypoint class for script components</span>
&nbsp;
<span style="color: #0600FF;">Imports</span> System
<span style="color: #0600FF;">Imports</span> System.<span style="color: #0000FF;">Data</span>
<span style="color: #0600FF;">Imports</span> System.<span style="color: #0000FF;">Math</span>
<span style="color: #0600FF;">Imports</span> Microsoft.<span style="color: #0000FF;">SqlServer</span>.<span style="color: #0000FF;">Dts</span>.<span style="color: #0000FF;">Pipeline</span>.<span style="color: #0000FF;">Wrapper</span>
<span style="color: #0600FF;">Imports</span> Microsoft.<span style="color: #0000FF;">SqlServer</span>.<span style="color: #0000FF;">Dts</span>.<span style="color: #0000FF;">Runtime</span>.<span style="color: #0000FF;">Wrapper</span>
&nbsp;
<span style="color: #FF8000;">Public</span> <span style="color: #0600FF;">Class</span> ScriptMain
    <span style="color: #0600FF;">Inherits</span> UserComponent
&nbsp;
    <span style="color: #008080; font-style: italic;">'Declare and initialize LineNum to keep track of the number of rows</span>
    <span style="color: #0600FF;">Dim</span> LineNum <span style="color: #FF8000;">As</span> Int32 <span style="color: #008000;">=</span> <span style="color: #FF0000;">0</span>
&nbsp;
&nbsp;
    <span style="color: #FF8000;">Public</span> <span style="color: #FF8000;">Overrides</span> <span style="color: #0600FF;">Sub</span> Input0_ProcessInputRow<span style="color: #000000;">&#40;</span><span style="color: #FF8000;">ByVal</span> Row <span style="color: #FF8000;">As</span> Input0Buffer<span style="color: #000000;">&#41;</span>
        <span style="color: #008080; font-style: italic;">'Increment LineNum  to keep track of the number of rows</span>
        LineNum <span style="color: #008000;">+=</span> <span style="color: #FF0000;">1</span>
&nbsp;
        <span style="color: #008080; font-style: italic;">'split the input column. This will be highly dependent on the format and will probably need adjustment</span>
        <span style="color: #0600FF;">Dim</span> columns <span style="color: #FF8000;">As</span> <span style="color: #FF8000;">String</span><span style="color: #000000;">&#40;</span><span style="color: #000000;">&#41;</span> <span style="color: #008000;">=</span> System.<span style="color: #0000FF;">Text</span>.<span style="color: #0000FF;">RegularExpressions</span>.<span style="color: #0000FF;">Regex</span>.<span style="color: #0600FF;">Split</span><span style="color: #000000;">&#40;</span>Row.<span style="color: #0600FF;">line</span>, <span style="color: #808080;">&quot;(?&lt;=&quot;</span><span style="color: #808080;">&quot;),(?=&quot;</span><span style="color: #808080;">&quot;)&quot;</span><span style="color: #000000;">&#41;</span>
&nbsp;
        <span style="color: #008080; font-style: italic;">'If the number of elements is not expected, we assume there was a problem</span>
        <span style="color: #0600FF;">If</span> columns.<span style="color: #0000FF;">Length</span> &lt;&gt; <span style="color: #FF0000;">4</span> <span style="color: #FF8000;">Then</span>
            Row.<span style="color: #0000FF;">ErrorLine</span> <span style="color: #008000;">=</span> Row.<span style="color: #0600FF;">line</span>
            Row.<span style="color: #0000FF;">ErrorLineNum</span> <span style="color: #008000;">=</span> LineNum
            Row.<span style="color: #0000FF;">DirectRowToErrorRows</span><span style="color: #000000;">&#40;</span><span style="color: #000000;">&#41;</span>
            <span style="color: #008080; font-style: italic;">'Everything looks good, let's move forward</span>
        <span style="color: #FF8000;">Else</span>
            Row.<span style="color: #0000FF;">MatchLineNum</span> <span style="color: #008000;">=</span> LineNum
            Row.<span style="color: #0000FF;">ID</span> <span style="color: #008000;">=</span> StripQualifier<span style="color: #000000;">&#40;</span>columns<span style="color: #000000;">&#40;</span><span style="color: #FF0000;">0</span><span style="color: #000000;">&#41;</span>, <span style="color: #808080;">&quot;&quot;</span><span style="color: #808080;">&quot;&quot;</span><span style="color: #000000;">&#41;</span>
            Row.<span style="color: #0000FF;">Company</span> <span style="color: #008000;">=</span> StripQualifier<span style="color: #000000;">&#40;</span>columns<span style="color: #000000;">&#40;</span><span style="color: #FF0000;">1</span><span style="color: #000000;">&#41;</span>, <span style="color: #808080;">&quot;&quot;</span><span style="color: #808080;">&quot;&quot;</span><span style="color: #000000;">&#41;</span>
            Row.<span style="color: #0000FF;">PhoneNumber</span> <span style="color: #008000;">=</span> StripQualifier<span style="color: #000000;">&#40;</span>columns<span style="color: #000000;">&#40;</span><span style="color: #FF0000;">2</span><span style="color: #000000;">&#41;</span>, <span style="color: #808080;">&quot;&quot;</span><span style="color: #808080;">&quot;&quot;</span><span style="color: #000000;">&#41;</span>
            Row.<span style="color: #0000FF;">FavoriteIrrationalNumber</span> <span style="color: #008000;">=</span> StripQualifier<span style="color: #000000;">&#40;</span>columns<span style="color: #000000;">&#40;</span><span style="color: #FF0000;">3</span><span style="color: #000000;">&#41;</span>, <span style="color: #808080;">&quot;&quot;</span><span style="color: #808080;">&quot;&quot;</span><span style="color: #000000;">&#41;</span>
            Row.<span style="color: #0000FF;">DirectRowToMatchRows</span><span style="color: #000000;">&#40;</span><span style="color: #000000;">&#41;</span>
        <span style="color: #0600FF;">End</span> <span style="color: #0600FF;">If</span>
&nbsp;
    <span style="color: #0600FF;">End</span> <span style="color: #0600FF;">Sub</span>
&nbsp;
    <span style="color: #FF8000;">Public</span> <span style="color: #0600FF;">Function</span> StripQualifier<span style="color: #000000;">&#40;</span><span style="color: #FF8000;">ByRef</span> InputString <span style="color: #FF8000;">As</span> <span style="color: #FF8000;">String</span>, <span style="color: #FF8000;">ByRef</span> Qualifier <span style="color: #FF8000;">As</span> <span style="color: #FF8000;">String</span><span style="color: #000000;">&#41;</span> <span style="color: #FF8000;">As</span> <span style="color: #FF8000;">String</span>
        <span style="color: #008080; font-style: italic;">'This is a helper function only to remove surrounding text qualifiers</span>
&nbsp;
        <span style="color: #0600FF;">Dim</span> OutputString <span style="color: #FF8000;">As</span> <span style="color: #FF8000;">String</span>
&nbsp;
        <span style="color: #0600FF;">If</span> InputString.<span style="color: #0000FF;">Substring</span><span style="color: #000000;">&#40;</span><span style="color: #FF0000;">0</span>, Qualifier.<span style="color: #0000FF;">Length</span><span style="color: #000000;">&#41;</span> <span style="color: #008000;">=</span> Qualifier <span style="color: #804040;">And</span> InputString.<span style="color: #0000FF;">Substring</span><span style="color: #000000;">&#40;</span>InputString.<span style="color: #0000FF;">Length</span> <span style="color: #008000;">-</span> Qualifier.<span style="color: #0000FF;">Length</span>, Qualifier.<span style="color: #0000FF;">Length</span><span style="color: #000000;">&#41;</span> <span style="color: #008000;">=</span> Qualifier <span style="color: #FF8000;">Then</span>
            OutputString <span style="color: #008000;">=</span> InputString.<span style="color: #0000FF;">Substring</span><span style="color: #000000;">&#40;</span>Qualifier.<span style="color: #0000FF;">Length</span>, InputString.<span style="color: #0000FF;">Length</span> <span style="color: #008000;">-</span> <span style="color: #000000;">&#40;</span><span style="color: #FF0000;">2</span> <span style="color: #008000;">*</span> Qualifier.<span style="color: #0000FF;">Length</span><span style="color: #000000;">&#41;</span><span style="color: #000000;">&#41;</span>
        <span style="color: #FF8000;">Else</span>
            OutputString <span style="color: #008000;">=</span> InputString
        <span style="color: #0600FF;">End</span> <span style="color: #0600FF;">If</span>
&nbsp;
        <span style="color: #FF8000;">Return</span> OutputString
    <span style="color: #0600FF;">End</span> <span style="color: #0600FF;">Function</span>
&nbsp;
&nbsp;
<span style="color: #0600FF;">End</span> <span style="color: #0600FF;">Class</span></pre></td></tr></table></div>

<p>Be sure and build the script (Ctrl+Shift+B or use the Build menu) before closing out. If not, you may receive a validation error. After a successful build, close and click &#8220;OK&#8221; on the Script Transformation Editor dialog. You can now connect the outputs from your Script Component to any destination, be it an <acronym title="Object Linking and Embedding">OLE</acronym> <abbr title="Database">DB</abbr> Destination or another flat file. The important thing to note is that you connect the Error Rows output to analyze any potential problems. You have access to both the record number and the original line read in. This should provide you with sufficient information to start analyzing the problem, and hopefully fix the parsing routine in the Script Component to deal with it.</p>
<p>Note, that performing this extra Script Component severely affected the processing rate versus <abbr title="SQL Server Integration Services">SSIS</abbr> natively supporting the import. Again, the best solution would be to correct the source, if possible.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.ideaexcursion.com/2008/11/12/handling-embedded-text-qualifiers/feed/</wfw:commentRss>
		<slash:comments>12</slash:comments>
		</item>
	</channel>
</rss>
