<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:creativeCommons="http://backend.userland.com/creativeCommonsRssModule"	>
<channel>
	<title>Comments on: Handling Embedded Text Qualifiers</title>
	<atom:link href="http://www.ideaexcursion.com/2008/11/12/handling-embedded-text-qualifiers/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.ideaexcursion.com/2008/11/12/handling-embedded-text-qualifiers/</link>
	<description>Technology Musings</description>
	<lastBuildDate>Sat, 04 Sep 2010 18:38:01 +0000</lastBuildDate>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.0.1</generator>
	<item>
		<title>By: SQL Lion</title>
		<link>http://www.ideaexcursion.com/2008/11/12/handling-embedded-text-qualifiers/comment-page-1/#comment-399</link>
		<dc:creator>SQL Lion</dc:creator>
		<pubDate>Sun, 04 Apr 2010 19:13:59 +0000</pubDate>
		<guid isPermaLink="false">http://www.ideaexcursion.com/?p=88#comment-399</guid>
		<description>To get the workaround and Step by Step description for developing SSIS package in order to overcome the issue with SSIS while importing text files with Flat File Connection Manager and  Flat File Source where the &quot;Row Delimiter&quot; property does not work properly for rows having NULL or empty values, follow the below link:
&lt;a href=&quot;//www.sqllion.com/2010/04/ssis-vs-text-file-importing-1/”&quot; rel=&quot;nofollow&quot;&gt; 
http://www.sqllion.com/2010/04/ssis-vs-text-file-importing-1/ &lt;/a&gt;
Thanks,
SQL Lion</description>
		<content:encoded><![CDATA[<p>To get the workaround and Step by Step description for developing SSIS package in order to overcome the issue with SSIS while importing text files with Flat File Connection Manager and  Flat File Source where the &#8220;Row Delimiter&#8221; property does not work properly for rows having NULL or empty values, follow the below link:<br />
<a href="//www.sqllion.com/2010/04/ssis-vs-text-file-importing-1/”" rel="nofollow"><br />
</a><a href="http://www.sqllion.com/2010/04/ssis-vs-text-file-importing-1/" rel="nofollow">http://www.sqllion.com/2010/04/ssis-vs-text-file-importing-1/</a><br />
Thanks,<br />
SQL Lion</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Marco</title>
		<link>http://www.ideaexcursion.com/2008/11/12/handling-embedded-text-qualifiers/comment-page-1/#comment-199</link>
		<dc:creator>Marco</dc:creator>
		<pubDate>Tue, 02 Jun 2009 17:08:34 +0000</pubDate>
		<guid isPermaLink="false">http://www.ideaexcursion.com/?p=88#comment-199</guid>
		<description>Andy, I found a way to solve your problem:

&lt;pre lang=&quot;csharp&quot;&gt;public override void Input0_ProcessInputRow(Input0Buffer Row)
{
        Regex rCSV = new Regex(&quot;,(?=(?:[^\&quot;]*\&quot;[^\&quot;]*\&quot;)*(?![^\&quot;]*\&quot;))&quot;);
        Regex rQout = new Regex(&quot;\&quot;\&quot;&quot;);

        string[] fields = rCSV.Split(Row.line);


        if (fields.Length == EXPECTED_FIELDS)
        {
                // Case 1: Quoted field with &quot;&quot; embedded text qualifiers
                Row.FIELD1 = Format(field[0], rQout);
                
                // Case 2: Non-quoted field
                Row.FIELD2 = Convert.ToDESIRED_TYPE(field[1]);
        }
}

public static string Format(string input, Regex rQout)
{
        return (!String.IsNullOrEmpty(input)) ? rQout.Replace(input.Substring(1, input.Length - 2), &quot;\&quot;&quot;) : &quot;&quot;;
}&lt;/pre&gt;

-------------------------------------------------------

This has also the case where there are emtpy fields:

For example

&quot;Hi&quot;,123,,&quot;my name is Marco the &quot;&quot;programmer&quot;&quot;&quot;,&quot;other text&quot;

Will produce the output:

Hi
123

my name is Marco the &quot;programmer&quot;
other text</description>
		<content:encoded><![CDATA[<p>Andy, I found a way to solve your problem:</p>

<div class="wp_syntax"><div class="code"><pre class="csharp" style="font-family:monospace;"><span style="color: #0600FF;">public</span> <span style="color: #0600FF;">override</span> <span style="color: #0600FF;">void</span> Input0_ProcessInputRow<span style="color: #000000;">&#40;</span>Input0Buffer Row<span style="color: #000000;">&#41;</span>
<span style="color: #000000;">&#123;</span>
        Regex rCSV <span style="color: #008000;">=</span> <span style="color: #008000;">new</span> Regex<span style="color: #000000;">&#40;</span><span style="color: #666666;">&quot;,(?=(?:[^<span style="color: #008080; font-weight: bold;">\&quot;</span>]*<span style="color: #008080; font-weight: bold;">\&quot;</span>[^<span style="color: #008080; font-weight: bold;">\&quot;</span>]*<span style="color: #008080; font-weight: bold;">\&quot;</span>)*(?![^<span style="color: #008080; font-weight: bold;">\&quot;</span>]*<span style="color: #008080; font-weight: bold;">\&quot;</span>))&quot;</span><span style="color: #000000;">&#41;</span><span style="color: #008000;">;</span>
        Regex rQout <span style="color: #008000;">=</span> <span style="color: #008000;">new</span> Regex<span style="color: #000000;">&#40;</span><span style="color: #666666;">&quot;<span style="color: #008080; font-weight: bold;">\&quot;</span><span style="color: #008080; font-weight: bold;">\&quot;</span>&quot;</span><span style="color: #000000;">&#41;</span><span style="color: #008000;">;</span>
&nbsp;
        <span style="color: #FF0000;">string</span><span style="color: #000000;">&#91;</span><span style="color: #000000;">&#93;</span> fields <span style="color: #008000;">=</span> rCSV.<span style="color: #0000FF;">Split</span><span style="color: #000000;">&#40;</span>Row.<span style="color: #0000FF;">line</span><span style="color: #000000;">&#41;</span><span style="color: #008000;">;</span>
&nbsp;
&nbsp;
        <span style="color: #0600FF;">if</span> <span style="color: #000000;">&#40;</span>fields.<span style="color: #0000FF;">Length</span> <span style="color: #008000;">==</span> EXPECTED_FIELDS<span style="color: #000000;">&#41;</span>
        <span style="color: #000000;">&#123;</span>
                <span style="color: #008080; font-style: italic;">// Case 1: Quoted field with &quot;&quot; embedded text qualifiers</span>
                Row.<span style="color: #0000FF;">FIELD1</span> <span style="color: #008000;">=</span> Format<span style="color: #000000;">&#40;</span>field<span style="color: #000000;">&#91;</span><span style="color: #FF0000;">0</span><span style="color: #000000;">&#93;</span>, rQout<span style="color: #000000;">&#41;</span><span style="color: #008000;">;</span>
&nbsp;
                <span style="color: #008080; font-style: italic;">// Case 2: Non-quoted field</span>
                Row.<span style="color: #0000FF;">FIELD2</span> <span style="color: #008000;">=</span> Convert.<span style="color: #0000FF;">ToDESIRED_TYPE</span><span style="color: #000000;">&#40;</span>field<span style="color: #000000;">&#91;</span><span style="color: #FF0000;">1</span><span style="color: #000000;">&#93;</span><span style="color: #000000;">&#41;</span><span style="color: #008000;">;</span>
        <span style="color: #000000;">&#125;</span>
<span style="color: #000000;">&#125;</span>
&nbsp;
<span style="color: #0600FF;">public</span> <span style="color: #0600FF;">static</span> <span style="color: #FF0000;">string</span> Format<span style="color: #000000;">&#40;</span><span style="color: #FF0000;">string</span> input, Regex rQout<span style="color: #000000;">&#41;</span>
<span style="color: #000000;">&#123;</span>
        <span style="color: #0600FF;">return</span> <span style="color: #000000;">&#40;</span><span style="color: #008000;">!</span><span style="color: #FF0000;">String</span>.<span style="color: #0000FF;">IsNullOrEmpty</span><span style="color: #000000;">&#40;</span>input<span style="color: #000000;">&#41;</span><span style="color: #000000;">&#41;</span> <span style="color: #008000;">?</span> rQout.<span style="color: #0000FF;">Replace</span><span style="color: #000000;">&#40;</span>input.<span style="color: #0000FF;">Substring</span><span style="color: #000000;">&#40;</span><span style="color: #FF0000;">1</span>, input.<span style="color: #0000FF;">Length</span> <span style="color: #008000;">-</span> <span style="color: #FF0000;">2</span><span style="color: #000000;">&#41;</span>, <span style="color: #666666;">&quot;<span style="color: #008080; font-weight: bold;">\&quot;</span>&quot;</span><span style="color: #000000;">&#41;</span> <span style="color: #008000;">:</span> <span style="color: #666666;">&quot;&quot;</span><span style="color: #008000;">;</span>
<span style="color: #000000;">&#125;</span></pre></div></div>

<p>&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;-</p>
<p>This has also the case where there are emtpy fields:</p>
<p>For example</p>
<p>&#8220;Hi&#8221;,123,,&#8221;my name is Marco the &#8220;&#8221;programmer&#8221;"&#8221;,&#8221;other text&#8221;</p>
<p>Will produce the output:</p>
<p>Hi<br />
123</p>
<p>my name is Marco the &#8220;programmer&#8221;<br />
other text</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Andy Galbraith</title>
		<link>http://www.ideaexcursion.com/2008/11/12/handling-embedded-text-qualifiers/comment-page-1/#comment-180</link>
		<dc:creator>Andy Galbraith</dc:creator>
		<pubDate>Mon, 04 May 2009 19:57:05 +0000</pubDate>
		<guid isPermaLink="false">http://www.ideaexcursion.com/?p=88#comment-180</guid>
		<description>The biggest problem in the file I am fighting is that everything does not have a qualifier around it:

&quot;Clark, Bob&quot;,&quot;123 Main St&quot;,1234.00,45,&quot;Something Else&quot;,....

...so I am having trouble writing the appropriate parsing function because I cannot parse on just comma or on quote-comma-quote!

Thanks for listening to my frustration...{-:</description>
		<content:encoded><![CDATA[<p>The biggest problem in the file I am fighting is that everything does not have a qualifier around it:</p>
<p>&#8220;Clark, Bob&#8221;,&#8221;123 Main St&#8221;,1234.00,45,&#8221;Something Else&#8221;,&#8230;.</p>
<p>&#8230;so I am having trouble writing the appropriate parsing function because I cannot parse on just comma or on quote-comma-quote!</p>
<p>Thanks for listening to my frustration&#8230;{-:</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Taylor Gerring</title>
		<link>http://www.ideaexcursion.com/2008/11/12/handling-embedded-text-qualifiers/comment-page-1/#comment-175</link>
		<dc:creator>Taylor Gerring</dc:creator>
		<pubDate>Tue, 28 Apr 2009 17:21:31 +0000</pubDate>
		<guid isPermaLink="false">http://www.ideaexcursion.com/?p=88#comment-175</guid>
		<description>@Orlando

Yes, thanks for pointing out that a line break as data would not work for this solution due to parsing each line as a single record.

And you&#039;re right insofar as using DTS. This is what infuriates many people, is that this very old tool more correctly parses CSV than SSIS, despite the longstanding issue. At this point, it&#039;s clear this is not a priority for Microsoft, so unless someone develops a custom Data Flow Source, all we can do is wait and hope for a patch or fix in the next version of SQL Server.</description>
		<content:encoded><![CDATA[<p>@Orlando</p>
<p>Yes, thanks for pointing out that a line break as data would not work for this solution due to parsing each line as a single record.</p>
<p>And you&#8217;re right insofar as using DTS. This is what infuriates many people, is that this very old tool more correctly parses CSV than SSIS, despite the longstanding issue. At this point, it&#8217;s clear this is not a priority for Microsoft, so unless someone develops a custom Data Flow Source, all we can do is wait and hope for a patch or fix in the next version of SQL Server.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Orlando</title>
		<link>http://www.ideaexcursion.com/2008/11/12/handling-embedded-text-qualifiers/comment-page-1/#comment-174</link>
		<dc:creator>Orlando</dc:creator>
		<pubDate>Tue, 28 Apr 2009 02:42:42 +0000</pubDate>
		<guid isPermaLink="false">http://www.ideaexcursion.com/?p=88#comment-174</guid>
		<description>As described in RFC 4180 setion 2 item 7 (http://tools.ietf.org/html/rfc4180#section-2) any characters may appear between text-qualifiers...including line breaks. Your solution will suffice for rows that exist on one line, however csv files can have lines such as this:

1,&quot;Hello, this field 
is a &quot;&quot;real&quot;&quot; pain!&quot;,&quot;4/27/2009&quot;

Yes, that&#039;s one row where:

Field 1 = 1
Field 2 (represented on one line with line break escaped for readability)  = Hello, this field \r\nis a &quot;real&quot; pain!
Field 3 = 4/27/2009

The destination table is:

create table dbo.LogInfo
(
    RecordID int,
    LogInfo varchar(500),
    LogDateTime datetime
)


I have looked into reading the file where each line equates to a single column and parsing from there as you suggested however the embedded line break prevents me from using that method.

Any further pointers on how to import csv files using SSIS would be much appreciated. Against some long-standing personal bias I am actually considering recommending using DTS, a 10+ year old technology, to solve the issue since is does a capable job of parsing and importing csv files and SSIS does not provide an easy path to process what many would consider a most common file format.

Thanks for reading.</description>
		<content:encoded><![CDATA[<p>As described in RFC 4180 setion 2 item 7 (<a href="http://tools.ietf.org/html/rfc4180#section-2" rel="nofollow">http://tools.ietf.org/html/rfc4180#section-2</a>) any characters may appear between text-qualifiers&#8230;including line breaks. Your solution will suffice for rows that exist on one line, however csv files can have lines such as this:</p>
<p>1,&#8221;Hello, this field<br />
is a &#8220;&#8221;real&#8221;" pain!&#8221;,&#8221;4/27/2009&#8243;</p>
<p>Yes, that&#8217;s one row where:</p>
<p>Field 1 = 1<br />
Field 2 (represented on one line with line break escaped for readability)  = Hello, this field \r\nis a &#8220;real&#8221; pain!<br />
Field 3 = 4/27/2009</p>
<p>The destination table is:</p>
<p>create table dbo.LogInfo<br />
(<br />
    RecordID int,<br />
    LogInfo varchar(500),<br />
    LogDateTime datetime<br />
)</p>
<p>I have looked into reading the file where each line equates to a single column and parsing from there as you suggested however the embedded line break prevents me from using that method.</p>
<p>Any further pointers on how to import csv files using SSIS would be much appreciated. Against some long-standing personal bias I am actually considering recommending using DTS, a 10+ year old technology, to solve the issue since is does a capable job of parsing and importing csv files and SSIS does not provide an easy path to process what many would consider a most common file format.</p>
<p>Thanks for reading.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Dennis</title>
		<link>http://www.ideaexcursion.com/2008/11/12/handling-embedded-text-qualifiers/comment-page-1/#comment-88</link>
		<dc:creator>Dennis</dc:creator>
		<pubDate>Tue, 03 Feb 2009 22:24:49 +0000</pubDate>
		<guid isPermaLink="false">http://www.ideaexcursion.com/?p=88#comment-88</guid>
		<description>Oh, yes. It works now. Thanks.</description>
		<content:encoded><![CDATA[<p>Oh, yes. It works now. Thanks.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Taylor Gerring</title>
		<link>http://www.ideaexcursion.com/2008/11/12/handling-embedded-text-qualifiers/comment-page-1/#comment-87</link>
		<dc:creator>Taylor Gerring</dc:creator>
		<pubDate>Tue, 03 Feb 2009 21:22:21 +0000</pubDate>
		<guid isPermaLink="false">http://www.ideaexcursion.com/?p=88#comment-87</guid>
		<description>Did you follow the steps outlined above? It sounds like it can&#039;t find the column for some reason.

&lt;blockquote&gt;8. Add another output and call it “Error Rows” or anything else of your choosing.
9. Set ExclusionGroup = 1
10. Set “Synchronous InputID” to your input. In my case, it is named “Input 0”
11. Add only two columns, ErrorLine (string [DT_STR] 8000) and ErrorLineNum (four-byte signed integer [DT_I4]).&lt;/blockquote&gt;</description>
		<content:encoded><![CDATA[<p>Did you follow the steps outlined above? It sounds like it can&#8217;t find the column for some reason.</p>
<blockquote><p>8. Add another output and call it “Error Rows” or anything else of your choosing.<br />
9. Set ExclusionGroup = 1<br />
10. Set “Synchronous InputID” to your input. In my case, it is named “Input 0”<br />
11. Add only two columns, ErrorLine (string [DT_STR] 8000) and ErrorLineNum (four-byte signed integer [DT_I4]).</p></blockquote>
]]></content:encoded>
	</item>
	<item>
		<title>By: Dennis</title>
		<link>http://www.ideaexcursion.com/2008/11/12/handling-embedded-text-qualifiers/comment-page-1/#comment-85</link>
		<dc:creator>Dennis</dc:creator>
		<pubDate>Tue, 03 Feb 2009 21:16:28 +0000</pubDate>
		<guid isPermaLink="false">http://www.ideaexcursion.com/?p=88#comment-85</guid>
		<description>I&#039;m getting &quot;Error 30456: ErrorLine is not a member of Script Component&quot;. Do you know why?</description>
		<content:encoded><![CDATA[<p>I&#8217;m getting &#8220;Error 30456: ErrorLine is not a member of Script Component&#8221;. Do you know why?</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Dennis</title>
		<link>http://www.ideaexcursion.com/2008/11/12/handling-embedded-text-qualifiers/comment-page-1/#comment-84</link>
		<dc:creator>Dennis</dc:creator>
		<pubDate>Tue, 03 Feb 2009 19:31:29 +0000</pubDate>
		<guid isPermaLink="false">http://www.ideaexcursion.com/?p=88#comment-84</guid>
		<description>Oh, great. I will give the VB.NET a try.</description>
		<content:encoded><![CDATA[<p>Oh, great. I will give the VB.NET a try.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Handling Embedded Text Qualifiers in SSIS 2005 &#124; Idea Excursion</title>
		<link>http://www.ideaexcursion.com/2008/11/12/handling-embedded-text-qualifiers/comment-page-1/#comment-83</link>
		<dc:creator>Handling Embedded Text Qualifiers in SSIS 2005 &#124; Idea Excursion</dc:creator>
		<pubDate>Tue, 03 Feb 2009 16:32:09 +0000</pubDate>
		<guid isPermaLink="false">http://www.ideaexcursion.com/?p=88#comment-83</guid>
		<description>[...] a quick note advising that I&#8217;ve updated my Handling Embedded Text Qualifiers post to also include a Visual Basic example, making the information also relevant to SQL Server [...]</description>
		<content:encoded><![CDATA[<p>[...] a quick note advising that I&#8217;ve updated my Handling Embedded Text Qualifiers post to also include a Visual Basic example, making the information also relevant to SQL Server [...]</p>
]]></content:encoded>
	</item>
</channel>
</rss>
