I am struggling to script a BULK INSERT for my .dat files. I am getting many errors depending on what I try, but I'm pretty sure it's related to the row delimiter. I can successfully use the SQL import wizard for my .dat files. Under "Specify the characters that delimit the source file", I see the following as defaults that I accept:
Row delimiter: {CR}{LF}
Column delimiter: Comma {,}
The 500k rows and 158 columns come in just beautifully thanks to the wizard.
When scripting, I have tried all manner of \r\n variations and 0x0a etc and searched for help and reviewed similar posts. I can successfully script in a single line of real data, so I don’t think the column parsing is an issue. It fails when I try 2 lines in a dummy file. The least amount of errors I generate occurs with the following
bulk insert abc
from 'C:\TestDat\try2.dat'
with (FIRSTROW = 1, fieldterminator = ',', rowterminator = '\r\n')
go
Msg 4863, Level 16, State 1, Line 3
Bulk load data conversion error (truncation) for row 1, column 158 (Column 157).
(using 0x0a will create more errors on a 2 line file).
.dat contents:
xxx,20080501,xxx,xxx:175000,55008654,0178636,202307011001,001859915,OK,W,xx,1.0,00000100,48,202306290100,R0,2,202306290200,R0,0,202306290300,R0,0,202306290400,R0,0,202306290500,R0,1,202306290600,R0,1,202306290700,R0,0,202306290800,R0,0,202306290900,R0,0,202306291000,R0,0,202306291100,R0,0,202306291200,R0,1,202306291300,R0,0,202306291400,R0,0,202306291500,R0,2,202306291600,R0,1,202306291700,R0,0,202306291800,R0,0,202306291900,R0,0,202306292000,R0,0,202306292100,R0,0,202306292200,R0,0,202306292300,R0,2,202306300000,R0,3,202306300100,R0,0,202306300200,R0,0,202306300300,R0,2,202306300400,R0,0,202306300500,R0,1,202306300600,R0,1,202306300700,R0,3,202306300800,R0,0,202306300900,R0,0,202306301000,R0,0,202306301100,R0,0,202306301200,R0,3,202306301300,R0,1,202306301400,R0,0,202306301500,R0,1,202306301600,R0,1,202306301700,R0,0,202306301800,R0,0,202306301900,R0,1,202306302000,R0,0,202306302100,R0,1,202306302200,R0,0,202306302300,R0,1,202307010000,R0,0
xxx,20080501,xxx,xxx:175000,55008654,0178636,202307011001,001859915,OK,W,xx,1.0,00000100,4,202307010100,R0,1,202307010200,R0,0,202307010300,R0,1,202307010400,R0,0
Ok, I created the
try2.datfile,and i created a table
abcusing (I left out column3..column157, but they have same definition....);Then started SQLCMD
When you are getting an error, I see two possibilities:
When selecting the table in SSMS, I do see:
[![start of the row][1]][1]
[![end of the row][2]][2]
The last column does not seem to be OK...... More investigation needed....
A workaround could be:
Create a table like this:
and import using (since you do not have
#characters in your file):After that, you can do:
Resulting in:
NOTE: When your version of SQL is new enough, you can add the
enable_ordinalparameter to STRING_SPLITNOTE2: Why is this
enable_ordinaladded to STRING_SPLIT?Because in the old (or current ) way it works, STRING SPLIT does not "care" about ordinal position (https://stackoverflow.com/a/64092709/724039)
I created a DBFIDDLE with your data, this has a line number added to
defPart of the
SELECT R, value FROM def d CROSS APPLY STRING_SPLIT((SELECT Column1 ), ',');(near the line change) is:Unfortunately it is not possible (AFAIK) to add row number per line number in a reliable way...
ANOTHER ATTEMPT:
seeing: "The column count per row will equal 14+n*3 where n = the value in the 14th column."
Define your import table (here named with the clumsy name
def2:Import your data:
Finally get the values, which can be used to be inserted in a (new) table:
output (where
Ris the row number from your import table, andR3is the row number of the values (R2is the position of the,inColumnRest)):P.S. SORRY for this messy output....
This solution gets you two tables, the first
def2with the first 14 columns, the second with the values in that row (but you have to define that table yourself!)