Greetings,

I've posted this question in other news groups (msdn and a few others) without any response.:sad: I hope I don't get into any trouble for posting this question here, but I reach out for your help.:confused:

I have an excel file stored in a data base table as an image data type. I would like to somehow read the data from the excel file and extract its data into a data base table using t-sql.

I've used sql statements such as the following but this requires the excel file to reside as a physical file on the sql servers file system.

insert into SpreadsheetDataDump (f1,f2,f3,f4,f5,f6)  
Select f1,f2,f3,f4,f5,f6 
FROM OPENDATASOURCE
('Microsoft.Jet.OLEDB.4.0','Data Source="D:\spreadsheets\test.xls";Extended Properties="Excel 8.0;IMEX=1;HDR=NO"')
...WorksheetName$

In my case, I do not want the excel file to reside on the file system because I require many different users to submit excel files. I just cannot afford to expose the file system to so many users, so I've created a web interface for the users to upload the excel file into the database.

As I mentioned, the excel file gets inserted into a table as an image data type. Now I want to select this image data using t-sql and somehow extract the excel spreadsheet data. If I can't do this, can I at least parse the image data into a physical file somewhere on the SQL server? I don't mind parsing it to disk at this point because by now the process is handled by SQL and the end user is out of picture.

Can this be done? Or do I have to parse out the image data as an excel file somewhere on the sql servers file system?

If I have to parse the image data to the file system, can this be done in t-sql? i.e. does t-sql have the ability to stream image data to the file system either natively or by extended stored procedures? (I want to avoid writing a dll wrapper to do this.)

I'm using SQL Server 2000 on Windows 2000 server.

Thanks for any help you can provide.

-Mike

Recommended Answers

All 15 Replies

Ok folks, I didn't get any replies but I see a few views -- 811 at this point. I completed this task a while ago, but forgot about this thread.

If anyone is interested in how I coded this task, please reply to this thread. If I see anyone interested, I'll post my code. Hopefully the folks who viewed this thread are seeking a solution to the same issues I was facing.

Ok folks, I didn't get any replies but I see a few views -- 811 at this point. I completed this task a while ago, but forgot about this thread.

If anyone is interested in how I coded this task, please reply to this thread. If I see anyone interested, I'll post my code. Hopefully the folks who viewed this thread are seeking a solution to the same issues I was facing.

I'm definitely interested in seeing how you accomplished this; I'm in this position myself now and would rather not have to reinvent the wheel if i didn't have to...

thanks!

Ok. Hang on; it's going to take a while for me to gather all of the information.

First of all, let me recap the situation. I need an interface that allows users to upload excel data into a sql database.

1) I need a way to get excel data into my sql database.

2) I use the following sql statement to get the excel data into my sql database…

insert into SpreadsheetDataDump (f1,f2,f3,f4,f5,f6) 
Select f1,f2,f3,f4,f5,f6 
FROM OPENDATASOURCE
('Microsoft.Jet.OLEDB.4.0','Data Source="D:\spreadsheets\test.xls";Extended Properties="Excel 8.0;IMEX=1;HDR=NO"')
...WorksheetName$

Keep in mind, in this scenario, the excel file must exist on the sql server.

3) Users need a way to upload the excel file onto the sql server.

4) I cannot grant users direct access to the sql server’s file system.

Task: How does the user get the excel file onto the sql server file system without granting access to the sql server’s file system?

In essence, there are three steps in this process. 1. Get excel file into sql table. 2. Put excel file (stored as Image) onto disc. 3. Insert excel data into sql database.

Step-1) I use a web (asp.net) interface to upload the excel file into a sql table as an Image data type. I’m going to defer you to http://aspnet.4guysfromrolla.com/articles/120606-1.aspx, which explains the process of getting a binary file uploaded into a sql table. You’ll get a lot more stuff out of this article, but we’re concerned with the part about getting the file uploaded into the database. I thought it was a neat article, so I'm giving you the link instead of my code.

I use a similar table structure in my database. This is the table structure I use in my code:

CREATE TABLE BinaryDataFiles(
[PkID] [int] IDENTITY(1,1) NOT NULL,
[FileName] [nvarchar](255) ,
[FileSize] [real],
[FileData] [image],
[UploadDate] [datetime] DEFAULT (getdate()),
[ContentType] [nvarchar](30)
)

Step-2) Here’s the part that actually gets the excel file onto the sql server’s file system. I use a console application to dump the excel file (stored as an Image) onto disc. This console application is called from a sql stored procedure.

Here’s the code that streams the image data to file. It compiles to a file called StreamImageData.exe.

using System;
using System.Collections.Generic;
using System.Text;
using System.Data;
using System.Data.SqlClient;
using System.Diagnostics;
using System.IO;
using System.Configuration;
 
namespace StreamImageData {
class Program {
static int Main(string[] args) {
try {
string ConStr = "Data Source=theServer;Initial Catalog=theDatabase;Persist Security Info=True;User ID=theUser;Password=thePassword";
SqlConnection dbConn = new SqlConnection(ConStr);
string sql_str = "select FileSize, FileName, FileData, ContentType from BinaryDataFiles where (PkID = @PkID)";
SqlCommand cmdGetFile = new SqlCommand(sql_str, dbConn);
SqlParameter paramPkID = new SqlParameter("@PkID", SqlDbType.Int);
cmdGetFile.Parameters.Add(paramPkID);
paramPkID.Direction = ParameterDirection.Input;
paramPkID.Value = args[0].ToString(); // The primary key of the blob.
string fileName = args[1].ToString(); // The output file name.
dbConn.Open();
SqlDataReader dr;
dr = cmdGetFile.ExecuteReader();
if(dr.Read()) {
// Create random data to write to the file.
byte[] dataArray = (byte[])dr["FileData"];
using(FileStream fileStream = new FileStream(fileName, FileMode.Create)) {
// Write the data to the file, byte by byte.
for(int i = 0; i < dataArray.Length; i++) {
fileStream.WriteByte(dataArray[i]);
}
}
}
return 0;
} catch(Exception ex) {
Debug.Write(ex.ToString());
return -1;
}
}
}
}

This application expects two arguments – the primary key for the record which stores the image data and an output file name.

I use a stored procedure to execute StreamImageData.exe. The stored procedure goes something like this...

USE [YourDatabase]
GO
CREATE PROCEDURE [dbo].[BinaryDataFiles_StreamImage2File](
@FileName as varchar(256) 
,@ID as varchar(64)
,@Debug as bit = 0 --defaults to 0 (off)
)
AS
set nocount on 
 
DECLARE @TheResult int
declare @command_string as nvarchar(2000) 
set @command_string = '"d:\apps\bin\StreamImageData.exe" ' 
 + @ID + ' ' + @FileName
 
EXEC @TheResult = master.dbo.xp_cmdshell @command_string
print @TheResult
 
declare @ItExists as int 
exec master..xp_fileexist @FileName, @ItExists out
if @ItExists = 0 -- It failed; the file doesn't exist.
begin 
 -- Dump some kind of fail message
end

Step-3) Now that the excel file is parsed onto the sql server’s file system, you can insert the excel data into your database table.
Once again, I use some code that looks like the following to get the excel data into the sql table.

insert into SpreadsheetDataDump (f1,f2,f3,f4,f5,f6)  
Select f1,f2,f3,f4,f5,f6 
FROM OPENDATASOURCE
('Microsoft.Jet.OLEDB.4.0','Data Source="D:\spreadsheets\test.xls";Extended Properties="Excel 8.0;IMEX=1;HDR=NO"')
...WorksheetName$

And this is what the SpreadsheetDataDump table’s structure looks like…

CREATE TABLE BinaryDataFiles(
[f1] [nvarchar](255) ,
[f2] [nvarchar](255) ,
[f3] [nvarchar](255) ,
[f4] [nvarchar](255) ,
[f5] [nvarchar](255) ,
[f6] [nvarchar](255)
)

These are the fundamental components of getting this task done. I have all three of these steps launch automatically, so to the user, it seems like the spreadsheet data automatically populated into the database. In the end, I also delete the image data out of the table and delete the excel file from the file system. Once the sql data is put into the table, my application had no reason to keep the file or image data.

There's a lot more that I've done in my production code. I didn't post all of it because I don't know what applies to your situation, so instead of cluttering this up with unnecessary code, I choose to provide the basics. I hope this helps - if not, please let me know your specifics and I will answer them as best as I can.

Also note that in this case, the spreadsheet data is being thrown into varchar fields. You may need to tweak your data if you want it in a different data type other that character data.

Is there any way we can specify the file name and location in Datasource as a variable?

Like can we replace the following with a variable?

'Data Source="D:\spreadsheets\test.xls";Extended Properties="Excel 8.0;IMEX=1;HDR=NO"'

Yes -- you can build the statement and then use the execute command to execute the statement.

I do something similar to the following in my actual code...

declare @text_01 as varchar(1000) 
declare @text_02 as varchar(500)
declare @spreadsheet as varchar(250)
declare @worksheet as varchar(250)
declare @sql_statement as varchar(2000)

set @spreadsheet = 'c:\path\test.xls'
set @worksheet = 'Sheet1'

set @text_01 = 'INSERT INTO SpreadsheetDataDump (f1,f2,f3,f4,f5,f6) SELECT f1,f2,f3,f4,f5,f6 '

set @text_02 = ' FROM OPENDATASOURCE(''Microsoft.Jet.OLEDB.4.0'',''Data Source="'
+  @spreadsheet + '";Extended Properties="Excel 8.0;IMEX=1;HDR=NO"'')...'
+  @worksheet + '$' 


set @sql_statement =  @text_01 + @text_02

exec (@sql_statement )

In my actual code, the @spreadsheet and @worksheet variables are passed in as parameters. I've declared them in this sample code for clarity.

Hope this helps.

thanks a lot.
Started on the same path before posting that question. Acheieved the result in a similar way.

All is done and working but when i import the data from the excel to sql table, i see a change in the data.
the excel sheet has a column with numbers stored as text. they have a 0 as the first digit. this is gone once the sheet is imported into the column with data type varchar.

is there a way out of this?

Yes -- this is a hurdle that we have to face. You have to wirte a second routine that will cast your data into correct data types.

I tweak my data by getting the table containing the spreadsheet data and inserting it into a second table. The second table is defined with correct data types, so I have to cast anything that is not a character data type.

hmm..will let you know my findings or what ever i do. thanks man. repped.

Ok. One more thing (and I use this technique a lot). You can conditionally inject a zero to your string data based on conditions.

case when ((cast(f1 as int) > -10) or (cast(f1 as int) < 10)) then '0' + f1 
else f1 
end

This code will inject a '0' in front of any single digit number.

i am changing the excel column to varchar before i actually write to the sql table.
i used this earlier:

INSERT INTO table(c1,c2)
SELECT Code1,Desc
FROM OPENDATASOURCE
('Microsoft.Jet.OLEDB.4.0','Data Source="C:\test1.xls";Extended Properties="Excel 8.0;IMEX=1;HDR=YES"')
...Sheet1$

I modified the above as follows and i think i see correct results:

INSERT INTO table(c1,c2)
SELECT CAST(Code1 AS varchar(6)),Desc
FROM OPENDATASOURCE
('Microsoft.Jet.OLEDB.4.0','Data Source="C:\test1.xls";Extended Properties="Excel 8.0;IMEX=1;HDR=YES"')
...Sheet1$

i have to do vast testing before i can say that this worked.

Be careful in situations where you have different data types in the same column.

For example, in my spreadsheet, the first four rows contain information that really has nothing to do with the data. The real data starts from the fifth row and down. So in my example, I have integer data in the first column (starting from the fifth row). However, cell A1 contains a title that is a string. If I use cast(f1 as int) in my code, it will fail on cell A1.

In your case, you're casting to a varchar so no big deal, but if you are casting to a numeric data type you'll need to be aware of the pit falls.

Bottom line - lots of error checking and data massaging may be required.

Higa, you are right. But in my case, the excel sheet will have only two columns and I know for sure it doesnt cause any problems.
Now I have a new problem. Its about sheet names in the excel. I dont have the problem and solution clearly yet. I will post both of them soon.

OK..here it is. I am trying to read a specific sheet from an excel file. I realized that the name of the sheet may not be same all the time, but the position of the sheet is fixed. I can take the sheet name as input, but sometimes it may have spaces in that name and i tried in vain to read those sheets with spaces. So, two ways..one is to be able to access the sheets by their postion or to be able to read the sheet names with spaces.

If anyone know how to tackle this problem, please post a solution.

I don't know how to specify the sheet by position.

I also struggled with the sheet names containing spaces. This seems to be a limitation as far as I was able to research. In the end, I replaced all spaces in the sheet name with the underscore character. (This was the least headache.)

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.