Copying a microsoft word doc

Please support our C++ advertiser: Intel Parallel Studio Home
Thread Solved

Join Date: Jun 2009
Posts: 14
Reputation: shealy is an unknown quantity at this point 
Solved Threads: 0
shealy shealy is offline Offline
Newbie Poster

Copying a microsoft word doc

 
0
  #1
Jul 1st, 2009
I have 2 binaries - a java binary that requests a microsoft word doc from a c++ binary. The C++ binary opens the word doc in binary mode, reads x no of chars and returns chars to java binary. Java binary eventually receives all data and writes data using filestream write. When I try to open the newly created file, the contents are not readable. The size of the newly created file is the exact same size as the original file that is read by the C++ server.

Should the java and C++ binaries try and manipulate the microsoft word line feeds etc?
Reply With Quote Quick reply to this message  
Join Date: Dec 2008
Posts: 117
Reputation: u8sand is on a distinguished road 
Solved Threads: 15
u8sand's Avatar
u8sand u8sand is offline Offline
Junior Poster

Re: Copying a microsoft word doc

 
0
  #2
Jul 1st, 2009
I once tried to do something like this with ONLY C++. Tried to take all the contents of a document and remake it with the same thing. The problem is though that there are some characters that may not show up (may not follow ascii char set) and their may be text that is not being retrieved. Make sure your getting all the text, so use a pointer:
  1. #include <iostream>
  2. #include <fstream>
  3.  
  4. using namespace std;
  5.  
  6. char* main(char* file)
  7. {
  8. char* contents;
  9. char* buffer;
  10. int numOfChars = 0;
  11. char ch;
  12. ifstream fin(file);
  13. if(fin)
  14. {
  15. while(fin.get(ch))
  16. {
  17. buffer = new char[numOfChars+2];
  18. for(int i = 0; i < numOfChars; i++)
  19. buffer[i] = contents[i];
  20. buffer[numOfChars] = ch;
  21. buffer[numOfChars+1] = '\0';
  22. delete contents;
  23. contents = new char[++numOfChars+1];
  24. for(int i = 0; i < numOfchars; i++)
  25. contents[i] = buffer[i];
  26. contents[numOfChars] = '\0';
  27. delete buffer;
  28. }
  29. fin.close();
  30. }
  31. else
  32. return "Error";
  33. return contents;
  34. }
Last edited by u8sand; Jul 1st, 2009 at 10:29 pm.
Reply With Quote Quick reply to this message  
Join Date: Aug 2005
Posts: 15,647
Reputation: Ancient Dragon has a reputation beyond repute Ancient Dragon has a reputation beyond repute Ancient Dragon has a reputation beyond repute Ancient Dragon has a reputation beyond repute Ancient Dragon has a reputation beyond repute Ancient Dragon has a reputation beyond repute Ancient Dragon has a reputation beyond repute Ancient Dragon has a reputation beyond repute Ancient Dragon has a reputation beyond repute Ancient Dragon has a reputation beyond repute Ancient Dragon has a reputation beyond repute 
Solved Threads: 1498
Team Colleague
Featured Poster
Ancient Dragon's Avatar
Ancient Dragon Ancient Dragon is offline Offline
Still Learning

Re: Copying a microsoft word doc

 
0
  #3
Jul 1st, 2009
>>When I try to open the newly created file, the contents are not readable
Its because doc files are binary files, not text files. Those files contain a lot of formatting information, such as font, font color, font size, etc, that is only readable by MS-Word or similar compatible program.

Binary files have to be opened in binary mode ifstream fin(file, ios::binary); and use stream's read() method.
  1. ifstream fin(file, ios::binary);
  2. ofstream out("newfile.doc", ios::binary);
  3. char iobuffer[255];
  4. while( fin.read( iobuffer, sizeof(iobuffer) )
  5. {
  6. // do something with this block of data
  7. size_t sz = fin.gcount();
  8. out.write( iobuffer, sz);
  9. }
Last edited by Ancient Dragon; Jul 1st, 2009 at 10:37 pm.
Reply With Quote Quick reply to this message  
Join Date: Jun 2009
Posts: 14
Reputation: shealy is an unknown quantity at this point 
Solved Threads: 0
shealy shealy is offline Offline
Newbie Poster

Re: Copying a microsoft word doc

 
0
  #4
Jul 2nd, 2009
I'm pretty sure that the text is being copied correctly insofar as one can using C++ filestream reads/writes and buffers. File sizes are the same also. Does one need to use microsoft apis to ensure that non-ascii chars are converted?




Originally Posted by u8sand View Post
I once tried to do something like this with ONLY C++. Tried to take all the contents of a document and remake it with the same thing. The problem is though that there are some characters that may not show up (may not follow ascii char set) and their may be text that is not being retrieved. Make sure your getting all the text, so use a pointer:
  1. #include <iostream>
  2. #include <fstream>
  3.  
  4. using namespace std;
  5.  
  6. char* main(char* file)
  7. {
  8. char* contents;
  9. char* buffer;
  10. int numOfChars = 0;
  11. char ch;
  12. ifstream fin(file);
  13. if(fin)
  14. {
  15. while(fin.get(ch))
  16. {
  17. buffer = new char[numOfChars+2];
  18. for(int i = 0; i < numOfChars; i++)
  19. buffer[i] = contents[i];
  20. buffer[numOfChars] = ch;
  21. buffer[numOfChars+1] = '\0';
  22. delete contents;
  23. contents = new char[++numOfChars+1];
  24. for(int i = 0; i < numOfchars; i++)
  25. contents[i] = buffer[i];
  26. contents[numOfChars] = '\0';
  27. delete buffer;
  28. }
  29. fin.close();
  30. }
  31. else
  32. return "Error";
  33. return contents;
  34. }
Reply With Quote Quick reply to this message  
Join Date: Jun 2009
Posts: 14
Reputation: shealy is an unknown quantity at this point 
Solved Threads: 0
shealy shealy is offline Offline
Newbie Poster

Re: Copying a microsoft word doc

 
0
  #5
Jul 2nd, 2009
I missed this reply - sorry. I am treating the microsoft word doc in the C++ code as a binary doc and using fstreams to read/write the data. Then when I use microsoft word to open the newly copied file, the contents are not readable.
Is it possible to just read the contents of the microsoft doc file in binary form, write and open without doing any formatting of special chars?
Reply With Quote Quick reply to this message  
Join Date: Aug 2005
Posts: 15,647
Reputation: Ancient Dragon has a reputation beyond repute Ancient Dragon has a reputation beyond repute Ancient Dragon has a reputation beyond repute Ancient Dragon has a reputation beyond repute Ancient Dragon has a reputation beyond repute Ancient Dragon has a reputation beyond repute Ancient Dragon has a reputation beyond repute Ancient Dragon has a reputation beyond repute Ancient Dragon has a reputation beyond repute Ancient Dragon has a reputation beyond repute Ancient Dragon has a reputation beyond repute 
Solved Threads: 1498
Team Colleague
Featured Poster
Ancient Dragon's Avatar
Ancient Dragon Ancient Dragon is offline Offline
Still Learning

Re: Copying a microsoft word doc

 
0
  #6
Jul 2nd, 2009
>>Does one need to use microsoft apis to ensure that non-ascii chars are converted?

Huh? I didn't post anything specific to microsoft, only standard C++ stuff. Binary files have to be opened in binary mode using ios::binary option. If you don't do that then the destination file will be corrupt.
Reply With Quote Quick reply to this message  
Join Date: Aug 2005
Posts: 15,647
Reputation: Ancient Dragon has a reputation beyond repute Ancient Dragon has a reputation beyond repute Ancient Dragon has a reputation beyond repute Ancient Dragon has a reputation beyond repute Ancient Dragon has a reputation beyond repute Ancient Dragon has a reputation beyond repute Ancient Dragon has a reputation beyond repute Ancient Dragon has a reputation beyond repute Ancient Dragon has a reputation beyond repute Ancient Dragon has a reputation beyond repute Ancient Dragon has a reputation beyond repute 
Solved Threads: 1498
Team Colleague
Featured Poster
Ancient Dragon's Avatar
Ancient Dragon Ancient Dragon is offline Offline
Still Learning

Re: Copying a microsoft word doc

 
0
  #7
Jul 2nd, 2009
Originally Posted by shealy View Post
I missed this reply - sorry. I am treating the microsoft word doc in the C++ code as a binary doc and using fstreams to read/write the data. Then when I use microsoft word to open the newly copied file, the contents are not readable.
Is it possible to just read the contents of the microsoft doc file in binary form, write and open without doing any formatting of special chars?
That is exactly what the code snipped I posted will do. Its just standard file i/o operation, nothing special about it.
Reply With Quote Quick reply to this message  
Join Date: Jun 2009
Posts: 14
Reputation: shealy is an unknown quantity at this point 
Solved Threads: 0
shealy shealy is offline Offline
Newbie Poster

Re: Copying a microsoft word doc

 
0
  #8
Jul 2nd, 2009
To be clear - I am using fstreams and read and opening the microsoft word doc in binary mode. Ditto with the newly created file that gets the contents of the word doc. All this is done using C++ code. When I try to view the newly created doc with ms-word, the contents are not readable.
Reply With Quote Quick reply to this message  
Join Date: Feb 2009
Posts: 1,968
Reputation: tux4life has a reputation beyond repute tux4life has a reputation beyond repute tux4life has a reputation beyond repute tux4life has a reputation beyond repute tux4life has a reputation beyond repute tux4life has a reputation beyond repute tux4life has a reputation beyond repute tux4life has a reputation beyond repute tux4life has a reputation beyond repute tux4life has a reputation beyond repute tux4life has a reputation beyond repute 
Solved Threads: 214
tux4life's Avatar
tux4life tux4life is offline Offline
Posting Virtuoso

Re: Copying a microsoft word doc

 
0
  #9
Jul 2nd, 2009
Originally Posted by u8sand View Post
I once tried to do something like this with ONLY C++. Tried to take all the contents of a document and remake it with the same thing. The problem is though that there are some characters that may not show up (may not follow ascii char set) and their may be text that is not being retrieved. Make sure your getting all the text, so use a pointer:
  1. #include <iostream>
  2. #include <fstream>
  3.  
  4. using namespace std;
  5.  
  6. char* main(char* file)
  7. {
  8. char* contents;
  9. char* buffer;
  10. int numOfChars = 0;
  11. char ch;
  12. ifstream fin(file);
  13. if(fin)
  14. {
  15. while(fin.get(ch))
  16. {
  17. buffer = new char[numOfChars+2];
  18. for(int i = 0; i < numOfChars; i++)
  19. buffer[i] = contents[i];
  20. buffer[numOfChars] = ch;
  21. buffer[numOfChars+1] = '\0';
  22. delete contents;
  23. contents = new char[++numOfChars+1];
  24. for(int i = 0; i < numOfchars; i++)
  25. contents[i] = buffer[i];
  26. contents[numOfChars] = '\0';
  27. delete buffer;
  28. }
  29. fin.close();
  30. }
  31. else
  32. return "Error";
  33. return contents;
  34. }
Hey, why has no-one told him that it's int main() and not char* main() or void main() or ... ??
Last edited by tux4life; Jul 2nd, 2009 at 6:24 am.
"Never argue with idiots, they just drag you down to their level and then beat you with experience."
Reply With Quote Quick reply to this message  
Join Date: Aug 2005
Posts: 15,647
Reputation: Ancient Dragon has a reputation beyond repute Ancient Dragon has a reputation beyond repute Ancient Dragon has a reputation beyond repute Ancient Dragon has a reputation beyond repute Ancient Dragon has a reputation beyond repute Ancient Dragon has a reputation beyond repute Ancient Dragon has a reputation beyond repute Ancient Dragon has a reputation beyond repute Ancient Dragon has a reputation beyond repute Ancient Dragon has a reputation beyond repute Ancient Dragon has a reputation beyond repute 
Solved Threads: 1498
Team Colleague
Featured Poster
Ancient Dragon's Avatar
Ancient Dragon Ancient Dragon is offline Offline
Still Learning

Re: Copying a microsoft word doc

 
0
  #10
Jul 2nd, 2009
When I actually tried it I had the same problem. I used a command prompt and found out that the two files were just a few bytes different.

well, the code I posted almost works. The problem is that the last few bytes does not get read/written
  1. int main(int argc, char* argv[])
  2. {
  3. char iobuf[255];
  4. size_t total = 0;
  5. size_t sz = 0;
  6. ifstream fin("file1.doc", ios::binary);
  7. if( !fin.is_open() )
  8. {
  9. cout << "Can't open the file\n";
  10. return 1;
  11. }
  12. ofstream fout( "copy.doc", ios::binary);
  13. while( fin.read(iobuf, sizeof(iobuf) ))
  14. {
  15. sz = fin.gcount();
  16. total += sz;
  17. fout.write(iobuf, sz);
  18. sz = 0;
  19. }
  20. sz = fin.gcount();
  21. if( sz > 0)
  22. {
  23. cout << "sz = " << sz << "\n";
  24. total += sz;
  25. fout.write(iobuf, sz);
  26. }
  27. fin.close();
  28. fout.close();
  29. cout << "Total = " << total << "\n";
  30. return 0;
  31. }
Last edited by Ancient Dragon; Jul 2nd, 2009 at 6:37 am.
Reply With Quote Quick reply to this message  
Reply

This thread has been marked solved.
Perhaps start a new thread instead?
Message:




Views: 1530 | Replies: 30
Thread Tools Search this Thread



Tag cloud for C++
About Us | Contact Us | Advertise | DaniWeb | Acceptable Use Policy | RSS Feed

©2003 - 2009 DaniWeb® LLC