•
•
•
•
What is DaniWeb IT Discussion Community?
You're currently browsing the PHP section within the Web Development category of DaniWeb, a massive community of 423,804 software developers, web developers, Internet marketers, and tech gurus who are all enthusiastic about making contacts, networking, and learning from each other. In fact, there are 3,735 IT professionals currently interacting right now! Registration is free, only takes a minute and lets you enjoy all of the interactive features of the site.
Please support our PHP advertiser: Lunarpages PHP Web Hosting
Views: 507 | Replies: 4
![]() |
code
$fileHandle = fopen($userDoc, "r");
$line = @fread($fileHandle, filesize($userDoc));
$lines = explode(chr(0x0D),$line);
$outtext = "";
foreach($lines as $line_num => $thisline)
{
if ($line_num >=0 && $line_num <=150 ) {
$pos = strpos($thisline, chr(0x00));
if (($pos !== FALSE)||(strlen($thisline)==0))
{
}
else
{
$outtext = $thisline;
$outtext = preg_replace("/[^a-zA-Z0-9\s\,\.\-\n\r\t@\/\_\(\)]/"," ",$outtext);
echo "<table>";
echo "<tr><td>" .htmlspecialchars($outtext). "</td></tr>";
echo "</table>";
}
}
} could you post some example output?
as it is you're putting everything in htmlspecialchars($outtext) so it won't be formatted.
To convert the formatting to HTML formatting, you'll have to know the doc formatting syntax (it's probably version dependent). Then convert each doc formatting into the equivalent HTML formatting.
A program that handles .doc files pretty well and is open source is OpenOffice. Its Java I believe. You can browse the source code to see just how they do it.. though it may be abstracted a bit so any references you can find on the .doc formatting would probably get you there faster.
as it is you're putting everything in htmlspecialchars($outtext) so it won't be formatted.
To convert the formatting to HTML formatting, you'll have to know the doc formatting syntax (it's probably version dependent). Then convert each doc formatting into the equivalent HTML formatting.
A program that handles .doc files pretty well and is open source is OpenOffice. Its Java I believe. You can browse the source code to see just how they do it.. though it may be abstracted a bit so any references you can find on the .doc formatting would probably get you there faster.
www.fijiwebdesign.com - web design and development and fun
Cpanel Email - Let users Register email accounts on your website upon registration
Ajax Chat - Fully browser based chat!
Cpanel Email - Let users Register email accounts on your website upon registration
Ajax Chat - Fully browser based chat!
•
•
Join Date: May 2008
Posts: 1
Reputation:
Rep Power: 0
Solved Threads: 0
Hi,
I was trying to read MS Word documents but without good results, cause those strange characters.
I then started looking for something on google and I found your code above.
After some changes, I managed to read the first line and remove the junk at the end of the document.
It worked with 97 - 2003 .doc files
Thanks a lot, without your code I wouldn´t have done it.
Here´s the code
I created an account here just to thank you!
All I can tell you about the bold and formatting stuff is that all the information is writen at the end of the file and you need to read the .doc file especification if you want to learn about it.
Thanks, and if I find something to make this code better, I´ll tell you.
(sorry for my english)
I was trying to read MS Word documents but without good results, cause those strange characters.
I then started looking for something on google and I found your code above.
After some changes, I managed to read the first line and remove the junk at the end of the document.
It worked with 97 - 2003 .doc files
Thanks a lot, without your code I wouldn´t have done it.
Here´s the code
<?
// Read the file and split it into lines
$pathToFile = "path\\to\\file.doc";
$lines = explode(chr(0x0D), file_get_contents($pathToFile, "r"));
$outText = "";
// Take care of the first line and removes it from the lines array
$firstLine = explode(chr(0x00), array_shift($lines));
$outText .= "<p>".$firstLine[sizeof($firstLine)-1]."</p>\n";
// Read each line found in the doc
foreach ($lines as $line){
//Stop if find any weird thing
$pos = substr_count($line, chr(0x00));
if (($pos != false)) break;
//No weird thing, add to outText, removing some strange characters
$line = preg_replace("/[^\w ]/", "", $line);
$outText .= "<p>".$line."</p>\n";
}
// Print the results
echo ($outText);
?>I created an account here just to thank you!
All I can tell you about the bold and formatting stuff is that all the information is writen at the end of the file and you need to read the .doc file especification if you want to learn about it.
Thanks, and if I find something to make this code better, I´ll tell you.
(sorry for my english)
![]() |
•
•
•
•
•
•
•
•
DaniWeb PHP Marketplace
•
•
•
•
Currently Active Users Viewing This Thread: 1 (0 members and 1 guests)
Similar Threads
- Writing string to a file (Java)
- How to search for a string in a file (C#)
- sort string in file txt (C++)
- assigning a string from a file to a variable (C++)
- need to find occurances in a string (C++)
- Help with File Reading loop (C++)
- Reading in a *.csv file and loading the data into an Array (Java)
- can no get this code to read from a file (Java)
- Using printf with a file (C++)
Other Threads in the PHP Forum
- Previous Thread: Sending Emails
- Next Thread: How much data can a post variable hold?


Linear Mode