DaniWeb IT Discussion Community

DaniWeb IT Discussion Community (http://www.daniweb.com/forums/)
-   PHP (http://www.daniweb.com/forums/forum17.html)
-   -   first string in a file (http://www.daniweb.com/forums/thread120761.html)

queenc Apr 24th, 2008 6:59 am
first string in a file
 
hi i have written a code for convertingo uploaded a .doc and view it as html........
i am not able to view thw first line,all the bold string are looking like ordinary string

queenc Apr 24th, 2008 7:09 am
Re: first string in a file
 
code
   $fileHandle = fopen($userDoc, "r");
    $line = @fread($fileHandle, filesize($userDoc)); 
  $lines = explode(chr(0x0D),$line);
    $outtext = "";
  foreach($lines as $line_num => $thisline)
      {
 
if ($line_num >=0 && $line_num <=150 ) {
        $pos = strpos($thisline, chr(0x00));
        if (($pos !== FALSE)||(strlen($thisline)==0))
 {
 }
else

{
 $outtext = $thisline;
    $outtext = preg_replace("/[^a-zA-Z0-9\s\,\.\-\n\r\t@\/\_\(\)]/"," ",$outtext);
 echo  "<table>";
echo  "<tr><td>" .htmlspecialchars($outtext). "</td></tr>";
 echo  "</table>";
}
}
}

digital-ether Apr 24th, 2008 12:28 pm
Re: first string in a file
 
could you post some example output?

as it is you're putting everything in htmlspecialchars($outtext) so it won't be formatted.
To convert the formatting to HTML formatting, you'll have to know the doc formatting syntax (it's probably version dependent). Then convert each doc formatting into the equivalent HTML formatting.

A program that handles .doc files pretty well and is open source is OpenOffice. Its Java I believe. You can browse the source code to see just how they do it.. though it may be abstracted a bit so any references you can find on the .doc formatting would probably get you there faster.

queenc Apr 25th, 2008 1:11 am
Re: first string in a file
 
2 Attachment(s)
hi
i have attached the code i the first file and he output in the second file.In the output file..
the expected output is 139 lines only but it displaying junk values

Fred_Castro May 11th, 2008 2:55 pm
Re: first string in a file
 
Hi,
I was trying to read MS Word documents but without good results, cause those strange characters.

I then started looking for something on google and I found your code above.
After some changes, I managed to read the first line and remove the junk at the end of the document.

It worked with 97 - 2003 .doc files

Thanks a lot, without your code I wouldn´t have done it.

Here´s the code

<?
        // Read the file and split it into lines
        $pathToFile = "path\\to\\file.doc";
        $lines = explode(chr(0x0D), file_get_contents($pathToFile, "r"));
       
        $outText = "";
       
        // Take care of the first line and removes it from the lines array
        $firstLine = explode(chr(0x00), array_shift($lines));
        $outText .= "<p>".$firstLine[sizeof($firstLine)-1]."</p>\n";
       
        // Read each line found in the doc
        foreach ($lines as $line){
                //Stop if find any weird thing
                $pos = substr_count($line, chr(0x00));
                if (($pos != false)) break;
               
                //No weird thing, add to outText, removing some strange characters
                $line = preg_replace("/[^\w ]/", "", $line);
                $outText .= "<p>".$line."</p>\n";       
        }
       
        // Print the results
        echo ($outText);
?>

I created an account here just to thank you!
All I can tell you about the bold and formatting stuff is that all the information is writen at the end of the file and you need to read the .doc file especification if you want to learn about it.

Thanks, and if I find something to make this code better, I´ll tell you.

(sorry for my english)


All times are GMT -4. The time now is 5:56 am.

Forum system based on vBulletin Copyright ©2000 - 2008, Jelsoft Enterprises Ltd.
©2003 - 2008 DaniWeb® LLC