We're a community of 1076K IT Pros here for help, advice, solutions, professional growth and fun. Join us!
1,075,577 Members — Technology Publication meets Social Media
Username:
Password:
Lost login information?
Start New Discussion Reply to this Discussion

Perl Encoding??

Hello,

I am new to Perl - and so far I am enjoying it. Unfortunately I do not have the luxury to start completely from scratch. I have here a problem that i am struggling to solve. I have spent many hours trying to solve this issue without any success, hence why I am asking (or begging whichever makes you feel better ;)) for help.

Problem

In Isodraw (technical illustration app) I am exporting a filename to a text file. Perl accesses the file, and places the text into a variable. I compare this variable to cell data within a spreadsheet until a positive match is made. I have everything working perfectly except that when Perl reads the text inside the text file it reads it differently:


V6558-04505-011_01 (Original - Text File)

■V 6 5 5 8 - 0 4 5 0 5 - 0 1 1 _ 0 1 (PROBLEM - Perl)


From my research it is due to different encoding. Now my options within Isodraw when creating the text file are either UNICODE or 8-bit ASCII. Neither has a good result in Perl, but I cannot change this inside Isodraw so Perl has to do it. (Note: if I manually save the text file out in notepoad to ANSI perl reads it perfectly).

I desperately need some assistance this is currently beyond my knowledge if anyone can help I would really appreciate it.

Many thanks

Alan


Example code

#!/usr/bin/perl -w

use v5.10.0;
use strict;


############# READ NOTE HERE ##############

###### -Uncomment below to see it working perfectly!
#our $VarFileName = "V6558-01501-011_01";


##### IF you wish to see it reading from the file comment above and uncomment Notes Y and Z.



my $record;

our $VarFileName; 			############ NOTE Y

my $VarISS = "VarISS_TestValue";
my $VarICN = "VarICN_TestValue";

############## READ FILE FROM ISODRAW ##################
open (ReadFILE, "<D:/ForJim/FROM_ISODRAW.txt") or die "couldn't open the file!";

while ($record = <ReadFILE>)
{
say $record;
chomp($record);

$VarFileName = $record; 		############ NOTE Z

}
#############################

#############################

	my $VarComparison = "V6558-01501-011_01"; ### TEMP
	if ($VarComparison eq $VarFileName)
		{
		say "MATCH!!!";
		} else {
			say "NOT THE SAME!";
			}

#say our $varFilename;
2
Contributors
5
Replies
2 Days
Discussion Span
2 Years Ago
Last Updated
6
Views
ColMatrix
Newbie Poster
4 posts since Dec 2010
Reputation Points: 10
Solved Threads: 0
Skill Endorsements: 0

Please post the file "FROM_ISODRAW.txt" as an attachment. That should give us a file with the original encoding preserved so we can reproduce the problem. Click the "Manage Attachments" button to attach your text file.

d5e5
Practically a Posting Shark
831 posts since Sep 2009
Reputation Points: 162
Solved Threads: 163
Skill Endorsements: 1

Hi

It would not let me edit my post above, did previously but not now for some reason (I am logged in).

heres the file - and thanks for spending the time to help.

Attachments FROM_ISODRAW.txt (0.04KB)
ColMatrix
Newbie Poster
4 posts since Dec 2010
Reputation Points: 10
Solved Threads: 0
Skill Endorsements: 0

Hi

It would not let me edit my post above, did previously but not now for some reason (I am logged in).

heres the file - and thanks for spending the time to help.

Strange, one of my text editors (gedit) tells me the file is plain text and another (Komodo Edit) says it is UTF-16 Little Endian. Try replacing the statement that opens the file with the following:

#Change the following to your path and file name
my $filename = '/home/david/Programming/data/FROM_ISODRAW.txt';

############## READ FILE FROM ISODRAW ##################
open (ReadFILE, '<:encoding(UTF-16)', $filename) or die "couldn't open $filename: $!";
d5e5
Practically a Posting Shark
831 posts since Sep 2009
Reputation Points: 162
Solved Threads: 163
Skill Endorsements: 1

Thanks for sharing,

I tried your suggestion unfortunately it didnt work, a friend managed to assist me. Below is the code, I think he did the same as you but the string still contained a lot of extra space (data) and when compared to the variable it still wasnt equal so it would return 'NOT THE SAME'- I personally would have thought that with the encoding it would have taken care of this issue....but it hasnt.

If the answer below can be shortened to a more compact version, or their is a better work around then please feel free to add any input. I have removed unneccessary elements for this test.

Thank you for your time and effort!

Alan

#!/usr/bin/perl -w

# Declare the subroutines
sub trim($);
sub ltrim($);
sub rtrim($);

# Right trim function to remove trailing whitespace
sub rtrim($)
{
	my $string = shift;
	$string =~ s/\s+$//;
	return $string;
}

use v5.10.0;
use strict;



my $record;
our $ansi;

our $VarFileName; 	

############## READ FILE FROM ISODRAW ##################

open(INFILE, "<:encoding(UTF-16)", "C:/ForJim/FROM_ISODRAW.txt");
while(<INFILE>)
{
$record=$_;

print "$record \n";

$VarFileName = $record; 
}
close(INFILE);

######## Trim the trailing whitespace ########
$VarFileName = rtrim($VarFileName);

#################################

	my $VarComparison = "V6558-04505-011_01"; ### TEMP
	if ($VarComparison eq $VarFileName)
		{
		say "MATCH!!!";
		} else {
			say "NOT THAT SAME!";
			}
ColMatrix
Newbie Poster
4 posts since Dec 2010
Reputation Points: 10
Solved Threads: 0
Skill Endorsements: 0

Looks OK except opening a file without testing whether the open succeeds can result in confusion if the file fails to open, because the program will continue without giving an error until it tries to read a record from the unopened file. For that reason we usually add an or die... or an || die... clause to the open statement. See "Simple Opens" in http://perldoc.perl.org/5.10.0/perlopentut.html

d5e5
Practically a Posting Shark
831 posts since Sep 2009
Reputation Points: 162
Solved Threads: 163
Skill Endorsements: 1

This article has been dead for over three months: Start a new discussion instead

Post: Markdown Syntax: Formatting Help
 
You
 
© 2013 DaniWeb® LLC
Page rendered in 0.5393 seconds using 2.7MB