Hello I try to learn the String.charCodeAt(index) method Here is a snippet that demonstrate that the first index is allways 48 independentely of the character. How work's it (I was expecting that each letter has her unicode-number equivalent. function unicoding(str) { for (i = 0; i < str.length; i ++) { console.log(String.charCodeAt(i)) } return str; } unicoding("abce"); unicoding("abcde"); unicoding("zabcdef"); unicoding("yzabcdefg"); 48 49 50 51 48 49 50 51 52 48 49 50 51 52 53 54 48 49 50 51 52 53 54 55 56 => 'yzabcdefg' what is the secret of this numbers. Sorry if it is very trivial.

Member Avatar
Member Avatar
+0 forum 9

I am trying to do some text processing tasks against a collection of files stored in a directory. The data set is just standard 20-newsgroup data. However, running the following code segement gives error message such as `UnicodeDecodeError: 'utf8' codec can't decode byte 0x80 in position 240: invalid start byte` I think it should be related to unicode problem. But I am not clear how to solve it. 9: DIR = 'C:\\Users\\Desktop\\data\\rec.sport.hockey' 10: posts = [open(os.path.join(DIR,f)).read() for f in os.listdir(DIR)] 11: x_train = vectorizer.fit_transform(posts) The traceback message is as follows Traceback (most recent call last): File "C:/Users/PycharmProjects/Project3/demo10.py", line 11, in …

Member Avatar
Member Avatar
+0 forum 3

Given a unicode string or file, what is the best way to find a short unicode string that is **not** a substring of the first one ? I want a working algorithm that avoids transforming the first string into a list of characters. My first idea is to choose a random unicode character, then if it is a substring of string A, choose a random string B of 2 unicode characters. Again if it is in string A, choose a new random B with 3 characters, etc. If somebody sees a working deterministic algorithm, it would be a great idea.

Member Avatar
Member Avatar
+0 forum 16

I have a code behind in my web pages and I'm trying to write Greek characters in my SQL 2010 Tables. I write in my Tables but what I get is ????? why is that? What I have to do in my SQL in order get the Greek characters? Thank you

Member Avatar
Member Avatar
+0 forum 2

Working with Microsoft Visual Studio 2010 Professional, using C#, with Microsoft SQL Server 2008. I have been tasked with taking an existing application, and making changes so that it can receive multiple languages in its text boxes without needing to configure ahead of time what language is to be used. To ensure that I am doing this correctly, I have created a new table in the database, called Language. If I can get things working correctly with [Language], I should be able to get it all working. The field in question is "MotherTongueLanguageName": CREATE TABLE [Language]( [LanguageID] [tinyint] IDENTITY(1,1) NOT …

Member Avatar
Member Avatar
+0 forum 12

Hello, I have a file which I need to read which contains some unusual characters (EG: '╠') and I need to be able to read it and convert those characters to numbers. As an example of what I am looking for would be code that reads a file and prints "Found" whenever '╠' is encountered. How do I do this with native std::fstream compatibility? EG: void testFile(const char *str) { std::fstream file(str); char temp; int count=0; while (!file.eof()) { file>>temp; if (temp=='╠')//I highly doubt this would work count++; } cout<<"Found "<<count<<" ╠s"<<endl; } I am sure that there is a …

Member Avatar
Member Avatar
+0 forum 7

Hi, I want to put urdu language text in a software to be searched. How can i do that? Software is like a lot of text and people can search through it to find specific sections or parts. And for those who don't know urdu, it is written like arabic. Regards

Member Avatar
+0 forum 0

Hi i am new to VC++, iam trying to implement the encoding of password **In Java ,** **byte[] saltedPassword = (password + getSalt()).getBytes();** **output :** ** SaltedPassword :[B@3eca90** in Java, saltedPassword get encoded value in the same way **i want to implement in VC++** How to do **Encoding in VC++**..? Please anyone help for me. Thanks in Advance..

Member Avatar
Member Avatar
+0 forum 1

I'm trying to do the following: 1. get username typed into the userField 2. make a SEARCH mysql_query with the username as a variable I'm having a hard time getting past phase 2 since mysql_query takes a const char* as the query string, and I can only get username as char* or wchar_t* I'm also compiling in unicode. My code for now: void mysql_connect(HWND hLoginWnd) { MYSQL *con, mysql; MYSQL_RES *res; mysql_init(&mysql); mysql_options(&mysql, MYSQL_SET_CHARSET_NAME, "utf8"); mysql_real_connect(&mysql, "localhost", "root", "", "treenitaulu", 3306, NULL, 0); char name[512], pass[512]; int lenUser = SendMessage(userField, WM_GETTEXT, 512, (LPARAM)name); int lenPass = SendMessage(passField, WM_GETTEXT, 512, (LPARAM)pass); …

Member Avatar
Member Avatar
+0 forum 1

Hello i'd like to ask a quick one. -Mfc VC++ 6 application. -NO Unicode support by design. -Greek characters appear fine in all dialogs since OS is Windows Greek version (?) -Greek text is both hard coded in c++ source files and retrieved from mysql database. -ONE specific Edit Box combines mysql data with greek text and hard coded in source files strings, to form a kind of report, and print it on paper. but: If i copy text from an Edit Box and paste it in notepad for example, the greek characters appear scrambled. If i print on paper …

Member Avatar
+0 forum 0

Hi, I want to output [chess symbols](http://en.wikipedia.org/wiki/Chess_symbols_in_Unicode) from unicode to console in c++, please tell me how to do that. Im using Dev c++, windows8 and console. Im just a beginner so please keep it simple for me (also tell me the libraries to include please)

Member Avatar
Member Avatar
+0 forum 16

hi, my email form is written in php, it does not support multiple languages i mean unicode fonts. it shows something like this: খোলা জানালা . In this case what i should do.

Member Avatar
Member Avatar
+0 forum 6

Dear Friends, I'm having some troubles in my script connected with special characters. I built a program that is supposed to process some images in a folder and give back some files (xls and txt). Everything is fine in most of the cases, but if in the path (including filename) there is any non ascii character such as č I get the following error 'ascii' codec can't encode character u'\u010d' in position...: ordinal not in range(128) How can I handle this problem? Thanks a lot for your help

Member Avatar
Member Avatar
+0 forum 12

Hello everyone, I am having a hard time reading and writing a UTF-8 file in visual c++ 2010. [CODE] void ReadUTF8File() { ifstream UTF8File("C:\\DaniWeb\\Desktop\\UTF8File.txt"); /* UTF8File.txt: ☺☻♥♦♣♠•◘○ */ string UTF8FileStr; if(UTF8File.is_open()) { while(!UTF8File.eof()) { UTF8File >> UTF8FileStr; cout << UTF8FileStr << endl; /* cout: ∩╗┐Γÿ║Γÿ╗ΓÖÑΓÖªΓÖúΓÖáΓÇóΓùÿΓùï */ } } UTF8File.close(); } [/CODE] The output was not similar to the file's text. Please help, thank you for your time and consideration.

Member Avatar
Member Avatar
+0 forum 7

Hi all, My web application uploads files from MySQL database. I have the text in ms word *.doc files, so I convert it to *.txt as UNICODE-UTF8, and mysql table charset is set to UTF8. I load the database entering this command: [CODE] LOAD DATA local INFILE 'file.txt' INTO TABLE tab; [/CODE] I'm getting some problems here with the charset enconding. For example this text: [CODE] Amente E ainda os escravos algũas horas podem folgar sem culpa, chorar podem todas. Nam é a sojeição cousa que assi se possa sofrer e a mi [/CODE] is loaded to the database like …

Member Avatar
+0 forum 0

Hi, I'm using Windows and Python 3. I'm having problems using [B]os.listdir[/B] with Unicode. Let's say I have a directory which contains files with Unicode file names. The name and path of the directory itself might or might not be Unicode. When it is Unicode, I can't seem to get listdir to accept it as an argument. It always raises a WindowsError exception complaining that the Unicode string isn't a valid directory path and displaying it very literally in the error message. For example, "C:/aXb", with the X representing some particular Unicode character, would be displayed by the error message …

Member Avatar
Member Avatar
+0 forum 2

Hi, I'm having a problem with using the streamWriter to write to a unicode file. Code as follows: [CODE] for(int i = 1; i < 1018; ++i) { std::wostringstream integer; integer << L"IDS_STRING" << i << L"\t"; wstring thisString = integer.str(); thisString = thisString + L"\"Spare String\""; wchar_t* SpareString = const_cast<wchar_t*>(thisString.c_str()); bool isPresent = std::find_if(load.begin(), load.end(), StartsWith1(thisString)) != load.end(); System::IO::StreamWriter^ WriteSpareStrings = gcnew System::IO::StreamWriter(L"Chinese.txt", true, System::Text::Encoding::Unicode); if(isPresent == false) { WriteSpareStrings->Write(SpareString); WriteSpareStrings->Write(L"\r\n"); } WriteSpareStrings->Flush(); WriteSpareStrings->Close(); } [/CODE] So, obviously I'm expecting the output to the file to be strings of "IDS_STRINGi "Spare String"", but instead all the strings are just …

Member Avatar
Member Avatar
+0 forum 1

Good day :) I'm trying to display some Unicode text (Выход) on a Button. C# uses UTF-16 encoding but I'm reading my data from a UTF-8 encoded file so I take the necessary steps to ensure that I've read the data correctly. In debug mode I see that the string is correct but on the UI (after a [I].Text =[/I] ... call) the text appears as %. Any ideas? Thank you.

Member Avatar
Member Avatar
+0 forum 1

Hi, I'm having a problem with taking the user input from a richtextbox in Unicode and writing it to a Unicode text file. I'm able to read a different Unicode file and write it to the new file, but when it comes to writing the contents of the richtextbox to the new file, nothing appears. I've tried debugging it, and I can see that the wstring is assigned correctly, I just can't see why this won't write to file. What am I doing wrong? [CODE] private: System::Void button1_Click(System::Object^ sender, System::EventArgs^ e) { std::wifstream inFile(L"testin.txt", std::ios::in | std::ios::binary); std::wofstream outFile(L"test.txt", std::ios::app …

Member Avatar
Member Avatar
+0 forum 2

[CODE]#include<stdio.h> #include<string.h> void FindWord (char used[30] , char string[30] , int wordsize); char* CleanString (char string[30], int wordsize); int main (void) { FILE *fp; char letters[30]; char words[30]; char used[30]; char *string; int wordsize; int input_size; input_size = wordsize = 0; fp = fopen("dictionary.txt","r");/*contains a list of words in the following format : number_of_chars.word (i.e. 2.hi , 4.stop )*/ while(1) { printf("Type the letters and press enter : "); gets(letters);/* saves the inputed data to a string*/ fflush(stdin); input_size = strlen(letters);/* gets how many characters user entered*/ while (1) { fscanf(fp,"%d",&wordsize);/* will read the number (i.e. for 2.hi will read …

Member Avatar
Member Avatar
+0 forum 1

Hi, Is it possible to take the input from a textbox that's in chinese (unicode) and convert it into hex to output to a text file? Any insight would be great, thanks!

Member Avatar
Member Avatar
+0 forum 3

i've some code written as a dll in C++ in Visual Studio 2008. I've to be able to debug it. But i can't load dll in test exe. There isn't any problem on my test exe because when i try to load another dll there isn't any problem. Dll that given to me builded in debug unicode. Is it may be the cause of the problem? And how can i convert it to just debug mode? i really need your help., Thanks in advance.

Member Avatar
Member Avatar
+0 forum 5

Hey guys, I need some help regarding a project involving dictionary based language translation. So what I have to do is, given a text in a foreign language (like Hindi), my program should match each word from a 'dictionary like file' and give the output in English, I don't have to worry about grammar, inflection, etc. I think I could make a simple program which just searches for each word and replaces it with the translated word. Seems simple but I am having the following problems: 1. How do I use another language in a C program? Maybe Unicode? But …

Member Avatar
Member Avatar
+0 forum 1

hi, i'm using UTF-8 in my database and when i insert unicode characters directly through database it's all fine. But when i'm inserting through an html form using php it sends some kind of other characters to the database e.g : &#3462;&#3514;&#3540;&#3510;&#3549;&#3520;&#3505;&#3530; how can i solve this...???

Member Avatar
Member Avatar
+0 forum 2

Hello guys, Here's a problem that's giving me a hard time. I am working on a web server in Java and right now I want to enable the user to create a photo album. The user can type the album name in an <input> field in an HTML form. The problem is that I receive the client request to create a new album with the album name being encoded in some way, which I can't figure out. Here's what I have so far: 1. A simple .html file that contains the form (it's quite large, so I won't post it …

Member Avatar
Member Avatar
+0 forum 3

Hi all, I got some little problem for Unicode: My Unicode were Chienese world => 你好, 有什么能协助您的吗? When the 1st time i insert it in to database it saved in this form -> "&#20320 &#22909 , &#26377 &#20160 &#20040 &#33021 &#21327 &#21161 &#24744 &#30340 &#21527 &#65311" ( every &# 1234 got ";" at behind) but when i run update query to update it it display in this ä½ å¥½, 有什么能协助您的吗? and the database become 你好, 有什么能协助您的吗? how i "encrypt" it to this format -> "&#20320 &#22909 , &#26377 &#20160 &#20040 &#33021 &#21327 &#21161 &#24744 &#30340 &#21527 &#65311" during update?

Member Avatar
Member Avatar
+0 forum 2

Hi all, Recently I've run into a problem where a string I am reading from a file is being read in with a nonprinting character appended to the end. The character being appended is U+0020. I'm just unsure how to get rid of this. I know that I could just chop off the last character of the string, but this wouldn't solve my problem because some strings have this Unicode character on the end and others don't. Any help would be greatly appreciated. Thanks, Dylan

Member Avatar
Member Avatar
+0 forum 2

I've searched all over for an answer to this, including this forum, so sorry if I missed something, anyway, I'd like to get a numerical code from extended characters like ß or ü and so on. I don't use them very much myself, as I'm a native English language user! But they pop up enough that I should be able to support them if they arise. I have found some information on long chars but I didn't manage to find a resource I could understand enough to actually use. i.e. [code=cplusplus] char c; int i; c = 'h'; i = …

Member Avatar
Member Avatar
+0 forum 6

Hello guys... I am working to module need to convert any UNICODE words/ chars that never change with font To their equivalent HTML numeric & JAVA unicode [\u****]. In the JSP the charCodeAt(i) function works but in JAVA no such utility exit ... Plz write with such an suitable example i can understand easily... Eg: آ in Urdu refers to 1570 in HTML n in JAV unicode \u0622 Thanks

Member Avatar
Member Avatar
+0 forum 1

Hi Guys, I was hoping someone can help me with this. I have a python program that reads and writes files in utf-8(unicode). When I run it in Eclipse, the output is perfectly fine. When I try making an exe file of my program it is not working. I also tried double clicking the python program to run it in windows dos but the same problem occurred. I tried running it in pythonWin still the same problem. Now I have a problem debugging my program since it is working perfectly fine in Eclipse. Is it a problem about encoding? I …

Member Avatar
Member Avatar
+0 forum 4

The End.