I wrote a web page with one Russian word in it, using the Cyrillic alphabet character entity names (found in a list of all Unicode characters that I have). The page works fine in every browser I tried. But the W3C validator won't validate the page because the Cyrillic character entity names were used. It validates if I use the Unicode numbers instead, but that is such clanky programming. It reminds me of having to type in the ASCII code for every character the program displays in assembly language programming.

Why can't we have the better programming practices? W3C seems to want to deprecate all but 5 of the character entity names. This sounds like going back to the stone age.

Member Avatar
diafol

Have you set encoding to utf-8?

Edit: sorry not sure I understood properly. WHat is it actually you need help with?

Could you give examples?

<!doctype html>
<html lang="en">
<head>
  <meta charset="utf-8">
  <title>Page Title</title>
</head>
<body>
АВДИ
</body>
</html>

Validated OK for me.

Member Avatar
diafol

Btw the ???? Above are ABD and back to front N. Daniweb just ain.t up to displaying them in msgs.

I am using the doctype for xhtml. Does that make a difference?

Yes, I am using utf-8.

As an example, it did not recognize the character entity & YAcy; (the backward R) as valid.
It does recongize & #1071; (same character) as valid.
(I inserted a space after the & here to prevent encoding.)

Every browser I tried accepted & YAcy; and displayed the correct character.

Member Avatar
diafol

OK, any reason why you're using xhtml instead of html5?
Also you should be able to insert the characters directly, not use the HTML encodings - that would be ridiculous for anybody using anything other than English :)

Member Avatar
diafol

This validated:

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html lang="en" xmlns="http://www.w3.org/1999/xhtml">
<head>
  <meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
  <title>Page Title</title>
</head>
<body>
АВДИ
</body>
</html>

???? = ABDN again
THere was no problem. I didn't try with strict mode.

try wrappingthe cyriliic in <span lang='ru'>АБВГДЕЖЅ</span>

w3c declaring language

commented: Good info +15
Member Avatar
diafol

OK, just read back the post again. You're using entity names (the ones starting with &...; like 'yacy'). Why are you using entity names instead of typing the actual characters directly? You can even use something as simple as charmap to do this.

BTW - I get 'yacy' to work and 'Yacy'.

<!doctype html>
<html lang="en">
<head>
  <meta charset="utf-8">
  <title>Page Title</title>
</head>
<body>
&yacy; &YAcy;
</body>
</html>
Member Avatar
diafol

Aha. Yes I get the error message if using XHTML or HTML4. Change to HTML5 if you can. I only just realised what it was that was causing the issue - doh! Does the document need to be validated as XML? If not use HTML5.

My entire site is xhtml 1.0 strict as required by the webmaster for uniformity. It is the price I gladly pay for free hosting with unlimited storage.

I am still converting some of my old pages (not currently up) from html 4.0 (from when I had Geocities) to xhtml 1.0 so I can put them up (but doing it at my leisure, as new pages have my priority).

My native language is English. I do not have the ablity to type in the character directly because I do not speak or write Russian. I put this one word in to explain a connection between the Bible and a historic event.

Member Avatar
diafol

Easy fix. Copy the text on the screen and paste it over the encodings.

I get little white rectangles when I paste the copied text. My editor can't accept the Cyrillic characters.

Member Avatar
diafol

Try a proper editor. Even notepad accepts.

@diafol:
his source/editor appears functional, the W3 validator picks a unicode fault thats hard to find, 'cause its not really there.

Member Avatar
diafol

Ok, my last bit on this. Can't believe it's still on-going...

notepad.png

Why you can't just use charmap in Windows I don't know. You are creating a problem where one doesn't exist.