hi

ive been wondering how translator work like google translator or others ive got a site im working on an i need to translate it from english to creol not Haitian Creole so a translator for my language is not available anywhere so i what to make my own so it can be use also by others in the future.how do i go about this if i could get one similar to my language i can perfection it


any suggestion will be a great help thanks

Recommended Answers

All 6 Replies

You could start from replacing word by word. In a string, you attempt to search for the longest match first. For example...

cry -> aWordInYourLanguage
cry out -> anotherWordInYourLanguage  <== match this one first

I think google also let those who own the language help them adding the vocabulary, so they have more words/phrases to be used. One tricky part for this is you need to have an efficient search algorithm in order to translate it fast. I think you can use tools to do the create token word string for you (do the match). I used to use flex to do the work. Still you need to write your own matching rules.

First thing you need to do is understand both languages well and understand the grammatical rules for each language then be able to tell what translates to what from what I see in your post you do not understand english well enough to tackle a program like this you need to understand the concept of sentences and punctuation

Don't worry about translation vocabulary and grammar i have people who are very fluent in both languages who's ready to help. The only thing i need to know now is: where do i start, which tool will i need etc...

Translation is actually quite a hard problem. I don't think you're going to solve it with simple search/replace.
I heard an interesting talk last year by someone working in this area, and it sounds to me like most of the serious machine translation out there relies more on statistical matching rather than any attempt to do syntax/semantics of the source/target languages. To do this, you need a huge corpus for each language, and some way to match up one corpus against the other. This is not an amateur project!

Instead, you might want to consider an internationalization approach, in which you strip literal text strings from your code and read them in from a source, with one source for each language.

The text strings can be large chunks, like paragraphs of text, or smaller ones, like the word "Submit" on a submit button. The important thing is that you're sourcing all literal text that appears on the site from these databases, so you've got a complete separation of content/presentation. How you manage this depends on what you're using to set up the web site, but it's a medium-hard problem, instead of a problem that's occupied several generations of professional programmers and linguists. Much easier to get a handle on it.

When you have this set up, you can set up a functionality allowing your fluent volunteers to suggest translations, and have those automatically entered in the database, but this would be a lot of development. Instead, it might be easier to ask them to submit suggestions and corrections manually, and train one of them to populate the Haitian database.


What this doesn't do, of course, is translate on the fly - if you're doing user-generated content, you'll have to make some provision for user-generated translation, or you'll burn out your volunteers pretty fast.

Have fun - it'll be an interesting piece of work, and not necessarily a very easy one.

thanks for the great advice my site would be in joomla ive got hold of a english to french language pack for front end and back end had a good look at it here is a sample of the content

ADD=Ajouter
ALIGN=Aligner
ALT TEXT=Texte alternatif
ALREADY EXISTS=Existe déjà
ALTERNATIVE READ MORE TEXT=Alternative à Lire la suite
PARAMALTREADMORE=Saisissez le texte que vous voulez afficher à côté du lien <em>Lire la suite :</em> au lieu du titre de l'article (par défaut).
ARCHIVES=Archives
ARTICLES BEING MOVED=Articles en cours de déplacement :
ARTICLES BEING COPIED=Articles en cours de copie :
ARTICLE CATEGORY NOT PUBLISHED=Catégorie de l'article non publiée
ARTICLE MUST HAVE A TITLE=L'article doit avoir un titre
ARTICLE MUST HAVE SOME TEXT=L'article doit contenir du texte
ARTICLE SECTION NOT PUBLISHED=Section de l'article non publiée
ARTICLE # NOT FOUND=Article #%d introuvable
ARTICLE RATING=Notation de l'article
AUTHOR=Auteur
AUTHOR ALIAS=Pseudo de l'auteur
AUTHOR FILTER=Filtrer par auteur
BORDER=Bordure
BOTTOM=Bas
CAPTION=Légende
CONTENT=Contenu
CREATED=Créé
DATE=Date

so mu guess is how about i strip down the hole file and replace the French word with my language since our language (seychelles creole) is closer to french i know its going to be NOT A EASY TASK since the large amount of text il have to replace but thats a great start?

Yeah, that's the sort of thing you'll need. I know nothing about joomla, but it sounds like you're on the right track.

For an extensive site, a text file of this sort might end up being unwieldy - I don't know, I only use this sort of thing in smallish applications with a small set of literals. I don't know enough to advise you on whether this will end up slowing your site down - but this is a start, it's the right idea in any case.

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.