hi everyone I need your help in extracting arabic text from image using vb.net. I only found program for extracting english language text but not found for arabic.
I used the following code that support only english language.

Imports MODI
Imports System.IO
Public Class Form1

    Private Sub Button1_Click(ByVal sender As System.Object, ByVal e As System.EventArgs) Handles Button1.Click
        OpenFileDialog1.ShowDialog()
        Dim filepath = OpenFileDialog1.FileName
        Dim extract As String = Me.ExtractTextFromImage(filepath)

        Label1.Text = extract.Replace(Environment.NewLine, "<br />")
    End Sub

    Private Function ExtractTextFromImage(ByVal filepath As String) As Object
        Dim modiDocument As New Document()
        modiDocument.Create(filepath)
        modiDocument.OCR(MiLANGUAGES.miLANG_ENGLISH)
        Dim modiImage As MODI.Image = TryCast(modiDocument.Images(0), MODI.Image)
        Dim extractedText As String = modiImage.Layout.Text
        modiDocument.Close()
        Return extractedText
    End Function

End Class

Recommended Answers

All 3 Replies

Tesseract 3.0+ should have Arabic support. It's in C but there are wrappers for other languages, including two for .NET.

If that proves too hard there is also a JavaScript port of Tesseract that supports Arabic. You'll need node.js though.

commented: Was thinking Tessaract but didn't check language support. Bad me. Have used. Can be fun (as in image processing required.) +0

thank for you share dudes

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.