I got a HTML file. I wish to parse the content I need to a text file. How can I make use of The unix utility application that strips the HTML markup tags from the file’s content leaving just content text? Thanks for the advise!

Partial HTML file code

<thead>
                    <tr>
                        <th class="ticker_name">Currency Pair</th>
                        <th>Price</th>                       <th>Change</th>            </tr>

                </thead>
        <tbody>
                <tr class=ticker_row>
    <td class="ticker_name"><a href="/q?s=USDSGD=X">USD to SGD</a></td>
    <td>1.3119</td>        <td class="ticker_down">-0.0042</td>    </tr>    <tr class=ticker_colored_row>
    <td class="ticker_name"><a href="/q?s=EURSGD=X">EUR to SGD</a></td>

    <td>1.8093</td>        <td class="ticker_up">+0.0155</td>    </tr>    <tr class=ticker_row>
    <td class="ticker_name"><a href="/q?s=GBPSGD=X">GBP to SGD</a></td>
    <td>2.0754</td>        <td class="ticker_up">+0.0077</td>    </tr>    <tr class=ticker_colored_row>

    <td class="ticker_name"><a href="/q?s=SGDJPY=X">SGD to JPY</a></td>
    <td>63.5148</td>        <td class="ticker_up">+0.0587</td>    </tr>    <tr class=ticker_row>
    <td class="ticker_name"><a href="/q?s=SGDHKD=X">SGD to HKD</a></td>
    <td>5.9143</td>        <td class="ticker_up">+0.0174</td>    </tr>    <tr class=ticker_colored_row>

    <td class="ticker_name"><a href="/q?s=SGDMYR=X">SGD to MYR</a></td>
    <td>2.3519</td>        <td class="ticker_up">+0.0061</td>    </tr>    <tr class=ticker_row>
    <td class="ticker_name"><a href="/q?s=SGDIDR=X">SGD to IDR</a></td>
    <td>6,803.1157</td>        <td class="ticker_up">+20.9390</td>    </tr>    <tr class=ticker_colored_row>

    <td class="ticker_name"><a href="/q?s=SGDCNY=X">SGD to CNY</a></td>
    <td>5.1002</td>        <td class="ticker_up">+0.0144</td>    </tr>    <tr class=ticker_row>
    <td class="ticker_name"><a href="/q?s=AUDSGD=X">AUD to SGD</a></td>
    <td>1.2759</td>        <td class="ticker_up">+0.0049</td>    </tr>

                </tbody>

Text File Output

[Currency Pair] : [Price] : [Change]    
USD to SGD : 1.3227    : +0.0019
EUR to SGD : 1.7756    : +0.0002
GBP to SGD : 2.0897    : +0.0020
SGD to JPY : 63.6415    : -0.1669
SGD to HKD : 5.8652    : -0.0086
SGD to MYR : 2.3356    : -0.0036
SGD to IDR : 6,765.1    :-11.2568
SGD to CNY : 5.0606    : -0.0081
AUD to SGD : 1.2670    : -0.0012

Recommended Answers

All 4 Replies

Read the documentation for "The unix utility application that strips the HTML markup tags from the file’s content leaving just content text"

Where can I find the documentation, "The unix utility application that strips the HTML markup tags from the file’s content leaving just content text"?

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.