0

Hi,

I use the stl features to deal with string conversions, especially facets. Now, I am faced with the task to lower a string except its first letter. Please note that I am dealing with multi-byte characters and so I cannot simply replace the first character of the string with its original after lowering it. Here is my code for lowering a string entirely:

std::string lower_string ( const std::string& a_string, const std::locale& a_locale )
{
	std::string result;
	char *p = (char*) alloca ( a_string.size() + 1 );

	memcpy ( p, a_string.c_str (), a_string.size() + 1);
	std::use_facet< std::ctype<char> > ( a_locale ).tolower ( p, p + a_string.length () );
	result.assign(p);
	return result;
}

Has anyone an idea how to avoid the conversion of the first (probably) multi-byte character in a c++ way? Or a way to subsequently upper the first character?

Many thanks in advance,
Kay

2
Contributors
5
Replies
6
Views
5 Years
Discussion Span
Last Post by denkfix
Featured Replies
  • In my point of view this behavior is a clear bug: Real multibyte characters in multibyte strings are not converted by the conversion functions offered in <locale> regardless of the chosen locale. Meanwhile I tried boost::algorithm::string::to_lower, but this function also reverts to std::ctype yielding the same results. For all who … Read More

0

So what you want is a function that creates a new string which is a copy of the string being past... Why not change the 'const std::string& a_string' parameter to 'std::string& a_string' and change original string or is that outside the requirements?

Your question, get the iterators for the string's begin and end and then increment the begin value by one.

Edited by gerard4143: n/a

0

I didn't take the time and read your post completely..Ooops

Try something like below.

#include <iostream>
#include <string>
#include <algorithm>
#include <locale>


std::string lower_string (const std::string& a_string, const std::locale& a_locale )
{
    char *str = new char(a_string.size());
    std::string::const_iterator begin = a_string.begin();
    std::string::const_iterator end = a_string.end();

    copy(begin, end, str);

    std::string ans;

    std::use_facet< std::ctype<char> > ( a_locale ).tolower ( str + 1, str + a_string.size() );

    ans = str;
    
    return ans;
}

int main(int argc, char *argv[])
{
    std::locale loc("");
    std::string name("A nAME tO PASS aLONG");

    std::string ans = lower_string(name, loc);

    std::cout << ans << std::endl;
    return 0;
}

Edited by gerard4143: n/a

0

Here's a tighter version.

#include <iostream>
#include <string>
#include <algorithm>
#include <locale>

std::string lower_string (const std::string& a_string, const std::locale& a_locale )
{
    char *str = new char[a_string.size() + 1];
    std::string::const_iterator begin = a_string.begin();
    std::string::const_iterator end = a_string.end();

    copy(begin, end, str);

    std::use_facet< std::ctype<char> > ( a_locale ).tolower ( str + 1, str + a_string.size() );

    std::string ans(str);
    delete [] str;

    return ans;
}

int main(int argc, char *argv[])
{
    std::locale loc("");
    std::string name("A nAME tO PASS aLONG");

    std::string ans = lower_string(name, loc);

    std::cout << ans << std::endl;
    return 0;
}

Its been a while since I look at C++ code.

Edited by gerard4143: n/a

0

Many thanks! I had to add the following line to make it work (right after the conversion):

str[a_string.length ()] = '\0';

The bad thing: it does not work with multi-byte characters as in

std::string name("Ä nAME tO PÄSS aLONG");

, even if I try to convert the whole string which is quit frustrating. What is the point in using locale then? Using std::wstring makes it work:

std::wstring lower_string2 ( const std::wstring& a_string, const std::locale& a_locale )
{
	std::wstring b_string; 
	b_string += a_string;
	std::use_facet< std::ctype<wchar_t> > ( a_locale ).tolower ( const_cast<wchar_t*> ( b_string.c_str () ), b_string.c_str () + b_string.length () );
	return b_string;
}
2

In my point of view this behavior is a clear bug: Real multibyte characters in multibyte strings are not converted by the conversion functions offered in <locale> regardless of the chosen locale. Meanwhile I tried boost::algorithm::string::to_lower, but this function also reverts to std::ctype yielding the same results. For all who are faced with the same problem, try boost::locale (http://cppcms.sourceforge.net/boost_locale/html/index.html). The conversion methods in this library work as expected.

This question has already been answered. Start a new discussion instead.
Have something to contribute to this discussion? Please be thoughtful, detailed and courteous, and be sure to adhere to our posting rules.