c++ - getting a sub string of a std::wstring -

- May 15, 2011

how can substring of std::wstring includes non-ascii characters?

the following code not output anything:
(the text arabic word contains 4 characters each character has 2 bytes, plus word "hello")

#include <iostream> #include <string>  using namespace std;  int main() {     wstring s = l"سلام hello";     wcout << s.substr(0,3) << endl;     wcout << s.substr(4,5) << endl;      return 0; }

this should work: live on coliru

#include <iostream> #include <string> #include <boost/regex/pending/unicode_iterator.hpp>  using namespace std;  template <typename c> std::string to_utf8(c const& in) {     std::string result;     auto out = std::back_inserter(result);     auto utf8out = boost::utf8_output_iterator<decltype(out)>(out);      std::copy(begin(in), end(in), utf8out);     return result; }  int main() {     wstring s = l"سلام hello";      auto first  = s.substr(0,3);     auto second = s.substr(4,5);      cout << to_utf8(first)  << endl;     cout << to_utf8(second) << endl; }

prints

سلا  hell

frankly though, think substring calls making weird assumptions. let me suggest fix in minute:

Search This Blog

Sher

c++ - getting a sub string of a std::wstring -

Comments

Post a Comment

Popular posts from this blog

java - How to Configure JAXRS and Spring With Annotations -

visual studio - TFS will not accept changes I've made to a Java project -

php - Create image in codeigniter on the fly -