c++ - getting a sub string of a std::wstring -


how can substring of std::wstring includes non-ascii characters?

the following code not output anything:
(the text arabic word contains 4 characters each character has 2 bytes, plus word "hello")

#include <iostream> #include <string>  using namespace std;  int main() {     wstring s = l"سلام hello";     wcout << s.substr(0,3) << endl;     wcout << s.substr(4,5) << endl;      return 0; } 

this should work: live on coliru

#include <iostream> #include <string> #include <boost/regex/pending/unicode_iterator.hpp>  using namespace std;  template <typename c> std::string to_utf8(c const& in) {     std::string result;     auto out = std::back_inserter(result);     auto utf8out = boost::utf8_output_iterator<decltype(out)>(out);      std::copy(begin(in), end(in), utf8out);     return result; }  int main() {     wstring s = l"سلام hello";      auto first  = s.substr(0,3);     auto second = s.substr(4,5);      cout << to_utf8(first)  << endl;     cout << to_utf8(second) << endl; } 

prints

سلا  hell 

frankly though, think substring calls making weird assumptions. let me suggest fix in minute:


Comments

Popular posts from this blog

Detect support for Shoutcast ICY MP3 without navigator.userAgent in Firefox? -

web - SVG not rendering properly in Firefox -

java - JavaFX 2 slider labelFormatter not being used -