Archived
This forum has been archived. Please start a new discussion on GitHub.
An error occurs when c# client send chinese string to c++ server
Hi:
An error occurs when c# client send chinese string to the c++ server.
The server can receive the string,but the result of unmarshal is wrong,
and an assert appears when release the memory of string.so I browse
the source code of ice and icecs,I see these:
basicstream.cpp:
void
IceInternal::BasicStream::write(const string& v)
{
Int len = static_cast<Int>(v.size());
writeSize(len);
if(len > 0)
{
Container::size_type pos = b.size();
resize(pos + len);
memcpy(&b[pos], v.c_str(), len);
}
}
void
IceInternal::BasicStream::read(string& v)
{
Int len;
readSize(len);
if(b.end() - i < len)
{
throw UnmarshalOutOfBoundsException(__FILE__, __LINE__);
}
if(len > 0)
{
v.assign(reinterpret_cast<const char*>(&(*i)), len);
i += len;
}
else
{
v.clear();
}
}
basicstream.cs:
public virtual void writeString(string v)
{
if(v == null || v.Length == 0)
{
writeSize(0);
return;
}
try
{
byte[] arr = utf8.GetBytes(v);
writeSize(arr.Length);
expand(arr.Length);
_buf.put(arr);
}
catch(Exception)
{
Debug.Assert(false);
}
}
public virtual string readString()
{
int len = readSize();
if(len == 0)
{
return "";
}
try
{
if(_stringBytes == null || len > _stringBytes.Length)
{
_stringBytes = new byte[len];
}
_buf.get(_stringBytes, 0, len);
return utf8.GetString(_stringBytes, 0, len);
}
catch(InvalidOperationException ex)
{
throw new Ice.UnmarshalOutOfBoundsException(ex);
}
catch(System.ArgumentException ex)
{
throw new Ice.MarshalException("Invalid UTF8 string", ex);
}
catch(Exception)
{
Debug.Assert(false);
return "";
}
}
So it seems the implementation of cs support UTF8,and C++ not support UTF8 but MBCS ??
When I use System.Text.Encoding.Default intead of UTF8Encoding in cs client,the server
can receive and print the string correctly.
my platform:
the Ice/IceCS version is 1.5.1, win 2000/vs.net 2003(7.1.3091)/.net 1.1(1.1.4322)
slice:
interface IPrinter
{
void Print(string s);
};
The Server:
class PrinterImpl : public IPrinter
{
public:
PrinterImpl(void);
virtual ~PrinterImpl(void);
virtual void Print(const ::std::string&, const ::Ice::Current& current);
};
PrinterImpl::PrinterImpl(void)
{
}
PrinterImpl::~PrinterImpl(void)
{
}
void PrinterImpl::Print(const ::std::string& s, const ::Ice::Current& current)
{
cout<<s<<endl;
}
An error occurs when c# client send chinese string to the c++ server.
The server can receive the string,but the result of unmarshal is wrong,
and an assert appears when release the memory of string.so I browse
the source code of ice and icecs,I see these:
basicstream.cpp:
void
IceInternal::BasicStream::write(const string& v)
{
Int len = static_cast<Int>(v.size());
writeSize(len);
if(len > 0)
{
Container::size_type pos = b.size();
resize(pos + len);
memcpy(&b[pos], v.c_str(), len);
}
}
void
IceInternal::BasicStream::read(string& v)
{
Int len;
readSize(len);
if(b.end() - i < len)
{
throw UnmarshalOutOfBoundsException(__FILE__, __LINE__);
}
if(len > 0)
{
v.assign(reinterpret_cast<const char*>(&(*i)), len);
i += len;
}
else
{
v.clear();
}
}
basicstream.cs:
public virtual void writeString(string v)
{
if(v == null || v.Length == 0)
{
writeSize(0);
return;
}
try
{
byte[] arr = utf8.GetBytes(v);
writeSize(arr.Length);
expand(arr.Length);
_buf.put(arr);
}
catch(Exception)
{
Debug.Assert(false);
}
}
public virtual string readString()
{
int len = readSize();
if(len == 0)
{
return "";
}
try
{
if(_stringBytes == null || len > _stringBytes.Length)
{
_stringBytes = new byte[len];
}
_buf.get(_stringBytes, 0, len);
return utf8.GetString(_stringBytes, 0, len);
}
catch(InvalidOperationException ex)
{
throw new Ice.UnmarshalOutOfBoundsException(ex);
}
catch(System.ArgumentException ex)
{
throw new Ice.MarshalException("Invalid UTF8 string", ex);
}
catch(Exception)
{
Debug.Assert(false);
return "";
}
}
So it seems the implementation of cs support UTF8,and C++ not support UTF8 but MBCS ??
When I use System.Text.Encoding.Default intead of UTF8Encoding in cs client,the server
can receive and print the string correctly.
my platform:
the Ice/IceCS version is 1.5.1, win 2000/vs.net 2003(7.1.3091)/.net 1.1(1.1.4322)
slice:
interface IPrinter
{
void Print(string s);
};
The Server:
class PrinterImpl : public IPrinter
{
public:
PrinterImpl(void);
virtual ~PrinterImpl(void);
virtual void Print(const ::std::string&, const ::Ice::Current& current);
};
PrinterImpl::PrinterImpl(void)
{
}
PrinterImpl::~PrinterImpl(void)
{
}
void PrinterImpl::Print(const ::std::string& s, const ::Ice::Current& current)
{
cout<<s<<endl;
}
0
Comments
-
C++ also supports UTF-8, it just doesn't check whether every std::string being sent or received is really in UTF-8 format. It is the application's responsibility to make sure that UTF-8 strings are used with Ice. If your application does not use Unicode, you must convert your strings from whatever codeset you use to Unicode first.
Ice for C++ has two utility functions that allow you to convert between a UTF-8 std::string and a UTF-16 std::wstring. Have a look at IceUtil/Unicode.h for details.0 -
Of couse,string in C# is always in Unicode format,but std::string Ice for C++ use is in ANSI format. UTF8 and ANSI is in the same format only char from 0x00 to 0x7f.MSDN say VC++ 7.x only support MBCS and Unicode.
e.g.
platform Intel x86
string a = "A"
string w = "‰ä" //unicode format: 0x62 0x11
Their bytes in C#:
ansi unicode(C# compiler use) utf8
a [0x41 [0x41 0x00] [0x41]
w [0xce 0xd2] [0x11 0x62] [0xe6 0x88 0x91]
in c++
ansi unicode
a [0x41] [0x41 0x00]
w [0xce 0xd2] [0x11 0x62]
When the C# client send string w,it really send "0xe6 0x88 0x91" because the class Ice.BasicStream of Ice for C# use UTF-8 Encoding(Class System.Text.UTF8Encoding).Then the C++ server receive bytes "0xe6 0x88 0x91" and basicstream of Ice for C++ can't correctly read string from the bytes.It works right if client and server are in the same language.Incorrect if not.
e.g.
send Client Server Result
w C# C# OK
w C++ C++ OK
w C# C++ server error
w C++ C# server error0 -
I think you may have misunderstood Marc. The string that is coming from the C# side is delivered to the C++ as a UTF-8 encoded string. You need to explicitly convert it to a wide string in your C++ application code using the functions defined in Unicode.h.
Cheers,
Michi.0 -
I don't understand what you are exactly asking, i.e., what you expect Ice to do. BasicStream in Ice for C++ doesn't touch the strings it receives, so it doesn't convert anything, and thus cannot have a problem reading any string.
Ice for C++ simply expects that the strings it receives are UTF-8 encoded, and this is what it will get from Ice for C#. If you have an UTF-8 compatible application (like an UTF-8 compatible GUI toolkit), then you can use these strings right away. Otherwise, you can also convert it to UTF-16 wstrings with the mentioned conversion functions.
If your appllication doesn't use Unicode in either UTF-8 or UTF-16 encoding, then your own application code must convert between Unicode and whatever font you are using.0