My data class that will be serialized into XML look like this:
[XmlType(TypeName = "SPCFileInfo")]
[Serializable]
public class SPCFileInfoProtocol
{
[XmlElement("CompanyCode")]
public string CompanyCode { get; set; }
[XmlElement("FileName")]
public string FileName { get; set; }
[XmlElement("FileVer")]
public int FileVer { get; set; }
[XmlElement("FileSize")]
public long FileSize { get; set; }
[XmlElement("CreatedOn")]
public DateTime CreatedOn { get; set; }
[XmlElement("LastUpdatedOn")]
public DateTime LastUpdatedOn { get; set; }
[XmlElement("FileBytes")]
public byte[] FileBytes { get; set; }
}
And here's my serialization utiltiy class
public static class XmlSerializer
{
public static string SerializeToString<T>(T item)
{
if (item == null)
{
return null;
}
System.Xml.Serialization.XmlSerializer serializer = new System.Xml.Serialization.XmlSerializer(typeof(T));
XmlWriterSettings settings = new XmlWriterSettings();
settings.Encoding = new UnicodeEncoding(false, false); // no BOM in a .NET string
settings.Indent = false;
settings.OmitXmlDeclaration = false;
using (StringWriter textWriter = new StringWriter())
{
using (XmlWriter xmlWriter = XmlWriter.Create(textWriter, settings))
{
serializer.Serialize(xmlWriter, item);
}
return textWriter.ToString();
}
}
public static T DeserializeFromString<T>(string xmlString)
{
T item = default(T);
try
{
using (StringReader stringReader = new StringReader(xmlString))
{
System.Xml.Serialization.XmlSerializer xmlSerializer =
new System.Xml.Serialization.XmlSerializer(typeof(T));
item = (T)xmlSerializer.Deserialize(stringReader);
}
}
catch (Exception ex)
{
Trace.WriteLine(ex.ToString());
}
return item;
}
}
Serialization into XML works fine, but when I attempt to deserialize, I get the following exception:
XMLException: There is an error in XML document. hexadecimal value 0x00, is an invalid character.
Upon investigation, I found out that certain character codes are not valid for XML document. Removing invalid characters is not an option since they constitute the bytes for a file.
My question is how do you serialize/deserialize a data class like above into XML without stripping invalid bytes? If this is not possible, what are some viable alternatives?
Edit: Upon request, here's the full stacktrace of the error
System.InvalidOperationException: There is an error in XML document (1, 21933). ---> System.Xml.XmlException: '.', hexadecimal value 0x00, is an invalid character. Line 1, position 21933. at System.Xml.XmlTextReaderImpl.Throw(Exception e) at System.Xml.XmlTextReaderImpl.Throw(String res, String[] args) at System.Xml.XmlTextReaderImpl.ParseText(Int32& startPos, Int32& endPos, Int32& outOrChars) at System.Xml.XmlTextReaderImpl.ParseText()
at System.Xml.XmlTextReaderImpl.ParseElementContent() at System.Xml.XmlTextReaderImpl.Read() at System.Xml.XmlTextReader.Read() at System.Xml.XmlReader.ReadElementString() at Microsoft.Xml.Serialization.GeneratedAssembly.XmlSerializationReaderSPCCommandProtocol.Read2_SPCCommandProtocol(Boolean isNullable, Boolean checkType) at Microsoft.Xml.Serialization.GeneratedAssembly.XmlSerializationReaderSPCCommandProtocol.Read3_SPCCommand() --- End of inner exception stack trace --- at System.Xml.Serialization.XmlSerializer.Deserialize(XmlReader xmlReader, String encodingStyle, XmlDeserializationEvents events)
at System.Xml.Serialization.XmlSerializer.Deserialize(XmlReader xmlReader) at NextSPCFileUpdater.Utilities.XmlSerializer.DeserializeFromString[T](String xmlString) in C:\Source Codes\SPC\nextspc-fileupdater\NextSPCFileUpdater\Utilities\XmlSerializer.cs:line 48
And here's the new version of deserialization
public static T DeserializeFromString<T>(string xmlString)
{
T item = default(T);
try
{
using (StringReader stringReader = new StringReader(xmlString))
using (XmlTextReader xmlTextReader = new XmlTextReader(stringReader) { Normalization = false })
{
System.Xml.Serialization.XmlSerializer xmlSerializer =
new System.Xml.Serialization.XmlSerializer(typeof(T));
item = (T)xmlSerializer.Deserialize(xmlTextReader);
}
}
catch (Exception ex)
{
Trace.WriteLine(ex.ToString());
}
return item;
}
As you've noticed, there are lots of characters that may not be present in an XML document. These can be included in your data, however, using the proper escape sequence.
The default settings of the XmlTextReader cause it to mishandle this -- I think it interprets the escape sequences prematurely, but I'm not precisely certain. If I recall correctly, the XmlSerializer will create an XmlTextReader to wrap the TextReader you pass it. To override that, you need to create one yourself, setting its
Normalization
property of the XmlTextReader tofalse
.Regardless of whether my recollection of the causes of the problem is correct, however, setting
Normalization
tofalse
will solve your problem:Or rather, in your case:
As an aside, most will find your code far more readable if you use some
using
directives:Still more will find it more readable if you use
var
(though I have at least one colleague who disagrees):