Skip to content

Error parsing XML files > 2Gb on Windows

Loading XML files on Windows is causing an error when the file size exceeds 2Gb. I've tracked down the source of the bug in expat XML_GetCurrentByteIndex() but I can't figure out why it is happening as there's some type conversion magic going on.

typedef long long XML_Index;

XML_Index XMLCALL
XML_GetCurrentByteIndex(XML_Parser parser) {
  if (parser == NULL)
    return -1;
  if (parser->m_eventPtr)
    return (XML_Index)(parser->m_parseEndByteIndex
                       - (parser->m_parseEndPtr - parser->m_eventPtr));
  return -1;
}

When the stream pointer passes the 2Gb mark the returned index becomes a negative value on Windows. So by casting it to unsigned long before conversion to vtkTypeInt64 the return value remains positive and the parser is happy to continue.

I suppose it is relevant to mention that std::streampos is defined as a long (32bit) data type on Windows 64bit and therefore no files greater than 4Gb can ever be handled where ifstream::tellg() and ifstream::seekg() are utilised during parsing.

vtkTypeInt64 vtkXMLParser::GetXMLByteIndex()
{
  XML_Parser parser = static_cast<XML_Parser>(this->Parser);
#ifdef _WIN32
  unsigned long result = XML_GetCurrentByteIndex(parser);
  return result;
#else
  return XML_GetCurrentByteIndex(parser);
#endif  

Has anyone with better C++ knowledge got any thoughts on this?

Edited by Todd
To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information