Bugs found in previous releases of CMarkup are already fixed but the notices and fixes remain here for users who are still working with old releases. The great thing about a source code product is that you can fix the bug right in your program and be done with it, no temporary work-arounds, no waiting for the next release.
11.3 Bug: crash reading value
Joon-Hong Jo 16-Dec-2010
CMarkup xml; xml.SetDocFlags(CMarkup::MDF_COLLAPSEWHITESPACE); ... if (xml.GetChildData().length() <= 0)
<?xml version="1.0" encoding="utf-8"?>
<Properties>
<MdlData>
<Path>\DIR\PATH\</Path>
<Name>0123_4567</Name>
<Type>ABC</Type>
</MdlData>
<Parameters>
<Parameter>
<Name>XYZ_PART_NO</Name>
<Value>
</Value>
</Parameter>
<Parameter>
When reading [the Value element data], my code crashes.
(fixed in 11.4)
The MDF_TRIMWHITESPACE
and MDF_COLLAPSEWHITESPACE
modes introduced in release 11.3 crash on values that are only whitespace.
Fix for 11.3: in Markup.cpp:3191 add > 0
as follows:
if ( bAlterWhitespace && nCharWhitespace > 0 )
11.0 Bug: file read and write modes
07-May-2009
(fixed in 11.1)
If you are using either read or write file mode (the Open method) this is an important fix. Please replace the ElemStack
Unslot
method in Markup.cpp line 1274 as follows:
void Unslot( TagPos& lp ) { int n=lp.iSlotNext,p=lp.iSlotPrev; if (n) pL[n].iSlotPrev=p; if (p) pL[p].iSlotNext=n; else anTable[lp.nSlot]=n; };
*Thanks Dave Terracino
10.0 Issue: performance comparison
Michael 10-Oct-2008
(fixed in 10.1)
When I upgraded to CMarkup 10, I ran a small benchmark program that does basic XML writing and reading. Unfortunately CMarkup 10 is slower than revision 9 when writing XML files. There also seems to be a scaling problem:
CMarkup 9 10000 Runs 381ms, 100000 Runs 2814ms CMarkup 10 10000 Runs 711ms (!), 100000 Runs 38635ms (> 10 times slower!)
[sorry, this is a bug only for generating documents with MFC CString
(not STL string
), noticeable when generating documents over 100k, which you can fix by removing +n/100
in 2 places in Markup.h, (fixed in 10.1) read more...]
This is a performance bug only in the creation of an XML document such as with the AddElem method. Thank you for discovering this, it is due to a mixup during testing just before release (and the bug is not in foxe 2.3). As you build a bigger document it becomes noticeable over 100k and gets exponentially slower due to reallocations (the same issue described in Speed of CMarkup). Timing the following code for nEntries
like 10000
and 100000
illustrates the problem.
CMarkup xml; for ( int iElem = 0; iElem < nEntries; ++iElem ) xml.AddElem( _T("elem"), _T("data") );
Fix for 10.0: The changes are only in Markup.h on the two lines with MCD_GETBUFFER
defines, remove the +n/100
so they appear as follows on line 127 and 145:
#define MCD_GETBUFFER(s,n) new MCD_CHAR[n+1]; s.reserve(n)
#define MCD_GETBUFFER(s,n) s.GetBuffer(n)
7.0-9.0 Issue: Memory checkers report uninitialized variable
April 22, 2008
(fixed in 10.0)
IBM Rational Purify and Boundschecker may complain that nTagLengths
is not initialized before it is used in void SetStartTagLen( int n ) { nTagLengths = (nTagLengths & ~EP_STMASK) + n; };
. This is caused by the way tag lengths were implemented in the struct ElemPos nTagLengths
member since Release 7.0. It is not actually utilizing uninitialized bits, but the memory checkers might not know that. Bit fields alleviate this confusion and they will be used in the next release, but in the meantime you can fix it yourself as shown below. In addition, EP_STMASK
is 0x2fffff
instead of 0x3fffff
causing a problem with start tag lengths over a megabyte (should support 4MB start tags meaning unusually large attributes). The bit fields implementation is more self-evident and less prone to bugs.
Fix for 7.0 to 9.0: The changes are only in Markup.h in struct ElemPos
. First replace the following methods:
int StartTagLen() const { return nStartTagLen; }; void SetStartTagLen( int n ) { nStartTagLen = n; }; void AdjustStartTagLen( int n ) { nStartTagLen += n; }; int EndTagLen() const { return nEndTagLen; }; void SetEndTagLen( int n ) { nEndTagLen = n; };
Then a few lines below, replace the int nTagLengths
declaration with these two:
unsigned int nStartTagLen : 22; // 4MB limit for start tag unsigned int nEndTagLen : 10; // 1K limit for end tag
8.2 Bug: SavePos/RestorePos with non-ASCII names
September 5, 2006
(fixed in 8.3)
A report of a crash caused by using the firstobject XML editor tree customization feature with Unicode text turned up an underlying bug in releases 7.0 to 8.2 of CMarkup (MFC) and CMarkupSTL (STL). A non-ASCII character name in SavePos
and RestorePos
will cause a GPF.
Fix: In struct SavedPosMap
in Markup.h or MarkupSTL.h, change the Hash
function to read as follows (changes shown in bold):
int Hash( LPCTSTR szName ) { unsigned int n=0; while (*szName) n += (unsigned int)(*szName++); return n % SPM_SIZE; };
apostrophe problem
Bill Brannan 9-Oct-2004
My XML document is suddenly not being accepted in release 7.2 due to an attribute containing an apostrophe.
(fixed in 7.3)
In XML you can use a single quote inside of a double quoted attribute value and visa versa. CMarkup encodes any attribute in an attribute value so you will generally not encounter this, but you may come across an existing document with the unencoded quotes in attribute values such as:
<H d="d'd" s='s"s'/>
Release 7.2 introduced a change in the way attribute values are parsed that caused it to reject these.
Fix: An incorrect fix was posted here Oct 6 to Oct 9, please update if you used code posted during those days. In x_ParseNode
in Markup.cpp or MarkupSTL.cpp where it says else if ( nNodeType == MNT_ELEMENT )
please change it to read as follows: (changes are shown in bold):
if ( *pDoc == _T('\"') && ! (nParseFlags&PD_INQUOTE_S) ) nParseFlags ^= PD_INQUOTE_D; else if ( *pDoc == _T('\'') && ! (nParseFlags&PD_INQUOTE_D) ) nParseFlags ^= PD_INQUOTE_S;
heap corruption
Soren Madsen 30-Aug-2004
Whenever a CMarkup object is assigned the value of an empty CMarkup document there is heap corruption (which can be detected in MFC with AfxCheckMemory()
after calling AddElem
). There is no problem if the CMarkup object on the right hand side contains any elements, so this is a somewhat rare situation. The case that was reported was using xml = CMarkup();
to empty out the xml
CMarkup object, but xml.SetDoc(NULL)
is a more direct way to do that.
(fixed in 7.2)
The following three cases involve the assignment operator and lead to heap corruption (where xmlEmpty
is either a newly instantiated CMarkup object or one on which SetDoc(NULL)
has been called):
xmlTest = xmlEmpty;
CMarkup xmlTest( xmlEmpty );
(copy constructor calls assignment operator)xmlTest = CMarkup();
(a temporary empty CMarkup is instantiated and assigned)Fix: In operator=
in Markup.cpp or MarkupSTL.cpp where it says "Copy used part of the index array," please add a two line if
statement so it reads as follows (lines to be added are shown in bold):
// Copy used part of the index array m_aPos.RemoveAll(); m_aPos.nSize = m_iPosFree; if ( m_aPos.nSize < 8 ) m_aPos.nSize = 8;
previous link bug
Bill Brannan 9-Aug-2004
CMarkup seems to lose track of elements after RemoveElem
since upgrade to release 7.0.
(fixed in 7.1)
In CMarkup release 7.0, a link to previous element is not set correctly when an element or subdocument is inserted in between sibling elements causing a problem in FindPrevElem
and in rare cases after RemoveElem
affecting the links used by FindElem
. This can introduce a bug in tested code when upgrading from a previous release of CMarkup, so this is an important fix. A bug fix release of CMarkup will be made available soon.
Fix: In x_LinkElem
in Markup.cpp or MarkupSTL.cpp where it says "Link in after iPosBefore," please add an else
clause to the if
statement so it reads as follows (change shown in bold):
// Link in after iPosBefore pElem->nFlags = 0; pElem->iElemNext = m_aPos[iPosBefore].iElemNext; if ( ! pElem->iElemNext ) m_aPos[m_aPos[iPosParent].iElemChild].iElemPrev = iPos; else m_aPos[pElem->iElemNext].iElemPrev = iPos; m_aPos[iPosBefore].iElemNext = iPos; pElem->iElemPrev = iPosBefore;
empy subdocument bug
Bill Brannan 4-Aug-2004
When upgrading to CMarkup 7.0, after AddChildSubDoc
with the following element as a subdocument:
<analysis_name>Analysis</analysis_name>
Debugging shows that GetChildTagName
== "analysis_name" so GetChildData
SHOULD == "Analysis" but it is returning "" instead. It thinks it is an empty element due to a bug in AddChildSubDoc
.
(fixed in 7.1)
Fix: In function x_AddSubDoc
in Markup.cpp or MarkupSTL.cpp where it links in parent and siblings, add one line and change one line as follows (changes shown in bold):
// Link in parent and siblings bool bEmpty = m_aPos[iPos].nFlags & MNF_EMPTY; x_LinkElem( iPosParent, iPosBefore, iPos ); m_aPos[iPosTempParent].iElemNext = m_iPosDeleted; m_iPosDeleted = iPosTempParent; if ( bEmpty ) m_aPos[iPos].nFlags |= MNF_EMPTY;
parser rejects certain tag names
Stefan Herber 28-Jul-2004
Tagnames like "_Example" produce an error on parsing xml files.
(fixed in 7.1)
Release 7.0 parser in both CMarkup and CMarkupSTL erroneously rejects tag names starting with underscore and colon. Errortext e.g.: "Incorrect tag name character at offset 402."
Fix: In function x_ParseNode
in Markup.cpp or MarkupSTL.cpp, I changed it from:
if ( *pDoc > 0x60 || ( *pDoc > 0x40 && *pDoc < 0x5b ) )to (
0x5f
is underscore, 0x3a
is colon)
if ( *pDoc > 0x60 || ( *pDoc > 0x40 && *pDoc < 0x5b ) || *pDoc == 0x5f || *pDoc == 0x3a )
6.6 Bug: Comma in Error String
May 26, 2004
(fixed in 7.0)
Release 6.6 introduced an extra comma in the error string. After calling SetDoc( strXML )
with ill-formed XML, the result of GetError()
has a comma at the beginning. This bug only exists in Release 6.6.
Fix: The change is in the x_ParseDoc()
function where there is a 4 line if else
clause after the remark that says Combine preserved result with parse error. Add the following if
statement and curly brackets around the whole if else
clause; in the MFC version Markup.cpp use:
if ( ! csResult.IsEmpty() ) { if ... else ... }
In the STL version MarkupSTL.cpp use:
if ( strResult.size() ) { if ... else ... }
CMarkupMSXML SetDoc Wastes Memory
via Dharmesh Shah 28-Aug-2003
The leak in SetDoc() and x_AddSubDoc() is still in 6.5. ...in our services applications it ended up loosing a pointer to a copy of a BSTR the size of a whole XML document we were loading from a string.
(fixed in 6.6)
Fix: In the SetDoc() function in MarkupMSXML.cpp the following line:
_bstr_t bstrDoc(A2BSTR(szDoc));
should read:
_bstr_t bstrDoc(A2BSTR(szDoc),false);
It appears again x_AddSubDoc() in MarkupMSXML.cpp the following line:
_bstr_t bstrSubDoc(A2BSTR(szSubDoc));
should read:
_bstr_t bstrSubDoc(A2BSTR(szSubDoc), false);
MBCS Builds, Double Byte Chars
knight_zhuge 29-Jan-2003
The internal x_TextToDoc
function fails to support double-byte characters. This failure occurs when MBCS
is defined for the build and you add double-byte characters to your document (i.e. it does not occur in regular ASCII or UTF-8). For example, if the paramater szText
is 3 GB2312 Chinese characters (hex D6 D0 B9 FA C8 CB) or 6 bytes, after the loop csText
is only 3 bytes (hex D6 B9 C8).
(fixed in 6.5)
Fix: In the x_TextToDoc
method in Markup.cpp, change:
++nLen;
to:
nLen +=_tclen( pSource );
Add or Insert SubDoc
Tony Nancarrow 18-Oct-2002
I have discovered what appears to be a bug in x_AddSubDoc
(called from AddSubDoc
, InsertSubDoc
, AddChildSubDoc
and InsertChildSubDoc
). If the document to be added or inserted to the parent document contains a processing instruction such as: <?xml version="1.0"?>
then the software gets stuck inside an infinite loop within x_AddSubDoc
. The problem code appears to be the loop:
// Skip version tag or DTD at start of subdocument TokenPos token( szSubDoc ); int nNodeType = x_ParseNode( token ); while ( nNodeType && nNodeType != MNT_ELEMENT ) { token.szDoc = &szSubDoc[token.nNext]; token.nNext = 0; nNodeType = x_ParseNode( token ); }
(fixed in 6.5)
Fix: In the x_AddSubDoc
method in either Markup.cpp or MarkupSTL.cpp, change:
token.szDoc = &szSubDoc[token.nNext];
to:
token.szDoc = &token.szDoc[token.nNext];
UNICODE DecodeBase64
Eric Mathieu 24-May-2002
DecodeBase64
is not working in the Windows CE (UNICODE
build) of CMarkup.
(fixed in 6.4)
(this feature is only in CMarkup Developer and the free XML editor FOAL C++ scripting) |
Fix: In the Markup.cpp DecodeBase64
function, change:
const BYTE* pBase64 = (const BYTE*)(LPCTSTR)csBase64;
to:
LPCTSTR pBase64 = (LPCTSTR)csBase64;
FindChildElem and the level tracker
Jonnie White 14-Mar-2002
If I look for a child element using FindChildElem("/root/list/thing"), I find the right element. The main position is moved to list, and the child position to thing. The level counter doesn't change though:
m_Doc.ResetPos(); m_Doc.FindChildElem("/root/list/thing"); // main pos is list, child pos is thing, level is 0! // if I then try to navigate out m_Doc.OutOfElem(); // I can't, because we are already at level 0
(fixed in 6.3)
Paths In CMarkup (this feature is only in CMarkup Developer and the free XML editor FOAL C++ scripting) |
Fix: The level is not tracked in release 6.3. To take care of it in previous releases by hand, the if
statement in OutOfElem
should be
if ( m_iPos && m_aPos[m_iPos].iElemParent )
instead of
if ( m_iPos && m_nLevel > 0 )
Load/Save Exception Memory Leak
Nikolay Sokratov 20-Feb-2002
Bug: everywhere you use CFileException * don't forget to delete it. That will fix memory leak caused by the exception not being deleted.
catch (CFileException*e) { e->Delete(); return FALSE; }
(fixed in 6.3)
Fix: Exceptions are no longer used for compatibility with Windows CE. The CFile
constructor is replaced with the Open
method that catches the exception. But if you need to implement the fix by hand in releases before 6.3, do as directed above.
CMarkup on Windows CE
Reto Bucher 15-Jan-2002
First, Exceptions are not supported under WinCE (File Handling) and second, two Macros do not exist under CE (_tclen, _tccpy)
(fixed in 6.3)
Fix: If you need to implement the fixes by hand in releases 6.1 and 6.2 do the following:
1. Add the following two defines near the top of Markup.cpp because the double-byte character function defines seem not to be available:
#define _tclen(p) 1 #define _tccpy(p1,p2) *(p1)=*(p2)
2. Remove exception handling from the Load method by using CFile::Open
rather than CFile
constructor:
bool CMarkup::Load( LPCTSTR szFileName ) { CString csDoc; CFile file; if (!file.Open(szFileName, CFile::modeRead)) return false; int nLength = file.GetLength(); #if defined(UNICODE) // Allocate Buffer for UTF-8 file data unsigned char* pBuffer = new unsigned char[nLength + 1]; nLength = file.Read( pBuffer, nLength ); pBuffer[nLength] = '\0'; // Convert file from UTF-8 to Windows UNICODE (AKA UCS-2) int nWideLength = MultiByteToWideChar(CP_UTF8,0, (const char*)pBuffer,nLength,NULL,0); nLength = MultiByteToWideChar(CP_UTF8,0, (const char*)pBuffer,nLength, csDoc.GetBuffer(nWideLength),nWideLength); ASSERT( nLength == nWideLength ); delete [] pBuffer; #else nLength = file.Read( csDoc.GetBuffer(nLength), nLength ); #endif csDoc.ReleaseBuffer(nLength); file.Close(); return SetDoc( csDoc ); }
March 12, 2011
(fixed in 11.5)
Another bug in
MDF_TRIMWHITESPACE
andMDF_COLLAPSEWHITESPACE
. With one of these flags turned on in SetDocFlags, GetData will return"a"
instead of"a <"
(losing the encoded less than sign) in the following XML:Fix for 11.4: in Markup.cpp:3076 insert code to set
nCharWhitespace = 0
as follows: