The problem is if a validator stops validating after first error, or if there are too many false positives like undeclared tags than violations of the things your doctype actually asks to check for.

If you do know how to write your own validators, you can often write better ones faster than writing your own model-to-xml transformers and your own doctype, but even in this case, it's easier to change an existing DTD or model to xml transformer than having to modify your own custom validator.

Or, depending on the kind of data you store, some new characters may appear (maybe a user with a foreign name) and from that moment proper character set definitions may be important.

Same if you just send an XML file from Windows to a Linux system.

In the very worst, it lets you have a way to tell which parts of DOM are "standard", and which are either metadata(eg tag that angularjs will transform later)/junk data(typos, metadata that's no longer needed, so on), in each case, those custom tags introduce complexity to your model, and validating xml can help you be aware of which parts of DOM might be causing the most trouble.

I know it makes the XML better and more Semantic but what are the overall benefits of validating XML? Sure, if you only write out some data for your own application to read it may be enough if it just works.

If you send a file to somebody else matters may be different.

But even within your application you may later choose to switch the parsing library and the new one may complain about errors the old one accepted and ignored like not properly escaping some characters.

For example, you might have a JSON model for colors This can be difficult to maintain, requires very careful coding, and is very fragile to typos, field changes, so on.

A simple program can be written to transform the above to XML, and most models already have many online transformers from them to XML.

I am currently dealing with such a service: they tell you how to form the input, but then they return the response as a string. Said complex XML has no schema to validate against, so I have no idea what I am guaranteed to get back from the server, what is optional, or what types are possibly returned.

