d2jsp
Log InRegister
d2jsp Forums > Off-Topic > Computers & IT > Programming & Development > Change One Element In Xml File > File Reader/writer Vs Dom Vs Sax Vs Stax
12Next
Add Reply New Topic New Poll
Member
Posts: 1,208
Joined: Aug 1 2013
Gold: 50.00
Feb 20 2015 11:59am
I have a very small XML file and I know the structure of it. What's the best (in terms of simplicity and efficiency) way to change a single line in place in this file? I'm planning on trying StAX but don't know if that'll be overkill... though Java doesn't really seem to have any simple means of handling text files so other options seem to be worse.

This post was edited by SanityWasHacked on Feb 20 2015 12:02pm
Member
Posts: 11,637
Joined: Feb 2 2004
Gold: 434.84
Feb 24 2015 01:40pm
Quote (SanityWasHacked @ Feb 20 2015 12:59pm)
I have a very small XML file and I know the structure of it. What's the best (in terms of simplicity and efficiency) way to change a single line in place in this file? I'm planning on trying StAX but don't know if that'll be overkill... though Java doesn't really seem to have any simple means of handling text files so other options seem to be worse.


For making small, programmatic edits to an XML file in Java I recommend just using a DOM parser. SAX is low level and you have to juggle the state by yourself and StAX is more than you need.
Member
Posts: 32,925
Joined: Jul 23 2006
Gold: 3,804.50
Feb 24 2015 03:55pm
i dont know what the simplest way is, but personally i'd just use xstream to consume it, then change your property, then spit it back out. of course, you'd have to create a class(es) to reflect your xml structure.
Member
Posts: 13,425
Joined: Sep 29 2007
Gold: 0.00
Warn: 20%
Feb 24 2015 05:57pm
If you only need one line changed and you know the exact line, just use regex.

Don't need to parse shit.
Member
Posts: 1,995
Joined: Jun 28 2006
Gold: 7.41
Feb 24 2015 07:06pm
Quote (AbDuCt @ Feb 24 2015 06:57pm)
If you only need one line changed and you know the exact line, just use regex.

Don't need to parse shit.


I wouldn't use regex to parse xml. Use XPath instead
Member
Posts: 13,425
Joined: Sep 29 2007
Gold: 0.00
Warn: 20%
Feb 24 2015 07:37pm
Quote (Minkomonster @ Feb 24 2015 09:06pm)
I wouldn't use regex to parse xml. Use XPath instead


But if it's just a single line you can just use a search and replace regex.
Member
Posts: 1,995
Joined: Jun 28 2006
Gold: 7.41
Feb 24 2015 11:20pm
Quote (AbDuCt @ Feb 24 2015 08:37pm)
But if it's just a single line you can just use a search and replace regex.


Assuming the schema is simple and predictable, sure. using XPath isn't that much more code, and you can search by node name reliably. By using regex, you have a lot to take into consideration like namespace prefixes, attributes, etc.

Let's say he has a password in the XML page and wants to mask it.

<Password>hunter2</Password>

Regex to find that node?

Pattern: (<Password>).*(</Password>)
Replace String: $1**********$2

Easy enough. But what if this is a SOAP request for some web service that he is logging, and needs to filter the password so it doesn't get written to disk?

Code
<?xml version="1.0" encoding="UTF-8" standalone="no" ?>
<SOAP-ENV:Envelope
SOAP-ENV:encodingStyle="http://schemas.xmlsoap.org/soap/encoding/"
xmlns:SOAP-ENV="http://schemas.xmlsoap.org/soap/envelope/"
xmlns:SOAP-ENC="http://schemas.xmlsoap.org/soap/encoding/"
xmlns:xsi="http://www.w3.org/1999/XMLSchema-instance"
xmlns:xsd="http://www.w3.org/1999/XMLSchema">
<SOAP-ENV:Body>
<ns1:Account>AzureDiamond</ns1:Account>
<ns1:Password>hunter2</ns1:Password>
</SOAP-ENV:Body>
</SOAP-ENV:Envelope>


Now we have namespaces to deal with. You could hardcode it, but if the schema ever changes, you would also have to modify the masking logic as well. So, let's make it generic?

Pattern: ((<(\w*:)?Password>).*(</\2Password>)
Replace String: $1**********$3

Password is a required field, but what if the user didn't enter one?

Code
<?xml version="1.0" encoding="UTF-8" standalone="no" ?>
<SOAP-ENV:Envelope
SOAP-ENV:encodingStyle="http://schemas.xmlsoap.org/soap/encoding/"
xmlns:SOAP-ENV="http://schemas.xmlsoap.org/soap/envelope/"
xmlns:SOAP-ENC="http://schemas.xmlsoap.org/soap/encoding/"
xmlns:xsi="http://www.w3.org/1999/XMLSchema-instance"
xmlns:xsd="http://www.w3.org/1999/XMLSchema">
<SOAP-ENV:Body>
<ns1:Account>AzureDiamond</ns1:Account>
<ns1:Password xsi:nil="true" />
</SOAP-ENV:Body>
</SOAP-ENV:Envelope>


Now you need a new pattern all together. Unless you want to try and combine them, and trust me...it doesn't end well.

Oh, and none of these patterns took into consideration whitespace which may or may not be present. My point is, using regex to parse XML is never a fun thing. XPath is designed to do this very thing. Just point it at the node you want, and let it handle the parsing.

Member
Posts: 13,425
Joined: Sep 29 2007
Gold: 0.00
Warn: 20%
Feb 24 2015 11:31pm
That is more than one line you minko <3
Member
Posts: 1,995
Joined: Jun 28 2006
Gold: 7.41
Feb 25 2015 07:50am
Quote (AbDuCt @ Feb 25 2015 12:31am)
That is more than one line you minko <3


No it isn't. In all 3 examples, its just that one node that is attempting to be changed. The schema just got more and more complex. This may not be applicable to OP, but its more theory than practice. XML is not a regular language, and therefor it is a bad candidate for regex. XPath was designed specifically for this purpose.
Member
Posts: 13,425
Joined: Sep 29 2007
Gold: 0.00
Warn: 20%
Feb 25 2015 05:27pm
Quote (Minkomonster @ Feb 25 2015 09:50am)
No it isn't. In all 3 examples, its just that one node that is attempting to be changed. The schema just got more and more complex. This may not be applicable to OP, but its more theory than practice. XML is not a regular language, and therefor it is a bad candidate for regex. XPath was designed specifically for this purpose.


One node spanning multiple lines though.
Go Back To Programming & Development Topic List
12Next
Add Reply New Topic New Poll