Opened 17 years ago
Closed 15 years ago
#587 closed Defect (fixed)
Unescaped ampersands in client_state.xml
Reported by: | Nicolas | Owned by: | davea |
---|---|---|---|
Priority: | Minor | Milestone: | Undetermined |
Component: | Client - Daemon | Version: | |
Keywords: | xml | Cc: |
Description
PrimeGrid uses scripts to generate input files on the fly. This makes the input files have question marks and ampersands on the URL. The scheduler reply correctly escapes these ampersands as &
. But when the client saves the file information on client_state.xml
, they get unescaped:
<file_info> <name>psp_sr2sieve_2837737_cmd</name> <nbytes>59.000000</nbytes> <max_nbytes>0.000000</max_nbytes> <status>1</status> <url>http://www.primegrid.com/download/psp_sr2sieve_workunit.php?from=3029916000&to=3029916500</url> </file_info>
This makes client_state non-wellformed XML. (note even Trac XML syntax highlighting is showing the ampersand in red).
Change History (12)
comment:1 Changed 17 years ago by
Resolution: | → wontfix |
---|---|
Status: | new → closed |
comment:2 follow-up: 3 Changed 17 years ago by
Resolution: | wontfix |
---|---|
Status: | closed → reopened |
Yes, with anyone using a real parser for client_state.xml (addons). It's pretty simple: what is getting saved is not XML (not valid XML at least).
I made a mockup for a debt changing GUI, using HTML, and a PHP script to read my current debts (no better sample data than real data). Immediately stopped working when I attached to PrimeGrid?, because &to is not a valid XML entity.
I detached PrimeGrid?, and it still didn't work, because there were accented characters in a result stderr (Windows gave a localized error message), in ISO-8859-1, but client_state.xml doesn't have a charset declaration, and XML specification says the default is UTF-8. That accented character made it invalid UTF-8.
That is not XML. That is a format that happens to be based in XML and look quite the same, but has no escaping, needs one tag per line, and has no idea how to handle Unicode. Will my client stop working if I edit client_state and remove all newlines? If yes, BOINC isn't really using XML, because newlines shouldn't matter.
comment:3 Changed 15 years ago by
Replying to Nicolas:
Yes, with anyone using a real parser for client_state.xml (addons). It's pretty simple: what is getting saved is not XML (not valid XML at least).
I made a mockup for a debt changing GUI, using HTML, and a PHP script to read my current debts (no better sample data than real data). Immediately stopped working when I attached to PrimeGrid?, because &to is not a valid XML entity.
This is causing problems with Boinc.NET as well as the XMLReader sees these ampersands as 'entity tags'. This causes an invalidation and an exception to be thrown. I have a temp work around. I replace the '&'(s) after the RPC BEFORE loading it into the XMLReader with '@'(s). After the LINQ parse, I replace the '@'(s) in the url with '&'(s). This works ok so far. As long as the ampersands remain ONLY in urls, this will always work in Boinc.NET --Mike
comment:4 follow-up: 5 Changed 15 years ago by
Mike, do you mean this problem appears in GUI RPC replies as well?
comment:5 Changed 15 years ago by
Replying to Nicolas:
Mike, do you mean this problem appears in GUI RPC replies as well?
When I do an RPC <get_state>, some of the projects are using PHP variable passing in their links. 'page.php?val1=1&val2=2&val3=3' Thats the sorta thing im talking about. The XMLReader in .NET cant deal with this and thinks they are Entity tags as i mentioned above. I have so far NOT found any ampersands anywhere else. Im waiting for a tester to send me thier 'dump' from the RPC call so I can check to see if there are other PHP or HTML tags that are causing problems. Im thinking some projects are corupting the XML in other ways. I'll post what I find. --Mike
comment:6 follow-up: 10 Changed 15 years ago by
Projects specify GUI URLs in XML files on their server; I think the best thing is to demand that these be entity-escaped from the beginning. (it would be messy to repair them in the client, and even then the scheduler RPC reply would still be invalid).
What projects are returning unescaped GUI URLs?
comment:7 Changed 15 years ago by
I agree.. There are other ways to send variables to PHP scripts As for the projects.. the only one I know of 100% is Quake Catcher Network http://qcn.stanford.edu/qcnalpha/ The one link that is used to show the map has PHP variable passing in it. Im still trying to locate other projects.
comment:8 Changed 15 years ago by
I contacted QCN; should be fixed today. Let me know if there are other instances of this.
comment:9 Changed 15 years ago by
YOYO@home in the <description> element... <gui_url>
<name>news: 02 May 2009</name>
<description>-- Stats: OGR & Muon -- BOINC project yoyo@home: Main page News</description>
<url>http://www.rechenkraft.net/yoyo/all_news.php#115</url>
</gui_url> Im still waiting for that Email to show up.. He in Austria so it may be late today. BTW.. thanks for the RPC I requested
comment:10 follow-up: 11 Changed 15 years ago by
Replying to davea:
Projects specify GUI URLs in XML files on their server; I think the best thing is to demand that these be entity-escaped from the beginning. (it would be messy to repair them in the client, and even then the scheduler RPC reply would still be invalid).
Project admin enters XML into a file. Server reads the file and sends it to the client. The client sends it to the GUI. The GUI parses it and shows it. And none of the steps ever complains about invalid XML? (in fact most don't even parse what they're passing along)
comment:11 Changed 15 years ago by
Replying to Nicolas:
Replying to davea:
Projects specify GUI URLs in XML files on their server; I think the best thing is to demand that these be entity-escaped from the beginning. (it would be messy to repair them in the client, and even then the scheduler RPC reply would still be invalid).
Project admin enters XML into a file. Server reads the file and sends it to the client. The client sends it to the GUI. The GUI parses it and shows it. And none of the steps ever complains about invalid XML? (in fact most don't even parse what they're passing along)
All I know is, I use LINQ to DataTable? querying... In order to do this, I must load it into a datatable using a XMLReader. This is where it chokes on the ampersands. I have tried many ways to parse the XML and this by far is the fastest there is.
Its not a big deal as I have found a simple work around. As long as the Ampersands dont show up any where that will ruin the schema of the XML, its all ok. Filtering out the '&'s from the entire XML before loading it in the the XMLReader has solved a major problem.. Not its just a matter of making sure they are converted back where they are needed. Unless this becomes more of an issue, I wouldn't worry to much about it.. I would however, make it clear to project admins the trouble this can cause. Maybe a blurp in the Docs somewhere?
Thanks Dave & Nicolas
comment:12 Changed 15 years ago by
Resolution: | → fixed |
---|---|
Status: | reopened → closed |
Original problem fixed in [18915].
Does this cause any problems? If not I'm not going to fix it, but will check in the changes if someone else wants to.