-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ROC data when parsing pepXML #7
Comments
@Owen-Duncan I've looked into this, and here's what I've found. What this means, is that there's no way for the automatic parser to know where to expect // prepare the input stream
final XMLStreamReader xsr = JaxbUtils.createXmlStreamReader(p, false);
// advance the input stream to the beginning of <peptideprophet_summary>
final boolean foundPepProphSummary = XmlUtils.advanceReaderToNext(xsr, "peptideprophet_summary");
if (!foundPepProphSummary)
throw new IllegalStateException("Could not advance the reader to the beginning of a peptideprophet_summary tag.");
// unmarshal
final PeptideprophetSummary ps = JaxbUtils.unmarshal(PeptideprophetSummary.class, xsr); Make sure you're using MSFTBX v1.6.1 (it's on Maven Central now), there were a few fixes introduced. I know this is waaay suboptimal, but I never noticed the issue as nobody ever needed to access that portion of the file. Too bad that the pepxml xsd schema is flawed. Here's a complete example: public static void main(String[] args) throws Exception {
// input file
String pathIn = args[0];
Path p = Paths.get(pathIn).toAbsolutePath();
if (!Files.exists(p))
throw new IllegalArgumentException("File doesn't exist: " + p.toString());
//////////////////////////////////
//
// Relevant part start
//
//////////////////////////////////
// prepare the input stream
final XMLStreamReader xsr = JaxbUtils.createXmlStreamReader(p, false);
// advance the input stream to the beginning of <peptideprophet_summary>
final boolean foundPepProphSummary = XmlUtils.advanceReaderToNext(xsr, "peptideprophet_summary");
if (!foundPepProphSummary)
throw new IllegalStateException("Could not advance the reader to the beginning of a peptideprophet_summary tag.");
// unmarshal
final PeptideprophetSummary ps = JaxbUtils.unmarshal(PeptideprophetSummary.class, xsr);
//////////////////////////////////
//
// Relevant part end
//
//////////////////////////////////
// use the unmarshalled object
StringBuilder sb = new StringBuilder();
sb.append("Input files:");
for (InputFileType inputFile : ps.getInputfile()) {
sb.append("\n\t").append(inputFile.getName());
if (!StringUtils.isNullOrWhitespace(inputFile.getDirectory()))
sb.append(" @ ").append(inputFile.getDirectory());
}
for (RocErrorDataType rocErrorData : ps.getRocErrorData()) {
sb.append("\n");
sb.append(String.format("ROC Error data (charge '%s'): \n", rocErrorData.getCharge()));
// roc_data_points
for (RocDataPoint rocDataPoint : rocErrorData.getRocDataPoint()) {
sb.append(String.format("ROC min_prob=\"%.3f\" sensitivity=\"%.3f\" error=\"%.3f\" " +
"num_corr=\"%d\" num_incorr=\"%d\"\n",
rocDataPoint.getMinProb(), rocDataPoint.getSensitivity(), rocDataPoint.getError(),
rocDataPoint.getNumCorr(), rocDataPoint.getNumIncorr()));
}
// error_points
for (ErrorPoint errroPoint : rocErrorData.getErrorPoint()) {
sb.append(String.format("ERR error=\"%.3f\" min_prob=\"%.3f\" num_corr=\"%d\" num_incorr=\"%d\"\n",
errroPoint.getError(), errroPoint.getMinProb(), errroPoint.getNumCorr(), errroPoint.getNumIncorr()));
}
}
System.out.println(sb.toString());
} |
Thank you! that worked perfectly. for anyone following i needed to make two modifications to the code;
and
import java.nio.file.Files;
import java.nio.file.Path;
import java.nio.file.Paths;
import javax.xml.stream.XMLStreamReader;
public class JAXBPEPXMLFDR {
public static void main(String[] args) throws Exception{
// input file
String pathIn = args[0];
Path p = Paths.get(pathIn).toAbsolutePath();
if (!Files.exists(p))
throw new IllegalArgumentException("File doesn't exist: " + p.toString());
//prepare input stream
final XMLStreamReader xsr = JaxbUtils.createXmlStreamReader(p, false);
//advance reader to begining of <roc_error_data>
final boolean foundPepProphSummary = XmlUtils.advanceReaderToNextRunSummary(xsr, "interprophet_summary");
final InterprophetSummary ps = JaxbUtils.unmarshall(InterprophetSummary.class, xsr);
// use the unmarshalled object
StringBuilder sb = new StringBuilder();
sb.append("Input files:");
for (InputFileType inputFile : ps.getInputfile()) {
sb.append("\n\t").append(inputFile.getName());
if (!StringUtils.isNullOrWhitespace(inputFile.getDirectory()))
sb.append(" @ ").append(inputFile.getDirectory());
}
for (RocErrorDataType rocErrorData : ps.getRocErrorData()) {
sb.append("\n");
sb.append(String.format("ROC Error data (charge '%s'): \n", rocErrorData.getCharge()));
// roc_data_points
for (RocDataPoint rocDataPoint : rocErrorData.getRocDataPoint()) {
sb.append(String.format("ROC min_prob=\"%.3f\" sensitivity=\"%.3f\" error=\"%.3f\" " +
"num_corr=\"%d\" num_incorr=\"%d\"\n",
rocDataPoint.getMinProb(), rocDataPoint.getSensitivity(), rocDataPoint.getError(),
rocDataPoint.getNumCorr(), rocDataPoint.getNumIncorr()));
}
// error_points
for (ErrorPoint errroPoint : rocErrorData.getErrorPoint()) {
sb.append(String.format("ERR error=\"%.3f\" min_prob=\"%.3f\" num_corr=\"%d\" num_incorr=\"%d\"\n",
errroPoint.getError(), errroPoint.getMinProb(), errroPoint.getNumCorr(), errroPoint.getNumIncorr()));
}
}
System.out.println(sb.toString());
}
} |
@Owen-Duncan in 1.6.1 I changed the names of those methods to better reflect what they're doing. Glad it's working for you. |
Hi, msftbx has been great, I've started using it extensively in an analysis pipeline. When parsing pepXML i'd like to retrieve the roc_data_point entries to determine FDRs at given probabilities. When i parse pepXML to an msmsPipelineAnalysis type the roc data doesn't seem to be present, though RocErrorData types are in the library. Using interprophet analysis on TPP 5.0.
The text was updated successfully, but these errors were encountered: