I've taken another crack at an old problem I haven't been able to fix. My goal is to be able to recreate in java with picard an acceptable substitute to bedtools bam to bed conversion. The problem is my code for some reason takes an absurdly long time to run even with a small file. I was hoping someone would know if I could make any changes that could speed up the conversion.
This part creates a new Swingworker task where a bam file will be converted and executes it. It also adds a listener to copy the converted file once it is finished and inform the user.
The task has a loop where the actual conversion takes place. The loop is seen below. Bedtools can converted a file in maybe 4 minutes or so but for some reason this loop seems to run on the same file for nearly an hour without signs of stopping. I'm okay with something a little slower than bedtools but the current speed is way too slow. Is this an inherent deficiency or would there be a way to improve the loop? Or maybe I made an error somewhere?
This part creates a new Swingworker task where a bam file will be converted and executes it. It also adds a listener to copy the converted file once it is finished and inform the user.
Code:
final BamConverter task = new BamConverter(chosenFile, txtArea); task.addPropertyChangeListener( new PropertyChangeListener() { @Override public void propertyChange(PropertyChangeEvent evt) { if (task.getState() == StateValue.DONE) { try { chosenFileHolder = task.get(); chosenFile = chosenFileHolder; txtArea.append("BAM2BED complete!\n"); txtArea.setCaretPosition(txtArea.getText().length()); } catch (InterruptedException exp) { exp.printStackTrace(); } catch (ExecutionException exp) { exp.printStackTrace(); } } } } ); task.execute(); }
The task has a loop where the actual conversion takes place. The loop is seen below. Bedtools can converted a file in maybe 4 minutes or so but for some reason this loop seems to run on the same file for nearly an hour without signs of stopping. I'm okay with something a little slower than bedtools but the current speed is way too slow. Is this an inherent deficiency or would there be a way to improve the loop? Or maybe I made an error somewhere?
Code:
try { FileWriter fstream = new FileWriter(BAMconverted); BufferedWriter out = new BufferedWriter(fstream); while (iterator.hasNext()) { final SAMRecord record = iterator.next(); if (record.getReadUnmappedFlag()) { continue; } out.write(record.getReferenceName() + "\t" + (record.getAlignmentStart() - 1) + "\t" + //subtract 1 to shift from one-based to zero-based (record.getAlignmentEnd() - 1 + 1) + "\t" + //subtract 1 to shift from one-based to zero-based, and // then add 1 to shift from inclusive to exclusive record.getReadName() + "\t" + record.getMappingQuality() + "\t" + (record.getReadNegativeStrandFlag()? "-": "+") ); cstepcount2=cstepcount; Integer.toString(cstepcount2); innerTextArea.append(""+cstepcount2 + "\n"); innerTextArea.setCaretPosition(innerTextArea.getText().length()); cstepcount++; if ((cstepcount/1000)==0) { //processing stuff publish( cstepcount ); // Tell the GUI about counter tracking the conversion steps completed. } } out.close(); } catch(IOException ie){ ie.printStackTrace(); } reader.close();