Friday, September 16, 2005

 

Processing Text -- Blank Paragraph Removal

At this point in the proceedings, we may have some blank lines in our story. They could be the result of empty cells, of cells that end in a paragraph return, or of cells that had blank paragraphs within their text. It doesn't really matter where they came from, they've got to go.

There are various techniques for doing this. My preferred approach is this:
function removeBlankParas(theText){
 var myLengths = theText.paragraphs.everyItem().length;
 for (var j = myLengths.length - 1; j >= 0; j--) {
  if (myLengths[j] == 1) {
   theText.paragraphs[j].remove();
  }
 }
}
This approach to processing the paragraphs of a story (or any range of text) seems to be very efficient. It cuts down on the number of interactions with InDesign very dramatically when compared to getting the length of each paragraph individually. It does take some discipline (in more complicated scenarios) to distinguish between that which is the contents of a JavaScript variable and that which is the contents of the text you're working on, but the pay-off is often great.

In this case, because the action being performed changes the number of paragraphs in the text as the loop proceeds, it is necessary to work from the back. For some other actions, proceeding from the front is needed.

I just tested the script, but it exposed another problem; one that has been lurking at the back of my mind for a while because one of the paragraph styles I created a few days ago had a name that ended with a space: some "empty" paragraphs actually have a whitespace character in them. Let's write a function for this. I can use it to cleanse the names of the paragraph styles created earlier in the process and I'll need to use it here before running removeBlankParas().

I'm going to assume that there is a one-to-one relationship between the character positions in the text and the characters in the strings in JavaScript variables. This is a good assumption as long as none of the the text is in certain Asian languages.

So, the framework looks like this:
function trimText(theText) {
 // Trims spaces from the front and back of all paragraphs in theText
 // Trims tabs from the end
 var myContents = theText.paragraphs.everyItem().contents;
 for (var j = myContents.length - 1; j >= 0; j--) {
 }
}
And the big question is just what to put into the loop. Which I'll have to put off until my next session.

Comments: Post a Comment

<< Home

This page is powered by Blogger. Isn't yours?