Wednesday, November 19, 2008

Logos: Searching for long, complex sentences

In a previous post I noted how David Lang at Accordance worked out a procedure for finding the longest sentences in the Gospel of Mark. I followed up with two ways to do so in BibleWorks. I had not figured out how to do so in Logos, but I'm happy to report that Vincent Setterholm from Logos has provided a method. (For each program, do keep in mind that we are just searching for sentence length as an approximation of complexity.) Here's what Vincent wrote with all the details, but if you skip past it, I've put together more of a step-at-a-time list. If you just want to jump to the search and give it a try, HERE is the search file to download and copy into your LibronixDLS/GraphicalQueries directory. If you would rather watch a video of the process, click HERE. (The video is 3 minutes and less than 3Mb.)

We can sort of do this the easy way, but have to check the first verse of each bible book by hand, or we can sort of do it the hard way, and get an unwieldy list of hits with lots of duplicates.

I say 'sort of' because in the graphical query editor we approximate a 'word' as 7 characters, so some slightly shorter sentences with long words will show up (and some longer sentences with short words won't), but that probably isn't a huge deal, since this is a pretty fuzzy way to search for complexity anyway, and one can always tweak the number '30' until one is happy with the result list.

Also, to prevent the problem of not seeing the first hit of the chapter (like the blind-spot with the first sentence of the book), I have to define the search to be bound on the whole book. This means that the best way to see all the hits is to click the 'concordance' button after the search is done, since the usual search results pane won't be very useful here. Then when you click on the hits, realize that it is going to navigate to the major stop BEFORE the long sentence, so the list of verses is sort of 'off by one', and will be sorted by the type of punctuation mark that is used in the preceding sentence.

The hit count will count both punctuation marks, so for a list of sentences, if you care about an accurate count, divide by two.

The way to do this is to open the Gramcord NA 27 and then start a graphic query, create three 'ref' objects that are all to Punctuation, Major Stop in the Gramcord morphology. Position them in a triangle, and double click on the one in the apex and check the options for intervening term: filter and 'exclude' and ok out of that. Then draw an arrow from the left ref to the filter and from the filter to the right ref. Then draw an arrow from the left ref directly to the right ref and double click the arrow to set proximity conditions. Set them to at least 30 words. Then run the search and make sure that in the advanced query dialog you select the open NA27 to run the query against, and then select 'book' as the document level, instead of chapter, verse or article. Once your search hits come up, click the 'concordance' link.

Now I COULD write a search that would even grab the first verse in a book, but I would have to use wild-cards, and that would be very slow (slower than checking the first 27 manually) AND it would display each verse in the search hits many times - indeed, once for each time that there is another word longer than the proximity threshold. So I recommend the easy way.
Here then are the steps to take as described by Vincent:
  1. Open the Gramcord NA 27
  2. Start a graphical query using from the menu: File > New > Graphical Query - Provide the name you want and hit OK
  3. Drag a REF object into the search window
  4. When the dialog opens, select Punctuation, Major Stop > OK
  5. Copy the object you have created (right click > copy) and then paste (right-click > paste) two more objects
  6. Move the objects into a triangle pattern, and then double-click the one at the top/apex
  7. When the dialogue opens, click on "Exclude Term" (just below the box with all the fields), click on Intervening Term (near the bottom) and Filter > OK >> This will create a 'punctuation, major stop, Filter object
  8. Now draw a line from the first (left-most) object to the Filter object at the apex, then a line from the Filter object to the final (right-most) object (If you haven't drawn lines before, it is a matter of moving your cursor near an object until a gray arrowhead appears and then dragging it to the desired object to be connected
  9. Now draw a line from the first to the last object and double-click on it
  10. In the dialogue, set your proximity to "At least 50 intervening" > OK >> Your search field should like the one in the graphic at the top of this post
  11. Click on the Search icon at the top left > When the Advanced Search dialogue box opens, make sure you are using Nestle-Aland 27 with Gramcord; select to search By: Special and choose Bible Book; limit your range to a Bible Text, and type in a single book (in this case, Mark) >> Hit Search
  12. On my older Pentium4, 3.0Ghz with 3Gb RAM WinXP system, the search took about 6 minutes to run
  13. Do not be dismayed if it looks like you only get one hit! Click on "Concordance at the top right of the box, then sort "By reference." The verses listed indicate the end of a sentence just prior to a sentence with 50+ words. (Note Vincent's observation above about 'words' in in the graphical query editor. I.e., there will be some sentences with less than 50 words.)
  14. Now click on one of the references to open up the NA27 text.You will have to scan through the results, but note that there are colored markers to indicate the major stop before a long sentence and a different colored marker at the end of the sentence.
The only difference that Logos has compared to Accordance and BibleWorks is that they can be set to look only for periods and question marks, but Logos includes semicolons among the major stops.

1 comment:

  1. Wow! That's good but I never would have figured it out. The "filter object" continues to be a mystery to me. My solution (see my comment on "Accordance & BibleWorks (and Logos?): Searching for long, complex sentences") is quick and dirty, but gets the job done. It has the benefit of giving an idea of the length of the sentence.