I'm trying to write custom Elisp functions for word counts based on certain parts of an org-mode
buffer, and I was wondering if there is a good way for org-element
to parse all headlines matching a certain tag (or property). More specifically, I have a buffer like this:
#+TITLE: My manuscript
Authors, affiliations, etc.
* Abstract :abstract:
Text in the abstract.
* Introduction :body:
Some sample text goes here.
* Methods :body:
Some sample text goes here.
* Results :body:
Some sample text goes here.
* Discussion :body:
Some sample text goes here.
* References :refs:
References go here.
And I want to get a word count for all headers matching the :body:
tag (i.e., 20). So far, I have been using the functions from this answer, which do a great job at returning a word count for the headline at point:
(require 'cl-lib)
(require 'org-element)
(defun org-element-parse-headline (&optional granularity visible-only)
"Parse current headline.
GRANULARITY and VISIBLE-ONLY are like the args of `org-element-parse-buffer'."
(let ((level (org-current-level)))
(org-element-map
(org-element-parse-buffer granularity visible-only)
'headline
(lambda (el)
(and
(eq (org-element-property :level el) level)
(<= (org-element-property :begin el) (point))
(<= (point) (org-element-property :end el))
el))
nil 'first-match 'no-recursion)))
(cl-defun org+-count-words-of-heading (&key (worthy '(paragraph bold italic underline code footnote-reference link strike-through subscript superscript table table-row table-cell))
(no-recursion nil))
"Count words in the section of the current heading.
WORTHY is a list of things worthy to be counted.
This list should at least include the symbols:
paragraph, bold, italic, underline and strike-through.
If NO-RECURSION is non-nil don't count the words in subsections."
(interactive (and current-prefix-arg
(list :no-recursion t)))
(let ((word-count 0))
(org-element-map
(org-element-contents (org-element-parse-headline))
'(paragraph table)
(lambda (par)
(org-element-map
par
worthy
(lambda (el)
(cl-incf
word-count
(cl-loop
for txt in (org-element-contents el)
when (eq (org-element-type txt) 'plain-text)
sum
(with-temp-buffer
(insert txt)
(count-words (point-min) (point-max))))
))))
nil nil (and no-recursion 'headline)
)
(when (called-interactively-p 'any)
(message "Word count in section: %d" word-count))
word-count))
I imagine I have to tweak the org-element-parse-headline
function to match tags instead of grabbing the headline at point, but does anyone know how to do this? Thanks!
To get results from the whole buffer, not just the headline at point, set the value of level
, eg. to 0
to check the whole buffer and don't set the 'first-match
argument to org-element-map
. Then, to restrict matches to headlines with specific tags, add a condition to the function in org-element-map
.
(defun my-org-element-parse-headline (&optional granularity visible-only)
(let ((level 0)) ; or restrict level
(org-element-map
(org-element-parse-buffer granularity visible-only)
'headline
(lambda (el)
;; eg. restrict elements to levels greater than `level`
(when (< level (org-element-property :level el))
(and
;; match "body" tags
(member "body" (org-element-property :tags el))
;; ...
el)))
;; dont set 'first-match if you want all the matches
nil nil 'no-recursion)))
;; eg. results from your example org file
(mapcar (lambda (el) (org-element-property :raw-value el)) (my-org-element-parse-headline))
;; ("Introduction" "Methods" "Results" "Discussion")