Writing $\LaTeX$ in two different languages, where one of them is Left-to-Right (English) and the other is Right-to-Left (Persian) is cumbersome. I found emacs to have a better support for bidirectional text (VSCode is awful for this purpose, and TeXStudio was not ideal). Changing editor solves the issue of rendering source files, yet the tedious task is to surround every Latin script with \lr{} macro (from xepersian package). This macro commands $\LaTeX$ to use the correct font (Latin Modern Roman in my case) and change the directionality of text in the output file.

To automate this task (wrapping \lr{} macro around every English text), I wrote this series of functions in elisp.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
;; Enclose \lr intelligently
(defun my-blank-p (char)
  (string-match-p "^[[:blank:]\n]$" (char-to-string char)))

(defun my-char-type (char)
  (let ((bidi-class (get-char-code-property char 'bidi-class)))
    (cond
     ((seq-contains-p "۰۱۲۳۴۵۶۷۸۹" char) 'Persian)
     ((member bidi-class '(L EN)) 'Latin)
     ((member bidi-class '(R AL)) 'Persian)
     ((my-blank-p char) 'Space)
     (t 'Neutral))))

(defun my-wrap-latin-runs (start end)
  "Wrap Latin runs of text in \\lr{}"
  (interactive "r")
  (save-excursion
    (let ((end-m (copy-marker end t))
          (from nil)
          (to nil)
          (type nil)
          (prv-picked nil))
      (goto-char start)
      (while (< (point) end-m)
        (setq type (my-char-type (char-after)))
        (cond
         ((eq type 'Neutral)
          (setq from (point)))
         ((eq type 'Latin)
          (if (not from)
              (setq from (point)))
          (setq prv t)
          (while (and (< (point) end-m)
                      (not (eq (my-char-type (char-after)) 'Persian)))
            (setq type (my-char-type (char-after)))
            (if (or (eq type 'Latin)
                    (and prv-picked (eq type 'Neutral)))
                (progn
                  (setq to (point))
                  (setq prv-picked t))
              (setq prv-picked nil))
            (forward-char))
          (let ((to-m (copy-marker (point) t)))
            (goto-char to)
            (forward-char)
            (insert "}")
            (goto-char from)
            (insert "\\lr{")
            (goto-char to-m))
          (setq from nil))
         (t
          (setq from nil)))
        (forward-char)))))

(with-eval-after-load 'tex-mode
  (define-key tex-mode-map (kbd "C-c r") 'my-wrap-latin-runs))

The usage is pretty simple. In a .tex file, select a region of text and press C-c r.

This is one of my first attempts to write something in elisp; So I won’t be surprised if there was a noticeable error in this code. The logic of the code might seem a little weird. There are four main principles I wanted to consider:

  • Every Latin character needs to be wrapped.
  • Every Persian character should not be wrapped.
  • For every two Latin character, if there are no Persian character between them, they should be wrapped in a single \lr{}.
  • For a Latin character and a Neutral character, if there are no Persian or Space characters between them, they should be wrapped in a single \lr{}.

References