Writing $\LaTeX$ in two different languages, where one of them is Left-to-Right (English) and the other is Right-to-Left (Persian) is cumbersome. I found emacs to have a better support for bidirectional text (VSCode is awful for this purpose, and TeXStudio was not ideal). Changing editor solves the issue of rendering source files, yet the tedious task is to surround every Latin script with \lr{} macro (from xepersian package). This macro commands $\LaTeX$ to use the correct font (Latin Modern Roman in my case) and change the directionality of text in the output file.

To automate this task (wrapping \lr{} macro around every English text), I wrote this series of functions in elisp.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
;; Enclose \lr intelligently
(defun my-blank-p (char)
  (string-match-p "^[[:blank:]\n]$" (char-to-string char)))

(defun my-char-type (char)
  (let ((bidi-class (get-char-code-property char 'bidi-class)))
    (cond
     ((seq-contains-p "۰۱۲۳۴۵۶۷۸۹" char) 'Persian)
     ((member bidi-class '(L EN)) 'Latin)
     ((member bidi-class '(R AL)) 'Persian)
     ((my-blank-p char) 'Space)
     (t 'Neutral))))

(defun my-wrap-latin-runs (start end)
  "Wrap Latin runs of text in \\lr{}"
  (interactive "r")
  (save-excursion
    (let ((end-m (copy-marker end t))
          (from nil)
          (to nil)
          (type nil)
          (prv-picked nil))
      (goto-char start)
      (while (< (point) end-m)
        (setq type (my-char-type (char-after)))
        (cond
         ((eq type 'Neutral)
          (setq from (point)))
         ((eq type 'Latin)
          (if (not from)
              (setq from (point)))
          (setq prv t)
          (while (and (< (point) end-m)
                      (not (eq (my-char-type (char-after)) 'Persian)))
            (setq type (my-char-type (char-after)))
            (if (or (eq type 'Latin)
                    (and prv-picked (eq type 'Neutral)))
                (progn
                  (setq to (point))
                  (setq prv-picked t))
              (setq prv-picked nil))
            (forward-char))
          (let ((to-m (copy-marker (point) t)))
            (goto-char to)
            (forward-char)
            (insert "}")
            (goto-char from)
            (insert "\\lr{")
            (goto-char to-m))
          (setq from nil))
         (t
          (setq from nil)))
        (forward-char)))))

(with-eval-after-load 'tex-mode
  (define-key tex-mode-map (kbd "C-c r") 'my-wrap-latin-runs))

The usage is pretty simple. In a .tex file, select a region of text and press C-c r.

This is one of my first attempts to write something in elisp; So I won't be surprised if there was a noticeable error in this code. The logic of the code might seem a little weird. There are four main principles I wanted to consider:

  • Every Latin character needs to be wrapped.
  • Every Persian character should not be wrapped.
  • For every two Latin character, if there are no Persian character between them, they should be wrapped in a single \lr{}.
  • For a Latin character and a Neutral character, if there are no Persian or Space characters between them, they should be wrapped in a single \lr{}.

Supporting other types of wrapping

<2025-03-26 Wed>

I realized the same code works perfectly for $ wrap! It is really hard to implement such thing for English-only script, yet in Persian-English script you just need the snippet I wrote above. I modify it to allow different wrapping options.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
(defun my-wrap-latin-runs (start end wrap-start wrap-end)
  "Wrap Latin runs of text"
  (save-excursion
    (let ((end-m (copy-marker end t))
          (from nil)
          (to nil)
          (type nil)
          (prv-picked nil))
      (goto-char start)
      (while (< (point) end-m)
        (setq type (my-char-type (char-after)))
        (cond
         ((eq type 'Neutral)
          (setq from (point)))
         ((eq type 'Latin)
          (if (not from)
              (setq from (point)))
          (setq prv t)
          (while (and (< (point) end-m)
                      (not (eq (my-char-type (char-after)) 'Persian)))
            (setq type (my-char-type (char-after)))
            (if (or (eq type 'Latin)
                    (and prv-picked (eq type 'Neutral)))
                (progn
                  (setq to (point))
                  (setq prv-picked t))
              (setq prv-picked nil))
            (forward-char))
          (let ((to-m (copy-marker (point) t)))
            (goto-char to)
            (forward-char)
            (insert wrap-end)
            (goto-char from)
            (insert wrap-start)
            (goto-char to-m))
          (setq from nil))
         (t
          (setq from nil)))
        (forward-char)))))

(defun define-wrap-latin-command (key start-text end-text)
  (with-eval-after-load 'tex-mode
    (define-key
     tex-mode-map
     (kbd key)
     (lambda (start end)
       (interactive "r")
       (my-wrap-latin-runs start end start-text end-text)))))

(define-wrap-latin-command "C-c r" "\\lr{" "}")
(define-wrap-latin-command "C-c s" "$" "$")
(define-wrap-latin-command "C-c t" "\\lrtt{" "}")

I have not changed the functions my-blank-p and my-char-type.