Selaa lähdekoodia

new latex math plugin, now handles typogrify correctly and does not need template alteration to be used

Barry Steyn 11 vuotta sitten
vanhempi
commit
474b7fd2ff
2 muutettua tiedostoa jossa 349 lisäystä ja 77 poistoa
  1. 46 46
      latex/Readme.md
  2. 303 31
      latex/latex.py

+ 46 - 46
latex/Readme.md

@@ -3,73 +3,73 @@ Latex Plugin For Pelican
 
 This plugin allows you to write mathematical equations in your articles using Latex.
 It uses the MathJax Latex JavaScript library to render latex that is embedded in
-between `$..$` for inline math and `$$..$$` for displayed math. It also allows for 
+between `$..$` for inline math and `$$..$$` for displayed math. It also allows for
 writing equations in by using `\begin{equation}`...`\end{equation}`.
 
 Installation
 ------------
-
 To enable, ensure that `latex.py` is put somewhere that is accessible.
 Then use as follows by adding the following to your settings.py:
 
     PLUGINS = ["latex"]
 
-Be careful: Not loading the plugin is easy to do, and difficult to detect. To
-make life easier, find where pelican is installed, and then copy the plugin
-there. An easy way to find where pelican is installed is to verbose list the
-available themes by typing `pelican-themes -l -v`. 
-
-Once the pelican folder is found, copy `latex.py` to the `plugins` folder. Then 
-add to settings.py like this:
-
-    PLUGINS = ["pelican.plugins.latex"]
-
-Now all that is left to do is to embed the following to your template file 
-between the `<head>` parameters (for the NotMyIdea template, this file is base.html)
+Your site is now capable of rendering latex math using the mathjax JavaScript
+library. No alterations to the template file is needed.
 
-    {% if article and article.latex %}
-        {{ article.latex }}
-    {% endif %}
-    {% if page and page.latex %}
-        {{ page.latex }}
-    {% endif %}
+### Typogrify
+Typogrify will now play nicely with Latex (i.e. typogrify can be enabled
+and Latex will be rendered correctly). In order for this to happen,
+version 2.07 (or above) of typogrify is required. In fact, this plugin expects
+that at least version 2.07 is present and will fail without it.
 
 Usage
 -----
-Latex will be embedded in every article. If however you want latex only for
-selected articles, then in settings.py, add
-
-    LATEX = 'article'
-
-And in each article, add the metadata key `latex:`. For example, with the above
-settings, creating an article that I want to render latex math, I would just 
-include 'Latex' as part of the metadata without any value:
-
-    Date: 1 sep 2012
-    Status: draft
-    Latex:
+### Bakcward Compatibility
+This plugin is backward compatible in the sense that it
+accompishes what previous versions did without needing any setup in the
+metadata or settings files.
+
+### Settings File
+Extra options regarding how mathjax renders latex can be set in the settings
+file. These options are in a dictionary variable called `LATEX` in the pelican
+settings file.
+
+The dictionary can be set with the following keys:
+
+ * `wrap`: controls the tags that math is wrapped with inside the resulting
+html. For example, setting `wrap` to `'mathjax'` would wrap all math inside
+`<mathjax>...</mathjax>` tags. If typogrify is set to True, then math needs
+to be wrapped in tags and `wrap` will therefore default to `mathjax` if not
+set.
+ * `align`: controls how displayed math will be aligned. Can be set to either
+`left`, `right` or `center` (default is `center`).
+ * `indent`: if `align` not set to `center`, then this controls the indent
+level (default is `0em`).
+ * `show_menu`: controls whether the mathjax contextual menu is shown.
+ * `process_escapes`: controls whether mathjax processes escape sequences.
+ * `preview`: controls the preview message users are seen while mathjax is
+loading.
+ * `color`: controls the color of the mathjax rendered font.
+
+For example, in settings.py, the following would make latex render in blue and
+displaymath align to the left:
+
+    LATEX = {'color':'blue','align':left}
 
 Latex Examples
 --------------
 ###Inline
-Latex between `$`..`$`, for example, `$`x^2`$`, will be rendered inline 
+Latex between `$`..`$`, for example, `$`x^2`$`, will be rendered inline
 with respect to the current html block.
 
 ###Displayed Math
-Latex between `$$`..`$$`, for example, `$$`x^2`$$`, will be rendered centered in a 
+Latex between `$$`..`$$`, for example, `$$`x^2`$$`, will be rendered centered in a
 new paragraph.
 
 ###Equations
-Latex between `\begin` and `\end`, for example, `begin{equation}` x^2 `\end{equation}`, 
-will be rendered centered in a new paragraph with a right justified equation number 
-at the top of the paragraph. This equation number can be referenced in the document. 
-To do this, use a `label` inside of the equation format and then refer to that label 
-using `ref`. For example: `begin{equation}` `\label{eq}` X^2 `\end{equation}`. Now 
+Latex between `\begin` and `\end`, for example, `begin{equation}` x^2 `\end{equation}`,
+will be rendered centered in a new paragraph with a right justified equation number
+at the top of the paragraph. This equation number can be referenced in the document.
+To do this, use a `label` inside of the equation format and then refer to that label
+using `ref`. For example: `begin{equation}` `\label{eq}` X^2 `\end{equation}`. Now
 refer to that equation number by `$`\ref{eq}`$`.
-   
-Template And Article Examples
------------------------------
-To see an example of this plugin in action, look at 
-[this article](http://doctrina.org/How-RSA-Works-With-Examples.html). To see how 
-this plugin works with a template, look at 
-[this template](https://github.com/barrysteyn/pelican_theme-personal_blog).

+ 303 - 31
latex/latex.py

@@ -5,51 +5,323 @@ Latex Plugin For Pelican
 
 This plugin allows you to write mathematical equations in your articles using Latex.
 It uses the MathJax Latex JavaScript library to render latex that is embedded in
-between `$..$` for inline math and `$$..$$` for displayed math. It also allows for 
-writing equations in by using `\begin{equation}`...`\end{equation}`.
+between `$..$` for inline math and `$$..$$` for displayed math. It also allows for
+writing equations in by using `\begin{equation}`...`\end{equation}`. No
+alteration to a template is required for this plugin to work, just install and
+use.
+
+Typogrify Compatibility
+-----------------------
+This plugin now plays nicely with typogrify, but it requires
+typogrify version 2.07 or above.
+
+User Settings
+-------------
+Users are also able to pass a dictionary of settings in the settings file which
+will control how the mathjax library renders thing. This could be very useful
+for template builders that want to adjust look and feel of the math.
+See README for more details.
 """
 
 from pelican import signals
+from pelican import contents
+import re
 
-# Reference about dynamic loading of MathJax can be found at http://docs.mathjax.org/en/latest/dynamic.html
-# The https cdn address can be found at http://www.mathjax.org/resources/faqs/#problem-https
-latexScript = """
-    <script type= "text/javascript">
+# Global Variables
+_WRAP_TAG = None  # the tag to wrap mathjax in (needed to play nicely with typogrify or for template designers)
+_LATEX_REGEX = re.compile(r'(\$\$|\$|\\begin\{(.+?)\}).*?\1|\\end\{\2\}', re.DOTALL | re.IGNORECASE) #  used to detect latex
+_LATEX_SUMMARY_REGEX = None  # used to match latex in summary
+_LATEX_PARTIAL_REGEX = None  # used to match latex that has been cut off in summary
+_MATHJAX_SETTINGS = {}  # Settings that can be specified by the user, used to control mathjax script settings
+_MATHJAX_SCRIPT="""
+<script type= "text/javascript">
+    if (!document.getElementById('mathjaxscript_pelican')) {{
         var s = document.createElement('script');
-        s.type = 'text/javascript';
-        s.src = 'https:' == document.location.protocol ? 'https://c328740.ssl.cf1.rackcdn.com/mathjax/latest/MathJax.js' : 'http://cdn.mathjax.org/mathjax/latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML'; 
+        s.id = 'mathjaxscript_pelican';
+        s.type = 'text/javascript'; s.src = 'https:' == document.location.protocol ? 'https://c328740.ssl.cf1.rackcdn.com/mathjax/latest/MathJax.js' : 'http://cdn.mathjax.org/mathjax/latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML';
         s[(window.opera ? "innerHTML" : "text")] =
-            "MathJax.Hub.Config({" + 
-            "    config: ['MMLorHTML.js']," + 
-            "    jax: ['input/TeX','input/MathML','output/HTML-CSS','output/NativeMML']," +
-            "    TeX: { extensions: ['AMSmath.js','AMSsymbols.js','noErrors.js','noUndefined.js'], equationNumbers: { autoNumber: 'AMS' } }," + 
+            "MathJax.Hub.Config({{" +
+            "    config: ['MMLorHTML.js']," +
+            "    TeX: {{ extensions: ['AMSmath.js','AMSsymbols.js','noErrors.js','noUndefined.js'], equationNumbers: {{ autoNumber: 'AMS' }} }}," +
+            "    jax: ['input/TeX','input/MathML','output/HTML-CSS']," +
             "    extensions: ['tex2jax.js','mml2jax.js','MathMenu.js','MathZoom.js']," +
-            "    tex2jax: { " +
+            "    displayAlign: '{align}'," +
+            "    displayIndent: '{indent}'," +
+            "    showMathMenu: {show_menu}," +
+            "    tex2jax: {{ " +
             "        inlineMath: [ [\'$\',\'$\'] ], " +
             "        displayMath: [ [\'$$\',\'$$\'] ]," +
-            "        processEscapes: true }, " +
-            "    'HTML-CSS': { " +
-            "        styles: { '.MathJax .mo, .MathJax .mi': {color: 'black ! important'}} " +
-            "    } " +
-            "}); ";
+            "        processEscapes: {process_escapes}," +
+            "        preview: '{preview}'," +
+            "    }}, " +
+            "    'HTML-CSS': {{ " +
+            "        styles: {{ '.MathJax_Display, .MathJax .mo, .MathJax .mi, .MathJax .mn': {{color: '{color} ! important'}} }}" +
+            "    }} " +
+            "}}); ";
         (document.body || document.getElementsByTagName('head')[0]).appendChild(s);
-    </script>
+    }}
+</script>
 """
 
-def addLatex(gen, metadata):
+
+# Python standard library for binary search, namely bisect is cool but I need
+# specific business logic to evaluate my search predicate, so I am using my
+# own version
+def binary_search(match_tuple, ignore_within):
+    """Determines if t is within tupleList. Using the fact that tupleList is
+    ordered, binary search can be performed which is O(logn)
     """
-        The registered handler for the latex plugin. It will add 
-        the latex script to the article metadata
+
+    ignore = False
+    if ignore_within == []:
+        return False
+
+    lo = 0
+    hi = len(ignore_within)-1
+
+    # Find first value in array where predicate is False
+    # predicate function: tupleList[mid][0] < t[index]
+    while lo < hi:
+        mid = lo + (hi-lo+1)/2
+        if ignore_within[mid][0] < match_tuple[0]:
+            lo = mid
+        else:
+            hi = mid-1
+
+    if lo >= 0 and lo <= len(ignore_within)-1:
+        ignore = (ignore_within[lo][0] <= match_tuple[0] and ignore_within[lo][1] >= match_tuple[1])
+
+    return ignore
+
+
+def ignore_content(content):
+    """Creates a list of match span tuples for which content should be ignored
+    e.g. <pre> and <code> tags
     """
-    if 'LATEX' in gen.settings.keys() and gen.settings['LATEX'] == 'article':
-        if 'latex' in metadata.keys():
-            metadata['latex'] = latexScript
-    else:
-        metadata['latex'] = latexScript
+    ignore_within = []
 
-def register():
+    # used to detect all <pre> and <code> tags. NOTE: Alter this regex should
+    # additional tags need to be ignored
+    ignore_regex = re.compile(r'<(pre|code).*?>.*?</(\1)>', re.DOTALL | re.IGNORECASE)
+
+    for match in ignore_regex.finditer(content):
+        ignore_within.append(match.span())
+
+    return ignore_within
+
+
+def wrap_latex(content, ignore_within):
+    """Wraps latex in user specified tags.
+
+    This is needed for typogrify to play nicely with latex but it can also be
+    styled by template providers
     """
-        Plugin registration
+    wrap_latex.foundlatex = False
+
+    def math_tag_wrap(match):
+        """function for use in re.sub"""
+
+        # determine if the tags are within <pre> and <code> blocks
+        ignore = binary_search(match.span(1), ignore_within) and binary_search(match.span(2), ignore_within)
+
+        if ignore:
+            return match.group(0)
+        else:
+            wrap_latex.foundlatex = True
+            return '<%s>%s</%s>' % (_WRAP_TAG, match.group(0), _WRAP_TAG)
+
+    return (_LATEX_REGEX.sub(math_tag_wrap, content), wrap_latex.foundlatex)
+
+
+def process_summary(instance, ignore_within):
+    """Summaries need special care. If Latex is cut off, it must be restored.
+
+    In addition, the mathjax script must be included if necessary thereby
+    making it independent to the template
     """
-    signals.article_generator_context.connect(addLatex)
-    signals.page_generator_context.connect(addLatex)
+
+    process_summary.altered_summary = False
+    insert_mathjax_script = False
+    endtag = '</%s>' % _WRAP_TAG if _WRAP_TAG != None else ''
+
+    # use content's _get_summary method to obtain summary
+    summary = instance._get_summary()
+
+    # Determine if there is any math in the summary which are not within the
+    # ignore_within tags
+    mathitem = None
+    for mathitem in _LATEX_SUMMARY_REGEX.finditer(summary):
+        if binary_search(mathitem.span(), ignore_within):
+            mathitem = None # In <code> or <pre> tags, so ignore
+        else:
+            insert_mathjax_script = True
+
+    # Repair the latex if it was cut off mathitem will be the final latex
+    # code  matched that is not within <pre> or <code> tags
+    if mathitem and mathitem.group(4) == ' ...':
+        end = r'\end{%s}' % mathitem.group(3) if mathitem.group(3) is not None else mathitem.group(2)
+        latex_match = re.search('%s.*?%s' % (re.escape(mathitem.group(1)), re.escape(end)), instance._content, re.DOTALL | re.IGNORECASE)
+        new_summary = summary.replace(mathitem.group(0), latex_match.group(0)+'%s ...' % endtag)
+
+        if new_summary != summary:
+            return new_summary+_MATHJAX_SCRIPT.format(**_MATHJAX_SETTINGS)
+
+    def partial_regex(match):
+        """function for use in re.sub"""
+        if binary_search(match.span(), ignore_within):
+            return match.group(0)
+
+        process_summary.altered_summary = True
+        return match.group(1) + match.group(4)
+
+    # check for partial latex tags at end. These must be removed
+    summary = _LATEX_PARTIAL_REGEX.sub(partial_regex, summary)
+
+    if process_summary.altered_summary:
+        return summary+_MATHJAX_SCRIPT.format(**_MATHJAX_SETTINGS) if insert_mathjax_script else summary
+
+    return summary+_MATHJAX_SCRIPT.format(**_MATHJAX_SETTINGS) if insert_mathjax_script else None
+
+
+def process_settings(settings):
+    """Sets user specified MathJax settings (see README for more details)"""
+
+    global _MATHJAX_SETTINGS
+
+    # NOTE TO FUTURE DEVELOPERS: Look at the README and what is happening in
+    # this function if any additional changes to the mathjax settings need to
+    # be incorporated. Also, please inline comment what the variables
+    # will be used for
+
+    # Default settings
+    _MATHJAX_SETTINGS['align'] = 'center'  # controls alignment of of displayed equations (values can be: left, right, center)
+    _MATHJAX_SETTINGS['indent'] = '0em'  # if above is not set to 'center', then this setting acts as an indent
+    _MATHJAX_SETTINGS['show_menu'] = 'true'  # controls whether to attach mathjax contextual menu
+    _MATHJAX_SETTINGS['process_escapes'] = 'true'  # controls whether escapes are processed
+    _MATHJAX_SETTINGS['preview'] = 'TeX'  # controls what user sees as preview
+    _MATHJAX_SETTINGS['color'] = 'black'  # controls color math is rendered in
+
+    if not isinstance(settings, dict):
+        return
+
+    # The following mathjax settings can be set via the settings dictionary
+    # Iterate over dictionary in a way that is compatible with both version 2
+    # and 3 of python
+    for key, value in ((key, settings[key]) for key in settings):
+        if key == 'align' and isinstance(value, str):
+            if value == 'left' or value == 'right' or value == 'center':
+                _MATHJAX_SETTINGS[key] = value
+            else:
+                _MATHJAX_SETTINGS[key] = 'center'
+
+        if key == 'indent':
+            _MATHJAX_SETTINGS[key] = value
+
+        if key == 'show_menu' and isinstance(value, bool):
+            _MATHJAX_SETTINGS[key] = 'true' if value else 'false'
+
+        if key == 'process_escapes' and isinstance(value, bool):
+            _MATHJAX_SETTINGS[key] = 'true' if value else 'false'
+
+        if key == 'preview' and isinstance(value, str):
+            _MATHJAX_SETTINGS[key] = value
+
+        if key == 'color' and isinstance(value, str):
+            _MATHJAX_SETTINGS[key] = value
+
+
+def process_content(instance):
+    """Processes content, with logic to ensure that typogrify does not clash
+    with latex.
+
+    In addition, mathjax script is inserted at the end of the content thereby
+    making it independent of the template
+    """
+
+    if not instance._content:
+        return
+
+    ignore_within = ignore_content(instance._content)
+
+    if _WRAP_TAG:
+        instance._content, latex = wrap_latex(instance._content, ignore_within)
+    else:
+        latex = True if _LATEX_REGEX.search(instance._content) else False
+
+    # The user initially set typogrify to be True, but since it would clash
+    # with latex, we set it to False. This means that the default reader will
+    # not call typogrify, so it is called here, where we are able to control
+    # logic for it ignore latex if necessary
+    if process_content.typogrify:
+        # Tell typogrify to ignore the tags that latex has been wrapped in
+        ignore_tags = [_WRAP_TAG] if _WRAP_TAG else None
+
+        # Exact copy of the logic as found in the default reader
+        from typogrify.filters import typogrify
+        instance._content = typogrify(instance._content, ignore_tags)
+        instance.metadata['title'] = typogrify(instance.metadata['title'], ignore_tags)
+
+    if latex:
+        # Mathjax script added to the end of article. Now it does not need to
+        # be explicitly added to the template
+        instance._content += _MATHJAX_SCRIPT.format(**_MATHJAX_SETTINGS)
+
+        # The summary needs special care because latex math cannot just be cut
+        # off
+        summary = process_summary(instance, ignore_within)
+        if summary != None:
+            instance._summary = summary
+
+
+def pelican_init(pelicanobj):
+    """Intialializes certain global variables and sets typogogrify setting to
+    False should it be set to True.
+    """
+
+    global _WRAP_TAG
+    global _LATEX_SUMMARY_REGEX
+    global _LATEX_PARTIAL_REGEX
+
+    try:
+        settings = pelicanobj.settings['LATEX']
+    except:
+        settings = None
+
+    process_settings(settings)
+
+    # Allows mathjax script to be accessed from template should it be needed
+    pelicanobj.settings['MATHJAXSCRIPT'] = _MATHJAX_SCRIPT.format(**_MATHJAX_SETTINGS)
+
+    # If typogrify set to True, then we need to handle it manually so it does
+    # not conflict with Latex
+    try:
+        if pelicanobj.settings['TYPOGRIFY'] == True:
+            pelicanobj.settings['TYPOGRIFY'] = False
+            _WRAP_TAG = 'mathjax' # default to wrap mathjax content inside of
+            process_content.typogrify = True
+    except KeyError:
+        pass
+
+    # Set _WRAP_TAG to the settings tag if defined. The idea behind this is
+    # to give template designers control over how math would be rendered
+    try:
+        if pelicanobj.settings['LATEX']['wrap']:
+            _WRAP_TAG = pelicanobj.settings['LATEX']['wrap']
+    except (KeyError, TypeError):
+        pass
+
+    # regular expressions that depend on _WRAP_TAG are set here
+    tagstart = r'<%s>' % _WRAP_TAG if not _WRAP_TAG is None else ''
+    tagend = r'</%s>' % _WRAP_TAG if not _WRAP_TAG is None else ''
+    latex_summary_regex = r'((\$\$|\$|\\begin\{(.+?)\}).+?)(\2|\\end\{\3\}|\s?\.\.\.)(%s)?' % tagend
+    latex_partial_regex = r'(.*)(%s)(\\.*?|\$.*?)(\s?\.\.\.)(%s)' % (tagstart, tagend)
+
+    _LATEX_SUMMARY_REGEX = re.compile(latex_summary_regex, re.DOTALL | re.IGNORECASE)
+    _LATEX_PARTIAL_REGEX = re.compile(latex_partial_regex, re.DOTALL | re.IGNORECASE)
+
+
+def register():
+    """Plugin registration"""
+
+    signals.initialized.connect(pelican_init)
+    signals.content_object_init.connect(process_content)