Issue
I'm writing a book on coding in python using Latex. I plan on having a lot of text with python code interspersed throughout, along with its output. What's really giving me trouble is when I need to go back and edit my python code, it's a huge pain to get it back nicely into my latest document.
I've done a whole lot of research and can't seem to find a good solution.
This one includes full files as one, doesn't solve my issues https://tex.stackexchange.com/questions/289385/workflow-for-including-jupyter-aka-ipython-notebooks-as-pages-in-a-latex-docum
Same with this one. http://blog.juliusschulz.de/blog/ultimate-ipython-notebook
Found Solution 1 (awful)
I can copy and paste python code into latex ok using the listings latex package.
Pros:
- Easy to update only small section of code.
Cons:
- For output need to run in python, copy, paste separately.
- Initial writing SLOW, need to do this process hundreds of times per chapter.
Found Solution 2 (bad)
Use jupyter notebook with markdown, export to Latex, \include file into main Latex document.
Pros:
- Streamlined
- Has output contained within.
Cons:
- To make small changes, need to reimport whole document, any changes made to markdown text within Latex editor are not saved
- Renaming a single variable in python after jupyter notebook could take hours.
- Editing seems like a giant chore.
Ideal solution
- Write Text in Latex
- Write python in jupyter notebook, export to latex.
- Somehow include code snippets (small sections of the exported file) into different parts of the main latex book. This is the part I can't figure out
- When python changes are needed, changes in jupyter, then re-export as latex file with same name
- Latex book is automatically updated from includes.
The key here is that the exported python notebook is being split up and sent to different parts of the document. In order for that to work it needs to somehow be tagged or marked in the markdown or code of the notebook, so when I re-export it those same parts get sent to the same spots in the book.
Pros:
- Python edits easy, easily propagated back to book.
- Text written in latex, can use power of latex
Any help in coming up with a solution closer to my ideal solution would be much appreciated. It's killing me.
Probably doesn't matter, but I'm coding both latex and jupyter notebooks in VS Code. I'm open to changing tools if it means solving these problems.
Solution
Here's a small script I wrote. It splits single *.ipynb
file and converts it to multiple *.tex
file.
Usage is:
- copy following script and save as something like
main.py
- execute
python main.py init
. it will createmain.tex
andstyle_ipython_custom.tplx
- in your jupyther notebook, add extra line
#latex:tag_a
,#latex:tag_b
, .. to each cell which you want to extract. same tag will be extracted to same*.tex
file. - save it as
*.ipynb
file. fortunately, current VSCode python plugin supports exporting to*.ipynb
, or use jupytext to convert from*.py
to*.ipynb
. - run
python main.py path/to/your.ipynb
and it will createtag_a.tex
andtag_b.tex
- edit
main.tex
and add\input{tag_a.tex}
or\input{tag_b.tex}
where ever you want. - run
pdflatex main.tex
and it will producemain.pdf
The idea behind this script:
Converting from jupyter notebook to LaTex using default nbconvert.LatexExporter
produces complete LaTex file which includes macro definitions. Using it to convert each cell will may create large LaTex file. To avoid the problem, the script first creates main.tex
which has only macro definitions, and then converts each cell to LaTex file which has no macro defnition. This can be done using custom template file which is slightly modified from style_ipython.tplx
Tagging or marking the cell might be done using cell metadata, but I could not find how to set it in VSCode python plugin (Issue), so instead it scans source of each cell with regex pattern ^#latex:(.*)
, and remove it before converting it to LaTex file.
Source:
import sys
import re
import os
from collections import defaultdict
import nbformat
from nbconvert import LatexExporter, exporters
OUTPUT_FILES_DIR = './images'
CUSTOM_TEMPLATE = 'style_ipython_custom.tplx'
MAIN_TEX = 'main.tex'
def create_main():
# creates `main.tex` which only has macro definition
latex_exporter = LatexExporter()
book = nbformat.v4.new_notebook()
book.cells.append(
nbformat.v4.new_raw_cell(r'\input{__your_input__here.tex}'))
(body, _) = latex_exporter.from_notebook_node(book)
with open(MAIN_TEX, 'x') as fout:
fout.write(body)
print("created:", MAIN_TEX)
def init():
create_main()
latex_exporter = LatexExporter()
# copy `style_ipython.tplx` in `nbconvert.exporters` module to current directory,
# and modify it so that it does not contain macro definition
tmpl_path = os.path.join(
os.path.dirname(exporters.__file__),
latex_exporter.default_template_path)
src = os.path.join(tmpl_path, 'style_ipython.tplx')
target = CUSTOM_TEMPLATE
with open(src) as fsrc:
with open(target, 'w') as ftarget:
for line in fsrc:
# replace the line so than it does not contain macro definition
if line == "((*- extends 'base.tplx' -*))\n":
line = "((*- extends 'document_contents.tplx' -*))\n"
ftarget.write(line)
print("created:", CUSTOM_TEMPLATE)
def group_cells(note):
# scan the cell source for tag with regexp `^#latex:(.*)`
# if sames tags are found group it to same list
pattern = re.compile(r'^#latex:(.*?)$(\n?)', re.M)
group = defaultdict(list)
for num, cell in enumerate(note.cells):
m = pattern.search(cell.source)
if m:
tag = m.group(1).strip()
# remove the line which contains tag
cell.source = cell.source[:m.start(0)] + cell.source[m.end(0):]
group[tag].append(cell)
else:
print("tag not found in cell number {}. ignore".format(num + 1))
return group
def doit():
with open(sys.argv[1]) as f:
note = nbformat.read(f, as_version=4)
group = group_cells(note)
latex_exporter = LatexExporter()
# use the template which does not contain LaTex macro definition
latex_exporter.template_file = CUSTOM_TEMPLATE
try:
os.mkdir(OUTPUT_FILES_DIR)
except FileExistsError:
pass
for (tag, g) in group.items():
book = nbformat.v4.new_notebook()
book.cells.extend(g)
# unique_key will be prefix of image
(body, resources) = latex_exporter.from_notebook_node(
book,
resources={
'output_files_dir': OUTPUT_FILES_DIR,
'unique_key': tag
})
ofile = tag + '.tex'
with open(ofile, 'w') as fout:
fout.write(body)
print("created:", ofile)
# the image data which is embedded as base64 in notebook
# will be decoded and returned in `resources`, so write it to file
for filename, data in resources.get('outputs', {}).items():
with open(filename, 'wb') as fres:
fres.write(data)
print("created:", filename)
if len(sys.argv) <= 1:
print("USAGE: this_script [init|yourfile.ipynb]")
elif sys.argv[1] == "init":
init()
else:
doit()
Answered By - ymonad
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.