Issue
I have a raw text cell in my IPython notebook project.
Is there a way to get the text as a string with a build in function or something similar?
Solution
My (possibly unsatisfactory) answer is in two parts. This is based on a personal investigation of iPython structures, and it's entirely possible I've missed something that more directly answers the question.
Current session
The raw text for code cells entered during the current session is available within a notebook using the list In
.
So the raw text of the current cell can be returned by the following expression within the cell:
In[len(In)-1]
For example, evaluating a cell containing this code:
print "hello world"
three = 1+2
In[len(In)-1]
yields this corresponding Out[]
value:
u'print "hello world"\nthree = 1+2\nIn[len(In)-1]'
So, within an active notebook session, you can access the raw text of cell as In[n]
, where n
is the displayed index of the required cell.
But if the cell was entered during a previous Notebook session, which has subsequently been closed and reopened, that no longer works. Also, only code cells seem to be included in the In
array.
Also, this doesn't work for non-code cells, so wouldn't work for a raw text cell.
Cells from saved notebook sessions
In my research, the only way I could uncover to get the raw text from previous sessions was to read the original notebook file. There is a documentation page Importing IPython Notebooks as Modules describing how to do this. The key code is in In[4]
:
# load the notebook object
with io.open(path, 'r', encoding='utf-8') as f:
nb = current.read(f, 'json')
where current
is an instance of the API described at Module: nbformat.current.
The notebook object returned is accessed as a nested dictionary and list structure, e.g.:
for cell in nb.worksheets[0].cells:
...
The cell
objects thus enumerated have two key fields for the purpose of this question:
cell.cell_type
is the type of the cell ("code", "markdown", "raw", etc.).cell.input
is the raw text content of the cell as a list of strings, with an entry for each line of text.
Much of this can be seen by looking at the JSON data that constitutes a saved iPython notebook.
Apart from the "prompt number" fields in a notebook, which seem to change whenever the field is re-evaluated, I could find no way to create a stable reference to a notebook cell.
Conclusion
I couldn't find an easy answer to the original question. What I found is covered above. Without knowing the motivation behind the original question, I can't know if it's enough.
What I looked for, but was unable to identify, was a way to reference the current notebook that can be used from within the notebook itself (e.g. via a function like get_ipython()
). That doesn't mean it doesn't exist.
The other missing piece in my response is any kind of stable way to refer to a specific cell. (e.g. Looking at the notebook file format, raw text cells consist solely of a cell type ("raw") and the raw text itself, though it appears that cell metadata might also be included.) This suggests the only way to directly reference a cell is through its position in the notebook, but that is subject too change when the notebook is edited.
(Researched and answered as part of the Oxford participation in http://aaronswartzhackathon.org)
Answered By - Graham Klyne
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.