U.S. patent application number 17/070207 was filed with the patent office on 2021-02-18 for storing logical units of program code generated using a dynamic programming notebook user interface.
The applicant listed for this patent is Palantir Technologies Inc.. Invention is credited to Omar Ali, Punyashloka Biswal, Adam Borochoff, John Chakerian, Ben Duffield, Mark Elliot, Ankit Shankar.
Application Number | 20210048988 17/070207 |
Document ID | / |
Family ID | 1000005190687 |
Filed Date | 2021-02-18 |
United States Patent
Application |
20210048988 |
Kind Code |
A1 |
Elliot; Mark ; et
al. |
February 18, 2021 |
STORING LOGICAL UNITS OF PROGRAM CODE GENERATED USING A DYNAMIC
PROGRAMMING NOTEBOOK USER INTERFACE
Abstract
The programming notebook system, methods, and user interfaces
described herein provide software developers with enhanced tools by
which a programming notebook workflow and session history
associated with code cells in a programming notebook may be tracked
and maintained. As a developer progresses through a development
workflow, the developer can select an option to save a program code
card representing some or all of the program code cell inputs. A
card editor user interface may present an aggregated listing of all
program code the developer has provided across multiple code cells
during the current session which the developer can edit, refine,
and/or comment. The card editor may also allow the developer to add
associated user interface code to display a UI component associated
with the program code card, and allow the developer to add a
description and tags for the card so that the card can be searched
for and reused.
Inventors: |
Elliot; Mark; (London,
GB) ; Biswal; Punyashloka; (Norwalk, CT) ;
Shankar; Ankit; (San Francisco, CA) ; Ali; Omar;
(Abu Dhabi, AE) ; Chakerian; John; (Los Altos
Hills, CA) ; Duffield; Ben; (New York, NY) ;
Borochoff; Adam; (New York, NY) |
|
Applicant: |
Name |
City |
State |
Country |
Type |
Palantir Technologies Inc. |
Denver |
CO |
US |
|
|
Family ID: |
1000005190687 |
Appl. No.: |
17/070207 |
Filed: |
October 14, 2020 |
Related U.S. Patent Documents
|
|
|
|
|
|
Application
Number |
Filing Date |
Patent Number |
|
|
16135285 |
Sep 19, 2018 |
10838697 |
|
|
17070207 |
|
|
|
|
15392168 |
Dec 28, 2016 |
10127021 |
|
|
16135285 |
|
|
|
|
14845001 |
Sep 3, 2015 |
9870205 |
|
|
15392168 |
|
|
|
|
62097388 |
Dec 29, 2014 |
|
|
|
Current U.S.
Class: |
1/1 |
Current CPC
Class: |
G06F 3/0481 20130101;
G06F 8/36 20130101; G06F 8/34 20130101; G06F 8/41 20130101; G06F
11/3668 20130101; G06F 8/33 20130101; G06T 11/206 20130101; G06F
9/45512 20130101 |
International
Class: |
G06F 8/34 20060101
G06F008/34; G06F 8/33 20060101 G06F008/33; G06F 8/41 20060101
G06F008/41; G06F 8/36 20060101 G06F008/36; G06F 11/36 20060101
G06F011/36; G06T 11/20 20060101 G06T011/20 |
Claims
1. (canceled)
2. A computer-implemented method of generating displayable data
visualizations, the computer-implemented method comprising: by one
or more hardware computer processors executing program code:
analyzing at least some data items of a set of data items to
determine at least a first data type associated with at least one
of the data items of the set of data items, wherein the first data
type comprises geographic data; selecting, based at least in part
on the first data type, a first visualization type from a plurality
of data visualization types, wherein the plurality of data
visualization types are built-in and include at least one of: time
series, calendar, scatter plot, line plot, histogram, chart, graph,
table, map, heat map, or geographic map, and wherein the first
visualization type comprises at least one of: map, heat map, or
geographic map; generating at least a first displayable data
visualization of the first visualization type, wherein the first
displayable data visualization is based at least in part on a
portion of the set of data items; and causing display of the first
displayable data visualization.
3. The computer-implemented method of claim 2, wherein the
geographic data comprises at least one of: geographic coordinates,
map coordinates, or latitude and longitude coordinates.
4. The computer-implemented method of claim 2, wherein the
geographic data comprises geographic abbreviations.
5. The computer-implemented method of claim 2, wherein the set of
data items includes one or more column headers, and wherein the
first data type is determined, at least in part, based on the one
or more column headers.
6. The computer-implemented method of claim 2, wherein the first
data type is determined, at least in part, based on a range of
values of at least a portion of the set of data items.
7. The computer-implemented method of claim 2, wherein the first
data type is determined, at least in part, based on another
attribute of at least a portion of the set of data items.
8. The computer-implemented method of claim 2 further comprising:
by the one or more hardware computer processors executing program
code: receiving a user input modifying data items that are
represented in the first displayable data visualization; and
causing display of an updated first displayable data
visualization.
9. The computer-implemented method of claim 8, wherein the user
input modifying data items comprises at least one of: combining the
data items with one or more other data items, receiving and
applying a query of the set of data items, or receiving and
filtering based on a selection of particular data items or types of
data items of interest to the user.
10. A computer-implemented method of generating displayable data
visualizations, the computer-implemented method comprising: by one
or more hardware computer processors executing program code:
analyzing at least some data items of a set of data items to
determine at least a first data type associated with at least one
of the data items of the set of data items, wherein the first data
type comprises at least one of dates or times; selecting, based at
least in part on the first data type, a first visualization type
from a plurality of data visualization types, wherein the plurality
of data visualization types are built-in and include at least one
of: time series, calendar, scatter plot, line plot, histogram,
chart, graph, table, map, heat map, or geographic map, and wherein
the first visualization type comprises at least one of: time series
or calendar; generating at least a first displayable data
visualization of the first visualization type, wherein the first
displayable data visualization is based at least in part on a
portion of the set of data items; and causing display of the first
displayable data visualization.
11. The computer-implemented method of claim 10, wherein the first
displayable data visualization comprises time series, including a
time series plot.
12. The computer-implemented method of claim 10, wherein the set of
data items includes one or more column headers, and wherein the
first data type is determined, at least in part, based on the one
or more column headers.
13. The computer-implemented method of claim 10, wherein the first
data type is determined, at least in part, based on a range of
values of at least a portion of the set of data items.
14. The computer-implemented method of claim 10, wherein the first
data type is determined, at least in part, based on another
attribute of at least a portion of the set of data items.
15. The computer-implemented method of claim 10 further comprising:
by the one or more hardware computer processors executing program
code: receiving a user input modifying data items that are
represented in the first displayable data visualization; and
causing display of an updated first displayable data
visualization.
16. The computer-implemented method of claim 15, wherein the user
input modifying data items comprises at least one of: combining the
data items with one or more other data items, receiving and
applying a query of the set of data items, or receiving and
filtering based on a selection of particular data items or types of
data items of interest to the user.
17. A computer-implemented method of generating displayable data
visualizations, the computer-implemented method comprising: by one
or more hardware computer processors executing program code:
analyzing at least some data items of a set of data items to
determine at least a first data type associated with at least one
of the data items of the set of data items, wherein the first data
type comprises scalars; selecting, based at least in part on the
first data type, a first visualization type from a plurality of
data visualization types, wherein the plurality of data
visualization types are built-in and include at least one of: time
series, calendar, scatter plot, line plot, histogram, chart, graph,
table, map, heat map, or geographic map, and wherein the first
visualization type comprises at least one of: scatter plot, line
plot, chart, or graph; generating at least a first displayable data
visualization of the first visualization type, wherein the first
displayable data visualization is based at least in part on a
portion of the set of data items; and causing display of the first
displayable data visualization.
18. The computer-implemented method of claim 17, wherein the set of
data items includes one or more column headers, and wherein the
first data type is determined, at least in part, based on the one
or more column headers.
19. The computer-implemented method of claim 17, wherein the first
data type is determined, at least in part, based on a range of
values of at least a portion of the set of data items.
20. The computer-implemented method of claim 17 further comprising:
by the one or more hardware computer processors executing program
code: receiving a user input modifying data items that are
represented in the first displayable data visualization; and
causing display of an updated first displayable data
visualization.
21. The computer-implemented method of claim 20, wherein the user
input modifying data items comprises at least one of: combining the
data items with one or more other data items, receiving and
applying a query of the set of data items, or receiving and
filtering based on a selection of particular data items or types of
data items of interest to the user.
Description
CROSS-REFERENCE TO RELATED APPLICATIONS
[0001] This application is a continuation of U.S. patent
application Ser. No. 16/135,285, filed on Sep. 19, 2018, which is a
continuation of U.S. patent application Ser. No. 15/392,168, filed
on Dec. 28, 2016, which is a continuation of U.S. patent
application Ser. No. 14/845,001, filed on Sep. 3, 2015, which
claims priority from provisional U.S. Patent Application No.
62/097,388, filed on Dec. 29, 2014. The entire disclosure of each
of the above items is hereby made part of this specification as if
set forth fully herein and incorporated by reference for all
purposes, for all that it contains.
[0002] Any and all applications for which a foreign or domestic
priority claim is identified in the Application Data Sheet as filed
with the present application are hereby incorporated by reference
under 37 CFR 1.57.
BACKGROUND
[0003] Programming notebooks have become a valuable asset in a
software developer's toolkit. A programming notebook, such as the
popular iPython Notebook, allows a developer to more rapidly
develop and test code, typically by enabling a dynamic command-line
shell interface which the developer can use to input, execute, and
view associated outputs for lines of program code in a
read-execute-print loop ("REPL"). Programming notebook outputs can
be provided in various formats, such as a JavaScript Object
Notation ("JSON," which is a lightweight data-interchange format)
document containing an ordered list of input/output cells which can
contain code, text, mathematics, plots and rich media. Programming
notebook outputs can also be converted to a number of open standard
output formats (HTML, HTML presentation slides, LaTeX, PDF,
ReStructuredText, Markdown, Python, etc.).
[0004] Typically, a programming notebook consists of a sequence of
cells. A cell is a multi-line text input field, and its contents
can be executed by the developer using the programming notebook
interface. Code cells allow the developer to edit and write code
and can provide features such as syntax highlighting and tab
completion. When a cell is executed using a backend system
associated with the programming notebook, results are displayed in
the notebook as the cell's output. Output can be displayed in a
variety of formats such as text, data plots, and tables.
[0005] In a normal programming notebook workflow, the developer can
edit cells in-place multiple times until a desired output or result
is obtained, rather than having to re-run separate scripts. The
programming notebook interface enables the developer to work on
complex computational programs in discrete and manageable pieces.
The developer can organize related programming ideas into cells and
work progressively forward as various pieces are working correctly.
Once a developer has completed a workflow, the programming notebook
can be saved or downloaded into a format which, among other things,
may remove output results and convert some cell contents (e.g.,
some contents may be converted to non-executable comments in an
output programming language).
SUMMARY
[0006] One embodiment comprises a computing system for providing a
programming notebook, the computing system comprising: one or more
hardware computer processors configured to execute software code; a
non-transitory storage medium storing software modules configured
for execution by the one or more hardware computer processors. The
software modules may comprise at least: a code compiler and
execution module configured to: receive, on behalf of a user
interacting with a programming notebook user interface in a
programming session, a request to execute a unit of program code
associated with a program cell in the programming notebook user
interface, wherein the unit of program code comprises one or more
lines of program code; execute, on behalf of the user, the unit of
program code; provide an output result associated with the
execution of the unit of program code, wherein the output result is
configured for display in association with the program cell in the
programming notebook user interface; and a program code card
management module configured to: maintain a session history of
requests to execute units of program code and associated output
results; receive a request to generate a program code card for the
programming session; provide a program code card editor user
interface including at least an aggregate listing of the lines
associated with respective units of program code associated with
the session history; receive, via the program code card editor user
interface, user input comprising a selection of program code for
the program code card; and generate the program code card based at
least in part on the user input.
[0007] In another embodiment, the user input further comprises at
least some user interface code. In another embodiment, the user
input further comprises at least a description or a tag for the
program code card. In another embodiment, providing the output
result associated with the execution of the unit of program code
comprises: analyzing the output result to determine a data type
associated with the data type; selecting a data visualization user
interface component to include with the output result based at
least in part on the data type; generating, based on the output
result, a data visualization user interface component; and provide
the data visualization user interface component with the output
result. In another embodiment, the data visualization user
interface component is one of a time series, a scatter plot, a
histogram, a chart, a bar graph, or a table. In another embodiment,
based on a determination that the data type is a date the data
visualization user interface component is a time series. In another
embodiment, based on a determination that the data type is a
geographic unit of measurement, the data visualization user
interface component is a map.
[0008] Another embodiment comprises a computer-implemented method
comprising: under control of a hardware computing device configured
with specific computer executable instructions: maintaining a
session history of requests to execute units of program code
received in association with a programming notebook user interface
in a programming session, wherein respective units of program code
are associated with respective program cells in the programming
notebook user interface; receiving a request to generate a program
code card for the programming session; providing a program code
card editor user interface including at least an aggregate listing
of the units of program code associated with the session history,
wherein the aggregate listing includes, for each unit of program
code, an indicator label of the associated program cell in the
programming notebook user interface; receiving, via the program
code card editor user interface, user input comprising a selection
of program code for the program code card; and generating the
program code card based at least in part on the user input.
BRIEF DESCRIPTION OF THE DRAWINGS
[0009] FIG. 1 illustrates an example programming workflow user
interface for a programming notebook including a program code card
component, as generated using one embodiment of the programming
notebook system of FIG. 6.
[0010] FIG. 2 illustrates an example card editor user interface for
a programming notebook, as generated using one embodiment of the
programming notebook system of FIG. 6.
[0011] FIG. 3 illustrates an example programming workflow user
interface for a programming notebook including one or more
automatically generated data visualizations, as generated using one
embodiment of the programming notebook system of FIG. 6.
[0012] FIG. 4 is a flowchart for one embodiment of an example
process for generating and storing logical units of program code
using a dynamic programming notebook user interface, as used in one
embodiment of the programming notebook system of FIG. 6.
[0013] FIG. 5 is a flowchart for one embodiment of an example
process for automatically determining one or more data
visualizations to provide with output results generated in response
to a program code execution request, as used in one embodiment of
the programming notebook system of FIG. 6.
[0014] FIG. 6 is a block diagram of an implementation of an
illustrative programming notebook system.
DETAILED DESCRIPTION OF SPECIFIC EMBODIMENTS
Overview
[0015] One drawback to existing programming notebook user
interfaces and workflows is that the final output tends to be
static, susceptible to inadvertent edits and difficult to reverse
format corruption if manually edited, and generally does not lend
itself well to facilitating re-use, re-execution, or sharing of
code developed using the programming notebook interface. For
example, using traditional programming notebook user interfaces a
developer, in order to re-use code or logic a developer is often
required to resort to copying-and-pasting blocks of program logic
or code from one notebook to another. As a result, if a bug in the
program logic or code is identified, it may only be fixed once in
the location it is found; the other "copy" would not be
automatically updated, and in many cases the developer that
identified and fixed the bug may be unaware that a copy in another
notebook may also need to be fixed or updated accordingly. It may
never be clear which version of the program logic or code is the
"canonical" or master version, and the developer may not even know
that other copies of the program logic or code exist in one or more
other notebooks.
[0016] As another example of how traditional programming notebook
user interfaces are deficient relates to reordering or deleting of
cells of program logic in the notebook. Using these functions, a
developer may perform some logic development and analysis, and then
reorder or delete cells in the analysis workflow such that the
notebook may become inconsistent, and in some cases may no longer
be "run" from top to bottom because the logic has become
inconsistent, invalid, or even corrupt. Other programming
environments which utilize or support a REPL-type interface and
copying/pasting of code logic may suffer similar failings.
[0017] The programming notebook system, methods, and user
interfaces described in this disclosure provide the developer with
an enhanced tool by which the workflow and session history
associated with code cells in a programming notebook are tracked
and maintained. As the developer progresses through a development
workflow, when desired outcome results are achieved, the developer
can select an option to save a program code card representing some
or all of the code cell inputs. A card editor user interface may
provide the developer with a code editor input panel which presents
an aggregated listing of all program code the developer has
provided across multiple code cells during the current session. The
developer can edit, refine, comment, and/or otherwise clean-up the
aggregated code listing, such as by removing intermediate lines of
code which were rendered unnecessary by other lines of code,
editing code to refine definitions, adding comments to document the
code and what it does, and so on. The card editor may also allow
the developer to add associated user interface code to display a UI
component associated with the program code card. The card editor
may also allow the developer to add a description and tags for the
card so that the card can be searched for and reused by other
developers using the programming notebook system.
[0018] Once a program code card has been generated and stored by
the programming notebook system it is added to a searchable card
library. Developers can then search for and add cards to their own
programming notebook workflows and leverage the work done by other
developers. Cards may be used within a cell in the workflow, for
example directly as a call to a function defined by the card or by
interaction with an associated UI component defined by the card.
For example, a UI component may expose one or more text input boxes
corresponding to input parameters for a function defined by the
card.
[0019] One potential benefit of enabling re-use of cards in the
workflow is that cards can be built on top of each other and saved
into new cards, which include or reference program code defined in
previous cards. Then, an end user can request to import a program
code card into a programming session or workflow. The programming
notebook system then imports the program code card into the
programming session or workflow, such that the end user can
execute, by providing user input to the programming notebook user
interface, a unit of program code associated with the program code
card.
[0020] Another feature provided by the programming notebook system
described herein is enhanced output results which are provided
based on introspection of the output results. For example,
additional output results may be suggested based on data attributes
associated with the output results and/or input parameters.
Additional output results might include one or more interactive
data visualization thumbnail images and UI controls presenting the
output results, or a portion thereof, in various different formats
(e.g., a time series, a histogram, a table, a heat map, etc.).
Whether and which data visualization thumbnail images and UI
controls are displayed may be based on the attributes and/or values
of the data output. For example, if data attributes or values
indicate the data includes map coordinates, a geographic map data
visualization thumbnail image and UI control may be displayed. Or,
in another example, if data attributes or values indicate the data
includes dates and times, a time series data visualization
thumbnail image and UI control may be displayed. Each interactive
data visualization thumbnail image and UI control may be generated
based on the actual output results be fully interactive such that,
for example, if the developer selects the thumbnail a corresponding
full or larger size data visualization may be displayed in the
programming notebook user interface. Among other benefits this
proactive prediction of data visualizations which may be relevant
or useful to the developer can help streamline and improve the
developer's workflow. For example, being able to quickly review
output results can aid the developer in determining whether the
program code used to generate the output results may need to be
modified and in what ways in order to move closer to or achieve a
desired result.
[0021] In one example, an introspection algorithm may be
implemented as follows. A visualization may define a set of rules
defining what the visualization requires of the underlying data.
For example, for a heat map, the rules may specify that (1) the
data must be in a tabular format, and (2) the table of data must
contain two numeric columns, which are in correct, specified ranges
for latitude and longitude (e.g., from negative 90 to positive 90
degrees, from negative 180 to positive 180 degrees). Or, as another
example, for a line chart, the rules may specify that, in order to
correctly plot a line, (1) if the data is a list of scalars, then
the data must contain a set of numbers, or (2) if the data is a
list of pairs, then the value of the first coordinate of each pair
must be increasing over the list of pairs. Or, as yet another
example, for a timeseries plot (e.g., one or more line charts
overlaid with a time axis), the rules may specify that (1) the data
must be in a tabular format and (2) one column in the table must be
in a date-time format or parseable to a date-time format (e.g., an
ISO 8601 date format). The rules for each data visualization may
then be responsible for providing the data in a normalized form
(e.g., for a timeseries, explicit ticks for values on the time axis
may be specified). In addition, the rules may be "fuzzy" such that
data can be matched to the rules (or the data may satisfy the
criteria specified by the rules) in differing degrees. The
visualizations may then be ranked based on the degree of match
between the data and the rules. (For example, for a heat map, if
the names of the columns identified in the table as providing
latitude and longitude contain the substrings "lat" or "Ion" then
this may increase the confidence that a heat map is a valid
visualization). The foregoing provides but one example of an
introspection algorithm; other approaches may also be used,
including various machine learning algorithms which may, for
example, be implemented to train or learn over time which
particular thumbnail a user actually picks.
[0022] Embodiments of the disclosure will now be described with
reference to the accompanying figures, wherein like numerals refer
to like elements throughout. The terminology used in the
description presented herein is not intended to be interpreted in
any limited or restrictive manner, simply because it is being
utilized in conjunction with a detailed description of certain
specific embodiments of the disclosure. Furthermore, embodiments of
the disclosure may include several novel features, no single one of
which is solely responsible for its desirable attributes or which
is essential to practicing the embodiments of the disclosure herein
described.
[0023] For purposes of this disclosure, certain aspects,
advantages, and novel features of various embodiments are described
herein. It is to be understood that not necessarily all such
advantages may be achieved in accordance with any particular
embodiment of the invention. Thus, for example, those skilled in
the art will recognize that one embodiment may be carried out in a
manner that achieves one advantage or group of advantages as taught
herein without necessarily achieving other advantages as may be
taught or suggested herein.
Example User Interfaces
[0024] FIGS. 1, 2, and 3 illustrate example programming notebook
user interfaces, as used in one or more embodiments of the
programming notebook system 100 of FIG. 6. The sample user
interfaces may be displayed, for example, via a web browser (e.g.,
as a web page), a mobile application, or a standalone application.
However, in some embodiments, the sample user interfaces shown in
FIGS. 1, 2, and 3 may also be displayed on any suitable computer
device, such as a cell/smart phone, tablet, wearable computing
device, portable/mobile computing device, desktop, laptop, or
personal computer, and are not limited to the samples as described
herein. The user interfaces include examples of only certain
features that a programming notebook system may provide. In other
embodiments, additional features may be provided, and they may be
provided using various different user interfaces and software code.
Depending on the embodiment, the user interfaces and functionality
described with reference to FIGS. 1, 2, and 3 may be provided by
software executing on the individual's computing device, by a
programming notebook system located remotely that is in
communication with the computing device via one or more networks,
and/or some combination of software executing on the computing
device and the programming notebook system. In other embodiments,
analogous interfaces may be presented using audio or other forms of
communication. In an embodiment, the interfaces shown in FIGS. 1,
2, and 3 are configured to be interactive and respond to various
user interactions. Such user interactions may include clicks with a
mouse, typing with a keyboard, touches and/or gestures on a touch
screen, voice commands, physical gestures made within a proximity
of a user interface, and/or the like.
[0025] FIG. 1 (split across illustrates an example programming
workflow user interface 1000 for a programming notebook including a
program code card component, as generated using one embodiment of
the programming notebook system of FIG. 6. The programming workflow
user interface 1000 may be displayed in association with a
programming session and enable a developer to quickly and
interactively compose and execute lines of program code and view
associated output results.
[0026] The programming workflow user interface 1000 of FIG. 1
includes several illustrative program cells. Program cell 102
includes two lines of program code and respective printouts of
results from execution of the two lines of program code. Each
program cell may also include, for example, an execution time
status indicator to indicate how long the program code took to
execute. Each program cell may also include a submenu of available
actions with respect to the program cell, which may include for
example a label descriptor to describe a type of code or mode
associated with the program cell (e.g., a programming language
indicator such as "Scala," "python," "HTML," "JavaScript,"
"markdown," "sql," and so on; a card mode indicator to indicate
that the program cell is being used to display or invoke a card;
and other similar types of modes). The program cell submenu may
also include an option to execute the program code in the cell, an
option to edit the cell (e.g., the developer may return to a
previous cell, edit the program cell contents, and re-run the
program cell to produce updated output results); an option to
expand the size of the input box (e.g., if the developer wishes to
add more lines of code than available space allows); and an option
to delete, remove, or otherwise discard the program cell from the
current session.
[0027] Program cell 104 represents a program code card which has
been added to the current session and invoked by the developer to
call a particular function defined by the logic in the program code
card. In this example the card in program cell 104 has an
associated UI control 105, which may optionally be specified by a
user when the card is created or edited. For example, the card
included at program cell 104 displays a function label; three text
input boxes for the input parameters used by the function; and a
function call preview displaying the function to be called with the
provided input parameters (e.g., "cardFunction(input1,input2)")
when the program cell is executed. The UI control 105 for the card
at program cell 104 also includes a "Run" button which the
developer can select in order to run the card's program code. In
some embodiments more advanced UI controls and inputs may be
provided for the card, as specified by the developer using the card
editor user interface (such as the one shown in FIG. 2). In other
embodiments, no UI control 105 may be provided and the developer
using the card may type the function call and input parameters
(e.g., "card.cardFunction(input1,input2)") directly into the
default text box provided by the program cell.
[0028] The programming workflow user interface 1000 may include a
main menu 107 providing additional features for the programming
notebook. For example, the programming notebook system 100 may
support or enable the developer to launch multiple sessions and
switch between them. Thus, a "new session" option may cause the
programming notebook system 100 to display a clean copy of the
programming workflow user interface 1000 (e.g., including one empty
program cell for the developer to begin a new session workflow).
Additionally, a "change session" option may allow the developer to
switch between multiple active sessions. User selection of this
option may cause the programming notebook system 100 to display a
copy of the programming workflow user interface 1000 for the
changed-to session (e.g., including any program cell(s) and result
outputs associated with the changed-to session for the developer to
continue the changed-to session workflow).
[0029] The main menu 107 in FIG. 1 may also include a transcript
option for the developer to view a transcript of program code for
the current session. The transcript can include an aggregate
listing of the lines of program code input by the developer across
all program cells for the current session, without the result
outputs associated with each program cell. The transcript may be
generated to automatically insert program cell identifiers as code
comments to identify or delimit the contents of each respective
program cell (e.g., "// cell 1," "//cell 2," etc.). In certain
embodiments in which the developer has imported a program code card
in the programming workflow user interface, the transcript view may
be generated to include (1) the function or code used to invoke the
underlying code associated with the card, (2) the underlying code,
and/or (3) an option to toggle or switch between (1) and (2).
[0030] As the developer progresses through a workflow and the
number of program cells used in the current session increases, the
transcript may be updated to reflect the current session. If a
developer returns to or re-uses a program cell, the program cell
identifiers may indicate this in some manner, such as with a
revised identification, timestamp, or other way. For instance, a
revised and/or re-executed program cell may be added as a new cell
in the session history and associated transcript.
[0031] The main menu in FIG. 1 may also include a card menu option
for the developer to create a new card from program code used in
the current session, and/or to search for and add cards from a
library of cards which have been created by the developer and/or
other developers using the programming notebook system of FIG. 6.
An example card editor user interface is illustrated and described
in more detail with reference to FIG. 2. Saved cards may be stored,
searched, and accessed, for example, from the program code card
repository 122. Cards may be searched based on a name, a
description, and/or one or more tags, as well as other searchable
attributes (including in some cases the underlying code associated
with the card) provided by the developer when the card is created
or edited.
[0032] The programming workflow user interface 1000 may also
include a session UI panel 106 which lists, for the current
session, variables and their current values and/or functions which
have been defined. For example, the session panel 106 listing
includes, among other things, variables for "sc" and "sq" and a sql
function "<function1>." The programming workflow user
interface 1000 may also include a REPL UI panel which lists a
summary of result outputs for the current session. The session
panel and the REPL UI panels provide the developer with useful
at-a-glance information to aid the developer's workflow and code
refinement process.
[0033] FIG. 2 illustrates an example card editor user interface
2000 for a programming notebook, as generated using one embodiment
of the programming notebook system of FIG. 6. The card editor user
interface 2000 may be displayed in response to the developer's
selection of an option to create a new card based on the current
session. The card editor user interface 2000 may be initialized by
the programming notebook system 100 to include program code content
associated with the current session in the main programming
workflow UI. In particular, the program code may be accessed from
the transcript associated with the current session and displayed in
a first user-editable text area 202 by which the developer can edit
the program code (e.g., add or remove lines, insert code comments
and specifications, and other general code clean up and
maintenance). In one embodiment the entire contents of the current
session transcript may be displayed in the card editor for the
developer to edit. In another embodiment, the main programming
workflow user interface may include user-selectable options for the
developer to select one or more program code cells from which to
extract program code for the card editor.
[0034] The card editor user interface 2000 may also include a
second user-editable text area 204 by which the developer may
optionally provide user interface code to be associated with the
program code logic of the card. For example, as shown in FIG. 2,
HTML and JavaScript may be input by the developer to generate a
web-based UI component for the card. Then, when the card is used in
subsequent programming notebook sessions, the programming notebook
system 100 can interpret the UI code for the card in order to cause
display of the UI component directly within the programming
notebook UI (for example, as show in program cell 104 of the
programming workflow user interface in FIG. 1).
[0035] The card editor user interface 2000 also provides options
for the developer to provide a description and one or more tags to
be associated with the card when it is saved. The description
and/or tags may be searchable by other users of the programming
notebook system 100 to facilitate re-use of cards across many
developer sessions. When the developer is satisfied with the card's
settings, a save option may be selected in order to save the card,
for example in the program code card repository 170. After the card
is saved, the card editor user interface 3000 may be closed
(automatically or manually) and the developer can return to the
main programming workflow UI. If the developer desires, one or more
program cells (for example, those that were used as inputs to the
card) may be removed from the workflow by using the respective
delete options. However, in some embodiments, the program cell may
not also be removed from the transcript. That is, in some
instances, the transcript is implemented as immutable log of
everything that happens in the workflow, including deletion events.
Thus, deletion of a program cell may be more like hiding, in that
the transcript still maintains a copy of the deleted cell. Then,
the transcript view could be augmented to show or indicate when an
entry in the transcript no longer exists in the main programming
workflow UI (e.g., this may be visually indicated to the user via
some formatting change, an icon, and so on).
[0036] FIG. 3 illustrates an example programming workflow user
interface 3000 for a programming notebook including one or more
automatically generated data visualizations, such as example
visualizations 302A, 302B, and 302C of FIG. 3, as generated using
one embodiment of the programming notebook system of FIG. 6. The
data visualizations may be of particular benefit to the developer
in the context of database queries in order to quickly view query
results and assess whether the query needs to be revised or tweaked
to improve the quality of the output results. The data
visualizations 302 (including 302A, 302B, and 302C) shown in user
interface 3000 are user-selectable image thumbnails or miniaturized
visualizations of the actual output results. In response to the
user selecting one of the data visualizations, a larger
corresponding version of the same data may be displayed in the main
workflow of the user interface 3000 to enable the developer to
explore the output results. A variety of data visualizations,
ranging from thumbnail to normal to large sized, may be generated
and displayed, including but not limited to time series,
histograms, tables, graphics, heat maps, and other types of data
charts and visualizations.
[0037] The data visualizations 302 shown in user interface 3000 may
be generated, for example, in accordance with the process 500
illustrated and described with reference to FIG. 5 herein. In
particular, the data visualizations may be automatically selected
for generation and generated based at least in part on an analysis
of the type of data returned with the output results. For example,
the programming notebook system 100 may analyze the output results,
determine that the output results include geographic data (such as
latitude and longitude coordinates), and generate a map data
visualization, such as visualization 302A, for display with the
output results in the main programming workflow user interface
Examples of Processes Performed by Programming Notebook Systems
[0038] FIGS. 4 and 5 are flowcharts illustrating various
embodiments of programming notebook system processes. In some
implementations, the processes are performed by embodiments of the
programming notebook system 100 described with reference to FIG. 6
and/or by one of its components, such as the such as the code
compiler and execution module 122, the program code card management
module 126, or the data (column) introspection module 128. For ease
of explanation, the following describes the services as performed
by the programming notebook system 100. The example scenarios are
intended to illustrate, but not to limit, various aspects of the
programming notebook system 100. In one embodiment, the processes
can be dynamic, with some procedures omitted and others added.
Generating Logical Units of Program Code
[0039] FIG. 4 is a flowchart illustrating one embodiment of a
process 400 for generating and storing logical units of program
code using a dynamic programming notebook user interface, as used
in one embodiment of the programming notebook system 100 of FIG. 6.
Depending on the embodiment, the method of FIG. 4 may include fewer
or additional blocks and the blocks may be performed in an order
that is different than illustrated.
[0040] The process 400 begins at block 405 where the programming
notebook system receives a request to execute user input program
code for a cell, such as input provided by a developer interacting
with the programming workflow user interface 1000. This aspect of
the process may be referred to as the "read" part of a
read-eval-print loop (REPL). The request to execute program code
may include one or more lines of program code of varying complexity
and may include operations such as, but not limited to, initiating
a database connection, submitting queries to the database, defining
variables and functions, inserting code comments and markup, and so
on. The programming notebook system 100 may be configured to
support a wide variety of programming languages, including but not
limited to Scala, Python, HTML, JavaScript, Ruby, and so on.
[0041] At block 410, the programming notebook system 100 executes
the program code associated with the request. This may be, for
example, the "eval" part of REPL. The program code received with
the request is evaluated and executed to produce output results.
The output results may include a wide range of programmatic outputs
including no output (e.g., a simple return), a Boolean value, a
variable, a value, search query results, and the like. As discussed
further herein, the output results may further include, or be
analyzed to include, one or more data visualizations which may be
of possible interest to the developer based on any inputs in the
program code and/or based on the output results.
[0042] At block 415, the programming notebook system 100 provides
the output results associated with execution of the program code,
e.g., as produced at block 410. This aspect of the process
corresponds to the "print" part of the REPL. The output results may
be presented or configured for presentation in the programming
workflow user interface 1000, for example below the program cell
used by the developer to input the program code for the
request.
[0043] At block 420 the programming notebook system 100 maintains
the session history of program code cell execution requests and the
associated output results. The session history may be maintained
and used by the programming notebook system 100 in memory 130
(e.g., for the duration of the current session or as the developer
switches between multiple sessions) or stored for later access and
retrieval (e.g., in one of the other data sources 174 of FIG. 6).
The session history may be used to, for example, generate a
transcript of the current session in response to the user's
selection of the view transcript option (see, e.g., FIG. 1). The
session history may be used to generate or initialize a card editor
user interface, as further described below.
[0044] At block 425 the programming notebook system 100 determines
whether a request to generate a program code card has been
received. If such a request has not been received, then the process
400 may return to block 405 and continue processing program code
execution requests in the REPL from blocks 405 to 415 as many times
as the developer would like.
[0045] In response to a request to generate a program code card has
been received, the process may proceed to block 430. At block 430,
the maintained session history is provided to allow the user to
select and/or edit program code for the program code card. For
example, the maintained session history may include all program
code, organized by respective cells, which the user provides as
input for the current session. The program code may be displayed,
for example, as a listing of program code in a user-editable text
area within a card editor user interface, such as the card editor
user interface 2000 of FIG. 2. As discussed with reference to FIG.
2, the card editor UI may also enable the user to add additional
program code for an associated UI component for the card.
[0046] When the user has completed editing of the program code,
associated UI code, description, and/or tags, she can select the
"Save" (or similar) option. In response, at block 435, the
programming notebook system 100 generates the program code card
comprising one or more user selected and/or edited lines of program
code. Once the program card code has been generated the programming
notebook system 100 can store the program code card, for example at
the program code card repository 170.
Determining Data Visualizations to Provide with Program Code Output
Results
[0047] FIG. 5 is a flowchart illustrating one embodiment of a
process 500 for automatically determining one or more data
visualizations to provide with output results generated in response
to a program code execution request, as used in one embodiment of
the programming notebook system 100 of FIG. 6. Depending on the
embodiment, the method of FIG. 5 may include fewer or additional
blocks and the blocks may be performed in an order that is
different than illustrated.
[0048] The process 500 begins at block 505 where the programming
notebook system 100 analyzes parameters associated with the program
code provided by the user at a cell in the main programming
workflow UI to determine one or more potential data type attributes
associated with the input parameters. For example, the program code
to be executed might include one or more input parameters of a
particular data type which may suggest what type of data the output
results will be.
[0049] At block 510, the programming notebook system 100 analyzes
the output results associated with execution of the program code
for the cell to determine one or more potential data type
attributes associated with the output results. For example, if the
output results comprise a table of data (e.g., rows and columns)
then the columns and/or data values may be analyzed to identify the
type of data associated with each column. For example, if the
output results table of data includes column headers, these headers
may contain contextual information to indicate the type of data
(e.g., a column labeled with the word "DATE" is likely to be a date
data attribute, a column labeled with the word "CITY" is likely to
be a geographical data attribute, and so on). Further, the output
results data table values may be parsed and analyzed to determine
probable data types, in particular if no column headings or other
metadata is available. For example, values in the format
"####-##-##" may be analyzed and interpreted by the programming
notebook system 100 to indicate that the value is likely to be a
date data attribute. In other examples standard formats may be
analyzed and compared to results data to identify probable matches
or data types including latitude and longitude coordinates,
geographic abbreviations, special symbols which may indicate the
data type (e.g., currency symbols).
[0050] At block 515, the programming notebook system 100 generates
one or more interactive data visualization thumbnails based on the
potential data types identified at blocks 505 and 510. The data
visualization thumbnails may include one or more of a time series,
a histogram, a table, a heat map, a geographic map, a scatter plot,
a line graph, a pie chart, or any other type of data visualization.
Whether and which data visualization thumbnails are selected may
depend on the detected data types (and/or probable data types)
and/or combinations of data types. For example, if geographic data
types are identified, a geographic map may be generated as one of
the data visualizations. Or, if date and time data types are
identified, a time series or a calendar may be generated as data
visualizations. The data visualizations may be generated based on
the actual output results to provide an accurate view of the
data.
[0051] At block 520, the programming notebook system 100 provides
the interactive data visualization thumbnails with the output
results. The interactive data visualizations may then be displayed
with the output results, for example in the programming workflow
user interface 3000 above or below the program cell used by the
developer to input the program code for the request. The data
visualizations may be configured to respond to user interaction by,
for example, causing display of a larger non-thumbnail version of
the data visualization in the programming workflow user interface
3000. The larger non-thumbnail version may be fully interactive and
support user functionality such as zooming in our out, manipulating
parameters, selecting portions of the visualization to filter
results, and so on.
Example System Implementation and Architecture
[0052] FIG. 6 is a block diagram of one embodiment of a programming
notebook system 100 in communication with a network 160 and various
systems, such as client computing systems(s) 168, program code card
repository 170, and/or other data source(s) 172. The programming
notebook system 100 may be used to implement systems and methods
described herein, including, but not limited to the processes 400
of FIG. 4 and the process 500 of FIG. 5.
Programming Notebook System
[0053] In the embodiment of FIG. 6, the programming notebook system
100 includes a code compiler and execution module 122, a program
code card management module 126, a data (column) introspection
module 128, and a user interface module 124 that may be stored in
the mass storage device 120 as executable software codes that are
executed by the CPU 150. These and other modules in the programming
notebook system 100 may include, by way of example, components,
such as software components, object-oriented software components,
class components and task components, processes, functions,
attributes, procedures, subroutines, segments of program code,
drivers, firmware, microcode, circuitry, data, databases, data
structures, tables, arrays, and variables. In the embodiment shown
in FIG. 6, the programming notebook system 100 is configured to
execute the modules recited above to perform the various methods
and/or processes herein (such as the processes described with
respect to FIGS. 4 and 5 herein).
[0054] The code compiler and execution module 122 provides
capabilities related to execution of program code associated with
requests received by the programming notebook system 100, for
example as described by the process 400 of FIG. 4. The program code
card management module 126 provides capabilities related to storing
and searching program code cards, some aspects of which are
described by the process 400 of FIG. 4 and/or the user interface
1000 of FIG. 1. The data (column) introspection module 128 provides
capabilities related to analyzing inputs and outputs associated
with executed program code to automatically identify potential data
visualizations which may be of use to the end user, for example as
described by the process 500 of FIG. 5. The user interface module
124 provides capabilities related to generation and presentation of
one or more user interfaces, such as the sample user interfaces
illustrated with reference to FIGS. 1, 2, and 3 herein.
[0055] The programming notebook system 100 includes, for example, a
server, workstation, or other computing device. In one embodiment,
the exemplary programming notebook system 100 includes one or more
central processing units ("CPU") 150, which may each include a
conventional or proprietary microprocessor. The programming
notebook system 100 further includes one or more memories 130, such
as random access memory ("RAM") for temporary storage of
information, one or more read only memories ("ROM") for permanent
storage of information, and one or more mass storage device 120,
such as a hard drive, diskette, solid state drive, or optical media
storage device. Typically, the modules of the programming notebook
system 100 are connected to the computer using a standard based bus
system. In different embodiments, the standard based bus system
could be implemented in Peripheral Component Interconnect ("PCI"),
Microchannel, Small Computer System Interface ("SCSI"), Industrial
Standard Architecture ("ISA"), and Extended ISA ("EISA")
architectures, for example. In addition, the functionality provided
for in the components and modules of programming notebook system
100 may be combined into fewer components and modules or further
separated into additional components and modules.
[0056] The programming notebook system 100 is generally controlled
and coordinated by operating system software, such as Windows XP,
Windows Vista, Windows 7, Windows 8, Windows Server, UNIX, Linux,
SunOS, Solaris, iOS, Blackberry OS, or other compatible operating
systems. In Macintosh systems, the operating system may be any
available operating system, such as MAC OS X. In other embodiments,
the programming notebook system 100 may be controlled by a
proprietary operating system. Conventional operating systems
control and schedule computer processes for execution, perform
memory management, provide file system, networking, I/O services,
and provide a user interface, such as a graphical user interface
("GUI"), among other things.
[0057] The exemplary programming notebook system 100 may include
one or more commonly available input/output (I/O) devices and
interfaces 110, such as a keyboard, mouse, touchpad, and printer.
In one embodiment, the I/O devices and interfaces 110 include one
or more display devices, such as a monitor, that allows the visual
presentation of data to a user. More particularly, a display device
provides for the presentation of GUls, application software data,
and multimedia analytics, for example. The programming notebook
system 100 may also include one or more multimedia devices 140,
such as speakers, video cards, graphics accelerators, and
microphones, for example.
Network
[0058] In the embodiment of FIG. 6, the I/O devices and interfaces
110 provide a communication interface to various external devices.
In the embodiment of FIG. 6, the programming notebook system 100 is
electronically coupled to a network 160, which comprises one or
more of a LAN, WAN, and/or the Internet, for example, via a wired,
wireless, or combination of wired and wireless, communication link.
The network 160 communicates with various computing devices and/or
other electronic devices via wired or wireless communication
links.
[0059] According to FIG. 6, in some embodiments information may be
provided to or accessed by the programming notebook system 100 over
the network 160 from one or more program code card repository 170
and/or other data source(s) 172. The program code card repository
170 may store, for example, logical units of program code (e.g.,
"cards") generated using the methods described herein. The program
code card repository 170 and/or other data source(s) 172 may
include one or more internal and/or external data sources. In some
embodiments, one or more of the databases or data sources may be
implemented using a relational database, such as Sybase, Oracle,
CodeBase, MySQL, and Microsoft.RTM. SQL Server as well as other
types of databases such as, for example, a flat file database, an
entity-relationship database, and object-oriented database, and/or
a record-based database
Other Embodiments
[0060] Each of the processes, methods, and algorithms described in
the preceding sections may be embodied in, and fully or partially
automated by, code modules executed by one or more computer systems
or computer processors comprising computer hardware. The code
modules may be stored on any type of non-transitory
computer-readable medium or computer storage device, such as hard
drives, solid state memory, optical disc, and/or the like. The
systems and modules may also be transmitted as generated data
signals (for example, as part of a carrier wave or other analog or
digital propagated signal) on a variety of computer-readable
transmission mediums, including wireless-based and
wired/cable-based mediums, and may take a variety of forms (for
example, as part of a single or multiplexed analog signal, or as
multiple discrete digital packets or frames). The processes and
algorithms may be implemented partially or wholly in
application-specific circuitry. The results of the disclosed
processes and process steps may be stored, persistently or
otherwise, in any type of non-transitory computer storage such as,
for example, volatile or non-volatile storage.
[0061] In general, the word "module," as used herein, refers to
logic embodied in hardware or firmware, or to a collection of
software instructions, possibly having entry and exit points,
written in a programming language, such as, for example, Java, Lua,
C or C++. A software module may be compiled and linked into an
executable program, installed in a dynamic link library, or may be
written in an interpreted programming language such as, for
example, BASIC, Perl, or Python. It will be appreciated that
software modules may be callable from other modules or from
themselves, and/or may be invoked in response to detected events or
interrupts. Software modules configured for execution on computing
devices may be provided on a computer readable medium, such as a
compact disc, digital video disc, flash drive, or any other
tangible medium. Such software code may be stored, partially or
fully, on a memory device of the executing computing device, such
as the programming notebook system 100, for execution by the
computing device. Software instructions may be embedded in
firmware, such as an EPROM. It will be further appreciated that
hardware modules may be comprised of connected logic units, such as
gates and flip-flops, and/or may be comprised of programmable
units, such as programmable gate arrays or processors. The modules
described herein are preferably implemented as software modules,
but may be represented in hardware or firmware. Generally, the
modules described herein refer to logical modules that may be
combined with other modules or divided into sub-modules despite
their physical organization or storage.
[0062] The various features and processes described above may be
used independently of one another, or may be combined in various
ways. All possible combinations and subcombinations are intended to
fall within the scope of this disclosure. In addition, certain
method or process blocks may be omitted in some implementations.
The methods and processes described herein are also not limited to
any particular sequence, and the blocks or states relating thereto
can be performed in other sequences that are appropriate. For
example, described blocks or states may be performed in an order
other than that specifically disclosed, or multiple blocks or
states may be combined in a single block or state. The example
blocks or states may be performed in serial, in parallel, or in
some other manner. Blocks or states may be added to or removed from
the disclosed example embodiments. The example systems and
components described herein may be configured differently than
described. For example, elements may be added to, removed from, or
rearranged compared to the disclosed example embodiments.
[0063] Conditional language used herein, such as, among others,
"can," "could," "might," "may," "for example," and the like, unless
specifically stated otherwise, or otherwise understood within the
context as used, is generally intended to convey that certain
embodiments include, while other embodiments do not include,
certain features, elements and/or steps. Thus, such conditional
language is not generally intended to imply that features, elements
and/or steps are in any way required for one or more embodiments or
that one or more embodiments necessarily include logic for
deciding, with or without author input or prompting, whether these
features, elements and/or steps are included or are to be performed
in any particular embodiment. The terms "comprising," "including,"
"having," and the like are synonymous and are used inclusively, in
an open-ended fashion, and do not exclude additional elements,
features, acts, operations, and so forth. Also, the term "or" is
used in its inclusive sense (and not in its exclusive sense) so
that when used, for example, to connect a list of elements, the
term "or" means one, some, or all of the elements in the list.
Conjunctive language such as the phrase "at least one of X, Y and
Z," unless specifically stated otherwise, is otherwise understood
with the context as used in general to convey that an item, term,
etc. may be either X, Y or Z. Thus, such conjunctive language is
not generally intended to imply that certain embodiments require at
least one of X, at least one of Y and at least one of Z to each be
present.
[0064] While certain example embodiments have been described, these
embodiments have been presented by way of example only, and are not
intended to limit the scope of the disclosure. Thus, nothing in the
foregoing description is intended to imply that any particular
element, feature, characteristic, step, module, or block is
necessary or indispensable. Indeed, the novel methods and systems
described herein may be embodied in a variety of other forms;
furthermore, various omissions, substitutions, and changes in the
form of the methods and systems described herein may be made
without departing from the spirit of the inventions disclosed
herein. The accompanying claims and their equivalents are intended
to cover such forms or modifications as would fall within the scope
and spirit of certain of the inventions disclosed herein.
[0065] Any process descriptions, elements, or blocks in the flow
diagrams described herein and/or depicted in the attached figures
should be understood as potentially representing modules, segments,
or portions of code which include one or more executable
instructions for implementing specific logical functions or steps
in the process. Alternate implementations are included within the
scope of the embodiments described herein in which elements or
functions may be deleted, executed out of order from that shown or
discussed, including substantially concurrently or in reverse
order, depending on the functionality involved, as would be
understood by those skilled in the art.
[0066] All of the methods and processes described above may be
embodied in, and partially or fully automated via, software code
modules executed by one or more general purpose computers. For
example, the methods described herein may be performed by the
programming notebook system 100 and/or any other suitable computing
device. The methods may be executed on the computing devices in
response to execution of software instructions or other executable
code read from a tangible computer readable medium. A tangible
computer readable medium is a data storage device that can store
data that is readable by a computer system. Examples of computer
readable mediums include read-only memory, random-access memory,
other volatile or non-volatile memory devices, CD-ROMs, magnetic
tape, flash drives, and optical data storage devices.
[0067] It should be emphasized that many variations and
modifications may be made to the above-described embodiments, the
elements of which are to be understood as being among other
acceptable examples. All such modifications and variations are
intended to be included herein within the scope of this disclosure.
The foregoing description details certain embodiments of the
invention. It will be appreciated, however, that no matter how
detailed the foregoing appears in text, the invention can be
practiced in many ways. As is also stated above, it should be noted
that the use of particular terminology when describing certain
features or aspects of the invention should not be taken to imply
that the terminology is being re-defined herein to be restricted to
including any specific characteristics of the features or aspects
of the invention with which that terminology is associated.
* * * * *