SparserThe Universal Parser
License: Creative Commons Zero v1.0 Universal (cc0-1.0)
Version: 1.4.12
The So What
Sparser is a language parser created to provide a simple data format that may accurately describe any language. This universal nature is achieved through three realizations:
-
Languages are composed of structures.
A structure is a logical grouping that is typically defined by a language through syntax and sometimes by context. A simple scheme that identifies the atomic particles of a language by structure is sufficient to describe the language. That simple scheme is what allows Sparser to describe different languages equally and universally. -
Descriptions of language structures are universally extensible.
Language structures are not flat. One structure may include other child structures. Those child structures may be of a different grammar, syntax, or language from the parent structure. A universal description of language structure allows complex documents, which are documents composed of more than one language, to be described as a single uniform piece of data. -
Structure provides context.
Understanding any arrangment of complex information takes into account context. Context is the meaning of a thing in relation to other things of close proximity. Sometimes the pieces of a language are partially or wholly defined by their context. A uniform and universal description of structure provides implicit definitions of a language fragment that becomes explicit when extended with additional descriptive data.
Example
Provided is a simple example of trivial code and the resultant parser output formatted as an HTML table. Try out your code samples on the demo page.
JSX Code Sample
{/* child comment, put {} around */}
);]]>
Parsed Output
index | begin | ender | lexer | lines | stack | types | token |
---|---|---|---|---|---|---|---|
0 | -1 | -1 | script | 0 | global | word | var |
1 | -1 | -1 | script | 1 | global | reference | content |
2 | -1 | -1 | script | 1 | global | operator | = |
3 | -1 | 23 | script | 1 | global | start | ( |
4 | 3 | 22 | markup | 2 | paren | start | <Nav> |
5 | 4 | 7 | markup | 2 | Nav | script_start | { |
6 | 5 | 7 | script | 0 | script | comment | /* child comment, put {} around */ |
7 | 5 | 7 | markup | 0 | script | script_end | } |
8 | 4 | 22 | markup | 2 | Nav | singleton | <cPerson/> |
9 | 8 | 20 | markup | 2 | cPerson | jsx_attribute_start | name={ |
10 | 9 | 20 | script | 0 | jsx_attribute | word | window |
11 | 9 | 20 | script | 0 | jsx_attribute | separator | . |
12 | 9 | 20 | script | 0 | jsx_attribute | word | isLoggedIn |
13 | 9 | 20 | script | 1 | jsx_attribute | operator | ? |
14 | 9 | 20 | script | 1 | jsx_attribute | word | window |
15 | 9 | 20 | script | 0 | jsx_attribute | separator | . |
16 | 9 | 20 | script | 0 | jsx_attribute | word | name |
17 | 9 | 20 | script | 1 | jsx_attribute | operator | : |
18 | 9 | 20 | script | 1 | jsx_attribute | string | '' |
19 | 9 | 20 | script | 0 | jsx_attribute | separator | x; |
20 | 9 | 20 | markup | 0 | jsx_attribute | jsx_attribute_end | } |
21 | 8 | 22 | markup | 1 | cPerson | comment_attribute | // end of line comment |
22 | 4 | 22 | markup | 2 | Nav | end | </Nav> |
23 | 3 | 23 | script | 2 | paren | end | ) |
24 | -1 | -1 | script | 0 | global | separator | ; |
You can see from the above output example a couple of different types of code syntax are parsed and yet the output is uniform across all types of syntax. The example also demonstrates a JavaScript structure that contains markup code, which then contains a section of JavaScript code.
The begin and ender data columns describe the start and finish points of a structure while the stack data column names that structure. For more information please read the technical documentation or try it out in the demo tool.
Usage Examples
The production application files are written to be environment agnostic. The application files for use with brwoser and Node.js are actually identical aside from a single global reference.
Get and Configure the Application
Get the code from Github
git clone https://github.com/Unibeautify/sparser.git
cd sparser
npm install
node js/services build
Locally install the code from NPM
npm install sparser
cd node_modules/sparser
npm install
node js/services build
Or perform an NPM global install and not have to configure anything
npm install sparser -g
Execute the application in the terminal with Node
The command documentation is built into the terminal API. To access this documenation use the commands command to see all available commands and guidance for examples.
Locally installed
node js/services commands
Globally installed
sparser commands
Execute the application in a Node.js application
require("sparser");
const output = "";
global.sparser.options.source = "my code";
output = global.sparser.parser();
Execute the application in the browser
First, include the application into HTML.
<script type="application/javascript" src="sparser/js/browser.js"></script>
Second, execute the application
const output = "";
window.sparser.options.source = "my code";
output = window.sparser.parser();
Currently Supported Languages by Processing Lexer
markdown
markup
- Apache Velocity
- ASP Inline Expression
- CFML (ColdFusion Markup Language)
- Django Inline HTML
- Dust.js
- EEX Elixir Templates
- EJS (Embedded JavaScript) Templates
- ERB (Embedded Ruby)
- FreeMarker
- Genshi
- Handlebars
- HTL (HTML Templating Language)
- HTML
- Jekyll
- Jinja
- JSTL (Java Standard Tag Library)
- Liquid
- Mustache
- Nunjucks
- SGML
- SilverStripe
- Spacebars templates
- ThymeLeaf
- Underscore Templates (TPL)
- Twig
- Vapor Leaf
- Vash
- Volt
- XML
- XSLT
script
style
45 total languages.
Contributions are welcome.
This project is young. All manners of contributions are welcome. Here are some suggested areas:
- Finding and logging defects
- Extending and improving documentation
- Adding or streamlining automation, APIs, and integrations
- Adding support for new language rules
- Contributor resources