"Translated" ProseMirror Chinese Guide

✍🏼 Written on Sep 6, 2019    💡 Updated on Nov 15, 2021
❗️ Note: it has been days since this article was written, please be aware of its timeliness
🖥  Note:All documents or manual API descriptions mentioned in this article can be viewed at https://prosemirror.xheldon.com/docs/ref/.

Content pointing to the https://prosemirror.xheldon.com domain can be accessed in English by replacing it with https://prosemirror.net.

Translation Notes:

  1. The work requires using ProseMirror, but no well-translated documentation was found on the market (some translations read like machine translations). Therefore, I took this opportunity to translate this conceptual guide for the library.
  2. Based on previous translation experience, to avoid ambiguity, keeping some 专有名词不翻译 is the best choice.
  3. The translation aims to stay faithful to the original text, but some direct translations may result in awkward or incoherent semantics. In such cases, additional context like subjects may be added, or paraphrasing may be used. Readers who find inconsistencies can refer to the original text.
  4. For parts I didn’t fully understand, I consulted the author on the ProseMirror forum. Links to these discussions are included.
  5. This guide may not make complete sense at first glance. It’s recommended to skim through it first, then check this repository to see and implement basic features like headings (node types) and bold (marks) before revisiting the guide for better comprehension.
  6. I prefer using English punctuation with Chinese input methods.
  7. Occasionally, I mistakenly wrote “ProseMirror” as “Prosemirror,” but this doesn’t affect the guide.
  8. I created a simple demo with basic examples for experimentation, available at: this repository. Forks and stars are welcome!
  9. Spaces between Chinese and English text and after commas are standard practice.
  10. My technical and translation skills are limited, and so is my understanding. Corrections and feedback are appreciated. Thank you!

Translator’s Conceptual Explanation

  1. Document: The entire document in ProseMirror, typically referenced by editor.view.state.doc.
  2. Schema: The skeleton object of ProseMirror, defining various rules to constrain the document. Sometimes manual adjustments are needed to comply with these rules, but ProseMirror usually handles them automatically.
  3. State: The data structure object in ProseMirror, analogous to React’s state. It includes the view’s state and plugin-specific states. For example, the schema is defined here: state.schema.
  4. View: The visual representation object in ProseMirror, containing methods to update the view. The state is one of its properties: view.state.
  5. Transform: A container object for document changes, with methods to modify these changes. Transaction is its subclass, handling state changes for the entire editor.
  6. Selection: The selection object, representing the cursor when nothing is selected. It includes various position-related properties and methods.
  7. Range: A container for multiple node objects, often used to handle selections spanning multiple node and mark types.
  8. Slice: Primarily used to address schema violations caused by partial selections.
  9. Node: The basic element in ProseMirror. Various node types can be defined via the schema, with at least doc (root node) and text (text node) required.
  10. NodeType: The type of a ProseMirror node, typically used to create nodes and define their attributes.
  11. XXXSpec: Configuration objects for defining XXX, such as NodeSpec and MarkSpec.
  12. Mark: ProseMirror treats inline text as a flat structure (unlike DOM’s tree structure) for easier counting and manipulation. Marks represent attributes of inline nodes, like font-size or bold, and can be customized.
  13. MarkType: Similar to node types, defines mark attributes and includes methods for creating marks.
  14. DOMOutputSpec: The return value specified in the schema’s toDOM, as explained in the official documentation.
  15. ResolvedPos: An object returned by ProseMirror when resolving position information (see the “Position Counting” section), containing position-related details.
  16. Plugin: Typically used to implement behaviors like clicks, pasting, or undo. Plugins can also define nodes directly.
  17. Decoration: Often used to generate views independent of document state, enabling visual effects without altering the document structure.

Chinese-English Translation Reference (Interchangeable while reading this guide)

ProseMirror Chinese Guide

This guide introduces various concepts used in the library and how they interrelate. To give you an overall impression of the system, it’s recommended to read the documents in order, or at least (if you’re impatient and just want a general understanding) finish the section about the View component.

Introduction

ProseMirror provides a comprehensive set of tools and concepts for building rich text editors. Its user interface is inspired by 所见即所得 concepts but strives to avoid falling into its styling-editing pitfalls.

The fundamental concept of ProseMirror is that you and your code have absolute control over the document and its changes. Here, a document isn’t the messy blob of code found in HTML, but rather a custom data structure that only contains elements you explicitly allow and the relationships you specify between them (meaning you control which elements can appear and their relationships—translator’s note). All document updates originate from a single point, making it easier to handle changes.

The core modules of ProseMirror aren’t plug-and-play. During development, we prioritized modularity and customizability over simplicity. That said, we hope someone will eventually develop a ready-to-use editor based on ProseMirror. To use an analogy, ProseMirror is like LEGO bricks that require manual assembly, rather than a matchbox that’s ready to use upon opening.

ProseMirror has four essential modules required for any operation, along with many extension modules maintained by the core team. These extension modules, like third-party modules offering useful features, can be replaced by other modules implementing the same functionality.

The four essential modules are:

  1. prosemirror-model defines the editor’s Document Model, which describes the editor’s content.
  2. prosemirror-state provides a single data structure describing the editor’s complete state, including selection operations, and a system called “transaction” for transitioning from the current state to the next.
  3. prosemirror-view displays a given state as editable elements in the editor and handles user interactions.
  4. prosemirror-transform includes functionality for making reversible document changes. It forms the foundation for the transaction system in prosemirror-state, enabling undo history and collaborative editing.

Beyond these, there are modules like basic editing commands, keybindings, undo history, macros, collaborative editing, and a simple document Schema, among others. More modules can be found in the ProseMirror organization on GitHub.

ProseMirror isn’t a browser-loadable script, meaning you’ll need to use a bundler to work with it. A bundler automatically resolves your script’s dependencies and merges them into a single file for convenient browser loading. You can explore more about web bundling, for example, here.

My First Editor

The following code stacks together like LEGO bricks to create the simplest editor:

1
2
3
4
5
6
import { schema } from 'prosemirror-schema-basic';
import { EditorState } from 'prosemirror-state';
import { EditorView } from 'prosemirror-view';

let state = EditorState.create({ schema });
let view = new EditorView(document.body, { state });

Prosemirror requires you to manually specify a Schema for the document (to define which elements can or cannot be contained and the relationships between elements). To achieve this, the first thing the above code does is import a basic schema (typically, you would write your own schema, but here the author uses a pre-existing one containing basic elements as an example — translator’s note).

This base schema is then used to create a state, which generates an empty document adhering to the schema’s constraints, along with a default selection at the start of the document (this selection is empty, meaning it represents the cursor). Finally, this state generates a view that is appended to document.body. The document from the aforementioned state will ultimately be rendered as an editable DOM node (i.e., a contenteditable node — translator’s note) and a state transaction that reacts to user input.

(Unfortunately) At this point, the editor is not yet functional. For example, if you press Enter in the editor, nothing will happen because the four core modules mentioned earlier do not know how to respond to such input. We will later instruct it on how to handle various input behaviors.

Transactions

When users input text or, more broadly, interact with the page’s view, Prosemirror generates ‘state transactions’. This means that after every user input, Prosemirror not only modifies the document content but also updates the state behind the scenes. In other words, each change results in the creation of a transaction, which describes the changes applied to the state. These changes can then be used to create a new state, which in turn updates the view.

By default, these changes are handled by the framework, and you don’t need to worry about them. However, you can attach hooks to this process by writing a plugin or customizing your view. For example, the following code adds a dispatchTransaction prop, which is called whenever a transaction is created:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
// 忽略 import 部分
let state = EditorState.create({ schema });
let view = new EditorView(document.body, {
state,
dispatchTransaction(transaction) {
console.log(
'Document size went from',
transaction.before.content.size,
'to',
transaction.doc.content.size
);
let newState = view.state.apply(transaction);
view.updateState(newState);
},
});

Every state update ultimately requires executing the updateState method, and dispatching a transaction generally triggers an update to the editing state.

Plugins

Plugins are used to extend editing behavior and state in various ways. Some plugins are simple, like the keymap plugin, which binds keyboard inputs to actions. Others are more complex, such as the history plugin, which monitors transactions and stores them in reverse order to enable undo/redo functionality when users want to revert a transaction.

Let’s add the following two plugins to enable undo/redo functionality:

1
2
3
4
5
6
7
8
9
// 忽略重复的导入
import { undo, redo, history } from 'prosemirror-history';
import { keymap } from 'prosemirror-keymap';

let state = EditorState.create({
schema,
plugins: [history(), keymap({ 'Mod-z': undo, 'Mod-y': redo })],
});
let view = new EditorView(document.body, { state });

Plugins are registered when the state is created (because they require access to the state’s transactions). After creating a view for this undo/redo-capable state, you’ll be able to undo the last action by pressing Ctrl+Z (or Cmd+Z on Mac).

Commands

The special functions bound to specific keyboard keys in the example above are called commands. Most editing behaviors are written as commands, allowing them to be bound to keys, called by editing menus, or exposed for user interaction.

The prosemirror-commands package provides many basic editing commands, including mapping the Enter and Delete keys to behave as you’d expect in an editor.

1
2
3
4
5
6
7
8
9
10
11
12
// 忽略重复的导入
import { baseKeymap } from 'prosemirror-commands';

let state = EditorState.create({
schema,
plugins: [
history(),
keymap({ 'Mod-z': undo, 'Mod-y': redo }),
keymap(baseKeymap),
],
});
let view = new EditorView(document.body, { state });

At this point, you should have a mostly functional editor.

If you want to add a menu for easier editing or include key bindings allowed by the schema, you might want to check out the prosemirror-example-setup package. This package provides a set of preconfigured plugins for implementing a basic editor. However, as the name suggests, it’s primarily for demonstrating API usage and isn’t suitable for production. For a real-world development environment, you’ll likely want to replace some of its components with your own code to achieve precisely the desired behavior.

Content

A state’s document object is stored in the doc property, which is a read-only data structure represented by a series of nodes at different levels. This hierarchical structure somewhat resembles DOM nodes in a browser. A simple document might have a “doc” node containing two “paragraph” nodes, each of which in turn contains a “text” node. You can read more about the document data structure in guide.

When initializing a state, you can pass it an initial document. In this case, the schema field is optional because the schema can be derived from the document.

In the following example, we initialize a state by formatting the DOM element with the id “content” through the DOM formatting mechanism. The schema information used by this state is obtained by mapping formatted DOM nodes to their corresponding elements (meaning the DOM nodes contain elements that are formatted and converted into schema form for the state to use, so the schema information can be derived from the formatted DOM without manual specification—translator’s note).

1
2
3
4
5
6
7
8
import { DOMParser } from 'prosemirror-model';
import { EditorState } from 'prosemirror-state';
import { schema } from 'prosemirror-schema-basic';

let content = document.getElementById('content');
let state = EditorState.create({
doc: DOMParser.fromSchema(schema).parse(content),
});

Documents

Prosemirror defines its own data structure to represent document content. Since the document is the core element of building an editor, understanding how it works is essential.

Structure

A Prosemirror document is a node type containing a fragment object, which in turn contains zero or more child nodes.

This closely resembles the structure of the browser’s DOM, as Prosemirror, like the DOM, is a recursive tree structure. However, Prosemirror differs slightly from the DOM in how it stores inline elements.

In HTML, a paragraph and its contained markup are represented as a tree. For example, consider the following HTML structure:

1
2
3
4
5
6
<p>
This is{' '}
<strong>
strong text with <em>emphasis</em>
</strong>
</p>

dom structure

In Prosemirror, however, inline elements are represented as a flat model, with their node markup attached as metadata to the corresponding nodes:

prosemirror-document-structure

This data structure aligns more naturally with how we intuitively perceive such text. It allows us to use character offsets rather than tree node paths to indicate positions within a paragraph and makes operations like splitting content or changing styles straightforward, avoiding the clumsiness of tree manipulations.

It also means each document has only one representation. Adjacent text nodes with identical marks are merged, and empty text nodes are disallowed. The order of marks is specified in the schema.

Thus, a Prosemirror document is a tree of block nodes, with most leaf nodes being of the textblock type—block nodes containing text. You can also have simple leaf nodes with no content, such as a horizontal rule (hr) element or a video element.

Node objects have several properties to describe their role in the document:

  • isBlock and isInline indicate whether the node is a block-type node (like a div) or an inline node (like a span).
  • inlineContent being true means the node only accepts inline elements as content (this can be used to determine whether to add inline nodes to it—translator’s note).
  • isTextBlock being true indicates the node is a block node containing inline content.
  • isLeaf being true means the node cannot contain any content.

Thus, a typical “paragraph” node is a textblock-type node, while a blockquote (quotation element) is a block element that may contain other block elements as its content. Text nodes, line breaks, and inline images are all inline leaf nodes, whereas horizontal rule (hr element) nodes are typical block leaf nodes. (Leaf nodes, as mentioned, cannot contain child nodes; they can be either inline or block.)

Schema allows you to impose additional constraints on questions like “which elements are allowed where.” For example, even if a node permits block content, that doesn’t mean it allows all block nodes as content (you can manually specify exceptions via the schema).

Identity and Persistence

Another difference between the DOM tree and a ProseMirror document lies in how they represent node objects. In the DOM, nodes are mutable objects with identity (search for “mutable objects” if unfamiliar). This means a node can only exist under its parent node (if it appears elsewhere, it is no longer here, as identity ensures uniqueness). When such a node updates, it mutates (i.e., the update occurs on the original node, modifying it in place).

In ProseMirror, however, nodes are merely values (as opposed to DOM’s mutable objects), representing a node like the number 3. The number 3 can appear in multiple data structures simultaneously—it isn’t bound to any single structure. If you add 1 to it, you get a new value, 4, without altering the original 3.

This is how ProseMirror documents operate. Their values are immutable and can be treated as primitive values to compute a new document. These document nodes are unaware of their containing data structure because they can exist in multiple structures or even repeat within a single structure. They are values, not stateful objects.

This means every time you update a document, you get a new document. The new document shares all unchanged child node values from the old one, making document creation inexpensive.

This approach has many advantages. It ensures the editor remains functional during state updates because the new state directly represents the new document (if an update is incomplete, the state doesn’t exist, so the document doesn’t either—the editor retains the previous state and document). Switching between states is instantaneous (with no intermediate states). Such transitions can also be reasoned about mathematically—something that would be very difficult if values mutated behind the scenes (like DOM nodes). ProseMirror’s design enables collaborative editing and allows for highly efficient DOM updates by comparing the previously rendered document with the current one.

Since nodes are represented as plain JavaScript objects, explicitly freezing their properties (to prevent mutation) would severely impact performance. Thus, while ProseMirror documents operate immutably, you can technically mutate them manually. However, ProseMirror does not support this. If you forcibly mutate these data structures, the editor may crash because they are often shared across multiple contexts (modifying one affects others unpredictably). So, be very careful! The same applies to arrays and objects stored on node objects, such as node attributes or child nodes in fragments.

Data Structures

A document’s data structure looks like this:

prosemirror-data-structure

Each node is an instance of the Node class. They are categorized by the type property, which reveals the node’s name, its available attributes, and similar information. Node types (and mark types) are created only once per schema, and they know which schema they belong to.

A node’s content is stored in a field pointing to a Fragment instance, which contains an array of nodes. Even nodes that cannot or do not have content follow this pattern, with such nodes being replaced by a shared empty fragment.

Some node types allow attributes, which are stored as additional values (separate from content) on each node. For example, an image node might use attributes to store alt text and URL information.

Additionally, inline nodes contain active marks—marks refer to things like emphasis or links—represented as Mark instances.

The entire document is a node. The document’s content consists of child nodes under the top-level node. Typically, these top-level child nodes are a series of block nodes, some of which may contain text blocks that in turn hold inline content. However, the top-level node could also be just a text block, in which case the entire document contains only inline content.

Which nodes are allowed in which positions is determined by the document’s schema. To create nodes programmatically (rather than by directly inputting content into the editor), you must traverse the schema, such as by using the node and text methods.

1
2
3
4
5
6
7
8
import { schema } from 'prosemirror-schema-basic';

// null 参数的位置是用来在必要的情况下指定属性的
let doc = schema.node('doc', null, [
schema.node('paragraph', null, [schema.text('One.')]),
schema.node('horizontal_rule'),
schema.node('paragraph', null, [schema.text('Two!')]),
]);

Indexing

ProseMirror nodes support two types of indexing—they can be treated as a tree structure, where offsets distinguish nodes, or as a flat sequence of tokens (where a token is a counting unit).

The first indexing method allows you to interact with individual nodes as you would in the DOM, using the child method and childCount to directly access child nodes, or writing recursive functions to traverse the document (if you want to traverse all nodes, use descendants or nodesBetween).

The second indexing method is more useful when locating a specific position in the document. It represents any position in the document as an integer—the token’s sequential number. These token objects don’t physically exist in memory—they are merely a counting convenience—but the document’s tree structure and each node’s awareness of its own size make position-based access efficient.

  • The document’s starting position, before all content, is position 0.
  • Entering or exiting a non-leaf node (i.e., a node that can contain content) counts as 1 token. So, if the document starts with a paragraph (tagged p), the position at the start of the paragraph is 1 (i.e., after <p>).
  • Each character in a text node counts as 1 token. So, if the opening paragraph contains the word “hi,” position 2 is after “h,” position 3 is after “i,” and position 4 is after the entire paragraph (i.e., after </p>).
  • Leaf nodes that cannot have content (e.g., image nodes) count as 1 token.

Thus, if you have a document represented in HTML like this:

1
2
<p>One</p>
<blockquote><p>Two<img src="..."></p></blockquote>

The token sequence and positions would look like this:

prosemirror-indexing

Every node has a nodeSize ](https://prosemirror.xheldon.com/docs/ref/#model.Node.nodeSize) attribute representing the overall size of the node. You can also obtain the size of the node’s content via .content.size. Note that for the outermost node of a document (i.e., the root node where the contenteditable attribute resides in the DOM), the opening and closing tokens are not considered part of the document (since you cannot place the cursor outside the document). Therefore, the size of the document is doc.content.size, not doc.nodeSize (though the document’s opening and closing tags are not part of the document, they are still counted. The latter is always 2 larger than the former).

Manually calculating these positions involves a significant amount of computational work. (Thus) you can call Node.resolve ](https://prosemirror.xheldon.com/docs/ref/#model.Node.resolve) to obtain a more detailed [data structure] ](https://prosemirror.xheldon.com/docs/ref/#model.ResolvedPos) description of a position. This data structure will tell you the parent node of the current position, its offset within the parent node, the ancestor nodes of the parent node, and other information.

It is crucial to distinguish between a child node’s index (such as each childCount), a document-wide position, and a node’s offset (sometimes this offset is used in recursive functions to indicate the current node’s position, which involves the node’s offset).

Slices

For operations like copy-paste and drag-and-drop, the concept of a slice of document comes into play. For example, the content between two positions is a slice. Unlike a complete node or fragment, a slice may be “open” (meaning the slice may contain unclosed tags, such as in <p>123</p><p>456</p>, where a slice might be 23</p><p>45).

For instance, if you select from the middle of one paragraph to the middle of another, the resulting slice contains two paragraphs—the first open at the start and the second open at the end. If you select a paragraph node programmatically (rather than through view interaction), you get a closed node. Treating a slice like regular node content might violate schema constraints because certain required nodes (such as the opening <p> and closing </p> tags to make the slice content a complete node) fall outside the slice.

The Slice ](https://prosemirror.xheldon.com/docs/ref/#model.Slice) data structure is used to represent such data. It stores a [fragment] ](https://prosemirror.xheldon.com/docs/ref/#model.Fragment) with information about the [open depth] ](https://prosemirror.xheldon.com/docs/ref/#model.Slice.openStart) on both sides (i.e., the hierarchical depth relative to the root node). You can use the slice method ](https://prosemirror.xheldon.com/docs/ref/#model.Node.slice) on nodes to “cut” a slice from the document.

1
2
3
4
5
6
7
//假设文档有两个 p 标签, 第一个 p 标签包含 a, 另一个 p 标签包含 b, 即:
// <p>a</p><p>b</p>
let slice1 = doc.slice(0, 3); // The first paragraph
console.log(slice1.openStart, slice1.openEnd); // → 0 0
let slice2 = doc.slice(1, 5); // From start of first paragraph
// to end of second
console.log(slice2.openStart, slice2.openEnd); // → 1 1

Changing

Since nodes and fragments are [persistent data structures] ](https://en.wikipedia.org/wiki/Persistent_data_structure) (i.e., immutable), you should never modify them directly. If you need to manipulate a document, it should remain unchanged (operations produce a new document while the old one stays unmodified).

In most cases, you should use [transformations] ](https://prosemirror.xheldon.com/docs/guide/#transform) to update the document without directly altering nodes. This also facilitates keeping a record of changes, which is necessary for the document as part of the editor’s state.

If you must manually update a document, ProseMirror provides some useful helper functions on Node and Fragment to create a fresh version of a document. You’ll likely use the Node.replace method frequently, which replaces the content within a specified range of the document with a slice containing new content. For shallow updates to a node, you can use the copy method, which creates an identical node but allows you to specify new content for the copy. Fragments also offer methods for updating documents, such as replaceChild and append.

Schemas

Every ProseMirror document has an associated schema. This schema describes the types of nodes that exist in the document and their nesting relationships. For example, a schema might specify that a top-level node can contain one or more blocks, while paragraph nodes can contain any number of inline nodes, which in turn can contain any number of marks.

For an example of schema usage, you can refer to this basic schema package. However, one of ProseMirror’s strengths is that it allows you to define your own schemas.

Node Types

Each node in a document has a type, which represents its semantic meaning and attributes, including how it is rendered in the editor.

When defining a schema, you need to list each node type used, describing them with a spec object:

1
2
3
4
5
6
7
8
const trivialSchema = new Schema({
nodes: {
doc: { content: 'paragraph+' },
paragraph: { content: 'text*' },
text: { inline: true },
/* ... and so on */
},
});

The above code defines a schema where a document can contain one or more paragraphs, and each paragraph can contain any number of text nodes.

Every schema must at least define the type of the top-level node (by default named “doc,” though you can configure it) and the “text” type for textual content.

Nodes that function as inline types for indexing, etc., must declare their inline property (recall that the text type is defined as inline—something you might have overlooked).

Content Expressions

The string values in the content field of the schema example above are called ‘content expressions.’ They control which child node types are allowed for the current node type.

For example, “paragraph” means “one paragraph,” while “paragraph+” means “one or more paragraphs.” Similarly, “paragraph*” means “zero or more paragraphs,” and “caption?” means “zero or one caption node.” You can also use range expressions akin to regular expressions after node names, such as {2} (exactly two), {1, 5} (one to five), or {2, } (two or more).

These expressions can be combined to create sequences. For instance, “heading paragraph+” means “a heading followed by one or more paragraphs.” You can also use the pipe operator “|” to choose between two expressions, like “(paragraph | blockquote)+.”

Some element type groups may appear multiple times in your schema. For example, if you have nodes representing the “block” concept, they might appear under top-level elements or nested within blockquote-type nodes. You can create a node group by specifying the group property in the schema and then referencing the group name in other expressions:

1
2
3
4
5
6
7
8
const groupSchema = new Schema({
nodes: {
doc: { content: 'block+' },
paragraph: { group: 'block', content: 'text*' },
blockquote: { group: 'block', content: 'block+' },
text: {},
},
});

In the example above, “block+” is equivalent to “(paragraph | blockquote)+.”

It is recommended to set nodes that allow block content (in this example, doc and blockquote) to have at least one child node. If a node is empty, the browser will collapse it, making it uneditable (this means that if the content of doc or blockquote is set to block* instead of block+, it allows the case where no child nodes exist—it follows the common regex notation: * for zero or more, + for one or more. In this scenario, when editing, the browser inputs a text node, which is an inline node, making it impossible to input. Readers can try this out—translator’s note).

In the schema, the order of nodes matters. When creating a default instance of a required node—for example, after applying a replace step—to ensure the current document still complies with the schema constraints, the first node expression that satisfies the schema constraints will be used. If the node expression is a group, the first node type in that group (determined by the order of the group’s member nodes in the schema) will be used. If I swap the order of “paragraph” and “blockquote” in the schema example above, the editor will throw a stack overflow error when attempting to create a new block node—because the editor will first try to create a “blockquote” node, but this node requires at least one block node as content, so it then needs to create another “blockquote” node as content, and so on indefinitely.

Not every node operation function in the Prosemirror library checks the validity of the content it processes—higher-level concepts like transforms do perform checks, but lower-level node creation methods typically do not. These lower-level methods usually delegate validity checks to their callers. It is entirely possible for these methods to work even with invalid content—for example, NodeType.create, which creates a node with invalid content. This is even justifiable for nodes on the “open” side of a slice (because a slice is not a valid node but still needs to be manipulated directly—users shouldn’t have to manually complete it, right?—translator’s note). There is a createChecked method to verify whether given content complies with the schema, and a check method to assert whether the given content is valid.

Marks

Marks are typically used to add extra styling or other information to inline content. The schema must declare all allowed marks in the current document (just like declaring nodes—translator’s note). Mark types are objects somewhat similar to node types, used to categorize different marks and provide additional information.

By default, nodes that allow inline content permit all marks defined in the schema to be applied to their child nodes. You can configure this in the marks field of the node spec.

Below is a simple schema example that supports strong and emphasis marks in paragraphs but disallows them in headings.

1
2
3
4
5
6
7
8
9
10
11
12
const markSchema = new Schema({
nodes: {
doc: { content: 'block+' },
paragraph: { group: 'block', content: 'text*', marks: '_' },
heading: { group: 'block', content: 'text*', marks: '' },
text: { inline: true },
},
marks: {
strong: {},
em: {},
},
});

The value of the marks field can be written as a comma-separated list of mark names or mark groups—“_”, which is a wildcard allowing all marks. An empty string means no marks are allowed.

Attributes

The document schema also defines which attributes are allowed for nodes and marks. If your node type requires additional node-specific information—such as the level attribute for heading nodes (e.g., H1, H2, etc.)—attributes are the way to go.

Attributes are plain objects with predefined properties (on each node or mark) pointing to JSON-serializable values. To specify which attributes are allowed, use the optional attrs property in the node spec or mark spec:

1
2
3
4
heading: {
content: "text*",
attrs: {level: {default: 1}}
}

In the schema above, each heading node instance has a level attribute accessible via .attrs.level. If not specified when creating a ](https://prosemirror.xheldon.com/docs/ref/#model.NodeType.create) heading, the default level is 1.

If you don’t provide a default value for an attribute when defining a node, an error will occur when creating that node without explicitly passing the attribute. This also makes it impossible for Prosemirror to call certain interfaces like createAndFill to generate nodes that comply with schema constraints.

Serialization and Parsing

To enable editing elements in the browser, document nodes must be rendered as DOM elements. The simplest way is to specify how each node should appear in the DOM within the schema. This can be achieved by defining the toDOM field in each node spec of the schema.

This field should point to a function that takes the current node as an argument and returns a description of the node’s DOM structure. This can be either a DOM node directly or an array describing it, for example:

1
2
3
4
5
6
7
8
9
10
11
12
const schema = new Schema({
nodes: {
doc: { content: 'paragraph+' },
paragraph: {
content: 'text*',
toDOM(node) {
return ['p', 0];
},
},
text: {},
},
});

In the example above, [“p”, 0] means the paragraph node is rendered as a

tag in HTML. The 0 represents a “hole,” indicating where the node’s content should be rendered (meaning if the node is expected to have content, a 0 should be included at the end of the array). You can also add an object after the tag to specify HTML attributes, such as [“div”, {class: “c”}, 0]. Leaf nodes don’t need a “hole” in their DOM representation since they have no content.

Mark specs have a similar toDOM method as nodes, but they need to render as standalone tags that directly wrap the content. Thus, the content is placed directly in the returned node, eliminating the need to explicitly specify the “hole.”

You’ll also often need to parse HTML DOM content into a Prosemirror-recognized document—for example, when users paste or drag content into the editor. The prosemirror-model module provides functions for this, but you can also include parsing instructions directly in the schema’s parseDOM property.

Here, a set of parsing rules is listed, describing how the DOM maps to nodes or marks. For instance, the basic schema defines the emphasis mark as follows:

1
2
3
4
5
parseDOM: [
{ tag: 'em' }, // Match <em> nodes
{ tag: 'i' }, // and <i> nodes
{ style: 'font-style=italic' }, // and inline 'font-style: italic'
];

The tag field in the parse rule can also be a CSS selector, so you can pass strings like “div.myclass.” Similarly, the style field matches inline CSS styles.

When a schema includes the parseDOM field, you can create a DOMParser object using DOMParser.fromSchema. The editor does this by default when creating a clipboard content parser, but you can override it.

Documents also have built-in JSON serialization. You can call toJSON on a node to generate an object that can safely be passed to JSON.stringify (likely for debugging purposes). Additionally, the schema object has a nodeFromJSON method to convert the toJSON result back into the original node.

Extending a schema

The parameters passed to the Schema constructor for setting nodes and marks can be either OrderedMap-type objects or plain JavaScript objects. The resulting schema’s .spec.nodes and .spec.marks properties are always OrderedMaps, which can serve as the foundation for other schemas.

OrderedMaps support many methods for conveniently creating new schemas. For example, you can generate a schema without the blockquote node by calling schema.markSpec.remove("blockquote") and passing the result to the nodes field of the Schema constructor’s parameters.

Document Transformations

Transform is the core working mechanism of Prosemirror. It serves as the foundation for transactions, enabling features like edit history tracking and collaborative editing.

Why?

Why can’t we directly modify (mutate) the document? Or at least create a completely new version of the document and replace the editor’s content with it?

There are several reasons. One is code clarity. Immutable data structures indeed lead to simpler code. Moreover, the primary function of the transform system is to preserve the traces of document updates—the sequence of steps in a transform represents each incremental change from the old document to the new one.

Undo History can save these steps and reapply them in reverse when needed (Prosemirror implements selective undo, which is more sophisticated than simply rolling back to a previous state).

Collaborative editing systems transmit these steps and, when necessary, record them to ensure all collaborators maintain the same document.

In most cases, it’s useful for editor plugins to react to every document change (whether from the user or collaborative editing), ensuring the plugins remain synchronized with the editor’s state.

Steps

Document updates are broken down into individual steps, each describing a specific update. While you typically don’t interact with them directly, understanding how they work is essential.

An example of a step is ReplaceStep, which replaces a portion of the document, or AddMarkStep, which applies a mark to a range.

A Step can be applied to a document to produce a new document.

1
2
3
4
5
console.log(myDoc.toString()); // → p("hello")
// 删除了 position 在 3-5 的 setp
let step = new ReplaceStep(3, 5, Slice.empty);
let result = step.apply(myDoc);
console.log(result.doc.toString()); // → p("heo")

Applying a step is a relatively straightforward process—it doesn’t handle tasks like inserting nodes to maintain schema constraints or transforming slices to fit the schema. This means applying a step can fail. For instance, attempting to delete one token of a node (i.e., its opening or closing tag) would leave the other token unclosed, which is nonsensical. This is why the apply method returns a result object—either referencing the new document (if the step succeeds) or containing an error message (if it fails).

You’ll usually want to use helper functions to generate steps for you, sparing you from worrying about the details.

Transforms

An editing action may produce one or more steps. The most convenient way to handle a sequence of steps is to create a Transform object (or, if you’re working with the editor’s overall state, a Transaction, which is a subclass of Transform).

1
2
3
4
5
let tr = new Transform(myDoc);
tr.delete(5, 7); // Delete between position 5 and 7
tr.split(5); // Split the parent node at position 5
console.log(tr.doc.toString()); // The modified document
console.log(tr.steps.length); // → 2

Most transform methods return the transform itself, allowing for convenient method chaining (e.g., tr.delete(5, 7).split(5)).

Transforms include methods like deleting and replacing, adding and removing marks, tree manipulation methods such as splitting, joining, lifting, and wrapping, among others.

Mapping

When you make changes to a document, certain positions referencing the document may become invalid or lose their original meaning. For example, if you insert a character, the positions of all subsequent characters will increment by 1, meaning those characters now point to new positions. Similarly, if you delete all content in a document, any positions previously referencing that content become invalid.

We often need to preserve positions during document changes (regardless of how they shift—translator’s note), such as selection boundaries (which contain positional information like from and to; if the document changes, these from and to values must sometimes adjust accordingly to avoid incorrect selection positioning—translator’s note). To address this, steps can provide a map that transforms positional information in the document before and after applying the step.

1
2
3
4
let step = new ReplaceStep(4, 6, Slice.empty); // Delete 4-5
let map = step.getMap();
console.log(map.map(8)); // → 6
console.log(map.map(2)); // → 2 (document 变化的地方之前的 position 未变化)

The Transform object automatically accumulates the maps generated by a series of steps. It achieves this using an abstraction called Mapping, which collects a sequence of step maps and allows you to map them all at once.

1
2
3
4
5
6
let tr = new Transaction(myDoc);
tr.split(10); // split a node, +2 tokens at 10
tr.delete(2, 5); // -3 tokens at 2
console.log(tr.mapping.map(15)); // → 14
console.log(tr.mapping.map(6)); // → 3
console.log(tr.mapping.map(10)); // → 9

However, a question arises: where should a given position be mapped to? (For instance, if a position falls exactly in the middle of a change, splitting a node into two parts, the position could logically map to either the end of the preceding node or the start of the following one. Thus, a convention is needed—translator’s note.) Consider the last line of the example above. Position 10 happens to be at the split point of a node where two tokens are inserted. Should it map before or after the inserted content? In this case, it clearly maps after the insertion.

Sometimes, though, you might want different mapping behavior. This is why the map method for step maps and mappings accepts a second parameter: bias. Setting this to -1 will position the insertion point before the inserted content.

1
console.log(tr.mapping.map(10, -1)); // → 7

The reason for keeping each individual step small and straightforward is to enable this kind of mapping, as well as to allow steps to be inverted losslessly and to map step positions relative to one another.

Rebasing

(Admittedly, I didn’t fully grasp the meaning of this section, so I’ve translated it verbatim from the documentation without adding my own interpretation. Corrections are welcome if any inaccuracies exist—translator’s note.)

When dealing with more complex tasks involving steps and maps—such as implementing your own change tracking or integrating collaborative editing features—you’ll need to rebase steps.

You might want to skip learning this part until you’re certain you actually need it.

Rebasing, in simple terms, refers to transforming one step so it can be applied to a document modified by another step when both steps alter the same document. Pseudocode example:

1
2
3
4
5
stepA(doc) = docA
stepB(doc) = docB
stepB(docA) = MISMATCH!
rebase(stepB, mapA) = stepB'
stepB'(docA) = docAB

Steps have a map method that takes a mapping and uses it to transform the entire step. This mapping process can fail if the step becomes meaningless after mapping—for example, if the content it intends to apply has been deleted. However, when successful, you’ll have a step pointing to a new document, i.e., the mapped version. Thus, in the pseudocode example above, rebase(stepB, mapA) can simply be implemented as stepB.map(mapA).

If you need to rebase a chain of steps onto another chain of steps:

1
2
3
stepA2(stepA1(doc)) = docA
stepB2(stepB1(doc)) = docB
???(docA) = docAB

We can map stepB1 through stepA1 and stepA2, resulting in stepB1'. However, for stepB2, which originates from the document produced by stepB1(doc) and whose mapped version must be applied to the document produced by stepB1'(docA), things get more complicated. It must be mapped through the following chain of maps:

1
rebase(stepB2, [invert(mapB1), mapA1, mapA2, mapB1'])

For example, first, the inversion of stepB1’s map reverts the document back to the initial document. Then (stepB1) applies the map stream (chained calls) generated by stepA1 and stepA2. Finally, by applying the map produced by stepB1, the document is transformed into docA.

If there were a stepB3 here, we could obtain stepB3’s map stream from the previous map stream by prepending invert(mapB2) and appending mapB2' to the end of the stream, and so on.

However, when stepB1 inserts some content and stepB2 performs operations on that content, applying stepB2 through invert(mapB1) mapping will return null because the inversion of stepB1 removes the content it was about to apply. Nevertheless, this content will later be reintroduced into the stream by mapB1. The mapping abstraction provides a way to track such streams, including methods to invert related maps within the pipeline. You can use the mapping object to map steps to address the scenario described above.

Even if you have a rebased step, there’s no guarantee it will still be applicable to the current document. For instance, your step might add some marks, but another step could modify the parent node of the content you intended to mark, turning it into a node that no longer allows the marks from your previous step. Attempting to apply your step would then fail. A more appropriate approach in such cases is to simply discard the step.

The Editor State

What constitutes the editor’s state? Of course, you already have a document that forms part of it. But there’s also a selection (to complete the state). Additionally, there needs to be a way to store changes in mark settings, such as enabling or disabling a mark before editing begins (to fulfill a common requirement: clicking a mark like bold or font-size first, then editing).

ProseMirror’s state primarily consists of three components, which exist on the state object: doc, selection, and storeMarks.

1
2
3
4
5
6
import { schema } from 'prosemirror-schema-basic';
import { EditorState } from 'prosemirror-state';

let state = EditorState.create({ schema });
console.log(state.doc.toString()); // An empty paragraph
console.log(state.selection.from); // 1, the start of the paragraph

However, plugins may also need to store state. For example, the undo history plugin needs to save the history of changes. This is why the settings of active plugins are also stored in the state, and these plugins can define their own slots to store their own state.

Selection

ProseMirror supports various types of selections (and allows third-party code to define new selection types). These different types of selections appear as subclasses of Selection. Like documents and other state-related values, they are immutable—meaning to change a selection, you need to create a new selection object and a new state to hold it.

A selection has at least a start (.from) and an end (.to) position pointing to the current document. Many selection types also distinguish between anchor (the fixed side of the selection) and head (the movable side of the selection), so these properties exist on every selection object.

The most commonly used selection type is text selection, which represents a normal cursor (when anchor and head are the same) or selected text. Both ends of a text selection must be at inline positions, i.e., within nodes that allow inline content.

ProseMirror’s core library also supports node selection, which represents when a single node is selected. For instance, when you Ctrl/Cmd + click on a node. The range of this selection type spans from before the node to after it.

Transactions

During normal editing, new states are derived from old states. You may have encountered situations like loading a document where you want to create a completely new state, which is an exception (i.e., not derived from the old state—translator’s note).

States are updated by applying ](https://prosemirror.xheldon.com/docs/ref/#state.EditorState.apply) and ](https://prosemirror.xheldon.com/docs/ref/#state.Transaction) to an existing state to produce a new state. Conceptually, this happens once: given an old state and a transaction of changes, the new values for each component of the state are computed, which together form the new state’s value.

1
2
3
4
5
let tr = state.tr;
console.log(tr.doc.content.size); // 25
tr.insertText('hello'); // Replaces selection with 'hello'
let newState = state.apply(tr);
console.log(tr.doc.content.size); // 30

](https://prosemirror.xheldon.com/docs/ref/#state.Transaction) is a subclass of ](https://prosemirror.xheldon.com/docs/ref/#transform.Transform), inheriting methods for updating documents by applying ](https://prosemirror.xheldon.com/docs/ref/#transform.Step) to the previous document. Additionally, transactions track selections and other state-related components, providing convenient selection-related methods such as ](https://prosemirror.xheldon.com/docs/ref/#state.Transaction.replaceSelection).

The simplest way to create a transaction is to call the ](https://prosemirror.xheldon.com/docs/ref/#state.EditorState.tr) getter on the editor’s state object (i.e., view.state.tr). This creates an empty transaction based on the current state, allowing you to add steps and other updates to it.

By default, the old selection is ](https://prosemirror.xheldon.com/docs/ref/#state.Selection.map) through each step, resulting in a new selection. However, you can also use ](https://prosemirror.xheldon.com/docs/ref/#state.Transaction.setSelection) to precisely set a new selection.

1
2
3
4
5
6
let tr = state.tr;
console.log(tr.selection.from); // → 10
tr.delete(6, 8);
console.log(tr.selection.from); // → 8 (moved back)
tr.setSelection(TextSelection.create(tr.doc, 3));
console.log(tr.selection.from); // → 3

Similarly, the ](https://prosemirror.xheldon.com/docs/ref/#state.EditorState.storedMarks) (i.e., storeMarks) is automatically cleared when the document or selection changes. It can be reset using ](https://prosemirror.xheldon.com/docs/ref/#state.Transaction.setStoredMarks) or ](https://prosemirror.xheldon.com/docs/ref/#state.Transaction.ensureMarks).

Finally, the ](https://prosemirror.xheldon.com/docs/ref/#state.Transaction.scrollIntoView) method ensures the next state is drawn in the current viewport. You may want to call this after most user interactions.

Like Transform methods, most Transaction methods return the transaction itself for convenient chaining.

Plugins

When ](https://prosemirror.xheldon.com/docs/ref/#state.EditorState^create) a new state, you can provide an array of plugins. These will persist in any state and influence how transactions are applied and how the state behaves.

Plugins are instances of the ](https://prosemirror.xheldon.com/docs/ref/#state.Plugin) class, enabling a wide range of features. The simplest case is adding some ](https://prosemirror.xheldon.com/docs/ref/#view.EditorProps) to the editor view in response to an event, while more complex cases involve adding new state to the editor and updating it based on transactions.

When creating a plugin, you need to pass ](https://prosemirror.xheldon.com/docs/ref/#state.PluginSpec) to specify its behavior:

1
2
3
4
5
6
7
8
9
10
let myPlugin = new Plugin({
props: {
handleKeyDown(view, event) {
console.log('A key was pressed!');
return false; // We did not handle this
},
},
});

let state = EditorState.create({ schema, plugins: [myPlugin] });

If a plugin requires its own state slot (in Vue terms, a scoped slot—translator’s note), it can define its own state property:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
let transactionCounter = new Plugin({
state: {
init() {
return 0;
},
apply(tr, value) {
return value + 1;
},
},
});

function getTransactionCount(state) {
return transactionCounter.getState(state);
}

In the example above, the plugin simply counts the number of transactions applied to the state. The helper function uses the plugin’s getState method, which retrieves the plugin’s state from the editor’s state object.

Since the editor’s state is a persistent, immutable object and plugin state is part of it, plugin state values must also be immutable. For example, if a plugin’s state needs to change, the apply method must return a new value rather than modifying the old one, and no other code should alter it.

For plugins, it’s often useful to attach additional information to transactions. For example, in undo history, when performing an undo operation, a marker is added to the resulting transaction. When a plugin detects this marker, it treats the transaction specially: the plugin removes the top item from the undo stack and adds the transaction to the redo stack, rather than applying it as a normal change to the current document.

To achieve this (adding extra information to transactions), transactions allow metadata to be attached to them. We can update the transaction-counting plugin (the example mentioned above—translator’s note) to ignore marked transactions, as shown below:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
let transactionCounter = new Plugin({
state: {
init() {
return 0;
},
apply(tr, value) {
if (tr.getMeta(transactionCounter)) return value;
else return value + 1;
},
},
});

function markAsUncounted(tr) {
tr.setMeta(transactionCounter, true);
}

The keys in metadata can be strings, but to avoid naming conflicts, it’s strongly recommended to use plugin objects (i.e., PluginKey objects, similar in principle to Symbols). Some keys are already reserved by ProseMirror, such as “addToHistory,” which can be set to false to prevent a transaction from being undone. When handling a paste event, the editor will set the transaction’s paste property to true.

The view component

ProseMirror’s editor view is a user interface component that displays the editor state to users and allows them to perform editing operations.

The definition of “editing operations” mentioned above is narrower for the core view component, which directly handles interactions with the editing interface, such as typing, copying, pasting, and dragging. Beyond that, it doesn’t do much more. This means other tasks, such as displaying a menu, providing keyboard bindings, or responding outside the core view component, cannot be implemented by the view component alone and require plugins to achieve.

Editable DOM

The editor allows us to designate a portion of the DOM as editable. This property enables that portion of the DOM to be focused and selected, making it possible to input content. The view component creates a DOM representation of the document (by default using your schema’s toDOM method) and makes it editable. When the editable element is focused, ProseMirror ensures that the DOM Selection matches the editor state’s selection.

For most DOM events, there are also many registered event handlers available, which convert events into appropriate transactions. For example, when pasting, the pasted content is formatted into a ProseMirror document slice and then inserted into the document.

Most events are also allowed to be handled directly by the user (rather than being wrapped by ProseMirror) and then reinterpreted using ProseMirror’s data model. For instance, browsers are quite adept at handling cursor and selection positions (especially with bidirectional text), so most cursor movement-related key and mouse events are delegated to the browser. After processing, ProseMirror checks which type of text selection the current DOM selection should correspond to. If it detects that the actual selection doesn’t match ProseMirror’s current selection, a transaction to update the selection will be dispatched.

Input events are typically left to the browser as well, because interfering with them can disable native features like spell-checking, auto-capitalization, and others on mobile devices. When the browser updates the DOM, the editor detects the changes, reformats the affected parts of the document, and converts these changes into transactions.

Data flow

So, the editor view displays a given editor state, and when certain events occur, it creates a new transaction and broadcasts it (broadcasting this newly created transaction for use by other plugins or events—translator’s note). This transaction is then typically used to create a new state, which is then applied to the view via the updateState method:

prosemirror-data-flow

As shown, ProseMirror establishes a simple cyclic data flow, which is entirely different from the typical imperative event-handling implementations (common in the JavaScript world) that often result in a more complex network of data flows.

“Intercepting” transactions is possible because they are dispatched through the dispatchTransaction property, allowing ProseMirror’s data flow to integrate into a larger data cycle—if your entire app follows a data flow similar to ProseMirror’s (such as the data flow in view frameworks like React/Vue—translator’s note), for example, Redux or other similar architectures, you can integrate ProseMirror’s transactions into your main event dispatch loop and place ProseMirror’s state in your app’s ‘store’ (borrowing the Redux store concept here—translator’s note).

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
// The app's state
let appState = {
editor: EditorState.create({ schema }),
score: 0,
};
let view = new EditorView(document.body, {
state: appState.editor,
dispatchTransaction(transaction) {
update({ type: 'EDITOR_TRANSACTION', transaction });
},
});

// A crude app state update function, which takes an update object,
// updates the `appState`, and then refreshes the UI.
function update(event) {
if (event.type == 'EDITOR_TRANSACTION')
appState.editor = appState.editor.apply(event.transaction);
else if (event.type == 'SCORE_POINT') appState.score++;
draw();
}
// An even cruder drawing function
function draw() {
document.querySelector('#score').textContent = appState.score;
view.updateState(appState.editor);
}

Efficient Updating

One way to implement the updateState functionality is to re-render the entire document every time it is called. However, for larger documents, this would be very slow.

Therefore, when updating the view, the view compares the old document with the new one and preserves the parts of the DOM that remain unchanged (while replacing the new ones—translator’s note). ProseMirror handles this for you, ensuring that each update requires only minimal work.

In some cases, such as when updating input text, the text has already been added to the DOM by the browser’s own editing operations (i.e., the browser has modified the DOM, and ProseMirror listens for DOM change events, triggering a transaction to synchronize the DOM changes without needing further DOM updates). Ensuring consistency between ProseMirror and the DOM may not require any DOM updates at all. (When such a transaction synchronizing DOM state to ProseMirror is canceled, the view will undo the DOM changes to ensure the DOM remains in sync with the state.)

Similarly, the DOM selection is only synchronized when it becomes out of sync with the state’s selection, to avoid disrupting the browser selection’s hidden states (such as the functionality where pressing the up or down arrow on a shorter line moves the cursor to the end of a longer line above or below).

Props

‘Props’ are highly useful, a concept borrowed directly from React. Props act like parameters for UI components. Ideally, the props a component receives completely define its behavior.

1
2
3
4
5
6
7
8
9
let view = new EditorView({
state: myState,
editable() {
return false;
}, // Enables read-only behavior
handleDoubleClick() {
console.log('Double click!');
},
});

As mentioned above, the current state is a prop. The code controlling the component (i.e., the code passing props to the component—translator’s note) can update other props at different times, but not the state, because the component itself does not change any props other than the state (as these should be updated by the controlling code—translator’s note). updateState is simply a shortcut for updating the state prop.

Plugins can also declare props, excluding state and dispatchTransaction, which must be provided directly when defining the view (Plugins are allowed to define a state field representing the plugin’s state; here, ‘state’ refers to the editor’s state—translator’s note).

1
2
3
4
5
6
7
8
9
function maxSizePlugin(max) {
return new Plugin({
props: {
editable(state) {
return state.doc.content.size < max;
},
},
});
}

When a prop is declared multiple times (by multiple plugins, etc.), how these props are handled depends on their own nature. Generally, props provided directly by the (editor view) take precedence, followed by processing in the order each plugin declares them. For some props, such as ](https://prosemirror.xheldon.com/docs/ref/#view.EditorProps.domParser), the first declared value is used, and subsequent declarations are ignored. For handler functions (of props), returning a boolean indicates whether they handle the event—the first one returning true processes the event (and other handlers of the same type are ignored—translator’s note). Finally, for other props, such as ](https://prosemirror.xheldon.com/docs/ref/#view.EditorProps.attributes) (which sets attributes on the editable DOM) and decorations (covered in the next section), their merged values are used.

Decorations

Decorations give you some control over how your document view is rendered. They are created through the return value of the ](https://prosemirror.xheldon.com/docs/ref/#view.EditorProps.decorations) property and come in three types:

To efficiently render and compare decorations, these decorations need to be provided in the form of a ](https://prosemirror.xheldon.com/docs/ref/#view.DecorationSet) (a tree-like data structure resembling the actual document structure). You can create one using the static method ](https://prosemirror.xheldon.com/docs/ref/#view.DecorationSet^create), passing the current document and an array of decoration objects as arguments:

1
2
3
4
5
6
7
8
9
10
11
let purplePlugin = new Plugin({
props: {
decorations(state) {
return DecorationSet.create(state.doc, [
Decoration.inline(0, state.doc.content.size, {
style: 'color: purple',
}),
]);
},
},
});

When you have many decorations, recreating the decoration set in memory during every redraw can be costly. Therefore, in such cases, it is recommended to maintain your decorations in the plugin’s state, map them to the new document state when the document changes, and update them only when necessary.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
let specklePlugin = new Plugin({
state: {
init(_, { doc }) {
let speckles = [];
for (let pos = 1; pos < doc.content.size; pos += 4)
speckles.push(
Decoration.inline(pos - 1, pos, { style: 'background: yellow' })
);
return DecorationSet.create(doc, speckles);
},
apply(tr, set) {
return set.map(tr.mapping, tr.doc);
},
},
props: {
decorations(state) {
return specklePlugin.getState(state);
},
},
});

In the example, the plugin initializes its state as a decoration set, adding a yellow inline background decoration every four positions. This may not be very useful, but similar scenarios can implement features like highlighting search results or adding comment areas.

When a transaction is applied to the state, the plugin state’s ](https://prosemirror.xheldon.com/docs/ref/#state.StateField.apply) method maps the decoration set forward, keeping the decorations in place to “adapt” to the new document structure. The mapping method (often used for local changes) updates efficiently due to the tree structure of the decoration set—only nodes affected by changes are updated.

(In production environments, the plugin’s apply method may also handle adding or removing decorations triggered by new events, which can be detected by inspecting transaction metadata or information carried by the transaction.)

Finally, the decorations property simply returns the plugin’s state, which will display the decorations in the view.

Node Views

Another way to influence how the editor view renders your document is through ](https://prosemirror.xheldon.com/docs/ref/#view.NodeView). These define small, independent UI components for nodes in the document. They allow you to control how these DOM elements are rendered, define their update behavior, and write custom code to handle events.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
let view = new EditorView({
state,
nodeViews: {
image(node) {
return new ImageView(node);
},
},
});
class ImageView {
constructor(node) {
// The editor will use this as the node's DOM representation
this.dom = document.createElement('img');
this.dom.src = node.attrs.src;
this.dom.addEventListener('click', (e) => {
console.log('You clicked me!');
e.preventDefault();
});
}
stopEvent() {
return true;
}
}

In the example, the image node’s view object creates a custom DOM node for the image, adds event handlers, and includes a stopEvent method to indicate that ProseMirror should ignore events from this DOM node.

You’ll often want to interact with nodes to affect real nodes in the document. But to create a transaction that modifies a node, you first need to know where that node is located. To enable this, node views pass a getter function that can be used to query their current position in the document. Let’s modify the previous example to allow entering alt text for the image node when it’s clicked.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
let view = new EditorView({
state,
nodeViews: {
image(node, view, getPos) {
return new ImageView(node, view, getPos);
},
},
});
class ImageView {
constructor(node, view, getPos) {
this.dom = document.createElement('img');
this.dom.src = node.attrs.src;
this.dom.alt = node.attrs.alt;
this.dom.addEventListener('click', (e) => {
e.preventDefault();
let alt = prompt('New alt text:', '');
if (alt)
view.dispatch(
view.state.tr.setNodeMarkup(getPos(), null, {
src: node.attrs.src,
alt,
})
);
});
}
stopEvent() {
return true;
}
}

setNodeMarkup is a method that can be used to change the type or attributes of a node at a given position. In the example above, we use the getPos method to find the current position of the image node, then assign new attributes with updated alt text to that node.

When a node updates, the default behavior is to preserve its outer DOM structure, only comparing its children with the new set of child nodes and updating or replacing them as needed. A node view can override this default behavior, allowing us to perform actions like updating CSS class names for paragraphs based on node content.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
let view = new EditorView({
state,
nodeViews: {
paragraph(node) {
return new ParagraphView(node);
},
},
});
class ParagraphView {
constructor(node) {
this.dom = this.contentDOM = document.createElement('p');
if (node.content.size == 0) this.dom.classList.add('empty');
}
update(node) {
if (node.type.name != 'paragraph') return false;
if (node.content.size > 0) this.dom.classList.remove('empty');
else this.dom.classList.add('empty');
return true;
}
}

Images don’t contain content, so in our earlier example, we didn’t need to worry about how their content would be rendered. But paragraphs do have content. Node views support two approaches for handling content: you can let ProseMirror manage it, or you can handle it entirely manually. If you provide a contentDOM property, ProseMirror will render the node’s content inside that DOM node and handle content updates. If you don’t provide this property, the node’s content becomes a black box to the editor - how you display the content and handle user interaction is entirely up to you.

In this case, we want the paragraph content to behave like normal editable text, so the contentDOM property is defined the same as the dom property, since the content needs to be rendered directly into the outer container.

The magic happens in the update method. First, it’s important to note that this method entirely determines how the node view should be updated to reflect changes in the node. The new node drawn by the editor’s update algorithm could be anything, so you must verify that the newly drawn node can be processed by the current node view.

The update method in the example first checks whether the new node is still a paragraph, aborting if not. Then based on the new node’s content, it determines whether the empty class name should be present on the node. Returning true indicates the update was successful (at which point the node’s content will be updated).

Commands

In ProseMirror terminology, a command function enables users to perform actions through key combinations (like cmd+a for select all) or menu interactions.

For practical reasons, commands are somewhat complex. Some simple commands are functions that take an editor state and a dispatch function (EditorView.dispatch or another transaction-related function) as parameters, then return a boolean. Here’s a very simple example:

1
2
3
4
5
function deleteSelection(state, dispatch) {
if (state.selection.empty) return false;
dispatch(state.tr.deleteSelection());
return true;
}

If a command isn’t available, it should return false and do nothing. When available, it should dispatch a transaction and return true. The keymap plugin uses this mechanism to prevent keys already handled by one command from being processed by others.

To query whether a command can be applied to a given state without actually executing it, the dispatch parameter is optional. When no dispatch function is provided and the command is available, it will simply return true without performing any actions. The following example demonstrates this:

1
2
3
4
5
function deleteSelection(state, dispatch) {
if (state.selection.empty) return false;
if (dispatch) dispatch(state.tr.deleteSelection());
return true;
}

To check whether the current selection can be deleted, you’d call deleteSelection(view.state, null), whereas actually deleting a selection would involve calling deleteSelection(view.state, view.dispatch). A menu bar could use this mechanism to determine whether menu buttons should be grayed out (indicating unavailability).

When using commands from the menu bar mentioned above, they do not access the actual editor view—in fact, most commands don’t need to access it. They can even be applied and tested via menu commands in settings when no view is available. However, some commands do require interaction with the DOM—they might need to ](https://prosemirror.xheldon.com/docs/ref/#view.EditorView.endOfTextblock)query whether a given position is at the end of a textblock or want to pop up a dialog positioned relative to the view. Therefore, most plugins that invoke commands will pass a third parameter: the current view.

1
2
3
4
5
6
7
function blinkView(_state, dispatch, view) {
if (dispatch) {
view.dom.style.background = 'yellow';
setTimeout(() => (view.dom.style.background = ''), 1000);
}
return true;
}

This example (though quite useless) demonstrates that commands don’t necessarily need to dispatch a transaction—while they are often called to apply their so-called side effects (i.e., dispatching a transaction), they can also be invoked to pop up a dialog (without dispatching).

The ](https://prosemirror.xheldon.com/docs/ref/#commands)prosemirror-commands module provides a wide range of editing commands, from simple variants like ](https://prosemirror.xheldon.com/docs/ref/#commands.deleteSelection)deleteSelection to more complex ones such as ](https://prosemirror.xheldon.com/docs/ref/#commands.joinBackward)joinBackward, which implements the block-joining behavior that occurs when you press backspace at the start of a textblock line. The module also includes a ](https://prosemirror.xheldon.com/docs/ref/#commands.baseKeymap)basic keymap that binds numerous architecture-agnostic (i.e., not distinguishing between Win/Mac or Safari/Chrome, etc.) commands to their respective keys.

In some cases, different behaviors—even those typically bound to a single key—are split into separate commands (i.e., a single key may be handled by different commands under different circumstances). The utility function ](https://prosemirror.xheldon.com/docs/ref/#commands.chainCommands)chainCommands can be used to combine multiple commands—they will be tried one after another until one returns true.

For example, the basic keymap binds the backspace key to a command chain: ](https://prosemirror.xheldon.com/docs/ref/#commands.deleteSelection)deleteSelection (effective when the selection is non-empty), ](https://prosemirror.xheldon.com/docs/ref/#commands.joinBackward)joinBackward (effective when the cursor is at the start of a textblock), followed by ](https://prosemirror.xheldon.com/docs/ref/#commands.selectNodeBackward)selectNodeBackward (which selects the node before the selection if the schema prohibits normal node joining). When none of these apply, the browser executes its default behavior, which is appropriate for pressing backspace within a textblock (ensuring native spell-checking and similar features work correctly).

The commands module also exports some command constructors, such as toggleMark, which takes a mark type and an optional set of attributes, then returns a command function that can toggle the mark on the current selection.

Other modules may also export command functions, such as the ](https://prosemirror.xheldon.com/docs/ref/#history.undo)undo and ](https://prosemirror.xheldon.com/docs/ref/#history.redo)redo functions from the history module. To customize your own editor or allow users to interact with custom document nodes, you may need to write your own command functions.

Collaborative Editing

Real-time collaborative editing allows multiple users to edit the same document simultaneously. Changes made by users are immediately applied to their local documents, then sent to others, with modifications from different users automatically merged (without manual conflict resolution). This editing experience ensures uninterrupted workflow while maintaining document consistency.

This guide explains how to get started with Prosemirror’s collaborative editing features.

Algorithm

Prosemirror’s collaborative editing system employs a central authority model, which determines the order in which modifications from different users are applied to the document. If two editors make changes simultaneously, these changes are submitted to the authority. The authority will accept one of the changes and broadcast it to all editors. Other changes will not be accepted. When an editor receives new changes from the server, it must rebase its local changes onto the latest version from other editors and attempt to resubmit them (similar to Git’s rebase—local modifications remain unchanged (as rejected by the server), the editor’s document is updated to the latest version, and the local changes are resubmitted to see if the server accepts them this time).

The role of the central authority is actually quite simple—it must:

  • Track the current version of the document
  • Accept changes from editors and, when these changes are applied, add them to its own list of modifications
  • Provide editors with a way to receive updates for a given version

Let’s implement a minimal central authority that runs in a JavaScript environment, just like the editor.

class Authority {
  constructor(doc) {
    this.doc = doc;
    this.steps = [];
    this.stepClientIDs = [];
    this.onNewSteps = [];
  }
  receiveSteps(version, steps, clientID) {
    if (version != this.steps.length) return;
    // Apply and accumulate new steps
    steps.forEach((step) => {
      this.doc = step.apply(this.doc).doc;
      this.steps.push(step);
      this.stepClientIDs.push(clientID);
    });
    // Signal listeners
    this.onNewSteps.forEach(function (f) {
      f();
    });
  }
  stepsSince(version) {
    return {
      steps: this.steps.slice(version),
      clientIDs: this.stepClientIDs.slice(version),
    };
  }
}

When an editor attempts to submit its changes to the authority, it calls the authority’s receiveSteps method. It passes the last version number it received, the new changes it has made based on that version, and its client ID (which identifies which changes originated from itself).

Once the submission is accepted by the authority, the client will be notified because the authority informs it that new server-side changes are available, along with the corresponding steps for applying those changes. In a real-world implementation of the authority, you could optimize by having receiveSteps return a status and immediately confirm the steps it sends (instead of passively waiting for the server to notify it). However, the mechanism described above (waiting for server notification) serves as a fallback solution for unreliable network conditions. Therefore, you should always treat waiting for server updates as the default fallback behavior.

The example authority implementation will maintain an indefinitely growing array of steps, where its length represents the current version.

The collab Module

The collab module exports a collab function, which returns a plugin used to track local modifications, accept remote changes, and determine when and what changes should be sent to the authority.

import { EditorState } from 'prosemirror-state';
import { EditorView } from 'prosemirror-view';
import { schema } from 'prosemirror-schema-basic';
import collab from 'prosemirror-collab';

function collabEditor(authority, place) {
  let view = new EditorView(place, {
    state: EditorState.create({
      doc: authority.doc,
      plugins: [collab.collab({ version: authority.steps.length })],
    }),
    dispatchTransaction(transaction) {
      let newState = view.state.apply(transaction);
      view.updateState(newState);
      let sendable = collab.sendableSteps(newState);
      if (sendable)
        authority.receiveSteps(
          sendable.version,
          sendable.steps,
          sendable.clientID
        );
    },
  });

  authority.onNewSteps.push(function () {
    let newData = authority.stepsSince(collab.getVersion(view.state));
    view.dispatch(
      collab.receiveTransaction(view.state, newData.steps, newData.clientIDs)
    );
  });

  return view;
}

The collabEditor function creates a new editor view that loads the collab plugin. Whenever the state updates, it checks whether anything needs to be sent to the authority and, if so, sends it.

It also registers a function that the authority will call when new modification steps are available. This function creates a transaction to update the local editor according to the steps provided by the authority.

When a set of steps is rejected by the authority, those steps remain unconfirmed until—perhaps soon—we receive new steps from the authority. After that (i.e., after accepting the new steps), the onNewSteps callback invokes [dispatch](https://prosemirror.xheldon.com/docs/ref/#state.Transaction), triggering our dispatchTransaction function, which will then attempt to resubmit its changes.

That’s all there is to it. Of course, for asynchronous data flows (such as long polling or WebSockets in the collab demo), you’ll need more complex communication and synchronization code. You might also want your authority to discard some steps occasionally to reduce memory usage. But overall, this small example fully outlines how an authority should be implemented.

- EOF -
Originally published at: "Translated" ProseMirror Chinese Guide - Xheldon Blog