Skip to content

Improve CodeBlock Extensibility #11008

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
2 tasks done
Danielku15 opened this issue Mar 19, 2025 · 3 comments
Open
2 tasks done

Improve CodeBlock Extensibility #11008

Danielku15 opened this issue Mar 19, 2025 · 3 comments
Labels
feature This is not a bug or issue with Docusausus, per se. It is a feature request for the future.

Comments

@Danielku15
Copy link
Contributor

Danielku15 commented Mar 19, 2025

Have you read the Contributing Guidelines on issues?

Description

The main goal of this feature is to allow developers to extend the syntax highlighting in a react/docusaurus compatible way and by this work around the limitations of Prismjs plugins not being available.

Has this been requested on Canny?

No response

Motivation

Docusuarus / prism-react-renderer do not allow using standard prismjs plugins due to the DOM manipulation nature of them. It has been discussed that Prism plugins are not supported, and it also was discussed to use alternative components for the syntax highlighting needs.

With this proposal developers would get a mechanism of enriching the syntax highlighting by utilizing magic comments and swizzling without major impact to the Docusaurus core functionality.

My personal goal is to develop a plugin where I can generate hyperlinks for individual tokens rendered to let user jump to the related reference documentation. The current state of implementation requires me to swizzle and adapt a major part of the CodeBlock components to achieve this.

API design

As of today Docusaurus offers:

  1. Magic Comments as a means of highlighting individual lines via CSS classes.
  2. Metadata string as a means of injecting simple options
  3. Fine grained swizzling as mechanism to allow customizing the component rendering.

Based on this strategy also the extensibility of CodeBlocks should be designed. In a first step, the goal is to allow individual website authors to customize the components to their needs. It is at this point a non-goal to provide a fully fledged plugin system where you could pull NPM packages and register them. This could become an option in future.

The concrete proposal is to:

  1. Provide a mechanism for end users to inject "plugin" configurations. (optional but recommended)
    1.1. Parse Metadata string into a option bag.

The metadata string is currently a plain string rather parsed individually at some spots. Docusaurus should attempt parsing the options with a defined syntax:

grammar ExprParser;

metadataComment

metadata
    : rangeSyntax? optionSyntax* EOF;

rangeSyntax: '{' lineRange (',' lineRange)* '}';

lineRange: INT 
         | INT '-' INT;

optionSyntax: ID
            | ID '=' ID
            | ID '=' STRING;
    
INT : [0-9]+ ;
ID: [a-zA-Z_][a-zA-Z_0-9]* ;
WS: [ \t\n\r\f]+ -> skip ;
STRING : '"' ~[<"]* '"' | '\'' ~[<']* '\''; 

Image

The options are then parsed into a defined data structure for easy access in later steps. Also docusaurus could internally benefit from this bag instead of doing "contains" checks.

const showLineNumbersMeta = metastring
?.split(' ')
.find((str) => str.startsWith('showLineNumbers'));
if (showLineNumbersMeta) {
if (showLineNumbersMeta.startsWith('showLineNumbers=')) {
const value = showLineNumbersMeta.replace('showLineNumbers=', '');
return parseInt(value, 10);
}
return 1;
}

options are meant to be unique and are overwritten. Support for arrays or maps as values could be added later if the need arises.

1.2. Metadata Magic Comments
The metadata string requires all options to be placed in a single line which has quite some limits when you need to specify many options. Hence it would be a good extension to also allow specifying the options from 1.1 via some magic comment syntax.

These magic comments would be parsed into the property bag and then erased.

For a simple initial version we parse the whole code for magic comments and fill the bag at once. Later this could be extended that options can be changed at any time in the code affecting subsequent lines.

An option comment starts with a configurable special prefix (to be defined), should not interfere with the majority of real comments, and only allows one option per comment. The configuration option prism: { optionCommentPrefix: '!docusuarus-' } allows users to eliminate conflicts.

// !docusaurus-option
// !docusuaurs-option2=true 
// !docusaurus-option3=value
// !docusaurus-option4="long value"
  1. Create a new swizzlable component for rendering individual tokens and ensure they receive the metadata bag. (required for MVP)
    2.1 Add a swizzleable LineToken

Currently the Line component creates a span preventing any customization of the rendered token.

<span key={key} {...getTokenProps({token})} />

A new component allows swizzling and customizing the rendering on token level (e.g. a dev could create a <a href=""> instead).

2.2. Pass through the metadata options.

The parsed options from 1. should be passed down to the Line and LineToken components so devs can make use of it. If we decide not to parse the metadata string (yet?) at least the plain string available on String.tsx should be passed through.

Have you tried building it?

I successfully created a rough prototype locally by swizzling CodeBlock and Line to pass-through the metadata and customize the generated token. But it requires swizzling of various internal components which is risky.

Image

Self-service

  • I'd be willing to contribute this feature to Docusaurus myself.
@Danielku15 Danielku15 added feature This is not a bug or issue with Docusausus, per se. It is a feature request for the future. status: needs triage This issue has not been triaged by maintainers labels Mar 19, 2025
@slorber slorber removed the status: needs triage This issue has not been triaged by maintainers label Mar 19, 2025
@slorber
Copy link
Collaborator

slorber commented Mar 19, 2025

Thanks for the feature design.

I generally agree with many things you said.

This historical component is quite messy in its current state, and we should probably refactor it, handle the metastring in one place, and split it in more granular encapsulated components to make it easier to swizzle and extend.

In general, I'm not a fan of introducing a custom grammar for the metastring, and I think we should rather rely on something that already exists. I guess we could just use URLSearchParams. The problem is that we historically support the {1,3-5} syntax and removing it would be a breaking change. However that's probably easy to add a pre-processing step that expands {1,3-5} to highlight=1 highlight=3-5 and later parse it in a more generic way. (using spaces would probably be less awkward than using &).


In general, I think we should adopt React Server Components, and then eventually rebuild this code block component from the ground up based our ability to use more heavy highlighters like Shiki.

Until we get there, let's try to not be too disruptive. We can refactor/improve things incrementally. Some changes can be non-breaking, and we can also ship some breaking changes gated behind a v4 future flag.

If you want to send small refactor PRs, I'm happy to review them. It seems relatively safe to:

  • Centralize things like the meta string parsing in a single place
  • Create other subcomponents like <LineToken>

It's probably worth considering other pending code block PRs I need to review:

@Danielku15
Copy link
Contributor Author

Thanks for the feedback. I also agree with your points. The RFC might sound a bit more complex than the effective code change will likely be.

On the grammar you're right, but on the other hand, the syntax is already established (numbering, title, etc.) and regex based parsing is mostly implemented in codeBlockUtils. Writing a low level parser (e.g. a recursive decent parser) might have some slight performance benefits which likely do not justify the complexity maintenance effort.

I'll give it a shot to propose a PR, then we get a better feeling on the impact and real complexity.

@Danielku15
Copy link
Contributor Author

@slorber A first proposal is now ready at #11011

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature This is not a bug or issue with Docusausus, per se. It is a feature request for the future.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants