Discussion:
Patch: Enables integration of syntax highlighting systems
(too old to reply)
Lester L. Martin II
2018-06-28 04:31:57 UTC
Permalink
This patch changes the way `void artifact_page(void)` renders a files
content.
Formerly a `<blockquote><pre>` was issued for content, whereas now a
`<blockquote><pre><code class=$ext>` is issued where $ext is the file's
extension (example, "blah.lua" extension would be "lua").

This allows for proper integration with syntax highlighting systems such
as
"highlight.js" and "prism.js" where only the former was tested. This
allows
syntax highlighting without having to have language detection.

Remaining work likely includes changing the CSS of all themes to take
this into
account. Other things that might be considered is to conditionally
remove the
`<blockquote>` part from all but the content rendering. Another part
would be
rendering syntax highlights with line numbers (currently unimplemented).

Below is the output of 'fossil diff' ran against checkout
'6a7d2ad8f1dd5c542eba0b885418328803d8d802' with my changes:

Index: src/info.c
==================================================================
--- /home/masky/misc/fossil/src/info.c~0 2018-06-27 23:21:48.520779000
-0500
+++ /home/masky/misc/fossil/src/info.c 2018-06-27 16:48:28.683256000
-0500
@@ -670 +670 @@
-
+
@@ -2157 +2157,2 @@
- const char *z;
+ const char *z, *ext, *name;
+ char *tmp;
@@ -2158,0 +2160,3 @@
+ name = blob_str(&downloadName);
+
+ ext = (tmp = strrchr(name, '.')) == NULL ? "" : tmp+1;
@@ -2162 +2166 @@
- @ <pre>
+ @ <pre><code class="%h(ext)">
@@ -2164 +2168 @@
- @ </pre>
+ @ </code></pre>
@@ -2232 +2236 @@
- ? db_text("(No title)",
+ ? db_text("(No title)",

!!END!!

I'm unsure why "@@ -2232 +2236 @@" section got included as no changes
were
made to that.

I submit these works under the BSD 2 Clause license as found in
"COPYRIGHT-BSD2.txt" and assign the rights to D. Richard Hipp.

I would like to submit a contributor agreement to make it official. This
change is so small that my previous post to the list serv considered,
one could have implemented it in less time than it would take for a
contributor agreement to finally be filed but as there was no reply
I didn't wait upon such.

With the notes at the beginning portion of this email as to CSS changes,
final note is, if changing the section of code to omit `<blockquote>`
around content and reserving it for only the other cases, CSS changes
would not need to be made.

If this is not fit for contribution please let me know what I might need
to do to get it fit and I'll proceed further.

Thanks,

--
Lester L. Martin II
Chad Perrin
2018-06-28 05:36:32 UTC
Permalink
Post by Lester L. Martin II
This patch changes the way `void artifact_page(void)` renders a files
content. Formerly a `<blockquote><pre>` was issued for content,
whereas now a `<blockquote><pre><code class=$ext>` is issued where
$ext is the file's extension (example, "blah.lua" extension would be
"lua").
This allows for proper integration with syntax highlighting systems
such as "highlight.js" and "prism.js" where only the former was
tested. This allows syntax highlighting without having to have
language detection.
It's pretty cool you did this. I'm going to look over this and the
syntax highlighting tools that might work with it to see if I want to
start using it, though I guess I'd have to compile my own Fossil if I
wanted to do it sooner rather than later. Thanks for the work.
Post by Lester L. Martin II
Remaining work likely includes changing the CSS of all themes to take
this into account. Other things that might be considered is to
conditionally remove the `<blockquote>` part from all but the content
rendering. Another part would be rendering syntax highlights with line
numbers (currently unimplemented).
Have you checked it against any themese to see if it affects them at
all? If so, which themes did you find weren't problematic? I'm
guessing that if you found some that were you'd mention it.
Post by Lester L. Martin II
Below is the output of 'fossil diff' ran against checkout
Index: src/info.c
==================================================================
--- /home/masky/misc/fossil/src/info.c~0 2018-06-27 23:21:48.520779000
-0500
+++ /home/masky/misc/fossil/src/info.c 2018-06-27 16:48:28.683256000
-0500
@@ -670 +670 @@
-
+
@@ -2157 +2157,2 @@
- const char *z;
+ const char *z, *ext, *name;
+ char *tmp;
@@ -2158,0 +2160,3 @@
+ name = blob_str(&downloadName);
+
+ ext = (tmp = strrchr(name, '.')) == NULL ? "" : tmp+1;
@@ -2162 +2166 @@
@@ -2164 +2168 @@
@@ -2232 +2236 @@
- ? db_text("(No title)",
+ ? db_text("(No title)",
!!END!!
were
made to that.
I haven't looked at the code in context; I've only seen the patch you
inlined in the email so far. It looks sane to me, though. I might come
back and look at it again in the morning, and check it against the
Fossil source code for context.
Post by Lester L. Martin II
I would like to submit a contributor agreement to make it official. This
change is so small that my previous post to the list serv considered,
one could have implemented it in less time than it would take for a
contributor agreement to finally be filed but as there was no reply
I didn't wait upon such.
Even if they want the contributor agreement on file before incorporating
it, I'm glad you sent it to the list for people who might want to start
using it sooner than the next Fossil release.
--
Chad Perrin [ original content licensed OWL: http://owl.apotheon.org ]
Lester L. Martin II
2018-06-28 10:46:44 UTC
Permalink
Post by Chad Perrin
Post by Lester L. Martin II
This patch changes the way `void artifact_page(void)` renders a files
content. Formerly a `<blockquote><pre>` was issued for content,
whereas now a `<blockquote><pre><code class=$ext>` is issued where
$ext is the file's extension (example, "blah.lua" extension would be
"lua").
This allows for proper integration with syntax highlighting systems
such as "highlight.js" and "prism.js" where only the former was
tested. This allows syntax highlighting without having to have
language detection.
It's pretty cool you did this. I'm going to look over this and the
syntax highlighting tools that might work with it to see if I want to
start using it, though I guess I'd have to compile my own Fossil if I
wanted to do it sooner rather than later. Thanks for the work.
Glad you may find it useful, the steps to get it working with
highlightjs
is just to edit a Skin/Theme you like and add the following lines:
<link rel="stylesheet" href="https://your_domain/agate.min.css">
<script src="https://your_domain/highlight.min.js"></script>
<script>
hljs.configure({languages: []})
hljs.initHighlightingOnLoad();
</script>

You'll need to download highlightjs, and the corresponding CSS theme
for it that you like of course, and either configure nginx, or your
webserver or choice, to serve a static directory on the same domain,
or to use fossil's "--files" param to get fossil to serve those files.
Post by Chad Perrin
Post by Lester L. Martin II
Remaining work likely includes changing the CSS of all themes to take
this into account. Other things that might be considered is to
conditionally remove the `<blockquote>` part from all but the content
rendering. Another part would be rendering syntax highlights with line
numbers (currently unimplemented).
Have you checked it against any themese to see if it affects them at
all? If so, which themes did you find weren't problematic? I'm
guessing that if you found some that were you'd mention it.
The only themes it doesn't look visually opinion with this IMO are
Ardoise and Bootstrap. Note I didn't test all themes in a dark mode. The
following is a review of each theme.

Works:
Default, Blitz, Original, Enhanced Original, Black and White (menu on
the
left), Plain Gray no Logo

Works with comment:
Blitz (looks decent)
Xekri (doubled border, looks decent still)
Shadow Boxes and Rounded Corners (doubled border, looks decent still)

Doesn't work:
Ardoise (my fav theme at the moment, doubled borders look horrid)
Bootstrap (not bad, but the doubled border isn't great either)

If a theme is left unmentioned it probably worked, the list of what
worked
was growing far too long.
Post by Chad Perrin
Post by Lester L. Martin II
Below is the output of 'fossil diff' ran against checkout
Index: src/info.c
==================================================================
--- /home/masky/misc/fossil/src/info.c~0 2018-06-27 23:21:48.520779000
-0500
+++ /home/masky/misc/fossil/src/info.c 2018-06-27 16:48:28.683256000
-0500
@@ -670 +670 @@
-
+
@@ -2157 +2157,2 @@
- const char *z;
+ const char *z, *ext, *name;
+ char *tmp;
@@ -2158,0 +2160,3 @@
+ name = blob_str(&downloadName);
+
+ ext = (tmp = strrchr(name, '.')) == NULL ? "" : tmp+1;
@@ -2162 +2166 @@
@@ -2164 +2168 @@
@@ -2232 +2236 @@
- ? db_text("(No title)",
+ ? db_text("(No title)",
!!END!!
were
made to that.
I haven't looked at the code in context; I've only seen the patch you
inlined in the email so far. It looks sane to me, though. I might come
back and look at it again in the morning, and check it against the
Fossil source code for context.
Post by Lester L. Martin II
I would like to submit a contributor agreement to make it official. This
change is so small that my previous post to the list serv considered,
one could have implemented it in less time than it would take for a
contributor agreement to finally be filed but as there was no reply
I didn't wait upon such.
Even if they want the contributor agreement on file before
incorporating
it, I'm glad you sent it to the list for people who might want to start
using it sooner than the next Fossil release.
Thanks, really hope someone (or lots of people) find this a great,
easy enhancement.

--
Lester L. Martin II
Richard Hipp
2018-06-28 12:11:04 UTC
Permalink
Post by Lester L. Martin II
This patch changes the way `void artifact_page(void)` renders a files
content.
Formerly a `<blockquote><pre>` was issued for content, whereas now a
`<blockquote><pre><code class=$ext>` is issued where $ext is the file's
extension (example, "blah.lua" extension would be "lua").
But then the syntax highlighting goes away if you select line numbering, no?
--
D. Richard Hipp
***@sqlite.org
Lester L. Martin II
2018-06-28 13:02:51 UTC
Permalink
Post by Richard Hipp
Post by Lester L. Martin II
This patch changes the way `void artifact_page(void)` renders a files
content.
Formerly a `<blockquote><pre>` was issued for content, whereas now a
`<blockquote><pre><code class=$ext>` is issued where $ext is the file's
extension (example, "blah.lua" extension would be "lua").
But then the syntax highlighting goes away if you select line
numbering, no?
Indeed. The entire code dealing with adding in line numbering would need
reworking to enable it (and probably updates to CSS as well). I might
can
look into getting that working as well. I actually think there would be
a way that would be simpler than IIRC prefixing each line with spaces
and
the number and then more spaces.

--
Lester L. Martin II
Richard Hipp
2018-06-28 13:12:59 UTC
Permalink
Post by Lester L. Martin II
Indeed. The entire code dealing with adding in line numbering would need
reworking to enable it (and probably updates to CSS as well). I might
can
look into getting that working as well. I actually think there would be
a way that would be simpler than IIRC prefixing each line with spaces
and
the number and then more spaces.
Excellent.

Please mail in your CLA when you get a chance.
--
D. Richard Hipp
***@sqlite.org
Lester L. Martin II
2018-06-28 15:40:19 UTC
Permalink
Post by Richard Hipp
Please mail in your CLA when you get a chance.
Sending later today, hopefully regular stamped mail won't take 2 weeks.

I've developed a way to do highlighting with line numbering...
The issues are as follows however:
1. The JS code responsible for the highlighting is external to Fossil
itself.
2. The JS code highlights <pre><code> blocks, so if there is a
`<pre><code>$line</code><code>$line2</code></pre>` each code block is
highlighted individually.
3. Context is lost between code blocks.

Each line gets wrapped in a separate code block because you can use
css to insert a line number vs manually inserting the line number via
code per line.

That said, if you have a comment block `/* comments */` broken up over a
few lines, it won't apply the comment coloring to the lines in-between
when
line numbering is enabled. Thus, I'd consider that while it half works,
such
is inherently broken unless we decide to depend upon a certain JS
library
to handle highlighting. Highlightjs is compatible (BSD2 licensed) but I
also
could see that its probably unwanted to include yet more JS. I'm not
sure
if line numbering shouldn't be shifted to a "up to the user" ordeal like
syntax highlighting was. That however means stripping away the "?ln"
query
capabilities. I'm unsure how this situation would best be handled,
because
as it is, syntax highlighting works, doesn't work with line numbering,
and
likely cannot work with line numbering without bringing in something as
a
dependency and depending on it to handle the line numbering for us.

All that said... what I did was more a proof of concept with the line
numbering syntax highlighting and even "?ln" query capabilities went
untested when I first noticed the loss of context issue and figured that
it would be best to figure out what trade offs should be made to be able
to highlight whilst displaying numbers or if the trade off is to not be
highlight capable when numbers are displayed.

--
Lester L. Martin II
Chad Perrin
2018-06-28 16:07:53 UTC
Permalink
Post by Lester L. Martin II
I've developed a way to do highlighting with line numbering...
1. The JS code responsible for the highlighting is external to Fossil
itself.
2. The JS code highlights <pre><code> blocks, so if there is a
`<pre><code>$line</code><code>$line2</code></pre>` each code block is
highlighted individually.
3. Context is lost between code blocks.
Each line gets wrapped in a separate code block because you can use
css to insert a line number vs manually inserting the line number via
code per line.
That said, if you have a comment block `/* comments */` broken up over
a few lines, it won't apply the comment coloring to the lines
in-between when line numbering is enabled. Thus, I'd consider that
while it half works, such is inherently broken unless we decide to
depend upon a certain JS library to handle highlighting.
I think a cleaner approach, though it might take a bit of rewriting for
file display, would use an HTML table, possibly assembled by JavaScript.
Note that I'm feeling a little dirty for suggesting this; it's something
that could also reasonably be done server-side, though it would probably
require using something server-side for the syntax highlighting, too.

See if this makes sense:

Apply syntax highlighting to a block of code. Organize the code in
second column TD elements of a table. Organize line numbers in first
column TD elements of a table.

Thus, you would have HTML for a line of code that looks something like
this (using a totally made-up function as example code):

<tr>
<td class="line-no">$num</td>
<td class="code-line">
<span class="color-type">uint16_t</span> <span class="color-label">get_next</span><span class="color-delim">() {</span>
</td>
</tr>

Of course, that's just off the top of my head. I don't actually know
how the HTML resulting from application of the syntax highlighting code
would look in this case. The upshot, though, is that applying syntax
highlighting to the code to be displayed first, then organizing it into
table cells with number cells, might avoid JS syntax highlighting issues
with line numbers, including any issues around multiline highlighting.
The downside is that it would result in either having to do all line
numbering in JS or applying syntax highlighting server-side.
--
Chad Perrin [ original content licensed OWL: http://owl.apotheon.org ]
Lester L. Martin II
2018-06-28 17:11:11 UTC
Permalink
Post by Chad Perrin
Post by Lester L. Martin II
I've developed a way to do highlighting with line numbering...
1. The JS code responsible for the highlighting is external to Fossil
itself.
2. The JS code highlights <pre><code> blocks, so if there is a
`<pre><code>$line</code><code>$line2</code></pre>` each code block is
highlighted individually.
3. Context is lost between code blocks.
Each line gets wrapped in a separate code block because you can use
css to insert a line number vs manually inserting the line number via
code per line.
That said, if you have a comment block `/* comments */` broken up over
a few lines, it won't apply the comment coloring to the lines
in-between when line numbering is enabled. Thus, I'd consider that
while it half works, such is inherently broken unless we decide to
depend upon a certain JS library to handle highlighting.
I think a cleaner approach, though it might take a bit of rewriting for
file display, would use an HTML table, possibly assembled by
JavaScript.
Note that I'm feeling a little dirty for suggesting this; it's
something
that could also reasonably be done server-side, though it would probably
require using something server-side for the syntax highlighting, too.
Server side syntax highlighting is an excellent idea, though I'm not
quite sure at this time how to implement it. The other issue with such
being that fossil would basically need to cache the results of running
a highlight for the liftime of the program up until something
invalidates the cache. Line numbering does/will need serious work to
integrate with syntax highlighting regardless of the approach.
Post by Chad Perrin
Apply syntax highlighting to a block of code. Organize the code in
second column TD elements of a table. Organize line numbers in first
column TD elements of a table.
Thus, you would have HTML for a line of code that looks something like
<tr>
<td class="line-no">$num</td>
<td class="code-line">
<span class="color-type">uint16_t</span> <span
class="color-label">get_next</span><span class="color-delim">()
{</span>
</td>
</tr>
Basically how GitHub and several other things implement it.
Post by Chad Perrin
Of course, that's just off the top of my head. I don't actually know
how the HTML resulting from application of the syntax highlighting code
would look in this case. The upshot, though, is that applying syntax
highlighting to the code to be displayed first, then organizing it into
table cells with number cells, might avoid JS syntax highlighting issues
with line numbers, including any issues around multiline highlighting.
The downside is that it would result in either having to do all line
numbering in JS or applying syntax highlighting server-side.
The issue with just applying highlights first is how will line endings
be tracked since html elements need not necessarily be rendered
similarly
by all highlighting libraries. Detecting line endings in a generic way
after markup has been applied will be very difficult and likely library
specific. I keep using Prism.js as my goto for illustration but I would
bet that the differences between hljs and prism are enough that the JS
needing to be written to (hopefully) detect marked up line endings
between them would be different and we get into a "supports $library"
case vs a generic case like it has been so far without syntax highlights
and how it'd remain if we didn't go forward with syntax highlighting
when lines are numbered.

We still would end up depending on the "Line numbers" checkbox being
a call into JS to add those in for everything but the server-side
case. I'd rather not have to write JS to try to target 2 different
highlighting engines (or possibly more dependent upon what other
users prefer). Then that means that we'd need to check the JS code
written against say... the latest 3 versions of each highlighting
engine in our "support list". At that point it could be said that our
hold ups in deploying a new version are tied up in making sure
integration with several external resources will move along properly.
We'd also get into a case of saying "supports up to $version_number
of this library" (and more of those statements for other libraries
supported). At this point I came to the conclusion it's a huge
undertaking and would require extensive long term management, and
believe at that point, it might be best to "bless" a certain syntax
highlighting library and forgo anything else. If that library was
included in fossil, then wouldn't need to worry about having to possibly
push a fix to allow the newest version to work.

So so far I see 4 "valid" options:
1. Move towards server side highlighting implementing a caching
mechanism.
2. Chase multiple versions of differing libraries and maintain our
own JS that either calls the library's line numbering function
or uses our own stuff to afix numbering after the other has been
done.
3. Bless a certain highlighting library and/or version of that library
with possible inclusion into fossil itself or a vivid notice that
only $version is supported at this point in time.
4. Relegate line numbering with syntax highlighting to a no go.

Of course we could implement our own JS (or C based) highlighting code
though I consider that a grand deal of effort that even I wouldn't
feel comfortable undertaking with either of the mentioned languages.

--
Lester L. Martin II
Chad Perrin
2018-06-28 17:49:59 UTC
Permalink
Post by Lester L. Martin II
Post by Chad Perrin
I think a cleaner approach, though it might take a bit of rewriting for
file display, would use an HTML table, possibly assembled by
JavaScript.
Note that I'm feeling a little dirty for suggesting this; it's something
that could also reasonably be done server-side, though it would probably
require using something server-side for the syntax highlighting, too.
Server side syntax highlighting is an excellent idea, though I'm not
quite sure at this time how to implement it. The other issue with such
being that fossil would basically need to cache the results of running
a highlight for the liftime of the program up until something
invalidates the cache. Line numbering does/will need serious work to
integrate with syntax highlighting regardless of the approach.
Yeah, that'd probably be more work overall, and would likely reduce the
customization for syntax highlighting allowed to people deploying Fossil
repositories to the web. It would make how line numbering and syntax
highlighting integrate much more "deterministic", though, in that Fossil
devs would have a clearer view of everything that happens when trying to
account for it in updates to Fossil source. It's a trade-off, as with
most such decisions.
Post by Lester L. Martin II
Post by Chad Perrin
Apply syntax highlighting to a block of code. Organize the code in
second column TD elements of a table. Organize line numbers in
first column TD elements of a table.
Thus, you would have HTML for a line of code that looks something
<tr>
<td class="line-no">$num</td>
<td class="code-line">
<span class="color-type">uint16_t</span> <span class="color-label">get_next</span><span class="color-delim">() {</span>
</td>
</tr>
Basically how GitHub and several other things implement it.
I guess my intuition about how to handle it is in good company, for some
definition of "good".
Post by Lester L. Martin II
Post by Chad Perrin
Of course, that's just off the top of my head. I don't actually
know how the HTML resulting from application of the syntax
highlighting code would look in this case. The upshot, though, is
that applying syntax highlighting to the code to be displayed first,
then organizing it into table cells with number cells, might avoid
JS syntax highlighting issues with line numbers, including any
issues around multiline highlighting. The downside is that it would
result in either having to do all line numbering in JS or applying
syntax highlighting server-side.
The issue with just applying highlights first is how will line endings
be tracked since html elements need not necessarily be rendered
similarly by all highlighting libraries. Detecting line endings in a
generic way after markup has been applied will be very difficult and
likely library specific. I keep using Prism.js as my goto for
illustration but I would bet that the differences between hljs and
prism are enough that the JS needing to be written to (hopefully)
detect marked up line endings between them would be different and we
get into a "supports $library" case vs a generic case like it has been
so far without syntax highlights and how it'd remain if we didn't go
forward with syntax highlighting when lines are numbered.
If you mean that syntax highlighting libraries might insert literal
newlines into the file when marking it up for highlighting, that's
pretty awful, and could indeed screw up the whole exercise.
Post by Lester L. Martin II
We still would end up depending on the "Line numbers" checkbox being a
call into JS to add those in for everything but the server-side case.
I'd rather not have to write JS to try to target 2 different
highlighting engines (or possibly more dependent upon what other users
prefer). Then that means that we'd need to check the JS code written
against say... the latest 3 versions of each highlighting engine in
our "support list". At that point it could be said that our hold ups
in deploying a new version are tied up in making sure integration with
several external resources will move along properly. We'd also get
into a case of saying "supports up to $version_number of this library"
(and more of those statements for other libraries supported). At this
point I came to the conclusion it's a huge undertaking and would
require extensive long term management, and believe at that point, it
might be best to "bless" a certain syntax highlighting library and
forgo anything else. If that library was included in fossil, then
wouldn't need to worry about having to possibly push a fix to allow
the newest version to work.
This pretty much makes the detriments of a server-side approach that I
described earlier apply to the client-side approach, too. There are
other concerns that apply to the server-side, too, though, such as the
fact I suspect more rewriting of Fossil source would be required, though
I'm just guessing at this point. I'm beginning to think that the best
approach might be to ship a JS syntax highlighting library with Fossil,
or just bless a single library, and allow people deploying their own
repositories to the web to monkey with that at their own peril, in the
short term. Building in some server-side syntax highlighting with the
ability to ignore that and use client-side of one's own choosing (again
at one's own peril) might be the "correct" long-term approach for how to
handle syntax highlighting.
Post by Lester L. Martin II
1. Move towards server side highlighting implementing a caching
mechanism.
This seems like something that should be done eventually, while making
some intermediate approach available in the meantime with no guarantees
of future compatibility -- an optional, experimental, biohazard-warning
approach just to fill in the gap until the server side is available.
Post by Lester L. Martin II
2. Chase multiple versions of differing libraries and maintain our
own JS that either calls the library's line numbering function
or uses our own stuff to afix numbering after the other has been
done.
Sorry, I'm not sure what you're saying here. If you're saying that
syntax highlighting libraries have their own line numbering
functionality, it might make sense to just defer to that in cases where
syntax highlighting is used, and thus obviate most of this discussion.
Post by Lester L. Martin II
3. Bless a certain highlighting library and/or version of that library
with possible inclusion into fossil itself or a vivid notice that
only $version is supported at this point in time.
That seems like the reasonable short-term solution, to me, but probably
not as an intended long-term official solution. There are reasons to
favor server-side functionality for these things "eventually", and avoid
pushing all this off to the end user. If syntax highlighting is
considered a nonessential option, though, a simpler solution would be to
just make some minor server-side changes to allow people deploying repos
to the web to do the work of experimenting and integrating as they feel
inclined to do so.

A solution for line-numbering in the case of "well, we use the pre-code
tag convention, and the rest is up to you" might be to just use JS to
apply a specially-styled ordered list to the entire block of code after
any hooked-in JS syntax highlighting code and call it done. When you
customize, you get what you get. Right?
Post by Lester L. Martin II
4. Relegate line numbering with syntax highlighting to a no go.
That's definitely a short-term hack kind of "solution", and probably not
something that should be an official implementation decision five years
down the road.
Post by Lester L. Martin II
Of course we could implement our own JS (or C based) highlighting code
though I consider that a grand deal of effort that even I wouldn't
feel comfortable undertaking with either of the mentioned languages.
I'm not too keen on that idea, either.
--
Chad Perrin [ original content licensed OWL: http://owl.apotheon.org ]
Sam Putman
2018-06-28 18:40:01 UTC
Permalink
Post by Chad Perrin
Apply syntax highlighting to a block of code. Organize the code in
second column TD elements of a table. Organize line numbers in first
column TD elements of a table.
Thus, you would have HTML for a line of code that looks something like
<tr>
<td class="line-no">$num</td>
<td class="code-line">
<span class="color-type">uint16_t</span> <span
class="color-label">get_next</span><span class="color-delim">() {</span>
</td>
</tr>
Of course, that's just off the top of my head. I don't actually know
how the HTML resulting from application of the syntax highlighting code
would look in this case. The upshot, though, is that applying syntax
highlighting to the code to be displayed first, then organizing it into
table cells with number cells, might avoid JS syntax highlighting issues
with line numbers, including any issues around multiline highlighting.
The downside is that it would result in either having to do all line
numbering in JS or applying syntax highlighting server-side.
As a target, I would suggest the emitted html look as much like this as
possible:

view-source:https://github.com/jvirkki/libbloom/blob/master/bloom.c

The actual code block begins at line 821.

This style of markup is a de-facto standard and leads to a linking style
that would
greatly aid migration from git if fossil could adhere to it.
Chad Perrin
2018-06-28 19:14:53 UTC
Permalink
Post by Sam Putman
As a target, I would suggest the emitted html look as much like this
view-source:https://github.com/jvirkki/libbloom/blob/master/bloom.c
The actual code block begins at line 821.
This style of markup is a de-facto standard and leads to a linking
style that would greatly aid migration from git if fossil could adhere
to it.
My example was nothing but off the top of my head equivalent to
pseudocode (except I think the code was all valid HTML around valid C).
Only the class names change between my version and this version, apart
from some extra details like data-line-number and id properties, in any
case. That means I was evidently thinking identically (in principle) to
the thoughts of whoever wrote the code that produced your example.

I'm not sure how this has any effect on migration from git to fossil,
though. Git export and Fossil import wouldn't touch this code. Are you
talking about some kind of external tools being able to interact with
this code in the browser? If so, the classes involved probably come
from whatever JS library is used for syntax highlighting anyway, rather
than from something like code internal to Fossil (unless syntax
highlighting gets implemented in C as part of Fossil).

I guess the upshot is that I'm not sure what you mean, and all I've been
able to do so far is guess.
--
Chad Perrin [ original content licensed OWL: http://owl.apotheon.org ]
Lester L. Martin II
2018-06-28 23:53:10 UTC
Permalink
Post by Chad Perrin
Post by Lester L. Martin II
Server side syntax highlighting is an excellent idea, though I'm not
quite sure at this time how to implement it. The other issue with such
being that fossil would basically need to cache the results of running
a highlight for the liftime of the program up until something
invalidates the cache. Line numbering does/will need serious work to
integrate with syntax highlighting regardless of the approach.
Yeah, that'd probably be more work overall, and would likely reduce the
customization for syntax highlighting allowed to people deploying
Fossil
repositories to the web. It would make how line numbering and syntax
highlighting integrate much more "deterministic", though, in that
Fossil
devs would have a clearer view of everything that happens when trying
to
account for it in updates to Fossil source. It's a trade-off, as with
most such decisions.
I'm not sure syntax highlighting is Fossil's task, though integrating
easily with other things that do syntax highlighting sounds like it is
something of benefit to Fossil. That said, I would not want to be
responsible for writing syntax parsers in C so as to generate pretty
content. It might be horrible to offload this to the client via JS,
but that might actually be the best solution just because it keeps
Fossil flexible.
Post by Chad Perrin
Post by Lester L. Martin II
Post by Chad Perrin
Thus, you would have HTML for a line of code that looks something
<tr>
<td class="line-no">$num</td>
<td class="code-line">
<span class="color-type">uint16_t</span> <span class="color-label">get_next</span><span class="color-delim">() {</span>
</td>
</tr>
Basically how GitHub and several other things implement it.
I guess my intuition about how to handle it is in good company, for
some
definition of "good".
The way GitHub does it is fine, however, they have existed before
something of the power of CSS line counters existed likely. Either that
or they tried such an approach and found an issue that I can't think of
at the moment. This will see more addressed a bit further down.
Post by Chad Perrin
Post by Lester L. Martin II
The issue with just applying highlights first is how will line endings
be tracked since html elements need not necessarily be rendered
similarly by all highlighting libraries. Detecting line endings in a
generic way after markup has been applied will be very difficult and
likely library specific. I keep using Prism.js as my goto for
illustration but I would bet that the differences between hljs and
prism are enough that the JS needing to be written to (hopefully)
detect marked up line endings between them would be different and we
get into a "supports $library" case vs a generic case like it has been
so far without syntax highlights and how it'd remain if we didn't go
forward with syntax highlighting when lines are numbered.
If you mean that syntax highlighting libraries might insert literal
newlines into the file when marking it up for highlighting, that's
pretty awful, and could indeed screw up the whole exercise.
I mean that a syntax highlighting library can do it however it likes
and while I'd think most wouldn't insert a literal newline, I might not
think I could plausibly count on `</br>` to be a consistent method
guaranteeing numbering. The other problem is if the syntax highlighter
fails halfway through but doesn't undo it's work, leaving things
partially highlighted you're in for some confusion in the JS you write
yourself. This might not be common but it is quite possible.
Post by Chad Perrin
Post by Lester L. Martin II
We still would end up depending on the "Line numbers" checkbox being a
call into JS to add those in for everything but the server-side case.
I'd rather not have to write JS to try to target 2 different
highlighting engines (or possibly more dependent upon what other users
prefer). Then that means that we'd need to check the JS code written
against say... the latest 3 versions of each highlighting engine in
our "support list". At that point it could be said that our hold ups
in deploying a new version are tied up in making sure integration with
several external resources will move along properly. We'd also get
into a case of saying "supports up to $version_number of this library"
(and more of those statements for other libraries supported). At this
point I came to the conclusion it's a huge undertaking and would
require extensive long term management, and believe at that point, it
might be best to "bless" a certain syntax highlighting library and
forgo anything else. If that library was included in fossil, then
wouldn't need to worry about having to possibly push a fix to allow
the newest version to work.
This pretty much makes the detriments of a server-side approach that I
described earlier apply to the client-side approach, too. There are
other concerns that apply to the server-side, too, though, such as the
fact I suspect more rewriting of Fossil source would be required,
though
I'm just guessing at this point. I'm beginning to think that the best
approach might be to ship a JS syntax highlighting library with Fossil,
or just bless a single library, and allow people deploying their own
repositories to the web to monkey with that at their own peril, in the
short term. Building in some server-side syntax highlighting with the
ability to ignore that and use client-side of one's own choosing (again
at one's own peril) might be the "correct" long-term approach for how
to
handle syntax highlighting.
I'm of the opinion of blessing not only a single
library, but a singular version of that library per Fossil release.
Point
blank I do not see a way to get line numbering and syntax highlighting
to work without depending on either a server-side or client-side
external solution, and producing code to cover each kind and all sorts
of error conditions that can occur due to syntax highlighting I believe
is outside of the scope of Fossil. Writing our own syntax highlighter
might be a great long term approach, but I also don't think anyone
wants to duplicate the work of the likes of hljs, prism, or for a
server side example pygments and hugo's chroma, in C, nor create
something
akin to hugo's solution in that it depends on yet another project's
syntax definition files.
Post by Chad Perrin
Post by Lester L. Martin II
1. Move towards server side highlighting implementing a caching
mechanism.
This seems like something that should be done eventually, while making
some intermediate approach available in the meantime with no guarantees
of future compatibility -- an optional, experimental, biohazard-warning
approach just to fill in the gap until the server side is available.
Until such time as someone wishes to create the server side solution be
it that it depends on an external solution or something that is built in
to Fossil I'd opt that the intermediate approach should (with some work
and sadly without supporting but one library and version of that
library)
be considered "good enough" and not necessarily considered experimental
so long as one follows a guide someone (likely myself) would create
on setting this up.
Post by Chad Perrin
Post by Lester L. Martin II
2. Chase multiple versions of differing libraries and maintain our
own JS that either calls the library's line numbering function
or uses our own stuff to afix numbering after the other has been
done.
Sorry, I'm not sure what you're saying here. If you're saying that
syntax highlighting libraries have their own line numbering
functionality, it might make sense to just defer to that in cases where
syntax highlighting is used, and thus obviate most of this discussion.
Some syntax highlighting libraries have their own ways of adding in
line numbering yes (prism). Those that don't have tend to have other
solutions provided by others to get line numbering indeed. By "chase"
I meant that if we support more than one library, and more than one
version of each library, we would be steadily chasing compatibility and
ability to integrate with such a wide range of "stuff" that it'd be
a complete mess and likely hold up the project when it's time for a
release due to checking against all such that the community wants.
Post by Chad Perrin
Post by Lester L. Martin II
3. Bless a certain highlighting library and/or version of that library
with possible inclusion into fossil itself or a vivid notice that
only $version is supported at this point in time.
That seems like the reasonable short-term solution, to me, but probably
not as an intended long-term official solution. There are reasons to
favor server-side functionality for these things "eventually", and
avoid
pushing all this off to the end user. If syntax highlighting is
considered a nonessential option, though, a simpler solution would be
to
just make some minor server-side changes to allow people deploying
repos
to the web to do the work of experimenting and integrating as they feel
inclined to do so.
A solution for line-numbering in the case of "well, we use the pre-code
tag convention, and the rest is up to you" might be to just use JS to
apply a specially-styled ordered list to the entire block of code after
any hooked-in JS syntax highlighting code and call it done. When you
customize, you get what you get. Right?
Unless someone really wants to write our own syntax highlighting library
be it in C or JS, 3 might need to suffice for the long term solution.
The `<pre><code>` tag convention probably without introducing a `<code>`
tag per source line would work if a JS library supports (or via other
means supports) highlighting with lines though would definitely create
an
absence of "?ln" query capabilities likely unless we go full forward
with the "bless a certain highlightng library" part.
Post by Chad Perrin
Post by Lester L. Martin II
4. Relegate line numbering with syntax highlighting to a no go.
That's definitely a short-term hack kind of "solution", and probably
not
something that should be an official implementation decision five years
down the road.
Actually, if we don't want to introduce a "for syntax highlighting,
you may only use this library and expect it to work, and it must be this
version" and we can't live with *broken* highlighting when lines
are numbered, and we wish to keep "?ln" queries (though this might be
possible to continue having even with the explicit external dep for
if highlighting is enabled), then 4 is our only option.
Post by Chad Perrin
As a target, I would suggest the emitted html look as much like this as
view-source:https://github.com/jvirkki/libbloom/blob/master/bloom.c
The actual code block begins at line 821.
In ref to your mentioning line 821, I don't happen to see said line
in the linked file.
Post by Chad Perrin
This style of markup is a de-facto standard and leads to a linking
style that would
greatly aid migration from git if fossil could adhere to it.
GitHub isn't "the standard", they're just predominant. Anything we do
could borrow ideas from them, but should not necessarily lean towards
a complete re-implementation of what they do. If anything we should
start on a solid list of syntax highlighting features needed, and build
from there. I'd always say do better than those people view as the
predominant way of doing things excepting perhaps where RFC's are
concerned, adhere strictly to RFC's.

As to linking of lines of code within a project, that's yet another
feature and doesn't necessarily have anything to do with syntax
highlighting either (though it would perhaps cause some concern
with line numbering and syntax highlighting too).
Post by Chad Perrin
Post by Lester L. Martin II
As a target, I would suggest the emitted html look as much like this
view-source:https://github.com/jvirkki/libbloom/blob/master/bloom.c
The actual code block begins at line 821.
This style of markup is a de-facto standard and leads to a linking
style that would greatly aid migration from git if fossil could adhere
to it.
My example was nothing but off the top of my head equivalent to
pseudocode (except I think the code was all valid HTML around valid C).
Only the class names change between my version and this version, apart
from some extra details like data-line-number and id properties, in any
case. That means I was evidently thinking identically (in principle) to
the thoughts of whoever wrote the code that produced your example.
I'm not sure how this has any effect on migration from git to fossil,
though. Git export and Fossil import wouldn't touch this code. Are you
talking about some kind of external tools being able to interact with
this code in the browser? If so, the classes involved probably come
from whatever JS library is used for syntax highlighting anyway, rather
than from something like code internal to Fossil (unless syntax
highlighting gets implemented in C as part of Fossil).
I guess the upshot is that I'm not sure what you mean, and all I've been
able to do so far is guess.
I believe he would have meant a migration from GitHub (the defacto
pseudonym for git itself seemingly). Git itself has no standard way
of displaying code, and interoperability with GitHub as to class
definitions and such would only allow migrations *if* GitHub allows
exporting the wiki pages you create over there in a format useable
by Fossil as well as only coming into play *if* your wiki links
reference certain lines of code in your codebase.

Alright, this has become quite a deep conversation.

At this time, I believe there is no sure way to go forward with
syntax highlighting in relation to line numbers. I believe until
Fossil's creator chimes in and validates a certain way as compatible
with the direction of the project that syntax highlighting with
line numbers should be tabled until he fields an idea or says
one such idea of ours generated is the way to attempt.

My proposal for such is to depend on external JS and keep this
out of the realm of Fossil and to get Fossil to where it supports
this way of doing it. I think I'd rather no syntax highlighting
for line numbers than broken syntax highlighting. With this proposal
we *will* need to reimagine how "?ln" queries will work on artifact
content pages. At the moment it doesn't matter if we render the code
as a table, or as a `<pre>` with multiple `<code>` blocks or what,
what matters is a decision on how to do syntax highlighting whilst
line numbering and not be broken. Let's find which way is going
to be okay going forward to work on that problem then worry
about what html will actually be generated.

This is starting to become a thread about
src/info.c's `artifact_page` function and what should be done going
forward to improve it in even greater aspects.

I would propose at this point we do the following:

First, consider syntax highlighting tackled. The other cases are
cases where having it would be nice, but are not absolutely
necessary and are not easily implemented in a non-broken manner
or only implementable via a complete replacement of the code dealing
with line numbering (possibly moving it out to JS) and a
re-implementation of how "?ln" queries are handled.

Second, get together a list of further features we'd want
syntax highlighting capability tacked on to and discuss each feature
individually and come to a consensus that is agreeable with the
project's direction. I could include in this list off the top of my
head: syntax highlighting for line numberings, syntax highlighting
for diff viewing. Both of those require such a rework of how things
are done, that we can either choose to depend on one JS library,
or need to figure out what the project's direction would support us
doing.

Third, start working on the features in relation to syntax highlighting
after having found a solution compatible with the projects direction.

Wash, rinse, repeat, up until we can stop talking about syntax
highlighting because at that point it's done and not only done,
but well done and superior in context of capability and ability
within reference to the scope of the Fossil project.

Finally, if JS is going to be responsible for line numbering anyway,
perhaps line numberings should be on by default and our only
consideration
be how will we get "?ln" queries back working and probably diff viewing
as well.

--
Lester L. Martin II
Chad Perrin
2018-06-29 16:24:32 UTC
Permalink
Okay, after all that, I feel like distilling this down to its essence
(according to my own opinions, naturally) might be in order.

I feel like we basically have three sane options available:

1. Make some very minor changes to the Fossil source, where it generates
pretty viewable web pages, to make it much easier to retrofit syntax
highlighting via JS libraries for those users who want it. Get someone
to write up a currently-effective guide to getting it set up, but make
it a sort of unofficial, community guide. Do not officially support
syntax highlighting at all. Do not bother screwing around with anything
making line numbering play well with JS syntax highlighting unless and
until someone presents a patch that fits with this philosophy of not
supporting syntax highlighting but enabling it when easy to do so.

2. Pick a single JS syntax highlighting library (highlight.js) to bless.
Include a guide in official docs for setting it up in deployment.
Specify a supported version range for each Fossil release. Unless line
numbering is found to be easy to work in, just write it off and
officially declare that line numbering and syntax highlighting do not
play well together, but keep that on the radar for figuring out later if
possible. Call this "officially tested, but not officially supported".

3. Ship that library with Fossil. There's no need for identifying a
supported range: either you use what ships with it or you're on your
own, and we don't care any longer. I think taking this approach without
resolving the line numbering problem has some issues for purposes of
perception of the project, though, so I think one of the following two
things should happen here: either call it experimental with firm plans
to resolve the line numbering issue before calling it a release feature,
or don't do this at all.

While using an approach similar to GitHub's for purposes of easing
transition from GitHub to self-hosted Fossil would be nice, if it's too
much work to do so it shouldn't stand in the way of getting a good
solution for Fossil. This feels like one of those "perfect is the enemy
of good enough" situations, for a case that is only "perfect" with
regard to ensuring people are slightly more inclined to switch from
GitHub to self-hosted Fossil. In fact, considering there's probably
nobody else providing that kind of fine-grained display characteristics
similarity with GitHub, this doesn't feel like a critical issue at all.

Most people probably just wouldn't even expect it to be that similar,
I'd think. Follow RFCs carefully, provide similarity of implementation
to GitHub for convenience if it's not too much trouble, and move on.

YMMV.
--
Chad Perrin [ original content licensed OWL: http://owl.apotheon.org ]
Sam Putman
2018-06-29 17:42:07 UTC
Permalink
Post by Chad Perrin
Post by Sam Putman
As a target, I would suggest the emitted html look as much like this
view-source:https://github.com/jvirkki/libbloom/blob/master/bloom.c
The actual code block begins at line 821.
This style of markup is a de-facto standard and leads to a linking
style that would greatly aid migration from git if fossil could adhere
to it.
I'm not sure how this has any effect on migration from git to fossil,
though. Git export and Fossil import wouldn't touch this code. Are you
talking about some kind of external tools being able to interact with
this code in the browser? If so, the classes involved probably come
from whatever JS library is used for syntax highlighting anyway, rather
than from something like code internal to Fossil (unless syntax
highlighting gets implemented in C as part of Fossil).
I guess the upshot is that I'm not sure what you mean, and all I've been
able to do so far is guess.
It's a related but distinct feature, the ability to render links like this
one:

https://github.com/jvirkki/libbloom/blob/master/bloom.c#L57-L60

Github, Gitlab, and Gogs will all correctly render that link, and various
short/relative
links of the same form.

This is a good convention for making URIs for branches, files, lines, and
the like. These
URIs get embedded into documentation and tickets, anywhere you might want a
hyperlink in your rendered cod. The schema would work as well for fossil
as it does
for git.

Those can't be effectively migrated to fossil, which will display the
content hash of the
file being rendered as the URI.

As for the HTML schema for marking up code, it's also a de-facto standard.
Originating
with pygments, if I recall correctly and used, with some variation, by all
the major syntax
highlighters.

If the other proposal is just whatever highlights.js emits, I'm sure we'll
find that they are
somewhere between similar and identical.
Lester L. Martin II
2018-06-30 18:47:49 UTC
Permalink
Post by Chad Perrin
Okay, after all that, I feel like distilling this down to its essence
(according to my own opinions, naturally) might be in order.
1. Make some very minor changes to the Fossil source, where it
generates
pretty viewable web pages, to make it much easier to retrofit syntax
highlighting via JS libraries for those users who want it. Get someone
to write up a currently-effective guide to getting it set up, but make
it a sort of unofficial, community guide. Do not officially support
syntax highlighting at all. Do not bother screwing around with
anything
making line numbering play well with JS syntax highlighting unless and
until someone presents a patch that fits with this philosophy of not
supporting syntax highlighting but enabling it when easy to do so.
Patch in the works, is a matter of getting the JS code up to feature
parity with the C code since it'll be responsible now for content
selection and highlighting of said selections now, along with scrolling
content to it's line appropriately. Scrolling, and highlighting 1 line
work as of the moment, will continue until multiple line and multiple
selection work. This will bring some changes and make syntax
highlighting
support explicit. I'll address the current ideas later in the message.
It however doesn't fit with "not supporting syntax highlighting but
enabling it when easy to do so" as we'll see later.
Post by Chad Perrin
2. Pick a single JS syntax highlighting library (highlight.js) to
bless.
Include a guide in official docs for setting it up in deployment.
Specify a supported version range for each Fossil release. Unless line
numbering is found to be easy to work in, just write it off and
officially declare that line numbering and syntax highlighting do not
play well together, but keep that on the radar for figuring out later
if
possible. Call this "officially tested, but not officially supported".
What I'm working on should retain the flexibility of Fossil to support
things other than hljs.
Post by Chad Perrin
3. Ship that library with Fossil. There's no need for identifying a
supported range: either you use what ships with it or you're on your
own, and we don't care any longer. I think taking this approach
without
resolving the line numbering problem has some issues for purposes of
perception of the project, though, so I think one of the following two
things should happen here: either call it experimental with firm plans
to resolve the line numbering issue before calling it a release
feature,
or don't do this at all.
Assuming all goes well, one could use what I'm doing when I release the
patch and associated files and pull the extra files in making it
official
along with no line numbering problem meaning no perception issue.
Post by Chad Perrin
While using an approach similar to GitHub's for purposes of easing
transition from GitHub to self-hosted Fossil would be nice, if it's too
much work to do so it shouldn't stand in the way of getting a good
solution for Fossil. This feels like one of those "perfect is the
enemy
of good enough" situations, for a case that is only "perfect" with
regard to ensuring people are slightly more inclined to switch from
GitHub to self-hosted Fossil. In fact, considering there's probably
nobody else providing that kind of fine-grained display characteristics
similarity with GitHub, this doesn't feel like a critical issue at all.
GitHub (and other code hosting solutions) similarity will be addressed
further down.
Post by Chad Perrin
It's a related but distinct feature, the ability to render links like
https://github.com/jvirkki/libbloom/blob/master/bloom.c#L57-L60
Github, Gitlab, and Gogs will all correctly render that link, and
various short/relative
links of the same form.
This is a good convention for making URIs for branches, files, lines,
and the like. These
URIs get embedded into documentation and tickets, anywhere you might want a
hyperlink in your rendered cod. The schema would work as well for
fossil as it does
for git.
Those can't be effectively migrated to fossil, which will display the
content hash of the
file being rendered as the URI.
I'm not going to work on the URI path parts feature you're intending
to describe here. That's an entirely seperate feature from syntax
highlighting and would probably require a lot more work throughout
the entirety of the codebase than anything else I could imagine
supporting or building in feature wise. It's very much outside the
scope of "let's make syntax highlighting work". It is, however, not
a bad idea.

That said, I do have to deal with 2 parts of the URI, the "&ln="
and "#" parts. We'll address this further down as well.
Post by Chad Perrin
As for the HTML schema for marking up code, it's also a de-facto
standard. Originating
with pygments, if I recall correctly and used, with some variation, by
all the major syntax
highlighters.
If the other proposal is just whatever highlights.js emits, I'm sure
we'll find that they are
somewhere between similar and identical.
You'll I guess be happy to know that since hljs's line numbering
extension follows this "standard" that means that what I submit will
oddly enough follow it too. I'm not sure this "standard" is the best
solution, but at the moment what I care about is that it works.

----

I'm working on a solution that involves both the C part of the code base
along with some JS resources. When I finish I'll submit a patch. It
should
keep all current features working. The JS resources will be made
available
but not included in the patch (or the Fossil repo) unless specifically
requested for since doing so means a few more things would need to be
done
and there's no guarantee that everyone would want these files included
by default. The only reason for inclusion of the files mentioned is if
there is an easy way to enable this, versus one doing some configuration
of their own repo. I would say adding an easy option would be grand.

The issue with only doing things like GitHub does with the hash part at
the end of the URI is that doing so would break links within Fossil
repos
on wiki pages, tickets and so forth, that use the current linking
format.
So, supporting both is grand, and the right thing to do.

The reason the patch I'm working on would introduce this as a supported
feature is because I intend to create a new Fossil setting or settings.
I'm still working out how this will work due to trying to ascertain the
best way to move the resource loading and calling out into the theme,
versus being built into the code, to allow for continued flexibility.

The settings needed:
1. syntax-hl (Is syntax highlighting enabled)
2. syntax-hl-call (what javascript function to insert a call for when
doing line numbering)
3. syntax-hl-easy (Do the hard work for you?)

I'm unsure if they should be combined into a single setting. If so I'd
go for "syntax-hl-call" and the easy setting would be a JS included
with Fossil that would be "fossil_easy_syntax_hl" or similar.

If I'm thinking correctly, those 1 or 2 settings will allow for Fossil
to retain it's flexibility and support syntax highlighting overall
whilst remaining agnostic to how one enables their syntax highlighting.

This also means that all the old line numbering, and content rendering
will continue to work for those who do not wish to have such, whilst
making it possible, either simply, or with a bit of work on the user's
part, to add in syntax highlighting that works, and works well.

I've still yet to clean all this up and make it use-able from a general
standpoint. After this coming patch, getting syntax highlighting and
line numbering working, the next thing will be to get diffs working,
if this doesn't already fix it for them too.

--
Lester L. Martin II

Loading...