Find project on GitHub GitHub

Lorem ipsum dolor sit amet, consectetur adipiscing elit. Aenean suscipit magna sit amet mi ullamcorper, ac egestas sem tempor. Suspendisse maximus massa orci, non condimentum ligula gravida nec. Nam lacus orci, elementum id dolor vitae, laoreet commodo elit. Morbi a felis sit amet diam egestas convallis. Sed imperdiet vestibulum lacus, euismod sagittis nisi suscipit sit amet. Nam ut lectus nunc. Aliquam erat volutpat. Suspendisse sollicitudin finibus neque eu posuere. Etiam cursus fermentum nibh, ac iaculis quam fringilla et. Proin lacinia imperdiet lorem. Duis in lorem non lectus accumsan bibendum. Fusce mattis purus vitae lacus posuere suscipit. Donec vitae nibh quis lacus lacinia consequat eget in mauris. Ut diam mauris, dictum id porta id, lacinia scelerisque neque. Nullam id ultrices ligula, eget ullamcorper risus. Aenean a justo sed felis maximus blandit in ut purus. Vestibulum pharetra urna non nisi pharetra mattis. Duis neque eros, ultrices in malesuada dignissim, porttitor quis lectus. Mauris id bibendum odio. In porttitor, libero a pulvinar tincidunt, purus ante auctor augue, ornare placerat sem dui nec ante. Ut tincidunt ante pretium magna dictum, id venenatis nulla cursus. Integer porta mauris arcu, in viverra elit sagittis at.

Settings:

TypesetBot.attach(
  '.some-query-selector',
  {
    alignment: 'justify',
    hyphenLanguage: 'en-us',
    // ...more settings
  }
);

Explore the demo on CodePen

Install

NPM

# Install
$ npm install typesetbot

# (optional) Download hyphenation library and patterns, for example US english
$ npm install hypher hyphenation.en-us

CDN

<link rel="stylesheet" href="https://rawgit.com/MGApcDev/TypesetBot/master/dist/typesetbot.min.css">
<script type="text/javascript" src="https://rawgit.com/MGApcDev/TypesetBot/master/dist/typesetbot.js"></script>

<!-- (optional) -->
<!-- hyphenation library -->
<script type="text/javascript" src="https://cdn.rawgit.com/bramstein/hypher/v0.2.5/dist/jquery.hypher.js"></script>
<!-- hyphenation pattern for US english -->
<script type="text/javascript" src="https://cdn.rawgit.com/bramstein/hyphenation-patterns/dc01d58a/dist/browser/en-us.js"></script>

License

Commercial license

If you want to use TypesetBot to develop commercial sites, themes, projects, and applications, the Commercial license is the appropriate license. With this option, your source code is kept proprietary.

Open source license

If you are creating an open source application under a license compatible with the GNU GPL license v3, you may use TypesetBot under the terms of the GPLv3.

Basic usage

<html>
<head>
    <link rel="stylesheet" href="node_modules/typesetbot/dist/typesetbot.min.css">
    <script type="text/javascript" src="node_modules/typesetbot/dist/typesetbot.js"></script>

    <script type="text/javascript" src="node_modules/hypher/dist/jquery.hypher.js"></script>
    <script type="text/javascript" src="node_modules/hyphenation.en-us/lib/en-us.js"></script>
    <script type="text/javascript">
        TypesetBot.attach('body');
    </script>
</head>
<body>
    <p>Lorem ipsum...</p>
</body>
</html>

Your browser uses a very simple line breaking algorithm to arrange the text on your screen, TypesetBot changes this arrangement using am algorithm inspired by TeX's total-fit algorithm. TypesetBot looks at all the lines in your paragraph and finds the best line breaks for all of them. This process reduces whitespace between words and applies hyphenations only when necessary, so the text becomes more readable. TypesetBot is designed with the web in mind: written in JavaScript for wide support, query selector to only typeset the correct elements, recalculates the typesetting when the viewport size changes, support for inline tags, easy installs with CDN and NPM, hyphenation for 39 languages.

Features

Line breaking

The line breaking algorithm implemented, is inspired by TeX's implementation of the total-fit algorithm. The algorithm was detailed in the Digital Typography by Donald Knuth. The implementation was modified to make better sense in the web domain. This required some rethinking some fundamentatal elements of the algorithm we get: use of query selectors, html inline-tag support, handle viewport changes and feeding preprocessed data.

Query selectors

TypesetBot only work on paragraph elements. TypesetBot doesn't have to change your entire website, using query selectors like found in jQuery to select a single paragraph or a bigger selector to typeset multiple paragraphs. Using 'body' you can typeset all paragraphs on a page or '.content' to find all paragraphs within the content class.

Hyphenation

TypesetBot don't require the use of hyphenation libraries, but hyphenation of words can be used to greatly improve the line breaking quality. Hyphenation is done with hypher and uses the open source TeX hyphenation patterns and can hyphenate text in various languages. Languages supported can be found in hyphenation-patterns by bramstein.

Viewport changes

With responsive design being a important thing on the web, I think it only made sense to consider that the user might change the viewport size by dragging the browser or going from portrait to landscape on mobile. When using the _TypesetBot.attach()_ we recalculate the line breaking every time the viewport changes and with some clever CSS the transition is hard to notice.

Tag support

On of the biggest changes that had to be made from the total-fit algorithm, is the support for inline tags in paragraphs. The support allows the user to wrap text in style changing tags like

<span>
<!-- or -->
<b class="someClass">
these can be used as you would expect, even in the middle of a word and with styles that change the font-size.

Settings

TypesetBot has been made to be very adjustable as there is no perfect settings. Some of these settings can increase the performance and behavior so it's important to expose these to the user.

{
    // Algorithm.
    alignment: 'justify', // Other options are 'ragged-right', 'ragged-left' and 'ragged-center'

    hyphenPenalty: 50, // Penalty for line-breaking on a hyphen
    hyphenPenaltyRagged: 500, // Penalty for line-breaking on a hyphen when using ragged text
    flagPenalty: 3000, // Penalty when current and last line had flag value 1. Reffered to as 'α'
    fitnessClassDemerit: 3000, // Penalty when switching between ratio classes. Reffered to as 'γ'
    demeritOffset: 1, // Offset to prefer fewer lines by increasing demerit of "~zero badness lines"

    // "the value of q is increased by 1 (if q < 0) or decreased by 1 (if q > 0) until a feasible solution is found." - DT p.114
    loosenessParam: 0, // If zero we find to solution with fewest total demerits. Reffered to as 'q'
    absoluteMaxRatio: 5, // Max adjustment ratio before we giving up on finding solutions

    maxRatio: 2, // Maximum acceptable adjustment ratio. Referred to as 'p'
    minRatio: -1, // Minimum acceptable adjustment ratio. Less than -1 will make the text too closely spaced.

    // Hyphen.
    hyphenLanguage: 'en-us', // Language of hyphenation patterns to use
    hyphenLeftMin: 2, // Minimum number of letters to keep on the left side of word
    hyphenRightMin: 2, // Minimum number of letters to keep on the right side of word

    // 4 classes of adjustment ratios.
    fitnessClass: [-1, -0.5, 0.5, 1, Infinity],

    // Font.
    spaceUnit: 'em', // Space width unit, em is relative to font-size
    spaceWidth: 1/3, // Ideal space width
    spaceStretchability: 1/6, // How much can the space width stretch
    spaceShrinkability: 1/9, // How much can the space width shrink

    // Inline element that the program will unwrap from paragraphs as they could disrupt the line breaking.
    unwrapElements: ['img'],


    // Dynamic width.
    // Allow the paragraph to account for overlapping elements, turn this off if you know there's no overlapping element to gain performance.
    dynamicWidth: true,
    // Pixel increment of vertical search, higher better performance, lower more accurate result.
    dynamicWidthIncrement: 5,

    // Functions.

    // Adjustment ratio
    ratio (idealW, actualW, wordCount, shrink, stretch, settings) {
        if (actualW < idealW) {
            return (idealW - actualW) / ((wordCount - 1) * stretch);
        }

        return (idealW - actualW) / ((wordCount - 1) * shrink);
    },
    // Badness calculation
    badness (ratio, settings) {
        if (ratio == null || ratio < settings.minRatio) {
            return Infinity;
        }

        return 100 * Math.pow(Math.abs(ratio), 3) + 0.5;
    },
    // Demerit calculation
    demerit (badness, penalty, flag, settings) {
        var flagPenalty = flag ? settings.flagPenalty : 0;
        if (penalty >= 0) {
            return Math.pow(settings.demeritOffset + badness + penalty, 2) + flagPenalty;
        } else if (penalty === -Infinity) {
            return Math.pow(settings.demeritOffset + badness, 2) + flagPenalty;
        } else {
            return Math.pow(settings.demeritOffset + badness, 2) - Math.pow(penalty, 2) + flagPenalty;
        }
    },
    debug: false // Print performance information
}

Performance

Quality

The easiest way to compare quality of the typesetting is to look at the adjustment ratios that indecates how tight or loose a line is relative to the number of words in the line. Ideally we this value to be 0 so the word-spacing is exactly 1/3 em with no additional whitespace. Plain HTML can only increase word-spacing if necesarry, while TypesetBot can both increase and decrease it. The point is to decrease visual impact, so adjustment ratios exponentially scales with the amount of additional whitespace. Normal HTML has an easier time line breaking on wider lines, but on smaller viewports it starts to fall apart, so we can see how this is affected as multiple viewports. In the example lines with ratio over 0.5 or lower than -0.5 is highlighted in yellow, and ratios higher than 1.0 is highlighted in red. Reference: here

In this example the website has a lot of space to reduce the adjustment ratio, so the plain HTML gets pretty good results of an average 0.5, but the worst ratio is still 1.7 which is not very good, as explained because we prefer to have consistently low ratios and even if it’s only a single line, that line will stand out because it appears to have a white box rather than a big space. In this case TypesetBot doesn’t need to hyphen any words and therefore justified and ragged finds the same solution. An average of -0.1 is very good and unlike the plain HTML the max and min also stays closer to zero with a max of 0.3 and min of -0.7, but this is a lot better than 1.7 from plain HTML

Text width 500px, this is very commonly used on websites where 2/3 of the page is the text content. With this we see the real advantages of TypesetBot, the average ratio of normal HTML moves up to 1.25 which is still okay, but the max ratio moves up to 3.2 which is very visible while TypesetBot stay on a very low average of 0.1 and the max and min only increasing a bit to 0.8 and -0.8 respectively. This is also where we see the difference between justified and ragged, the justified implementation has better ratios, but the ragged has only 1 hyphen over 43 lines, compared to 5 over 43. Even with the ragged implementation all ratios are significantly better than the normal HTML implementation.

As the viewport shrinks to 200px, we're at a text width that is not preperable, but this is inevitable for text on mobile devices. In this case we see very significant improvements in all aspects. Here the average ratio for normal HTML has jumped to 3.8 and the max is a staggering 16, for TypesetBot the average is still good at around 1 for both ragged and justified, but even TypesetBot has a max of 4.9, which is not good. Part of how TypesetBot is able to get so good results is the hyphenation, where both justified and ragged hyphened nearly half the lines, but this is a trade-off that the algorithm is willing to make. One thing to note is that TypesetBot saved 8 lines, which is a good side-effect of the hyphenation.

Execution time

TypesetBot constructs a list of objects for each paragraph on the first run, the list contains information for every word and hyphenation in the paragraph, so it's an expensive task, but only occurs on a new page load. The second run occurs when the users adjusts the viewport and all the linebreaking and additional hyphenations needs to be calculated. Reference: here

Performance examples

Description Initialize Viewport change
1 paragraph, 441 words, w. hyphenation 143ms 137ms
1 paragraph, 441 words, w/o. hyphenation 109ms 74ms
1 paragraph, 882 words, w. hyphenation 270ms 209ms
1 paragraph, 882 words, w/o. hyphenation 170ms 123ms
2 paragraph, 441 words, w. hyphenation 275ms 147ms
2 paragraph, 441 words, w/o. hyphenation 130ms 111ms
1 paragraph, 441 words, w. hyphenation, w/o. dynamic width 53ms 51ms
Find project on GitHub GitHub