Skip to content

Commit 0521ec6

Browse files
author
Davide Valeriani
authored
Merge pull request matplotlib#51 from aitikgupta/aitikgupta/gsoc-mid
GSoC'21 Mid-Term Progress: Aitik Gupta
2 parents 72db1b9 + 88d1680 commit 0521ec6

File tree

2 files changed

+88
-0
lines changed

2 files changed

+88
-0
lines changed
304 KB
Loading
Lines changed: 88 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,88 @@
1+
---
2+
title: "GSoC'21: Mid-Term Progress"
3+
date: 2021-07-02T08:32:05+05:30
4+
draft: false
5+
categories: ["News", "GSoC"]
6+
description: "Mid-Term Progress with Google Summer of Code 2021 project under NumFOCUS: Aitik Gupta"
7+
displayInList: true
8+
author: Aitik Gupta
9+
10+
resources:
11+
- name: featuredImage
12+
src: "AitikGupta_GSoC.png"
13+
params:
14+
showOnTop: true
15+
---
16+
17+
**"<ins>Aitik, how is your GSoC going?</ins>"**
18+
19+
Well, it's been a while since I last wrote. But I wasn't spending time watching _Loki_ either! (that's a lie.)
20+
21+
During this period the project took on some interesting (and stressful) curves, which I intend to talk about in this small writeup.
22+
## New Mentor!
23+
The first week of coding period, and I met one of my new mentors, [Jouni](https://github.com/jkseppan). Without him, along with [Tom](https://github.com/tacaswell) and [Antony](https://github.com/anntzer), the project wouldn't have moved _an inch_.
24+
25+
It was initially Jouni's [PR](https://github.com/matplotlib/matplotlib/pull/18143) which was my starting point of the first milestone in my proposal, <ins>Font Subsetting</ins>.
26+
27+
## What is Font Subsetting anyway?
28+
As was proposed by Tom, a good way to understand something is to document your journey along the way! (well, that's what GSoC wants us to follow anyway right?)
29+
30+
Taking an excerpt from one of the paragraphs I wrote [here](https://github.com/matplotlib/matplotlib/blob/a94f52121cea4194a5d6f6fc94eafdfb03394628/doc/users/fonts.rst#subsetting):
31+
> Font Subsetting can be used before generating documents, to embed only the _required_ glyphs within the documents. Fonts can be considered as a collection of these glyphs, so ultimately the goal of subsetting is to find out which glyphs are required for a certain array of characters, and embed only those within the output.
32+
33+
Now this may seem straightforward, right?
34+
#### Wrong.
35+
The glyph programs can call their own subprograms, for example, characters like `ä` could be composed by calling subprograms for `a` and `¨`; or `` could be composed by a program that changes the display matrix and calls the subprogram for ``.
36+
37+
Since the subsetter has to find out _all such subprograms_ being called by _every glyph_ included in the subset, this is a generally difficult problem!
38+
39+
Something which one of my mentors said which _really_ stuck with me:
40+
> Matplotlib isn't a font library, and shouldn't try to be one.
41+
42+
It's really easy to fall into the trap of trying to do _everything_ within your own project, which ends up rather _hurting_ itself.
43+
44+
Since this holds true even for Matplotlib, it uses external dependencies like [FreeType](https://www.freetype.org/), [ttconv](https://github.com/sandflow/ttconv), and newly proposed [fontTools](https://github.com/fonttools/fonttools) to handle font subsetting, embedding, rendering, and related stuff.
45+
46+
PS: If that font stuff didn't make sense, I would recommend going through a friendly tutorial I wrote, which is all about [Matplotlib and Fonts](https://matplotlib.org/stable/users/fonts.html)!
47+
## Unexpected Complications
48+
Matplotlib uses an external dependency `ttconv` which was initially forked into Matplotlib's repository **in 2003**!
49+
> ttconv was a standalone commandline utility for converting TrueType fonts to subsetted Type 3 fonts (among other features) written in 1995, which Matplotlib forked in order to make it work as a library.
50+
51+
Over the time, there were a lot of issues with it which were either hard to fix, or didn't attract a lot of attention. (See the above paragraph for a valid reason)
52+
53+
One major utility which is still used is `convert_ttf_to_ps`, which takes a _font path_ as input and converts it into a Type 3 or Type 42 PostScript font, which can be embedded within PS/EPS output documents. The guide I wrote ([link](https://matplotlib.org/stable/users/fonts.html)) contains decent descriptions, the differences between these type of fonts, etc.
54+
55+
#### So we need to convert that _font path_ input to a _font buffer_ input.
56+
Why do we need to? Type 42 subsetting isn't really supported by ttconv, so we use a new dependency called fontTools, whose 'full-time job' is to subset Type 42 fonts for us (among other things).
57+
58+
> It provides us with a font buffer, however ttconv expects a font path to embed that font
59+
60+
Easily enough, this can be done by Python's `tempfile.NamedTemporaryFile`:
61+
```python
62+
with tempfile.NamedTemporaryFile(suffix=".ttf") as tmp:
63+
# fontdata is the subsetted buffer
64+
# returned from fontTools
65+
tmp.write(fontdata.getvalue())
66+
67+
# TODO: allow convert_ttf_to_ps
68+
# to input file objects (BytesIO)
69+
convert_ttf_to_ps(
70+
os.fsencode(tmp.name),
71+
fh,
72+
fonttype,
73+
glyph_ids,
74+
)
75+
```
76+
77+
***But this is far from a clean API; in terms of separation of \*reading\* the file from \*parsing\* the data.***
78+
79+
What we _ideally_ want is to pass the buffer down to `convert_ttf_to_ps`, and modify the embedding code of `ttconv` (written in C++). And _here_ we come across a lot of unexplored codebase, _which wasn't touched a lot ever since it was forked_.
80+
81+
Funnily enough, just yesterday, after spending a lot of quality time, me and my mentors figured out that the **whole logging system of ttconv was broken**, all because of a single debugging function. 🥲
82+
83+
<hr>
84+
85+
This is still an ongoing problem that we need to tackle over the coming weeks, hopefully by the next time I write one of these blogs, it gets resolved!
86+
87+
Again, thanks a ton for spending time reading these blogs. :D
88+
#### NOTE: This blog post is also available at my [personal website](https://aitikgupta.github.io/gsoc-mid/).

0 commit comments

Comments
 (0)