Skip to content

Commit 81342bf

Browse files
author
Davide Valeriani
authored
Merge pull request matplotlib#52 from aitikgupta/aitikgupta/gsoc-pre-quarter
GSoC'21 Pre-Quarter Progress: Aitik Gupta
2 parents 0521ec6 + 82de67a commit 81342bf

File tree

2 files changed

+92
-0
lines changed

2 files changed

+92
-0
lines changed
304 KB
Loading
Lines changed: 92 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,92 @@
1+
---
2+
title: "GSoC'21: Pre-Quarter Progress"
3+
date: 2021-07-19T07:32:05+05:30
4+
draft: false
5+
categories: ["News", "GSoC"]
6+
description: "Pre-Quarter Progress with Google Summer of Code 2021 project under NumFOCUS: Aitik Gupta"
7+
displayInList: true
8+
author: Aitik Gupta
9+
10+
resources:
11+
- name: featuredImage
12+
src: "AitikGupta_GSoC.png"
13+
params:
14+
showOnTop: true
15+
---
16+
17+
**“<ins>Well? Did you get it working?!</ins>”**
18+
19+
Before I answer that question, if you're missing the context, check out my [previous blog](https://matplotlib.org/matplotblog/posts/gsoc_2021_midterm/)'s last few lines.. promise it won't take you more than 30 seconds to get the whole problem!
20+
21+
With this short writeup, I intend to talk about _what_ we did and _why_ we did, what we did. XD
22+
23+
## Ostrich Algorithm
24+
Ring any bells? Remember OS (Operating Systems)? It's one of the core CS subjects which I bunked then and regret now. (╥﹏╥)
25+
26+
The [wikipedia page](https://en.wikipedia.org/wiki/Ostrich_algorithm) has a 2-liner explaination if you have no idea what's an Ostrich Algorithm.. but I know most of y'all won't bother clicking it XD, so here goes:
27+
> Ostrich algorithm is a strategy of ignoring potential problems by "sticking one's head in the sand and pretending there is no problem"
28+
29+
An important thing to note: it is used when it is more **cost-effective** to _allow the problem to occur than to attempt its prevention_.
30+
31+
As you might've guessed by now, we ultimately ended up with the *not-so-clean* API (more on this later).
32+
33+
## What was the problem?
34+
The highest level overview of the problem was:
35+
36+
```
37+
❌ fontTools -> buffer -> ttconv_with_buffer
38+
✅ fontTools -> buffer -> tempfile -> ttconv_with_file
39+
```
40+
The first approach created corrupted outputs, however the second approach worked fine. A point to note here would be that *Method 1* is better in terms of separation of *reading* the file from *parsing* the data.
41+
42+
1. [fontTools](https://github.com/fonttools/fonttools) handles the Type42 subsetting for us, whereas [ttconv](https://github.com/matplotlib/matplotlib/tree/master/extern/ttconv) handles the embedding.
43+
2. `ttconv_with_buffer` is a modification to the original `ttconv_with_file`; that allows it to input a file buffer instead of a file-path
44+
45+
You might be tempted to say:
46+
> "Well, `ttconv_with_buffer` must be wrongly modified, duh."
47+
48+
Logically, yes. `ttconv` was designed to work with a file-path and not a file-object (buffer), and modifying a codebase **written in 1998** turned out to be a larger pain than we anticipated.
49+
#### It came to a point where one of my mentors decided to implement everything in Python!
50+
He even did, but <ins>the efforts</ins> to get it to production / or to fix `ttconv` embedding were ⋙ to just get on with the second method. That damn ostrich really helped us get out of that debugging hell. 🙃
51+
## Font Fallback - initial steps
52+
Finally, we're onto the second subgoal for the summer: [Font Fallback](https://www.w3schools.com/css/css_font_fallbacks.asp)!
53+
54+
To give an idea about how things work right now:
55+
1. User asks Matplotlib to use certain font families, specified by:
56+
```python
57+
matplotlib.rcParams["font-family"] = ["list", "of", "font", "families"]
58+
```
59+
2. This list is used to search for available fonts on a user's system.
60+
3. However, in current (and previous) versions of Matplotlib:
61+
> <ins>As soon as a font is found by iterating the font-family, **all text** is rendered by that _and only that_ font.</ins>
62+
63+
You can immediately see the problems with this approach; using the same font for every character will not render any glyph which isn't present in that font, and will instead spit out a square rectangle called "tofu" (read the first line [here](https://www.google.com/get/noto/)).
64+
65+
And that is exactly the first milestone! That is, parsing the _<ins>entire list</ins>_ of font families to get an intermediate representation of a multi-font interface.
66+
## Don't break, a lot at stake!
67+
Imagine if you had the superpower to change Python standard library's internal functions, _without_ consulting anybody. Let's say you wanted to write a solution by hooking in and changing, let's say `str("dumb")` implementation by returning:
68+
```ipython
69+
>>> str("dumb")
70+
["d", "u", "m", "b"]
71+
```
72+
Pretty "<ins>dumb</ins>", right? xD
73+
74+
For your usecase it might work fine, but it would also mean breaking the _entire_ Python userbase' workflow, not to mention the 1000000+ libraries that depend on the original functionality.
75+
76+
On a similar note, Matplotlib has a public API known as `findfont(prop: str)`, which when given a string (or [FontProperties](https://matplotlib.org/stable/api/font_manager_api.html#matplotlib.font_manager.FontProperties)) finds you a font that best matches the given properties in your system.
77+
78+
It is used <ins>throughout the library</ins>, as well as at multiple other places, including downstream libraries. Being naive as I was, I changed this function signature and submitted the [PR](https://github.com/matplotlib/matplotlib/pull/20496). 🥲
79+
80+
Had an insightful discussion about this with my mentors, and soon enough raised the [other PR](https://github.com/matplotlib/matplotlib/pull/20549), which didn't touch the `findfont` API at all.
81+
82+
---
83+
84+
One last thing to note: Even if we do complete the first milestone, we wouldn't be done yet, since this is just parsing the entire list to get multiple fonts..
85+
86+
We still need to migrate the library's internal implementation from **font-first** to **text-first**!
87+
88+
89+
But that's for later, for now:
90+
![OnceAgainThankingYou](https://user-images.githubusercontent.com/43996118/126441988-5a2067fd-055e-44e5-86e9-4dddf47abc9d.png)
91+
92+
#### NOTE: This blog post is also available at my [personal website](https://aitikgupta.github.io/gsoc-pre-quarter/).

0 commit comments

Comments
 (0)