Jekyll2024-01-12T17:46:06+00:00https://jaredkhan.com/feed.xmlJared KhanBreaking JavaScript Generators2023-08-14T09:00:00+00:002023-08-14T09:00:00+00:00https://jaredkhan.com/blog/breaking-js-generators<p><em>Coming back to JavaScript after focusing on Python, I was surprised that using <code class="language-plaintext highlighter-rouge">break</code> when iterating over a generator closes the generator. This post has an example of that behaviour and a comparison with Python’s behaviour.</em></p>
<p>Python and JavaScript both have the concept of generators, functions which can yield multiple values, pausing after each yield until the consumer asks for the next value.</p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1"># Python
</span><span class="k">def</span> <span class="nf">my_generator</span><span class="p">():</span>
<span class="k">yield</span> <span class="mi">1</span>
<span class="k">yield</span> <span class="mi">2</span>
<span class="k">yield</span> <span class="mi">3</span>
<span class="n">iterator</span> <span class="o">=</span> <span class="n">my_generator</span><span class="p">()</span>
<span class="k">print</span><span class="p">(</span><span class="n">iterator</span><span class="p">.</span><span class="nb">next</span><span class="p">())</span>
<span class="k">print</span><span class="p">(</span><span class="s">"something else"</span><span class="p">)</span>
<span class="k">for</span> <span class="n">item</span> <span class="ow">in</span> <span class="n">iterator</span><span class="p">:</span>
<span class="k">print</span><span class="p">(</span><span class="n">item</span><span class="p">)</span>
<span class="c1"># Outputs:
# 1
# something else
# 2
# 3
</span></code></pre></div></div>
<div class="language-js highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1">// Javascript</span>
<span class="kd">function</span><span class="o">*</span> <span class="nx">myGenerator</span><span class="p">()</span> <span class="p">{</span>
<span class="k">yield</span> <span class="mi">1</span><span class="p">;</span>
<span class="k">yield</span> <span class="mi">2</span><span class="p">;</span>
<span class="k">yield</span> <span class="mi">3</span><span class="p">;</span>
<span class="p">}</span>
<span class="kd">const</span> <span class="nx">iterator</span> <span class="o">=</span> <span class="nx">myGenerator</span><span class="p">();</span>
<span class="nx">console</span><span class="p">.</span><span class="nx">log</span><span class="p">(</span><span class="nx">iterator</span><span class="p">.</span><span class="nx">next</span><span class="p">().</span><span class="nx">value</span><span class="p">);</span>
<span class="nx">console</span><span class="p">.</span><span class="nx">log</span><span class="p">(</span><span class="dl">"</span><span class="s2">something else</span><span class="dl">"</span><span class="p">);</span>
<span class="k">for</span> <span class="p">(</span><span class="kd">const</span> <span class="nx">item</span> <span class="k">of</span> <span class="nx">iterator</span><span class="p">)</span> <span class="p">{</span>
<span class="nx">console</span><span class="p">.</span><span class="nx">log</span><span class="p">(</span><span class="nx">item</span><span class="p">);</span>
<span class="p">}</span>
<span class="c1">// Outputs:</span>
<span class="c1">// 1</span>
<span class="c1">// something else</span>
<span class="c1">// 2</span>
<span class="c1">// 3</span>
</code></pre></div></div>
<p>These two things look basically identical. The iterator object we get back in the two cases has pretty much the same three main functions:</p>
<ul>
<li><code class="language-plaintext highlighter-rouge">__next__</code>(Python)/ <code class="language-plaintext highlighter-rouge">next</code>(JavaScript)
<ul>
<li>Run the generator up to the next yield and get the yielded value</li>
</ul>
</li>
<li><code class="language-plaintext highlighter-rouge">throw(exception)</code>
<ul>
<li>Force the generator to throw the given exception from its current suspended position</li>
</ul>
</li>
<li><code class="language-plaintext highlighter-rouge">close</code>(Python)/<code class="language-plaintext highlighter-rouge">return(value)</code>(JavaScript)
<ul>
<li>Close the generator, making sure any <code class="language-plaintext highlighter-rouge">finally</code> blocks are run. Can’t get values out of this generator after this (unless there are <code class="language-plaintext highlighter-rouge">yield</code>s in the <code class="language-plaintext highlighter-rouge">finally</code> for some reason).</li>
</ul>
</li>
</ul>
<p>The interesting difference that surprised me is that in JavaScript, calling <code class="language-plaintext highlighter-rouge">break</code> whilst iterating over a generator calls <code class="language-plaintext highlighter-rouge">return</code> on the generator, so you can’t really keep using that generator:</p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1"># Python
</span><span class="k">def</span> <span class="nf">my_generator</span><span class="p">():</span>
<span class="k">try</span><span class="p">:</span>
<span class="k">yield</span> <span class="mi">1</span>
<span class="k">yield</span> <span class="mi">2</span>
<span class="k">finally</span><span class="p">:</span>
<span class="k">print</span><span class="p">(</span><span class="s">"finally"</span><span class="p">)</span>
<span class="n">it</span> <span class="o">=</span> <span class="n">my_generator</span><span class="p">()</span>
<span class="c1"># Print one item then break
</span><span class="k">for</span> <span class="n">item</span> <span class="ow">in</span> <span class="n">it</span><span class="p">:</span>
<span class="k">print</span><span class="p">(</span><span class="n">item</span><span class="p">)</span>
<span class="k">break</span>
<span class="c1"># Print the rest of the items
</span><span class="k">for</span> <span class="n">item</span> <span class="ow">in</span> <span class="n">it</span><span class="p">:</span>
<span class="k">print</span><span class="p">(</span><span class="n">item</span><span class="p">)</span>
<span class="c1"># Outputs:
# 1
# 2
# finally
</span></code></pre></div></div>
<div class="language-js highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1">// Javascript</span>
<span class="kd">function</span><span class="o">*</span> <span class="nx">myGenerator</span><span class="p">()</span> <span class="p">{</span>
<span class="k">try</span> <span class="p">{</span>
<span class="k">yield</span> <span class="mi">1</span><span class="p">;</span>
<span class="k">yield</span> <span class="mi">2</span><span class="p">;</span>
<span class="p">}</span> <span class="k">finally</span> <span class="p">{</span>
<span class="nx">console</span><span class="p">.</span><span class="nx">log</span><span class="p">(</span><span class="dl">"</span><span class="s2">finally</span><span class="dl">"</span><span class="p">);</span>
<span class="p">}</span>
<span class="p">}</span>
<span class="kd">const</span> <span class="nx">it</span> <span class="o">=</span> <span class="nx">myGenerator</span><span class="p">();</span>
<span class="c1">// Print one item then break</span>
<span class="k">for</span> <span class="p">(</span><span class="kd">const</span> <span class="nx">item</span> <span class="k">of</span> <span class="nx">it</span><span class="p">)</span> <span class="p">{</span>
<span class="nx">console</span><span class="p">.</span><span class="nx">log</span><span class="p">(</span><span class="nx">item</span><span class="p">);</span>
<span class="k">break</span><span class="p">;</span>
<span class="p">}</span>
<span class="c1">// Try to print the rest of the items</span>
<span class="c1">// (doesn't work)</span>
<span class="k">for</span> <span class="p">(</span><span class="kd">const</span> <span class="nx">item</span> <span class="k">of</span> <span class="nx">it</span><span class="p">)</span> <span class="p">{</span>
<span class="nx">console</span><span class="p">.</span><span class="nx">log</span><span class="p">(</span><span class="nx">item</span><span class="p">);</span>
<span class="p">}</span>
<span class="c1">// Outputs:</span>
<span class="c1">// 1</span>
<span class="c1">// finally</span>
</code></pre></div></div>
<p>In Python, the <code class="language-plaintext highlighter-rouge">close</code> gets called in the <code class="language-plaintext highlighter-rouge">__del__</code> of the iterator object. That is, it gets closed when it gets garbage collected. This isn’t really possible in JavaScript due to the lack of any standardised garbage collection scheme, so I guess they had to do something else to make sure ‘finally’ is more likely to actually be called. That said, if it’s iterated outside of a <code class="language-plaintext highlighter-rouge">for..of</code>, it’s easy enough to just not close it and the finally will never run.</p>
<div class="language-js highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1">// Javascript</span>
<span class="kd">function</span><span class="o">*</span> <span class="nx">myGenerator</span><span class="p">()</span> <span class="p">{</span>
<span class="k">try</span> <span class="p">{</span>
<span class="k">yield</span> <span class="mi">1</span><span class="p">;</span>
<span class="k">yield</span> <span class="mi">2</span><span class="p">;</span>
<span class="p">}</span> <span class="k">finally</span> <span class="p">{</span>
<span class="nx">console</span><span class="p">.</span><span class="nx">log</span><span class="p">(</span><span class="dl">"</span><span class="s2">finally</span><span class="dl">"</span><span class="p">);</span>
<span class="p">}</span>
<span class="p">}</span>
<span class="kd">const</span> <span class="nx">iterator</span> <span class="o">=</span> <span class="nx">myGenerator</span><span class="p">();</span>
<span class="nx">console</span><span class="p">.</span><span class="nx">log</span><span class="p">(</span><span class="nx">iterator</span><span class="p">.</span><span class="nx">next</span><span class="p">().</span><span class="nx">value</span><span class="p">);</span>
<span class="c1">// Outputs:</span>
<span class="c1">// 1</span>
<span class="c1">// (never outputs 'finally')</span>
</code></pre></div></div>
<p><strong>Further Reading</strong></p>
<ul>
<li><a href="https://peps.python.org/pep-0342/#specification-summary">PEP 342 – Coroutines via Enhanced Generators</a>
<ul>
<li>This specifies the behaviour of the <code class="language-plaintext highlighter-rouge">try..finally</code> in Python generators</li>
</ul>
</li>
<li><a href="https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Statements/for...of#description:~:text=If%20the%20for...of%20loop%20exited%20early%20(e.g.%20a%20break%20statement%20is%20encountered%20or%20an%20error%20is%20thrown)%2C%20the%20return()%20method%20of%20the%20iterator%20is%20called%20to%20perform%20any%20cleanup">MDN – for…of</a>
<ul>
<li>This mentions the <code class="language-plaintext highlighter-rouge">break</code> behaviour</li>
</ul>
</li>
</ul>Coming back to JavaScript after focusing on Python, I was surprised that using break when iterating over a generator closes the generator. This post has an example of that behaviour and a comparison with Python’s behaviour.Single-pitch Sport Climbing in the Calanques2023-05-01T12:00:00+00:002023-05-01T12:00:00+00:00https://jaredkhan.com/blog/single-pitch-sport-calanques<p><em>The Calanques National Park near Marseille features a huge expanse of beautiful limestone with breathtaking views.
From April 22nd - 30th 2023 I was able to visit with 4 friends and experience a small sample of the wealth of rock climbing on offer in the Massif.</em></p>
<h1 id="day-0---arrival">Day 0 - Arrival</h1>
<p>Having hired a car from Marseille Provence Airport, we arrived at our Airbnb in Saint Cyr Sur Mer, dumped our bags,
and found dinner in the small port town.
Cassis is a more commonly recommended place to stay for a trip to the Calanques but the choice of available Airbnbs
for our trip nudged us a bit further west, and we didn’t mind the additional scenic driving.</p>
<figure>
<img src="/assets/images/calanques/airbnb_view.jpeg" alt="view of La Pointe Grenier from accommodation in saint cyr sur mer" title="The view of La Pointe Grenier from our accommodation in Saint Cyr Sur Mer" />
<figcaption>The view of La Pointe Grenier from our accommodation in Saint Cyr Sur Mer</figcaption>
</figure>
<h1 id="day-1--roche-percée-morgiou">Day 1 – Roche Percée, Morgiou</h1>
<p>For our first day in The Calanques, we were looking for a crag that was far enough from the available parking so as not to be too busy on the weekend. We opened up the guidebook — <a href="https://climb-europe.com/rockclimbingshop/climbing-in-the-calanques-guidebook.html">Climbing in the Calanques (2020)</a> — to find our first target.</p>
<p>For the duration of our trip, the Route de Morgiou and Route de Sormiou roads were closed during the daytime ‘to allow clear access to emergency services’. The guidebook shows you where the road closure gates are on its maps using a black gate symbol (<img src="/assets/images/calanques/gate.png" style="width: 1.5em;" />). The gates can also be spotted on Google street view to get a better idea.</p>
<p>Roche Percée is accessible within 40 minutes walk from the large car park next to Luminy University (43.2325910°, 5.4324360°) and has routes within our preferred grade range (5b-6b).</p>
<p>Having not paid enough attention, and not understanding what the guide book was trying to say, we strayed down yellow trail number 7 rather than 6a and ended up spending about 2 hours on what should have been a 40 minute approach.
Luckily, the weather was overcast and not too hot, and we made it to the crag without melting.
From Luminy, follow yellow trail 6a until you are near the coordinates, above the crag look at the arch to get your bearings, then go back to find the side trail which is marked with a yellow cross. Follow the piles of rocks down a scramble descent to get round to the west of the crag, enter through the arch.</p>
<figure>
<img src="/assets/images/calanques/approach_rochee.jpeg" alt="The final few steps to Roche Percée, approaching the arch through the trees." title="The final few steps to Roche Percée, approaching the arch through the trees." />
<figcaption>The final few steps to Roche Percée, approaching the arch through the trees.</figcaption>
</figure>
<p>The <a href="http://www.calanques-parcnational.fr/fr/application-mobile-officielle-mes-calanques">Mes Calanques app</a> has a downloadable map that shows the walking trails along with your location (which you manually update by pressing the location button), which we found very useful throughout the trip.</p>
<p>The main walking trails in The Calanques tend to have markers (lines, or Ls to indicate a turn) painted on rocks in shiny paint. Coloured crosses indicate that the route of that colour does <strong>not</strong> go this way.</p>
<figure>
<img src="/assets/images/calanques/mes_calanques.jpeg" alt="Screenshot from the Map section of the Mes Calanques app." title="Screenshot from the Map section of the Mes Calanques app." />
<figcaption>Screenshot from the Map section of the Mes Calanques app.</figcaption>
</figure>
<figure>
<img src="/assets/images/calanques/yellow_cross.jpeg" alt="Screenshot from the Map section of the Mes Calanques app." title="Screenshot from the Map section of the Mes Calanques app." />
<figcaption>A yellow cross at Escalier Des Géants indicating that this side trail is not part of the current yellow trail.</figcaption>
</figure>
<p>The crag itself is a charming, secluded dish with a terrific view of the Calanque de Morgiou and a delightful archway entrance. There is a reasonably sized and flat-ish ledge from which to belay, though walking around the crag you will need to pay some attention to stay clear from the sheer drop when you’re not tied into anything.</p>
<figure>
<img src="/assets/images/calanques/rochee_crag.jpeg" alt="A view of the Roche Percée crag." title="A view of the Roche Percée crag." />
<figcaption>A view of the Roche Percée crag.</figcaption>
</figure>
<figure>
<img src="/assets/images/calanques/la_baleine.jpeg" alt="Excitedly getting up our first route of the trip, La Baleine. (Photo: Sam Bailey)" title="Excitedly getting up our first route of the trip, La Baleine. (Photo: Sam Bailey)" />
<figcaption>Excitedly getting up our first route of the trip, La Baleine. (Photo: Sam Bailey)</figcaption>
</figure>
<figure>
<img src="/assets/images/calanques/lichen.jpeg" alt="This striking black lichen cuddles the rock at Roche Percée." title="This striking black lichen cuddles the rock at Roche Percée." />
<figcaption>This striking black lichen cuddles the rock at Roche Percée.</figcaption>
</figure>
<p>We thankfully found our way back more directly, returning to the Luminy car park in around 40 mins.</p>
<table>
<thead>
<tr>
<th>Climb Log</th>
</tr>
</thead>
<tbody>
<tr>
<td>La Baleine, 5b, hangdogged</td>
</tr>
<tr>
<td>La Sixiéme Tortue, 4c, lead. <em>Easy climbing with a strange anchor at the top. Two bolts joined by a chain with no rappel ring, just a maillon to lower off.</em></td>
</tr>
<tr>
<td>Le Boyau, 5a, lead. <em>Beautiful, juggy climbing, with a big quartz crystal in the middle</em></td>
</tr>
</tbody>
</table>
<h1 id="day-2--pouce-sormiou">Day 2 – Pouce, Sormiou</h1>
<figure>
<img src="/assets/images/calanques/thumb.jpeg" alt="A view of the 'thumb' which gives this crag its name." title="A view of the 'thumb' which gives this crag its name." />
<figcaption>A view of the 'thumb' which gives this crag its name. (Photo: Nathan Bailey)</figcaption>
</figure>
<p>We approached this crag from the car park at the start of Route de <strong>Morgiou</strong> (despite this crag overlooking Sormiou). Google Maps struggled to direct us to this car park, despite it not being complicated, so keep an eye on the map. We arrived at about 11:00 on a Monday and got the last space in the car park (get in!).</p>
<p>From the car park we did not walk along the road but along the red route number 5 (not pictured in the guidebook but in the IGN or Mes Calanques app map) until we met the path marked with a blue climber. This approach took us 30 mins.</p>
<p>Anchors at this crag were excellent, generally being a rappel ring connected to two bolts by a chain. There is also a practice anchor at ground level, making this a good first crag of the trip if you or your climbing partners would like extra practice at tying off before gravity gets involved.</p>
<figure>
<img src="/assets/images/calanques/pracitce_anchor.jpeg" alt="A practice anchor at ground level at Pouce, Sormiou." title="A practice anchor at ground level at Pouce, Sormiou." />
<figcaption>A practice anchor at ground level at Pouce, Sormiou.</figcaption>
</figure>
<figure>
<img src="/assets/images/calanques/la_future_derniere.jpeg" alt="Sam on La Future Dernière." title="Sam on La Future Dernière." />
<figcaption>Sam on La Future Dernière. (Photo: Jamie Bernardi)</figcaption>
</figure>
<table>
<thead>
<tr>
<th>Climb Log</th>
</tr>
</thead>
<tbody>
<tr>
<td>Le Pouce Dans Le Nez, 3c, lead. <em>An essential photo opportunity for any equipped visitors to this crag. I was so excited to sit on the top, I forgot to change into my climbing shoes. The view of the Calanque de Sormiou from the top of ‘the thumb in the nose’ is fantastic. If the wind is gentle or if you are feeling particularly brave, you can stand up at the top.</em></td>
</tr>
<tr>
<td>Anti-Rouille, 5b, onsight</td>
</tr>
<tr>
<td>Iznogood, 6a, lead. <em>Despite the name, an enjoyable route, though some loose rocks towards the top. There are additional bolts above the anchor for you to step up and safely enjoy the view.</em></td>
</tr>
<tr>
<td>C’est Super, 5b, lead. <em>Gentle for the grade, ledges take the edge of the length of this route.</em></td>
</tr>
<tr>
<td>La Future Dernière, 6a, hangdogged. <em>Mainly juggy climbing with the holds juust out of static reach for me.</em></td>
</tr>
</tbody>
</table>
<h1 id="day-3">Day 3</h1>
<p><strong>Cliff of Pastré, Marseilleveyre</strong></p>
<p>Ready for a bit less walking, we turned to Marseilleveyre, where many crags are short walks from city parking and have enjoyable views of the city. We headed for the Cliff of Pastré, managing to find parking near to the guidebook’s recommended location. The walk from there to the cliff is around 15 minutes.</p>
<p>The Grotte sector has enjoyable rock features, including the cave that gives it its name, and a fantastic view over the city.
The Phalanges Resinees Sector has some lovely climbing and again the top provides excellent views.
We were treated to more awesome weather, the 18mph NW wind was no problem at this crag. We finished the day with a drink along the seaside at Marseille.</p>
<figure>
<img src="/assets/images/calanques/grotte.jpeg" alt="The cave that gives the Grotte sector its name." title="The cave that gives the Grotte sector its name." />
<figcaption>The cave that gives the Grotte sector its name.</figcaption>
</figure>
<figure>
<img src="/assets/images/calanques/pastre_climbing.jpeg" alt="Finding my feet on Vent de Panique (right) whilst Jamie (left) performs the final moves on Tapage Diurne. " title="Finding my feet on Vent de Panique (right) whilst Jamie (left) performs the final moves on Tapage Diurne. " />
<figcaption>Finding my feet on Vent de Panique (right) whilst Jamie (left) performs the final moves on Tapage Diurne. (Photo: Sam Bailey)</figcaption>
</figure>
<figure>
<img src="/assets/images/calanques/pastre_view_from_top.jpeg" alt="A view from the top of a great route, Vent de Panique, over Marseille." title="A view from the top of a great route, Vent de Panique, over Marseille." />
<figcaption>A view from the top of a great route, Vent de Panique, over Marseille.</figcaption>
</figure>
<table>
<thead>
<tr>
<th>Climb Log</th>
</tr>
</thead>
<tbody>
<tr>
<td>Grotte Sector – L’ultravoie, 5b+, lead</td>
</tr>
<tr>
<td>Grotte Sector – Graffiti, 5c+, onsight. <em>A nice crack climb with a spooky distance between the last clip and the anchor</em></td>
</tr>
<tr>
<td>Phalanges Resinees Sector – Tapage diurne, 5c, onsight. <em>A lovely layback most of the way up</em></td>
</tr>
<tr>
<td>Phalanges Resinees Sector – Vent de Panique, 5c, lead. <em>Lovely technical moves all the way up, on slabby terrain. Some polish but lots of options available.</em></td>
</tr>
</tbody>
</table>
<h1 id="day-4--resting-in-marseille">Day 4 – ‘Resting’ in Marseille</h1>
<p>After 3 days of climbing and Calanque-ing, we needed some time to grow our fingerprints back. We drove over to Marseille and parked near the old port (a decision that ultimately cost us €32 for the day).
From there, we queued up for boat tickets to Frioul Island and wandered around the Mucem museum, with its bizarre concrete lace-clad square, until our boat was ready to depart.</p>
<figure>
<img src="/assets/images/calanques/old_port_mirror.jpeg" alt="Taking a moment to reflect in the mirror ceiling of a covered area of the Old Port in Marseille." title="Taking a moment to reflect in the mirror ceiling of a covered area of the Old Port in Marseille." />
<figcaption>Taking a moment to reflect in the mirror ceiling of a covered area of the Old Port in Marseille.</figcaption>
</figure>
<figure>
<img src="/assets/images/calanques/mucem.jpeg" alt="Gazing through the surreal concrete lacing of the Mucem." title="Gazing through the surreal concrete lacing of the Mucem." />
<figcaption>Gazing through the surreal concrete lacing of the Mucem.</figcaption>
</figure>
<p>We enjoyed a picnic on the island and explored the eery ruins of unfinished German WW2-time ammunition shelters throughout the Fort of Ratonneau. Nowadays, the fort is home to hundreds of gulls, who negotiate with tourists for nesting space. Inevitably we walked more on this ‘rest’ day than any other day of the trip.</p>
<figure>
<img src="/assets/images/calanques/fort.jpeg" alt="Gazing through the surreal concrete lacing of the Mucem." title="Gazing through the surreal concrete lacing of the Mucem." />
<figcaption>The Fort of Ratonneau. (Photo: Jamie Bernardi)</figcaption>
</figure>
<figure>
<img src="/assets/images/calanques/marseille_streets.jpeg" alt="Returning from dinner at Le Cours Julien, Marseille." title="Returning from dinner at Le Cours Julien, Marseille." />
<figcaption>Returning from dinner at Le Cours Julien, Marseille.</figcaption>
</figure>
<h1 id="day-5--escalier-des-géants-les-goudes">Day 5 – Escalier Des Géants, Les Goudes</h1>
<p>This day featured another easy approach, parking near the end of Boulevard Alexandre Delabre (43.211710°, 5.351372°) and following the easy instructions of the guidebook for about 15 minutes.
The brief spits of rain on this day cooled us down just enough to have a go at <em>La Fissure du Géant.</em>
Some of the group enjoyed the length of the routes offered here, and say this crag was a highlight of the trip.</p>
<figure>
<img src="/assets/images/calanques/falling_fissure.jpeg" alt="Mid-crumble as my arms give up on La Fissure du Géant." title="Mid-crumble as my arms give up on La Fissure du Géant." />
<figcaption>Mid-crumble as my arms give up on La Fissure du Géant. (Photo: Sam Bailey)</figcaption>
</figure>
<figure>
<img src="/assets/images/calanques/fissure.jpeg" alt="Back for the finish on La Fissure du Géant." title="Back for the finish on La Fissure du Géant." />
<figcaption>Back for the finish on La Fissure du Géant. (Photo: Sam Bailey)</figcaption>
</figure>
<table>
<thead>
<tr>
<th>Climb Log</th>
</tr>
</thead>
<tbody>
<tr>
<td>Miax Sector – Et Jean, 5b+, onsight. <em>A sketchy start. It’s easy to stray left into Le Diédre, stay on the right for the interesting holds at the top. Your rope will run over some spiky rocks at the top whether you stray left onto a different anchor or stay right, you may wish to extend the anchor with a sling if a few of your party are doing this route.</em></td>
</tr>
<tr>
<td>Miax Sector – Pièce Montée, 5c, lead. <em>This slabby route feels a little scary with a bolt missing just after half way.</em></td>
</tr>
<tr>
<td>Right Side Cliff – La Fissure du Géant, 6b, hangdogged. <em>This huge route features cool laybacks along the crack and crimps where the crack is too tight. A ledge offers a good break before the crux through the notch. This route betrayed my lack of stamina, with my forearms melting off the route about 3/4 the way up.</em></td>
</tr>
<tr>
<td>Right Side Cliff – La Lolotte, 5c, lead. <em>Some polish on this.</em></td>
</tr>
</tbody>
</table>
<h1 id="day-6--en-vau">Day 6 – En Vau</h1>
<p>En Vau is billed as one of the most scenic and classic Calanques and is incredibly popular with tourists and beach-goers.
Many of the climbing routes in this area have been there since the 1930s or earlier.
It took us 90 minutes in a group of 5 to walk from the parking at the start of the Route Gaston Rebuffat (43.23712135474467°, 5.4996331475356826°),
named after the famous climber from the region, down to the beach, following clear signposts to Calanque d’En Vau all the way.
The view on this walk is mostly of a plain-looking forest, until the final 20 minutes or so when you are really in the valley and the beauty of the place reveals itself.</p>
<figure>
<img src="/assets/images/calanques/en_vau_approach.jpeg" alt="Approaching the beach at En Vau." title="Approaching the beach at En Vau." />
<figcaption>Approaching the beach at En Vau.</figcaption>
</figure>
<figure>
<img src="/assets/images/calanques/en_vau_beach.jpeg" alt="The beach of Calanque d'En Vau as seen from the top of ." title="The beach of Calanque d'En Vau as seen from the top of ." />
<figcaption>The beach of Calanque d'En Vau as seen from the top of La Ratopenado.</figcaption>
</figure>
<p>The beach is of course the hub of the place, with around 200 people sunbathing and swimming on the Friday lunchtime that we visited.
There are many sectors nearby this beach and we had set our sights on the Petite Aiguille, hoping for terrific views and classic climbing.
Unfortunately, the seaward side of the Petite Aiguille has enough bush cover from the beach for some to have used it as a handy toilet.
The routes on this side also looked to us to be poorly equipped, with old bolts and unsafe starts (a typical clipstick will not help you reach the first bolts on these).
The second bolt of <em>La Diagonale</em> is stained with the rusty remains of multiple maillon rapide bails.
We were persuaded to start on the other side of the needle.</p>
<figure>
<img src="/assets/images/calanques/finding_the_line.jpeg" alt="Trying to find a decent line of bolts up the seaward side of Petite Aiguille." title="Trying to find a decent line of bolts up the seaward side of Petite Aiguille." />
<figcaption>Trying to find a decent line of bolts up the seaward side of Petite Aiguille.</figcaption>
</figure>
<p>The routes on the valley side of the Petite Aiguille were somewhat better equipped, though the line of <em>La B.B</em> was not evident to us,
and members of the group ended up merging it with <em>Directe Nord-Oeust</em> in order to get to an anchor.
Reaching the top of this needle one way or another provides an excellent view of the Calanque, as long as you don’t mind an audience.</p>
<p>After a picnic and a dip in the crystal clear water, we looked for some routes to finish our En Vau experience.
<em>Dalle Du Chat</em> looked interesting and is also very close to the beach so we wandered over to take a look.
Unfortunately, in the few years since the guidebook was published, there has been a major rockfall on the left of this sector (to the left of <em>Le Gynécologue</em>),
and the right of the sector (<em>Passangers Du Vent</em> - <em>Sale Temps Pour Les Nains</em>) has been fenced off.
I stared wide-eyed at the boulders at the base of the cliff with the knowledge that they were recently the top, gulped hard, and went to look for something else.</p>
<figure>
<img src="/assets/images/calanques/spotthediff.jpeg" alt="The guidebook's topo of Dalle Du Chat, alongside the current state of this area." title="The guidebook's topo of Dalle Du Chat, alongside the current state of this area." />
<figcaption>The guidebook's topo of Dalle Du Chat, alongside the current state of this area.</figcaption>
</figure>
<p>A few moments walk away from the beach along the main path, followed by a brief clamber up a scree path finds you the <em>Grande Aiguille</em>,
which was a lovely spot for some in the group to climb the very pleasing <em>Le G.H.M.,</em> appreciating the eye of the needle at the top, and for others to dry off in the sun from their swim.</p>
<p>The hike back, past beach-goers in just their sea-logged boxer shorts, was no slower than the walk in, taking us 80 minutes.</p>
<figure>
<img src="/assets/images/calanques/le_ghm.jpeg" alt="Sam beta-hunting on Le G.H.M." title="Sam beta-hunting on Le G.H.M." />
<figcaption>Sam beta-hunting on Le G.H.M.</figcaption>
</figure>
<figure>
<img src="/assets/images/calanques/grande_aiguille_needle.jpeg" alt="Looking through the eye of the Grande Aiguille." title="Looking through the eye of the Grande Aiguille." />
<figcaption>Looking through the eye of the Grande Aiguille.</figcaption>
</figure>
<figure>
<img src="/assets/images/calanques/en_vau_return.jpeg" alt="Still hiking back from Calanque d'En Vau." title="Still hiking back from Calanque d'En Vau." />
<figcaption>Still hiking back from Calanque d'En Vau.</figcaption>
</figure>
<figure>
<img src="/assets/images/calanques/en_vau_sunset.jpeg" alt="The sun setting as we finally finish our return to the car park at the start of Route de Gaston Rebuffat." title="The sun setting as we finally finish our return to the car park at the start of Route de Gaston Rebuffat." />
<figcaption>The sun setting as we finally finish our return to the car park at the start of Route de Gaston Rebuffat.</figcaption>
</figure>
<table>
<thead>
<tr>
<th>Climb Log</th>
</tr>
</thead>
<tbody>
<tr>
<td>Petite Aiguille – La Ratopenado, 5b, lead. <em>Long spacing between the bolts throughout and a very long gap from the last clip to the anchor make this otherwise fine route a little nervy.</em></td>
</tr>
<tr>
<td>Grand Aiguille - Le G.H.M., 5c, lead. <em>Lovely climbing, with the crux just above the cave. Don’t forget to look through the eye of the needle at the top!</em></td>
</tr>
</tbody>
</table>
<h1 id="day-7--vallon-des-escampons">Day 7 – Vallon Des Escampons</h1>
<p>Having spent our enthusiasm for walking on our En Vau day, we went for another crag that is very close to available parking.
We parked in the car park at the start of the road to Morgiou (43.222840°, 5.417970°),
the residential street that the guidebook recommends to park on is small and was closed for non-residents when we were there, which seems fair enough.
The walk in took about 15 minutes.</p>
<figure>
<img src="/assets/images/calanques/high_foot_low_effort.jpeg" alt="Scratching my head on Bon Secours. 'Surely I'm not supposed to put my foot up there'." title="Scratching my head on Bon Secours. 'Surely I'm not supposed to put my foot up there'." />
<figcaption>Scratching my head on Bon Secours. 'Surely I'm not supposed to put my foot up there'.</figcaption>
</figure>
<p>A fairly inconspicuous sign on a tree told us that the five climbs <em>Rififi Allah Fédé - Jamika</em> were closed during our visit due to nesting birds and suggested they’d remain closed until July 2023.</p>
<p>The equipment in this area is more generous than in others, with bolts spaced around 1m apart.
This gave us nervous climbers the security to try some slightly harder routes, making a satisfying end to our climbing for the week.</p>
<figure>
<img src="/assets/images/calanques/a_beer_for_the_road.jpeg" alt="Nathan unfazed by the ants on Une Bière Pour La Route." title="Nathan unfazed by the ants on Une Bière Pour La Route." />
<figcaption>Nathan unfazed by the ants on Une Bière Pour La Route. (Photo: Ben Papp)</figcaption>
</figure>
<figure>
<img src="/assets/images/calanques/papinade.jpeg" alt="Nathan finding a way up the crack of Papinade." title="Nathan finding a way up the crack of Papinade." />
<figcaption>Nathan finding a way up the crack of Papinade.</figcaption>
</figure>
<table>
<thead>
<tr>
<th>Climb Log</th>
</tr>
</thead>
<tbody>
<tr>
<td>Escampons Left – Le Pharo, 5c, hangdogged. <em>A bit crimpy at the top, though it’s possible to cheat into the crack on the right for an easier finish.</em></td>
</tr>
<tr>
<td>Escampons Left – Bon Secours, 5c, hangdogged. <em>A flexible high-foot move near the end, which can be avoided by traversing round to the left.</em></td>
</tr>
<tr>
<td>Escampons Right – Papinade, 6a, onsight. <em>A cool vertical crack to start followed by a juggy traverse to the left, then a crimpy slab finish, this one has it all!</em></td>
</tr>
<tr>
<td>Escampons Right – Une Bière Pour La Route, 6a, onsight. <em>Enjoyable moves if you can ignore the hollow-feeling rock and the ants, wasps, and spiderwebs you might encounter on the way. Anchor is up to the left a bit from the last clip, above some bushes.</em></td>
</tr>
</tbody>
</table>
<h1 id="conclusion">Conclusion</h1>
<p>We were spoilt by fantastic weather and great variety of climbing locations on our trip to the Calanques.
The availability of beginner-grade single pitch sport climbs drew us in to the area and brought us up close and personal with cliffs and ridges that inspire us to climb harder and further
and to explore more of the beautiful rock that’s available to us.</p>
<p>We all left with the same sentiment: “now we <em>have</em> to learn how to multi-pitch”.</p>
<figure>
<img src="/assets/images/calanques/squad.jpeg" alt="The squad." title="The squad." />
<figcaption>The squad.</figcaption>
</figure>
<figure>
<img src="/assets/images/calanques/cuticles.jpeg" alt="The result of 6 days of cuticle abuse." title="The result of 6 days of cuticle abuse." />
<figcaption>The result of 6 days of cuticle abuse.</figcaption>
</figure>The Calanques National Park near Marseille features a huge expanse of beautiful limestone with breathtaking views. From April 22nd - 30th 2023 I was able to visit with 4 friends and experience a small sample of the wealth of rock climbing on offer in the Massif.Use a Switch Pro Controller with Dolphin Emulator on macOS Ventura2022-10-29T14:25:00+00:002022-10-29T14:25:00+00:00https://jaredkhan.com/blog/switch-pro-controller-dolphin<h2 id="step-1-connect-the-controller-to-the-mac">Step 1: Connect the controller to the Mac</h2>
<p>Using a Nintendo Switch Pro Controller on a Mac requires macOS Ventura.</p>
<ul>
<li>Open <strong>System Settings ▶︎ Bluetooth</strong></li>
<li>Hold down the <strong><small>SYNC</small></strong> button on the top of the controller next to the USB port and wait for the LEDs at the bottom of the controller to light back and forth</li>
<li>Click <strong>Connect</strong> next to ‘Pro Controller’ under ‘Nearby Devices’</li>
</ul>
<p><img src="/assets/images/switch_pro_controller_dolphin/connecting_pro_controller.png" alt="Connecting a Nintendo Switch Pro Controller in macOS Ventura System Settings" /></p>
<h2 id="step-2-set-up-the-controller-in-dolphin">Step 2: Set up the controller in Dolphin</h2>
<ul>
<li>Open Dolphin and click on <strong>Controllers</strong></li>
<li>Choose ‘Standard Controller’ in the port you want to configure</li>
</ul>
<p><img src="/assets/images/switch_pro_controller_dolphin/configuring_standard_controller.png" alt="Configuring a Standard Controller in Dolphin on Mac" /></p>
<p><br /></p>
<p><strong>If you’d like to use my recommended settings:</strong></p>
<ul>
<li>Download <a href="/assets/switch_pro_controller.ini">switch_pro_controller.ini</a></li>
<li>Move it to <code class="language-plaintext highlighter-rouge">~/Library/Application Support/Dolphin/Config/Profiles/GCPad/Switch Pro Controller.ini</code>. This can be done in Terminal like so
<br />
<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nb">mkdir</span> <span class="nt">-p</span> ~/Library/<span class="s2">"Application Support"</span>/Dolphin/Config/Profiles/GCPad <span class="o">&&</span> <span class="se">\</span>
<span class="nb">mv</span> ~/Downloads/switch_pro_controller.ini <span class="se">\</span>
~/Library/<span class="s2">"Application Support"</span>/Dolphin/Config/Profiles/GCPad/<span class="s2">"Switch Pro Controller.ini"</span>
</code></pre></div> </div>
</li>
<li>Now, in Dolphin, press the <strong>Configure</strong> button</li>
<li>In the profile dropdown, choose ‘Switch Pro Controller’ and press <strong>Load</strong>. The buttons should be populated with my recommended settings and you can adjust them from there</li>
<li>If you are configuring multiple controllers, make sure that the device in the top left is the one you want to be configuring</li>
<li>For comfort and consistency with the GameCube controller layout, I’ve set:
<ul>
<li><strong>ZR</strong> on the Pro Controller to be <strong>R</strong> on the emulated GameCube controller</li>
<li>and <strong>R</strong> on the Pro Controller to be <strong>Z</strong> on the emulated GameCube controller</li>
</ul>
</li>
</ul>
<p><img alt="Loading my prebuilt profile" src="/assets/images/switch_pro_controller_dolphin/load_profile.png" /></p>
<p><br /></p>
<p><strong>If you want to configure the controller yourself:</strong></p>
<ul>
<li>Press <strong>Configure</strong></li>
<li>Select ‘SDL/0/Pro Controller’ in the dropdown that appears (the number might differ)</li>
<li>Go through each button, clicking on the blank box and then pressing the button on the controller that you want to use</li>
<li>For <strong>Control Stick ▶︎ Up</strong>, for example, simply push the left stick on the Pro Controller upwards, and Dolphin will figure out the axis that you mean</li>
<li>After setting up the sticks, you may wish to Calibrate them by pressing <strong>Calibrate</strong> and moving the sticks in the widest possible circle quite slowly</li>
<li>If you want to clear or add alternative inputs to any of the configured buttons, right click on the configured button. This also gives you a live view of all the channels of the connected controller so you can decide on your preferred configuration</li>
</ul>
<p><br /></p>
<p>That’s it. The controller should now function as an emulated GameCube controller in the slot that you chose.</p>Step 1: Connect the controller to the Mac Using a Nintendo Switch Pro Controller on a Mac requires macOS Ventura. Open System Settings ▶︎ Bluetooth Hold down the SYNC button on the top of the controller next to the USB port and wait for the LEDs at the bottom of the controller to light back and forth Click Connect next to ‘Pro Controller’ under ‘Nearby Devices’Swift’s Copy-on-write Optimisation2021-04-05T11:45:00+00:002021-04-05T11:45:00+00:00https://jaredkhan.com/blog/swift-copy-on-write<p>Swift’s Arrays have value semantics. They also have a copy-on-write optimisation:</p>
<blockquote>
<p>Collections defined by the standard library like arrays, dictionaries, and strings use an optimization to reduce the performance cost of copying. Instead of making a copy immediately, these collections share the memory where the elements are stored between the original instance and any copies. If one of the copies of the collection is modified, the elements are copied just before the modification. The behavior you see in your code is always as if a copy took place immediately.</p>
</blockquote>
<p><a href="https://docs.swift.org/swift-book/LanguageGuide/ClassesAndStructures.html#ID88">https://docs.swift.org/swift-book/LanguageGuide/ClassesAndStructures.html#ID88</a></p>
<p>Here’s a few points to help us understand how the copy-on-write optimisation works:</p>
<ul>
<li>The Array type is a value type.</li>
<li>Array has a stored property which is a <strong>reference</strong> to a buffer object which is where the Array data actually lives (on the heap).</li>
<li>When an Array is copied, the copy gets a reference to the same buffer. The reference count of the buffer has now increased.</li>
<li>When a mutation is applied to an Array, the reference count of the buffer is checked. If it is greater than 1, the buffer is copied first. Otherwise, the buffer is not copied. In this sense it’s not just copy-on-write but copy-on-write-when-necessary.</li>
<li>This behaviour is hand-written in the Standard Library code for Array and other container types, it’s <strong>not</strong> an automatic feature of Swift value types.</li>
</ul>
<p>With this in mind we should be able to reason about the memory performance of the following snippets. I’ve used pretty large numbers throughout, so that it’s easy to see the changes in memory usage in Activity Monitor (or other tools).</p>
<p>I’ve put these examples into <a href="https://gist.github.com/jaredkhan/5c77a7f79c6adcdf58c9a08c9d53a023">copy_on_write_examples.swift</a> in case you want to run them for yourself.</p>
<p>See if you can predict what the approximate memory usage will be at each of the numbered comments in these snippets.</p>
<h3 id="copying-a-large-array">Copying a large array</h3>
<div class="language-swift highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">var</span> <span class="nv">x</span> <span class="o">=</span> <span class="kt">Array</span><span class="p">(</span><span class="nv">repeating</span><span class="p">:</span> <span class="kt">Int64</span><span class="p">(</span><span class="mi">1</span><span class="p">),</span> <span class="nv">count</span><span class="p">:</span> <span class="mi">100_000_000</span><span class="p">)</span>
<span class="c1">// Memory usage is ~800MB</span>
<span class="k">var</span> <span class="nv">y</span> <span class="o">=</span> <span class="n">x</span>
<span class="c1">// (1)</span>
<span class="n">y</span><span class="o">.</span><span class="nf">append</span><span class="p">(</span><span class="mi">2</span><span class="p">)</span>
<span class="c1">// (2)</span>
</code></pre></div></div>
<p>This is a fairly simple case.</p>
<p>At (1) we have taken a copy of x into y but not yet mutated it so at this point we still just have a reference to the same buffer, memory usage is still around 800MB</p>
<p>At (2) we have mutated y’s data so Swift will notice that the buffer is not uniquely referenced and will copy it fully before the mutation. Memory usage is now around 1600MB</p>
<h3 id="nested-arrays">Nested Arrays</h3>
<div class="language-swift highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">var</span> <span class="nv">x</span> <span class="o">=</span> <span class="p">[</span>
<span class="kt">Array</span><span class="p">(</span><span class="nv">repeating</span><span class="p">:</span> <span class="kt">Int64</span><span class="p">(</span><span class="mi">1</span><span class="p">),</span> <span class="nv">count</span><span class="p">:</span> <span class="mi">100_000_000</span><span class="p">),</span>
<span class="kt">Array</span><span class="p">(</span><span class="nv">repeating</span><span class="p">:</span> <span class="kt">Int64</span><span class="p">(</span><span class="mi">1</span><span class="p">),</span> <span class="nv">count</span><span class="p">:</span> <span class="mi">100_000_000</span><span class="p">),</span>
<span class="p">]</span>
<span class="c1">// Memory usage is ~1600MB</span>
<span class="k">var</span> <span class="nv">y</span> <span class="o">=</span> <span class="n">x</span>
<span class="n">y</span><span class="o">.</span><span class="nf">append</span><span class="p">([])</span>
<span class="c1">// (1)</span>
<span class="n">y</span><span class="p">[</span><span class="mi">0</span><span class="p">]</span><span class="o">.</span><span class="nf">append</span><span class="p">(</span><span class="mi">2</span><span class="p">)</span>
<span class="c1">// (2)</span>
</code></pre></div></div>
<p>There’s a little bit of nesting going on here.</p>
<p>This time, x’s buffer is not storing a lot of data, rather it is storing 2 small Array structs which themselves have references to large buffers. After we take a copy of x and then mutate its buffer, it will copy x’s buffer but won’t copy the buffers of x’s elements.</p>
<p>At (1), memory usage is still around 1600MB.
At (2), we’ve made a mutation to one of the large buffers, so that one buffer will be copied first, so we expect a memory usage of about 2400MB</p>
<h3 id="array-within-a-struct">Array within a struct</h3>
<div class="language-swift highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kd">struct</span> <span class="kt">ThingWithArray</span> <span class="p">{</span>
<span class="k">let</span> <span class="nv">name</span><span class="p">:</span> <span class="kt">String</span>
<span class="k">let</span> <span class="nv">array</span><span class="p">:</span> <span class="p">[</span><span class="kt">Int64</span><span class="p">]</span>
<span class="p">}</span>
<span class="k">let</span> <span class="nv">x</span> <span class="o">=</span> <span class="kt">ThingWithArray</span><span class="p">(</span>
<span class="nv">name</span><span class="p">:</span> <span class="s">"Nice Thing"</span><span class="p">,</span>
<span class="nv">array</span><span class="p">:</span> <span class="kt">Array</span><span class="p">(</span><span class="nv">repeating</span><span class="p">:</span> <span class="kt">Int64</span><span class="p">(</span><span class="mi">1</span><span class="p">),</span> <span class="nv">count</span><span class="p">:</span> <span class="mi">100_000_000</span><span class="p">)</span>
<span class="p">)</span>
<span class="c1">// Memory usage is ~800MB</span>
<span class="k">var</span> <span class="nv">y</span> <span class="o">=</span> <span class="n">x</span>
<span class="c1">// (1)</span>
<span class="n">y</span><span class="o">.</span><span class="n">name</span> <span class="o">=</span> <span class="s">"Nicer Thing"</span>
<span class="c1">// (2)</span>
<span class="n">y</span><span class="o">.</span><span class="n">array</span><span class="o">.</span><span class="nf">append</span><span class="p">(</span><span class="mi">1</span><span class="p">)</span>
<span class="c1">// (3)</span>
</code></pre></div></div>
<p>This is similar to the last case. The struct itself doesn’t hold much data, the <code class="language-plaintext highlighter-rouge">array</code> property just holds a reference to a buffer which stores a lot of data.</p>
<p>At (1) the struct is copied into y, so y now has a reference to the same array buffer and the memory usage is still around 800MB.</p>
<p>At (2), mutating y doesn’t touch the array buffer, so still no big change in memory usage.</p>
<p>At (3), the array buffer is mutated via y so the buffer is now copied and memory usage goes to about 1600MB.</p>
<h3 id="many-structs">Many structs</h3>
<div class="language-swift highlighter-rouge"><div class="highlight"><pre class="highlight"><code> <span class="kd">struct</span> <span class="kt">Thing</span> <span class="p">{</span>
<span class="c1">// A struct with about 80 bytes</span>
<span class="k">let</span> <span class="nv">a</span><span class="p">:</span> <span class="kt">Int64</span> <span class="o">=</span> <span class="mi">0</span>
<span class="k">let</span> <span class="nv">b</span><span class="p">:</span> <span class="kt">Int64</span> <span class="o">=</span> <span class="mi">0</span>
<span class="k">let</span> <span class="nv">c</span><span class="p">:</span> <span class="kt">Int64</span> <span class="o">=</span> <span class="mi">0</span>
<span class="k">let</span> <span class="nv">d</span><span class="p">:</span> <span class="kt">Int64</span> <span class="o">=</span> <span class="mi">0</span>
<span class="k">let</span> <span class="nv">e</span><span class="p">:</span> <span class="kt">Int64</span> <span class="o">=</span> <span class="mi">0</span>
<span class="k">let</span> <span class="nv">f</span><span class="p">:</span> <span class="kt">Int64</span> <span class="o">=</span> <span class="mi">0</span>
<span class="k">let</span> <span class="nv">g</span><span class="p">:</span> <span class="kt">Int64</span> <span class="o">=</span> <span class="mi">0</span>
<span class="k">let</span> <span class="nv">h</span><span class="p">:</span> <span class="kt">Int64</span> <span class="o">=</span> <span class="mi">0</span>
<span class="k">let</span> <span class="nv">i</span><span class="p">:</span> <span class="kt">Int64</span> <span class="o">=</span> <span class="mi">0</span>
<span class="k">let</span> <span class="nv">j</span><span class="p">:</span> <span class="kt">Int64</span> <span class="o">=</span> <span class="mi">0</span>
<span class="p">}</span>
<span class="k">let</span> <span class="nv">thing</span> <span class="o">=</span> <span class="kt">Thing</span><span class="p">()</span>
<span class="k">let</span> <span class="nv">x</span> <span class="o">=</span> <span class="kt">Array</span><span class="p">(</span><span class="nv">repeating</span><span class="p">:</span> <span class="n">thing</span><span class="p">,</span> <span class="nv">count</span><span class="p">:</span> <span class="mi">10_000_000</span><span class="p">)</span>
<span class="c1">// (1)</span>
</code></pre></div></div>
<p>Structs themselves do not have automatic copy-on-write semantics, so taking many copies of a simple struct, even if they are not mutated, will really cause them to be copied. Memory usage at (1) is about 800MB.</p>
<h3 id="mutating-when-uniquely-referenced">Mutating when uniquely referenced</h3>
<div class="language-swift highlighter-rouge"><div class="highlight"><pre class="highlight"><code> <span class="k">var</span> <span class="nv">x</span> <span class="o">=</span> <span class="kt">Array</span><span class="p">(</span><span class="nv">repeating</span><span class="p">:</span> <span class="kt">Int64</span><span class="p">(</span><span class="mi">1</span><span class="p">),</span> <span class="nv">count</span><span class="p">:</span> <span class="mi">100_000_000</span><span class="p">)</span>
<span class="c1">// Memory usage is ~800MB</span>
<span class="k">if</span> <span class="kc">true</span> <span class="p">{</span>
<span class="k">let</span> <span class="nv">y</span> <span class="o">=</span> <span class="n">x</span>
<span class="p">}</span>
<span class="n">x</span><span class="o">.</span><span class="nf">append</span><span class="p">(</span><span class="mi">1</span><span class="p">)</span>
<span class="c1">// (1)</span>
</code></pre></div></div>
<p>Here we copy x into y but then y goes out of scope so the reference count on the large array buffer drops back down to 1. At (1), when we’ve mutated the array, the reference count is still 1 and so no copy is taken and the memory usage is still ~800MB.</p>
<h2 id="references-and-further-reading">References and Further Reading</h2>
<ul>
<li><a href="https://en.wikipedia.org/wiki/Value_semantics">Wikipedia — Value Semantics</a></li>
<li><a href="https://github.com/apple/swift/blob/main/docs/OptimizationTips.rst#advice-use-copy-on-write-semantics-for-large-values">swift/docs/OptimizationTips.rst — Use copy-on-write semantics for large values</a>
<ul>
<li>This discusses how to add copy-on-write behaviour to your own value types using isKnownUniquelyReferenced</li>
</ul>
</li>
<li><a href="https://www.youtube.com/watch?v=BXJIIQ-B4-E">Ben Cohen - Fast Safe Mutable State</a>
<ul>
<li>Slightly related, manager of the Swift Standard Library team explains copy-on-write and some interesting cases where the Standard Library takes extra care to prevent buffers being referenced multiple times unnecessarily.</li>
</ul>
</li>
<li><a href="https://github.com/apple/swift/blob/main/stdlib/public/core/Array.swift">swift/stdlib/public/core/Array.swift</a>
<ul>
<li><a href="https://github.com/apple/swift/blob/main/stdlib/public/core/Array.swift#L310">Declaration of the buffer reference</a></li>
<li><a href="https://github.com/apple/swift/blob/main/stdlib/public/core/Array.swift#L347">The method that is called before every mutating method to copy the buffer if necessary</a></li>
<li><a href="https://github.com/apple/swift/blob/main/stdlib/public/core/ContiguousArrayBuffer.swift#L256">The definition of the buffer</a></li>
</ul>
</li>
</ul>Swift’s Arrays have value semantics. They also have a copy-on-write optimisation:Running Mypy in Pre-commit2020-12-14T22:05:00+00:002020-12-14T22:05:00+00:00https://jaredkhan.com/blog/mypy-pre-commit<p><em>The only thing worse than not type-checking your code is thinking you are type-checking it when you aren’t.</em></p>
<p>This post is about running Mypy in a Git pre-commit hook using the <a href="https://pre-commit.com/">Pre-commit</a> framework. Running Mypy is a little fiddly in itself, and <a href="https://github.com/pre-commit/mirrors-mypy">pre-commit/mirrors-mypy</a> (the de facto way to call Mypy in Pre-commit) calls Mypy in a slightly opinionated way that may introduce more confusions or hide errors you want to see.</p>
<p><strong>Three take-away points if you’re in a hurry:</strong></p>
<ul>
<li>Make sure you run Mypy on all files, not just those that have changed</li>
<li>Make sure Mypy has access to the installed dependencies of the code it is type-checking</li>
<li>Be careful with the use of flags that reduce the strictness of Mypy like <code class="language-plaintext highlighter-rouge">--ignore-missing-imports</code></li>
</ul>
<p>Here, I show you how to make your own Mypy hook that suits your needs, in <em>3 only-somewhat-fiddly steps</em>:</p>
<ol>
<li>Running Mypy correctly outside of Pre-commit <a href="#step-1-running-mypy-correctly-outside-of-pre-commit">[Jump]</a></li>
<li>Creating your own Pre-commit hook <a href="#step-2-creating-our-own-pre-commit-hook">[Jump]</a></li>
<li>Giving Mypy access to your project dependencies <a href="#step-3-giving-the-mypy-hook-access-to-dependencies">[Jump]</a></li>
</ol>
<h2 id="a-solution-that-works-in-my-case">A solution that works in my case</h2>
<p>Before discussing the gory details and alternatives, here’s a solution that works for my project.</p>
<p>I add a <code class="language-plaintext highlighter-rouge">mypy.ini</code>:</p>
<div class="language-toml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nn">[mypy]</span>
<span class="c"># mypy_path will vary (and may not be necessary) </span>
<span class="c"># for your project layout.</span>
<span class="py">mypy_path</span><span class="p">=</span><span class="err">./src:./tests</span>
<span class="c"># Explicitly blacklist modules in use</span>
<span class="c"># that don't have type stubs.</span>
<span class="nn">[mypy-pytest.*]</span>
<span class="py">ignore_missing_imports</span> <span class="p">=</span> <span class="err">True</span>
<span class="nn">[mypy-pyproj.*]</span>
<span class="py">ignore_missing_imports</span> <span class="p">=</span> <span class="err">True</span>
</code></pre></div></div>
<p>and then add a script at <code class="language-plaintext highlighter-rouge">./run-mypy</code>:</p>
<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c">#!/usr/bin/env bash</span>
<span class="c"># A script for running mypy, </span>
<span class="c"># with all its dependencies installed.</span>
<span class="nb">set</span> <span class="nt">-o</span> errexit
<span class="c"># Change directory to the project root directory.</span>
<span class="nb">cd</span> <span class="s2">"</span><span class="si">$(</span><span class="nb">dirname</span> <span class="s2">"</span><span class="nv">$0</span><span class="s2">"</span><span class="si">)</span><span class="s2">"</span>
<span class="c"># Install the dependencies into the mypy env.</span>
<span class="c"># Note that this can take seconds to run.</span>
<span class="c"># In my case, I need to use a custom index URL.</span>
<span class="c"># Avoid pip spending time quietly retrying since </span>
<span class="c"># likely cause of failure is lack of VPN connection.</span>
pip <span class="nb">install</span> <span class="nt">--editable</span> <span class="nb">.</span> <span class="se">\</span>
<span class="nt">--index-url</span> https://custom-index-url.com/simple <span class="se">\</span>
<span class="nt">--retries</span> 1 <span class="se">\</span>
<span class="nt">--no-input</span> <span class="se">\</span>
<span class="nt">--quiet</span>
<span class="c"># Run on all files, </span>
<span class="c"># ignoring the paths passed to this script,</span>
<span class="c"># so as not to miss type errors.</span>
<span class="c"># My repo makes use of namespace packages.</span>
<span class="c"># Use the namespace-packages flag </span>
<span class="c"># and specify the package to run on explicitly.</span>
<span class="c"># Note that we do not use --ignore-missing-imports, </span>
<span class="c"># as this can give us false confidence in our results.</span>
mypy <span class="nt">--package</span> acme <span class="nt">--namespace-packages</span>
</code></pre></div></div>
<p>and then define a custom Pre-commit hook that runs that script in: <code class="language-plaintext highlighter-rouge">./.pre-commit-config.yaml</code></p>
<div class="language-yaml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1"># .pre-commit-config.yaml</span>
<span class="s">repos</span>
<span class="pi">-</span> <span class="na">repo</span><span class="pi">:</span> <span class="s">local</span>
<span class="c1"># We do not use pre-commit/mirrors-mypy, </span>
<span class="c1"># as it comes with opinionated defaults </span>
<span class="c1"># (like --ignore-missing-imports)</span>
<span class="c1"># and is difficult to configure to run </span>
<span class="c1"># with the dependencies correctly installed.</span>
<span class="na">hooks</span><span class="pi">:</span>
<span class="pi">-</span> <span class="na">id</span><span class="pi">:</span> <span class="s">mypy</span>
<span class="na">name</span><span class="pi">:</span> <span class="s">mypy</span>
<span class="na">entry</span><span class="pi">:</span> <span class="s2">"</span><span class="s">./run-mypy"</span>
<span class="na">language</span><span class="pi">:</span> <span class="s">python</span>
<span class="c1"># use your preferred Python version</span>
<span class="na">language_version</span><span class="pi">:</span> <span class="s">python3.7</span>
<span class="na">additional_dependencies</span><span class="pi">:</span> <span class="pi">[</span><span class="s2">"</span><span class="s">mypy==0.790"</span><span class="pi">]</span>
<span class="na">types</span><span class="pi">:</span> <span class="pi">[</span><span class="nv">python</span><span class="pi">]</span>
<span class="c1"># use require_serial so that script</span>
<span class="c1"># is only called once per commit</span>
<span class="na">require_serial</span><span class="pi">:</span> <span class="no">true</span>
<span class="c1"># Print the number of files as a sanity-check </span>
<span class="na">verbose</span><span class="pi">:</span> <span class="no">true</span>
</code></pre></div></div>
<p>You’ll have to adapt this to your own project structure and strictness/performance needs.
To expose all the issues this tries to cover, we’ll build it up in 3 steps.</p>
<h2 id="step-1-running-mypy-correctly-outside-of-pre-commit">Step 1: Running Mypy correctly outside of Pre-commit</h2>
<p>Before thinking about Pre-commit, we should make sure we can run Mypy directly in the desired way.</p>
<h3 id="running-on-the-correct-files">Running on the correct files</h3>
<p>Running <code class="language-plaintext highlighter-rouge">mypy .</code> in the root of your project will often <strong>not</strong> do what you need it to. You should play around, keeping an eye on Mypy output, to make sure Mypy is running on all the files that you want. This involves choosing:</p>
<ul>
<li>Whether to specify the files to type-check as a package, a module, a directory, or a file path</li>
<li>Whether to specify a MYPYPATH</li>
<li>Whether to add the <code class="language-plaintext highlighter-rouge">--namespace-packages</code> option</li>
<li>What working directory to invoke Mypy from</li>
</ul>
<p><a href="https://mypy.readthedocs.io/en/stable/running_mypy.html#running-mypy-and-managing-imports">Running mypy and managing imports</a> is a helpful section of the documentation for getting this right. Pay extra attention when you are using namespace packages, packages without <code class="language-plaintext highlighter-rouge">__init__.py</code> files.</p>
<p>I’m writing this whilst v0.790 is the latest release. Simplifying the calling of Mypy, and its import handling is a current priority for the maintainers. See for example the umbrella issue, <a href="https://github.com/python/mypy/issues/8584">#8584 — Redesign import handling</a>. Various improvements have already been merged to the master branch.</p>
<h3 id="following-the-right-rules">Following the right rules</h3>
<p>Once Mypy is running on the correct files, you’ll want to get it running the right checks for your codebase so that it passes whilst also checking what you want it to check. This may involve:</p>
<ul>
<li>Making changes to your codebase to meet new rules that you want to enforce</li>
<li>Setting various strictness settings. For example: <code class="language-plaintext highlighter-rouge">--no-implicit-optional</code>, <code class="language-plaintext highlighter-rouge">--disallow-untyped-defs</code>, <code class="language-plaintext highlighter-rouge">--no-strict-optional</code> or the umbrella option <code class="language-plaintext highlighter-rouge">--strict</code></li>
<li>
<p>Deciding which imported modules to treat as <code class="language-plaintext highlighter-rouge">Any</code>. Sometimes Mypy will complain that it can’t find a certain module or its stubs. This can be indicative that Mypy does not have access to these dependencies, which you should fix (see below), but can also mean the library doesn’t have any type stubs. For the latter case, it’s sensible to treat those modules as <code class="language-plaintext highlighter-rouge">Any</code> in a <code class="language-plaintext highlighter-rouge">mypy.ini</code> file:</p>
<div class="language-toml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c"># mypy.ini</span>
<span class="nn">[mypy]</span>
<span class="c"># this section is required</span>
<span class="c"># you can add a mypy_path here, if you need one.</span>
<span class="c"># example of explicitly ignoring missing stubs</span>
<span class="c"># for a dependency and its subpackages.</span>
<span class="c"># This is safer than ignoring everything </span>
<span class="c"># with the --ignore-missing-imports option.</span>
<span class="nn">[mypy-pyproj.*]</span>
<span class="py">ignore_missing_imports</span> <span class="p">=</span> <span class="err">True</span>
</code></pre></div> </div>
</li>
</ul>
<p>You may even want to temporarily introduce errors in certain files to make sure Mypy will notice them. See also <a href="https://mypy.readthedocs.io/en/stable/common_issues.html#no-errors-reported-for-obviously-wrong-code">Mypy docs — No errors reported for obviously wrong code</a>.</p>
<h3 id="bake-it-into-a-script">Bake it into a script</h3>
<p>Now that you know precisely how you want to call Mypy, create a script called <code class="language-plaintext highlighter-rouge">run-mypy</code> that captures the arguments you want to use. For example, in my case, I have a namespace package in the <code class="language-plaintext highlighter-rouge">src/acme</code> directory, and my script ended up looking like this:</p>
<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c">#!/usr/bin/env bash</span>
<span class="nb">set</span> <span class="nt">-o</span> errexit
<span class="c"># Change directory to the project root directory.</span>
<span class="nb">cd</span> <span class="s2">"</span><span class="si">$(</span><span class="nb">dirname</span> <span class="s2">"</span><span class="nv">$0</span><span class="s2">"</span><span class="si">)</span><span class="s2">"</span>
<span class="c"># Because I'm using namespace packages,</span>
<span class="c"># I have used --package acme rather than using </span>
<span class="c"># the path 'src/acme', which would correctly</span>
<span class="c"># collect my files but erroneously add </span>
<span class="c"># 'src/acme' to the Mypy search path.</span>
<span class="c"># We only want 'src' in the path so that Mypy</span>
<span class="c"># knows our modules by their fully qualified names.</span>
mypy <span class="nt">--package</span> acme <span class="nt">--namespace-packages</span>
</code></pre></div></div>
<p>I also had to add a mypy_path in mypy.ini:</p>
<div class="language-toml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nn">[mypy]</span>
<span class="py">mypy_path</span><span class="p">=</span><span class="err">./src</span>
</code></pre></div></div>
<h2 id="step-2-creating-our-own-pre-commit-hook">Step 2: Creating our own Pre-commit hook</h2>
<p>Now that we know how to run Mypy for our project, we can think about running it in Pre-commit. First, a brief primer on how Pre-commit works so that we can consider what might go wrong.</p>
<h3 id="how-pre-commit-runs-hooks">How Pre-commit runs hooks</h3>
<p>Pre-commit installs each Python hook in a separate virtualenv. Before each commit, the list of staged files is passed to that hook. Any unstaged changes are stashed and only restored after all hooks have run.</p>
<h4 id="problem-only-running-on-changed-files">Problem: Only running on changed files</h4>
<p>With Mypy, we probably don’t want to pass it just the list of changed files:</p>
<ul>
<li>It will miss type errors resulting from but not occurring in the staged changes. For example: if you have changed the definition of a function but not a usage of that function in another file then the usage is now invalid, but won’t be checked.</li>
<li>As mentioned above, you may need more control over how Mypy is invoked anyway.</li>
<li>Mypy uses an <a href="https://mypy.readthedocs.io/en/stable/command_line.html#incremental-mode">Incremental Mode</a> by default. It stores calculated type information so re-running on all files after only a few changes doesn’t take as long. For faster incremental runs, consider using a long-running <a href="https://mypy.readthedocs.io/en/stable/mypy_daemon.html#mypy-daemon">Mypy daemon</a>.</li>
</ul>
<p>We’ll solve this by using our own <code class="language-plaintext highlighter-rouge">run-mypy</code> script and ignoring the file list that Pre-commit passes to it.</p>
<h4 id="problem-running-in-an-isolated-virtualenv">Problem: Running in an isolated virtualenv</h4>
<p>Mypy running in a separate virtualenv is also problematic, since it won’t have access to all the dependencies installed in your main development environment. This means it can’t type check usages of those dependencies. We’ll solve this in Step 3.</p>
<h3 id="setting-up-the-hook">Setting up the hook</h3>
<p>We can solve both these problems with a properly-configured hook, which we’ll set up ourselves. To get started, create a new <a href="https://pre-commit.com/#repository-local-hooks">Repository-local hook</a> by adding the following to your .pre-commit-config.yaml like so</p>
<div class="language-yaml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1"># .pre-commit-config.yaml</span>
<span class="s">repos</span>
<span class="pi">-</span> <span class="na">repo</span><span class="pi">:</span> <span class="s">local</span>
<span class="na">hooks</span><span class="pi">:</span>
<span class="pi">-</span> <span class="na">id</span><span class="pi">:</span> <span class="s">mypy</span>
<span class="na">name</span><span class="pi">:</span> <span class="s">mypy</span>
<span class="na">entry</span><span class="pi">:</span> <span class="s2">"</span><span class="s">./run-mypy"</span>
<span class="na">language</span><span class="pi">:</span> <span class="s">python</span>
<span class="c1"># use your preferred Python version</span>
<span class="na">language_version</span><span class="pi">:</span> <span class="s">python3.7</span>
<span class="na">additional_dependencies</span><span class="pi">:</span> <span class="pi">[</span><span class="s2">"</span><span class="s">mypy==0.790"</span><span class="pi">]</span>
<span class="c1"># trigger for commits changing Python files</span>
<span class="na">types</span><span class="pi">:</span> <span class="pi">[</span><span class="nv">python</span><span class="pi">]</span>
<span class="c1"># use require_serial so that script</span>
<span class="c1"># is only called once per commit</span>
<span class="na">require_serial</span><span class="pi">:</span> <span class="no">true</span>
<span class="c1"># print the number of files as a sanity-check</span>
<span class="na">verbose</span><span class="pi">:</span> <span class="no">true</span>
</code></pre></div></div>
<h2 id="step-3-giving-the-mypy-hook-access-to-dependencies">Step 3: Giving the Mypy hook access to dependencies</h2>
<p>Mypy needs an environment where the dependencies are imported so that it can check for type-errors in their usage. Here’s a few options for doing that, with differing levels of convenience and speed:</p>
<h3 id="option-1-use-language-system-to-run-mypy-in-an-existing-environment">Option 1: Use <code class="language-plaintext highlighter-rouge">language: system</code> to run Mypy in an existing environment</h3>
<p>Replace <code class="language-plaintext highlighter-rouge">language: python</code> in your hook definition with <code class="language-plaintext highlighter-rouge">language: system</code>. Remove the <code class="language-plaintext highlighter-rouge">additional_dependencies</code> line and install Mypy into your environment directly. Now, Pre-commit will not create a separate virtualenv for the hook and will run it in whatever environment you happen to be in when you run <code class="language-plaintext highlighter-rouge">git commit</code> or <code class="language-plaintext highlighter-rouge">pre-commit run</code>. This means you always run Mypy directly in your dev environment, but breaks if any of the developers on the project want to trigger Pre-commit from outside the dev environment. For example, this won’t work if using a GUI Git client, as the correct virtualenv probably won’t be activated.</p>
<h3 id="option-2-point-mypy-to-a-specific-environment-with---python-executable">Option 2: Point Mypy to a specific environment with <code class="language-plaintext highlighter-rouge">--python-executable</code></h3>
<p>If it’s possible to automatically figure out the path to the appropriate Python interpreter (the one associated with the existing installation of your dependencies, which may or may not be in a virtual environment), then you can point Mypy to that path using the <code class="language-plaintext highlighter-rouge">--python-executable</code> option on <code class="language-plaintext highlighter-rouge">mypy</code>.</p>
<h3 id="option-3-install-specific-dependencies-with-the-additional_dependencies-hook-option">Option 3: Install specific dependencies with the <code class="language-plaintext highlighter-rouge">additional_dependencies</code> hook option</h3>
<p>If you only care about type-checking the usages of a few third-party modules, then you can install those specific modules into the hook environment like so:</p>
<div class="language-yaml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1"># .pre-commit-config.yaml</span>
<span class="s">repos</span>
<span class="pi">-</span> <span class="na">repo</span><span class="pi">:</span> <span class="s">local</span>
<span class="na">hooks</span><span class="pi">:</span>
<span class="pi">-</span> <span class="na">id</span><span class="pi">:</span> <span class="s">mypy</span>
<span class="na">name</span><span class="pi">:</span> <span class="s">mypy</span>
<span class="na">entry</span><span class="pi">:</span> <span class="s2">"</span><span class="s">./run-mypy"</span>
<span class="na">language</span><span class="pi">:</span> <span class="s">python</span>
<span class="c1"># Replace with appropriate version</span>
<span class="na">language_version</span><span class="pi">:</span> <span class="s">python3.7</span>
<span class="c1"># install Mypy, and the dependencies</span>
<span class="na">additional_dependencies</span><span class="pi">:</span>
<span class="pi">-</span> <span class="s2">"</span><span class="s">mypy==0.790"</span>
<span class="pi">-</span> <span class="s2">"</span><span class="s">sructlog==20.1.0"</span>
<span class="na">types</span><span class="pi">:</span> <span class="pi">[</span><span class="nv">python</span><span class="pi">]</span>
<span class="c1"># use require_serial so that script</span>
<span class="c1"># is only called once per commit</span>
<span class="na">require_serial</span><span class="pi">:</span> <span class="no">true</span>
<span class="c1"># Print the number of files as sanity-check </span>
<span class="na">verbose</span><span class="pi">:</span> <span class="no">true</span>
</code></pre></div></div>
<p>This is relatively fast as Pre-commit remembers the dependencies it installed in the environment <a href="https://github.com/pre-commit/pre-commit/blob/cf604f6b93b8fb7a61ab9ca45e5dedbdb4fd5796/pre_commit/repository.py#L54">[source code]</a>. The downside is this means duplicating your list of dependencies (at least those that have type stubs).</p>
<p>The additional_dependencies are just sent directly to Pip <a href="https://github.com/pre-commit/pre-commit/blob/cf604f6b93b8fb7a61ab9ca45e5dedbdb4fd5796/pre_commit/languages/python.py#L199">[source code]</a> so you can happily add, for example, a <code class="language-plaintext highlighter-rouge">--index-url</code> argument in this array. Just be aware that Pre-commit will only re-run Pip when the list of <code class="language-plaintext highlighter-rouge">additional_dependencies</code> changes, so don’t expect to put “requirements.txt” in this array and have it figure out when you’ve changed that file.</p>
<h3 id="option-4-running-a-full-pip-install-in-the-hook">Option 4: Running a full <code class="language-plaintext highlighter-rouge">pip install</code> in the hook</h3>
<p><strong>This is not fast.</strong> The speed we mostly care about is that of running the hook on each commit, not its initial setup. However, running <code class="language-plaintext highlighter-rouge">pip install</code> takes many seconds even when the dependencies are already installed. It is, however, a pretty reliable and easy way to make sure your dependencies are installed if the performance hit is acceptable to you and your team.</p>
<p>To do this, simply add the appropriate <code class="language-plaintext highlighter-rouge">pip</code> command into your <code class="language-plaintext highlighter-rouge">run-mypy</code> script. For example:</p>
<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c">#!/usr/bin/env bash</span>
<span class="nb">set</span> <span class="nt">-o</span> errexit
<span class="c"># Change directory to the project root directory.</span>
<span class="nb">cd</span> <span class="s2">"</span><span class="si">$(</span><span class="nb">dirname</span> <span class="s2">"</span><span class="nv">$0</span><span class="s2">"</span><span class="si">)</span><span class="s2">"</span>
<span class="c"># Install the dependencies into the mypy environment.</span>
<span class="c"># Note that this can take seconds to run.</span>
pip <span class="nb">install</span> <span class="nt">--editable</span> <span class="nb">.</span> <span class="nt">--no-input</span> <span class="nt">--quiet</span>
mypy <span class="nt">--package</span> acme <span class="nt">--namespace-packages</span>
</code></pre></div></div>
<p>This is the option I have gone for so far since it minimises the likelihood of us making, mistakes without imposing many restrictions on local project setup. For example, teammates can develop in whatever Python environment they like.</p>
<h2 id="bonus-step-running-in-ci">Bonus step: Running in CI</h2>
<p>If you use Pre-commit locally it’s often a good idea to run <code class="language-plaintext highlighter-rouge">pre-commit run -a</code> in your CI pipeline. The setup I gave at the top of this post works fine in CI too. However, if you’d rather have a different Mypy setup in CI than locally, you can run Pre-commit with a SKIP environment variable in CI to skip the Mypy hook, and then run Mypy however you want in a separate CI job: <code class="language-plaintext highlighter-rouge">SKIP=mypy pre-commit run -a</code>. See <a href="https://pre-commit.com/#temporarily-disabling-hooks">Pre-commit - Temporarily disabling hooks</a>.</p>
<h2 id="summary">Summary</h2>
<p>We have seen many potential issues of running Mypy in Pre-commit:</p>
<ul>
<li>Changing one file may cause a type error in another file, so we need to run Mypy on all files, not just those that have changed</li>
<li>We need to give Mypy access to the installed dependencies of the code it is type-checking, otherwise it can’t check the usages of those dependencies</li>
<li>Flags that reduce the strictness of Mypy like <code class="language-plaintext highlighter-rouge">--ignore-missing-imports</code> can give us false confidence</li>
</ul>
<p>We saw how to address these issues by making our own custom hook. There doesn’t appear to be a neat, one-size-fits-all solution, so it’s worth giving some thought to this set up in each instance.</p>The only thing worse than not type-checking your code is thinking you are type-checking it when you aren’t.How Git LFS Works2020-10-31T18:14:00+00:002020-10-31T18:14:00+00:00https://jaredkhan.com/blog/how-git-lfs-works<p><em><a href="https://git-lfs.github.com">Git LFS (Large File Storage)</a> helps you version large files in Git without having to download every version of it. This post explains the mechanisms that Git LFS uses internally to do its job, including:</em></p>
<ul>
<li><em>Git subcommands to give us the <code class="language-plaintext highlighter-rouge">git lfs</code> command in the first place</em> <a href="#git-subcommands">[Jump]</a></li>
<li><em>Git clean and smudge filters to replace large files with pointer files</em> <a href="#clean-and-smudge-filters">[Jump]</a></li>
<li><em>Git pre-push hooks to upload the large files to a server</em> <a href="#pre-push-hooks">[Jump]</a></li>
</ul>
<h2 id="the-problem-git-lfs-solves">The Problem Git LFS Solves</h2>
<p>Git needs to remember every version of every file in the repository. It starts by storing <strong>all the different versions of the file separately in full</strong>, it calls these ‘loose objects’.</p>
<p>When it notices that there are lots of loose objects it will do something called ‘packing’, where it remembers one version of the file as well as the differences between that and the other versions. This saves a lot of space if the differences are small relative to the size of the file. See <a href="https://git-scm.com/book/en/v2/Git-Internals-Packfiles">Git Internals - Packfiles</a> for more.</p>
<p>Even with packing, if a repo has a large file that changes often then this can quickly require a lot of storage space to make every version available locally, even though you might rarely work with historical versions of this large file. Ideally, you would only download and store the versions of this file that you actually need to view or work with, but you don’t want that to get in the way of your normal Git workflow.</p>
<p>Git LFS does not save space on your server, but saves you space on your local copies of the repo.</p>
<h2 id="how-git-lfs-is-used">How Git LFS Is Used</h2>
<p>The following instructions can be found on the <a href="https://git-lfs.github.com">Git LFS website</a>:</p>
<p>The remote Git server must be set up to support the <a href="https://github.com/git-lfs/git-lfs/blob/7b8779f3d46350d450394222d81e140a5a98911e/docs/api/batch.md#L1">Git LFS API</a>.
This is done for you on GitHub, GitLab (requires some configuration) and Bitbucket.</p>
<p>Each user of the repository must:</p>
<ul>
<li>Download the <code class="language-plaintext highlighter-rouge">git-lfs</code> executable</li>
<li>Run <code class="language-plaintext highlighter-rouge">git lfs install</code> on their machine</li>
</ul>
<p>One user of the repository must:</p>
<ul>
<li>Run <code class="language-plaintext highlighter-rouge">git lfs track "*.psd"</code>, replacing “*.psd” with the filename pattern that you want to track</li>
<li>Make sure the <code class="language-plaintext highlighter-rouge">.gitattributes</code> file is checked into the repository</li>
</ul>
<p>Now all users can add files to the repo as normal and Git LFS will work away in the background.</p>
<h2 id="what-git-lfs-does">What Git LFS Does</h2>
<p>There are three main things that Git LFS does for us:</p>
<ol>
<li>
<p>Git LFS replaces the large files that you try to <code class="language-plaintext highlighter-rouge">git add</code> for commit in the repo with <em>pointer files</em>, files that just contain an identifier of the content, and it stores the content files themselves in a separate local folder, the Git LFS Cache at <code class="language-plaintext highlighter-rouge">.git/lfs/objects</code>.</p>
</li>
<li>
<p>When you <code class="language-plaintext highlighter-rouge">git push</code> your commits, the new large files in the local Git LFS Cache are uploaded to the server separately.
This is done using the Git LFS API that the server must implement.</p>
</li>
<li>
<p>Whenever you do a <code class="language-plaintext highlighter-rouge">git checkout</code>, Git LFS will find all the pointer files and replace them with the files themselves, downloading whatever files necessary from the remote server. This means that you only store locally the large files that you actually checkout or that you committed yourself, not <strong>all</strong> versions of large files in the history of the repo.</p>
</li>
</ol>
<p>To perform these three magic tricks, Git LFS needs to intercept <code class="language-plaintext highlighter-rouge">add</code>, <code class="language-plaintext highlighter-rouge">push</code> and <code class="language-plaintext highlighter-rouge">checkout</code> and needs to keep track of which pointer files map to which large files in the Git LFS Cache. Understanding how it does these things in more detail can give us more confidence when using Git LFS and guide us in the right direction when something goes wrong.</p>
<h2 id="how-git-lfs-works">How Git LFS Works</h2>
<h3 id="git-subcommands">Git subcommands</h3>
<p><code class="language-plaintext highlighter-rouge">git clone</code> and <code class="language-plaintext highlighter-rouge">git push</code> are two different built-in <em>subcommands</em> of <code class="language-plaintext highlighter-rouge">git</code>. As it turns out, any executable available in your <code class="language-plaintext highlighter-rouge">PATH</code> that starts with <code class="language-plaintext highlighter-rouge">git-</code> can be used as a Git subcommand.</p>
<p>For example:</p>
<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c"># create script named 'git-shout'</span>
<span class="c"># sidenote: this is 'heredoc' syntax</span>
<span class="nb">cat</span> <span class="o"><<</span> <span class="no">EOF</span><span class="sh"> > git-shout
#! /usr/local/bin/bash
echo "running custom command!"
</span><span class="no">EOF
</span><span class="nb">chmod </span>u+x git-shout
<span class="c"># make sure it's in the shell's PATH</span>
<span class="nb">export </span><span class="nv">PATH</span><span class="o">=</span><span class="nv">$PATH</span>:.
git-shout
<span class="c"># running custom command!</span>
<span class="c"># it can also be called like this:</span>
git shout
<span class="c"># running custom command!</span>
</code></pre></div></div>
<p>There’s nothing special about it being a subcommand, it’s just a nice-looking alias to the same executable.</p>
<h3 id="clean-and-smudge-filters">Clean and Smudge filters</h3>
<p>Git has the concept of the <em>staging area</em> (or <em>index</em>) where changes go before they are committed. We select changes to place into the staging area with <code class="language-plaintext highlighter-rouge">git add</code>. Git has a feature called <em>filters</em> which let us process files just before they are staged (the ‘clean’ filter) and process files just before they are checked out into your working tree (the ‘smudge’ filter).</p>
<p>These filters can be used for things like:</p>
<ul>
<li>keeping any passwords or other secrets out of the repository</li>
<li>including the last-modified date of a file in the file itself</li>
<li>pulling in files from non-Git sources</li>
</ul>
<h4 id="creating-a-new-filter">Creating a new filter</h4>
<p>First, the clean and smudge actions for the filter need to be added to either the user’s <code class="language-plaintext highlighter-rouge">~/.gitconfig</code> file or the repository-local <code class="language-plaintext highlighter-rouge">.git/config</code> file. Either way, <strong>this needs to be done on each user’s machine.</strong> As a simple example, we’ll add a filter that censors the word ‘butts’, because we can’t be having such <em>foul</em> language getting checked into our repository:</p>
<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c"># .git/config</span>
<span class="c"># define a 'hide-naughty-word' filter</span>
<span class="o">[</span>filter <span class="s2">"hide-naughty-word"</span><span class="o">]</span>
<span class="c"># define the command for this filter </span>
clean <span class="o">=</span> <span class="nb">sed </span>s/butts/b--ts/
smudge <span class="o">=</span> <span class="nb">sed </span>s/b--ts/butts/
</code></pre></div></div>
<p>Here, I assume that all users of the repository already have the <code class="language-plaintext highlighter-rouge">sed</code> command installed on their machine.</p>
<h4 id="assigning-the-filter-to-filesfiletypes">Assigning the filter to files/filetypes</h4>
<p>Then, we need to tell Git which files to run this filter on by adding to the <code class="language-plaintext highlighter-rouge">.gitattributes</code> file in the repo. We can assign the filter to a specific file (<code class="language-plaintext highlighter-rouge">my-specific-file.txt</code>), a specific file type (<code class="language-plaintext highlighter-rouge">*.txt</code>), or all files (<code class="language-plaintext highlighter-rouge">*</code>). For more detail see <a href="https://git-scm.com/docs/gitattributes">gitattributes Documentation</a>. Here, we will run the <code class="language-plaintext highlighter-rouge">hide-naughty-word</code> filter on all files:</p>
<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">*</span> <span class="nv">filter</span><span class="o">=</span>hide-naughty-word
</code></pre></div></div>
<p>This <code class="language-plaintext highlighter-rouge">.gitattributes</code> file can be checked in so that it can be shared amongst all users of repo.</p>
<p>Now we can see this new filter in action by committing a new file and then using a command called <code class="language-plaintext highlighter-rouge">git cat-file</code> to see what has actually ended up in the repository:</p>
<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nb">echo</span> <span class="s2">"i like big butts and i cannot lie"</span> <span class="o">></span> mix-a-lot.txt
git add mix-a-lot.txt
git commit <span class="nt">-m</span> <span class="s2">"Add naughty file"</span>
git cat-file blob <span class="s2">"HEAD:mix-a-lot.txt"</span>
<span class="c"># i like big b--ts and i cannot lie</span>
<span class="nb">cat </span>mix-a-lot.txt
<span class="c"># i like big butts and i cannot lie</span>
</code></pre></div></div>
<p>We see that the content has been filtered in the repository but when we actually view the <em>checked out</em> file (for example, by using the <code class="language-plaintext highlighter-rouge">cat</code> tool) it has the content we expect.</p>
<h4 id="running-filters-on-files-that-are-already-committed">Running filters on files that are already committed</h4>
<p>What if someone on our team doesn’t have the filters set up properly and checks in an unfiltered file:</p>
<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c"># for demonstration,</span>
<span class="c"># empty the .gitattributes file,</span>
<span class="c"># disabling the filter.</span>
<span class="nb">echo</span> <span class="s2">""</span> <span class="o">></span> .gitattributes
<span class="nb">echo</span> <span class="s2">"i like big butts and i cannot lie"</span> <span class="o">></span> mix-a-lot2.txt
git add mix-a-lot2.txt
git commit <span class="nt">-m</span> <span class="s2">"Add unfiltered file"</span>
<span class="c"># re-enable filter</span>
<span class="nb">echo</span> <span class="s2">"* filter=hide-naughty-words"</span> <span class="o">></span> .gitattributes
git cat-file blob <span class="s2">"HEAD:mix-a-lot2.txt"</span>
<span class="c"># i like big butts and i cannot lie</span>
</code></pre></div></div>
<p>We can re-run the filters on all files and create a new commit like so:</p>
<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code>git add <span class="nt">--renormalize</span> <span class="nb">.</span>
git commit <span class="nt">-m</span> <span class="s2">"Run filters"</span>
git cat-file blob <span class="s2">"HEAD:mix-a-lot2.txt"</span>
<span class="c"># i like big b--ts and i cannot lie</span>
</code></pre></div></div>
<p>Note that the unfiltered file still exists in the history.</p>
<h3 id="pre-push-hooks">Pre-push hooks</h3>
<p>‘Git hooks’ are a way to run custom scripts when certain events happen. These scripts can be added to the <code class="language-plaintext highlighter-rouge">.git/hooks</code> directory of your repo with names like <code class="language-plaintext highlighter-rouge">pre-commit</code> or <code class="language-plaintext highlighter-rouge">post-checkout</code> and can be modified to do whatever you like.
The list of hooks that you can set up can be found in <a href="https://github.com/git/git/blob/e31aba42fb12bdeb0f850829e008e1e3f43af500/Documentation/githooks.txt">git/Documentation/githooks.txt</a>.</p>
<p>For example, when you run <code class="language-plaintext highlighter-rouge">git push</code>, Git will first run the <code class="language-plaintext highlighter-rouge">pre-push</code> script (if it exists). If that script exits with a non-zero exit code, the push will be aborted.</p>
<p>Git hooks cannot be checked into a repo. If all users of a project need to run the same Git hooks, each individual user will need to set them up on their copy of the repo.</p>
<h3 id="bringing-it-all-together">Bringing it all together</h3>
<p>When you install Git LFS, you will get an executable called <code class="language-plaintext highlighter-rouge">git-lfs</code>. Because it’s named starting with <code class="language-plaintext highlighter-rouge">git-</code>, it is also now runnable with <code class="language-plaintext highlighter-rouge">git lfs</code>. When you run <code class="language-plaintext highlighter-rouge">git lfs install</code> on your machine, your <code class="language-plaintext highlighter-rouge">~/.gitconfig</code> file is updated to contain the Git LFS filter definition.</p>
<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="o">[</span>filter <span class="s2">"lfs"</span><span class="o">]</span>
clean <span class="o">=</span> git-lfs clean %f
smudge <span class="o">=</span> git-lfs smudge %f
required <span class="o">=</span> <span class="nb">true
</span>process <span class="o">=</span> git-lfs filter-process
</code></pre></div></div>
<p>Because this is modifying a user-specific file, <code class="language-plaintext highlighter-rouge">git lfs install</code> needs to be run once by each user of the repository before they can successfully use Git LFS.</p>
<p>When you run, for example, <code class="language-plaintext highlighter-rouge">git lfs track "*.jpg"</code> to track all .jpg files in the repo with Git LFS, it updates your <code class="language-plaintext highlighter-rouge">.gitattributes</code> file:</p>
<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">*</span>.jpg <span class="nv">filter</span><span class="o">=</span>lfs <span class="nv">diff</span><span class="o">=</span>lfs <span class="nv">merge</span><span class="o">=</span>lfs <span class="nt">-text</span>
</code></pre></div></div>
<p>This tells Git to use the <code class="language-plaintext highlighter-rouge">lfs</code> clean and smudge filter for these files, as well as attaching some extra attributes. With the filters in place, whenever you stage a <code class="language-plaintext highlighter-rouge">.jpg</code> file it will be replaced with a pointer file containing the SHA-256 hash of the file content. The file itself gets stored in <code class="language-plaintext highlighter-rouge">.git/lfs/objects</code> at a path based on the hash so that it’s easy to find later.
Note that the .gitattributes file can and should be checked into the repository so that everyone on the project tracks the same files in Git LFS.</p>
<p>Almost every Git LFS command you run (including <code class="language-plaintext highlighter-rouge">git lfs install</code> and the clean and smudge filters) will also modify the pre-push hook if it’s not already set up. So, because we ran some Git LFS commands already, the <code class="language-plaintext highlighter-rouge">.git/hooks/pre-push</code> file should already look like this:</p>
<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c">#!/bin/sh</span>
<span class="nb">command</span> <span class="nt">-v</span> git-lfs <span class="o">></span>/dev/null 2>&1 <span class="o">||</span> <span class="o">{</span> <span class="nb">echo</span> <span class="o">></span>&2 <span class="s2">"</span><span class="se">\n</span><span class="s2">This repository is configured for Git LFS but 'git-lfs' was not found on your path. If you no longer wish to use Git LFS, remove this hook by deleting .git/hooks/pre-push.</span><span class="se">\n</span><span class="s2">"</span><span class="p">;</span> <span class="nb">exit </span>2<span class="p">;</span> <span class="o">}</span>
git lfs pre-push <span class="s2">"</span><span class="nv">$@</span><span class="s2">"</span>
</code></pre></div></div>
<p>It first checks if the <code class="language-plaintext highlighter-rouge">git-lfs</code> executable exists and gives an error message if not. It then forwards to the <code class="language-plaintext highlighter-rouge">git lfs pre-push</code> command. Git LFS’s pre-push command will read the list of branches to be pushed and scan each new commit in those branches for new pointer files <a href="https://github.com/git-lfs/git-lfs/blob/8d234c690b94e8a5b12ff72e3fba8481f032daf0/commands/command_pre_push.go#L40">[source code]</a>. For each new pointer file it finds, it looks up the actual large file in the Git LFS Cache. It will then upload all the new files to the remote server using the Git LFS API.</p>
<p>Note that because it’s using hashes for file identity, you will never end up with two copies of the same large file on the remote server.</p>
<p>And that’s it! These are the main mechanisms Git LFS uses to do its job.</p>
<h2 id="what-could-possibly-go-wrong">What could possibly go wrong?</h2>
<p>Now that we understand the main mechanisms in use, we can debug some issues that might arise:</p>
<blockquote>
<p>I’m seeing pointer files when I should be seeing the actual files</p>
</blockquote>
<p>The smudge filter might not be properly set up for this file.</p>
<ul>
<li>Ensure the .gitattributes file has a line that matches this file with <code class="language-plaintext highlighter-rouge">filter=lfs</code></li>
<li>Ensure you have the filter installed. It’s perfectly harmless to re-run <code class="language-plaintext highlighter-rouge">git lfs install</code></li>
<li>Check out the file again to re-run the filter: <code class="language-plaintext highlighter-rouge">git checkout -- <path-to-file></code></li>
</ul>
<blockquote>
<p>Large files are being checked in when I wanted pointers to be checked in</p>
</blockquote>
<p>The clean filter might not be properly set up for this file.</p>
<ul>
<li>Follow a similar remedy to above</li>
<li>Re-run the filter with <code class="language-plaintext highlighter-rouge">git add --renormalize <path-to-file></code></li>
</ul>
<blockquote>
<p>My repo is still taking up loads of space</p>
</blockquote>
<ul>
<li>Installing Git LFS won’t automatically run the LFS filters on historical commits. If you already have large files in your history and want to rewrite your Git history to avoid that, take a look at <a href="https://github.com/git-lfs/git-lfs/blob/7b8779f3d46350d450394222d81e140a5a98911e/docs/man/git-lfs-migrate.1.ronn"><code class="language-plaintext highlighter-rouge">git lfs migrate</code></a></li>
<li>If Alice makes 10 commits that change a large file, pushes them, and then Bob checks out her latest commit, then Bob will only download the latest version but Alice will likely still have all 10 versions in her LFS cache. Alice may want to run <a href="https://github.com/git-lfs/git-lfs/blob/7b8779f3d46350d450394222d81e140a5a98911e/docs/man/git-lfs-prune.1.ronn"><code class="language-plaintext highlighter-rouge">git lfs prune</code></a> in her copy of the repo to get rid of unneeded versions of the file.</li>
</ul>
<h2 id="summary">Summary</h2>
<p>In summary, Git LFS…</p>
<ul>
<li>…does not save you any space on the remote server, but saves you space locally.</li>
<li>…uses clean and smudge filters.</li>
<li>…installs a <code class="language-plaintext highlighter-rouge">git-lfs</code> executable which, due to its name, can also be run with <code class="language-plaintext highlighter-rouge">git lfs</code>.</li>
<li>…installs filters in <code class="language-plaintext highlighter-rouge">~/.gitconfig</code> when you run <code class="language-plaintext highlighter-rouge">git lfs install</code>.</li>
<li>…configures filters in <code class="language-plaintext highlighter-rouge">.gitattributes</code> when you run <code class="language-plaintext highlighter-rouge">git lfs track ...</code>.</li>
<li>…configures the pre-push hook whenever you run any <code class="language-plaintext highlighter-rouge">git-lfs</code> command (including the clean and smudge filters).</li>
<li>…puts large files away in <code class="language-plaintext highlighter-rouge">.git/lfs/object</code>, named by their SHA-256 hashes, using the clean filter.</li>
<li>…pushes large files to the remote using the pre-push hook.</li>
</ul>
<h2 id="references">References</h2>
<ul>
<li><a href="https://github.com/git-lfs/git-lfs/blob/master/docs/spec.md">Git LFS Client Specification</a></li>
<li><a href="https://www.youtube.com/watch?v=w-037RcHjAA">Tim Pettersen — Tracking huge files with Git LFS</a></li>
<li><a href="https://www.atlassian.com/git/tutorials/git-lfs">BitBucket Git LFS Tutorials</a> — These are well-written and cover many practical details of using Git LFS</li>
<li><a href="https://git-scm.com/docs/githooks">Git Hooks Documentation</a></li>
<li><a href="https://git-scm.com/docs/gitattributes">Git Attributes Documentation</a></li>
<li><a href="https://github.com/git-lfs/git-lfs">Git LFS Source Code</a></li>
</ul>Git LFS (Large File Storage) helps you version large files in Git without having to download every version of it. This post explains the mechanisms that Git LFS uses internally to do its job, including:Python Typing: Resisting the Any type2020-08-29T15:29:00+00:002020-08-29T15:29:00+00:00https://jaredkhan.com/blog/resist-the-any-type<p><em>With typing in Python, we aim to restrict as many invalid programs as possible before they’re ever run.
This post covers some useful features for tightening up our types</em>:</p>
<ul>
<li><code class="language-plaintext highlighter-rouge">TypeVar</code> for when an unknown type appears multiple times in the same context <a href="#polymorphism-with-typevar">[Jump]</a></li>
<li><code class="language-plaintext highlighter-rouge">@overload</code> for when you have a function whose behaviour depends on its input <a href="#overloading-with-overload">[Jump]</a></li>
<li><code class="language-plaintext highlighter-rouge">Protocol</code> for supporting any type with the desired attributes and methods <a href="#static-duck-typing-with-protocol">[Jump]</a></li>
</ul>
<h2 id="introduction">Introduction</h2>
<p>Python itself doesn’t have a static type checker, but does provide a syntax for adding type annotations to your code:</p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1"># basics.py
</span>
<span class="kn">from</span> <span class="nn">typing</span> <span class="kn">import</span> <span class="n">List</span>
<span class="kn">from</span> <span class="nn">dataclasses</span> <span class="kn">import</span> <span class="n">dataclass</span>
<span class="c1"># We can annotate the input and output types of functions.
</span><span class="k">def</span> <span class="nf">fibonacci</span><span class="p">(</span><span class="n">n</span><span class="p">:</span> <span class="nb">int</span><span class="p">)</span> <span class="o">-></span> <span class="nb">int</span><span class="p">:</span>
<span class="k">assert</span> <span class="n">n</span> <span class="o">>=</span> <span class="mi">0</span>
<span class="k">if</span> <span class="n">n</span> <span class="o">==</span> <span class="mi">0</span><span class="p">:</span>
<span class="k">return</span> <span class="mi">0</span>
<span class="k">if</span> <span class="n">n</span> <span class="o">==</span> <span class="mi">1</span><span class="p">:</span>
<span class="k">return</span> <span class="mi">1</span>
<span class="k">return</span> <span class="n">fibonacci</span><span class="p">(</span><span class="n">n</span> <span class="o">-</span> <span class="mi">1</span><span class="p">)</span> <span class="o">+</span> <span class="n">fibonacci</span><span class="p">(</span><span class="n">n</span> <span class="o">-</span> <span class="mi">2</span><span class="p">)</span>
<span class="o">@</span><span class="n">dataclass</span>
<span class="k">class</span> <span class="nc">Document</span><span class="p">:</span>
<span class="c1"># We can annotate class attributes too.
</span> <span class="c1"># We're using the generic type 'List' with parameter 'str'.
</span> <span class="c1"># This means the `lines` attribute is a list of strings.
</span> <span class="n">lines</span><span class="p">:</span> <span class="n">List</span><span class="p">[</span><span class="nb">str</span><span class="p">]</span>
<span class="c1"># This is invalid:
</span><span class="n">broken_doc</span> <span class="o">=</span> <span class="n">Document</span><span class="p">(</span><span class="n">lines</span><span class="o">=</span><span class="p">[</span><span class="mi">1</span><span class="p">,</span> <span class="mi">2</span><span class="p">,</span> <span class="mi">3</span><span class="p">])</span>
<span class="c1"># This is valid:
</span><span class="n">fibonacci_doc</span> <span class="o">=</span> <span class="n">Document</span><span class="p">(</span>
<span class="n">lines</span><span class="o">=</span><span class="p">[</span>
<span class="sa">f</span><span class="s">"</span><span class="si">{</span><span class="n">n</span><span class="si">}</span><span class="s">th fib number is </span><span class="si">{</span><span class="n">fibonacci</span><span class="p">(</span><span class="n">n</span><span class="p">)</span><span class="si">}</span><span class="s">"</span>
<span class="k">for</span> <span class="n">n</span> <span class="ow">in</span> <span class="nb">range</span><span class="p">(</span><span class="mi">20</span><span class="p">)</span>
<span class="p">]</span>
<span class="p">)</span>
</code></pre></div></div>
<p>IDEs like <a href="https://blog.jetbrains.com/pycharm/">PyCharm</a> and type-checking tools like <a href="http://mypy-lang.org">Mypy</a> and <a href="https://pyre-check.org">Pyre</a> use these annotations to identify type errors. The intended rules of this type-checking are mostly standardised in Python Enhancement Proposals (PEPs), so the behaviour is similar between the tools.</p>
<h3 id="getting-started-with-mypy">Getting started with Mypy</h3>
<p>Here, we’ll use Mypy for type-checking. To get started with Mypy, simply <code class="language-plaintext highlighter-rouge">pip install mypy</code>.
Here’s an example of running Mypy using the basics.py file from above</p>
<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c"># In your terminal:</span>
virtualenv mypy-venv <span class="nt">-p</span> python3.8
<span class="nb">source </span>mypy-venv/bin/activate
pip <span class="nb">install </span>mypy
mypy <span class="nt">--version</span>
<span class="c"># mypy 0.770</span>
mypy <span class="nt">--strict</span> python-typing-playground/basics.py
<span class="c"># warm-up/basics.py:22: error: List item 0 has incompatible type "int"; expected "str"</span>
<span class="c"># warm-up/basics.py:22: error: List item 1 has incompatible type "int"; expected "str"</span>
<span class="c"># warm-up/basics.py:22: error: List item 2 has incompatible type "int"; expected "str"</span>
<span class="c"># Found 3 errors in 1 file (checked 1 source file)</span>
</code></pre></div></div>
<p>I’m using Python 3.8. Many interesting typing features were added to the standard library in 3.8 but if you can’t use Python 3.8, <a href="https://pypi.org/project/typing-extensions/"><code class="language-plaintext highlighter-rouge">typing-extensions</code></a> often has your back.</p>
<p>Mypy lets you get started with a large existing untyped code base and gradually add type hints over time.
It will simply silently treat all objects as <code class="language-plaintext highlighter-rouge">Any</code> if it can’t find type hints for them.
I recommend using the <a href="https://mypy.readthedocs.io/en/stable/command_line.html#cmdoption-mypy-strict"><code class="language-plaintext highlighter-rouge">--strict</code> flag</a> when first trying it out.
This mostly stops Mypy from silently assuming things are <code class="language-plaintext highlighter-rouge">Any</code>, which I think is helpful when trying to understand what Mypy can and cannot infer.</p>
<h3 id="whats-wrong-with-any-anyway">What’s wrong with <code class="language-plaintext highlighter-rouge">Any</code> anyway?</h3>
<p>Adding types restricts the way our functions, classes and variables can be used. This restriction is good because it helps catch whole categories of bugs before the code is ever run whilst also (usually) making the code easier to think about. Typically, our goal with typing in Python is:</p>
<ul>
<li>to restrict as many ‘invalid’ usages as possible,</li>
<li>whilst allowing all ‘valid’ usages,</li>
<li>and keeping our code relatively tidy.</li>
</ul>
<p>Sometimes, using only the syntax given above, we might get stuck on how to express the type of something and turn to the <code class="language-plaintext highlighter-rouge">Any</code> type.
The <code class="language-plaintext highlighter-rouge">Any</code> type is a magical type which pretends to the type-checker to support any operation you could possibly want, and to also be a parent type of all types:</p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kn">from</span> <span class="nn">typing</span> <span class="kn">import</span> <span class="n">Any</span>
<span class="k">def</span> <span class="nf">do_something_shady</span><span class="p">(</span><span class="n">magic_input</span><span class="p">:</span> <span class="n">Any</span><span class="p">)</span> <span class="o">-></span> <span class="n">Any</span><span class="p">:</span>
<span class="c1"># We can do any operation we want on an Any object.
</span> <span class="c1"># The type-checker will allow it
</span> <span class="c1"># and assume the result is an Any object
</span> <span class="k">return</span> <span class="n">magic_input</span><span class="p">.</span><span class="n">thing</span><span class="p">[</span><span class="s">"stuff"</span><span class="p">]</span> <span class="o">+</span> <span class="mi">42</span>
<span class="c1"># All types are compatible with Any,
# so even if we pass in a str, this is valid.
# If we try to run this, it will fail,
# since a string doesn't have a 'thing' attribute.
</span><span class="n">do_something_shady</span><span class="p">(</span><span class="n">magic_input</span><span class="o">=</span><span class="s">"just a string"</span><span class="p">)</span>
</code></pre></div></div>
<p>Clearly, using <code class="language-plaintext highlighter-rouge">Any</code> can allow many invalid programs.
I’d like to share a few useful Python type-system features that we can consider before giving in to the <code class="language-plaintext highlighter-rouge">Any</code> type’s magical allure, or otherwise loosening our types.</p>
<h2 id="polymorphism-with-typevar">Polymorphism with <code class="language-plaintext highlighter-rouge">TypeVar</code></h2>
<p>Type variables lets us specify that some unknown type will appear multiple times in the same context,
whether that be multiple times in the same function signature, or multiple times in methods of a generic class.
We don’t know what the type will be, but we know it will be the same in all the places where it appears.</p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kn">from</span> <span class="nn">typing</span> <span class="kn">import</span> <span class="n">List</span><span class="p">,</span> <span class="n">TypeVar</span>
<span class="kn">from</span> <span class="nn">dataclasses</span> <span class="kn">import</span> <span class="n">dataclass</span>
<span class="c1"># Declare a type variable
</span><span class="n">ValueType</span> <span class="o">=</span> <span class="n">TypeVar</span><span class="p">(</span><span class="s">"ValueType"</span><span class="p">)</span> <span class="c1"># The string in this constructor must match the variable name
</span>
<span class="k">def</span> <span class="nf">repeat</span><span class="p">(</span><span class="n">value</span><span class="p">:</span> <span class="n">ValueType</span><span class="p">,</span> <span class="n">count</span><span class="p">:</span> <span class="nb">int</span><span class="p">)</span> <span class="o">-></span> <span class="n">List</span><span class="p">[</span><span class="n">ValueType</span><span class="p">]:</span>
<span class="s">"""Returns the input value repeated `count` times in a list."""</span>
<span class="k">return</span> <span class="p">[</span><span class="n">value</span><span class="p">]</span> <span class="o">*</span> <span class="n">count</span>
<span class="c1"># This is valid
</span><span class="s">", "</span><span class="p">.</span><span class="n">join</span><span class="p">(</span><span class="n">repeat</span><span class="p">(</span><span class="s">"a"</span><span class="p">,</span> <span class="n">count</span><span class="o">=</span><span class="mi">5</span><span class="p">))</span>
<span class="c1"># This is invalid (can't do a string join on a list of ints)
</span><span class="s">", "</span><span class="p">.</span><span class="n">join</span><span class="p">(</span><span class="n">repeat</span><span class="p">(</span><span class="mi">3</span><span class="p">,</span> <span class="n">count</span><span class="o">=</span><span class="mi">5</span><span class="p">))</span>
</code></pre></div></div>
<p>Because <code class="language-plaintext highlighter-rouge">ValueType</code> is used in two different places in the same signature, Mypy will check, for each call to this function,
that the values used in those two places match types.
In this case, it checks that the type of value returned from the function is always a list of the type of value given to the function.
This also means Mypy knows more about the result of calling the function, and can be stricter there too.</p>
<h3 id="using-bound">Using bound=</h3>
<p>So <code class="language-plaintext highlighter-rouge">TypeVar</code> is great for capturing this idea of an <em>unknown</em> type appearing multiple times.
Sometimes we do know something about the type, and maybe we want to use a certain field that we know will exist on the type,
but we still want to capture and check the fact that it appears multiple times.</p>
<p>By default a TypeVar will bind to any type, all the way up to <code class="language-plaintext highlighter-rouge">object</code> but we can put an ‘upper bound’ on this using the <code class="language-plaintext highlighter-rouge">bound</code> parameter.
This says ‘only let this TypeVar bind to a subtype of X’, where X is some type that we care about.</p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kn">from</span> <span class="nn">typing</span> <span class="kn">import</span> <span class="n">List</span><span class="p">,</span> <span class="n">TypeVar</span>
<span class="kn">from</span> <span class="nn">dataclasses</span> <span class="kn">import</span> <span class="n">dataclass</span>
<span class="o">@</span><span class="n">dataclass</span>
<span class="k">class</span> <span class="nc">Animal</span><span class="p">:</span>
<span class="n">name</span><span class="p">:</span> <span class="nb">str</span>
<span class="k">class</span> <span class="nc">Dog</span><span class="p">(</span><span class="n">Animal</span><span class="p">):</span>
<span class="k">def</span> <span class="nf">pat</span><span class="p">(</span><span class="bp">self</span><span class="p">)</span> <span class="o">-></span> <span class="bp">None</span><span class="p">:</span>
<span class="k">print</span><span class="p">(</span><span class="sa">f</span><span class="s">"gave </span><span class="si">{</span><span class="bp">self</span><span class="p">.</span><span class="n">name</span><span class="si">}</span><span class="s"> a good pat"</span><span class="p">)</span>
<span class="k">class</span> <span class="nc">Cat</span><span class="p">(</span><span class="n">Animal</span><span class="p">):</span>
<span class="k">def</span> <span class="nf">stroke</span><span class="p">(</span><span class="bp">self</span><span class="p">)</span> <span class="o">-></span> <span class="bp">None</span><span class="p">:</span>
<span class="k">print</span><span class="p">(</span><span class="sa">f</span><span class="s">"stroked </span><span class="si">{</span><span class="bp">self</span><span class="p">.</span><span class="n">name</span><span class="si">}</span><span class="s">"</span><span class="p">)</span>
<span class="n">AnimalType</span> <span class="o">=</span> <span class="n">TypeVar</span><span class="p">(</span><span class="s">'AnimalType'</span><span class="p">,</span> <span class="n">bound</span><span class="o">=</span><span class="n">Animal</span><span class="p">)</span>
<span class="k">def</span> <span class="nf">sort_animals_by_name</span><span class="p">(</span><span class="n">items</span><span class="p">:</span> <span class="n">List</span><span class="p">[</span><span class="n">AnimalType</span><span class="p">])</span> <span class="o">-></span> <span class="n">List</span><span class="p">[</span><span class="n">AnimalType</span><span class="p">]:</span>
<span class="k">return</span> <span class="nb">sorted</span><span class="p">(</span><span class="n">items</span><span class="p">,</span> <span class="n">key</span><span class="o">=</span><span class="k">lambda</span> <span class="n">animal</span><span class="p">:</span> <span class="n">animal</span><span class="p">.</span><span class="n">name</span><span class="p">)</span>
<span class="c1"># This is valid (all the inputs are dogs so all have the 'pat' method)
</span><span class="n">sorted_dogs</span> <span class="o">=</span> <span class="n">sort_animals_by_name</span><span class="p">([</span><span class="n">Dog</span><span class="p">(</span><span class="n">name</span><span class="o">=</span><span class="s">"spots"</span><span class="p">),</span> <span class="n">Dog</span><span class="p">(</span><span class="n">name</span><span class="o">=</span><span class="s">"buzz"</span><span class="p">)])</span>
<span class="n">sorted_dogs</span><span class="p">[</span><span class="mi">0</span><span class="p">].</span><span class="n">pat</span><span class="p">()</span>
<span class="c1"># This is invalid
</span><span class="n">sorted_dogs</span><span class="p">[</span><span class="mi">0</span><span class="p">].</span><span class="n">stroke</span><span class="p">()</span>
<span class="c1"># This is invalid (cannot use str due to bound=Animal)
</span><span class="n">not_animals</span> <span class="o">=</span> <span class="p">[</span><span class="s">"John"</span><span class="p">,</span> <span class="s">"Joan"</span><span class="p">,</span> <span class="s">"Jan"</span><span class="p">]</span>
<span class="n">sort_animals_by_name</span><span class="p">(</span><span class="n">not_animals</span><span class="p">)</span>
</code></pre></div></div>
<p>Because we specified that <code class="language-plaintext highlighter-rouge">AnimalType</code> must be a subtype of <code class="language-plaintext highlighter-rouge">Animal</code>, the type-checker allows us to use the <code class="language-plaintext highlighter-rouge">name</code> property from <code class="language-plaintext highlighter-rouge">Animal</code> within our function.
Note that we still aren’t able to use a more specific method like <code class="language-plaintext highlighter-rouge">pat</code> within the body of <code class="language-plaintext highlighter-rouge">sort_animals_by_name</code>, since it doesn’t always exist on the upper bound, <code class="language-plaintext highlighter-rouge">Animal</code>.</p>
<h3 id="misconceptions-about-scoping">Misconceptions about scoping</h3>
<p>Because of the strange syntax for declaring a type variable, where we create a global object, it’s easy to get confused about their scoping rules:</p>
<p><strong>Question: Is the following valid?</strong></p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kn">from</span> <span class="nn">typing</span> <span class="kn">import</span> <span class="n">List</span><span class="p">,</span> <span class="n">TypeVar</span>
<span class="n">T</span> <span class="o">=</span> <span class="n">TypeVar</span><span class="p">(</span><span class="s">"T"</span><span class="p">)</span>
<span class="k">def</span> <span class="nf">repeat</span><span class="p">(</span><span class="n">value</span><span class="p">:</span> <span class="n">T</span><span class="p">,</span> <span class="n">times</span><span class="p">:</span> <span class="nb">int</span><span class="p">)</span> <span class="o">-></span> <span class="n">List</span><span class="p">[</span><span class="n">T</span><span class="p">]:</span>
<span class="k">return</span> <span class="p">[</span><span class="n">value</span><span class="p">]</span> <span class="o">*</span> <span class="n">times</span>
<span class="k">def</span> <span class="nf">identity</span><span class="p">(</span><span class="n">x</span><span class="p">:</span> <span class="n">T</span><span class="p">)</span> <span class="o">-></span> <span class="n">T</span><span class="p">:</span>
<span class="k">return</span> <span class="n">x</span>
<span class="n">repeat</span><span class="p">(</span><span class="s">"echo"</span><span class="p">,</span> <span class="n">times</span><span class="o">=</span><span class="mi">5</span><span class="p">)</span>
<span class="n">repeat</span><span class="p">(</span><span class="mi">42</span><span class="p">,</span> <span class="n">times</span><span class="o">=</span><span class="mi">5</span><span class="p">)</span>
<span class="n">identity</span><span class="p">(</span><span class="mi">42</span><span class="p">)</span>
</code></pre></div></div>
<p>After all, we seem to be asking <code class="language-plaintext highlighter-rouge">T</code> to be a string and then to be an int.</p>
<p>The answer is <strong>yes, this is valid</strong>.
Generic functions using TypeVars can happily be called multiple times with different input types.
The two calls to the function are unrelated as far as this TypeVar is concerned.
Similarly, TypeVars can happily be recycled across multiple independent functions.
You often only need multiple TypeVars if you need multiple distinct types within the <em>same</em> function signature.</p>
<p>See <a href="https://www.python.org/dev/peps/pep-0484/#scoping-rules-for-type-variables">Scoping rules for type variables within PEP-484</a> for more examples.</p>
<h2 id="overloading-with-overload">Overloading with <code class="language-plaintext highlighter-rouge">@overload</code></h2>
<p>Sometimes we have an <em>overloaded</em> function, one which has different behaviour depending on the input given.</p>
<p>Overloading in Python is quite different to other languages.
In other languages, we might write multiple function implementations with the same function name but different arguments (either in type or in name)
and when that function is called, the correct implementation would be chosen based on the given arguments.</p>
<p>In Python, we are not going to define multiple implementations
(remember our type annotations aren’t read at runtime).
In Python we just have one implementation, which manually checks the types and does the right thing.
All <code class="language-plaintext highlighter-rouge">@overload</code> lets us do is tell the <em>type checker</em> which combinations of parameters and outputs are valid.</p>
<p>We do this by providing multiple <code class="language-plaintext highlighter-rouge">@overload</code> signatures before defining our actual implementation.</p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kn">from</span> <span class="nn">typing</span> <span class="kn">import</span> <span class="n">Optional</span><span class="p">,</span> <span class="n">overload</span>
<span class="kn">import</span> <span class="nn">math</span>
<span class="o">@</span><span class="n">overload</span>
<span class="k">def</span> <span class="nf">get_circle_area</span><span class="p">(</span><span class="o">*</span><span class="p">,</span> <span class="n">radius</span><span class="p">:</span> <span class="nb">float</span><span class="p">)</span> <span class="o">-></span> <span class="nb">float</span><span class="p">:</span> <span class="p">...</span>
<span class="o">@</span><span class="n">overload</span>
<span class="k">def</span> <span class="nf">get_circle_area</span><span class="p">(</span><span class="o">*</span><span class="p">,</span> <span class="n">circumference</span><span class="p">:</span> <span class="nb">float</span><span class="p">)</span> <span class="o">-></span> <span class="nb">float</span><span class="p">:</span> <span class="p">...</span>
<span class="k">def</span> <span class="nf">get_circle_area</span><span class="p">(</span><span class="o">*</span><span class="p">,</span>
<span class="n">radius</span><span class="p">:</span> <span class="n">Optional</span><span class="p">[</span><span class="nb">float</span><span class="p">]</span> <span class="o">=</span> <span class="bp">None</span><span class="p">,</span>
<span class="n">circumference</span><span class="p">:</span> <span class="n">Optional</span><span class="p">[</span><span class="nb">float</span><span class="p">]</span> <span class="o">=</span> <span class="bp">None</span>
<span class="p">)</span> <span class="o">-></span> <span class="nb">float</span><span class="p">:</span>
<span class="s">"""
Takes either a radius or circumference of a circle,
and returns the area of that circle.
"""</span>
<span class="c1"># Check we've been given exactly one of the two forms of input
</span> <span class="k">if</span> <span class="n">radius</span> <span class="ow">is</span> <span class="ow">not</span> <span class="bp">None</span> <span class="ow">and</span> <span class="n">circumference</span> <span class="ow">is</span> <span class="ow">not</span> <span class="bp">None</span><span class="p">:</span>
<span class="k">raise</span> <span class="nb">ValueError</span><span class="p">(</span><span class="s">"Can't use both radius and circumference"</span><span class="p">)</span>
<span class="k">elif</span> <span class="n">radius</span> <span class="ow">is</span> <span class="ow">not</span> <span class="bp">None</span><span class="p">:</span>
<span class="n">canonical_radius</span> <span class="o">=</span> <span class="n">radius</span>
<span class="k">elif</span> <span class="n">circumference</span> <span class="ow">is</span> <span class="ow">not</span> <span class="bp">None</span><span class="p">:</span>
<span class="n">canonical_radius</span> <span class="o">=</span> <span class="n">circumference</span> <span class="o">/</span> <span class="n">math</span><span class="p">.</span><span class="n">tau</span>
<span class="k">else</span><span class="p">:</span>
<span class="k">raise</span> <span class="nb">ValueError</span><span class="p">(</span><span class="s">"Give either a radius or circumference"</span><span class="p">)</span>
<span class="k">return</span> <span class="n">math</span><span class="p">.</span><span class="n">pi</span> <span class="o">*</span> <span class="n">canonical_radius</span> <span class="o">*</span> <span class="n">canonical_radius</span>
<span class="c1"># Invalid (want exactly one of radius or circumference):
</span><span class="n">get_circle_area</span><span class="p">()</span>
<span class="n">get_circle_area</span><span class="p">(</span><span class="n">radius</span><span class="o">=</span><span class="bp">None</span><span class="p">)</span>
<span class="n">get_circle_area</span><span class="p">(</span><span class="n">circumference</span><span class="o">=</span><span class="bp">None</span><span class="p">)</span>
<span class="n">get_circle_area</span><span class="p">(</span><span class="n">radius</span><span class="o">=</span><span class="mf">1.0</span><span class="p">,</span> <span class="n">circumference</span><span class="o">=</span><span class="bp">None</span><span class="p">)</span>
<span class="n">get_circle_area</span><span class="p">(</span><span class="n">radius</span><span class="o">=</span><span class="mf">1.0</span><span class="p">,</span> <span class="n">circumference</span><span class="o">=</span><span class="mf">3.0</span><span class="p">)</span>
<span class="n">get_circle_area</span><span class="p">(</span><span class="mf">3.0</span><span class="p">)</span> <span class="c1"># ambiguous without the argument keyword
</span>
<span class="c1"># Valid:
</span><span class="n">get_circle_area</span><span class="p">(</span><span class="n">radius</span><span class="o">=</span><span class="mf">1.0</span><span class="p">)</span>
<span class="n">get_circle_area</span><span class="p">(</span><span class="n">circumference</span><span class="o">=</span><span class="n">math</span><span class="p">.</span><span class="n">tau</span><span class="p">)</span>
</code></pre></div></div>
<p>We used two <code class="language-plaintext highlighter-rouge">@overload</code> signatures (with empty implementations) before specifying the implementation itself.
The type signature of the implementation itself is only used for type-checking that implementation, not for type-checking usages of the function.
The type checker will also make sure that our implementation supports all the <code class="language-plaintext highlighter-rouge">@overload</code> signatures described.
For example, If we have an <code class="language-plaintext highlighter-rouge">@overload</code> which accepts a string argument, but the type of the implementation only takes floats, Mypy will throw an error.</p>
<p>As a side note, calling this function with a number but without the argument keyword (e.g. <code class="language-plaintext highlighter-rouge">radius=</code>) would be ambiguous, we wouldn’t know if the number was a radius or a circumference.
The asterisk in the function signatures enforces that keywords be used.
For more explanation, see <a href="https://www.python.org/dev/peps/pep-3102/#specification">Keyword-Only Arguments — Specification</a></p>
<h2 id="static-duck-typing-with-protocol">Static Duck Typing with <code class="language-plaintext highlighter-rouge">Protocol</code></h2>
<p>If we’re restricting the input types of a function, we sometimes only care that it has certain attributes and methods,
not whether it’s a <em>subclass</em> of a particular class.
Furthermore, it’s not always possible to modify a class to inherit from a certain base class,
because it might be from a library that you don’t control.</p>
<p><code class="language-plaintext highlighter-rouge">Protocol</code> lets us capture these attribute and method constraints without requiring the type to inherit from any particular class.</p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1"># If you can't use Python 3.8, you can also import Protocol from typing-extensions
# https://pypi.org/project/typing-extensions/
</span><span class="kn">from</span> <span class="nn">abc</span> <span class="kn">import</span> <span class="n">abstractmethod</span>
<span class="kn">from</span> <span class="nn">typing</span> <span class="kn">import</span> <span class="n">Protocol</span>
<span class="kn">from</span> <span class="nn">dataclasses</span> <span class="kn">import</span> <span class="n">dataclass</span>
<span class="c1"># Setup: we'll define a few classes which happen to have a 'name'
</span>
<span class="o">@</span><span class="n">dataclass</span>
<span class="k">class</span> <span class="nc">Person</span><span class="p">:</span>
<span class="n">first_name</span><span class="p">:</span> <span class="nb">str</span>
<span class="n">last_name</span><span class="p">:</span> <span class="nb">str</span>
<span class="n">height</span><span class="p">:</span> <span class="nb">float</span>
<span class="o">@</span><span class="nb">property</span>
<span class="k">def</span> <span class="nf">name</span><span class="p">(</span><span class="bp">self</span><span class="p">)</span> <span class="o">-></span> <span class="nb">str</span><span class="p">:</span>
<span class="k">return</span> <span class="bp">self</span><span class="p">.</span><span class="n">first_name</span> <span class="o">+</span> <span class="s">" "</span> <span class="o">+</span> <span class="bp">self</span><span class="p">.</span><span class="n">last_name</span>
<span class="k">def</span> <span class="nf">greet</span><span class="p">(</span><span class="bp">self</span><span class="p">)</span> <span class="o">-></span> <span class="nb">str</span><span class="p">:</span>
<span class="k">return</span> <span class="sa">f</span><span class="s">"Hi, my name is </span><span class="si">{</span><span class="bp">self</span><span class="p">.</span><span class="n">name</span><span class="si">}</span><span class="s">."</span>
<span class="o">@</span><span class="n">dataclass</span>
<span class="k">class</span> <span class="nc">Company</span><span class="p">:</span>
<span class="n">name</span><span class="p">:</span> <span class="nb">str</span>
<span class="n">address</span><span class="p">:</span> <span class="nb">str</span>
<span class="k">def</span> <span class="nf">greet</span><span class="p">(</span><span class="bp">self</span><span class="p">)</span> <span class="o">-></span> <span class="nb">str</span><span class="p">:</span>
<span class="k">return</span> <span class="sa">f</span><span class="s">"We are </span><span class="si">{</span><span class="bp">self</span><span class="p">.</span><span class="n">name</span><span class="si">}</span><span class="s">, find us at </span><span class="si">{</span><span class="bp">self</span><span class="p">.</span><span class="n">address</span><span class="si">}</span><span class="s">."</span>
<span class="o">@</span><span class="n">dataclass</span>
<span class="k">class</span> <span class="nc">Dog</span><span class="p">:</span>
<span class="n">name</span><span class="p">:</span> <span class="nb">str</span>
<span class="n">color</span><span class="p">:</span> <span class="nb">str</span>
<span class="k">def</span> <span class="nf">greet</span><span class="p">(</span><span class="bp">self</span><span class="p">)</span> <span class="o">-></span> <span class="nb">str</span><span class="p">:</span>
<span class="k">return</span> <span class="s">"Woof!"</span>
<span class="k">class</span> <span class="nc">Greetable</span><span class="p">(</span><span class="n">Protocol</span><span class="p">):</span>
<span class="s">"""
Matches any type with
a readable (but not necessarily writeable) `name`
and a `greet` method that returns a string.
"""</span>
<span class="o">@</span><span class="nb">property</span>
<span class="k">def</span> <span class="nf">name</span><span class="p">(</span><span class="bp">self</span><span class="p">)</span> <span class="o">-></span> <span class="nb">str</span><span class="p">:</span> <span class="p">...</span>
<span class="k">def</span> <span class="nf">greet</span><span class="p">(</span><span class="bp">self</span><span class="p">)</span> <span class="o">-></span> <span class="nb">str</span><span class="p">:</span> <span class="p">...</span>
<span class="k">def</span> <span class="nf">introduce</span><span class="p">(</span><span class="n">greetable</span><span class="p">:</span> <span class="n">Greetable</span><span class="p">)</span> <span class="o">-></span> <span class="bp">None</span><span class="p">:</span>
<span class="k">print</span><span class="p">(</span>
<span class="sa">f</span><span class="s">"This thing is named </span><span class="si">{</span><span class="n">greetable</span><span class="p">.</span><span class="n">name</span><span class="si">}</span><span class="s">"</span>
<span class="sa">f</span><span class="s">" and it says '</span><span class="si">{</span><span class="n">greetable</span><span class="p">.</span><span class="n">greet</span><span class="p">()</span><span class="si">}</span><span class="s">'"</span>
<span class="p">)</span>
<span class="k">class</span> <span class="nc">Renameable</span><span class="p">(</span><span class="n">Protocol</span><span class="p">):</span>
<span class="s">"""Matches any type with a read/write `name`."""</span>
<span class="n">name</span><span class="p">:</span> <span class="nb">str</span>
<span class="k">def</span> <span class="nf">rename</span><span class="p">(</span><span class="n">renameable</span><span class="p">:</span> <span class="n">Renameable</span><span class="p">,</span> <span class="n">new_name</span><span class="p">:</span> <span class="nb">str</span><span class="p">)</span> <span class="o">-></span> <span class="bp">None</span><span class="p">:</span>
<span class="n">renameable</span><span class="p">.</span><span class="n">name</span> <span class="o">=</span> <span class="n">new_name</span>
<span class="n">spots</span> <span class="o">=</span> <span class="n">Dog</span><span class="p">(</span><span class="n">name</span><span class="o">=</span><span class="s">"spots"</span><span class="p">,</span> <span class="n">color</span><span class="o">=</span><span class="s">"white"</span><span class="p">)</span>
<span class="n">acme</span> <span class="o">=</span> <span class="n">Company</span><span class="p">(</span><span class="n">name</span><span class="o">=</span><span class="s">"Acme Inc."</span><span class="p">,</span> <span class="n">address</span><span class="o">=</span><span class="s">"123 Industry Ave."</span><span class="p">)</span>
<span class="n">john</span> <span class="o">=</span> <span class="n">Person</span><span class="p">(</span><span class="n">first_name</span><span class="o">=</span><span class="s">"John"</span><span class="p">,</span> <span class="n">last_name</span><span class="o">=</span><span class="s">"Johnson"</span><span class="p">,</span> <span class="n">height</span><span class="o">=</span><span class="mf">1.96</span><span class="p">)</span>
<span class="c1"># Valid things
</span><span class="n">introduce</span><span class="p">(</span><span class="n">spots</span><span class="p">)</span>
<span class="n">introduce</span><span class="p">(</span><span class="n">acme</span><span class="p">)</span>
<span class="n">introduce</span><span class="p">(</span><span class="n">john</span><span class="p">)</span>
<span class="n">rename</span><span class="p">(</span><span class="n">spots</span><span class="p">,</span> <span class="s">"spotsy"</span><span class="p">)</span>
<span class="c1"># Invalid things
</span><span class="n">introduce</span><span class="p">(</span><span class="mi">42</span><span class="p">)</span>
<span class="n">introduce</span><span class="p">(</span><span class="s">"blah"</span><span class="p">)</span>
<span class="n">rename</span><span class="p">(</span><span class="n">john</span><span class="p">,</span> <span class="s">"J-man"</span><span class="p">)</span> <span class="c1"># Person's `name` is read-only
</span></code></pre></div></div>
<p>None of the classes we defined inherit from <code class="language-plaintext highlighter-rouge">Greetable</code> (in fact, non-protocol classes cannot inherit from Protocols)
yet they all conform to it because they have a <code class="language-plaintext highlighter-rouge">name</code> and a <code class="language-plaintext highlighter-rouge">greet</code> method with the appropriate types.
As such, they can all be passed to a function that takes a <code class="language-plaintext highlighter-rouge">Greetable</code>.</p>
<p>When specifying attributes that must exist on the type, we saw both:</p>
<ul>
<li>the <code class="language-plaintext highlighter-rouge">@property</code> form, which lets us specify that we only need to read this value,</li>
<li>and the <code class="language-plaintext highlighter-rouge">name: str</code> form which is a shorthand for an attribute that is both readable and writeable.</li>
</ul>
<p>We can also opt in to these Protocols being checkable at runtime. For more detail, see <a href="https://www.python.org/dev/peps/pep-0544/#runtime-checkable-decorator-and-narrowing-types-by-isinstance">@runtime_checkable decorator within PEP 544</a></p>
<h2 id="conclusion">Conclusion</h2>
<p>In this post we’ve seen some useful features for tightening our types:</p>
<ul>
<li>Use <code class="language-plaintext highlighter-rouge">TypeVar</code> to express that a single unknown type will appear multiple times in the same context</li>
<li>Use <code class="language-plaintext highlighter-rouge">@overload</code> when you have a function whose behaviour depends on its input</li>
<li>Use <code class="language-plaintext highlighter-rouge">Protocol</code> when you want to support any type which happens to have the attributes and methods that you need.</li>
</ul>With typing in Python, we aim to restrict as many invalid programs as possible before they’re ever run. This post covers some useful features for tightening up our types:Should we get rid of British Summer Time?2020-08-22T14:15:42+00:002020-08-22T14:15:42+00:00https://jaredkhan.com/blog/get-rid-of-daylight-savings<p><em>British Summer Time is the UK’s rendition of the ever-controversial concept of Daylight-Saving Time (DST).
With the EU counting down to a co-ordinated end to DST in 2021, just after the end of the Brexit transition period,
the jury is out on what the UK will do next. Here, I explore some of the issues around this question.</em></p>
<h2 id="a-brief-history-of-daylight-saving-time-in-the-uk">A brief history of Daylight Saving Time in the UK</h2>
<p>The year was 1895. George Hudson was beginning to cause quite a stir on the New Zealand time-keeping scene. A Post Office worker by trade yet an astronomer and insect scientist by passion, Hudson cared about the opportunity to utilise daylight hours to their fullest. He presented his case to the Wellington Philosophical Society. From <a href="http://rsnz.natlib.govt.nz/volume/rsnz_31/rsnz_31_00_008570.html">Transactions and Proceedings of the Royal Society of New Zealand 1868-1961</a>:</p>
<div style="display: flex; align-items: flex-start; align-content: space-around; margin: 3em 0 2em 0; justify-content: center; flex-wrap: wrap;">
<img src="/assets/images/daylight_saving/george_hudson.png" alt="George Hudson: postal clerk, insect scientist, astronomer" title="George Hudson: postal clerk, insect scientist, astronomer" style="width: 30%; min-width: 10em; max-width: 15em; flex: 1; margin: 0 2em 2em 0;" />
<blockquote style="flex: 2; min-width: 20em;">
We cannot individually alter our times of going to bed or getting up, but must fall in with the habits of the majority ... those who desire to utilise the early-morning daylight are compelled to take some of their recreation before their daily work and some afterwards, which in many cases results in their having to forego pursuits that they would be enabled to follow successfully if their daylight leisure were continuous.
</blockquote>
</div>
<p>Though Hudson seems to be the first to write this idea down in so many words, it took tens of years, a world war, and persistent campaigning by other proponents of the idea for DST to be implemented anywhere in the world.</p>
<table>
<tbody>
<tr>
<td>1916</td>
<td>British Summer Time was adopted (Summer Time Act 1916), after Germany had already implemented DST. The purpose at the time was to preserve coal (sunlight at more appropriate hours means less artificial light).</td>
</tr>
<tr>
<td>1941-1945</td>
<td>Clocks were brought forward an extra hour during world war two (GMT+1 in winter, GMT+2 in summer, ‘British Double Summer Time’)</td>
</tr>
<tr>
<td>1968-1971</td>
<td>The UK experimented with year-round GMT+1 (‘the British Standard Time experiment’)</td>
</tr>
<tr>
<td>1997</td>
<td>The EU started prescribing the clock change and the dates on which it should happen throughout its member states (<a href="https://op.europa.eu/en/publication-detail/-/publication/58470b17-0729-4851-bf61-6dfc4ded1fa6/language-en">Eighth Directive 97/44/EC on summer-time arrangements</a>)</td>
</tr>
</tbody>
</table>
<h2 id="british-standard-time-experiment">British Standard Time Experiment</h2>
<p>We can gain some insights about our question by looking back on the British Standard Time experiment and the <a href="https://api.parliament.uk/historic-hansard/commons/1970/dec/02/british-standard-time">House of Commons British Standard Time debate</a> that followed.
The experiment was largely met with a ‘shrug’ by the UK public</p>
<blockquote>
<p>60% had no strong views,
35% were in favour of retaining the new system,
5% were against the new system <a href="https://api.parliament.uk/historic-hansard/commons/1970/dec/02/british-standard-time#S5CV0807P0_19701202_HOC_333">[ref]</a></p>
</blockquote>
<p>The rest of the debate was similarly inconclusive:</p>
<ul>
<li>Some concluded from the experiment that <strong>road accidents</strong> had decreased though others were quick to point out that the breathalyser was introduced shortly before the experiment began, complicating the analysis. Analysis tried to avoid this complication by only looking at the change in accidents in two ‘rush hour’ windows, arguing that drunk driving accidents typically happen later at night. (<a href="https://trl.co.uk/sites/default/files/RR228.pdf">The Potential Effects On Road Casualties Of Double British Summer Time — 2.1</a>)</li>
<li><a href="https://api.parliament.uk/historic-hansard/commons/1970/dec/02/british-standard-time#column_1349">Manual labourers</a>, especially in the north, <a href="https://api.parliament.uk/historic-hansard/commons/1970/dec/02/british-standard-time#S5CV0807P0_19701202_HOC_351">as well as postmen</a> who had to do morning deliveries, complained of <strong>going to work in the dark</strong></li>
<li><a href="https://api.parliament.uk/historic-hansard/commons/1970/dec/02/british-standard-time#S5CV0807P0_19701202_HOC_359">Some mothers complained</a> that their <strong>children had to travel to school in the dark</strong>. <a href="https://api.parliament.uk/historic-hansard/commons/1970/dec/02/british-standard-time#S5CV0807P0_19701202_HOC_329">Others enjoyed</a> that their children were coming home in the brighter hours of the afternoon.</li>
<li>Many argued that the <strong>additional time for leisure activities</strong> provided by year-round GMT+1 was particularly important. (Recall, this is the same argument George Hudson used when he first argued <em>for</em> DST)</li>
<li><a href="(https://api.parliament.uk/historic-hansard/commons/1970/dec/02/british-standard-time#column_1355)">Farmers were against the change</a>. Or, wait, were they in favour of it?</li>
</ul>
<h2 id="the-scotland-problem">The Scotland problem</h2>
<p>The country was also divided by it’s wide range of latitude (north-ness). Daylight hours change with the seasons and do so more dramatically the further you are from the equator. This means there is a stark difference between the <a href="https://www.timeanddate.com/sun/uk/edinburgh">daylight hours</a> of England and Scotland:</p>
<table>
<thead>
<tr>
<th> </th>
<th>Shortest Day</th>
<th>Longest Day</th>
</tr>
</thead>
<tbody>
<tr>
<td>Edinburgh</td>
<td>7 hours</td>
<td>17 ½ hours</td>
</tr>
<tr>
<td>London</td>
<td>8 hours</td>
<td>16 ½ hours</td>
</tr>
</tbody>
</table>
<p>This makes having a daylight saving period more appealing in Scotland because it’s harder to find a single time zone there that works all year round without either wasting morning daylight in summer or having some very dark mornings in winter. During the experiment, Edinburgh would have days where there was no Daylight until 9:43 (whilst London would have daylight from 9:06).</p>
<h2 id="further-considerations">Further Considerations</h2>
<p>Of course, a lot has changed since 1970 and the public debate has continued on whether we should keep fiddling our clocks. Parliament has many times debated changes to the system, notably including the failed <a href="https://services.parliament.uk/bills/2010-11/daylightsaving.html">Daylight Saving Bill 2010</a>. These have repeatedly failed or ran out of parliamentary time. Further considerations on the issue have included:</p>
<h3 id="energy">Energy</h3>
<p>There are many hunches one can form around how changing BST would impact energy usage:</p>
<ul>
<li>Synchronisation with mainland Europe could lead to increased peak-time energy costs since peaks will be shared across UK and Europe. The UK imports <a href="https://gridwatch.co.uk/int">about 5%</a> of its electricity from France, and also imports from the Netherlands and Belgium.</li>
<li>BST might cause some to wake up earlier when it’s not warm or bright yet and so could increase heating and lighting costs in the morning.</li>
<li>If morning hours are dark, people may forget to turn off lights when they leave for work.</li>
</ul>
<p>The ambitious can attempt to quantify the effects. In 2010, <a href="https://pdfs.semanticscholar.org/962f/009bef1eabbafe443be581b57e5fe4d210a7.pdf">Cambridge University Engineering Department estimated</a> overall energy savings of 0.3% in winter months if we were to adopt BST all year round. This isn’t an especially compelling result, though the authors suggest that they have adopted “a conservative approach such that they consider them lower bounds on any true savings.”</p>
<h3 id="sleep-and-health">Sleep and Health</h3>
<p>There has been some suggestion of a spike in heart attacks (or ‘accute myocardial infarctions’, if you must) the week after we lose an hour of sleep, at least in the US. <a href="https://openheart.bmj.com/content/1/1/e000019">Later research</a> has suggested that the net incidence of heart attacks is roughly unaffected, due to the corresponding drop when we gain an hour of sleep.</p>
<h2 id="the-eu-pulls-the-plug">The EU pulls the plug</h2>
<p>Despite the confusion and lack of conclusion, European Parliament <a href="https://oeil.secure.europarl.europa.eu/oeil/popups/summary.do?id=1579670&t=e&l=en">have made progress on</a> a law that sees all member states ditch daylight savings time, after they ran a <a href="https://eur-lex.europa.eu/legal-content/EN/TXT/PDF/?uri=CELEX:52018SC0406&from=EN">public consultation</a> (which was curiously responded to overwhelmingly by Germans). Each state will have the choice of whether to keep permanent summer time or permanent standard time but will have their last clock change for daylight saving in 2021. The legislation at the EU level allows for a coordinated end to daylight saving time in Europe.</p>
<p>This leaves a bit of a question mark for the UK as the Brexit transitionary period, during which it is subject to EU law, is due to end juuust before 2021. Assuming the EU law does get passed, feasible options for the UK are:</p>
<ul>
<li>Ignore the EU and crack on with what we already have</li>
<li>Join the EU in ending DST and use permanent GMT+1 (‘summer time’)</li>
<li>Join the EU in ending DST and use permanent GMT (‘winter time’)</li>
</ul>
<h2 id="flexibility">Flexibility</h2>
<p>Those in search of a ‘correct’ answer to this question will be disappointed. Changing the clocks in one way benefits some and disadvantages others. Putting the clocks forward may benefit golf-players and disadvantage postmen. Putting the clocks back may benefit farmers who need to feed fussy cows and disadvantage those driving home at 5pm. This balance is inherent in trying to change behaviour at such a global level. Should we really be surprised that the schedules we’ve gotten used to over the past 80 years are on average about right?</p>
<p><strong>Many criticisms of Daylight Saving Time are really criticisms of an inflexible system of work.</strong></p>
<p>Daylight saving time does not give anyone any more time between between sunrise and sunset. Rather unfortunately, that is set by nature alone. Daylight Saving Time exists to nudge the nation toward taking advantage of changing daylight hours with the seasons, to shift our rigid routines. Perhaps those routines need to be a little less rigid in the first place. George Hudson’s claim that “we cannot individually alter our times of going to bed or getting up, but must fall in with the habits of the majority” may not be so true anymore, and perhaps we should try re-evaluating smaller changes we can make.</p>
<p>We don’t stick to the same routine all year round because it doesn’t work all year round. Similarly, not everyone should work to a 9-5 routine because it doesn’t work for everyone.</p>British Summer Time is the UK’s rendition of the ever-controversial concept of Daylight-Saving Time (DST). With the EU counting down to a co-ordinated end to DST in 2021, just after the end of the Brexit transition period, the jury is out on what the UK will do next. Here, I explore some of the issues around this question.