Jared Khan

Review: Climbing at the Perse Sports Centre

2024-10-25T12:00:00+00:00

Cambridge’s first¹ publicly accessible roped climbing wall has recently opened!

The Perse School is a private school on Hills Road. Their shiny new Sports Centre has recently opened to the public. Here’s some details from my first visits to the centre’s roped climbing wall.

First, the basics:

The roped section is 10.5m high with 11 lines of climbing accommodating about 35 routes. That is to say, it’s smallish
Two of those lines are auto-belay lines
There’s about 30 bouldering problems on the other side of the room
There’s a ~30° digital board too

The climbing wall is operated by the school’s Outdoor Education Department, headed by Ben Parker. The whole team are really friendly, enthusiastic about this new facility, and very keen for you to have a good time at the wall. This starts with the induction that new users must book into and complete before climbing or booking further sessions.

A view of The Perse Sports Centre climbing wall

Induction

The induction at the Perse is the most thorough induction I’ve had at any climbing wall ever. It costs £15, pretty similar to the entry price, which I think is pretty good going considering it tends to be one instructor to about 4 climbers. On paper, it’s a box-ticking competency test, but thanks to the enthusiasm of the outdoor education team to, well, educate, it starts to resemble what other centres might call a ‘beginners 1-hour intro’ class (and charge you £30 for). For new or under-confident climbers, they’ll walk you through the process of belaying and even supervise your first belay if it comes to it. For new climbers, I think this is pretty great. One induction session likely won’t be enough to get you signed off to belay, but they’ll be sure to get you up to speed on how to use the small bouldering wall and the auto belays and you can do another induction when you’re a bit more confident (perhaps you’ve watched a few more YouTube videos, perhaps you’ve been tying figure-8s in your shoelaces) and want to get signed off for belaying.

More experienced climbers might find this induction a little tedious: although the pace is tailored to the group, it took about an hour in a group of 4 to complete it, but that time will likely come down as the process gets more rehearsed. It does mean that it’s not really a centre that an experienced climber can pop into when passing through the city, and is rather geared towards locals that can come multiple times. The induction session slot that you book is 2 hours long and you’re free to climb for the remainder of the session after the induction.

Booking

Once you’ve completed your induction, climbs must be pre-booked in 2-hour slots and, after the next week or so, inducted users will be able to book for guests that they supervise as well, without them having to go through a full induction.

These 2-hour sessions will cost you an eye-watering £12.50 each, the same price as entry to Rainbow Rocket just round the corner which has approximately 9x the amount of bouldering that Perse has, but no ropes. Compared to other roped centres, that’s only £1 cheaper than Milton Keynes’ Big Rock Hub (4x the size of Perse), and £4 cheaper than London’s Castle Climbing Centre (11x the size of Perse), but we might not expect anything less of the only roped climbing in the city.

Climbing Wall	Price per session	Max session length	Approx number of roped routes²
The Castle Climbing Centre, London	£16.50 (£14.50 when bulk-buying)	1 day	380
Big Rock Hub, Milton Keynes	£13.50 (£12.15 when bulk-buying)	1 day	135
The Perse Sports Centre, Cambridge	£12.50	2 hours	35

Sessions being pre-booked allows the centre to cap the capacity at around 20 climbers, which is for the best in a centre of this size.

Routes and Climbing

Let’s talk about route-setting. So far, the vast majority of the routes at the wall are those initially set by the manufacturer, Rockworks, predominantly with their own-brand holds. The grading is soft and the routes are mostly, to be honest, relatively unmemorable.

Going forward, the route-setting is to be done in-house by Ben, Kit and Matt: not veteran route-setters, but clearly motivated and excited to keep it interesting. They intend to set a few routes each week. In their first week of in-house setting, they managed to get a new roped route up as well as about 3 new bouldering routes. Not bad for their first session of setting at this height! At this frequency of setting, climbers might expect to get a decent visit once a fortnight or so.

I’m excited to see the setting team hit their stride, and expect that we’ll see routes getting more interesting as they steadily get more of their stash of Holdz-branded holds and volumes out the store-cupboard and gain from experience and feedback on the route-setting.

The inaugural list of routes

Another decision to make efficient use of the space is to allow lead climbing on every roped line including the auto-belay lines. Users are asked to remove the gym’s top rope and lead with their own rope, carrying the top rope back up with them on a gear loop. It’s been a while since I’ve come across this in a gym, and it obviously requires you to be a bit careful, but I think it is well-suited to the space. It is a little awkward especially because your lead rope and the gym’s top rope have to share an anchor at the top, but I’m told that separate lead-rope anchors will be installed at the top of each line to make this easier.

One slightly odd feature is the way the speed timers work on the auto-belay lines. The start button is at the entrance door, not at the bottom of the route, and so has to be operated by someone else. So whilst it’s definitely more fun and spectacular than your friend timing you on their phone, it’s not much more precise or convenient. Hopefully they’ll one day find a way to have a start trigger at the bottom of the wall and some kind of countdown.

The Digital board at the end of the bouldering area is a fun, welcome addition to that area. At a roughly 30° angle, it’s some of the steepest climbing the room has to offer, and it’s just kinda fun to look through the routes on this knowing that they’re set by other wall-users (mostly actually students at the Perse School, I’m told). The software could use some work, the photo preview of the route wasn’t working for me, and it’s hard to tell which routes are going to be any good at the moment, perhaps a log of how many people have ticked a given route would help, but it’s all good fun.

The digital board at The Perse Sports Centre climbing wall

Once you’re done climbing, there’s a pull-up bar and hangboard to finish off your upper body for the day, but unfortunately no dedicated space for stretching, so you’ll have to sneak your lunges in whatever quiet corner you can find.

Training and courses

I think this could be a great spot for teaching friends new climbing and rope skills before your first outdoor trip together. With a ground-level anchor, you can easily practice tying off for cleaning an outdoor sport route or even setting up an abseil. There is even an abseil platform from which you could safely practice your abseiling if you ask the staff nicely to allow you to use it.

A ground-level anchor and quickdraws helps for demonstrating various lead climbing skills

In terms of actual courses, Perse will be offering National Indoor Climbing Award Schemes (NICAS) for under 18s, which they’ll be offering levels 1-3, similar to what’s currently on offer over at Climb Hitchin, but obviously more convenient for the young climbers of Cambridge. They also intend to have a set of other sessions and courses including a session for adults to learn to lead climb indoors.

Overall

It’s a huge win to finally have some roped climbing opportunity in Cambridge, even at this modest scale. The Perse are clearly investing seriously in this section of their new Sports Centre and it’s a nice complement to the larger, more commercial bouldering opportunity round the corner at Rainbow Rocket. The team seem very excited about it and I’m confident that they’ll do a stellar job of running it, with a particular focus on young climbers. Route-setting could make or break the sustained interest here, and I’ll be rooting (ha) for the team as they get up to speed.

Aside from this, I believe Cambridge could easily support a roped climbing centre 4 times the size of the Perse and, if you agree, you can join us in advocating for more access to roped climbing in the city over at The Cambridge Arch Project.

at least, the first for a long while, and the first that would be considered an option for the sport climbers of today ↩
Numbers gathered from these centres websites. To be honest, a more useful comparison might be the number of routes set per month at each centre, but it’s a little harder to get a hold of that info ↩

Breaking JavaScript Generators

2023-08-14T09:00:00+00:00

Coming back to JavaScript after focusing on Python, I was surprised that using break when iterating over a generator closes the generator. This post has an example of that behaviour and a comparison with Python’s behaviour.

Python and JavaScript both have the concept of generators, functions which can yield multiple values, pausing after each yield until the consumer asks for the next value.

# Python
def my_generator():
	yield 1
	yield 2
	yield 3

iterator = my_generator()

print(iterator.next())
print("something else")
for item in iterator:
  print(item)

# Outputs:
# 1
# something else
# 2
# 3

// Javascript
function* myGenerator() {
  yield 1;
  yield 2;
  yield 3;
}

const iterator = myGenerator();

console.log(iterator.next().value);
console.log("something else");
for (const item of iterator) {
  console.log(item);
}

// Outputs:
// 1
// something else
// 2
// 3

These two things look basically identical. The iterator object we get back in the two cases has pretty much the same three main functions:

__next__(Python)/ next(JavaScript)
- Run the generator up to the next yield and get the yielded value
throw(exception)
- Force the generator to throw the given exception from its current suspended position
close(Python)/return(value)(JavaScript)
- Close the generator, making sure any finally blocks are run. Can’t get values out of this generator after this (unless there are yields in the finally for some reason).

The interesting difference that surprised me is that in JavaScript, calling break whilst iterating over a generator calls return on the generator, so you can’t really keep using that generator:

# Python
def my_generator():
  try:
    yield 1
    yield 2
  finally:
    print("finally")

it = my_generator()

# Print one item then break
for item in it:
  print(item)
  break

# Print the rest of the items
for item in it:
  print(item)

# Outputs:
# 1
# 2
# finally

// Javascript
function* myGenerator() {
  try {
    yield 1;
    yield 2;
  } finally {
    console.log("finally");
  }
}

const it = myGenerator();

// Print one item then break
for (const item of it) {
  console.log(item);
  break;
}

// Try to print the rest of the items
// (doesn't work)
for (const item of it) {
  console.log(item);
}

// Outputs:
// 1
// finally

In Python, the close gets called in the __del__ of the iterator object. That is, it gets closed when it gets garbage collected. This isn’t really possible in JavaScript due to the lack of any standardised garbage collection scheme, so I guess they had to do something else to make sure ‘finally’ is more likely to actually be called. That said, if it’s iterated outside of a for..of, it’s easy enough to just not close it and the finally will never run.

// Javascript
function* myGenerator() {
  try {
    yield 1;
    yield 2;
  } finally {
    console.log("finally");
  }
}

const iterator = myGenerator();
console.log(iterator.next().value);

// Outputs:
// 1
// (never outputs 'finally')

Further Reading

PEP 342 – Coroutines via Enhanced Generators
- This specifies the behaviour of the try..finally in Python generators
MDN – for…of
- This mentions the break behaviour

Single-pitch Sport Climbing in the Calanques

2023-05-01T12:00:00+00:00

The Calanques National Park near Marseille features a huge expanse of beautiful limestone with breathtaking views. From April 22nd - 30th 2023 I was able to visit with 4 friends and experience a small sample of the wealth of rock climbing on offer in the Massif.

Day 0 - Arrival

Having hired a car from Marseille Provence Airport, we arrived at our Airbnb in Saint Cyr Sur Mer, dumped our bags, and found dinner in the small port town. Cassis is a more commonly recommended place to stay for a trip to the Calanques but the choice of available Airbnbs for our trip nudged us a bit further west, and we didn’t mind the additional scenic driving.

The view of La Pointe Grenier from our accommodation in Saint Cyr Sur Mer

Day 1 – Roche Percée, Morgiou

For our first day in The Calanques, we were looking for a crag that was far enough from the available parking so as not to be too busy on the weekend. We opened up the guidebook — Climbing in the Calanques (2020) — to find our first target.

For the duration of our trip, the Route de Morgiou and Route de Sormiou roads were closed during the daytime ‘to allow clear access to emergency services’. The guidebook shows you where the road closure gates are on its maps using a black gate symbol (). The gates can also be spotted on Google street view to get a better idea.

Roche Percée is accessible within 40 minutes walk from the large car park next to Luminy University (43.2325910°, 5.4324360°) and has routes within our preferred grade range (5b-6b).

Having not paid enough attention, and not understanding what the guide book was trying to say, we strayed down yellow trail number 7 rather than 6a and ended up spending about 2 hours on what should have been a 40 minute approach. Luckily, the weather was overcast and not too hot, and we made it to the crag without melting. From Luminy, follow yellow trail 6a until you are near the coordinates, above the crag look at the arch to get your bearings, then go back to find the side trail which is marked with a yellow cross. Follow the piles of rocks down a scramble descent to get round to the west of the crag, enter through the arch.

The final few steps to Roche Percée, approaching the arch through the trees.

The Mes Calanques app has a downloadable map that shows the walking trails along with your location (which you manually update by pressing the location button), which we found very useful throughout the trip.

The main walking trails in The Calanques tend to have markers (lines, or Ls to indicate a turn) painted on rocks in shiny paint. Coloured crosses indicate that the route of that colour does not go this way.

Screenshot from the Map section of the Mes Calanques app.

A yellow cross at Escalier Des Géants indicating that this side trail is not part of the current yellow trail.

The crag itself is a charming, secluded dish with a terrific view of the Calanque de Morgiou and a delightful archway entrance. There is a reasonably sized and flat-ish ledge from which to belay, though walking around the crag you will need to pay some attention to stay clear from the sheer drop when you’re not tied into anything.

A view of the Roche Percée crag.

Excitedly getting up our first route of the trip, La Baleine. (Photo: Sam Bailey)

This striking black lichen cuddles the rock at Roche Percée.

We thankfully found our way back more directly, returning to the Luminy car park in around 40 mins.

Climb Log
La Baleine, 5b, hangdogged
La Sixiéme Tortue, 4c, lead. Easy climbing with a strange anchor at the top. Two bolts joined by a chain with no rappel ring, just a maillon to lower off.
Le Boyau, 5a, lead. Beautiful, juggy climbing, with a big quartz crystal in the middle

Day 2 – Pouce, Sormiou

A view of the 'thumb' which gives this crag its name. (Photo: Nathan Bailey)

We approached this crag from the car park at the start of Route de Morgiou (despite this crag overlooking Sormiou). Google Maps struggled to direct us to this car park, despite it not being complicated, so keep an eye on the map. We arrived at about 11:00 on a Monday and got the last space in the car park (get in!).

From the car park we did not walk along the road but along the red route number 5 (not pictured in the guidebook but in the IGN or Mes Calanques app map) until we met the path marked with a blue climber. This approach took us 30 mins.

Anchors at this crag were excellent, generally being a rappel ring connected to two bolts by a chain. There is also a practice anchor at ground level, making this a good first crag of the trip if you or your climbing partners would like extra practice at tying off before gravity gets involved.

A practice anchor at ground level at Pouce, Sormiou.

Sam on La Future Dernière. (Photo: Jamie Bernardi)

Climb Log
Le Pouce Dans Le Nez, 3c, lead. An essential photo opportunity for any equipped visitors to this crag. I was so excited to sit on the top, I forgot to change into my climbing shoes. The view of the Calanque de Sormiou from the top of ‘the thumb in the nose’ is fantastic. If the wind is gentle or if you are feeling particularly brave, you can stand up at the top.
Anti-Rouille, 5b, onsight
Iznogood, 6a, lead. Despite the name, an enjoyable route, though some loose rocks towards the top. There are additional bolts above the anchor for you to step up and safely enjoy the view.
C’est Super, 5b, lead. Gentle for the grade, ledges take the edge of the length of this route.
La Future Dernière, 6a, hangdogged. Mainly juggy climbing with the holds juust out of static reach for me.

Day 3

Cliff of Pastré, Marseilleveyre

Ready for a bit less walking, we turned to Marseilleveyre, where many crags are short walks from city parking and have enjoyable views of the city. We headed for the Cliff of Pastré, managing to find parking near to the guidebook’s recommended location. The walk from there to the cliff is around 15 minutes.

The Grotte sector has enjoyable rock features, including the cave that gives it its name, and a fantastic view over the city. The Phalanges Resinees Sector has some lovely climbing and again the top provides excellent views. We were treated to more awesome weather, the 18mph NW wind was no problem at this crag. We finished the day with a drink along the seaside at Marseille.

The cave that gives the Grotte sector its name.

Finding my feet on Vent de Panique (right) whilst Jamie (left) performs the final moves on Tapage Diurne. (Photo: Sam Bailey)

A view from the top of a great route, Vent de Panique, over Marseille.

Climb Log
Grotte Sector – L’ultravoie, 5b+, lead
Grotte Sector – Graffiti, 5c+, onsight. A nice crack climb with a spooky distance between the last clip and the anchor
Phalanges Resinees Sector – Tapage diurne, 5c, onsight. A lovely layback most of the way up
Phalanges Resinees Sector – Vent de Panique, 5c, lead. Lovely technical moves all the way up, on slabby terrain. Some polish but lots of options available.

Day 4 – ‘Resting’ in Marseille

After 3 days of climbing and Calanque-ing, we needed some time to grow our fingerprints back. We drove over to Marseille and parked near the old port (a decision that ultimately cost us €32 for the day). From there, we queued up for boat tickets to Frioul Island and wandered around the Mucem museum, with its bizarre concrete lace-clad square, until our boat was ready to depart.

Taking a moment to reflect in the mirror ceiling of a covered area of the Old Port in Marseille.

Gazing through the surreal concrete lacing of the Mucem.

We enjoyed a picnic on the island and explored the eery ruins of unfinished German WW2-time ammunition shelters throughout the Fort of Ratonneau. Nowadays, the fort is home to hundreds of gulls, who negotiate with tourists for nesting space. Inevitably we walked more on this ‘rest’ day than any other day of the trip.

The Fort of Ratonneau. (Photo: Jamie Bernardi)

Returning from dinner at Le Cours Julien, Marseille.

Day 5 – Escalier Des Géants, Les Goudes

This day featured another easy approach, parking near the end of Boulevard Alexandre Delabre (43.211710°, 5.351372°) and following the easy instructions of the guidebook for about 15 minutes. The brief spits of rain on this day cooled us down just enough to have a go at La Fissure du Géant. Some of the group enjoyed the length of the routes offered here, and say this crag was a highlight of the trip.

Mid-crumble as my arms give up on La Fissure du Géant. (Photo: Sam Bailey)

Back for the finish on La Fissure du Géant. (Photo: Sam Bailey)

Climb Log
Miax Sector – Et Jean, 5b+, onsight. A sketchy start. It’s easy to stray left into Le Diédre, stay on the right for the interesting holds at the top. Your rope will run over some spiky rocks at the top whether you stray left onto a different anchor or stay right, you may wish to extend the anchor with a sling if a few of your party are doing this route.
Miax Sector – Pièce Montée, 5c, lead. This slabby route feels a little scary with a bolt missing just after half way.
Right Side Cliff – La Fissure du Géant, 6b, hangdogged. This huge route features cool laybacks along the crack and crimps where the crack is too tight. A ledge offers a good break before the crux through the notch. This route betrayed my lack of stamina, with my forearms melting off the route about 3/4 the way up.
Right Side Cliff – La Lolotte, 5c, lead. Some polish on this.

Day 6 – En Vau

En Vau is billed as one of the most scenic and classic Calanques and is incredibly popular with tourists and beach-goers. Many of the climbing routes in this area have been there since the 1930s or earlier. It took us 90 minutes in a group of 5 to walk from the parking at the start of the Route Gaston Rebuffat (43.23712135474467°, 5.4996331475356826°), named after the famous climber from the region, down to the beach, following clear signposts to Calanque d’En Vau all the way. The view on this walk is mostly of a plain-looking forest, until the final 20 minutes or so when you are really in the valley and the beauty of the place reveals itself.

Approaching the beach at En Vau.

The beach of Calanque d'En Vau as seen from the top of La Ratopenado.

The beach is of course the hub of the place, with around 200 people sunbathing and swimming on the Friday lunchtime that we visited. There are many sectors nearby this beach and we had set our sights on the Petite Aiguille, hoping for terrific views and classic climbing. Unfortunately, the seaward side of the Petite Aiguille has enough bush cover from the beach for some to have used it as a handy toilet. The routes on this side also looked to us to be poorly equipped, with old bolts and unsafe starts (a typical clipstick will not help you reach the first bolts on these). The second bolt of La Diagonale is stained with the rusty remains of multiple maillon rapide bails. We were persuaded to start on the other side of the needle.

Trying to find a decent line of bolts up the seaward side of Petite Aiguille.

The routes on the valley side of the Petite Aiguille were somewhat better equipped, though the line of La B.B was not evident to us, and members of the group ended up merging it with Directe Nord-Oeust in order to get to an anchor. Reaching the top of this needle one way or another provides an excellent view of the Calanque, as long as you don’t mind an audience.

After a picnic and a dip in the crystal clear water, we looked for some routes to finish our En Vau experience. Dalle Du Chat looked interesting and is also very close to the beach so we wandered over to take a look. Unfortunately, in the few years since the guidebook was published, there has been a major rockfall on the left of this sector (to the left of Le Gynécologue), and the right of the sector (Passangers Du Vent - Sale Temps Pour Les Nains) has been fenced off. I stared wide-eyed at the boulders at the base of the cliff with the knowledge that they were recently the top, gulped hard, and went to look for something else.

The guidebook's topo of Dalle Du Chat, alongside the current state of this area.

A few moments walk away from the beach along the main path, followed by a brief clamber up a scree path finds you the Grande Aiguille, which was a lovely spot for some in the group to climb the very pleasing Le G.H.M., appreciating the eye of the needle at the top, and for others to dry off in the sun from their swim.

The hike back, past beach-goers in just their sea-logged boxer shorts, was no slower than the walk in, taking us 80 minutes.

Sam beta-hunting on Le G.H.M.

Looking through the eye of the Grande Aiguille.

Still hiking back from Calanque d'En Vau.

The sun setting as we finally finish our return to the car park at the start of Route de Gaston Rebuffat.

Climb Log
Petite Aiguille – La Ratopenado, 5b, lead. Long spacing between the bolts throughout and a very long gap from the last clip to the anchor make this otherwise fine route a little nervy.
Grand Aiguille - Le G.H.M., 5c, lead. Lovely climbing, with the crux just above the cave. Don’t forget to look through the eye of the needle at the top!

Day 7 – Vallon Des Escampons

Having spent our enthusiasm for walking on our En Vau day, we went for another crag that is very close to available parking. We parked in the car park at the start of the road to Morgiou (43.222840°, 5.417970°), the residential street that the guidebook recommends to park on is small and was closed for non-residents when we were there, which seems fair enough. The walk in took about 15 minutes.

Scratching my head on Bon Secours. 'Surely I'm not supposed to put my foot up there'.

A fairly inconspicuous sign on a tree told us that the five climbs Rififi Allah Fédé - Jamika were closed during our visit due to nesting birds and suggested they’d remain closed until July 2023.

The equipment in this area is more generous than in others, with bolts spaced around 1m apart. This gave us nervous climbers the security to try some slightly harder routes, making a satisfying end to our climbing for the week.

Nathan unfazed by the ants on Une Bière Pour La Route. (Photo: Ben Papp)

Nathan finding a way up the crack of Papinade.

Climb Log
Escampons Left – Le Pharo, 5c, hangdogged. A bit crimpy at the top, though it’s possible to cheat into the crack on the right for an easier finish.
Escampons Left – Bon Secours, 5c, hangdogged. A flexible high-foot move near the end, which can be avoided by traversing round to the left.
Escampons Right – Papinade, 6a, onsight. A cool vertical crack to start followed by a juggy traverse to the left, then a crimpy slab finish, this one has it all!
Escampons Right – Une Bière Pour La Route, 6a, onsight. Enjoyable moves if you can ignore the hollow-feeling rock and the ants, wasps, and spiderwebs you might encounter on the way. Anchor is up to the left a bit from the last clip, above some bushes.

Conclusion

We were spoilt by fantastic weather and great variety of climbing locations on our trip to the Calanques. The availability of beginner-grade single pitch sport climbs drew us in to the area and brought us up close and personal with cliffs and ridges that inspire us to climb harder and further and to explore more of the beautiful rock that’s available to us.

We all left with the same sentiment: “now we have to learn how to multi-pitch”.

The squad.

The result of 6 days of cuticle abuse.

Use a Switch Pro Controller with Dolphin Emulator on macOS Ventura

2022-10-29T14:25:00+00:00

Step 1: Connect the controller to the Mac

Using a Nintendo Switch Pro Controller on a Mac requires macOS Ventura.

Open System Settings ▶︎ Bluetooth
Hold down the SYNC button on the top of the controller next to the USB port and wait for the LEDs at the bottom of the controller to light back and forth
Click Connect next to ‘Pro Controller’ under ‘Nearby Devices’

Step 2: Set up the controller in Dolphin

Open Dolphin and click on Controllers
Choose ‘Standard Controller’ in the port you want to configure

If you’d like to use my recommended settings:

Download switch_pro_controller.ini

Move it to ~/Library/Application Support/Dolphin/Config/Profiles/GCPad/Switch Pro Controller.ini. This can be done in Terminal like so

mkdir -p ~/Library/"Application Support"/Dolphin/Config/Profiles/GCPad && \
mv ~/Downloads/switch_pro_controller.ini \
~/Library/"Application Support"/Dolphin/Config/Profiles/GCPad/"Switch Pro Controller.ini"

Now, in Dolphin, press the Configure button
In the profile dropdown, choose ‘Switch Pro Controller’ and press Load. The buttons should be populated with my recommended settings and you can adjust them from there
If you are configuring multiple controllers, make sure that the device in the top left is the one you want to be configuring
For comfort and consistency with the GameCube controller layout, I’ve set:
- ZR on the Pro Controller to be R on the emulated GameCube controller
- and R on the Pro Controller to be Z on the emulated GameCube controller

If you want to configure the controller yourself:

Press Configure
Select ‘SDL/0/Pro Controller’ in the dropdown that appears (the number might differ)
Go through each button, clicking on the blank box and then pressing the button on the controller that you want to use
For Control Stick ▶︎ Up, for example, simply push the left stick on the Pro Controller upwards, and Dolphin will figure out the axis that you mean
After setting up the sticks, you may wish to Calibrate them by pressing Calibrate and moving the sticks in the widest possible circle quite slowly
If you want to clear or add alternative inputs to any of the configured buttons, right click on the configured button. This also gives you a live view of all the channels of the connected controller so you can decide on your preferred configuration

That’s it. The controller should now function as an emulated GameCube controller in the slot that you chose.

Swift’s Copy-on-write Optimisation

2021-04-05T11:45:00+00:00

Swift’s Arrays have value semantics. They also have a copy-on-write optimisation:

Collections defined by the standard library like arrays, dictionaries, and strings use an optimization to reduce the performance cost of copying. Instead of making a copy immediately, these collections share the memory where the elements are stored between the original instance and any copies. If one of the copies of the collection is modified, the elements are copied just before the modification. The behavior you see in your code is always as if a copy took place immediately.

https://docs.swift.org/swift-book/LanguageGuide/ClassesAndStructures.html#ID88

Here’s a few points to help us understand how the copy-on-write optimisation works:

The Array type is a value type.
Array has a stored property which is a reference to a buffer object which is where the Array data actually lives (on the heap).
When an Array is copied, the copy gets a reference to the same buffer. The reference count of the buffer has now increased.
When a mutation is applied to an Array, the reference count of the buffer is checked. If it is greater than 1, the buffer is copied first. Otherwise, the buffer is not copied. In this sense it’s not just copy-on-write but copy-on-write-when-necessary.
This behaviour is hand-written in the Standard Library code for Array and other container types, it’s not an automatic feature of Swift value types.

With this in mind we should be able to reason about the memory performance of the following snippets. I’ve used pretty large numbers throughout, so that it’s easy to see the changes in memory usage in Activity Monitor (or other tools).

I’ve put these examples into copy_on_write_examples.swift in case you want to run them for yourself.

See if you can predict what the approximate memory usage will be at each of the numbered comments in these snippets.

Copying a large array

var x = Array(repeating: Int64(1), count: 100_000_000)
// Memory usage is ~800MB
var y = x
// (1)
y.append(2)
// (2)

This is a fairly simple case.

At (1) we have taken a copy of x into y but not yet mutated it so at this point we still just have a reference to the same buffer, memory usage is still around 800MB

At (2) we have mutated y’s data so Swift will notice that the buffer is not uniquely referenced and will copy it fully before the mutation. Memory usage is now around 1600MB

Nested Arrays

var x = [
	Array(repeating: Int64(1), count: 100_000_000),
	Array(repeating: Int64(1), count: 100_000_000),
]
// Memory usage is ~1600MB
var y = x
y.append([])
// (1)
y[0].append(2)
// (2)

There’s a little bit of nesting going on here.

This time, x’s buffer is not storing a lot of data, rather it is storing 2 small Array structs which themselves have references to large buffers. After we take a copy of x and then mutate its buffer, it will copy x’s buffer but won’t copy the buffers of x’s elements.

At (1), memory usage is still around 1600MB. At (2), we’ve made a mutation to one of the large buffers, so that one buffer will be copied first, so we expect a memory usage of about 2400MB

Array within a struct

struct ThingWithArray {
	let name: String
  let array: [Int64]
}

let x = ThingWithArray(
  name: "Nice Thing",
  array: Array(repeating: Int64(1), count: 100_000_000)
)
// Memory usage is ~800MB
var y = x
// (1)
y.name = "Nicer Thing"
// (2)
y.array.append(1)
// (3)

This is similar to the last case. The struct itself doesn’t hold much data, the array property just holds a reference to a buffer which stores a lot of data.

At (1) the struct is copied into y, so y now has a reference to the same array buffer and the memory usage is still around 800MB.

At (2), mutating y doesn’t touch the array buffer, so still no big change in memory usage.

At (3), the array buffer is mutated via y so the buffer is now copied and memory usage goes to about 1600MB.

Many structs

  struct Thing {
    // A struct with about 80 bytes
    let a: Int64 = 0
    let b: Int64 = 0
    let c: Int64 = 0
    let d: Int64 = 0
    let e: Int64 = 0
    let f: Int64 = 0
    let g: Int64 = 0
    let h: Int64 = 0
    let i: Int64 = 0
    let j: Int64 = 0
  }
  let thing = Thing()
  let x = Array(repeating: thing, count: 10_000_000)
  // (1)

Structs themselves do not have automatic copy-on-write semantics, so taking many copies of a simple struct, even if they are not mutated, will really cause them to be copied. Memory usage at (1) is about 800MB.

Mutating when uniquely referenced

  var x = Array(repeating: Int64(1), count: 100_000_000)
	// Memory usage is ~800MB
	if true {
    let y = x
  }
  x.append(1)
  // (1)

Here we copy x into y but then y goes out of scope so the reference count on the large array buffer drops back down to 1. At (1), when we’ve mutated the array, the reference count is still 1 and so no copy is taken and the memory usage is still ~800MB.

References and Further Reading

Wikipedia — Value Semantics
swift/docs/OptimizationTips.rst — Use copy-on-write semantics for large values
- This discusses how to add copy-on-write behaviour to your own value types using isKnownUniquelyReferenced
Ben Cohen - Fast Safe Mutable State
- Slightly related, manager of the Swift Standard Library team explains copy-on-write and some interesting cases where the Standard Library takes extra care to prevent buffers being referenced multiple times unnecessarily.
swift/stdlib/public/core/Array.swift

Running Mypy in Pre-commit

2020-12-14T22:05:00+00:00

The only thing worse than not type-checking your code is thinking you are type-checking it when you aren’t.

This post is about running Mypy in a Git pre-commit hook using the Pre-commit framework. Running Mypy is a little fiddly in itself, and pre-commit/mirrors-mypy (the de facto way to call Mypy in Pre-commit) calls Mypy in a slightly opinionated way that may introduce more confusions or hide errors you want to see.

Three take-away points if you’re in a hurry:

Make sure you run Mypy on all files, not just those that have changed
Make sure Mypy has access to the installed dependencies of the code it is type-checking
Be careful with the use of flags that reduce the strictness of Mypy like --ignore-missing-imports

Here, I show you how to make your own Mypy hook that suits your needs, in 3 only-somewhat-fiddly steps:

Running Mypy correctly outside of Pre-commit [Jump]
Creating your own Pre-commit hook [Jump]
Giving Mypy access to your project dependencies [Jump]

A solution that works in my case

Before discussing the gory details and alternatives, here’s a solution that works for my project.

I add a mypy.ini:

[mypy]
# mypy_path will vary (and may not be necessary) 
# for your project layout.
mypy_path=./src:./tests

# Explicitly blacklist modules in use
# that don't have type stubs.
[mypy-pytest.*]
ignore_missing_imports = True
[mypy-pyproj.*]
ignore_missing_imports = True

and then add a script at ./run-mypy:

#!/usr/bin/env bash

# A script for running mypy, 
# with all its dependencies installed.

set -o errexit

# Change directory to the project root directory.
cd "$(dirname "$0")"

# Install the dependencies into the mypy env.
# Note that this can take seconds to run.
# In my case, I need to use a custom index URL.
# Avoid pip spending time quietly retrying since 
# likely cause of failure is lack of VPN connection.
pip install --editable . \
  --index-url https://custom-index-url.com/simple \
  --retries 1 \
  --no-input \
  --quiet

# Run on all files, 
# ignoring the paths passed to this script,
# so as not to miss type errors.
# My repo makes use of namespace packages.
# Use the namespace-packages flag 
# and specify the package to run on explicitly.
# Note that we do not use --ignore-missing-imports, 
# as this can give us false confidence in our results.
mypy --package acme --namespace-packages

and then define a custom Pre-commit hook that runs that script in: ./.pre-commit-config.yaml

# .pre-commit-config.yaml

repos
- repo: local
  # We do not use pre-commit/mirrors-mypy, 
  # as it comes with opinionated defaults 
  # (like --ignore-missing-imports)
  # and is difficult to configure to run 
  # with the dependencies correctly installed.
  hooks:
    - id: mypy
      name: mypy
      entry: "./run-mypy"
      language: python
      # use your preferred Python version
      language_version: python3.7
      additional_dependencies: ["mypy==0.790"]
      types: [python]
      # use require_serial so that script
      # is only called once per commit
      require_serial: true
      # Print the number of files as a sanity-check 
      verbose: true

You’ll have to adapt this to your own project structure and strictness/performance needs. To expose all the issues this tries to cover, we’ll build it up in 3 steps.

Step 1: Running Mypy correctly outside of Pre-commit

Before thinking about Pre-commit, we should make sure we can run Mypy directly in the desired way.

Running on the correct files

Running mypy . in the root of your project will often not do what you need it to. You should play around, keeping an eye on Mypy output, to make sure Mypy is running on all the files that you want. This involves choosing:

Whether to specify the files to type-check as a package, a module, a directory, or a file path
Whether to specify a MYPYPATH
Whether to add the --namespace-packages option
What working directory to invoke Mypy from

Running mypy and managing imports is a helpful section of the documentation for getting this right. Pay extra attention when you are using namespace packages, packages without __init__.py files.

I’m writing this whilst v0.790 is the latest release. Simplifying the calling of Mypy, and its import handling is a current priority for the maintainers. See for example the umbrella issue, #8584 — Redesign import handling. Various improvements have already been merged to the master branch.

Following the right rules

Once Mypy is running on the correct files, you’ll want to get it running the right checks for your codebase so that it passes whilst also checking what you want it to check. This may involve:

Making changes to your codebase to meet new rules that you want to enforce
Setting various strictness settings. For example: --no-implicit-optional, --disallow-untyped-defs, --no-strict-optional or the umbrella option --strict
Deciding which imported modules to treat as Any. Sometimes Mypy will complain that it can’t find a certain module or its stubs. This can be indicative that Mypy does not have access to these dependencies, which you should fix (see below), but can also mean the library doesn’t have any type stubs. For the latter case, it’s sensible to treat those modules as Any in a mypy.ini file:
```
# mypy.ini
    
[mypy]
# this section is required
# you can add a mypy_path here, if you need one.
  
# example of explicitly ignoring missing stubs
# for a dependency and its subpackages.
# This is safer than ignoring everything 
# with the --ignore-missing-imports option.

[mypy-pyproj.*]
ignore_missing_imports = True
```

You may even want to temporarily introduce errors in certain files to make sure Mypy will notice them. See also Mypy docs — No errors reported for obviously wrong code.

Bake it into a script

Now that you know precisely how you want to call Mypy, create a script called run-mypy that captures the arguments you want to use. For example, in my case, I have a namespace package in the src/acme directory, and my script ended up looking like this:

#!/usr/bin/env bash

set -o errexit

# Change directory to the project root directory.
cd "$(dirname "$0")"

# Because I'm using namespace packages,
# I have used --package acme rather than using 
# the path 'src/acme', which would correctly
# collect my files but erroneously add 
# 'src/acme' to the Mypy search path.
# We only want 'src' in the path so that Mypy
# knows our modules by their fully qualified names.
mypy --package acme --namespace-packages

I also had to add a mypy_path in mypy.ini:

[mypy]
mypy_path=./src

Step 2: Creating our own Pre-commit hook

Now that we know how to run Mypy for our project, we can think about running it in Pre-commit. First, a brief primer on how Pre-commit works so that we can consider what might go wrong.

How Pre-commit runs hooks

Pre-commit installs each Python hook in a separate virtualenv. Before each commit, the list of staged files is passed to that hook. Any unstaged changes are stashed and only restored after all hooks have run.

Problem: Only running on changed files

With Mypy, we probably don’t want to pass it just the list of changed files:

It will miss type errors resulting from but not occurring in the staged changes. For example: if you have changed the definition of a function but not a usage of that function in another file then the usage is now invalid, but won’t be checked.
As mentioned above, you may need more control over how Mypy is invoked anyway.
Mypy uses an Incremental Mode by default. It stores calculated type information so re-running on all files after only a few changes doesn’t take as long. For faster incremental runs, consider using a long-running Mypy daemon.

We’ll solve this by using our own run-mypy script and ignoring the file list that Pre-commit passes to it.

Problem: Running in an isolated virtualenv

Mypy running in a separate virtualenv is also problematic, since it won’t have access to all the dependencies installed in your main development environment. This means it can’t type check usages of those dependencies. We’ll solve this in Step 3.

Setting up the hook

We can solve both these problems with a properly-configured hook, which we’ll set up ourselves. To get started, create a new Repository-local hook by adding the following to your .pre-commit-config.yaml like so

# .pre-commit-config.yaml

repos
- repo: local
  hooks:
    - id: mypy
      name: mypy
      entry: "./run-mypy"
      language: python
      # use your preferred Python version
      language_version: python3.7
      additional_dependencies: ["mypy==0.790"]
      # trigger for commits changing Python files
      types: [python]
      # use require_serial so that script
      # is only called once per commit
      require_serial: true
      # print the number of files as a sanity-check
      verbose: true

Step 3: Giving the Mypy hook access to dependencies

Mypy needs an environment where the dependencies are imported so that it can check for type-errors in their usage. Here’s a few options for doing that, with differing levels of convenience and speed:

Option 1: Use `language: system` to run Mypy in an existing environment

Replace language: python in your hook definition with language: system. Remove the additional_dependencies line and install Mypy into your environment directly. Now, Pre-commit will not create a separate virtualenv for the hook and will run it in whatever environment you happen to be in when you run git commit or pre-commit run. This means you always run Mypy directly in your dev environment, but breaks if any of the developers on the project want to trigger Pre-commit from outside the dev environment. For example, this won’t work if using a GUI Git client, as the correct virtualenv probably won’t be activated.

Option 2: Point Mypy to a specific environment with `--python-executable`

If it’s possible to automatically figure out the path to the appropriate Python interpreter (the one associated with the existing installation of your dependencies, which may or may not be in a virtual environment), then you can point Mypy to that path using the --python-executable option on mypy.

Option 3: Install specific dependencies with the `additional_dependencies` hook option

If you only care about type-checking the usages of a few third-party modules, then you can install those specific modules into the hook environment like so:

# .pre-commit-config.yaml

repos
- repo: local
  hooks:
    - id: mypy
    name: mypy
    entry: "./run-mypy"
    language: python
    # Replace with appropriate version
    language_version: python3.7
    # install Mypy, and the dependencies
    additional_dependencies: 
      - "mypy==0.790"
      - "sructlog==20.1.0"
    types: [python]
    # use require_serial so that script
    # is only called once per commit
    require_serial: true
    # Print the number of files as sanity-check 
    verbose: true

This is relatively fast as Pre-commit remembers the dependencies it installed in the environment [source code]. The downside is this means duplicating your list of dependencies (at least those that have type stubs).

The additional_dependencies are just sent directly to Pip [source code] so you can happily add, for example, a --index-url argument in this array. Just be aware that Pre-commit will only re-run Pip when the list of additional_dependencies changes, so don’t expect to put “requirements.txt” in this array and have it figure out when you’ve changed that file.

Option 4: Running a full `pip install` in the hook

This is not fast. The speed we mostly care about is that of running the hook on each commit, not its initial setup. However, running pip install takes many seconds even when the dependencies are already installed. It is, however, a pretty reliable and easy way to make sure your dependencies are installed if the performance hit is acceptable to you and your team.

To do this, simply add the appropriate pip command into your run-mypy script. For example:

#!/usr/bin/env bash

set -o errexit

# Change directory to the project root directory.
cd "$(dirname "$0")"

# Install the dependencies into the mypy environment.
# Note that this can take seconds to run.
pip install --editable . --no-input --quiet

mypy --package acme --namespace-packages

This is the option I have gone for so far since it minimises the likelihood of us making, mistakes without imposing many restrictions on local project setup. For example, teammates can develop in whatever Python environment they like.

Bonus step: Running in CI

If you use Pre-commit locally it’s often a good idea to run pre-commit run -a in your CI pipeline. The setup I gave at the top of this post works fine in CI too. However, if you’d rather have a different Mypy setup in CI than locally, you can run Pre-commit with a SKIP environment variable in CI to skip the Mypy hook, and then run Mypy however you want in a separate CI job: SKIP=mypy pre-commit run -a. See Pre-commit - Temporarily disabling hooks.

Summary

We have seen many potential issues of running Mypy in Pre-commit:

Changing one file may cause a type error in another file, so we need to run Mypy on all files, not just those that have changed
We need to give Mypy access to the installed dependencies of the code it is type-checking, otherwise it can’t check the usages of those dependencies
Flags that reduce the strictness of Mypy like --ignore-missing-imports can give us false confidence

We saw how to address these issues by making our own custom hook. There doesn’t appear to be a neat, one-size-fits-all solution, so it’s worth giving some thought to this set up in each instance.

How Git LFS Works

2020-10-31T18:14:00+00:00

Git LFS (Large File Storage) helps you version large files in Git without having to download every version of it. This post explains the mechanisms that Git LFS uses internally to do its job, including:

Git subcommands to give us the git lfs command in the first place [Jump]
Git clean and smudge filters to replace large files with pointer files [Jump]
Git pre-push hooks to upload the large files to a server [Jump]

The Problem Git LFS Solves

Git needs to remember every version of every file in the repository. It starts by storing all the different versions of the file separately in full, it calls these ‘loose objects’.

When it notices that there are lots of loose objects it will do something called ‘packing’, where it remembers one version of the file as well as the differences between that and the other versions. This saves a lot of space if the differences are small relative to the size of the file. See Git Internals - Packfiles for more.

Even with packing, if a repo has a large file that changes often then this can quickly require a lot of storage space to make every version available locally, even though you might rarely work with historical versions of this large file. Ideally, you would only download and store the versions of this file that you actually need to view or work with, but you don’t want that to get in the way of your normal Git workflow.

Git LFS does not save space on your server, but saves you space on your local copies of the repo.

How Git LFS Is Used

The following instructions can be found on the Git LFS website:

The remote Git server must be set up to support the Git LFS API. This is done for you on GitHub, GitLab (requires some configuration) and Bitbucket.

Each user of the repository must:

Download the git-lfs executable
Run git lfs install on their machine

One user of the repository must:

Run git lfs track "*.psd", replacing “*.psd” with the filename pattern that you want to track
Make sure the .gitattributes file is checked into the repository

Now all users can add files to the repo as normal and Git LFS will work away in the background.

What Git LFS Does

There are three main things that Git LFS does for us:

Git LFS replaces the large files that you try to git add for commit in the repo with pointer files, files that just contain an identifier of the content, and it stores the content files themselves in a separate local folder, the Git LFS Cache at .git/lfs/objects.
When you git push your commits, the new large files in the local Git LFS Cache are uploaded to the server separately. This is done using the Git LFS API that the server must implement.
Whenever you do a git checkout, Git LFS will find all the pointer files and replace them with the files themselves, downloading whatever files necessary from the remote server. This means that you only store locally the large files that you actually checkout or that you committed yourself, not all versions of large files in the history of the repo.

To perform these three magic tricks, Git LFS needs to intercept add, push and checkout and needs to keep track of which pointer files map to which large files in the Git LFS Cache. Understanding how it does these things in more detail can give us more confidence when using Git LFS and guide us in the right direction when something goes wrong.

How Git LFS Works

Git subcommands

git clone and git push are two different built-in subcommands of git. As it turns out, any executable available in your PATH that starts with git- can be used as a Git subcommand.

For example:

# create script named 'git-shout'
# sidenote: this is 'heredoc' syntax
cat << EOF > git-shout
#! /usr/local/bin/bash
echo "running custom command!"
EOF

chmod u+x git-shout
# make sure it's in the shell's PATH
export PATH=$PATH:.

git-shout
# running custom command!

# it can also be called like this:
git shout
# running custom command!

There’s nothing special about it being a subcommand, it’s just a nice-looking alias to the same executable.

Clean and Smudge filters

Git has the concept of the staging area (or index) where changes go before they are committed. We select changes to place into the staging area with git add. Git has a feature called filters which let us process files just before they are staged (the ‘clean’ filter) and process files just before they are checked out into your working tree (the ‘smudge’ filter).

These filters can be used for things like:

keeping any passwords or other secrets out of the repository
including the last-modified date of a file in the file itself
pulling in files from non-Git sources

Creating a new filter

First, the clean and smudge actions for the filter need to be added to either the user’s ~/.gitconfig file or the repository-local .git/config file. Either way, this needs to be done on each user’s machine. As a simple example, we’ll add a filter that censors the word ‘butts’, because we can’t be having such foul language getting checked into our repository:

# .git/config

# define a 'hide-naughty-word' filter
[filter "hide-naughty-word"]
  # define the command for this filter 
  clean = sed s/butts/b--ts/
  smudge = sed s/b--ts/butts/

Here, I assume that all users of the repository already have the sed command installed on their machine.

Assigning the filter to files/filetypes

Then, we need to tell Git which files to run this filter on by adding to the .gitattributes file in the repo. We can assign the filter to a specific file (my-specific-file.txt), a specific file type (*.txt), or all files (*). For more detail see gitattributes Documentation. Here, we will run the hide-naughty-word filter on all files:

*  filter=hide-naughty-word

This .gitattributes file can be checked in so that it can be shared amongst all users of repo.

Now we can see this new filter in action by committing a new file and then using a command called git cat-file to see what has actually ended up in the repository:

echo "i like big butts and i cannot lie" > mix-a-lot.txt
git add mix-a-lot.txt
git commit -m "Add naughty file"
git cat-file blob "HEAD:mix-a-lot.txt"
# i like big b--ts and i cannot lie
cat mix-a-lot.txt
# i like big butts and i cannot lie

We see that the content has been filtered in the repository but when we actually view the checked out file (for example, by using the cat tool) it has the content we expect.

Running filters on files that are already committed

What if someone on our team doesn’t have the filters set up properly and checks in an unfiltered file:

# for demonstration,
# empty the .gitattributes file,
# disabling the filter.
echo "" > .gitattributes  
echo "i like big butts and i cannot lie" > mix-a-lot2.txt
git add mix-a-lot2.txt
git commit -m "Add unfiltered file"
# re-enable filter
echo "*  filter=hide-naughty-words" > .gitattributes  
git cat-file blob "HEAD:mix-a-lot2.txt"
# i like big butts and i cannot lie

We can re-run the filters on all files and create a new commit like so:

git add --renormalize .
git commit -m "Run filters"
git cat-file blob "HEAD:mix-a-lot2.txt"
# i like big b--ts and i cannot lie

Note that the unfiltered file still exists in the history.

Pre-push hooks

‘Git hooks’ are a way to run custom scripts when certain events happen. These scripts can be added to the .git/hooks directory of your repo with names like pre-commit or post-checkout and can be modified to do whatever you like. The list of hooks that you can set up can be found in git/Documentation/githooks.txt.

For example, when you run git push, Git will first run the pre-push script (if it exists). If that script exits with a non-zero exit code, the push will be aborted.

Git hooks cannot be checked into a repo. If all users of a project need to run the same Git hooks, each individual user will need to set them up on their copy of the repo.

Bringing it all together

When you install Git LFS, you will get an executable called git-lfs. Because it’s named starting with git-, it is also now runnable with git lfs. When you run git lfs install on your machine, your ~/.gitconfig file is updated to contain the Git LFS filter definition.

[filter "lfs"]
  clean = git-lfs clean %f
  smudge = git-lfs smudge %f
  required = true
  process = git-lfs filter-process

Because this is modifying a user-specific file, git lfs install needs to be run once by each user of the repository before they can successfully use Git LFS.

When you run, for example, git lfs track "*.jpg" to track all .jpg files in the repo with Git LFS, it updates your .gitattributes file:

*.jpg filter=lfs diff=lfs merge=lfs -text

This tells Git to use the lfs clean and smudge filter for these files, as well as attaching some extra attributes. With the filters in place, whenever you stage a .jpg file it will be replaced with a pointer file containing the SHA-256 hash of the file content. The file itself gets stored in .git/lfs/objects at a path based on the hash so that it’s easy to find later. Note that the .gitattributes file can and should be checked into the repository so that everyone on the project tracks the same files in Git LFS.

Almost every Git LFS command you run (including git lfs install and the clean and smudge filters) will also modify the pre-push hook if it’s not already set up. So, because we ran some Git LFS commands already, the .git/hooks/pre-push file should already look like this:

#!/bin/sh
command -v git-lfs >/dev/null 2>&1 || { echo >&2 "\nThis repository is configured for Git LFS but 'git-lfs' was not found on your path. If you no longer wish to use Git LFS, remove this hook by deleting .git/hooks/pre-push.\n"; exit 2; }
git lfs pre-push "$@"

It first checks if the git-lfs executable exists and gives an error message if not. It then forwards to the git lfs pre-push command. Git LFS’s pre-push command will read the list of branches to be pushed and scan each new commit in those branches for new pointer files [source code]. For each new pointer file it finds, it looks up the actual large file in the Git LFS Cache. It will then upload all the new files to the remote server using the Git LFS API.

Note that because it’s using hashes for file identity, you will never end up with two copies of the same large file on the remote server.

And that’s it! These are the main mechanisms Git LFS uses to do its job.

What could possibly go wrong?

Now that we understand the main mechanisms in use, we can debug some issues that might arise:

I’m seeing pointer files when I should be seeing the actual files

The smudge filter might not be properly set up for this file.

Ensure the .gitattributes file has a line that matches this file with filter=lfs
Ensure you have the filter installed. It’s perfectly harmless to re-run git lfs install
Check out the file again to re-run the filter: git checkout --

Large files are being checked in when I wanted pointers to be checked in

The clean filter might not be properly set up for this file.

Follow a similar remedy to above
Re-run the filter with git add --renormalize

My repo is still taking up loads of space

Installing Git LFS won’t automatically run the LFS filters on historical commits. If you already have large files in your history and want to rewrite your Git history to avoid that, take a look at git lfs migrate
If Alice makes 10 commits that change a large file, pushes them, and then Bob checks out her latest commit, then Bob will only download the latest version but Alice will likely still have all 10 versions in her LFS cache. Alice may want to run git lfs prune in her copy of the repo to get rid of unneeded versions of the file.

Summary

In summary, Git LFS…

…does not save you any space on the remote server, but saves you space locally.
…uses clean and smudge filters.
…installs a git-lfs executable which, due to its name, can also be run with git lfs.
…installs filters in ~/.gitconfig when you run git lfs install.
…configures filters in .gitattributes when you run git lfs track ....
…configures the pre-push hook whenever you run any git-lfs command (including the clean and smudge filters).
…puts large files away in .git/lfs/object, named by their SHA-256 hashes, using the clean filter.
…pushes large files to the remote using the pre-push hook.

References

Git LFS Client Specification
Tim Pettersen — Tracking huge files with Git LFS
BitBucket Git LFS Tutorials — These are well-written and cover many practical details of using Git LFS
Git Hooks Documentation
Git Attributes Documentation
Git LFS Source Code

Python Typing: Resisting the Any type

2020-08-29T15:29:00+00:00

With typing in Python, we aim to restrict as many invalid programs as possible before they’re ever run. This post covers some useful features for tightening up our types:

TypeVar for when an unknown type appears multiple times in the same context [Jump]
@overload for when you have a function whose behaviour depends on its input [Jump]
Protocol for supporting any type with the desired attributes and methods [Jump]

Introduction

Python itself doesn’t have a static type checker, but does provide a syntax for adding type annotations to your code:

# basics.py

from typing import List
from dataclasses import dataclass


# We can annotate the input and output types of functions.
def fibonacci(n: int) -> int:
    assert n >= 0
    if n == 0:
        return 0
    if n == 1:
        return 1
    return fibonacci(n - 1) + fibonacci(n - 2)


@dataclass
class Document:
    # We can annotate class attributes too.
    # We're using the generic type 'List' with parameter 'str'.
    # This means the `lines` attribute is a list of strings.
    lines: List[str]


# This is invalid:
broken_doc = Document(lines=[1, 2, 3])

# This is valid:
fibonacci_doc = Document(
    lines=[
        f"{n}th fib number is {fibonacci(n)}"
        for n in range(20)
    ]
)

IDEs like PyCharm and type-checking tools like Mypy and Pyre use these annotations to identify type errors. The intended rules of this type-checking are mostly standardised in Python Enhancement Proposals (PEPs), so the behaviour is similar between the tools.

Getting started with Mypy

Here, we’ll use Mypy for type-checking. To get started with Mypy, simply pip install mypy. Here’s an example of running Mypy using the basics.py file from above

# In your terminal:
virtualenv mypy-venv -p python3.8
source mypy-venv/bin/activate
pip install mypy
mypy --version
# mypy 0.770
mypy --strict python-typing-playground/basics.py
# warm-up/basics.py:22: error: List item 0 has incompatible type "int"; expected "str"
# warm-up/basics.py:22: error: List item 1 has incompatible type "int"; expected "str"
# warm-up/basics.py:22: error: List item 2 has incompatible type "int"; expected "str"
# Found 3 errors in 1 file (checked 1 source file)

I’m using Python 3.8. Many interesting typing features were added to the standard library in 3.8 but if you can’t use Python 3.8, typing-extensions often has your back.

Mypy lets you get started with a large existing untyped code base and gradually add type hints over time. It will simply silently treat all objects as Any if it can’t find type hints for them. I recommend using the --strict flag when first trying it out. This mostly stops Mypy from silently assuming things are Any, which I think is helpful when trying to understand what Mypy can and cannot infer.

What’s wrong with `Any` anyway?

Adding types restricts the way our functions, classes and variables can be used. This restriction is good because it helps catch whole categories of bugs before the code is ever run whilst also (usually) making the code easier to think about. Typically, our goal with typing in Python is:

to restrict as many ‘invalid’ usages as possible,
whilst allowing all ‘valid’ usages,
and keeping our code relatively tidy.

Sometimes, using only the syntax given above, we might get stuck on how to express the type of something and turn to the Any type. The Any type is a magical type which pretends to the type-checker to support any operation you could possibly want, and to also be a parent type of all types:

from typing import Any


def do_something_shady(magic_input: Any) -> Any:
    # We can do any operation we want on an Any object.
    # The type-checker will allow it 
    # and assume the result is an Any object
    return magic_input.thing["stuff"] + 42


# All types are compatible with Any,
# so even if we pass in a str, this is valid.
# If we try to run this, it will fail,
# since a string doesn't have a 'thing' attribute.
do_something_shady(magic_input="just a string")

Clearly, using Any can allow many invalid programs. I’d like to share a few useful Python type-system features that we can consider before giving in to the Any type’s magical allure, or otherwise loosening our types.

Polymorphism with `TypeVar`

Type variables lets us specify that some unknown type will appear multiple times in the same context, whether that be multiple times in the same function signature, or multiple times in methods of a generic class. We don’t know what the type will be, but we know it will be the same in all the places where it appears.

from typing import List, TypeVar
from dataclasses import dataclass

# Declare a type variable
ValueType = TypeVar("ValueType")  # The string in this constructor must match the variable name


def repeat(value: ValueType, count: int) -> List[ValueType]:
    """Returns the input value repeated `count` times in a list."""
    return [value] * count

# This is valid
", ".join(repeat("a", count=5))

# This is invalid (can't do a string join on a list of ints)
", ".join(repeat(3, count=5))

Because ValueType is used in two different places in the same signature, Mypy will check, for each call to this function, that the values used in those two places match types. In this case, it checks that the type of value returned from the function is always a list of the type of value given to the function. This also means Mypy knows more about the result of calling the function, and can be stricter there too.

Using bound=

So TypeVar is great for capturing this idea of an unknown type appearing multiple times. Sometimes we do know something about the type, and maybe we want to use a certain field that we know will exist on the type, but we still want to capture and check the fact that it appears multiple times.

By default a TypeVar will bind to any type, all the way up to object but we can put an ‘upper bound’ on this using the bound parameter. This says ‘only let this TypeVar bind to a subtype of X’, where X is some type that we care about.

from typing import List, TypeVar
from dataclasses import dataclass

@dataclass
class Animal:
    name: str


class Dog(Animal):
    def pat(self) -> None:
        print(f"gave {self.name} a good pat")


class Cat(Animal):
    def stroke(self) -> None:
        print(f"stroked {self.name}")


AnimalType = TypeVar('AnimalType', bound=Animal)


def sort_animals_by_name(items: List[AnimalType]) -> List[AnimalType]:
    return sorted(items, key=lambda animal: animal.name)


# This is valid (all the inputs are dogs so all have the 'pat' method)
sorted_dogs = sort_animals_by_name([Dog(name="spots"), Dog(name="buzz")])
sorted_dogs[0].pat()

# This is invalid
sorted_dogs[0].stroke()

# This is invalid (cannot use str due to bound=Animal)
not_animals = ["John", "Joan", "Jan"]
sort_animals_by_name(not_animals)

Because we specified that AnimalType must be a subtype of Animal, the type-checker allows us to use the name property from Animal within our function. Note that we still aren’t able to use a more specific method like pat within the body of sort_animals_by_name, since it doesn’t always exist on the upper bound, Animal.

Misconceptions about scoping

Because of the strange syntax for declaring a type variable, where we create a global object, it’s easy to get confused about their scoping rules:

Question: Is the following valid?

from typing import List, TypeVar

T = TypeVar("T")

def repeat(value: T, times: int) -> List[T]:
    return [value] * times

def identity(x: T) -> T:
    return x

repeat("echo", times=5)
repeat(42, times=5)
identity(42)

After all, we seem to be asking T to be a string and then to be an int.

The answer is yes, this is valid. Generic functions using TypeVars can happily be called multiple times with different input types. The two calls to the function are unrelated as far as this TypeVar is concerned. Similarly, TypeVars can happily be recycled across multiple independent functions. You often only need multiple TypeVars if you need multiple distinct types within the same function signature.

See Scoping rules for type variables within PEP-484 for more examples.

Overloading with `@overload`

Sometimes we have an overloaded function, one which has different behaviour depending on the input given.

Overloading in Python is quite different to other languages. In other languages, we might write multiple function implementations with the same function name but different arguments (either in type or in name) and when that function is called, the correct implementation would be chosen based on the given arguments.

In Python, we are not going to define multiple implementations (remember our type annotations aren’t read at runtime). In Python we just have one implementation, which manually checks the types and does the right thing. All @overload lets us do is tell the type checker which combinations of parameters and outputs are valid.

We do this by providing multiple @overload signatures before defining our actual implementation.

from typing import Optional, overload
import math

@overload
def get_circle_area(*, radius: float) -> float: ...

@overload
def get_circle_area(*, circumference: float) -> float: ...

def get_circle_area(*,
     radius: Optional[float] = None,
     circumference: Optional[float] = None
) -> float:
    """
    Takes either a radius or circumference of a circle,
    and returns the area of that circle.
    """

    # Check we've been given exactly one of the two forms of input
    if radius is not None and circumference is not None:
        raise ValueError("Can't use both radius and circumference")
    elif radius is not None:
        canonical_radius = radius
    elif circumference is not None:
        canonical_radius = circumference / math.tau
    else:
        raise ValueError("Give either a radius or circumference")
    
    return math.pi * canonical_radius * canonical_radius


# Invalid (want exactly one of radius or circumference):
get_circle_area()
get_circle_area(radius=None)
get_circle_area(circumference=None)
get_circle_area(radius=1.0, circumference=None)
get_circle_area(radius=1.0, circumference=3.0)
get_circle_area(3.0)  # ambiguous without the argument keyword

# Valid:
get_circle_area(radius=1.0)
get_circle_area(circumference=math.tau)

We used two @overload signatures (with empty implementations) before specifying the implementation itself. The type signature of the implementation itself is only used for type-checking that implementation, not for type-checking usages of the function. The type checker will also make sure that our implementation supports all the @overload signatures described. For example, If we have an @overload which accepts a string argument, but the type of the implementation only takes floats, Mypy will throw an error.

As a side note, calling this function with a number but without the argument keyword (e.g. radius=) would be ambiguous, we wouldn’t know if the number was a radius or a circumference. The asterisk in the function signatures enforces that keywords be used. For more explanation, see Keyword-Only Arguments — Specification

Static Duck Typing with `Protocol`

If we’re restricting the input types of a function, we sometimes only care that it has certain attributes and methods, not whether it’s a subclass of a particular class. Furthermore, it’s not always possible to modify a class to inherit from a certain base class, because it might be from a library that you don’t control.

Protocol lets us capture these attribute and method constraints without requiring the type to inherit from any particular class.

# If you can't use Python 3.8, you can also import Protocol from typing-extensions
# https://pypi.org/project/typing-extensions/
from abc import abstractmethod
from typing import Protocol
from dataclasses import dataclass

# Setup: we'll define a few classes which happen to have a 'name'

@dataclass
class Person:
    first_name: str
    last_name: str
    height: float

    @property
    def name(self) -> str:
        return self.first_name + " " + self.last_name

    def greet(self) -> str:
        return f"Hi, my name is {self.name}."


@dataclass
class Company:
    name: str
    address: str

    def greet(self) -> str:
        return f"We are {self.name}, find us at {self.address}."


@dataclass
class Dog:
    name: str
    color: str

    def greet(self) -> str:
        return "Woof!"



class Greetable(Protocol):
    """
    Matches any type with
    a readable (but not necessarily writeable) `name`
    and a `greet` method that returns a string.
    """
    @property
    def name(self) -> str: ...

    def greet(self) -> str: ...


def introduce(greetable: Greetable) -> None:
    print(
        f"This thing is named {greetable.name}"
        f" and it says '{greetable.greet()}'"
    )


class Renameable(Protocol):
    """Matches any type with a read/write `name`."""
    name: str


def rename(renameable: Renameable, new_name: str) -> None:
    renameable.name = new_name


spots = Dog(name="spots", color="white")
acme = Company(name="Acme Inc.", address="123 Industry Ave.")
john = Person(first_name="John", last_name="Johnson", height=1.96)

# Valid things
introduce(spots)
introduce(acme)
introduce(john)
rename(spots, "spotsy")

# Invalid things
introduce(42)
introduce("blah")
rename(john, "J-man")  # Person's `name` is read-only

None of the classes we defined inherit from Greetable (in fact, non-protocol classes cannot inherit from Protocols) yet they all conform to it because they have a name and a greet method with the appropriate types. As such, they can all be passed to a function that takes a Greetable.

When specifying attributes that must exist on the type, we saw both:

the @property form, which lets us specify that we only need to read this value,
and the name: str form which is a shorthand for an attribute that is both readable and writeable.

We can also opt in to these Protocols being checkable at runtime. For more detail, see @runtime_checkable decorator within PEP 544

Conclusion

In this post we’ve seen some useful features for tightening our types:

Use TypeVar to express that a single unknown type will appear multiple times in the same context
Use @overload when you have a function whose behaviour depends on its input
Use Protocol when you want to support any type which happens to have the attributes and methods that you need.

Should we get rid of British Summer Time?

2020-08-22T14:15:42+00:00

British Summer Time is the UK’s rendition of the ever-controversial concept of Daylight-Saving Time (DST). With the EU counting down to a co-ordinated end to DST in 2021, just after the end of the Brexit transition period, the jury is out on what the UK will do next. Here, I explore some of the issues around this question.

A brief history of Daylight Saving Time in the UK

The year was 1895. George Hudson was beginning to cause quite a stir on the New Zealand time-keeping scene. A Post Office worker by trade yet an astronomer and insect scientist by passion, Hudson cared about the opportunity to utilise daylight hours to their fullest. He presented his case to the Wellington Philosophical Society. From Transactions and Proceedings of the Royal Society of New Zealand 1868-1961:

We cannot individually alter our times of going to bed or getting up, but must fall in with the habits of the majority ... those who desire to utilise the early-morning daylight are compelled to take some of their recreation before their daily work and some afterwards, which in many cases results in their having to forego pursuits that they would be enabled to follow successfully if their daylight leisure were continuous.

Though Hudson seems to be the first to write this idea down in so many words, it took tens of years, a world war, and persistent campaigning by other proponents of the idea for DST to be implemented anywhere in the world.

1916	British Summer Time was adopted (Summer Time Act 1916), after Germany had already implemented DST. The purpose at the time was to preserve coal (sunlight at more appropriate hours means less artificial light).
1941-1945	Clocks were brought forward an extra hour during world war two (GMT+1 in winter, GMT+2 in summer, ‘British Double Summer Time’)
1968-1971	The UK experimented with year-round GMT+1 (‘the British Standard Time experiment’)
1997	The EU started prescribing the clock change and the dates on which it should happen throughout its member states (Eighth Directive 97/44/EC on summer-time arrangements)

British Standard Time Experiment

We can gain some insights about our question by looking back on the British Standard Time experiment and the House of Commons British Standard Time debate that followed. The experiment was largely met with a ‘shrug’ by the UK public

60% had no strong views, 35% were in favour of retaining the new system, 5% were against the new system [ref]

The rest of the debate was similarly inconclusive:

Some concluded from the experiment that road accidents had decreased though others were quick to point out that the breathalyser was introduced shortly before the experiment began, complicating the analysis. Analysis tried to avoid this complication by only looking at the change in accidents in two ‘rush hour’ windows, arguing that drunk driving accidents typically happen later at night. (The Potential Effects On Road Casualties Of Double British Summer Time — 2.1)
Manual labourers, especially in the north, as well as postmen who had to do morning deliveries, complained of going to work in the dark
Some mothers complained that their children had to travel to school in the dark. Others enjoyed that their children were coming home in the brighter hours of the afternoon.
Many argued that the additional time for leisure activities provided by year-round GMT+1 was particularly important. (Recall, this is the same argument George Hudson used when he first argued for DST)
Farmers were against the change. Or, wait, were they in favour of it?

The Scotland problem

The country was also divided by it’s wide range of latitude (north-ness). Daylight hours change with the seasons and do so more dramatically the further you are from the equator. This means there is a stark difference between the daylight hours of England and Scotland:

	Shortest Day	Longest Day
Edinburgh	7 hours	17 ½ hours
London	8 hours	16 ½ hours

This makes having a daylight saving period more appealing in Scotland because it’s harder to find a single time zone there that works all year round without either wasting morning daylight in summer or having some very dark mornings in winter. During the experiment, Edinburgh would have days where there was no Daylight until 9:43 (whilst London would have daylight from 9:06).

Further Considerations

Of course, a lot has changed since 1970 and the public debate has continued on whether we should keep fiddling our clocks. Parliament has many times debated changes to the system, notably including the failed Daylight Saving Bill 2010. These have repeatedly failed or ran out of parliamentary time. Further considerations on the issue have included:

Energy

There are many hunches one can form around how changing BST would impact energy usage:

Synchronisation with mainland Europe could lead to increased peak-time energy costs since peaks will be shared across UK and Europe. The UK imports about 5% of its electricity from France, and also imports from the Netherlands and Belgium.
BST might cause some to wake up earlier when it’s not warm or bright yet and so could increase heating and lighting costs in the morning.
If morning hours are dark, people may forget to turn off lights when they leave for work.

The ambitious can attempt to quantify the effects. In 2010, Cambridge University Engineering Department estimated overall energy savings of 0.3% in winter months if we were to adopt BST all year round. This isn’t an especially compelling result, though the authors suggest that they have adopted “a conservative approach such that they consider them lower bounds on any true savings.”

Sleep and Health

There has been some suggestion of a spike in heart attacks (or ‘accute myocardial infarctions’, if you must) the week after we lose an hour of sleep, at least in the US. Later research has suggested that the net incidence of heart attacks is roughly unaffected, due to the corresponding drop when we gain an hour of sleep.

The EU pulls the plug

Despite the confusion and lack of conclusion, European Parliament have made progress on a law that sees all member states ditch daylight savings time, after they ran a public consultation (which was curiously responded to overwhelmingly by Germans). Each state will have the choice of whether to keep permanent summer time or permanent standard time but will have their last clock change for daylight saving in 2021. The legislation at the EU level allows for a coordinated end to daylight saving time in Europe.

This leaves a bit of a question mark for the UK as the Brexit transitionary period, during which it is subject to EU law, is due to end juuust before 2021. Assuming the EU law does get passed, feasible options for the UK are:

Ignore the EU and crack on with what we already have
Join the EU in ending DST and use permanent GMT+1 (‘summer time’)
Join the EU in ending DST and use permanent GMT (‘winter time’)

Flexibility

Those in search of a ‘correct’ answer to this question will be disappointed. Changing the clocks in one way benefits some and disadvantages others. Putting the clocks forward may benefit golf-players and disadvantage postmen. Putting the clocks back may benefit farmers who need to feed fussy cows and disadvantage those driving home at 5pm. This balance is inherent in trying to change behaviour at such a global level. Should we really be surprised that the schedules we’ve gotten used to over the past 80 years are on average about right?

Many criticisms of Daylight Saving Time are really criticisms of an inflexible system of work.

Daylight saving time does not give anyone any more time between between sunrise and sunset. Rather unfortunately, that is set by nature alone. Daylight Saving Time exists to nudge the nation toward taking advantage of changing daylight hours with the seasons, to shift our rigid routines. Perhaps those routines need to be a little less rigid in the first place. George Hudson’s claim that “we cannot individually alter our times of going to bed or getting up, but must fall in with the habits of the majority” may not be so true anymore, and perhaps we should try re-evaluating smaller changes we can make.

We don’t stick to the same routine all year round because it doesn’t work all year round. Similarly, not everyone should work to a 9-5 routine because it doesn’t work for everyone.

Jared Khan

Review: Climbing at the Perse Sports Centre

Induction

Booking

Routes and Climbing

Training and courses

Overall

Breaking JavaScript Generators

Single-pitch Sport Climbing in the Calanques

Day 0 - Arrival

Day 1 – Roche Percée, Morgiou

Day 2 – Pouce, Sormiou

Day 3

Day 4 – ‘Resting’ in Marseille

Day 5 – Escalier Des Géants, Les Goudes

Day 6 – En Vau

Day 7 – Vallon Des Escampons

Conclusion

Use a Switch Pro Controller with Dolphin Emulator on macOS Ventura

Step 1: Connect the controller to the Mac

Step 2: Set up the controller in Dolphin

Swift’s Copy-on-write Optimisation

Copying a large array

Nested Arrays

Array within a struct

Many structs

Mutating when uniquely referenced

References and Further Reading

Running Mypy in Pre-commit

A solution that works in my case

Step 1: Running Mypy correctly outside of Pre-commit

Running on the correct files

Following the right rules

Bake it into a script

Step 2: Creating our own Pre-commit hook

How Pre-commit runs hooks

Problem: Only running on changed files

Problem: Running in an isolated virtualenv

Setting up the hook

Step 3: Giving the Mypy hook access to dependencies

Option 1: Use language: system to run Mypy in an existing environment

Option 2: Point Mypy to a specific environment with --python-executable

Option 3: Install specific dependencies with the additional_dependencies hook option

Option 4: Running a full pip install in the hook

Bonus step: Running in CI

Summary

How Git LFS Works

The Problem Git LFS Solves

How Git LFS Is Used

What Git LFS Does

How Git LFS Works

Git subcommands

Clean and Smudge filters

Creating a new filter

Assigning the filter to files/filetypes

Running filters on files that are already committed

Pre-push hooks

Bringing it all together

What could possibly go wrong?

Summary

References

Python Typing: Resisting the Any type

Introduction

Getting started with Mypy

What’s wrong with Any anyway?

Polymorphism with TypeVar

Using bound=

Misconceptions about scoping

Overloading with @overload

Static Duck Typing with Protocol

Conclusion

Should we get rid of British Summer Time?

A brief history of Daylight Saving Time in the UK

British Standard Time Experiment

The Scotland problem

Further Considerations

Energy

Sleep and Health

The EU pulls the plug

Flexibility

Option 1: Use `language: system` to run Mypy in an existing environment

Option 2: Point Mypy to a specific environment with `--python-executable`

Option 3: Install specific dependencies with the `additional_dependencies` hook option

Option 4: Running a full `pip install` in the hook

What’s wrong with `Any` anyway?

Polymorphism with `TypeVar`

Overloading with `@overload`

Static Duck Typing with `Protocol`