Reading list round up II
How to write complex software
Grant Slatton's article provides an overview of his process for implementing complex software systems1. Two principles particularly stand out:
First, understanding your system's performance envelope is key to assess different implementation options. Writing disposable toy programs to identify the hardware/software limits that apply on the most optimistic (i.e. with little to no overhead) case can help sense check key design decisions, ensuring they are fit-for-purpose before committing significant resources.
Second, working top-to-bottom. Traditional software development often starts by implementing lower-level dependencies first - like building a bridge from both riverbanks simultaneously. However, this approach frequently leads to misaligned interfaces, kludges and accidental complexity in part due to the lack of a good feedback loop as different stakeholders are working in parallel without the means to view how the system works as a whole. Delivery pressures and the sunk cost fallacy make it very tempting for teams to accept this state of affairs, apply band-aids and move on. Instead, designing from the top down allows teams to craft interfaces that support readable code and sensible abstractions. This requires more imagination and a sense of what good looks like (as you will be writing code for dependencies that don't exist yet), but it provides faster feedback on component fit and enables teams to demonstrate working software earlier, even with stubbed components.
Nobody Gets Fired for Picking JSON, but Maybe They Should?
JSON is an incredibly successful data interchange format and is often adopted without further consideration. Unsurprisingly, despite its simplicity and human-friendliness it is far from perfect. Nobody Gets Fired for Picking JSON, but Maybe They Should? is a great breakdown of the various problems that plague JSON. In particular:
-
Numbers: Decimal number encoding is not defined in RFC8259 leaving it up to each implementation. Consider that because of how decimal numbers work in computers, representations may not be exact and therefore issues like rounding become more important. With JSON you get whatever the implementation chooses to go with, or worse some other program changes the CPU rounding more using
fesetround
! Also the behavior around representing infinity or NaN (Not a Number) is just weird and wonderful. -
Data loss on large integers: You can encode larger numbers with 64 bit integers than with 64 bit floating point numbers. Since every number in JSON is a decimal number, for large digits you are at risk of data loss. Even if your application doesn't use very large numbers that are subject to this problem, if you are not careful modeling your data and store for example barcodes as a numeric type you are at risk of data loss.
-
Strings: Generally okay and the whole JSON document should be encoded in UTF-8 (this is a great start), however it still permits unpaired surrogate code points which can lead to some strange artifacts when encoding/decoding.
-
Binary data: If you need to transmit binary data, it needs to be encoded as a
base64
encoded string, which adds more bloat and overhead. -
Streaming is not supported: Nothing else to say about this, you either get the full JSON document or you don't.
-
Canonicalization woes: JSON does not care about whitespace and field ordering, however if you digital signatures operate on byte blobs and are thus sensitive to these things and therefore RFC 8785 defines a JSON Canonicalization Scheme, which re-uses the ECMA-262 (Javascript 6+) serialization rules, which introduces subtle issues when dealing with strings and numbers (e.g. unpaired surrogate code points are not supported).
Generative AI – The Power and the Glory
AI continues to capture headlines and funding. Michael Liebreich's analysis of how energy has become a crucial limiting factor for the tech industry's plans for AI is a very interesting read.
As of today data centers only account for less than 2% of US power consumption, however Generative AI has quite an energy appetite: a Google search takes about 0.3Wh while a ChatGPT query takes 2.9Wh, a full order of magnitude more! With massive investments flowing into AI, managing this growing energy demand alongside existing infrastructure needs requires planning and collaboration between all parties2.
Given the boom-bust nature of the tech sector, forecasting power needs is a challenging exercise. Historical forecasts have often proved wildly inaccurate, reflecting inflated market sentiment (as seen during the crypto boom) and failing to account for actual demand, technological advances, or economies of scale. While hundreds of billions of dollars are being deployed to build new data centers, a crucial question remains: Is there sufficient demand to justify these capital expenditures?
According to the analysis, it would require $600 billion of annual revenue (and counting) to turn a profit on all the capital expenses in the pipeline. Actual adoption is nowhere near those numbers despite the hype. And while everyone is happy to play with these tools as long as they are free, convincing individuals and businesses to commit to recurring expenses is an entirely different matter, especially in a context where money is no longer cheap - Liebreich does a very good job of highlighting this.
Underlying a lot of forecasts is the assumption that transformer based systems like ChatGPT will continue scaling up following the trend of the past two years, and therefore chips and energy will be the bottlenecks. There are warning signs that this may not be the case:
- The available stock of public text data for training large language models may be approaching exhaustion3. Other modalities like video are also available, but is it economically and technically viable?
- There have been performance improvements for AI workloads, so between existing hardware architectures becoming better and/or entirely new classes of custom designed chips/co-processors4 emerging, it will be possible to run increasingly advanced AI workloads at lower energy costs.
- Technology is increasingly political (especially given the erratic behavior of certain figures in tech). This means that there may be a market and incentives for privacy friendly AI models that run locally on devices without sharing information with the outside world.
- Regulatory oversight could limit the tech industry's more ambitious expansion plans.
Overall it is important to remember that when you are in the thick of it, it's hard to tell the difference between a sigmoid and an exponential curve. However if a system requires the output of a nuclear plant to work, and in many ways still underperforms the human brain (which operates on just 20W), then the situation warrants a healthy dose of scepticism.
Footnotes
-
There is no hard and fast rule that defines complex software, but I would define it along three different dimensions: it is fairly large (i.e. it does not fit easily in one's head), it is connected to other systems/processes in an organization, and it deals with non-trivial amount of load. ↩
-
The article cites examples of how actors in this space are adapting to this reality: Microsoft data centers in Wyoming sharing their backup generators with the rest of the grid in exchange for better energy prices. Hyperscalers moving compute intensive tasks such as training new models to data centers that have access to plentiful, cheap (and hopefully green) energy, while using data centers closer to densely populated areas (which also have more demand) to run lower latency, less compute intensive tasks such as inference. ↩
-
Will We Run Out of Data? Limits of LLM Scaling Based on Human-Generated Data ↩