Analysis of how developers use O’Reilly content and learning platform.
I’m a fan of the differences between search trends and content consumption. Great stuff.
What O'Reilly Learning Platform Usage Tells Us About Where the Industry Is Headed
March 1, 2023
This year’s report on the O’Reilly learning platform takes a detailed look at how our customers used the platform. Our goal is to find out what they’re interested in now and how that changed from 2021—and to make some predictions about what 2023 will bring.
A lot has happened in the past year. In 2021, we saw that GPT-3 could write stories and even help people write software; in 2022, ChatGPT showed that you can have conversations with an AI. Now developers are using AI to write software. Late in 2021, Mark Zuckerberg started talking about “the metaverse,” and fairly soon, everyone was talking about it. But the conversation cooled almost as quickly as it started. Back then, cryptocurrency prices were approaching a high, and NFTs were “a thing”…then they crashed.
Learn faster. Dig deeper. See farther.
Join the O'Reilly online learning platform. Get a free trial today and find answers on the fly, or master something new and useful.
What’s real, and what isn’t? Our data shows us what O’Reilly’s 2.8 million users are actually working on and what they’re learning day-to-day. That’s a better measure of technology trends than anything that happens among the Twitterati. The answers usually aren’t found in big impressive changes; they’re found in smaller shifts that reflect how people are turning the big ideas into real-world products. The signals are often confusing: for example, interest in content about the “big three” cloud providers is slightly down, while interest in content about cloud migration is significantly up. What does that mean? Companies are still “moving into the cloud”—that trend hasn’t changed—but as some move forward, others are pulling back (“repatriation”) or postponing projects. It’s gratifying when we see an important topic come alive: zero trust, which reflects an important rethinking of how security works, showed tremendous growth. But other technology topics (including some favorites) are hitting plateaus or even declining.
While we don’t discuss the economy as such, it’s always in the background. Whether or not we’re actually in a recession, many in our industry perceive us to be so, and that perception can be self-fulfilling. Companies that went on a hiring spree over the past few years are now realizing that they made a mistake—and that includes both giants that do layoffs in the tens of thousands and startups that thought they had access to an endless stream of VC cash. In turn, that reality influences the actions individuals take to safeguard their jobs or increase their value should they need to find a new one.
Methodology
This report is based on our internal “units viewed” metric, which is a single metric across all the media types included in our platform: ebooks, of course, but also videos and live training courses. We use units viewed because it measures what people actually do on our platform. But it’s important to recognize the metric’s shortcomings; as George Box (almost)1 said, “All metrics are wrong, but some are useful.” Units viewed tends to discount the usage of new topics: if a topic is new, there isn’t much content, and users can’t view content that doesn’t exist. As a counter to our focus on units viewed, we’ll take a brief look at searches, which aren’t constrained by the availability of content. For the purposes of this report, units viewed is always normalized to 1, where 1 is assigned to the greatest number of units in any group of topics.
It’s also important to remember that these “units” are “viewed” by our users. Whether they access the platform through individual or corporate accounts, O’Reilly members are typically using the platform for work. Despite talk of “internet time,” our industry doesn’t change radically from day to day, month to month, or even year to year. We don’t want to discount or undervalue those who are picking up new ideas and skills—that’s an extremely important use of the platform. But if a company’s IT department were working on its ecommerce site in 2021, they were still working on that site in 2022, they won’t stop working on it in 2023, and they’ll be working on it in 2024. They might be adding AI-driven features or moving it to the cloud and orchestrating it with Kubernetes, but they’re not likely to drop React (or even PHP) to move to the latest cool framework.
However, when the latest cool thing demonstrates a few years of solid growth, it can easily become one of the well-established technologies. That’s happening now with Rust. Rust isn’t going to take over from Java and Python tomorrow, let alone in 2024 or 2025, but that’s a movement that’s real. Finally, it’s wise to be skeptical about “noise.” Changes of one or two percentage points often mean little. But when a mature technology that’s leading its category stops growing, it’s fair to wonder whether it’s hit a plateau and is en route to becoming a legacy technology.
The Biggest Picture
We can get a high-level view of platform usage by looking at usage for our top-level topics. Content about software development was the most widely used (31% of all usage in 2022), which includes software architecture and programming languages. Software development is followed by IT operations (18%), which includes cloud, and by data (17%), which includes machine learning and artificial intelligence. Business (13%), security (8%), and web and mobile (6%) come next. That’s a fairly good picture of our core audience’s interests: solidly technical, focused on software rather than hardware, but with a significant stake in business topics.
Total platform usage grew by 14.1% year over year, more than doubling the 6.2% gain we saw from 2020 to 2021. The topics that saw the greatest growth were business (30%), design (23%), data (20%), security (20%), and hardware (19%)—all in the neighborhood of 20% growth. Software development grew by 12%, which sounds disappointing, although in any study like this, the largest categories tend to show the least change. Usage of resources about IT operations only increased by 6.9%. That’s a surprise, particularly since the operations world is still coming to terms with cloud computing.
O’Reilly learning platform usage by topic year over year
While this report focuses on content usage, a quick look at search data gives a feel for the most popular topics, in addition to the fastest growing (and fastest declining) categories. Python, Kubernetes, and Java were the most popular search terms. Searches for Python showed a 29% year-over-year gain, while searches for Java and Kubernetes are almost unchanged: Java gained 3% and Kubernetes declined 4%. But it’s also important to note what searches don’t show: when we look at programming languages, we’ll see that content about Java is more heavily used than content about Python (although Python is growing faster).
Similarly, the actual use of content about Kubernetes showed a slight year-over-year gain (4.4%), despite the decline in the number of searches. And despite being the second-most-popular search term, units viewed for Kubernetes were only 41% of those for Java and 47% of those for Python. This difference between search data and usage data may mean that developers “live” in their programming languages, not in their container tools. They need to know about Kubernetes and frequently need to ask specific questions—and those needs generate a lot of searches. But they’re working with Java or Python constantly, and that generates more units viewed.
The Go programming language is another interesting case. “Go” and “Golang” are distinct search strings, but they’re clearly the same topic. When you add searches for Go and Golang, the Go language moves from 15th and 16th place up to 5th, just behind machine learning. However, change in use of the search term was relatively small: a 1% decline for Go, a 8% increase for Golang. Looking at Go as a topic category, we see something different: usage of content about Go is significantly behind the leaders, Java and Python, but still the third highest on our list, and with a 20% gain from 2021 to 2022.
Looking at searches is worthwhile, but it’s important to realize that search data and usage data often tell different stories.
Top searches on the O’Reilly learning platform year over year
Searches can also give a quick picture of which topics are growing. The top three year-over-year gains were for the CompTIA Linux+ certification, the CompTIA A+ certification, and transformers (the AI model that’s led to tremendous progress in natural language processing). However, none of these are what we might call “top tier” search terms: they had ranks ranging from 186 to 405. (That said, keep in mind that the number of unique search terms we see is well over 1,000,000. It’s a lot easier for a search term with a few thousand queries to grow than it is for a search term with 100,000 queries.)
The sharpest declines in search frequency were for cryptocurrency, Bitcoin, Ethereum, and Java 11. There are no real surprises here. This has been a tough year for cryptocurrency, with multiple scandals and crashes. As of late 2021, Java 11 was no longer the current long-term support (LTS) release of Java; that’s moved on to Java 17.
What Our Users Are Doing (in Detail)
That’s a high-level picture. But where are our users actually spending their time? To understand that, we’ll need to take a more detailed look at our topic hierarchy—not just at the topics at the top level but at those in the inner (and innermost) layers.
Software Development
The biggest change we’ve seen is the growth in interest in coding practices; 35% year-over-year growth can’t be ignored, and indicates that software developers are highly motivated to improve their practice of programming. Coding practices is a broad topic that encompasses a lot—software maintenance, test-driven development, maintaining legacy software, and pair programming are all subcategories. Two smaller categories that are closely related to coding practices also showed substantial increases: usage of content about Git (a distributed version control system and source code repository) was up 21%, and QA and testing was up 78%. Practices like the use of code repositories and continuous testing are still spreading to both new developers and older IT departments. These practices are rarely taught in computer science programs, and many companies are just beginning to put them to use. Developers, both new and experienced, are learning them on the job.
Going by units viewed, design patterns is the second-largest category, with a year-over-year increase of 13%. Object-oriented programming showed a healthy 24% increase. The two are closely related, of course; while the concept of design patterns is applicable to any programming paradigm, object-oriented programming (particularly Java, C#, and C++) is where they’ve taken hold.
It’s worth taking a closer look at design patterns. Design patterns are solutions to common problems—they help programmers work without “reinventing wheels.” Above all, design patterns are a way of sharing wisdom. They’ve been abused in the past by programmers who thought software was “good” if it used “design patterns,” and jammed as many into their code as possible, whether or not it was appropriate. Luckily, we’ve gotten beyond that now.
What about functional programming? The “object versus functional” debates of a few years ago are over for the most part. The major ideas behind functional programming can be implemented in any language, and functional programming features have been added to Java, C#, C++, and most other major programming languages. We’re now in an age of “multiparadigm” programming. It feels strange to conclude that object-oriented programming has established itself, because in many ways that was never in doubt; it has long been the paradigm of choice for building large software systems. As our systems are growing ever larger, object-oriented programming’s importance seems secure.
Leadership and management also showed very strong growth (38%). Software developers know that product development isn’t just about code; it relies heavily on communication, collaboration, and critical thinking. They also realize that management or team leadership may well be the next step in their career.
Finally, we’d be remiss not to mention quantum computing. It’s the smallest topic category in this group but showed a 24% year-over-year gain. The first quantum computers are now available through cloud providers like IBM and Amazon Web Services (AWS). While these computers aren’t yet powerful enough to do any real work, they make it possible to get a head start on quantum programming. Nobody knows when quantum computers will be substantial enough to solve real-world problems: maybe two years, maybe 20. But programmers are clearly interested in getting started.
Year-over-year growth for software development topics
Software architecture
Software architecture is a very broad category that encompasses everything from design patterns (which we also saw under software development) to relatively trendy topics like serverless and event-driven architecture. The largest topic in this group was, unsurprisingly, software architecture itself: a category that includes books on the fundamentals of software architecture, systems thinking, communication skills, and much more—almost anything to do with the design, implementation, and management of software. Not only was this a large category, but it also grew significantly: 26% from 2021 to 2022. Software architect has clearly become an important role, the next step for programming staff who want to level up their skills.
For several years, microservices has been one of the most popular topics in software architecture, and this year is no exception. It was the second-largest topic and showed 3.6% growth over 2021. Domain-driven design (DDD) was the third-most-commonly-used topic, although smaller; it also showed growth (19%). Although DDD has been around for a long time, it came into prominence with the rise of microservices as a way to think about partitioning an application into independent services.
Is the relatively low growth of microservices a sign of change? Have microservices reached a peak? We don’t think so, but it’s important to understand the complex relationship between microservices and monolithic architectures. Monoliths inevitably become more complex over time, as bug fixes, new business requirements, the need to scale, and other issues need to be addressed. Decomposing a complex monolith into a complex set of microservices is a challenging task and certainly one that can’t be underestimated: developers are trading one kind of complexity for another in the hope of achieving increased flexibility and scalability long-term. Microservices are no longer a “cool new idea,” and developers have recognized that they’re not the solution to every problem. However, they are a good fit for cloud deployments, and they leave a company well-positioned to offer its services via APIs and become an “as a service” company. Microservices are unlikely to decline, though they may have reached a plateau. They’ve become part of the IT landscape. But companies need to digest the complexity trade-off.
Web APIs, which companies use to provide services to remote client software via the web’s HTTP protocol, showed a very healthy increase (76%). This increase shows that we’re moving even more strongly to an “API economy,” where the most successful companies are built not around products but around services accessed through web APIs. That, after all, is the basis for all “software as a service” companies; it’s the basis on which all the cloud providers are built; it’s what ties Amazon’s business empire together. RESTful APIs saw a smaller increase (6%); the momentum has clearly moved from the simplicity of REST to more complex APIs that use JSON, GraphQL, and other technologies to move information.
The 29% increase in the usage of content about distributed systems is important. Several factors drive the increase in distributed systems: the move to microservices, the need to serve astronomical numbers of online clients, the end of Moore’s law, and more. The time when a successful application could run on a single mainframe—or even on a small cluster of servers in a rack—is long gone. Modern applications run across hundreds or thousands of computers, virtual machines, and cloud instances, all connected by high-speed networks and data buses. That includes software running on single laptops equipped with multicore CPUs and GPUs. Distributed systems require designing software that can run effectively in these environments: software that’s reliable, that stays up even when some servers or networks go down, and where there are as few performance bottlenecks as possible. While this category is still relatively small, its growth shows that software developers have realized that all systems are distributed systems; there is no such thing as an application that runs on a single computer.
Year-over-year growth for software architecture and design topics
What about serverless? Serverless looks like an excellent technology for implementing microservices, but it’s been giving us mixed signals for several years now. Some years it’s up slightly; some years it’s down slightly. This year, it’s down 14%, and while that’s not a collapse, we have to see that drop as significant. Like microservices, serverless is no longer a “cool new thing” in software architecture, but the decrease in usage raises questions: Are software developers nervous about the degree of control serverless puts in the hands of cloud providers, spinning up and shutting down instances as needed? That could be a big issue. Cloud customers want to get their accounts payable down, cloud providers want to get their accounts receivable up, and if the provider tweaks a few parameters that the customer never sees, that balance could change a lot. Or has serverless just plunged into the “trough of disillusionment” from which it will eventually emerge into the “plane of productivity”? Or maybe it’s just an idea whose time came and went? Whatever the reason, serverless has never established itself convincingly. Next year may give us a better idea…or just more ambiguity.
Programming languages
The stories we can tell about programming languages are little changed from last year. Java is the leader (with 1.7% year-over-year growth), followed by Python (3.4% growth). But as we look down the chart, we see some interesting challengers to the status quo. Go’s usage is only 20% of Java’s, but it’s seen 20% growth. That’s substantial. C++ is hardly a new language—and we typically expect older languages to be more stable—but it had 19% year-over-year growth. And Rust, with usage that’s only 9% of Java, had 22% growth from 2021 to 2022. Those numbers don’t foreshadow a revolution—as we said at the outset, very few companies are going to take infrastructure written in Java and rewrite it in Go or Rust just so they can be trend compliant. As we all know, a lot of infrastructure is written in COBOL, and that isn’t going anywhere. But both Rust and Go have established themselves in key areas of infrastructure: Docker and Kubernetes are both written in Go, and Rust is establishing itself in the security community (and possibly also the data and AI communities). Go and Rust are already pushing older languages like C++ and Java to evolve. With a few more years of 20% growth, Go and Rust will be challenging Java and Python directly, if they aren’t challenging them already for greenfield projects.
JavaScript is an anomaly on our charts: total usage is 19% of Java’s, with a 4.6% year-over-year decline. JavaScript shows up at, or near, the top on most programming language surveys, such as RedMonk’s rankings (usually in a virtual tie with Java and Python). However, the TIOBE Index shows more space between Python (first place), Java (fourth), and JavaScript (seventh)—more in line with our observations of platform usage. We attribute JavaScript’s decline partly to the increased influence of TypeScript, a statically typed variant of JavaScript that compiles to JavaScript (12% year-over-year increase). One thing we’ve noticed over the past few years: while programmers had a long dalliance with duck typing and dynamic languages, as applications (and teams) grew larger, developers realized the value of strong, statically typed languages (TypeScript certainly, but also Go and Rust, though these are less important for web development). This shift may be cyclical; a decade from now, we may see a revival of interest in dynamic languages. Another factor is the use of frameworks like React, Angular, and Node.js, which are undoubtedly JavaScript but have their own topics in our hierarchy. However, when you add all four together, you still see a 2% decline for JavaScript, without accounting for the shift from JavaScript to TypeScript. Whatever the reason, right now, the pendulum seems to be swinging away from JavaScript. (For more on frameworks, see the discussion of web development.)
The other two languages that saw a drop in usage are C# (6.3%) and Scala (16%). Is this just noise, or is it a more substantial decline? The change seems too large to be a random fluctuation. Scala has always been a language for backend programming, as has C# (though to a lesser extent). While neither language is particularly old, it seems their shine has worn off. They’re both competing poorly with Go and Rust for new users. Scala is also competing poorly with the newer versions of Java, which now have many of the functional features that initially drove interest in Scala.
Year-over-year growth for programming languages
Security
Computer security has been in the news frequently over the past few years. That unwelcome exposure has both revealed cracks in the security posture of many companies and obscured some important changes in the field. The cracks are all too obvious: most organizations do a bad job of the basics. According to one report, 91% of all attacks start with a phishing email that tricks a user into giving up their login credentials. Phishes are becoming more frequent and harder to detect. Basic security hygiene is as important as ever, but it’s getting more difficult. And cloud computing generates its own problems. Companies can no longer protect all of their IT systems behind a firewall; many of the servers are running in a data center somewhere, and IT staff has no idea where they are or even if they exist as physical entities.
Given this shift, it’s not surprising that zero trust, an important new paradigm for designing security into distributed systems, grew 146% between 2021 and 2022. Zero trust abandons the assumption that systems can be protected on some kind of secure network; all attempts to access any system, whether by a person or software, must present proper credentials. Hardening systems, while it received the least usage, grew 91% year over year. Other topics with significant growth were secure coding (40%), advanced persistent threats (55%), and application security (46%). All of these topics are about building applications that can withstand attacks, regardless of where they run.
Governance (year-over-year increase of 72%) is a very broad topic that includes virtually every aspect of compliance and risk management. Issues like security hygiene increasingly fall under “governance,” as companies try to comply with the requirements of insurers and regulators, in addition to making their operations more secure. Because almost all attacks start with a phish or some other kind of social engineering, just telling employees not to give their passwords away won’t help. Companies are increasingly using training programs, password managers, multifactor authentication, and other approaches to maintaining basic hygiene.
Year-over-year growth for security topics
Network security, which was the most heavily used security topic in 2022, grew by a healthy 32%. What drove this increase? Not the use of content about firewalls, which only grew 7%. While firewalls are still useful for protecting the IT infrastructure in a physical office, they’re of limited help when a substantial part of any organization’s infrastructure is in the cloud. What happens when an employee brings their laptop into the office from home or takes it to a coffee shop where it’s more vulnerable to attack? How do you secure WiFi networks for people working from home as well as in the office? The broader problem of network security has only become more difficult, and these problems can’t be solved by corporate firewalls.
Use of content about penetration testing and ethical hacking actually decreased by 14%, although it was the second-most-heavily-used security topic in our taxonomy (and the most heavily used in 2021).
Security certifications
Security professionals love their certifications. Our platform data shows that the most important certifications were CISSP (Certified Information Systems Security Professional) and CompTIA Security+. CISSP has long been the most popular security certification. It’s a very comprehensive certification oriented toward senior security specialists: candidates must have at least five years’ experience in the field to take the exam. Usage of CISSP-related content dropped 0.23% year over year—in other words, it was essentially flat. A change this small is almost certainly noise, but the lack of change may indicate that CISSP has saturated its market.
Compared to CISSP, the CompTIA Security+ certification is aimed at entry- or mid-level security practitioners; it’s a good complement to the other CompTIA certifications, such as Network+. Right now, the demand for security exceeds the supply, and that’s drawing new people into the field. This fits with the increase in the use of content to prepare for the CompTIA Security+ exam, which grew 16% in the past year. The CompTIA CSA+ exam (recently renamed the CYSA+) is a more advanced certification aimed specifically at security analysts; it showed 37% growth.
Year-over-year growth for security certifications
Use of content related to the Certified Ethical Hacker certification dropped 5.9%. The reasons for this decline aren’t clear, given that demand for penetration testing (one focus of ethical hacking) is high. However, there are many certifications specifically for penetration testers. It’s also worth noting that penetration testing is frequently a service provided by outside consultants. Most companies don’t have the budget to hire full-time penetration testers, and that may make the CEH certification less attractive to people planning their careers.
CBK isn’t an exam; it’s the framework of material around which the International Information System Security Certification Consortium, more commonly known as (ISC)², builds its exams. With a 31% year-over-year increase for CBK content, it’s another clear sign that interest in security as a profession is growing. And even though (ISC)²’s marquee certification, CISSP, has likely reached saturation, other (ISC)² certifications show clear growth: CCSP (Certified Cloud Security Professional) grew 52%, and SSCP (Systems Security Certified Practitioner) grew 67%. Although these certifications aren’t as popular, their growth is an important trend.
Data
Data is another very broad category, encompassing everything from traditional business analytics to artificial intelligence. Data engineering was the dominant topic by far, growing 35% year over year. Data engineering deals with the problem of storing data at scale and delivering that data to applications. It includes moving data to the cloud, building pipelines for acquiring data and getting data to application software (often in near real time), resolving the issues that are caused by data siloed in different organizations, and more.
Apache Spark, a platform for large-scale data processing, was the most widely used tool, even though the use of content about Spark declined slightly in the past year (2.7%). Hadoop, which would have led this category a decade ago, is still present, though usage of content about Hadoop dropped 8.3%; Hadoop has become a legacy data platform.
Microsoft Power BI has established itself as the leading business analytics platform; content about Power BI was the most heavily used, and achieved 31% year-over-year growth. NoSQL databases was second, with 7.6% growth—but keep in mind that NoSQL was a movement that spawned a large number of databases, with many different properties and designs. Our data shows that NoSQL certainly isn’t dead, despite some claims to the contrary; it has clearly established itself. However, the four top relational databases, if added together into a single “relational database” topic, would be the most heavily used topic by a large margin. Oracle grew 18.2% year over year; Microsoft SQL Server grew 9.4%; MySQL grew 4.7%; and PostgreSQL grew 19%.
Use of content about R, the widely used statistics platform, grew 15% from 2021. Similarly, usage of content about Pandas, the most widely used Python library for working with R-like data frames, grew 20%. It’s interesting that Pandas and R had roughly the same usage. Python and R have been competing (in a friendly way) for the data science market for nearly 20 years. Based on our usage data, right now it looks like a tie. R has slightly more market share, but Pandas has better growth. Both are staples in academic research: R is more of a “statistician’s workbench” with a comprehensive set of statistical tools, while Python and Pandas are built for programmers. The difference has more to do with users’ tastes than substance though: R is a fully capable programming language, and Python has excellent statistical and array-processing libraries.
Usage for content about data lakes and about data warehouses was also just about equal, but data lakes usage had much higher year-over-year growth (50% as opposed to 3.9%). Data lakes are a strategy for storing an organization’s data in an unstructured repository; they came into prominence a few years ago as an alternative to data warehouses. It would be useful to compare data lakes with data lakehouses and data meshes; those terms aren’t in our taxonomy yet.
Year-over-year growth for data analysis and database topics
Artificial intelligence
At the beginning of 2022, who would have thought that we would be asking an AI-driven chat service to explain source code (even if it occasionally makes up facts)? Or that we’d have AI systems that enable nonartists to create works that are on a par with professional designers (even if they can’t match Degas and Renoir)? Yet here we are, and we don’t have ChatGPT or generative AI in our taxonomy. The one thing that we can say is that 2023 will almost certainly take AI even further. How much further nobody knows.
For the past two years, natural language processing (NLP) has been at the forefront of AI research, with the release of Open AI’s popular tools GPT-3 and ChatGPT along with similar projects from Google, Meta, and others that haven’t been released. NLP has many industrial applications, ranging from automated chat servers to code generation (e.g., GitHub Copilot) to writing tools. It’s not surprising that NLP content was the most viewed and saw significant year-over-year growth (42%). All of this progress is based on deep learning, which was the second-most-heavily-used topic, with 23% growth. Interest in reinforcement learning seems to be off (14% decline), though that may turn around as researchers try to develop AI systems that are more accurate and that can’t be tricked into hate speech. Reinforcement learning with human feedback (RLHF) is one new technique that might lead to better-behaved language models.
There was also relatively little interest in content about chatbots (a 5.8% year-over-year decline). This reversal seems counterintuitive, but it makes sense in retrospect. The release of GPT-3 was a watershed event, an “everything you’ve done so far is out-of-date” moment. We’re excited about what will happen in 2023, though the results will depend a lot on how ChatGPT and its relatives are commercialized, as ChatGPT becomes a fee-based service, and both Microsoft and Google take steps towards chat-based search.
Year-over-year growth for artificial intelligence topics
Our learning platform gives some insight into the tools developers and researchers are using to work with AI. Based on units viewed, scikit-learn was the most popular library. It’s a relatively old tool, but it’s still actively maintained and obviously appreciated by the community: usage increased 4.7% over the year. While usage of content about PyTorch and TensorFlow is roughly equivalent (PyTorch is slightly ahead), it’s clear that PyTorch now has momentum. PyTorch increased 20%, while TensorFlow decreased 4.8%. Keras, a frontend library that uses TensorFlow, dropped 40%.
It’s disappointing to see so little usage of content on MLOps this year, along with a slight drop (4.0%) from 2021 to 2022. One of the biggest problems facing machine learning and artificial intelligence is deploying applications into production and then maintaining them. ML and AI applications need to be integrated into the deployment processes used for other IT applications. This is the business of MLOps, which presents a set of problems that are only beginning to be solved, including versioning for large sets of training data and automated testing to determine when a model has become stale and needs retraining. Perhaps it’s still too early, but these problems must be addressed if ML and AI are to succeed in the enterprise.
No-code and low-code tools for AI don’t appear in our taxonomy, unfortunately. Our report AI Adoption in the Enterprise 2022 argues that AutoML in its various incarnations is gradually gaining traction. This is a trend worth watching. While there’s very little training available on Google AutoML, Amazon AutoML, IBM AutoAI, Amazon SageMaker, and other low-code tools, they’ll almost certainly be an important force multiplier for experienced AI developers.
Infrastructure and Operations
Containers, Linux, and Kubernetes are the top topics within infrastructure and operations. Containers sits at the top of the list (with 2.5% year-over-year growth), with Docker, the most popular container, in fifth place (with a 4.4% decline). Linux, the second most used topic, grew 4.4% year over year. There’s no surprise here; as we’ve been saying for some time, Linux is “table stakes” for operations. Kubernetes is third, with 4.4% growth.
The containers topic is extremely broad: it includes a lot of content that’s primarily about Docker but also content about containers in general, alternatives to Docker (most notably Podman), container deployment, and many other subtopics. It’s clear that containers have changed the way we deploy software, particularly in the cloud. It’s also clear that containers are here to stay. Docker’s small drop is worth noting but isn’t a harbinger of change. Kubernetes deprecated direct Docker support at the end of 2020 in favor of the Container Runtime Interface (CRI). That change eliminated a direct tie between Kubernetes and Docker but doesn’t mean that containers built by Docker won’t run on Kubernetes, since Docker supports the CRI standard. A more convincing reason for the drop in usage is that Docker is no longer new and developers and other IT staff are comfortable with it. Docker itself may be a smaller piece of the operations ecosystem, and it may have plateaued, but it’s still very much there.
Content about Kubernetes was the third most widely viewed in this group, and usage grew 4.4% year over year. That relatively slow growth may mean that Kubernetes is close to a plateau. We increasingly see complaints that Kubernetes is overly complex, and we expect that, sooner or later, someone will build a container orchestration platform that’s simpler, or that developers will move toward “managed” solutions where a third party (probably a cloud provider) manages Kubernetes for them. One important part of the Kubernetes ecosystem, the service mesh, is declining; content about service mesh showed a 28% decline, while content about Istio (the service mesh implementation most closely tied to Kubernetes) declined 42%. Again, service meshes (and specifically Istio) are widely decried as too complex. It’s indicative (and perhaps alarming) that IT departments are resorting to “roll your own” for a complex piece of infrastructure that manages communications between services and microservices (including services for security). Alternatives are emerging. HashiCorp’s Consul and the open source Linkerd project are promising service meshes. UC Berkeley’s RISELab, which developed both Ray and Spark, recently announced SkyPilot, a tool with goals similar to Kubernetes but that’s specialized for data. Whatever the outcome, we don’t believe that Kubernetes is the last word in container orchestration.
Year-over-year growth for infrastructure and operations topics
If there’s any tool that defines “infrastructure as code,” it’s Terraform, which saw 74% year-over-year growth. Terraform’s goals are relatively simple: You write a simple description of the infrastructure you want and how you want that infrastructure configured. Terraform gathers the resources and configures them for you. Terraform can be used with all of the major cloud providers, in addition to private clouds (via OpenStack), and it’s proven to be an essential tool for organizations that are migrating to the cloud.
We took a separate look at the “continuous” methodologies (also known as CI/CD): continuous integration, continuous delivery, and continuous deployment. Overall, this group showed an 18% year-over-year increase in units viewed. This growth comes largely from a huge (40%) increase in the use of content about continuous delivery. Continuous integration showed a 22% decline, while continuous deployment had a 7.1% increase.
What does this tell us? The term continuous integration was first used by Grady Booch in 1991 and popularized by the Extreme Programming movement in the late 1990s. It refers to the practice of merging code changes into a single repository frequently, testing at each iteration to ensure that the project is always in a coherent state. Continuous integration is tightly coupled to continuous delivery; you almost always see CI/CD together. Continuous delivery is a practice that was developed at the second-generation web companies, including Flickr, Facebook, and Amazon, which radically changed IT practice by staging software updates for deployment several times daily. With continuous delivery, deployment pipelines are fully automated, requiring only a final approval to put a release into production. Continuous deployment is the newest (and smallest) of the three, emphasizing completely automated deployment to production: updates go directly from the developer into production, without any intervention. These methodologies are closely tied to each other. CI/CD/CD as a whole (and yes, nobody ever uses CD twice) is up 18% for the year. That’s a significant gain, and even though these topics have been around for a while, it’s evidence that growth is still possible.
Year-over-year growth for continuous methodologies
IT and operations certifications
The leading IT certification is clearly CompTIA, which showed a 41% year-over-year increase. The CompTIA family (Network+, A+, Linux+, and Security+) dominates the certification market. (The CompTIA Network+ showed a very slight decline (0.32%), which is probably just random fluctuation.) The Linux+ certification experienced tremendous year-over-year growth (47%). That growth is easy to understand. Linux has long been the dominant server operating system. In the cloud, Linux instances are much more widely used than the alternatives, though Windows is offered on Azure (of course) along with macOS. In the past few years, Linux’s market penetration has gone even deeper. We’ve already seen the role that containers are playing, and containers almost always run Linux as their operating system. In 1995, Linux might have been a quirky choice for people devoted to free and open source software. In 2023, Linux is mandatory for anyone in IT or software development. It’s hard to imagine getting a job or advancing in a career without demonstrating competence.
Year-over-year growth for IT certifications
It’s surprising to see the Cisco Certified Network Associate (CCNA) certification drop 18% and the Cisco Certified Network Professional (CCNP) certification drop 12%, as the Cisco certifications have been among the most meaningful and prestigious in IT for many years. (The Cisco Certified Internet Expert (CCIE) certification, while relatively small compared to the others, did show 70% growth.) There are several causes for this shift. First, as companies move workloads to the cloud or to colocation providers, maintaining a fleet of routers and switches becomes less important. Network certifications are less valuable than they used to be. But why then the increase in CCIE? While CCNA is an entry-level certification and CCNP is middle tier, CCIE is Cisco’s top-tier certification. The exam is very detailed and rigorous and includes hands-on work with network hardware. Hence the relatively small number of people who attempt it and study for it. However, even as companies offload much of their day-to-day network administration to the cloud, they still need people who understand networks in depth. They still have to deal with office networks, and with extending office networks to remote employees. While they don’t need staff to wrangle racks of data center routers, they do need network experts who understand what their cloud and colocation providers are doing. The need for network staff might be shrinking, but it isn’t going away. In a shrinking market, attaining the highest level of certification will have the most long-term value.
Cloud
We haven’t seen any significant shifts among the major cloud providers. Amazon Web Services (AWS) still leads, followed by Microsoft Azure, then Google Cloud. Together, this group represents 97% of cloud platform content usage. The bigger story is that we saw decreases in year-over-year usage for all three. The decreases are small and might not be significant: AWS is down 3.8%, Azure 7.5%, and Google Cloud 2.1%. We don’t know what’s responsible for this decline. We looked industry by industry; some were up, some were down, but there were no smoking guns. AWS showed a sharp drop in computers and electronics (about 27%), which is a relatively large category, and a smaller drop in finance and banking (15%), balanced by substantial growth in higher education (35%). There was a lot of volatility among industries that aren’t big cloud users—for example, AWS was up about 250% in agriculture—but usage among industries that aren’t major cloud users isn’t high enough to account for that change. (Agriculture accounts for well under 1% of total AWS content usage.) The bottom line is, as they say in the nightly financial news, “Declines outnumbered gains”: 16 out of 28 business categories showed a decline. Azure was similar, with 20 industries showing declines, although Azure saw a slight increase for finance and banking. The same was true for Google Cloud, though it benefited from an influx of individual (B2C) users (up 9%).
Over the past year, there’s been some discussion of “cloud repatriation”: bringing applications that have moved to the cloud back in-house. Cost is the greatest motivation for repatriation; companies moving to the cloud have often underestimated the costs, partly because they haven’t succeeded in using the cloud effectively. While repatriation is no doubt responsible for some of the decline, it’s at most a small part of the story. Cloud providers make it difficult to leave, which ironically might drive more content usage as IT staff try to figure out how to get their data back. A bigger issue might be companies that are putting cloud plans on hold because they hear of repatriation or that are postponing large IT projects because they fear a recession.
Of the smaller cloud providers, IBM showed a huge year-over-year increase (135%). Almost all of the change came from a significant increase in consulting and professional services (200% growth year over year). Oracle showed a 36% decrease, almost entirely due to a drop in content usage from the software industry (down 49%). However, the fact that Oracle is showing up at all demonstrates that it’s grown significantly over the past few years. Oracle’s high-profile deal to host all of TikTok’s data on US residents could easily solidify the company’s position as a significant cloud provider. (Or it could backfire if TikTok is banned.)
We didn’t include two smaller providers in the graph: Heroku (now owned by Salesforce) and Cloud Foundry (originally VMware, handed off to the company’s Pivotal subsidiary and then to the Cloud Foundry Foundation; now, multiple providers run Cloud Foundry software). Both saw fairly sharp year-over-year declines: 10% for Heroku, 26% for Cloud Foundry. As far as units viewed, Cloud Foundry is almost on a par with IBM. But Heroku isn’t even on the charts; it appears to be a service whose time has passed. We also omitted Tencent and Alibaba Cloud; they’re not in our subject taxonomy, and relatively little content is available.
Year-over-year growth for cloud providers
Cloud certifications followed a similar pattern. AWS certifications led, followed by Azure, followed by Google Cloud. We saw the same puzzling year-over-year decline here: 13% for AWS certification, 10% for Azure, and 6% for Google Cloud. And again, the drop was smallest for Google Cloud.
While usage of content about specific cloud providers dropped from 2021 to 2022, usage for content about other cloud computing topics grew. Cloud migration, a fairly general category for content about building cloud applications, grew 45%. Cloud service models also grew 41%. These increases may help us to understand why usage of content about the “big three” clouds decreased. As cloud usage moves beyond early adopters and becomes mainstream, the conversation naturally focuses less on individual cloud providers and more on high-level issues. After a few pilot projects and proofs of concept, learning about AWS, Azure, and Google Cloud is less important than planning a full-scale migration. How do you deploy to the cloud? How do you build services in the cloud? How do you integrate applications you have moved to the cloud with legacy applications that are staying in-house? At this point, companies know the basics and have to go the rest of the way.
Year-over-year growth for cloud certifications
With this in mind, it’s not at all surprising that our customers are very interested in hybrid clouds, for which content usage grew 28% year over year. Our users realize that every company will inevitably evolve toward a hybrid cloud. Either there’ll be a wildcat skunkworks project on some cloud that hasn’t been “blessed” by IT, or there’ll be an acquisition of a company that’s using a different provider, or they’ll need to integrate with a business partner using a different provider, or they don’t have the budget to move their legacy applications and data, or… The reasons are endless, but the conclusion is the same: hybrid is inevitable, and in many companies it’s already the reality.
The increase in use of content about private clouds (37%) is part of the same story. Many companies have applications and data that have to remain in-house (whether that’s physically on-premises or hosted at a data center offering colocation). It still makes sense for those applications to use APIs and deployment toolchains equivalent to those used in the cloud. “The cloud” isn’t the exception; it has become the rule.
Year-over-year growth for cloud architecture topics
Professional Skills
In the past year, O’Reilly users have been very interested in upgrading their professional and management skills. Every category in this relatively small group is up, and most of them are up significantly. Project management saw 47% year-over-year growth; professional development grew 37%. Use of content about the Project Management Professional (PMP) certification grew 36%, and interest in product management grew similarly (39%). Interest in communication skills increased 26% and interest in leadership grew by 28%. The two remaining categories that we tracked, IT management and critical thinking, weren’t as large and grew by somewhat smaller amounts (21% and 20%, respectively).
Several factors drive these increases. For a long time, software development and IT operations were seen as solo pursuits dominated by “neckbeards” and antisocial nerds, with some “rock stars” and “10x programmers” thrown in. This stereotype is wrong and harmful—not just to individuals but to teams and companies. In the past few years, we’ve heard a lot less about 10x developers and more about the importance of good communication, leadership, and mentoring. Our customers have realized that the key to productivity is good teamwork, not some mythical 10x developer. And there are certainly many employees who see positions in management, as a “tech lead,” as a product manager, or as a software architect, as the obvious next step in their careers. All of these positions stress the so-called “soft skills.” Finally, talk about a recession has been on the rise for the past year, and we continue to see large layoffs from big companies. While software developers and IT operations staff are still in high demand, and there’s no shortage of jobs, many are certainly trying to acquire new skills to improve their job security or to give themselves better options in the event that they’re laid off.
Year-over-year growth for professional skills topics
Web Development
The React and Angular frameworks continue to dominate web development. The balance is continuing to shift toward React (10% year-over-year growth) and away from Angular (a 17% decline). Many frontend developers feel that React offers better performance and is more flexible and easier to learn. Many new frameworks (and frameworks built on frameworks) are in play (Vue, Next.js, Svelte, and so on), but none are close to becoming competitors. Vue showed a significant year-over-year decline (17%), and the others didn’t make it onto the chart.
PHP is still a contender, of course, with almost no change (a decline of 1%). PHP advocates claim that 80% of the web is built on it: Facebook is built on PHP, for instance, along with millions of WordPress sites. Still, it’s hard to look at PHP and say that it’s not a legacy technology. Ruby on Rails grew 6.6%. Content usage for Ruby on Rails is similar to PHP, but Rails usage has been declining for some years. Is it poised for a comeback?
The use of content about JavaScript showed a slight decline (4.6%), but we don’t believe this is significant. In our taxonomy, content can only be tagged with one topic, and everything that covers React or Angular is implicitly about JavaScript. In addition, it’s interesting to see usage of TypeScript increasing (12%); TypeScript is a strongly typed variant of JavaScript that compiles (the right word is actually “transpiles”) to JavaScript, and it’s proving to be a better tool for large complex applications.
One important trend shows up at the bottom of the graph. WebAssembly is still a small topic, but it saw 74% growth from 2020 to 2021. And Blazor, Microsoft’s implementation of C# and .NET for WebAssembly, is up 59%. That’s a powerful signal. These topics are still small, but if they can maintain that kind of growth, they won’t be small for long. WebAssembly is poised to become an important part of web development.
Year-over-year growth for web development topics
Design
The heaviest usage in the design category went to user experience and related topics. User experience grew 18%, user research grew 5%, interface design grew 92%, and interaction design grew 36%. For years, we expected software to be difficult and uncomfortable to use. That’s changed. Apple made user interface design a priority early in the early 2000s, forcing other companies to follow if they wanted to remain competitive. The design thinking movement may no longer be in the news, but it’s had an effect: software teams think about design from the beginning. Even software developers who don’t have the word “design” in their job title need to think about and understand design well enough to build decent user interfaces and pleasant user experiences.
Usability, the only user-centric topic to show a decline, was only down 2.6%. It’s also worth noting that use of content about accessibility has grown 96%. Accessibility is still a relatively small category, but that kind of growth shows that accessibility is an aspect of user experience that can no longer be ignored. (The use of alt text for images is only one example: it’s become common on Twitter and is almost universal on Mastodon.)
Information architecture was down significantly (a 17% drop). Does that mean that interest has shifted from designing information flow to designing experiences, and is that a good thing?
Use of content about virtual and augmented reality is relatively small but grew 83%. The past year saw a lot of excitement around VR, Web3, the metaverse, and related topics. Toward the end of the year, that seemed to cool off. However, an 83% increase is noteworthy. Will that continue? It may depend on a new generation of VR products, both hardware and software. If Apple can make VR glasses that are comfortable and that people can wear without looking like aliens, 83% growth might seem small.
Year-over-year growth for design topics
The Future
We started out by saying that this industry doesn’t change as much from year to year as most people think. That’s true, but that doesn’t mean there’s no change. There are signals of important new trends—some completely new, some continuations of trends that started years ago. So what small changes are harbingers of bigger changes in the years to come?
The Go and Rust programming languages have shown significant growth both in the past year and for the last few years. There’s no sign that this growth will stop. It will take a few more years, but before long they’ll be on a par with Java and Python.
It’s no surprise that we saw huge gains for natural language processing and deep learning. GPT-3 and its successor ChatGPT are the current stars of the show. While there’s been a lot of talk about another “AI winter,” that isn’t going to happen. The success of ChatGPT (not to mention Stable Diffusion, Midjourney, and many projects going on at Meta and Google) will keep winter away, at least for another year. What will people build on top of ChatGPT and its successors? What new programming tools will we see? How will the meaning of “computer programming” change if AI assistants take over the task of writing code? What new research tools will become available, and will our new AI assistants persist in “making stuff up”? For several years now, AI has been the most exciting area in software. There’s lots to imagine, lots to build, and infinite space for innovation. As long as the AI community provides exciting new results, no one will be complaining and no one need fear the cold.
We’ve also seen a strong increase in interest in leadership, management, communication, and other “soft skills.” This interest isn’t new, but it’s certainly growing. Whether the current generation of programmers is getting tired of coding or whether they perceive soft skills as giving them better job security during a recession isn’t for us to say. It’s certainly true that better communication skills are an asset for any project.
Our audience is slightly less interested in content about the “big three” cloud providers (AWS, Azure, and Google Cloud), but they’re still tremendously interested in migrating to the cloud and taking advantage of cloud offerings. Despite many reports claiming that cloud adoption is almost universal (and I confess to writing some of them), I’ve long believed that we’re only in the early stages of cloud adoption. We’re now past the initial stage, during which a company might claim that it was “in the cloud” on the basis of a few trial projects. Cloud migration is serious business. We expect to see a new wave of cloud adoption. Companies in that wave won’t make naive assumptions about the costs of using the cloud, and they’ll have the tools to optimize their cloud usage. This new wave may not break until fears of a recession end, but it will come.
While the top-level security category grew 20%, we’d hoped to see more. For a long time, security was an afterthought, not a priority. That’s changing, but slowly. However, we saw huge gains for zero trust and governance. It’s unfortunate that these gains are driven by necessity (and the news cycle), but perhaps the message is getting through after all.
What about augmented and virtual reality (AR/VR), the metaverse, and other trendy topics that dominated much of the trade press? Interest in VR/AR content grew significantly, though what that means for 2023 is anyone’s guess. Long-term, the category probably depends on whether or not anyone can make AR glasses a fashion accessory that everyone needs to have. A bigger question is whether anyone can build a next-generation web that’s decentralized, and that fosters immediacy and collaboration without requiring exotic goggles. That’s clearly something that can be done: look no further than Figma (for collaboration), Mastodon (for decentralization), or Petals (for a cloud-less cloud).
Will these be the big stories for 2023? February is only just beginning; we have 11 months to find out.
Footnotes
1. Box said “models”; a metric is a kind of model, isn’t it?