Updated: May 16, 2022
This morning I was coming into my building when I noticed an excessively silly song playing over the PA. I don't know if it was trying to be that silly on purpose, but I had the impression it was just trying to stick out in a crowded field of overproduced, loop-based songs. As I got in the elevator, it occurred to me that there's a reason modern songs tend to have the same failings, and it's not just that a lot of modern producers don't have formal training. It's the software.
I am far from an expert, but in my experience the way music is created in the voice or on acoustic instruments is vastly different from how it is created in a DAW - a digital audio workstation, otherwise known as the software that powers modern music production. Common examples are Ableton, Logic, and ProTools. I happen to work exclusively in Logic, but they're all pretty much the same.
The first big difference is in the way the two mediums treat time. With a physical instrument, time is much more fluid and subjective. One can play totally without time - in fact, that is the default, like speaking. It requires a conscious effort to keep a steady beat; just ask any teacher of young children! Controlling the flow of time is as easy as speeding up or slowing down, and there is no click - it requires a significant amount of effort to establish a groove.
Contrast this to a DAW, where by default you are locked into a grid system. You are required to choose a tempo and meter, and it takes considerable effort to break out of this restricted environment. Of course, you could just ignore it entirely and record whatever you want, but if you want to alternate between time and timelessness, it's going to be a huge pain. As far as I know, there is no technology yet that makes beat mapping as simple and intuitive as it is playing live. You can think of it as the difference between sketching on plain white paper or canvas versus having to follow the lines on a sheet of graph paper. Even if you ignore the grid on the graph paper entirely, it's still exerting a subtle effect on your subconscious.
For example, the fact that 4/4 is imposed by default can have a chilling effect on the development of musical structure. Someone who is self-taught and untrained in the classics, especially if they use pre-made loops, is likely to fall into the trap of limiting themselves to 4 and 8-bar phrases. Now, if the default time signature was even 1/4? That would give us a lot more flexibility, and encourage composers to write phrases with the organic variety of speech, rather than the mathematical precision of the grid. Although I would prefer a system that was smart enough to detect the tempo and meter of a piece based on your playing, and superimpose it after the fact.
Another issue I run into with current technology is the way articulations are sampled in instrumental libraries. Switching from one articulation to another requires at minimum a keystroke and a discrete change in layer, whereas on an acoustic instrument articulations exist on a spectrum, the same as dynamics (which, incidentally, are modeled pretty well currently). One can seamlessly transition from more or less short and more or less accented, kind of like a gradient. The way instruments are sampled, on the computer you have to choose legato or staccato, and when you're one you're not the other. This has led us to a surplus of boring staccato string chord riffs in movie soundtracks, for example. We are incentivized to keep articulations uniform, not because it's impossible not to, but because it just doesn't feel good in the workflow. And that feeling extends to many other aspects of music production. Creativity thrives on flow, and for that flow we are to a certain extent at the mercy of our tools.
The more I thought about it, the more it reminded me of the transition from 2D to 3D in videogames in the late 90's. When 3D was in its infancy, it looked really horrible. Everything was reduced to blocky polygons with flat bitmaps that had no texture at all. In some ways, it was worse than Minecraft, which has an intentional aesthetic. This was just ugly, primitive. But what really pained me at the time was the contrast between that and what was possible in 2D. 2D was a mature technology. We had beautiful hand-painted environments, and they were cheap and accessible. Just look at the contrast between the 3D-modeled characters versus the 2D painted backgrounds in Final Fantasy 7, for instance. All the atmosphere was coming from the backgrounds and the music, which made up for the laughably primitive modeling of the actual characters. At the time, I found the wholesale abandonment of 2D for 3D tragic.
Of course, companies like id and Epic iterated their 3D engines continually, and with each generation we slowly crept closer to virtual reality. And now, in 2022, photo-realism is closer than ever, and we're pretty much only limited by our imaginations. In fact, I've noticed in recent years a trend in movies and games aimed at children where 3D technology is leveraged not to mimic reality, but to create fantastic new worlds that defy convention.
I feel the music technology industry needs to go through a similar revolution. I don't know how many people are out there thinking about how awkward and unnatural the workflow is in our current DAWs. But I feel pretty confident that we are in our rudimentary, "first-generation-3D" phase, and that the quality of our popular music demonstrates it. In the future, someone somewhere will develop tools that let us expose our hearts with a lot less clicking and tweaking...and then you'll really start to hear something remarkable.