I’ve changed my views over the past few years on some controversial topics due to realizations while working at a big company.
Very little is ever optimal when it’s something abstract and deals with humans. These things are inherently lossy and need some kind of “organization.”
In a relatively small organization of only 1000 people, changes made just a few years ago got completely lost.
- Migrations from platform A to platform B with authors leaving the company in the mix made decisions practically disappear.
- Team A uses team B’s platform incorrectly for months without anyone realizing, until eventually it breaks. Uh oh, but Team A’s component was load-bearing and set up by a dev on vacation right now. Team’s gotta figure it out immediately and service is interrupted for over 3 hours.
- Various teams building for problems that don’t exist. Doing hard work every day on the wrong problems, but the organization is not configured to speak honestly. XY problems everywhere you look. Speeding up a tool, service, app, or layer that never should have been made.
That’s just 1000 people, and in a highly successful organization of smart folks that seems like it’s flourishing from an outsider’s perspective. On the inside, it sometimes feels like everything’s on fire. Issues all over the place with not enough time in the day to even log it all in a coherent way. This is a detail oriented person speaking though, the truth is, the organization succeeding while being on fire is kind of just how they work. It’s not optimal, but it’s optimal enough to continue kicking.
It was never really the code that was hard to write or distributed theory that was hard to understand. Yes, theory and implementation can be challenging, but it’s always been the organization and coordination between people, teams, and their work and ideas that remains at the core of difficult distributed problems. I also think that those organizational struggles will always be the hardest problems facing humanity.
I don’t think there is a “solution” to organizing, because we are the bottleneck. Nobody can ingest “all of the information” of the small organization’s history in a day or even a year. It’s condensed into lossy summaries while some important context is long gone before it’s ever ingested. Regardless of any tool invented to help us with productivity, in the end, it’s still assuming that the human in control cares, that they understand, and that finding or understanding the source is even the problem. We can’t ingest that much, we can’t do it that quickly, and at the same time, most don’t even care.
After seeing this organization, and working on my own personal ones, I’ve grown to appreciate good leaders. They’re rare, they don’t get easily overwhelmed by things that don’t matter, and they can bring some clarity to coordination and organization. The clarity is the most impressive trait, but it’s still not enough to make organizing “easy” in any way. They just reduce enough of the worst friction. It’s not the leader that can fix coordination, it’s a bit of everybody and nobody at the same time. It’s a spectrum that ranges from fully disorganized up to impressively organized. But at a certain scale of what (to me, roughly) feels like approximately 50 people, it’s just super lossy and communication and alignment begins to fall off a cliff.
With all of this big org experience over the past few years, I now feel very skeptical of any “reorganizing society” or “why don’t we just…” claims. Actually, this is why I’m writing this. I’ve seen multiple articles over the past few weeks about local and national politics where this rhetoric is taking off. Articles where somehow some massive coordination problem of over 80M people is all caused by some single component that can be fixed with a new 100 page policy. I mean, it’s not like a lot of it doesn’t sound utopian! But I am noticing more and more that the hardest part is the part most often hand waived. The organization is the problem, not solving the theory.
Yes, ending all homelessness and hunger sounds incredible. But it’s not “just $100B” and we solve it forever, because these are social problems and not purely technical ones. An issue I’m noticing is that folks view huge scale social problems as if they’re simple technical ones, where with the flip of a switch, everything’s now solved, and that we only need to coordinate on flipping that switch. Some examples of technical solutions that improved organization:
- Before git, having a few people work on the same codebase got messy. Version control occupied a lot more mental bandwidth
- Before CICD workflows took off in technology, deploying new code was less efficient. Some companies would have the whole team deploying the thing in the office watching it happen all night. Watching each service update, making sure it all works with manual and scripted testing, and manually rolling back certain things if needed while documenting everything that happened.
Nice, so we made our organization way better by creating and adopting these clean technical solutions! So, there is room for growth even in successful organizations which is great. The interesting parts are that these are technical issues that deal specifically with being able to control your environment, isolate moving parts, iterate over and over again, and where fixing something once means it’s fixed (mostly). And yet, even with great technical solutions, organization still isn’t a solved problem, it just looks different and new possibilities arise, with a million ways to use a new thing. This is because, like I’ve said, we’re the problem! The merge conflict isn’t a failure with git. Working on a big codebase with a few teammates still isn’t some solved problem after adopting these technologies, but it’s done differently and has a new set of issues that different organizations solve differently.
I argue now that societal issues are almost never “just throw $100B at it” types of problems, because they lack the fundamental simplicity of a technical problem. There’s very little control over environmental pieces, and if we end up going in the wrong direction, how are we certain about it without watching it play out for way too long with too many resources? Even the most well-designed implementation should expect huge losses due to coordination, causing more modest results over incredibly transformative ones.
In my flourishing technical organization example, I noted that there are entire teams working on the wrong problem on products that should not exist. This is acceptable because the organization as a whole continues functioning well enough. When distributed across an entire country, the same problems happen but the tolerance for waste is lower, and people critique and cut the supply off.
Also, the technical solutions themselves aren’t silver bullets, despite having endless control of the problem, solution, and environment. Git and any CICD flow have failure modes too which we accept as part of reality and learn to work around, such as a common merge conflict. We accept that solving the merge conflict sometimes while using git is better than not using version control and suffering from the much worse alternative. When talking about social issues, many people see anything short of a full solution as a failure despite the fact that even technical solutions in a vacuum have their own failure modes.
If the “how” of implementation doesn’t touch on the “who does what when,” it doesn’t seem complete. Or when the proposed solution doesn’t clearly define realistic success, or if it’s that “you need to read xyz theory,” it also feels incomplete. I’ve seen how impossible it can be to get just 25 people aligned on the terms of a medium sized project, and then have the same mental model, keep at it consistently for a few months, and finally finish the job while documenting it and then maintaining it. They could all understand the theory, but is their mental map the same? And importantly is the difference between their mental maps sustainable, or will it cause endless conflict and redirect energy to the wrong rabbit holes?
We are frequently asking the wrong questions and aren’t optimizing for a good way to solve our problems. Not the “best way,” but “a good way.”
Organizations are open systems, but we like to treat them like simple closed ones. We can’t easily rally behind a complex open system, so we invent a simpler enemy that is the cause of all of our problems. It happens everywhere, and it’s not necessarily bad. Once I noticed this, I feel more overall more grounded when I come across the simplistic headlines.
I don’t think this is a negative perspective. It’s just that we might want too black and white of solutions when really there usually is no full solution, since the problem is us and our inability to ever fully coordinate. There are micro improvements everywhere like adding technical solutions into the mix, but usually not a one size “why don’t we just…” type of solution. Seeing it this way makes me less upset over everything actually, because the big problems aren’t usually as simple as “evil guy must be stopped from causing all evil.” Issues have to be emotionally portrayed that way to make any coordination even begin, but really they’re a lot more boring and complex with hundreds of boring moving pieces.
We must expect lossy organization and design for it. We ourselves as humans are the problem, and we have to remember that each of us is complex enough to make coordinating thousands of us the hardest possible challenge with no simple solutions. We’re made up of millions of small patch solutions that work locally, meaning there will cyclically be disagreement and “if only…” arguments since we want it to be easy.
The big problems are boring and complex with hundreds of moving pieces. Recognizing this doesn’t solve anything, but it does change how we evaluate successes and failures. When we expect perfect coordination, any continued dysfunction becomes evidence of incompetence, justifying abandonment or escalation. But if we design for lossiness from the start, we can ask better questions, like, “is this particular mess worse than the alternative mess?” and “what actually is success for this problem?”
Comments
Loading comments...