this post was submitted on 13 Aug 2024
17 points (57.7% liked)
Linux
48323 readers
771 users here now
From Wikipedia, the free encyclopedia
Linux is a family of open source Unix-like operating systems based on the Linux kernel, an operating system kernel first released on September 17, 1991 by Linus Torvalds. Linux is typically packaged in a Linux distribution (or distro for short).
Distributions include the Linux kernel and supporting system software and libraries, many of which are provided by the GNU Project. Many Linux distributions use the word "Linux" in their name, but the Free Software Foundation uses the name GNU/Linux to emphasize the importance of GNU software, causing some controversy.
Rules
- Posts must be relevant to operating systems running the Linux kernel. GNU/Linux or otherwise.
- No misinformation
- No NSFW content
- No hate speech, bigotry, etc
Related Communities
Community icon by Alpár-Etele Méder, licensed under CC BY 3.0
founded 5 years ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
::: spoiler I appreciate the perspective, and acknowledge that Apple is on an edge node. ARM sells the IP blocks required to create masks. Apple is designing chips as much an a toddler with wooden toy blocks is a master carpenter. They are paying a hefty royalty to ARM both for the design and for every single chip sold. RISC-V obsoletes this business model.
While it is true that Intel struggles, the Intel/AMD duality and transparency have a history that goes back to the government/enterprise requirement for second sourcing. Perhaps we live in a world where figureheads are completely ignorant of single source extortion and monopolies, but this is the direct inevitability of any move to ARM.
I believe, the push for ARM is limited to the consumer space, and is attempting to follow the footsteps of smartphones, and Apple, as a method of planned obsolescence through a proprietary POSIX kernel. This move is intended to undercut the market in hopes that people are ignorant enough to sell their right of ownership and get stuck with a worthless product in the long term.
One could argue that all hardware is a worthless product long term. That has been the case for quire a while, but not as recently, and won't remain the case in the future. The odds are high that your present mobile device is not much more advanced than your last. If your present device was designed for reparability and durability, it could likely last 5-10 times longer than the orphaned kernel used to steal ownership and prevent you from doing so. That kernel leverages a proprietary binary that supports the hardware and obsoletes the device. Every device model, and even some carrier sub-models have a unique binary that makes no sense to reverse engineer to open source the support. It has been attempted before, such as the LG Hammerhead, but the challenge is too monumental and time consuming.
All that said, I don't think anyone has seen where the market is headed right now. If you step back to the big picture abstraction, as I love to do, there is a very important shift that is happening right now. At the present, all processor designs have failed. They use a math coprocessor for matrix multiplication. Thia is an absolute disaster for the CPU architecture. Intel tried this in the early days of x86 with a math coprocessor as an accessory and it failed miserably in practice. Under the surface, this issue comes down to the bandwidth of the L2 to L1 cache memory.
I have been thinking about this for awhile, and while entirely speculative, I don't think this is a solvable problem without a 10 year scale redesign from scratch. This area of the die is at the edge of the insanely fast clock speeds in the core. Increasing the bus width here will inevitably cause major issues. This is in the gigahertz regime where electrical properties turn magic with signals that can jump gaps all over the place because everything is a capacitor, a resistor, and an inductor. The power differential of signals in this region is miniscule, and a large amount of parallelism is a monumental hurtle for instances when the majority of that bus is in the high state. If that bus could be wider, I believe it already would be. The push for single threaded performance and the marketability of CPU speed is what drove the evolution of this CPU architecture for decades now.
The market is focused on the most viable alternative of the maths coprocessor using a GPU, but this is a hack and is long term untenable. In the long term, data centers are not going to bear the overhead of a dual compute solution. Anyone that designs a new scalable processor that can handle both traditional code and matrix multiplication flexibly will win the ensuing market across the board. Any business that can handle both types of execution and scale to accommodate demand utilizing their entire available infrastructure will inevitably be much more efficient and with more profitability.
What does this really mean, it means that as of a year and a half ago, the entire market changed at the foundational level. Hardware is very slow. It takes 10 years from initial idea to first consumer market availability. At best, that means that the current systems paradigm has ~8 years before total obsolescence. I am willing to bet the farm, no one will be using a CPU from the present, or a GPU as a math coprocessor for matrix math.
How does this change the market? - nobody asked - It takes away the advantage of incumbency and establishment. It also takes away the security of iterative conservativism. Now all the sudden it is a massive liability to be iteratively conservative. Simply having the capital to pursue this new shift is a viable path to a competitive market share. There is absolutely no reason to hire ARM to do anything from scratch like this. It is far smarter to poach their engineers and from academia and use RISC-V without paying the royalty to an unnecessary middle man. ARM's only selling point is the initial cost savings of a prepackaged design, but all of their IP blocks are still focused on single threaded code execution. In fact, ARM is at a major disadvantage due to the nature of reduced instructions as it will require a redesign to accommodate an AVX like instruction capable of loading matrix math quickly.
The primary way Apple/ARM is handling AI workloads right now is through software optimisations. Sizing the tensors to the actual hardware massively improves the speed. This is simply software stuff. AI is moving too fast for this to work as a long term practice. Every model architecture needs to be tweaked, and the future will be very different. A flexible solution at the hardware level is required.
American businesses have become extremely slow and conservative. There is no telling who is doing the next generation of dominant hardware right now. Judging by the clown show of how the Americans handled tooling up for EV's, I expect the industry will pivot to Asia entirely. They have the foresight and stability to compete in this situation. The only question is really if ASML wants to stay relevant and sell to China, or if China has already found a replacement solution for EUV. I don't think anyone will reveal a single detail of what they are working on in the present, but when everyone shows their hand, it is a truly open statistical game unlike any time in the last 30 years. I see no reason why the establishment has a fortified entrenchment in the market.
I believe this will kill both ARM and x86 in a different future world, bond.
That’s not right. It’s not even wrong.
TSMC provides the cell library.
As far as I understand it, wouldn't the cell library be more like the node equivalent of a KiCAD library for 0402 passive footprints for PCB design? Like here is how we must do gates, buses, etc. But that has nothing to do with the way the ALU is setup or the LCR aspects of a final design? I've honestly only watched Asianometry, skimmed Intro to VLSI a few times, dabbled in FPGA, built Ben Eater's bread board computer, and screwed around with a CPU scheduler to learn why my last computer sucked at complex CAD assemblies. When I was looking for AI hardware to run LLM's I went deep enough to understand the specific CPU limitation and upon learning about my phone's matrix coprocessor I tried learning enough to understand why the thing even exists. That lead me to the understanding that a model can be designed for a specific architecture and run MUCH faster and smaller. I explain things as they sit on my road map of understanding, and knowing I'm likely wrong on the edge cases. I am no expert. I'm trying to give anyone enough rope to pull on so that I can find out here I'm wrong and learn. I share because I want to learn, I want to be wrong, but only in a way that I can extend incrementally from my mental roadmap.