LLM Architecture Gallery

(sebastianraschka.com)

183 points | by tzury 7 hours ago

11 comments

iroddis 2 hours ago
This is amazing, such a nice presentation. It reminds me of the Neural Network Zoo [1], which was also a nice visualization of different architectures.
[1] https://www.asimovinstitute.org/neural-network-zoo/
LuxBennu 17 minutes ago
Interesting collection. The architecture differences show up in surprising ways when you actually look at prompt patterns across models. Longer context windows don't just let you write more, they change what kind of input structure works best.
gasi 1 hour ago
So cool — thanks for sharing! Here’s a zoomable version of the diagram: https://zoomhub.net/LKrpB
Slugcat 40 minutes ago
What tool was used to draw the diagrams?
wood_spirit 3 hours ago
Lovely!
Is there a sort order? Would be so nice to understand the threads of evolutions and revolution in the progression. A bit of a family tree and influence layout? It would also be nice to have a scaled view so you can sense the difference in sizes over time.
[-]
- krackers 2 hours ago
  There is https://magazine.sebastianraschka.com/p/technical-deepseek which shows an evolution in deepseek family
charcircuit 42 minutes ago
I'm surprised at how similar all of them are with the main differences being the size of layers.
mvrckhckr 3 hours ago
What a great idea and nice execution.
isotropic 3 hours ago
[dead]
docybo 3 hours ago
[dead]
SideLineLabs 4 hours ago
[flagged]
FailMore 4 hours ago
Thanks! This is cool. Can you tell me if you learnt anything interesting/surprising when pulling this together? As in did it teach you something about LLM Architecture that you didn't know before you began?